Machine Learning for Sciences — Nonlinear Feature Selection for High-Dimensional Data

发布时间:2018-10-12 点击:

讲座题目:Machine Learning for Sciences — Nonlinear Feature Selection for High-Dimensional Data

主讲人:Makoto Yamada

讲座时间:2018年10月15日(星期一)09:00-11:00

讲座地点:中心校区东荣会议中心二楼多功能厅

主办单位:人工智能学院、未来科学国际合作联合实验室

Abstract

Feature selection is an important machine learning problem. However, there are a few methods that can select features from large and ultra high-dimensional data (more than million features) in nonlinear way. In this talk, we first introduce a Hilbert-Schmidt Independence Criterion Lasso (HSIC Lasso) that can efficiently select non-redundant features from a small and high-dimensional data in nonlinear way. A key advantage of HSIC Lasso is that it is a convex method and can find a globally optimal solution. Then we further extend the proposed method to handle ultra high-dimensional data by incorporating with distributed computing framework. Moreover, we introduce two newly proposed algorithms the localized lasso and hsicInf, where the localized lasso is useful for selecting a set of features from each sub-cluster and hsicInf can obtain p-values of selected features from any type of data.

主讲人简介:

Makoto Yamada,博士,现任日本京都大学副教授、RIKEN AIP单位负责人。于2005年在美国科罗拉多州立大学科林斯堡分校获得电子工程硕士学位,2010年在日本综合研究大学院大学获得统计科学博士学位。曾担任东京工业大学博士后研究员、NTT通信科学实验室研究员和雅虎实验室研究科学家。研究领域包括机器学习、自然语言处理、信号处理和计算机视觉等。近年来在顶级会议和期刊上发表了30多篇研究论文,荣获了WSDM 2016最佳论文奖,另外2018年在Cell上发表论文一篇。