Machine Learning for Sciences — Nonlinear Feature Selection for High-Dimensional Data

发布时间:2018-10-12  点击:

讲座题目:Machine Learning for Sciences — Nonlinear Feature Selection for High-Dimensional Data

主讲人:Makoto Yamada





Feature selection is an important machine learning problem. However, there are a few methods that can select features from large and ultra high-dimensional data (more than million features) in nonlinear way. In this talk, we first introduce a Hilbert-Schmidt Independence Criterion Lasso (HSIC Lasso) that can efficiently select non-redundant features from a small and high-dimensional data in nonlinear way. A key advantage of HSIC Lasso is that it is a convex method and can find a globally optimal solution. Then we further extend the proposed method to handle ultra high-dimensional data by incorporating with distributed computing framework. Moreover, we introduce two newly proposed algorithms the localized lasso and hsicInf, where the localized lasso is useful for selecting a set of features from each sub-cluster and hsicInf can obtain p-values of selected features from any type of data.


Makoto Yamada,博士,现任日本京都大学副教授、RIKEN AIP单位负责人。于2005年在美国科罗拉多州立大学科林斯堡分校获得电子工程硕士学位,2010年在日本综合研究大学院大学获得统计科学博士学位。曾担任东京工业大学博士后研究员、NTT通信科学实验室研究员和雅虎实验室研究科学家。研究领域包括机器学习、自然语言处理、信号处理和计算机视觉等。近年来在顶级会议和期刊上发表了30多篇研究论文,荣获了WSDM 2016最佳论文奖,另外2018年在Cell上发表论文一篇。

上一条:Machine Learning for Sciences — Revisiting Direct Density-Ratio Estimation




版权所有 2018 吉林大学 吉ICP备06002985号-1