人工智能学院2021年系列学术活动——澳大利亚国立大学邵靖宇博士学术报告

发布时间：2021-07-23 点击：次

Title: Entity Resolution with Active Learning

报告题目：实体分析与主动学习

Abstract: Entity Resolution (ER) refers to the process of identifying records which represent the same real-world entity from one or more datasets. However, traditional methods for ER suffer from several challenges, such as imbalance classes, limited labelling budgets and model overfitting. In this talk, I will first introduce a novel blocking scheme learning approach based on active learning techniques. Two strategies called active sampling and active branching are proposed to select samples and generate blocking schemes efficiently. Then, I will propose a skyblocking method, aiming to learn blocking scheme skylines with respect to different blocking criteria using three novel algorithms. Based on these blocking techniques, I will further develop a general active learning framework for classification, called Learning-To-Sample (LTS). This LTS framework has two key components: a sampling model and a boosting model, which can mutually learn from each other in iterations to improve the performance of each other. Finally, to address the overfitting problem, I will propose a semi-supervised generative adversarial network, namely ErGAN. This model contains a label generator and a discriminator which are optimized alternatively through adversarial learning.

报告摘要：实体解析(ER)是指从一个或多个数据集中识别出代表相同真实实体的记录的过程。然而，传统的ER方法面临着一些挑战，如不平衡的类别，有限的标签预算和模型过拟合。在这次演讲中，我将首先介绍一种新的基于主动学习技术的分块架构的学习方法。为了有效地选择样本并生成分块方案，提出了主动采样和主动分支两种策略。然后，我将提出一种skyblocking方法，旨在通过三种新的算法学习不同分块准则下的分块架构天际线。基于这些分块技术，我将进一步开发一个基于主动学习的通用的分类框架模型，称为learning-to-sample(LTS)。该LTS框架有两个关键的组成部分:采样模型和提升模型，它们可以在迭代中相互学习，相互提高性能。最后，为了解决潜在的过拟合问题，我将提出一种基于半监督学习的生成对抗网络模型，即ErGAN。该模型包含一个标签生成器和一个鉴别器，他们通过对抗性学习交替优化。

报告人简介：邵靖宇，男，澳大利亚国立大学学计算机博士，本科毕业于北京航空航天大学，硕士毕业于悉尼科技大学。主要研究方向为实体分析、数据挖掘、主动学习和机器学习。

报告时间：2021年7月26日（星期一）上午9：00-10：00

报告平台：腾讯会议

会议号：688 511 698

会议密码：202107

上一篇：吉林大学王湘浩人工智能杰出学者系列讲座首场活动——张钹院士主题报告会下一篇：人工智能学院2021年系列学术活动——澳大利亚麦考瑞大学博士王琪学术报告

科学研究

人工智能学院2021年系列学术活动——澳大利亚国立大学邵靖宇博士学术报告