报告题目:Positive and Unlabeled Data: Model, Estimation, Inference, and Classification
报告人:田庆隆(加拿大滑铁卢大学统计与精算科学系助理教授)
报告时间:2024年10月25日 15:00
报告地点:文波楼401会议室
摘要:This study introduces a new approach to addressing positive and unlabeled (PU) data through the double exponential tilting model (DETM). Traditional methods often fall short because they only apply to selected completely at random (SCAR) PU data, where the labeled positive and unlabeled positive data are assumed to be from the same distribution. In contrast, our DETM's dual structure effectively accommodates the more complex and underexplored selected at random PU data, where the labeled and unlabeled positive data can be from different distributions. We rigorously establish the theoretical foundations of DETM, including identifiability, parameter estimation, and asymptotic properties. Additionally, we move forward to statistical inference by developing a goodness-of-fit test for the SCAR condition and constructing confidence intervals for the proportion of positive instances in the target domain. We leverage an approximated Bayes classifier for classification tasks, demonstrating DETM's robust performance in prediction. Through theoretical insights and practical applications, this study highlights DETM as a comprehensive framework for addressing the challenges of PU data.
报告人简介:田庆隆,加拿大滑铁卢大学统计与精算科学系,助理教授。他于2016年毕业于中国人民大学,获得统计学学士学位,并于2021年毕业于爱荷华州立大学,获得统计学博士学位。他目前的研究兴趣包括迁移学习、领域自适应以及分布外检测。