diff --git a/papers/kdd/kdd2022.md b/papers/kdd/kdd2022.md index 898fd5fe..35263e0b 100644 --- a/papers/kdd/kdd2022.md +++ b/papers/kdd/kdd2022.md @@ -12,7 +12,7 @@ |[MSDR: Multi-Step Dependency Relation Networks for Spatial Temporal Forecasting](https://doi.org/10.1145/3534678.3539397)|Dachuan Liu, Jin Wang, Shuo Shang, Peng Han||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MSDR:+Multi-Step+Dependency+Relation+Networks+for+Spatial+Temporal+Forecasting)|7| |[Joint Knowledge Graph Completion and Question Answering](https://doi.org/10.1145/3534678.3539289)|Lihui Liu, Boxin Du, Jiejun Xu, Yinglong Xia, Hanghang Tong||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Joint+Knowledge+Graph+Completion+and+Question+Answering)|7| |[Graph Neural Networks for Multimodal Single-Cell Data Integration](https://doi.org/10.1145/3534678.3539213)|Hongzhi Wen, Jiayuan Ding, Wei Jin, Yiqi Wang, Yuying Xie, Jiliang Tang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph+Neural+Networks+for+Multimodal+Single-Cell+Data+Integration)|7| -|[Multi-Behavior Hypergraph-Enhanced Transformer for Sequential Recommendation](https://doi.org/10.1145/3534678.3539342)|Yuhao Yang, Chao Huang, Lianghao Xia, Yuxuan Liang, Yanwei Yu, Chenliang Li|Wuhan University, Wuhan, China; National University of Singapore, Singapore, Singapore; University of Hong Kong, Hong Kong, China; Ocean University of China, Qingdao, China|Learning dynamic user preference has become an increasingly important component for many online platforms (e.g., video-sharing sites, e-commerce systems) to make sequential recommendations. Previous works have made many efforts to model item-item transitions over user interaction sequences, based on various architectures, e.g., recurrent neural networks and self-attention mechanism. Recently emerged graph neural networks also serve as useful backbone models to capture item dependencies in sequential recommendation scenarios. Despite their effectiveness, existing methods have far focused on item sequence representation with singular type of interactions, and thus are limited to capture dynamic heterogeneous relational structures between users and items (e.g., page view, add-to-favorite, purchase). To tackle this challenge, we design a Multi-Behavior Hypergraph-enhanced T ransformer framework (MBHT) to capture both short-term and long-term cross-type behavior dependencies. Specifically, a multi-scale Transformer is equipped with low-rank self-attention to jointly encode behavior-aware sequential patterns from fine-grained and coarse-grained levels. Additionally,we incorporate the global multi-behavior dependency into the hypergraph neural architecture to capture the hierarchical long-range item correlations in a customized manner. Experimental results demonstrate the superiority of our MBHT over various state-of- the-art recommendation solutions across different settings. Further ablation studies validate the effectiveness of our model design and benefits of the new MBHT framework. Our implementation code is released at: https://github.com/yuh-yang/MBHT-KDD22.|学习动态用户偏好已经成为许多在线平台(如视频分享网站、电子商务系统)提供顺序推荐的一个越来越重要的组成部分。以往的研究基于多种体系结构,如递归神经网络和自我注意机制,对用户交互序列上的项目-项目转换进行了大量的研究。最近出现的图形神经网络也可以作为有用的骨干模型,以捕获项目依赖的顺序推荐场景。尽管现有的方法很有效,但是现有的方法都集中在单一交互类型的项目序列表示上,因此仅限于捕获用户和项目之间的动态异构关系结构(例如,页面查看、添加到收藏夹、购买)。为了应对这一挑战,我们设计了一个多行为超图增强型 T 变换器框架(MBHT)来捕获短期和长期的跨类型行为依赖。具体而言,多尺度变压器配备低级自注意,以从细粒度和粗粒度级别联合编码行为感知的序列模式。此外,我们将全局多行为依赖引入到超图神经结构中,以自定义的方式获取层次化的远程项目相关性。实验结果表明,我们的 MBHT 优于不同设置的各种最先进的推荐解决方案。进一步的消融研究验证了我们的模型设计的有效性和新的 MBHT 框架的好处。我们的实施代码在以下 https://github.com/yuh-yang/mbht-kdd22发布:。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-Behavior+Hypergraph-Enhanced+Transformer+for+Sequential+Recommendation)|6| +|[Multi-Behavior Hypergraph-Enhanced Transformer for Sequential Recommendation](https://doi.org/10.1145/3534678.3539342)|Yuhao Yang, Chao Huang, Lianghao Xia, Yuxuan Liang, Yanwei Yu, Chenliang Li|University of Hong Kong, Hong Kong, China; National University of Singapore, Singapore, Singapore; Ocean University of China, Qingdao, China; Wuhan University, Wuhan, China|Learning dynamic user preference has become an increasingly important component for many online platforms (e.g., video-sharing sites, e-commerce systems) to make sequential recommendations. Previous works have made many efforts to model item-item transitions over user interaction sequences, based on various architectures, e.g., recurrent neural networks and self-attention mechanism. Recently emerged graph neural networks also serve as useful backbone models to capture item dependencies in sequential recommendation scenarios. Despite their effectiveness, existing methods have far focused on item sequence representation with singular type of interactions, and thus are limited to capture dynamic heterogeneous relational structures between users and items (e.g., page view, add-to-favorite, purchase). To tackle this challenge, we design a Multi-Behavior Hypergraph-enhanced T ransformer framework (MBHT) to capture both short-term and long-term cross-type behavior dependencies. Specifically, a multi-scale Transformer is equipped with low-rank self-attention to jointly encode behavior-aware sequential patterns from fine-grained and coarse-grained levels. Additionally,we incorporate the global multi-behavior dependency into the hypergraph neural architecture to capture the hierarchical long-range item correlations in a customized manner. Experimental results demonstrate the superiority of our MBHT over various state-of- the-art recommendation solutions across different settings. Further ablation studies validate the effectiveness of our model design and benefits of the new MBHT framework. Our implementation code is released at: https://github.com/yuh-yang/MBHT-KDD22.|学习动态用户偏好已经成为许多在线平台(如视频分享网站、电子商务系统)提供顺序推荐的一个越来越重要的组成部分。以往的研究基于多种体系结构,如递归神经网络和自我注意机制,对用户交互序列上的项目-项目转换进行了大量的研究。最近出现的图形神经网络也可以作为有用的骨干模型,以捕获项目依赖的顺序推荐场景。尽管现有的方法很有效,但是现有的方法都集中在单一交互类型的项目序列表示上,因此仅限于捕获用户和项目之间的动态异构关系结构(例如,页面查看、添加到收藏夹、购买)。为了应对这一挑战,我们设计了一个多行为超图增强型 T 变换器框架(MBHT)来捕获短期和长期的跨类型行为依赖。具体而言,多尺度变压器配备低级自注意,以从细粒度和粗粒度级别联合编码行为感知的序列模式。此外,我们将全局多行为依赖引入到超图神经结构中,以自定义的方式获取层次化的远程项目相关性。实验结果表明,我们的 MBHT 优于不同设置的各种最先进的推荐解决方案。进一步的消融研究验证了我们的模型设计的有效性和新的 MBHT 框架的好处。我们的实施代码在以下 https://github.com/yuh-yang/mbht-kdd22发布:。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-Behavior+Hypergraph-Enhanced+Transformer+for+Sequential+Recommendation)|6| |[CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval](https://doi.org/10.1145/3534678.3539151)|Licheng Yu, Jun Chen, Animesh Sinha, Mengjiao Wang, Yu Chen, Tamara L. Berg, Ning Zhang|Meta AI, Menlo Park, CA, USA|We introduce CommerceMM - a multimodal model capable of providing a diverse and granular understanding of commerce topics associated to the given piece of content (image, text, image+text), and having the capability to generalize to a wide range of tasks, including Multimodal Categorization, Image-Text Retrieval, Query-to-Product Retrieval, Image-to-Product Retrieval, etc. We follow the pre-training + fine-tuning training regime and present 5 effective pre-training tasks on image-text pairs. To embrace more common and diverse commerce data with text-to-multimodal, image-to-multimodal, and multimodal-to-multimodal mapping, we propose another 9 novel cross-modal and cross-pair retrieval tasks, called Omni-Retrieval pre-training. We also propose a novel approach of modality randomization to dynamically adjust our model under different efficiency constraints. The pre-training is conducted in an efficient manner with only two forward/backward updates for the combined 14 tasks. Extensive experiments and analysis show the effectiveness of each task. When combining all pre-training tasks, our model achieves state-of-the-art performance on 7 commerce-related downstream tasks after fine-tuning.|我们介绍 CommerceMM ——一个多模态模型,它能够提供对与给定内容(图像、文本、图像 + 文本)相关的商业主题的多样化和细粒度的理解,并且能够泛化到广泛的任务,包括多模态分类、图像-文本检索、查询到产品检索、图像到产品检索等。我们遵循预先训练 + 微调训练制度,提出了5个有效的图像-文本对预先训练任务。为了使用文本到多模式、图像到多模式以及多模式到多模式映射来接受更多常见和多样化的商业数据,我们提出了另外9个新的跨模式和交叉对检索任务,称为 Omni-Retrieval pre-training。提出了一种新的模态随机化方法,在不同的效率约束下动态调整模型。预先培训是在一个有效的方式进行,只有两个向前/向后更新的合并14个任务。大量的实验和分析表明了每个任务的有效性。当结合所有的预训练任务时,我们的模型在经过微调后在7个与商业相关的下游任务上达到了最先进的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CommerceMM:+Large-Scale+Commerce+MultiModal+Representation+Learning+with+Omni+Retrieval)|6| |[Learning to Rotate: Quaternion Transformer for Complicated Periodical Time Series Forecasting](https://doi.org/10.1145/3534678.3539234)|Weiqi Chen, Wenwei Wang, Bingqing Peng, Qingsong Wen, Tian Zhou, Liang Sun||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+to+Rotate:+Quaternion+Transformer+for+Complicated+Periodical+Time+Series+Forecasting)|6| |[FederatedScope-GNN: Towards a Unified, Comprehensive and Efficient Package for Federated Graph Learning](https://doi.org/10.1145/3534678.3539112)|Zhen Wang, Weirui Kuang, Yuexiang Xie, Liuyi Yao, Yaliang Li, Bolin Ding, Jingren Zhou||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=FederatedScope-GNN:+Towards+a+Unified,+Comprehensive+and+Efficient+Package+for+Federated+Graph+Learning)|6| @@ -24,10 +24,10 @@ |[ERNIE-GeoL: A Geography-and-Language Pre-trained Model and its Applications in Baidu Maps](https://doi.org/10.1145/3534678.3539021)|Jizhou Huang, Haifeng Wang, Yibo Sun, Yunsheng Shi, Zhengjie Huang, An Zhuo, Shikun Feng||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ERNIE-GeoL:+A+Geography-and-Language+Pre-trained+Model+and+its+Applications+in+Baidu+Maps)|5| |[ChemicalX: A Deep Learning Library for Drug Pair Scoring](https://doi.org/10.1145/3534678.3539023)|Benedek Rozemberczki, Charles Tapley Hoyt, Anna Gogleva, Piotr Grabowski, Klas Karis, Andrej Lamov, Andriy Nikolov, Sebastian Nilsson, Michaël Ughetto, Yu Wang, Tyler Derr, Benjamin M. Gyori||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ChemicalX:+A+Deep+Learning+Library+for+Drug+Pair+Scoring)|5| |[DuARE: Automatic Road Extraction with Aerial Images and Trajectory Data at Baidu Maps](https://doi.org/10.1145/3534678.3539029)|Jianzhong Yang, Xiaoqing Ye, Bin Wu, Yanlei Gu, Ziyu Wang, Deguo Xia, Jizhou Huang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DuARE:+Automatic+Road+Extraction+with+Aerial+Images+and+Trajectory+Data+at+Baidu+Maps)|5| -|[TwHIN: Embedding the Twitter Heterogeneous Information Network for Personalized Recommendation](https://doi.org/10.1145/3534678.3539080)|Ahmed ElKishky, Thomas Markovich, Serim Park, Chetan Verma, Baekjin Kim, Ramy Eskander, Yury Malkov, Frank Portman, Sofía Samaniego, Ying Xiao, Aria Haghighi|Twitter Cortex, San Francisco, CA, USA; Twitter Cortex, Seattle, WA, USA; Twitter Cortex, Boston, MA, USA; Twitter Cortex, New York, NY, USA; Twitter, San Francisco, CA, USA|Social networks, such as Twitter, form a heterogeneous information network (HIN) where nodes represent domain entities (e.g., user, content, advertiser, etc.) and edges represent one of many entity interactions (e.g, a user re-sharing content or "following" another). Interactions from multiple relation types can encode valuable information about social network entities not fully captured by a single relation; for instance, a user's preference for accounts to follow may depend on both user-content engagement interactions and the other users they follow. In this work, we investigate knowledge-graph embeddings for entities in the Twitter HIN (TwHIN); we show that these pretrained representations yield significant offline and online improvement for a diverse range of downstream recommendation and classification tasks: personalized ads rankings, account follow-recommendation, offensive content detection, and search ranking. We discuss design choices and practical challenges of deploying industry-scale HIN embeddings, including compressing them to reduce end-to-end model latency and handling parameter drift across versions.|社交网络,如 Twitter,形成了一个异构的信息网络(HIN) ,其中节点代表领域实体(例如,用户,内容,广告商等) ,边缘代表许多实体交互之一(例如,用户重新分享内容或“关注”另一个)。来自多种关系类型的交互可以编码关于社交网络实体的有价值的信息,而这些信息并没有被单个关系完全捕获; 例如,用户对账户的偏好可能同时取决于用户内容参与交互和他们所关注的其他用户。在这项工作中,我们调查了知识图表嵌入实体在 Twitter HIN (TwHIN) ; 我们表明,这些预先训练的表示产生了显着的离线和在线改善的下游推荐和分类任务的范围: 个性化广告排名,帐户跟踪推荐,攻击性内容检测和搜索排名。我们讨论了部署行业规模的 HIN 嵌入的设计选择和实际挑战,包括压缩它们以减少端到端模型延迟和处理跨版本的参数漂移。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=TwHIN:+Embedding+the+Twitter+Heterogeneous+Information+Network+for+Personalized+Recommendation)|4| -|[Multi-Task Fusion via Reinforcement Learning for Long-Term User Satisfaction in Recommender Systems](https://doi.org/10.1145/3534678.3539040)|Qihua Zhang, Junning Liu, Yuzhuo Dai, Yiyan Qi, Yifan Yuan, Kunlun Zheng, Fan Huang, Xianfeng Tan|Tencent, Beijing, China; Tencent, Shenzhen, China|Recommender System (RS) is an important online application that affects billions of users every day. The mainstream RS ranking framework is composed of two parts: a Multi-Task Learning model (MTL) that predicts various user feedback, i.e., clicks, likes, sharings, and a Multi-Task Fusion model (MTF) that combines the multi-task outputs into one final ranking score with respect to user satisfaction. There has not been much research on the fusion model while it has great impact on the final recommendation as the last crucial process of the ranking. To optimize long-term user satisfaction rather than obtain instant returns greedily, we formulate MTF task as Markov Decision Process (MDP) within a recommendation session and propose a Batch Reinforcement Learning (RL) based Multi-Task Fusion framework (BatchRL-MTF) that includes a Batch RL framework and an online exploration. The former exploits Batch RL to learn an optimal recommendation policy from the fixed batch data offline for long-term user satisfaction, while the latter explores potential high-value actions online to break through the local optimal dilemma. With a comprehensive investigation on user behaviors, we model the user satisfaction reward with subtle heuristics from two aspects of user stickiness and user activeness. Finally, we conduct extensive experiments on a billion-sample level real-world dataset to show the effectiveness of our model. We propose a conservative offline policy estimator (Conservative-OPEstimator) to test our model offline. Furthermore, we take online experiments in a real recommendation environment to compare performance of different models. As one of few Batch RL researches applied in MTF task successfully, our model has also been deployed on a large-scale industrial short video platform, serving hundreds of millions of users.|推荐系统(RS)是一个重要的在线应用程序,每天影响数十亿用户。RS 的主流排名框架由两部分组成: 一个是多任务学习模型(Multi-Task Learning model,MTL) ,它预测用户的各种反馈,即点击、喜欢、分享; 另一个是多任务融合模型(Multi-Task Fusion model,MTF) ,它将多任务输出结合成一个用户满意度的最终排名得分。融合模型作为排名的最后一个关键过程,对最终推荐有着重要的影响。为了优化长期用户满意度,而不是贪婪地获得即时回报,我们在一个推荐会话中将 MTF 任务制定为马可夫决策过程(mDP) ,并提出了一个基于批处理强化学习(RL)的多任务融合框架(BatchRL-MTF) ,其中包括一个批处理强化学习框架和一个在线探索。前者利用批量 RL 从离线的固定批量数据中学习最优推荐策略以获得长期用户满意度,后者利用在线的潜在高价值行为来突破局部最优困境。通过对用户行为的全面调查,从用户粘性和用户主动性两个方面采用微妙的启发式方法建立了用户满意奖励模型。最后,我们在十亿个样本级别的真实世界数据集上进行了广泛的实验,以显示我们的模型的有效性。我们提出了一个保守的离线策略估计(保守-最优估计)来测试我们的模型离线。此外,我们在一个真实的推荐环境中进行在线实验,比较不同模型的性能。作为少数几个成功应用于 MTF 任务的批量 RL 研究之一,我们的模型也已经部署在一个大型工业短视频平台上,为数亿用户服务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-Task+Fusion+via+Reinforcement+Learning+for+Long-Term+User+Satisfaction+in+Recommender+Systems)|4| -|[Feature-aware Diversified Re-ranking with Disentangled Representations for Relevant Recommendation](https://doi.org/10.1145/3534678.3539130)|Zihan Lin, Hui Wang, Jingshu Mao, Wayne Xin Zhao, Cheng Wang, Peng Jiang, JiRong Wen|Renmin University of China, Beijing Key Laboratory of Big Data Management and Analysis Methods, & Beijing Academy of Artificial Intelligence, Beijing, China; Renmin University of China, Beijing, China; Kuaishou Inc., Beijing, China|Relevant recommendation is a special recommendation scenario which provides relevant items when users express interests on one target item (e.g., click, like and purchase). Besides considering the relevance between recommendations and trigger item, the recommendations should also be diversified to avoid information cocoons. However, existing diversified recommendation methods mainly focus on item-level diversity which is insufficient when the recommended items are all relevant to the target item. Moreover, redundant or noisy item features might affect the performance of simple feature-aware recommendation approaches. Faced with these issues, we propose a Feature Disentanglement Self-Balancing Re-ranking framework (FDSB) to capture feature- aware diversity. The framework consists of two major modules, namely disentangled attention encoder (DAE) and self-balanced multi-aspect ranker. In DAE, we use multi-head attention to learn disentangled aspects from rich item features. In the ranker, we develop an aspect-specific ranking mechanism that is able to adaptively balance the relevance and diversity for each aspect. In experiments, we conduct offline evaluation on the collected dataset and deploy FDSB on KuaiShou app for online ??/?? test on the function of relevant recommendation. The significant improvements on both recommendation quality and user experience verify the effectiveness of our approach.|相关推荐是一种特殊的推荐场景,当用户对一个目标项目表示兴趣时(例如,点击、喜欢和购买) ,它会提供相关的项目。除了考虑建议与触发项目之间的相关性之外,建议还应当多样化,以避免信息茧。然而,现有的多样化推荐方法主要侧重于项目层次的多样性,当推荐项目都与目标项目相关时,这种多样性是不够的。此外,冗余或嘈杂的项目特征可能会影响简单的特征感知推荐方法的性能。针对这些问题,我们提出了一种特征分离自平衡重排框架(FDSB)来捕获特征感知的多样性。该框架包括两个主要模块,即分离注意编码器(DAE)和自平衡多方面排序器。在 DAE 中,我们使用多头注意从丰富的项目特征中学习分离的方面。在排名中,我们开发了一个方面特定的排名机制,能够自适应地平衡每个方面的相关性和多样性。在实验中,我们对收集到的数据集进行离线评估,并在快手应用上部署 FDSB 以实现在线? ? ?/??检验有关推荐的作用。在推荐质量和用户体验方面的重大改进验证了我们方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Feature-aware+Diversified+Re-ranking+with+Disentangled+Representations+for+Relevant+Recommendation)|4| -|[Counteracting User Attention Bias in Music Streaming Recommendation via Reward Modification](https://doi.org/10.1145/3534678.3539393)|Xiao Zhang, Sunhao Dai, Jun Xu, Zhenhua Dong, Quanyu Dai, JiRong Wen|Huawei Noah's Ark Lab, Shenzhen, China; Renmin University of China, Beijing, China|In streaming media applications, like music Apps, songs are recommended in a continuous way in users' daily life. The recommended songs are played automatically although users may not pay any attention to them, posing a challenge of user attention bias in training recommendation models, i.e., the training instances contain a large number of false-positive labels (users' feedback). Existing approaches either directly use the auto-feedbacks or heuristically delete the potential false-positive labels. Both of the approaches lead to biased results because the false-positive labels cause the shift of training data distribution, hurting the accuracy of the recommendation models. In this paper, we propose a learning-based counterfactual approach to adjusting the user auto-feedbacks and learning the recommendation models using Neural Dueling Bandit algorithm, called NDB. Specifically, NDB maintains two neural networks: a user attention network for computing the importance weights that are used for modifying the original rewards, and another random network trained with dueling bandit for conducting online recommendations based on the modified rewards. Theoretical analysis showed that the modified rewards are statistically unbiased, and the learned bandit policy enjoys a sub-linear regret bound. Experimental results demonstrated that NDB can significantly outperform the state-of-the-art baselines.|在流媒体应用程序中,比如音乐应用程序,歌曲被持续推荐到用户的日常生活中。虽然用户可能没有注意到这些歌曲,但推荐的歌曲会自动播放,这对训练推荐模型中的用户注意偏差提出了挑战,即训练实例中包含大量假阳性标签(用户反馈)。现有的方法要么直接使用自动反馈,要么启发性地删除潜在的假阳性标签。这两种方法都会导致结果偏差,因为假阳性标签会引起训练数据分布的变化,从而影响推荐模型的准确性。在本文中,我们提出了一种基于学习的反事实方法来调整用户自动反馈和学习推荐模型的神经决斗盗贼算法,称为 NDB。具体来说,新开发银行维护两个神经网络: 一个是用户注意力网络,用于计算用于修改原始奖励的重要性权重,另一个是与决斗强盗一起训练的随机网络,用于根据修改后的奖励进行在线推荐。理论分析表明,修正后的奖励具有统计上的无偏性,学会的土匪政策具有亚线性后悔界限。实验结果表明,新数据库的性能明显优于最先进的基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Counteracting+User+Attention+Bias+in+Music+Streaming+Recommendation+via+Reward+Modification)|4| +|[TwHIN: Embedding the Twitter Heterogeneous Information Network for Personalized Recommendation](https://doi.org/10.1145/3534678.3539080)|Ahmed ElKishky, Thomas Markovich, Serim Park, Chetan Verma, Baekjin Kim, Ramy Eskander, Yury Malkov, Frank Portman, Sofía Samaniego, Ying Xiao, Aria Haghighi|Twitter Cortex, San Francisco, CA, USA; Twitter, San Francisco, CA, USA; Twitter Cortex, Seattle, WA, USA; Twitter Cortex, Boston, MA, USA; Twitter Cortex, New York, NY, USA|Social networks, such as Twitter, form a heterogeneous information network (HIN) where nodes represent domain entities (e.g., user, content, advertiser, etc.) and edges represent one of many entity interactions (e.g, a user re-sharing content or "following" another). Interactions from multiple relation types can encode valuable information about social network entities not fully captured by a single relation; for instance, a user's preference for accounts to follow may depend on both user-content engagement interactions and the other users they follow. In this work, we investigate knowledge-graph embeddings for entities in the Twitter HIN (TwHIN); we show that these pretrained representations yield significant offline and online improvement for a diverse range of downstream recommendation and classification tasks: personalized ads rankings, account follow-recommendation, offensive content detection, and search ranking. We discuss design choices and practical challenges of deploying industry-scale HIN embeddings, including compressing them to reduce end-to-end model latency and handling parameter drift across versions.|社交网络,如 Twitter,形成了一个异构的信息网络(HIN) ,其中节点代表领域实体(例如,用户,内容,广告商等) ,边缘代表许多实体交互之一(例如,用户重新分享内容或“关注”另一个)。来自多种关系类型的交互可以编码关于社交网络实体的有价值的信息,而这些信息并没有被单个关系完全捕获; 例如,用户对账户的偏好可能同时取决于用户内容参与交互和他们所关注的其他用户。在这项工作中,我们调查了知识图表嵌入实体在 Twitter HIN (TwHIN) ; 我们表明,这些预先训练的表示产生了显着的离线和在线改善的下游推荐和分类任务的范围: 个性化广告排名,帐户跟踪推荐,攻击性内容检测和搜索排名。我们讨论了部署行业规模的 HIN 嵌入的设计选择和实际挑战,包括压缩它们以减少端到端模型延迟和处理跨版本的参数漂移。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=TwHIN:+Embedding+the+Twitter+Heterogeneous+Information+Network+for+Personalized+Recommendation)|4| +|[Multi-Task Fusion via Reinforcement Learning for Long-Term User Satisfaction in Recommender Systems](https://doi.org/10.1145/3534678.3539040)|Qihua Zhang, Junning Liu, Yuzhuo Dai, Yiyan Qi, Yifan Yuan, Kunlun Zheng, Fan Huang, Xianfeng Tan|Tencent, Shenzhen, China; Tencent, Beijing, China|Recommender System (RS) is an important online application that affects billions of users every day. The mainstream RS ranking framework is composed of two parts: a Multi-Task Learning model (MTL) that predicts various user feedback, i.e., clicks, likes, sharings, and a Multi-Task Fusion model (MTF) that combines the multi-task outputs into one final ranking score with respect to user satisfaction. There has not been much research on the fusion model while it has great impact on the final recommendation as the last crucial process of the ranking. To optimize long-term user satisfaction rather than obtain instant returns greedily, we formulate MTF task as Markov Decision Process (MDP) within a recommendation session and propose a Batch Reinforcement Learning (RL) based Multi-Task Fusion framework (BatchRL-MTF) that includes a Batch RL framework and an online exploration. The former exploits Batch RL to learn an optimal recommendation policy from the fixed batch data offline for long-term user satisfaction, while the latter explores potential high-value actions online to break through the local optimal dilemma. With a comprehensive investigation on user behaviors, we model the user satisfaction reward with subtle heuristics from two aspects of user stickiness and user activeness. Finally, we conduct extensive experiments on a billion-sample level real-world dataset to show the effectiveness of our model. We propose a conservative offline policy estimator (Conservative-OPEstimator) to test our model offline. Furthermore, we take online experiments in a real recommendation environment to compare performance of different models. As one of few Batch RL researches applied in MTF task successfully, our model has also been deployed on a large-scale industrial short video platform, serving hundreds of millions of users.|推荐系统(RS)是一个重要的在线应用程序,每天影响数十亿用户。RS 的主流排名框架由两部分组成: 一个是多任务学习模型(Multi-Task Learning model,MTL) ,它预测用户的各种反馈,即点击、喜欢、分享; 另一个是多任务融合模型(Multi-Task Fusion model,MTF) ,它将多任务输出结合成一个用户满意度的最终排名得分。融合模型作为排名的最后一个关键过程,对最终推荐有着重要的影响。为了优化长期用户满意度,而不是贪婪地获得即时回报,我们在一个推荐会话中将 MTF 任务制定为马可夫决策过程(mDP) ,并提出了一个基于批处理强化学习(RL)的多任务融合框架(BatchRL-MTF) ,其中包括一个批处理强化学习框架和一个在线探索。前者利用批量 RL 从离线的固定批量数据中学习最优推荐策略以获得长期用户满意度,后者利用在线的潜在高价值行为来突破局部最优困境。通过对用户行为的全面调查,从用户粘性和用户主动性两个方面采用微妙的启发式方法建立了用户满意奖励模型。最后,我们在十亿个样本级别的真实世界数据集上进行了广泛的实验,以显示我们的模型的有效性。我们提出了一个保守的离线策略估计(保守-最优估计)来测试我们的模型离线。此外,我们在一个真实的推荐环境中进行在线实验,比较不同模型的性能。作为少数几个成功应用于 MTF 任务的批量 RL 研究之一,我们的模型也已经部署在一个大型工业短视频平台上,为数亿用户服务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-Task+Fusion+via+Reinforcement+Learning+for+Long-Term+User+Satisfaction+in+Recommender+Systems)|4| +|[Feature-aware Diversified Re-ranking with Disentangled Representations for Relevant Recommendation](https://doi.org/10.1145/3534678.3539130)|Zihan Lin, Hui Wang, Jingshu Mao, Wayne Xin Zhao, Cheng Wang, Peng Jiang, JiRong Wen|Renmin University of China, Beijing, China; Renmin University of China, Beijing Key Laboratory of Big Data Management and Analysis Methods, & Beijing Academy of Artificial Intelligence, Beijing, China; Kuaishou Inc., Beijing, China|Relevant recommendation is a special recommendation scenario which provides relevant items when users express interests on one target item (e.g., click, like and purchase). Besides considering the relevance between recommendations and trigger item, the recommendations should also be diversified to avoid information cocoons. However, existing diversified recommendation methods mainly focus on item-level diversity which is insufficient when the recommended items are all relevant to the target item. Moreover, redundant or noisy item features might affect the performance of simple feature-aware recommendation approaches. Faced with these issues, we propose a Feature Disentanglement Self-Balancing Re-ranking framework (FDSB) to capture feature- aware diversity. The framework consists of two major modules, namely disentangled attention encoder (DAE) and self-balanced multi-aspect ranker. In DAE, we use multi-head attention to learn disentangled aspects from rich item features. In the ranker, we develop an aspect-specific ranking mechanism that is able to adaptively balance the relevance and diversity for each aspect. In experiments, we conduct offline evaluation on the collected dataset and deploy FDSB on KuaiShou app for online ??/?? test on the function of relevant recommendation. The significant improvements on both recommendation quality and user experience verify the effectiveness of our approach.|相关推荐是一种特殊的推荐场景,当用户对一个目标项目表示兴趣时(例如,点击、喜欢和购买) ,它会提供相关的项目。除了考虑建议与触发项目之间的相关性之外,建议还应当多样化,以避免信息茧。然而,现有的多样化推荐方法主要侧重于项目层次的多样性,当推荐项目都与目标项目相关时,这种多样性是不够的。此外,冗余或嘈杂的项目特征可能会影响简单的特征感知推荐方法的性能。针对这些问题,我们提出了一种特征分离自平衡重排框架(FDSB)来捕获特征感知的多样性。该框架包括两个主要模块,即分离注意编码器(DAE)和自平衡多方面排序器。在 DAE 中,我们使用多头注意从丰富的项目特征中学习分离的方面。在排名中,我们开发了一个方面特定的排名机制,能够自适应地平衡每个方面的相关性和多样性。在实验中,我们对收集到的数据集进行离线评估,并在快手应用上部署 FDSB 以实现在线? ? ?/??检验有关推荐的作用。在推荐质量和用户体验方面的重大改进验证了我们方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Feature-aware+Diversified+Re-ranking+with+Disentangled+Representations+for+Relevant+Recommendation)|4| +|[Counteracting User Attention Bias in Music Streaming Recommendation via Reward Modification](https://doi.org/10.1145/3534678.3539393)|Xiao Zhang, Sunhao Dai, Jun Xu, Zhenhua Dong, Quanyu Dai, JiRong Wen|Renmin University of China, Beijing, China; Huawei Noah's Ark Lab, Shenzhen, China|In streaming media applications, like music Apps, songs are recommended in a continuous way in users' daily life. The recommended songs are played automatically although users may not pay any attention to them, posing a challenge of user attention bias in training recommendation models, i.e., the training instances contain a large number of false-positive labels (users' feedback). Existing approaches either directly use the auto-feedbacks or heuristically delete the potential false-positive labels. Both of the approaches lead to biased results because the false-positive labels cause the shift of training data distribution, hurting the accuracy of the recommendation models. In this paper, we propose a learning-based counterfactual approach to adjusting the user auto-feedbacks and learning the recommendation models using Neural Dueling Bandit algorithm, called NDB. Specifically, NDB maintains two neural networks: a user attention network for computing the importance weights that are used for modifying the original rewards, and another random network trained with dueling bandit for conducting online recommendations based on the modified rewards. Theoretical analysis showed that the modified rewards are statistically unbiased, and the learned bandit policy enjoys a sub-linear regret bound. Experimental results demonstrated that NDB can significantly outperform the state-of-the-art baselines.|在流媒体应用程序中,比如音乐应用程序,歌曲被持续推荐到用户的日常生活中。虽然用户可能没有注意到这些歌曲,但推荐的歌曲会自动播放,这对训练推荐模型中的用户注意偏差提出了挑战,即训练实例中包含大量假阳性标签(用户反馈)。现有的方法要么直接使用自动反馈,要么启发性地删除潜在的假阳性标签。这两种方法都会导致结果偏差,因为假阳性标签会引起训练数据分布的变化,从而影响推荐模型的准确性。在本文中,我们提出了一种基于学习的反事实方法来调整用户自动反馈和学习推荐模型的神经决斗盗贼算法,称为 NDB。具体来说,新开发银行维护两个神经网络: 一个是用户注意力网络,用于计算用于修改原始奖励的重要性权重,另一个是与决斗强盗一起训练的随机网络,用于根据修改后的奖励进行在线推荐。理论分析表明,修正后的奖励具有统计上的无偏性,学会的土匪政策具有亚线性后悔界限。实验结果表明,新数据库的性能明显优于最先进的基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Counteracting+User+Attention+Bias+in+Music+Streaming+Recommendation+via+Reward+Modification)|4| |[Knowledge-enhanced Black-box Attacks for Recommendations](https://doi.org/10.1145/3534678.3539359)|Jingfan Chen, Wenqi Fan, Guanghui Zhu, Xiangyu Zhao, Chunfeng Yuan, Qing Li, Yihua Huang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Knowledge-enhanced+Black-box+Attacks+for+Recommendations)|4| |[Towards Universal Sequence Representation Learning for Recommender Systems](https://doi.org/10.1145/3534678.3539381)|Yupeng Hou, Shanlei Mu, Wayne Xin Zhao, Yaliang Li, Bolin Ding, JiRong Wen||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Universal+Sequence+Representation+Learning+for+Recommender+Systems)|4| |[On Structural Explanation of Bias in Graph Neural Networks](https://doi.org/10.1145/3534678.3539319)|Yushun Dong, Song Wang, Yu Wang, Tyler Derr, Jundong Li||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On+Structural+Explanation+of+Bias+in+Graph+Neural+Networks)|4| @@ -49,7 +49,7 @@ |[Variational Flow Graphical Model](https://doi.org/10.1145/3534678.3539450)|Shaogang Ren, Belhal Karimi, Dingcheng Li, Ping Li||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Variational+Flow+Graphical+Model)|3| |[Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values](https://doi.org/10.1145/3534678.3539074)|Zijie J. Wang, Alex Kale, Harsha Nori, Peter Stella, Mark E. Nunnally, Duen Horng Chau, Mihaela Vorvoreanu, Jennifer Wortman Vaughan, Rich Caruana||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Interpretability,+Then+What?+Editing+Machine+Learning+Models+to+Reflect+Human+Knowledge+and+Values)|3| |[GBPNet: Universal Geometric Representation Learning on Protein Structures](https://doi.org/10.1145/3534678.3539441)|Sarp Aykent, Tian Xia|Auburn University, Auburn, AL, USA|Representation learning of protein 3D structures is challenging and essential for applications, e.g., computational protein design or protein engineering. Recently, geometric deep learning has achieved great success in non-Euclidean domains. Although protein can be represented as a graph naturally, it remains under-explored mainly due to the significant challenges in modeling the complex representations and capturing the inherent correlation in the 3D structure modeling. Several challenges include: 1) It is challenging to extract and preserve multi-level rotation and translation equivariant information during learning. 2) Difficulty in developing appropriate tools to effectively leverage the input spatial representations to capture complex geometries across the spatial dimension. 3) Difficulty in incorporating various geometric features and preserving the inherent structural relations. In this work, we introduce geometric bottleneck perceptron, and a general SO(3)-equivariant message passing neural network built on top of it for protein structure representation learning. The proposed geometric bottleneck perceptron can be incorporated into diverse network architecture backbones to process geometric data in different domains. This research shed new light on geometric deep learning in 3D structure studies. Empirically, we demonstrate the strength of our proposed approach on three core downstream tasks, where our model achieves significant improvements and outperforms existing benchmarks. The implementation is available at https://github.com/sarpaykent/GBPNet.|蛋白质三维结构的表示学习是具有挑战性和必要的应用,例如,计算蛋白质设计或蛋白质工程。近年来,几何深度学习在非欧几里德领域取得了巨大的成功。虽然蛋白质可以自然地表示为一个图形,但是它仍然没有得到充分的开发,主要是由于在建模复杂的表示和捕获三维结构建模中的内在关联方面的重大挑战。这些挑战包括: 1)在学习过程中提取和保存多层次旋转和翻译等变信息是一个挑战。2)难以开发合适的工具来有效地利用输入空间表示来捕获跨空间维度的复杂几何图形。3)难以结合各种几何特征和保持固有的结构关系。本文介绍了几何瓶颈感知器,并在此基础上构建了一个通用的 SO (3)等变信息传递神经网络,用于蛋白质结构表示学习。提出的几何瓶颈感知器可以整合到不同的网络结构骨架中,用于处理不同领域的几何数据。本研究为三维结构研究中的几何深度学习提供了新的思路。实际上,我们在三个核心的下游任务中展示了我们提议的方法的优势,在这些任务中,我们的模型实现了显著的改进,并优于现有的基准测试。有关实施方案可于 https://github.com/sarpaykent/gbpnet 索取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=GBPNet:+Universal+Geometric+Representation+Learning+on+Protein+Structures)|3| -|[Motif Prediction with Graph Neural Networks](https://doi.org/10.1145/3534678.3539343)|Maciej Besta, Raphael Grob, Cesare Miglioli, Nicola Bernold, Grzegorz Kwasniewski, Gabriel Gjini, Raghavendra Kanakagiri, Saleh Ashkboos, Lukas Gianinazzi, Nikoli Dryden, Torsten Hoefler|University of Geneva, Geneva, Switzerland; ETH Zürich, Zurich, Switzerland; University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA; ETH Zurich, Zurich, Switzerland|Link prediction is one of the central problems in graph mining. However, recent studies highlight the importance of higher-order network analysis, where complex structures called motifs are the first-class citizens. We first show that existing link prediction schemes fail to effectively predict motifs. To alleviate this, we establish a general motif prediction problem and we propose several heuristics that assess the chances for a specified motif to appear. To make the scores realistic, our heuristics consider - among others - correlations between links, i.e., the potential impact of some arriving links on the appearance of other links in a given motif. Finally, for highest accuracy, we develop a graph neural network (GNN) architecture for motif prediction. Our architecture offers vertex features and sampling schemes that capture the rich structural properties of motifs. While our heuristics are fast and do not need any training, GNNs ensure highest accuracy of predicting motifs, both for dense (e.g., k-cliques) and for sparse ones (e.g., k-stars). We consistently outperform the best available competitor by more than 10% on average and up to 32% in area under the curve. Importantly, the advantages of our approach over schemes based on uncorrelated link prediction increase with the increasing motif size and complexity. We also successfully apply our architecture for predicting more arbitrary clusters and communities, illustrating its potential for graph mining beyond motif analysis.|链接预测是图挖掘的核心问题之一。然而,最近的研究强调了高阶网络分析的重要性,在这种网络分析中,被称为图案的复杂结构是一等公民。我们首先证明了现有的链路预测方案不能有效地预测图案。为了解决这个问题,我们建立了一个通用的主题预测问题,并提出了几种启发式算法来评估特定主题出现的可能性。为了使得分数更加真实,我们的启发式方法考虑了链接之间的相关性,也就是说,一些到达的链接对给定主题中其他链接的外观的潜在影响。最后,为了获得最高的精度,我们开发了一个图形神经网络(GNN)结构用于模体预测。我们的体系结构提供了顶点特征和抽样方案,这些特征和抽样方案捕获了图案丰富的结构属性。虽然我们的启发式算法是快速的,不需要任何训练,GNN 确保预测图案的最高准确性,无论是对于密集的(例如,k- 团)和稀疏的(例如,k- 星)。我们始终超越最好的竞争对手超过10% 的平均水平和高达32% 的面积下的曲线。重要的是,与基于不相关链路预测的方案相比,我们的方法的优势随着基序大小和复杂度的增加而增加。我们还成功地应用了我们的体系结构来预测更多的任意集群和社区,说明了它在图形挖掘方面的潜力超越了主题分析。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Motif+Prediction+with+Graph+Neural+Networks)|3| +|[Motif Prediction with Graph Neural Networks](https://doi.org/10.1145/3534678.3539343)|Maciej Besta, Raphael Grob, Cesare Miglioli, Nicola Bernold, Grzegorz Kwasniewski, Gabriel Gjini, Raghavendra Kanakagiri, Saleh Ashkboos, Lukas Gianinazzi, Nikoli Dryden, Torsten Hoefler|ETH Zürich, Zurich, Switzerland; ETH Zurich, Zurich, Switzerland; University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA; University of Geneva, Geneva, Switzerland|Link prediction is one of the central problems in graph mining. However, recent studies highlight the importance of higher-order network analysis, where complex structures called motifs are the first-class citizens. We first show that existing link prediction schemes fail to effectively predict motifs. To alleviate this, we establish a general motif prediction problem and we propose several heuristics that assess the chances for a specified motif to appear. To make the scores realistic, our heuristics consider - among others - correlations between links, i.e., the potential impact of some arriving links on the appearance of other links in a given motif. Finally, for highest accuracy, we develop a graph neural network (GNN) architecture for motif prediction. Our architecture offers vertex features and sampling schemes that capture the rich structural properties of motifs. While our heuristics are fast and do not need any training, GNNs ensure highest accuracy of predicting motifs, both for dense (e.g., k-cliques) and for sparse ones (e.g., k-stars). We consistently outperform the best available competitor by more than 10% on average and up to 32% in area under the curve. Importantly, the advantages of our approach over schemes based on uncorrelated link prediction increase with the increasing motif size and complexity. We also successfully apply our architecture for predicting more arbitrary clusters and communities, illustrating its potential for graph mining beyond motif analysis.|链接预测是图挖掘的核心问题之一。然而,最近的研究强调了高阶网络分析的重要性,在这种网络分析中,被称为图案的复杂结构是一等公民。我们首先证明了现有的链路预测方案不能有效地预测图案。为了解决这个问题,我们建立了一个通用的主题预测问题,并提出了几种启发式算法来评估特定主题出现的可能性。为了使得分数更加真实,我们的启发式方法考虑了链接之间的相关性,也就是说,一些到达的链接对给定主题中其他链接的外观的潜在影响。最后,为了获得最高的精度,我们开发了一个图形神经网络(GNN)结构用于模体预测。我们的体系结构提供了顶点特征和抽样方案,这些特征和抽样方案捕获了图案丰富的结构属性。虽然我们的启发式算法是快速的,不需要任何训练,GNN 确保预测图案的最高准确性,无论是对于密集的(例如,k- 团)和稀疏的(例如,k- 星)。我们始终超越最好的竞争对手超过10% 的平均水平和高达32% 的面积下的曲线。重要的是,与基于不相关链路预测的方案相比,我们的方法的优势随着基序大小和复杂度的增加而增加。我们还成功地应用了我们的体系结构来预测更多的任意集群和社区,说明了它在图形挖掘方面的潜力超越了主题分析。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Motif+Prediction+with+Graph+Neural+Networks)|3| |[Efficient Orthogonal Multi-view Subspace Clustering](https://doi.org/10.1145/3534678.3539282)|Mansheng Chen, ChangDong Wang, Dong Huang, JianHuang Lai, Philip S. Yu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Efficient+Orthogonal+Multi-view+Subspace+Clustering)|3| |[Local Evaluation of Time Series Anomaly Detection Algorithms](https://doi.org/10.1145/3534678.3539339)|Alexis Huet, José Manuel Navarro, Dario Rossi||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Local+Evaluation+of+Time+Series+Anomaly+Detection+Algorithms)|3| |[Feature Overcorrelation in Deep Graph Neural Networks: A New Perspective](https://doi.org/10.1145/3534678.3539445)|Wei Jin, Xiaorui Liu, Yao Ma, Charu C. Aggarwal, Jiliang Tang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Feature+Overcorrelation+in+Deep+Graph+Neural+Networks:+A+New+Perspective)|3| @@ -68,10 +68,10 @@ |[Graph Attention Multi-Layer Perceptron](https://doi.org/10.1145/3534678.3539121)|Wentao Zhang, Ziqi Yin, Zeang Sheng, Yang Li, Wen Ouyang, Xiaosen Li, Yangyu Tao, Zhi Yang, Bin Cui||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph+Attention+Multi-Layer+Perceptron)|3| |[ItemSage: Learning Product Embeddings for Shopping Recommendations at Pinterest](https://doi.org/10.1145/3534678.3539170)|Paul Baltescu, Haoyu Chen, Nikil Pancha, Andrew Zhai, Jure Leskovec, Charles Rosenberg|Pinterest, San Francisco, CA, USA|Learned embeddings for products are an important building block for web-scale e-commerce recommendation systems. At Pinterest, we build a single set of product embeddings called ItemSage to provide relevant recommendations in all shopping use cases including user, image and search based recommendations. This approach has led to significant improvements in engagement and conversion metrics, while reducing both infrastructure and maintenance cost. While most prior work focuses on building product embeddings from features coming from a single modality, we introduce a transformer-based architecture capable of aggregating information from both text and image modalities and show that it significantly outperforms single modality baselines. We also utilize multi-task learning to make ItemSage optimized for several engagement types, leading to a candidate generation system that is efficient for all of the engagement objectives of the end-to-end recommendation system. Extensive offline experiments are conducted to illustrate the effectiveness of our approach and results from online A/B experiments show substantial gains in key business metrics (up to +7% gross merchandise value/user and +11% click volume).|产品学习嵌入是网络电子商务推荐系统的重要组成部分。在 Pinterest,我们构建了一套名为 ItemSage 的产品嵌入,在所有购物用例中提供相关推荐,包括用户、图片和基于搜索的推荐。这种方法显著改善了参与度和转换度量,同时降低了基础设施和维护成本。虽然大多数先前的工作集中于构建来自单一模式的特征的产品嵌入,但是我们引入了一个基于转换器的体系结构,该体系结构能够聚合来自文本和图像模式的信息,并表明它明显优于单一模式基线。我们还利用多任务学习来使 ItemSage 针对几种参与类型进行优化,从而产生一个对端到端推荐系统的所有参与目标都有效的候选人生成系统。为了说明我们的方法的有效性,我们进行了大量的离线实验,在线 A/B 实验的结果显示,在关键的商业指标方面取得了实质性的进展(高达7% 的商品总价值/用户和 + 11% 的点击量)。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ItemSage:+Learning+Product+Embeddings+for+Shopping+Recommendations+at+Pinterest)|2| |[Towards Unified Conversational Recommender Systems via Knowledge-Enhanced Prompt Learning](https://doi.org/10.1145/3534678.3539382)|Xiaolei Wang, Kun Zhou, JiRong Wen, Wayne Xin Zhao|Renmin University of China, Beijing, China|Conversational recommender systems (CRS) aim to proactively elicit user preference and recommend high-quality items through natural language conversations. Typically, a CRS consists of a recommendation module to predict preferred items for users and a conversation module to generate appropriate responses. To develop an effective CRS, it is essential to seamlessly integrate the two modules. Existing works either design semantic alignment strategies, or share knowledge resources and representations between the two modules. However, these approaches still rely on different architectures or techniques to develop the two modules, making it difficult for effective module integration. To address this problem, we propose a unified CRS model named UniCRS based on knowledge-enhanced prompt learning. Our approach unifies the recommendation and conversation subtasks into the prompt learning paradigm, and utilizes knowledge-enhanced prompts based on a fixed pre-trained language model (PLM) to fulfill both subtasks in a unified approach. In the prompt design, we include fused knowledge representations, task-specific soft tokens, and the dialogue context, which can provide sufficient contextual information to adapt the PLM for the CRS task. Besides, for the recommendation subtask, we also incorporate the generated response template as an important part of the prompt, to enhance the information interaction between the two subtasks. Extensive experiments on two public CRS datasets have demonstrated the effectiveness of our approach. Our code is publicly available at the link: https://github.com/RUCAIBox/UniCRS.|会话推荐系统(CRS)的目标是通过自然语言的对话主动地引导用户偏好,并推荐高质量的项目。通常,CRS 由一个推荐模块和一个会话模块组成,前者用于预测用户的首选项,后者用于生成适当的响应。为了开发一个有效的 CRS 系统,必须将这两个模块无缝地结合起来。现有的工作或者设计语义对齐策略,或者在两个模块之间共享知识资源和表示。然而,这些方法仍然依赖于不同的体系结构或技术来开发这两个模块,这使得有效的模块集成变得困难。针对这一问题,提出了一种基于知识增强的快速学习的统一 CRS 模型 UniCRS。该方法将推荐子任务和会话子任务统一到快速学习范式中,并利用基于固定预训练语言模型(PLM)的知识增强提示来统一实现推荐子任务和会话子任务。在快速设计中,我们包括融合知识表示、任务特定的软标记和对话上下文,它们可以提供足够的上下文信息来使 PLM 适应 CRS 任务。此外,对于推荐子任务,我们还将生成的响应模板作为提示的重要组成部分,以增强两个子任务之间的信息交互。在两个公共 CRS 数据集上的大量实验已经证明了我们方法的有效性。我们的代码可在以下 https://github.com/rucaibox/unicrs 公开获得:。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Unified+Conversational+Recommender+Systems+via+Knowledge-Enhanced+Prompt+Learning)|2| -|[Device-cloud Collaborative Recommendation via Meta Controller](https://doi.org/10.1145/3534678.3539181)|Jiangchao Yao, Feng Wang, Xichen Ding, Shaohu Chen, Bo Han, Jingren Zhou, Hongxia Yang|Hong Kong Baptist University, Hong Kong, China; CMIC, Shanghai Jiao Tong University, Shanghai, China; DAMO Academy, Alibaba Group, Hangzhou, China; Ant Group, Beijing, China|On-device machine learning enables the lightweight deployment of recommendation models in local clients, which reduces the burden of the cloud-based recommenders and simultaneously incorporates more real-time user features. Nevertheless, the cloud-based recommendation in the industry is still very important considering its powerful model capacity and the efficient candidate generation from the billion-scale item pool. Previous attempts to integrate the merits of both paradigms mainly resort to a sequential mechanism, which builds the on-device recommender on top of the cloud-based recommendation. However, such a design is inflexible when user interests dramatically change: the on-device model is stuck by the limited item cache while the cloud-based recommendation based on the large item pool do not respond without the new re-fresh feedback. To overcome this issue, we propose a meta controller to dynamically manage the collaboration between the on-device recommender and the cloud-based recommender, and introduce a novel efficient sample construction from the causal perspective to solve the dataset absence issue of meta controller. On the basis of the counterfactual samples and the extended training, extensive experiments in the industrial recommendation scenarios show the promise of meta controller in the device-cloud collaboration.|设备上的机器学习支持在本地客户机中轻量级部署推荐模型,这减轻了基于云的推荐模型的负担,同时包含了更多的实时用户特性。尽管如此,考虑到其强大的模型容量和从数十亿规模的项目库中有效地生成候选项,基于云的推荐在业界仍然非常重要。以前整合这两种模式优点的尝试主要依赖于一种顺序机制,这种机制在基于云的推荐之上构建设备上的推荐。然而,当用户的兴趣发生巨大变化时,这样的设计是不灵活的: 设备上的模型被有限的项目缓存卡住了,而基于大型项目池的基于云的推荐在没有新的更新反馈的情况下不会响应。针对这一问题,本文提出了一种元控制器来动态管理设备上的推荐器和基于云的推荐器之间的协作,并从因果关系的角度提出了一种新的有效的样本结构来解决元控制器数据集缺失的问题。在反事实样本和扩展训练的基础上,在工业推荐场景中的大量实验显示了元控制器在设备-云协作中的应用前景。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Device-cloud+Collaborative+Recommendation+via+Meta+Controller)|2| -|[MolSearch: Search-based Multi-objective Molecular Generation and Property Optimization](https://doi.org/10.1145/3534678.3542676)|Mengying Sun, Jing Xing, Han Meng, Huijun Wang, Bin Chen, Jiayu Zhou|Agios Pharmaceuticals, Cambridge, MA, USA; Michigan State University, Grand Rapids, MI, USA; Michigan State University, East Lansing, MI, USA|Leveraging computational methods to generate small molecules with desired properties has been an active research area in the drug discovery field. Towards real-world applications, however, efficient generation of molecules that satisfy multiple property requirements simultaneously remains a key challenge. In this paper, we tackle this challenge using a search-based approach and propose a simple yet effective framework called MolSearch for multi-objective molecular generation (optimization).We show that given proper design and sufficient domain information, search-based methods can achieve performance comparable or even better than deep learning methods while being computationally efficient. Such efficiency enables massive exploration of chemical space given constrained computational resources. In particular, MolSearch starts with existing molecules and uses a two-stage search strategy to gradually modify them into new ones, based on transformation rules derived systematically and exhaustively from large compound libraries. We evaluate MolSearch in multiple benchmark generation settings and demonstrate its effectiveness and efficiency.|利用计算方法生成具有期望特性的小分子已经成为药物发现领域的一个活跃的研究领域。然而,对于实际应用来说,同时满足多种性能要求的高效生产分子仍然是一个关键的挑战。在本文中,我们使用基于搜索的方法来解决这一挑战,并提出了一个简单而有效的框架,称为 MolSearch 的多目标分子生成(优化)。我们表明,给定适当的设计和充分的领域信息,基于搜索的方法可以实现性能可比,甚至比深度学习方法更好,同时计算效率。这样的效率使得在计算资源有限的情况下对化学空间进行大规模探索成为可能。特别是,MolSearch 从现有的分子开始,使用一个两阶段的搜索策略,逐渐修改成新的,基于转换规则,系统地和详尽地从大型化合物库。我们评估了 MolSearch 在多个基准测试生成环境中的性能,并证明了它的有效性和效率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MolSearch:+Search-based+Multi-objective+Molecular+Generation+and+Property+Optimization)|2| -|[Invariant Preference Learning for General Debiasing in Recommendation](https://doi.org/10.1145/3534678.3539439)|Zimu Wang, Yue He, Jiashuo Liu, Wenchao Zou, Philip S. Yu, Peng Cui|Siemens China, Shanghai, China; Tsinghua University, Beijing, China; University of Illinois at Chicago, Chicago, IL, USA|Current recommender systems have achieved great successes in online services, such as E-commerce and social media. However, they still suffer from the performance degradation in real scenarios, because various biases always occur in the generation process of user behaviors. Despite the recent development of addressing some specific type of bias, a variety of data bias, some of which are even unknown, are often mixed up in real applications. Although the uniform (or unbiased) data may help for the purpose of general debiasing, such data can either be hardly available or induce high experimental cost. In this paper, we consider a more practical setting where we aim to conduct general debiasing with the biased observational data alone. We assume that the observational user behaviors are determined by invariant preference (i.e. a user's true preference) and the variant preference (affected by some unobserved confounders). We propose a novel recommendation framework called InvPref which iteratively decomposes the invariant preference and variant preference from biased observational user behaviors by estimating heterogeneous environments corresponding to different types of latent bias. Extensive experiments, including the settings of general debiasing and specific debiasing, verify the advantages of our method.|现有的推荐系统在电子商务和社交媒体等在线服务领域取得了巨大的成功。然而,在实际场景中,它们仍然会受到性能下降的影响,因为在用户行为的生成过程中总是会出现各种偏差。尽管最近的发展解决一些特定类型的偏差,各种各样的数据偏差,其中一些甚至是未知的,往往是混合在实际应用。虽然统一(或无偏)的数据可能有助于一般的去偏目的,这样的数据可能难以获得或诱导高实验成本。在本文中,我们考虑一个更实际的设置,其中我们的目的是进行一般的去偏与有偏的观测数据单独。我们假设观察用户行为是由不变偏好(即用户的真实偏好)和变异偏好(受一些未观察到的混杂因素的影响)决定的。提出了一种新的推荐框架 InvPref,该框架通过估计不同类型潜在偏差对应的异质环境,迭代分解有偏差的观察用户行为的不变偏好和变异偏好。广泛的实验,包括一般消偏和具体消偏的设置,验证了我们的方法的优点。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Invariant+Preference+Learning+for+General+Debiasing+in+Recommendation)|2| -|[Automatic Controllable Product Copywriting for E-Commerce](https://doi.org/10.1145/3534678.3539171)|Xiaojie Guo, Qingkai Zeng, Meng Jiang, Yun Xiao, Bo Long, Lingfei Wu|JD.COM, Beijing, China; University of Notre Dame, Notre Dame, IN, USA; JD.COM Silicon Valley Research Center, Mountain View, CA, USA|Automatic product description generation for e-commerce has witnessed significant advancement in the past decade. Product copy- writing aims to attract users' interest and improve user experience by highlighting product characteristics with textual descriptions. As the services provided by e-commerce platforms become diverse, it is necessary to adapt the patterns of automatically-generated descriptions dynamically. In this paper, we report our experience in deploying an E-commerce Prefix-based Controllable Copywriting Generation (EPCCG) system into the JD.com e-commerce product recommendation platform. The development of the system contains two main components: 1) copywriting aspect extraction; 2) weakly supervised aspect labelling; 3) text generation with a prefix-based language model; and 4) copywriting quality control. We conduct experiments to validate the effectiveness of the proposed EPCCG. In addition, we introduce the deployed architecture which cooperates the EPCCG into the real-time JD.com e-commerce recommendation platform and the significant payoff since deployment. The codes for implementation are provided at https://github.com/xguo7/Automatic-Controllable-Product-Copywriting-for-E-Commerce.git.|电子商务中的产品描述自动生成技术在过去的十年中取得了长足的进步。产品文案的目的是通过文字描述突出产品特征,吸引用户的兴趣,提高用户体验。随着电子商务平台提供的服务变得多样化,有必要动态调整自动生成描述的模式。在本文中,我们报告了在京东电子商务产品推荐平台上部署基于前缀的可控文案生成(ECCG)系统的经验。该系统的开发包括两个主要组成部分: 1)文案方面提取; 2)弱监督方面标注; 3)基于前缀语言模型的文本生成; 4)文案质量控制。我们进行了实验,以验证所提出的心电图的有效性。此外,我们还将 EPCCG 协同的已部署体系结构引入到实时 JD.com 电子商务推荐平台中,并且从部署以来获得了显著的回报。实施守则载于 https://github.com/xguo7/automatic-controllable-product-copywriting-for-e-commerce.git。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Automatic+Controllable+Product+Copywriting+for+E-Commerce)|2| +|[Device-cloud Collaborative Recommendation via Meta Controller](https://doi.org/10.1145/3534678.3539181)|Jiangchao Yao, Feng Wang, Xichen Ding, Shaohu Chen, Bo Han, Jingren Zhou, Hongxia Yang|Ant Group, Beijing, China; Hong Kong Baptist University, Hong Kong, China; DAMO Academy, Alibaba Group, Hangzhou, China; CMIC, Shanghai Jiao Tong University, Shanghai, China|On-device machine learning enables the lightweight deployment of recommendation models in local clients, which reduces the burden of the cloud-based recommenders and simultaneously incorporates more real-time user features. Nevertheless, the cloud-based recommendation in the industry is still very important considering its powerful model capacity and the efficient candidate generation from the billion-scale item pool. Previous attempts to integrate the merits of both paradigms mainly resort to a sequential mechanism, which builds the on-device recommender on top of the cloud-based recommendation. However, such a design is inflexible when user interests dramatically change: the on-device model is stuck by the limited item cache while the cloud-based recommendation based on the large item pool do not respond without the new re-fresh feedback. To overcome this issue, we propose a meta controller to dynamically manage the collaboration between the on-device recommender and the cloud-based recommender, and introduce a novel efficient sample construction from the causal perspective to solve the dataset absence issue of meta controller. On the basis of the counterfactual samples and the extended training, extensive experiments in the industrial recommendation scenarios show the promise of meta controller in the device-cloud collaboration.|设备上的机器学习支持在本地客户机中轻量级部署推荐模型,这减轻了基于云的推荐模型的负担,同时包含了更多的实时用户特性。尽管如此,考虑到其强大的模型容量和从数十亿规模的项目库中有效地生成候选项,基于云的推荐在业界仍然非常重要。以前整合这两种模式优点的尝试主要依赖于一种顺序机制,这种机制在基于云的推荐之上构建设备上的推荐。然而,当用户的兴趣发生巨大变化时,这样的设计是不灵活的: 设备上的模型被有限的项目缓存卡住了,而基于大型项目池的基于云的推荐在没有新的更新反馈的情况下不会响应。针对这一问题,本文提出了一种元控制器来动态管理设备上的推荐器和基于云的推荐器之间的协作,并从因果关系的角度提出了一种新的有效的样本结构来解决元控制器数据集缺失的问题。在反事实样本和扩展训练的基础上,在工业推荐场景中的大量实验显示了元控制器在设备-云协作中的应用前景。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Device-cloud+Collaborative+Recommendation+via+Meta+Controller)|2| +|[MolSearch: Search-based Multi-objective Molecular Generation and Property Optimization](https://doi.org/10.1145/3534678.3542676)|Mengying Sun, Jing Xing, Han Meng, Huijun Wang, Bin Chen, Jiayu Zhou|Michigan State University, East Lansing, MI, USA; Michigan State University, Grand Rapids, MI, USA; Agios Pharmaceuticals, Cambridge, MA, USA|Leveraging computational methods to generate small molecules with desired properties has been an active research area in the drug discovery field. Towards real-world applications, however, efficient generation of molecules that satisfy multiple property requirements simultaneously remains a key challenge. In this paper, we tackle this challenge using a search-based approach and propose a simple yet effective framework called MolSearch for multi-objective molecular generation (optimization).We show that given proper design and sufficient domain information, search-based methods can achieve performance comparable or even better than deep learning methods while being computationally efficient. Such efficiency enables massive exploration of chemical space given constrained computational resources. In particular, MolSearch starts with existing molecules and uses a two-stage search strategy to gradually modify them into new ones, based on transformation rules derived systematically and exhaustively from large compound libraries. We evaluate MolSearch in multiple benchmark generation settings and demonstrate its effectiveness and efficiency.|利用计算方法生成具有期望特性的小分子已经成为药物发现领域的一个活跃的研究领域。然而,对于实际应用来说,同时满足多种性能要求的高效生产分子仍然是一个关键的挑战。在本文中,我们使用基于搜索的方法来解决这一挑战,并提出了一个简单而有效的框架,称为 MolSearch 的多目标分子生成(优化)。我们表明,给定适当的设计和充分的领域信息,基于搜索的方法可以实现性能可比,甚至比深度学习方法更好,同时计算效率。这样的效率使得在计算资源有限的情况下对化学空间进行大规模探索成为可能。特别是,MolSearch 从现有的分子开始,使用一个两阶段的搜索策略,逐渐修改成新的,基于转换规则,系统地和详尽地从大型化合物库。我们评估了 MolSearch 在多个基准测试生成环境中的性能,并证明了它的有效性和效率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MolSearch:+Search-based+Multi-objective+Molecular+Generation+and+Property+Optimization)|2| +|[Invariant Preference Learning for General Debiasing in Recommendation](https://doi.org/10.1145/3534678.3539439)|Zimu Wang, Yue He, Jiashuo Liu, Wenchao Zou, Philip S. Yu, Peng Cui|Siemens China, Shanghai, China; University of Illinois at Chicago, Chicago, IL, USA; Tsinghua University, Beijing, China|Current recommender systems have achieved great successes in online services, such as E-commerce and social media. However, they still suffer from the performance degradation in real scenarios, because various biases always occur in the generation process of user behaviors. Despite the recent development of addressing some specific type of bias, a variety of data bias, some of which are even unknown, are often mixed up in real applications. Although the uniform (or unbiased) data may help for the purpose of general debiasing, such data can either be hardly available or induce high experimental cost. In this paper, we consider a more practical setting where we aim to conduct general debiasing with the biased observational data alone. We assume that the observational user behaviors are determined by invariant preference (i.e. a user's true preference) and the variant preference (affected by some unobserved confounders). We propose a novel recommendation framework called InvPref which iteratively decomposes the invariant preference and variant preference from biased observational user behaviors by estimating heterogeneous environments corresponding to different types of latent bias. Extensive experiments, including the settings of general debiasing and specific debiasing, verify the advantages of our method.|现有的推荐系统在电子商务和社交媒体等在线服务领域取得了巨大的成功。然而,在实际场景中,它们仍然会受到性能下降的影响,因为在用户行为的生成过程中总是会出现各种偏差。尽管最近的发展解决一些特定类型的偏差,各种各样的数据偏差,其中一些甚至是未知的,往往是混合在实际应用。虽然统一(或无偏)的数据可能有助于一般的去偏目的,这样的数据可能难以获得或诱导高实验成本。在本文中,我们考虑一个更实际的设置,其中我们的目的是进行一般的去偏与有偏的观测数据单独。我们假设观察用户行为是由不变偏好(即用户的真实偏好)和变异偏好(受一些未观察到的混杂因素的影响)决定的。提出了一种新的推荐框架 InvPref,该框架通过估计不同类型潜在偏差对应的异质环境,迭代分解有偏差的观察用户行为的不变偏好和变异偏好。广泛的实验,包括一般消偏和具体消偏的设置,验证了我们的方法的优点。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Invariant+Preference+Learning+for+General+Debiasing+in+Recommendation)|2| +|[Automatic Controllable Product Copywriting for E-Commerce](https://doi.org/10.1145/3534678.3539171)|Xiaojie Guo, Qingkai Zeng, Meng Jiang, Yun Xiao, Bo Long, Lingfei Wu|JD.COM Silicon Valley Research Center, Mountain View, CA, USA; JD.COM, Beijing, China; University of Notre Dame, Notre Dame, IN, USA|Automatic product description generation for e-commerce has witnessed significant advancement in the past decade. Product copy- writing aims to attract users' interest and improve user experience by highlighting product characteristics with textual descriptions. As the services provided by e-commerce platforms become diverse, it is necessary to adapt the patterns of automatically-generated descriptions dynamically. In this paper, we report our experience in deploying an E-commerce Prefix-based Controllable Copywriting Generation (EPCCG) system into the JD.com e-commerce product recommendation platform. The development of the system contains two main components: 1) copywriting aspect extraction; 2) weakly supervised aspect labelling; 3) text generation with a prefix-based language model; and 4) copywriting quality control. We conduct experiments to validate the effectiveness of the proposed EPCCG. In addition, we introduce the deployed architecture which cooperates the EPCCG into the real-time JD.com e-commerce recommendation platform and the significant payoff since deployment. The codes for implementation are provided at https://github.com/xguo7/Automatic-Controllable-Product-Copywriting-for-E-Commerce.git.|电子商务中的产品描述自动生成技术在过去的十年中取得了长足的进步。产品文案的目的是通过文字描述突出产品特征,吸引用户的兴趣,提高用户体验。随着电子商务平台提供的服务变得多样化,有必要动态调整自动生成描述的模式。在本文中,我们报告了在京东电子商务产品推荐平台上部署基于前缀的可控文案生成(ECCG)系统的经验。该系统的开发包括两个主要组成部分: 1)文案方面提取; 2)弱监督方面标注; 3)基于前缀语言模型的文本生成; 4)文案质量控制。我们进行了实验,以验证所提出的心电图的有效性。此外,我们还将 EPCCG 协同的已部署体系结构引入到实时 JD.com 电子商务推荐平台中,并且从部署以来获得了显著的回报。实施守则载于 https://github.com/xguo7/automatic-controllable-product-copywriting-for-e-commerce.git。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Automatic+Controllable+Product+Copywriting+for+E-Commerce)|2| |[Multi-task Hierarchical Classification for Disk Failure Prediction in Online Service Systems](https://doi.org/10.1145/3534678.3539176)|Yudong Liu, Hailan Yang, Pu Zhao, Minghua Ma, Chengwu Wen, Hongyu Zhang, Chuan Luo, Qingwei Lin, Chang Yi, Jiaojian Wang, Chenjian Zhang, Paul Wang, Yingnong Dang, Saravan Rajmohan, Dongmei Zhang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-task+Hierarchical+Classification+for+Disk+Failure+Prediction+in+Online+Service+Systems)|2| |[EGM: Enhanced Graph-based Model for Large-scale Video Advertisement Search](https://doi.org/10.1145/3534678.3539061)|Tan Yu, Jie Liu, Yi Yang, Yi Li, Hongliang Fei, Ping Li||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=EGM:+Enhanced+Graph-based+Model+for+Large-scale+Video+Advertisement+Search)|2| |[Saliency-Regularized Deep Multi-Task Learning](https://doi.org/10.1145/3534678.3539442)|Guangji Bai, Liang Zhao|Emory University, Atlanta, GA, USA|Multi-task learning (MTL) is a framework that enforces multiple learning tasks to share their knowledge to improve their generalization abilities. While shallow multi-task learning can learn task relations, it can only handle pre-defined features. Modern deep multi-task learning can jointly learn latent features and task sharing, but they are obscure in task relation. Also, they pre-define which layers and neurons should share across tasks and cannot learn adaptively. To address these challenges, this paper proposes a new multi-task learning framework that jointly learns latent features and explicit task relations by complementing the strength of existing shallow and deep multitask learning scenarios. Specifically, we propose to model the task relation as the similarity between tasks' input gradients, with a theoretical analysis of their equivalency. In addition, we innovatively propose a multi-task learning objective that explicitly learns task relations by a new regularizer. Theoretical analysis shows that the generalizability error has been reduced thanks to the proposed regularizer. Extensive experiments on several multi-task learning and image classification benchmarks demonstrate the proposed method's effectiveness, efficiency as well as reasonableness in the learned task relation patterns.|多任务学习(MTL)是一种强制多任务共享知识以提高其泛化能力的学习框架。浅层多任务学习虽然可以学习任务关系,但只能处理预定义的特征。现代深度多任务学习可以联合学习任务的潜在特征和任务共享,但在任务关系方面较为模糊。此外,他们预先定义了哪些层和神经元应该跨任务共享,而不能自适应地学习。针对这些挑战,本文提出了一种新的多任务学习框架,通过补充现有浅层和深层多任务学习场景的优势,联合学习潜在特征和显性任务关系。具体来说,我们提出将任务关系建模为任务输入梯度之间的相似性,并对其等效性进行了理论分析。此外,我们创新地提出了一个多任务学习目标,通过一个新的正则化器显式学习任务关系。理论分析表明,该正则化器可以减小泛化误差。通过对多个多任务学习和图像分类基准的大量实验,证明了该方法在学习任务关系模式方面的有效性、高效性和合理性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Saliency-Regularized+Deep+Multi-Task+Learning)|2| @@ -112,14 +112,14 @@ |[Robust Time Series Analysis and Applications: An Industrial Perspective](https://doi.org/10.1145/3534678.3542612)|Qingsong Wen, Linxiao Yang, Tian Zhou, Liang Sun||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Robust+Time+Series+Analysis+and+Applications:+An+Industrial+Perspective)|2| |[PECOS: Prediction for Enormous and Correlated Output Spaces](https://doi.org/10.1145/3534678.3542629)|HsiangFu Yu, Jiong Zhang, WeiCheng Chang, JyunYu Jiang, Wei Li, ChoJui Hsieh||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PECOS:+Prediction+for+Enormous+and+Correlated+Output+Spaces)|2| |[Extracting Relevant Information from User's Utterances in Conversational Search and Recommendation](https://doi.org/10.1145/3534678.3539471)|Ali Montazeralghaem, James Allan|University of Massachusetts Amherst, Amherst, MA, USA|Conversational search and recommendation systems can ask clarifying questions through the conversation and collect valuable information from users. However, an important question remains: how can we extract relevant information from the user's utterances and use it in the retrieval or recommendation in the next turn of the conversation? Utilizing relevant information from users' utterances leads the system to better results at the end of the conversation. In this paper, we propose a model based on reinforcement learning, namely RelInCo, which takes the user's utterances and the context of the conversation and classifies each word in the user's utterances as belonging to the relevant or non-relevant class. RelInCo uses two Actors: 1) Arrangement-Actor, which finds the most relevant order of words in user's utterances, and 2) Selector-Actor, which determines which words, in the order provided by the arrangement Actor, can bring the system closer to the target of the conversation. In this way, we can find relevant information in the user's utterance and use it in the conversation. The objective function in our model is designed in such a way that it can maximize any desired retrieval and recommendation metrics (i.e., the ultimate|会话搜索和推荐系统可以通过会话提出澄清问题,并从用户那里收集有价值的信息。然而,一个重要的问题仍然存在: 我们如何从用户的话语中提取相关信息,并将其用于下一轮对话中的检索或推荐?利用用户话语中的相关信息,可以使系统在对话结束时获得更好的结果。在本文中,我们提出了一个基于强化学习的模型,即 RelinCo,该模型根据用户的话语和对话的上下文,将用户话语中的每个单词归类为相关或非相关类别。RelInCo 使用了两个参与者: 1)安排-参与者,它找到用户话语中最相关的词语顺序; 2)选择-参与者,它根据安排-参与者提供的顺序决定哪些词语可以使系统更接近对话的目标。通过这种方式,我们可以在用户的话语中找到相关信息,并在对话中加以利用。我们模型中的目标函数是这样设计的,它可以最大化任何所需的检索和推荐指标(即,最终的|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Extracting+Relevant+Information+from+User's+Utterances+in+Conversational+Search+and+Recommendation)|1| -|[Uni-Retriever: Towards Learning the Unified Embedding Based Retriever in Bing Sponsored Search](https://doi.org/10.1145/3534678.3539212)|Jianjin Zhang, Zheng Liu, Weihao Han, Shitao Xiao, Ruicheng Zheng, Yingxia Shao, Hao Sun, Hanqing Zhu, Premkumar Srinivasan, Weiwei Deng, Qi Zhang, Xing Xie|Microsoft, Newark, NJ, USA; Microsoft, Seattle, DC, USA; Microsoft, Beijing, China; Beijing University of Posts and Telecommunications, Beijing, China|Embedding based retrieval (EBR) is a fundamental building block in many web applications. However, EBR in sponsored search is distinguished from other generic scenarios and technically challenging due to the need of serving multiple retrieval purposes: firstly, it has to retrieve high-relevance ads, which may exactly serve user's search intent; secondly, it needs to retrieve high-CTR ads so as to maximize the overall user clicks. In this paper, we present a novel representation learning framework Uni-Retriever developed for Bing Search, which unifies two different training modes knowledge distillation and contrastive learning to realize both required objectives. On one hand, the capability of making high-relevance retrieval is established by distilling knowledge from the "relevance teacher model''. On the other hand, the capability of making high-CTR retrieval is optimized by learning to discriminate user's clicked ads from the entire corpus. The two training modes are jointly performed as a multi-objective learning process, such that the ads of high relevance and CTR can be favored by the generated embeddings. Besides the learning strategy, we also elaborate our solution for EBR serving pipeline built upon the substantially optimized DiskANN, where massive-scale EBR can be performed with competitive time and memory efficiency, and accomplished in high-quality. We make comprehensive offline and online experiments to evaluate the proposed techniques, whose findings may provide useful insights for the future development of EBR systems. Uni-Retriever has been mainstreamed as the major retrieval path in Bing's production thanks to the notable improvements on the representation and EBR serving quality.|嵌入式基于检索(EBR)是许多 Web 应用程序的基础构件。然而,由于需要服务于多种检索目的,赞助商搜索中的 EBR 不同于其他一般情况,在技术上具有挑战性: 首先,它必须检索高相关度的广告,这可能恰好服务于用户的搜索意图; 其次,它需要检索高点击率的广告,以最大限度地提高用户的总体点击率。本文提出了一种新的面向 Bing 搜索的 Uni-Retriever 表示学习框架,该框架将两种不同的训练模式知识提取和对比学习相结合,实现了两种不同的目标。一方面,从“关联教师模型”中提取知识,建立高关联检索能力;。另一方面,通过学习从整个语料库中区分用户点击广告,优化了高点击率检索的能力。这两种训练模式作为一个多目标学习过程共同执行,使得嵌入生成的广告更有利于高关联度和点击率的广告。除了学习策略,我们还详细阐述了我们的解决方案,EBR 服务流水线的基础上大幅度优化的 DiskANN,其中大规模的 EBR 可以执行竞争时间和内存效率,并完成在高质量。我们进行了全面的离线和在线实验来评估所提出的技术,其结果可能为未来 EBR 系统的发展提供有用的见解。统一检索已成为主流的检索路径在必应的生产显着改善的表示和 EBR 服务质量。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Uni-Retriever:+Towards+Learning+the+Unified+Embedding+Based+Retriever+in+Bing+Sponsored+Search)|1| +|[Uni-Retriever: Towards Learning the Unified Embedding Based Retriever in Bing Sponsored Search](https://doi.org/10.1145/3534678.3539212)|Jianjin Zhang, Zheng Liu, Weihao Han, Shitao Xiao, Ruicheng Zheng, Yingxia Shao, Hao Sun, Hanqing Zhu, Premkumar Srinivasan, Weiwei Deng, Qi Zhang, Xing Xie|Microsoft, Newark, NJ, USA; Microsoft, Beijing, China; Beijing University of Posts and Telecommunications, Beijing, China; Microsoft, Seattle, DC, USA|Embedding based retrieval (EBR) is a fundamental building block in many web applications. However, EBR in sponsored search is distinguished from other generic scenarios and technically challenging due to the need of serving multiple retrieval purposes: firstly, it has to retrieve high-relevance ads, which may exactly serve user's search intent; secondly, it needs to retrieve high-CTR ads so as to maximize the overall user clicks. In this paper, we present a novel representation learning framework Uni-Retriever developed for Bing Search, which unifies two different training modes knowledge distillation and contrastive learning to realize both required objectives. On one hand, the capability of making high-relevance retrieval is established by distilling knowledge from the "relevance teacher model''. On the other hand, the capability of making high-CTR retrieval is optimized by learning to discriminate user's clicked ads from the entire corpus. The two training modes are jointly performed as a multi-objective learning process, such that the ads of high relevance and CTR can be favored by the generated embeddings. Besides the learning strategy, we also elaborate our solution for EBR serving pipeline built upon the substantially optimized DiskANN, where massive-scale EBR can be performed with competitive time and memory efficiency, and accomplished in high-quality. We make comprehensive offline and online experiments to evaluate the proposed techniques, whose findings may provide useful insights for the future development of EBR systems. Uni-Retriever has been mainstreamed as the major retrieval path in Bing's production thanks to the notable improvements on the representation and EBR serving quality.|嵌入式基于检索(EBR)是许多 Web 应用程序的基础构件。然而,由于需要服务于多种检索目的,赞助商搜索中的 EBR 不同于其他一般情况,在技术上具有挑战性: 首先,它必须检索高相关度的广告,这可能恰好服务于用户的搜索意图; 其次,它需要检索高点击率的广告,以最大限度地提高用户的总体点击率。本文提出了一种新的面向 Bing 搜索的 Uni-Retriever 表示学习框架,该框架将两种不同的训练模式知识提取和对比学习相结合,实现了两种不同的目标。一方面,从“关联教师模型”中提取知识,建立高关联检索能力;。另一方面,通过学习从整个语料库中区分用户点击广告,优化了高点击率检索的能力。这两种训练模式作为一个多目标学习过程共同执行,使得嵌入生成的广告更有利于高关联度和点击率的广告。除了学习策略,我们还详细阐述了我们的解决方案,EBR 服务流水线的基础上大幅度优化的 DiskANN,其中大规模的 EBR 可以执行竞争时间和内存效率,并完成在高质量。我们进行了全面的离线和在线实验来评估所提出的技术,其结果可能为未来 EBR 系统的发展提供有用的见解。统一检索已成为主流的检索路径在必应的生产显着改善的表示和 EBR 服务质量。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Uni-Retriever:+Towards+Learning+the+Unified+Embedding+Based+Retriever+in+Bing+Sponsored+Search)|1| |[An Online Multi-task Learning Framework for Google Feed Ads Auction Models](https://doi.org/10.1145/3534678.3539055)|Ning Ma, Mustafa Ispir, Yuan Li, Yongpeng Yang, Zhe Chen, Derek Zhiyuan Cheng, Lan Nie, Kishor Barman|Google Inc., Mountain View, CA, USA|In this paper, we introduce a large scale online multi-task deep learning framework for modeling multiple feed ads auction prediction tasks on an industry-scale feed ads recommendation platform. Multiple prediction tasks are combined into one single model which is continuously trained on real time new ads data. Multi-tasking ads auction models in real-time faces many real-world challenges. For example, each task may be trained on different set of training data; the labels of different tasks may have different arrival time due to label delay; different tasks will interact with each other; combining the losses of each task is non-trivial. We tackle these challenges using practical and novel techniques such as multi-stage training for handling label delay, Multi-gate Mixture-of-Experts (MMoE) to optimize model interaction and an auto-parameter learning algorithm to optimize the loss weights of different tasks. We demonstrate that our proposed techniques can lead to quality improvements and substantial resource saving compared to modeling each single task independently.|本文介绍了一个大规模的在线多任务深度学习框架,在一个行业规模的推荐平台上对多种推广广告拍卖预测任务进行建模。将多个预测任务组合成一个单独的模型,对实时的新广告数据进行连续的训练。实时多任务广告拍卖模型在现实生活中面临着许多挑战。例如,每个任务可以在不同的训练数据集上进行训练; 由于标签延迟,不同任务的标签可能有不同的到达时间; 不同的任务将相互作用;。针对这些问题,我们采用了多阶段训练来处理标签延迟,多门专家混合(MMoE)来优化模型交互,以及自动参数学习算法来优化不同任务的损失权重。我们证明,与独立建模每个单独的任务相比,我们提出的技术可以导致质量改进和大量资源节省。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=An+Online+Multi-task+Learning+Framework+for+Google+Feed+Ads+Auction+Models)|1| |[NxtPost: User To Post Recommendations In Facebook Groups](https://doi.org/10.1145/3534678.3539042)|Kaushik Rangadurai, Yiqun Liu, Siddarth Malreddy, Xiaoyi Liu, Piyush Maheshwari, Vishwanath Sangale, Fedor Borisyuk|Meta Platforms Inc., Menlo Park, CA, USA|In this paper, we present NxtPost, a deployed user-to-post content based sequential recommender system for Facebook Groups. Inspired by recent advances in NLP, we have adapted a Transformer based model to the domain of sequential recommendation. We explore causal masked multi-head attention that optimizes both short and long-term user interests. From a user's past activities validated by defined safety process, NxtPost seeks to learn a representation for the user's dynamic content preference and to predict the next post user may be interested in. In contrast to previous Transformer based methods, we do not assume that the recommendable posts have a fixed corpus. Accordingly, we use an external item/token embedding to extend a sequence-based approach to a large vocabulary. We achieve 49% abs. improvement in offline evaluation. As a result of NxtPost deployment, 0.6% more users are meeting new people, engaging with the community, sharing knowledge and getting support. The paper shares our experience in developing a personalized sequential recommender system, lessons deploying the model for cold start users, how to deal with freshness, and tuning strategies to reach higher efficiency in online A/B experiments.|在本文中,我们介绍了 NxtPost,这是一个为 Facebook group 部署的基于用户到发布内容的顺序推荐系统。受自然语言处理最新进展的启发,我们将一个基于 Transform- 的模型应用于顺序推荐领域。我们探索因果掩盖多头注意,优化短期和长期用户的兴趣。通过定义的安全过程验证用户过去的活动,NxtPost 试图学习用户动态内容偏好的表示,并预测下一个帖子用户可能感兴趣的内容。与以前基于 former 的方法相比,我们不假定推荐的帖子具有固定的语料库。因此,我们使用外部项/令牌嵌入来将基于序列的方法扩展到大型词汇表。我们有49% 的腹肌。离线评估的改进。作为 NxtPost 部署的结果,0.6% 的用户正在结识新朋友,参与社区活动,分享知识并获得支持。本文分享了我们在开发个性化连续推荐系统的经验、为冷启动用户部署模型的教训、如何处理新鲜感,以及在线 A/B 实验中为提高效率而调整策略的经验。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=NxtPost:+User+To+Post+Recommendations+In+Facebook+Groups)|1| |[ReprBERT: Distilling BERT to an Efficient Representation-Based Relevance Model for E-Commerce](https://doi.org/10.1145/3534678.3539090)|Shaowei Yao, Jiwei Tan, Xi Chen, Juhao Zhang, Xiaoyi Zeng, Keping Yang|Alibaba Group, Hangzhou, China|Text relevance or text matching of query and product is an essential technique for e-commerce search engine, which helps users find the desirable products and is also crucial to ensuring user experience. A major difficulty for e-commerce text relevance is the severe vocabulary gap between query and product. Recently, neural networks have been the mainstream for the text matching task owing to the better performance for semantic matching. Practical e-commerce relevance models are usually representation-based architecture, which can pre-compute representations offline and are therefore online efficient. Interaction-based models, although can achieve better performance, are mostly time-consuming and hard to be deployed online. Recently BERT has achieved significant progress on many NLP tasks including text matching, and it is of great value but also big challenge to deploy BERT to the e-commerce relevance task. To realize this goal, we propose ReprBERT, which has the advantages of both excellent performance and low latency, by distilling the interaction-based BERT model to a representation-based architecture. To reduce the performance decline, we investigate the key reasons and propose two novel interaction strategies to resolve the absence of representation interaction and low-level semantic interaction. Finally, ReprBERT can achieve only about 1.5% AUC loss from the interaction-based BERT, but has more than 10% AUC improvement compared to previous state-of-the-art representation-based models. ReprBERT has already been deployed on the search engine of Taobao and serving the entire search traffic, achieving significant gain of user experience and business profit.|查询和产品的文本相关性或文本匹配是电子商务搜索引擎的关键技术,它可以帮助用户找到想要的产品,也是保证用户体验的关键。电子商务文本相关性的一个主要困难是查询和产品之间严重的词汇差距。近年来,神经网络以其较好的语义匹配性能成为文本匹配的主流。实用的电子商务相关性模型通常是基于表示的体系结构,它可以离线预先计算表示,因此具有在线效率。基于交互的模型,尽管可以获得更好的性能,但是大部分都是耗时的,并且很难在线部署。近年来,BERT 在包括文本匹配在内的许多自然语言处理任务中取得了显著的进展,将 BERT 部署到电子商务相关任务中具有很大的价值,但也面临很大的挑战。为了实现这一目标,我们提出了 ReprBERT,它具有良好的性能和低延迟的优点,通过提炼基于交互的 BERT 模型到一个基于表示的体系结构。为了减少表征交互和低层次语义交互的缺失,本文研究了表征交互和低层次语义交互的关键原因,并提出了两种新的交互策略来解决表征交互和低层次语义交互的缺失问题。最后,ReprBERT 只能从基于交互的 BERT 中获得约1.5% 的 AUC 损失,但与以前的基于最先进表示的模型相比,具有超过10% 的 AUC 改善。ReprBERT 已经部署在淘宝的搜索引擎上,服务于整个搜索流量,取得了显著的用户体验和商业利润收益。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ReprBERT:+Distilling+BERT+to+an+Efficient+Representation-Based+Relevance+Model+for+E-Commerce)|1| -|[Learning Supplementary NLP Features for CTR Prediction in Sponsored Search](https://doi.org/10.1145/3534678.3539064)|Dong Wang, Shaoguang Yan, Yunqing Xia, Kavé Salamatian, Weiwei Deng, Qi Zhang|University of Savoie & Tallinn University of Technology, Annecy, France; Microsoft Corporation, Beijing, China|In sponsored search engines, pre-trained language models have shown promising performance improvements on Click-Through-Rate (CTR) prediction. A widely used approach for utilizing pre-trained language models in CTR prediction consists of fine-tuning the language models with click labels and early stopping on peak value of the obtained Area Under the ROC Curve (AUC). Thereafter the output of these fine-tuned models, i.e., the final score or intermediate embedding generated by language model, is used as a new Natural Language Processing (NLP) feature into CTR prediction baseline. This cascade approach avoids complicating the CTR prediction baseline, while keeping flexibility and agility. However, we show in this work that calibrating separately the language model based on the peak single model AUC does not always yield NLP features that give the best performance in CTR prediction model ultimately. Our analysis reveals that the misalignment is due to overlap and redundancy between the new NLP features and the existing features in CTR prediction baseline. In other words, the NLP features can improve CTR prediction better if such overlap can be reduced. For this purpose, we introduce a simple and general joint-training framework for fine-tuning of language models, combined with the already existing features in CTR prediction baseline, to extract supplementary knowledge for NLP feature. Moreover, we develop an efficient Supplementary Knowledge Distillation (SuKD) that transfers the supplementary knowledge learned by a heavy language model to a light and serviceable model. Comprehensive experiments on both public data and commercial data presented in this work demonstrate that the new NLP features resulting from the joint-training framework can outperform significantly the ones from the independent fine-tuning based on click labels. we also show that the light model distilled with SuKD can provide obvious AUC improvement in CTR prediction over the traditional feature-based knowledge distillation.|在赞助商搜索引擎中,预先训练好的语言模型在点击率(Click-Through-Rate,CTR)预测方面显示出有希望的性能改进。一个广泛使用的方法,利用预先训练的语言模型在点击率预测包括微调的语言模型与点击标签和早期停止在 ROC 曲线下面积(AUC)峰值获得。然后,这些微调模型的输出,即语言模型生成的最终分数或中间嵌入,被用作 CTR 预测基线的一个新的自然语言处理(NLP)特征。这种级联方法避免了使 CTR 预测基线复杂化,同时保持了灵活性和敏捷性。然而,我们的工作表明,基于峰值单模型 AUC 分别标定语言模型并不总是产生 NLP 特征,最终给出 CTR 预测模型的最佳性能。我们的分析表明,失调是由于重叠和冗余之间的新 NLP 特征和现有的特征在 CTR 预测基线。换句话说,如果能够减少这种重叠,NLP 特征能够更好地提高 CTR 预测。为此,本文提出了一种简单通用的语言模型微调联合训练框架,结合 CTR 预测基线中已有的特征,提取 NLP 特征的补充知识。此外,我们开发了一个有效的补充知识提取(SuKD) ,将重语言模型所学到的补充知识转化为一个简单易用的模型。对公共数据和商业数据的综合实验表明,联合训练框架所产生的新的自然语言处理特征可以显著优于基于点击标签的独立微调。与传统的基于特征的知识提取方法相比,用 SuKD 提取的光模型在 CTR 预测方面可以提供明显的 AUC 改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Supplementary+NLP+Features+for+CTR+Prediction+in+Sponsored+Search)|1| -|[AutoShard: Automated Embedding Table Sharding for Recommender Systems](https://doi.org/10.1145/3534678.3539034)|Daochen Zha, Louis Feng, Bhargav Bhushanam, Dhruv Choudhary, Jade Nie, Yuandong Tian, Jay Chae, Yinbin Ma, Arun Kejariwal, Xia Hu|Rice University, Houston, TX, USA; Meta Platforms, Inc., Menlo Park, CA, USA|Embedding learning is an important technique in deep recommendation models to map categorical features to dense vectors. However, the embedding tables often demand an extremely large number of parameters, which become the storage and efficiency bottlenecks. Distributed training solutions have been adopted to partition the embedding tables into multiple devices. However, the embedding tables can easily lead to imbalances if not carefully partitioned. This is a significant design challenge of distributed systems named embedding table sharding, i.e., how we should partition the embedding tables to balance the costs across devices, which is a non-trivial task because 1) it is hard to efficiently and precisely measure the cost, and 2) the partition problem is known to be NP-hard. In this work, we introduce our novel practice in Meta, namely AutoShard, which uses a neural cost model to directly predict the multi-table costs and leverages deep reinforcement learning to solve the partition problem. Experimental results on an open-sourced large-scale synthetic dataset and Meta's production dataset demonstrate the superiority of AutoShard over the heuristics. Moreover, the learned policy of AutoShard can transfer to sharding tasks with various numbers of tables and different ratios of the unseen tables without any fine-tuning. Furthermore, AutoShard can efficiently shard hundreds of tables in seconds. The effectiveness, transferability, and efficiency of AutoShard make it desirable for production use. Our algorithms have been deployed in Meta production environment. A prototype is available at https://github.com/daochenzha/autoshard|嵌入式学习是深度推荐模型中将分类特征映射到密集向量的一项重要技术。然而,嵌入式表往往需要大量的参数,成为存储和效率的瓶颈。采用分布式训练解决方案将嵌入表划分为多个设备。然而,如果不仔细分区,嵌入表很容易导致不平衡。这是分布式系统嵌入表分片的一个重大设计挑战,即我们应该如何划分嵌入表来平衡设备之间的成本,这是一个非常重要的任务,因为1)很难有效和精确地度量成本,2)划分问题是已知的 NP 难题。在这项工作中,我们介绍了我们在 Meta 中的新实践,即 AutoShard,它使用一个神经成本模型来直接预测多表成本,并利用深度强化学习来解决分区问题。在一个开源的大规模合成数据集和 Meta 生产数据集上的实验结果证明了 AutoShard 相对于启发式算法的优越性。此外,AutoShard 的学习策略可以转换为使用不同数量的表和看不见的表的不同比例的分片任务,而不需要进行任何微调。此外,AutoShard 可以在几秒钟内高效地切分数百个表。AutoShard 的有效性、可转移性和效率使其适合生产使用。我们的算法已经部署在元生产环境中。Https://github.com/daochenzha/autoshard 上有一个原型|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=AutoShard:+Automated+Embedding+Table+Sharding+for+Recommender+Systems)|1| -|[On-Device Learning for Model Personalization with Large-Scale Cloud-Coordinated Domain Adaption](https://doi.org/10.1145/3534678.3539263)|Yikai Yan, Chaoyue Niu, Renjie Gu, Fan Wu, Shaojie Tang, Lifeng Hua, Chengfei Lyu, Guihai Chen|Shanghai Jiao Tong University, Shanghai, China; Alibaba Group, Hangzhou, China; University of Texas at Dallas, Richardson, TX, USA|Cloud-based learning is currently the mainstream in both academia and industry. However, the global data distribution, as a mixture of all the users' data distributions, for training a global model may deviate from each user's local distribution for inference, making the global model non-optimal for each individual user. To mitigate distribution discrepancy, on-device training over local data for model personalization is a potential solution, but suffers from serious overfitting. In this work, we propose a new device-cloud collaborative learning framework under the paradigm of domain adaption, called MPDA, to break the dilemmas of purely cloud-based learning and on-device training. From the perspective of a certain user, the general idea of MPDA is to retrieve some similar data from the cloud's global pool, which functions as large-scale source domains, to augment the user's local data as the target domain. The key principle of choosing which outside data depends on whether the model trained over these data can generalize well over the local data. We theoretically analyze that MPDA can reduce distribution discrepancy and overfitting risk. We also extensively evaluate over the public MovieLens 20M and Amazon Electronics datasets, as well as an industrial dataset collected from Mobile Taobao over a period of 30 days. We finally build a device-tunnel-cloud system pipeline, deploy MPDA in the icon area of Mobile Taobao for click-through rate prediction, and conduct online A/B testing. Both offline and online results demonstrate that MPDA outperforms the baselines of cloud-based learning and on-device training only over local data, from multiple offline and online metrics.|基于云的学习是目前学术界和工业界的主流。然而,全局数据分布作为所有用户数据分布的混合,用于训练全局模型可能偏离每个用户的局部分布进行推理,使得全局模型对于每个用户不是最优的。为了缓解分布差异,对模型个性化的本地数据进行设备上的训练是一个潜在的解决方案,但是存在严重的过拟合问题。在这项工作中,我们提出了一个新的设备-云计算合作学习框架,在领域适应的范例下称为 MPDA,以打破纯粹基于云的学习和设备上培训的困境。从某个用户的角度来看,MPDA 的总体思想是从作为大规模源域的云的全局池中检索一些类似的数据,以增加用户的本地数据作为目标域。选择哪些外部数据的关键原则取决于对这些数据进行训练的模型是否能够比本地数据更好地推广。从理论上分析了 MPDA 可以降低分布差异和过拟合风险。我们还广泛评估了公开的 MovieLens 20M 和亚马逊电子数据集,以及在30天内从移动淘宝收集的工业数据集。最后,我们建立了设备-隧道-云系统流水线,在移动淘宝的图标区域部署 MPDA 进行点进率预测,并进行在线 A/B 测试。离线和在线结果都表明,MPDA 仅在多个离线和在线指标的本地数据上优于基于云的学习和设备上培训的基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On-Device+Learning+for+Model+Personalization+with+Large-Scale+Cloud-Coordinated+Domain+Adaption)|1| -|[Debiasing Learning for Membership Inference Attacks Against Recommender Systems](https://doi.org/10.1145/3534678.3539392)|Zihan Wang, Na Huang, Fei Sun, Pengjie Ren, Zhumin Chen, Hengliang Luo, Maarten de Rijke, Zhaochun Ren|University of Amsterdam, Amsterdam, Netherlands; Meituan, Beijing, China; Alibaba Group, Beijing, China; Shandong University, Qingdao, China|Learned recommender systems may inadvertently leak information about their training data, leading to privacy violations. We investigate privacy threats faced by recommender systems through the lens of membership inference. In such attacks, an adversary aims to infer whether a user's data is used to train the target recommender. To achieve this, previous work has used a shadow recommender to derive training data for the attack model, and then predicts the membership by calculating difference vectors between users' historical interactions and recommended items. State-of-the-art methods face two challenging problems: (i) training data for the attack model is biased due to the gap between shadow and target recommenders, and (ii) hidden states in recommenders are not observational, resulting in inaccurate estimations of difference vectors. To address the above limitations, we propose a Debiasing Learning for Membership Inference Attacks against recommender systems (DL-MIA) framework that has four main components: (i) a difference vector generator, (ii) a disentangled encoder, (iii) a weight estimator, and (iv) an attack model. To mitigate the gap between recommenders, a variational auto-encoder (VAE) based disentangled encoder is devised to identify recommender invariant and specific features. To reduce the estimation bias, we design a weight estimator, assigning a truth-level score for each difference vector to indicate estimation accuracy. We evaluate DL-MIA against both general recommenders and sequential recommenders on three real-world datasets. Experimental results show that DL-MIA effectively alleviates training and estimation biases simultaneously, and Íachieves state-of-the-art attack performance.|经验丰富的推荐系统可能无意中泄露有关其培训数据的信息,从而导致侵犯隐私。我们通过成员推理的视角来研究推荐系统所面临的隐私威胁。在这种攻击中,对手的目的是推断用户的数据是否被用来训练目标推荐器。为了实现这一目标,以前的工作是使用阴影推荐来获取攻击模型的训练数据,然后通过计算用户历史交互和推荐项目之间的差异向量来预测成员关系。最先进的方法面临两个具有挑战性的问题: (i)攻击模型的训练数据由于阴影和目标推荐器之间的差距而有偏差,以及(ii)推荐器中的隐藏状态不是观察性的,导致差异向量的估计不准确。为了解决上述局限性,我们提出了针对推荐系统(DL-MIA)的成员推断攻击的去偏学习框架,其具有四个主要组成部分: (i)差分矢量生成器,(ii)分离编码器,(iii)权重估计器和(iv)攻击模型。为了缩小推荐器之间的差距,设计了一种基于变分自动编码器(VAE)的解纠缠编码器来识别推荐器的不变性和特定特征。为了减少估计偏差,我们设计了一个权重估计器,为每个差异向量指定一个真值水平分数来表示估计的准确性。我们在三个真实世界的数据集上评估 DL-MIA 与通用推荐和顺序推荐的对比。实验结果表明,DL-MIA 同时有效地减小了训练偏差和估计偏差,并取得了一流的攻击性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Debiasing+Learning+for+Membership+Inference+Attacks+Against+Recommender+Systems)|1| +|[Learning Supplementary NLP Features for CTR Prediction in Sponsored Search](https://doi.org/10.1145/3534678.3539064)|Dong Wang, Shaoguang Yan, Yunqing Xia, Kavé Salamatian, Weiwei Deng, Qi Zhang|Microsoft Corporation, Beijing, China; University of Savoie & Tallinn University of Technology, Annecy, France|In sponsored search engines, pre-trained language models have shown promising performance improvements on Click-Through-Rate (CTR) prediction. A widely used approach for utilizing pre-trained language models in CTR prediction consists of fine-tuning the language models with click labels and early stopping on peak value of the obtained Area Under the ROC Curve (AUC). Thereafter the output of these fine-tuned models, i.e., the final score or intermediate embedding generated by language model, is used as a new Natural Language Processing (NLP) feature into CTR prediction baseline. This cascade approach avoids complicating the CTR prediction baseline, while keeping flexibility and agility. However, we show in this work that calibrating separately the language model based on the peak single model AUC does not always yield NLP features that give the best performance in CTR prediction model ultimately. Our analysis reveals that the misalignment is due to overlap and redundancy between the new NLP features and the existing features in CTR prediction baseline. In other words, the NLP features can improve CTR prediction better if such overlap can be reduced. For this purpose, we introduce a simple and general joint-training framework for fine-tuning of language models, combined with the already existing features in CTR prediction baseline, to extract supplementary knowledge for NLP feature. Moreover, we develop an efficient Supplementary Knowledge Distillation (SuKD) that transfers the supplementary knowledge learned by a heavy language model to a light and serviceable model. Comprehensive experiments on both public data and commercial data presented in this work demonstrate that the new NLP features resulting from the joint-training framework can outperform significantly the ones from the independent fine-tuning based on click labels. we also show that the light model distilled with SuKD can provide obvious AUC improvement in CTR prediction over the traditional feature-based knowledge distillation.|在赞助商搜索引擎中,预先训练好的语言模型在点击率(Click-Through-Rate,CTR)预测方面显示出有希望的性能改进。一个广泛使用的方法,利用预先训练的语言模型在点击率预测包括微调的语言模型与点击标签和早期停止在 ROC 曲线下面积(AUC)峰值获得。然后,这些微调模型的输出,即语言模型生成的最终分数或中间嵌入,被用作 CTR 预测基线的一个新的自然语言处理(NLP)特征。这种级联方法避免了使 CTR 预测基线复杂化,同时保持了灵活性和敏捷性。然而,我们的工作表明,基于峰值单模型 AUC 分别标定语言模型并不总是产生 NLP 特征,最终给出 CTR 预测模型的最佳性能。我们的分析表明,失调是由于重叠和冗余之间的新 NLP 特征和现有的特征在 CTR 预测基线。换句话说,如果能够减少这种重叠,NLP 特征能够更好地提高 CTR 预测。为此,本文提出了一种简单通用的语言模型微调联合训练框架,结合 CTR 预测基线中已有的特征,提取 NLP 特征的补充知识。此外,我们开发了一个有效的补充知识提取(SuKD) ,将重语言模型所学到的补充知识转化为一个简单易用的模型。对公共数据和商业数据的综合实验表明,联合训练框架所产生的新的自然语言处理特征可以显著优于基于点击标签的独立微调。与传统的基于特征的知识提取方法相比,用 SuKD 提取的光模型在 CTR 预测方面可以提供明显的 AUC 改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Supplementary+NLP+Features+for+CTR+Prediction+in+Sponsored+Search)|1| +|[AutoShard: Automated Embedding Table Sharding for Recommender Systems](https://doi.org/10.1145/3534678.3539034)|Daochen Zha, Louis Feng, Bhargav Bhushanam, Dhruv Choudhary, Jade Nie, Yuandong Tian, Jay Chae, Yinbin Ma, Arun Kejariwal, Xia Hu|Meta Platforms, Inc., Menlo Park, CA, USA; Rice University, Houston, TX, USA|Embedding learning is an important technique in deep recommendation models to map categorical features to dense vectors. However, the embedding tables often demand an extremely large number of parameters, which become the storage and efficiency bottlenecks. Distributed training solutions have been adopted to partition the embedding tables into multiple devices. However, the embedding tables can easily lead to imbalances if not carefully partitioned. This is a significant design challenge of distributed systems named embedding table sharding, i.e., how we should partition the embedding tables to balance the costs across devices, which is a non-trivial task because 1) it is hard to efficiently and precisely measure the cost, and 2) the partition problem is known to be NP-hard. In this work, we introduce our novel practice in Meta, namely AutoShard, which uses a neural cost model to directly predict the multi-table costs and leverages deep reinforcement learning to solve the partition problem. Experimental results on an open-sourced large-scale synthetic dataset and Meta's production dataset demonstrate the superiority of AutoShard over the heuristics. Moreover, the learned policy of AutoShard can transfer to sharding tasks with various numbers of tables and different ratios of the unseen tables without any fine-tuning. Furthermore, AutoShard can efficiently shard hundreds of tables in seconds. The effectiveness, transferability, and efficiency of AutoShard make it desirable for production use. Our algorithms have been deployed in Meta production environment. A prototype is available at https://github.com/daochenzha/autoshard|嵌入式学习是深度推荐模型中将分类特征映射到密集向量的一项重要技术。然而,嵌入式表往往需要大量的参数,成为存储和效率的瓶颈。采用分布式训练解决方案将嵌入表划分为多个设备。然而,如果不仔细分区,嵌入表很容易导致不平衡。这是分布式系统嵌入表分片的一个重大设计挑战,即我们应该如何划分嵌入表来平衡设备之间的成本,这是一个非常重要的任务,因为1)很难有效和精确地度量成本,2)划分问题是已知的 NP 难题。在这项工作中,我们介绍了我们在 Meta 中的新实践,即 AutoShard,它使用一个神经成本模型来直接预测多表成本,并利用深度强化学习来解决分区问题。在一个开源的大规模合成数据集和 Meta 生产数据集上的实验结果证明了 AutoShard 相对于启发式算法的优越性。此外,AutoShard 的学习策略可以转换为使用不同数量的表和看不见的表的不同比例的分片任务,而不需要进行任何微调。此外,AutoShard 可以在几秒钟内高效地切分数百个表。AutoShard 的有效性、可转移性和效率使其适合生产使用。我们的算法已经部署在元生产环境中。Https://github.com/daochenzha/autoshard 上有一个原型|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=AutoShard:+Automated+Embedding+Table+Sharding+for+Recommender+Systems)|1| +|[On-Device Learning for Model Personalization with Large-Scale Cloud-Coordinated Domain Adaption](https://doi.org/10.1145/3534678.3539263)|Yikai Yan, Chaoyue Niu, Renjie Gu, Fan Wu, Shaojie Tang, Lifeng Hua, Chengfei Lyu, Guihai Chen|University of Texas at Dallas, Richardson, TX, USA; Shanghai Jiao Tong University, Shanghai, China; Alibaba Group, Hangzhou, China|Cloud-based learning is currently the mainstream in both academia and industry. However, the global data distribution, as a mixture of all the users' data distributions, for training a global model may deviate from each user's local distribution for inference, making the global model non-optimal for each individual user. To mitigate distribution discrepancy, on-device training over local data for model personalization is a potential solution, but suffers from serious overfitting. In this work, we propose a new device-cloud collaborative learning framework under the paradigm of domain adaption, called MPDA, to break the dilemmas of purely cloud-based learning and on-device training. From the perspective of a certain user, the general idea of MPDA is to retrieve some similar data from the cloud's global pool, which functions as large-scale source domains, to augment the user's local data as the target domain. The key principle of choosing which outside data depends on whether the model trained over these data can generalize well over the local data. We theoretically analyze that MPDA can reduce distribution discrepancy and overfitting risk. We also extensively evaluate over the public MovieLens 20M and Amazon Electronics datasets, as well as an industrial dataset collected from Mobile Taobao over a period of 30 days. We finally build a device-tunnel-cloud system pipeline, deploy MPDA in the icon area of Mobile Taobao for click-through rate prediction, and conduct online A/B testing. Both offline and online results demonstrate that MPDA outperforms the baselines of cloud-based learning and on-device training only over local data, from multiple offline and online metrics.|基于云的学习是目前学术界和工业界的主流。然而,全局数据分布作为所有用户数据分布的混合,用于训练全局模型可能偏离每个用户的局部分布进行推理,使得全局模型对于每个用户不是最优的。为了缓解分布差异,对模型个性化的本地数据进行设备上的训练是一个潜在的解决方案,但是存在严重的过拟合问题。在这项工作中,我们提出了一个新的设备-云计算合作学习框架,在领域适应的范例下称为 MPDA,以打破纯粹基于云的学习和设备上培训的困境。从某个用户的角度来看,MPDA 的总体思想是从作为大规模源域的云的全局池中检索一些类似的数据,以增加用户的本地数据作为目标域。选择哪些外部数据的关键原则取决于对这些数据进行训练的模型是否能够比本地数据更好地推广。从理论上分析了 MPDA 可以降低分布差异和过拟合风险。我们还广泛评估了公开的 MovieLens 20M 和亚马逊电子数据集,以及在30天内从移动淘宝收集的工业数据集。最后,我们建立了设备-隧道-云系统流水线,在移动淘宝的图标区域部署 MPDA 进行点进率预测,并进行在线 A/B 测试。离线和在线结果都表明,MPDA 仅在多个离线和在线指标的本地数据上优于基于云的学习和设备上培训的基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On-Device+Learning+for+Model+Personalization+with+Large-Scale+Cloud-Coordinated+Domain+Adaption)|1| +|[Debiasing Learning for Membership Inference Attacks Against Recommender Systems](https://doi.org/10.1145/3534678.3539392)|Zihan Wang, Na Huang, Fei Sun, Pengjie Ren, Zhumin Chen, Hengliang Luo, Maarten de Rijke, Zhaochun Ren|Meituan, Beijing, China; Alibaba Group, Beijing, China; Shandong University, Qingdao, China; University of Amsterdam, Amsterdam, Netherlands|Learned recommender systems may inadvertently leak information about their training data, leading to privacy violations. We investigate privacy threats faced by recommender systems through the lens of membership inference. In such attacks, an adversary aims to infer whether a user's data is used to train the target recommender. To achieve this, previous work has used a shadow recommender to derive training data for the attack model, and then predicts the membership by calculating difference vectors between users' historical interactions and recommended items. State-of-the-art methods face two challenging problems: (i) training data for the attack model is biased due to the gap between shadow and target recommenders, and (ii) hidden states in recommenders are not observational, resulting in inaccurate estimations of difference vectors. To address the above limitations, we propose a Debiasing Learning for Membership Inference Attacks against recommender systems (DL-MIA) framework that has four main components: (i) a difference vector generator, (ii) a disentangled encoder, (iii) a weight estimator, and (iv) an attack model. To mitigate the gap between recommenders, a variational auto-encoder (VAE) based disentangled encoder is devised to identify recommender invariant and specific features. To reduce the estimation bias, we design a weight estimator, assigning a truth-level score for each difference vector to indicate estimation accuracy. We evaluate DL-MIA against both general recommenders and sequential recommenders on three real-world datasets. Experimental results show that DL-MIA effectively alleviates training and estimation biases simultaneously, and Íachieves state-of-the-art attack performance.|经验丰富的推荐系统可能无意中泄露有关其培训数据的信息,从而导致侵犯隐私。我们通过成员推理的视角来研究推荐系统所面临的隐私威胁。在这种攻击中,对手的目的是推断用户的数据是否被用来训练目标推荐器。为了实现这一目标,以前的工作是使用阴影推荐来获取攻击模型的训练数据,然后通过计算用户历史交互和推荐项目之间的差异向量来预测成员关系。最先进的方法面临两个具有挑战性的问题: (i)攻击模型的训练数据由于阴影和目标推荐器之间的差距而有偏差,以及(ii)推荐器中的隐藏状态不是观察性的,导致差异向量的估计不准确。为了解决上述局限性,我们提出了针对推荐系统(DL-MIA)的成员推断攻击的去偏学习框架,其具有四个主要组成部分: (i)差分矢量生成器,(ii)分离编码器,(iii)权重估计器和(iv)攻击模型。为了缩小推荐器之间的差距,设计了一种基于变分自动编码器(VAE)的解纠缠编码器来识别推荐器的不变性和特定特征。为了减少估计偏差,我们设计了一个权重估计器,为每个差异向量指定一个真值水平分数来表示估计的准确性。我们在三个真实世界的数据集上评估 DL-MIA 与通用推荐和顺序推荐的对比。实验结果表明,DL-MIA 同时有效地减小了训练偏差和估计偏差,并取得了一流的攻击性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Debiasing+Learning+for+Membership+Inference+Attacks+Against+Recommender+Systems)|1| |[Automatic Generation of Product-Image Sequence in E-commerce](https://doi.org/10.1145/3534678.3539149)|Xiaochuan Fan, Chi Zhang, Yong Yang, Yue Shang, Xueying Zhang, Zhen He, Yun Xiao, Bo Long, Lingfei Wu|JD.COM, Beijing, UNK, China; JD.COM Research, Mountain View, CA, USA|Product images are essential for providing desirable user experience in an e-commerce platform. For a platform with billions of products, it is extremely time-costly and labor-expensive to manually pick and organize qualified images. Furthermore, there are the numerous and complicated image rules that a product image needs to comply in order to be generated/selected. To address these challenges, in this paper, we present a new learning framework in order to achieve Automatic Generation of Product-Image Sequence (AGPIS) in e-commerce. To this end, we propose a Multi-modality Unified Image-sequence Classifier (MUIsC), which is able to simultaneously detect all categories of rule violations through learning. MUIsC leverages textual review feedback as the additional training target and utilizes product textual description to provide extra semantic information. %Without using prior knowledge or manually-crafted task, a single MUIsC model is able to learn the holistic knowledge of image reviewing and detect all categories of rule violations simultaneously. Based on offline evaluations, we show that the proposed MUIsC significantly outperforms various baselines. Besides MUIsC, we also integrate some other important modules in the proposed framework, such as primary image selection, non-compliant content detection, and image deduplication. With all these modules, our framework works effectively and efficiently in JD.com recommendation platform. By Dec 2021, our AGPIS framework has generated high-standard images for about 1.5 million products and achieves 13.6% in reject rate. Code of this work is available at https://github.com/efan3000/muisc.|在电子商务平台中,产品图像对于提供理想的用户体验至关重要。对于一个拥有数十亿产品的平台来说,手动挑选和组织合格的图像是非常耗费时间和人力的。此外,还有许多复杂的图像规则,产品图像需要遵守这些规则才能生成/选择。针对这些挑战,本文提出了一种新的学习框架,以实现电子商务中产品图像序列(AGPIS)的自动生成。为此,我们提出了一种多模态统一图像序列分类器(MUIsC) ,它能够通过学习同时检测所有类别的违规行为。MUisC 利用文本评论反馈作为额外的培训目标,并利用产品文本描述提供额外的语义信息。% 在不使用先前知识或手工制作任务的情况下,单一的 MUIsC 模型能够学习图像审查的整体知识,并同时发现所有类别的违规行为。基于离线评估,我们表明所提出的 MUIsC 明显优于各种基线。除了 MUIsC,我们还整合了一些其他的重要模块,如初始图像选择、不兼容的内容检测和图像去重。通过所有这些模块,我们的框架在 JD.com 推荐平台上高效地工作。到2021年12月,我们的 AGPIS 框架已经为大约150万个产品生成了高标准的图像,并且实现了13.6% 的拒绝率。这项工作的代码可在 https://github.com/efan3000/muisc 查阅。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Automatic+Generation+of+Product-Image+Sequence+in+E-commerce)|1| |[Semantic Retrieval at Walmart](https://doi.org/10.1145/3534678.3539164)|Alessandro Magnani, Feng Liu, Suthee Chaidaroon, Sachin Yadav, Praveen Reddy Suram, Ajit Puthenputhussery, Sijie Chen, Min Xie, Anirudh Kashi, Tony Lee, Ciya Liao||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Semantic+Retrieval+at+Walmart)|1| |[Training Large-Scale News Recommenders with Pretrained Language Models in the Loop](https://doi.org/10.1145/3534678.3539120)|Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Bhuvan Middha, Fangzhao Wu, Xing Xie||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Training+Large-Scale+News+Recommenders+with+Pretrained+Language+Models+in+the+Loop)|1| @@ -167,7 +167,7 @@ |[User Engagement in Mobile Health Applications](https://doi.org/10.1145/3534678.3542681)|Babaniyi Yusuf Olaniyi, Ana Fernández del Río, África Periáñez, Lauren Bellhouse||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=User+Engagement+in+Mobile+Health+Applications)|1| |[Advances in Exploratory Data Analysis, Visualisation and Quality for Data Centric AI Systems](https://doi.org/10.1145/3534678.3542604)|Hima Patel, Shanmukha C. Guttula, Ruhi Sharma Mittal, Naresh Manwani, Laure BertiÉquille, Abhijit Manatkar||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Advances+in+Exploratory+Data+Analysis,+Visualisation+and+Quality+for+Data+Centric+AI+Systems)|1| |[Submodular Feature Selection for Partial Label Learning](https://doi.org/10.1145/3534678.3539292)|WeiXuan Bao, JunYi Hang, MinLing Zhang|Southeast University & Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, China|Partial label learning induces a multi-class classifier from training examples each associated with a candidate label set where the ground-truth label is concealed. Feature selection improves the generalization ability of learning system via selecting essential features for classification from the original feature set, while the task of partial label feature selection is challenging due to ambiguous labeling information. In this paper, the first attempt towards partial label feature selection is investigated via mutual-information-based dependency maximization. Specifically, the proposed approach SAUTE iteratively maximizes the dependency between selected features and labeling information, where the value of mutual information is estimated from confidence-based latent variable inference. In each iteration, the near-optimal features are selected greedily according to properties of submodular mutual information function, while the density of latent label variable is inferred with the help of updated labeling confidences over candidate labels by resorting to kNN aggregation in the induced lower-dimensional feature space. Extensive experiments over synthetic as well as real-world partial label data sets show that the generalization ability of well-established partial label learning algorithms can be significantly improved after coupling with the proposed feature selection approach.|部分标签学习从训练样本中归纳出一个多类分类器,每个样本与一个隐藏地面真实标签的候选标签集相关联。特征选择通过从原始特征集中选择分类所需的基本特征来提高学习系统的泛化能力,而部分标记特征选择则由于标记信息不明确而面临挑战。本文首次研究了基于互信息的依赖最大化方法在部分标签特征选择中的应用。特别地,提出的方法 SAUTE 迭代地最大化选择的特征和标记信息之间的依赖性,其中互信息的价值是估计基于置信度的潜变量推断。在每次迭代中,根据子模互信息函数的性质贪婪地选择接近最优的特征,利用诱导的低维特征空间中的 kNN 聚集,借助候选标签上更新的标签置信度推断潜在标签变量的密度。通过对合成和实际部分标签数据集的大量实验表明,与所提出的特征选择方法相结合,可以显著提高已有部分标签学习算法的泛化能力。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Submodular+Feature+Selection+for+Partial+Label+Learning)|1| -|[Practical Lossless Federated Singular Vector Decomposition over Billion-Scale Data](https://doi.org/10.1145/3534678.3539402)|Di Chai, Leye Wang, Junxue Zhang, Liu Yang, Shuowei Cai, Kai Chen, Qiang Yang|Hong Kong University of Science and Technology, Hong Kong, China; Peking University, Beijing, China|With the enactment of privacy-preserving regulations, e.g., GDPR, federated SVD is proposed to enable SVD-based applications over different data sources without revealing the original data. However, many SVD-based applications cannot be well supported by existing federated SVD solutions. The crux is that these solutions, adopting either differential privacy (DP) or homomorphic encryption (HE), suffer from accuracy loss caused by unremovable noise or degraded efficiency due to inflated data. In this paper, we propose FedSVD, a practical lossless federated SVD method over billion-scale data, which can simultaneously achieve lossless accuracy and high efficiency. At the heart of FedSVD is a lossless matrix masking scheme delicately designed for SVD: 1) While adopting the masks to protect private data, FedSVD completely removes them from the final results of SVD to achieve lossless accuracy; and 2) As the masks do not inflate the data, FedSVD avoids extra computation and communication overhead during the factorization to maintain high efficiency. Experiments with real-world datasets show that FedSVD is over 10000x faster than the HE-based method and has 10 orders of magnitude smaller error than the DP-based solution (ε=0.1, δ=0.1) on SVD tasks. We further build and evaluate FedSVD over three real-world applications: principal components analysis (PCA), linear regression (LR), and latent semantic analysis (LSA), to show its superior performance in practice. On federated LR tasks, compared with two state-of-the-art solutions: FATE [17] and SecureML [19], FedSVD-LR is 100x faster than SecureML and 10x faster than FATE.|随着 GDPR 等隐私保护规则的制定,联邦奇异值分解被提出,以使基于奇异值分解的应用能够在不同数据源之间进行而不暴露原始数据。然而,现有的联邦 SVD 解决方案不能很好地支持许多基于 SVD 的应用程序。问题的关键是,这些解决方案,无论是采用差分隐私(DP)或同态加密(HE) ,都会受到不可去除的噪声或因数据膨胀而导致效率降低所造成的精度损失。在本文中,我们提出了 FedSVD,一种实用的无损联邦 SVD 方法,它可以同时达到无损精度和高效率。FedSVD 的核心是一种为 SVD 精心设计的无损矩阵掩蔽方案: 1)在采用掩蔽保护私有数据的同时,FedSVD 从 SVD 的最终结果中完全去除掩蔽,以实现无损精度; 2)由于掩蔽不会使数据膨胀,FedSVD 避免了因子分解过程中的额外计算和通信开销,保持了高效率。实际数据集的实验表明,在奇异值分解任务中,FedSVD 比基于 HE 的方法快10000倍以上,并且比基于 DP 的方法(ε = 0.1,δ = 0.1)误差小10数量级。我们进一步构建和评估 FedSVD 在三个现实世界中的应用: 主成分分析(PCA)、线性回归分析(LR)和潜在语义学分析(LSA) ,以显示其在实践中的卓越性能。在联邦 LR 任务上,与 FATE [17]和 SecureML [19]这两种最先进的解决方案相比,FedSVD-LR 比 SecureML 快100倍,比 FATE 快10倍。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Practical+Lossless+Federated+Singular+Vector+Decomposition+over+Billion-Scale+Data)|1| +|[Practical Lossless Federated Singular Vector Decomposition over Billion-Scale Data](https://doi.org/10.1145/3534678.3539402)|Di Chai, Leye Wang, Junxue Zhang, Liu Yang, Shuowei Cai, Kai Chen, Qiang Yang|Peking University, Beijing, China; Hong Kong University of Science and Technology, Hong Kong, China|With the enactment of privacy-preserving regulations, e.g., GDPR, federated SVD is proposed to enable SVD-based applications over different data sources without revealing the original data. However, many SVD-based applications cannot be well supported by existing federated SVD solutions. The crux is that these solutions, adopting either differential privacy (DP) or homomorphic encryption (HE), suffer from accuracy loss caused by unremovable noise or degraded efficiency due to inflated data. In this paper, we propose FedSVD, a practical lossless federated SVD method over billion-scale data, which can simultaneously achieve lossless accuracy and high efficiency. At the heart of FedSVD is a lossless matrix masking scheme delicately designed for SVD: 1) While adopting the masks to protect private data, FedSVD completely removes them from the final results of SVD to achieve lossless accuracy; and 2) As the masks do not inflate the data, FedSVD avoids extra computation and communication overhead during the factorization to maintain high efficiency. Experiments with real-world datasets show that FedSVD is over 10000x faster than the HE-based method and has 10 orders of magnitude smaller error than the DP-based solution (ε=0.1, δ=0.1) on SVD tasks. We further build and evaluate FedSVD over three real-world applications: principal components analysis (PCA), linear regression (LR), and latent semantic analysis (LSA), to show its superior performance in practice. On federated LR tasks, compared with two state-of-the-art solutions: FATE [17] and SecureML [19], FedSVD-LR is 100x faster than SecureML and 10x faster than FATE.|随着 GDPR 等隐私保护规则的制定,联邦奇异值分解被提出,以使基于奇异值分解的应用能够在不同数据源之间进行而不暴露原始数据。然而,现有的联邦 SVD 解决方案不能很好地支持许多基于 SVD 的应用程序。问题的关键是,这些解决方案,无论是采用差分隐私(DP)或同态加密(HE) ,都会受到不可去除的噪声或因数据膨胀而导致效率降低所造成的精度损失。在本文中,我们提出了 FedSVD,一种实用的无损联邦 SVD 方法,它可以同时达到无损精度和高效率。FedSVD 的核心是一种为 SVD 精心设计的无损矩阵掩蔽方案: 1)在采用掩蔽保护私有数据的同时,FedSVD 从 SVD 的最终结果中完全去除掩蔽,以实现无损精度; 2)由于掩蔽不会使数据膨胀,FedSVD 避免了因子分解过程中的额外计算和通信开销,保持了高效率。实际数据集的实验表明,在奇异值分解任务中,FedSVD 比基于 HE 的方法快10000倍以上,并且比基于 DP 的方法(ε = 0.1,δ = 0.1)误差小10数量级。我们进一步构建和评估 FedSVD 在三个现实世界中的应用: 主成分分析(PCA)、线性回归分析(LR)和潜在语义学分析(LSA) ,以显示其在实践中的卓越性能。在联邦 LR 任务上,与 FATE [17]和 SecureML [19]这两种最先进的解决方案相比,FedSVD-LR 比 SecureML 快100倍,比 FATE 快10倍。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Practical+Lossless+Federated+Singular+Vector+Decomposition+over+Billion-Scale+Data)|1| |[Efficient Join Order Selection Learning with Graph-based Representation](https://doi.org/10.1145/3534678.3539303)|Jin Chen, Guanyu Ye, Yan Zhao, Shuncheng Liu, Liwei Deng, Xu Chen, Rui Zhou, Kai Zheng||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Efficient+Join+Order+Selection+Learning+with+Graph-based+Representation)|1| |[RLogic: Recursive Logical Rule Learning from Knowledge Graphs](https://doi.org/10.1145/3534678.3539421)|Kewei Cheng, Jiahao Liu, Wei Wang, Yizhou Sun||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=RLogic:+Recursive+Logical+Rule+Learning+from+Knowledge+Graphs)|1| |[TARNet: Task-Aware Reconstruction for Time-Series Transformer](https://doi.org/10.1145/3534678.3539329)|Ranak Roy Chowdhury, Xiyuan Zhang, Jingbo Shang, Rajesh K. Gupta, Dezhi Hong||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=TARNet:+Task-Aware+Reconstruction+for+Time-Series+Transformer)|1| @@ -221,32 +221,32 @@ |[A Practical Introduction to Federated Learning](https://doi.org/10.1145/3534678.3542631)|Yaliang Li, Bolin Ding, Jingren Zhou||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Practical+Introduction+to+Federated+Learning)|1| |[Toolkit for Time Series Anomaly Detection](https://doi.org/10.1145/3534678.3542625)|Dhaval Patel, Dzung Phan, Markus Mueller, Amaresh Rajasekharan||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Toolkit+for+Time+Series+Anomaly+Detection)|1| |[Epidemic Forecasting with a Data-Centric Lens](https://doi.org/10.1145/3534678.3542620)|Alexander Rodríguez, Harshavardhan Kamarthi, B. Aditya Prakash||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Epidemic+Forecasting+with+a+Data-Centric+Lens)|1| -|[EXTR: Click-Through Rate Prediction with Externalities in E-Commerce Sponsored Search](https://doi.org/10.1145/3534678.3539053)|Chi Chen, Hui Chen, Kangzhi Zhao, Junsheng Zhou, Li He, Hongbo Deng, Jian Xu, Bo Zheng, Yong Zhang, Chunxiao Xing|Alibaba Group, BeiJing, China; Tsinghua University, BeiJing, China|Click-Through Rate (CTR) prediction, estimating the probability of a user clicking on items, plays a key fundamental role in sponsored search. E-commerce platforms display organic search results and advertisements (ads), collectively called items, together as a mixed list. The items displayed around the predicted ad, i.e. external items, may affect the user clicking on the predicted. Previous CTR models assume the user click only relies on the ad itself, which overlooks the effects of external items, referred to as external effects, or externalities. During the advertising prediction, the organic results have been generated by the organic system, while the final displayed ads on multiple ad slots have not been figured out, which leads to two challenges: 1) the predicted (target) ad may win any ad slot, bringing about diverse externalities. 2) external ads are undetermined, resulting in incomplete externalities. Facing the above challenges, inspired by the Transformer, we propose EXternality TRansformer (EXTR) which regards target ad with all slots as query and external items as key&value to model externalities in all exposure situations in parallel. Furthermore, we design a Potential Allocation Generator (PAG) for EXTR, to learn the allocation of potential external ads to complete the externalities. Extensive experimental results on Alibaba datasets demonstrate the effectiveness of externalities in the task of CTR prediction and illustrate that our proposed approach can bring significant profits to the real-world e-commerce platform. EXTR now has been successfully deployed in the online search advertising system in Alibaba, serving the main traffic.|点进率(ctrl)预测,估计用户点击项目的概率,在赞助商搜索中起着关键的基础作用。电子商务平台显示有机搜索结果和广告(广告) ,统称项目,一起作为一个混合清单。在预测广告周围显示的项目,即外部项目,可能会影响用户点击预测广告。以前的 CTR 模型假设用户的点击只依赖于广告本身,它忽略了外部项目的影响,称为外部影响,或外部性。在广告预测过程中,有机结果是由有机系统产生的,而最终在多个广告时段上显示的广告还没有计算出来,这就带来了两个挑战: 1)预测的(目标)广告可能赢得任何一个广告时段,带来不同的外部性。2)外部广告不确定性,导致外部性不完全。面对上述挑战,我们提出外部性变压器(EXTR)的启发,以所有时隙为查询目标广告和外部项目为关键和价值模型的外部性在所有曝光情况下并行。此外,我们还为 EXTR 设计了一个潜在分配生成器(PAG) ,学习如何分配潜在的外部广告来完成外部性。对阿里巴巴数据集的大量实验结果显示了外部性在点击率预测任务中的有效性,并说明我们建议的方法可以为现实世界的电子商务平台带来显著的利润。EXTR 现已成功应用于阿里巴巴的在线搜索广告系统,为主要流量提供服务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=EXTR:+Click-Through+Rate+Prediction+with+Externalities+in+E-Commerce+Sponsored+Search)|0| +|[EXTR: Click-Through Rate Prediction with Externalities in E-Commerce Sponsored Search](https://doi.org/10.1145/3534678.3539053)|Chi Chen, Hui Chen, Kangzhi Zhao, Junsheng Zhou, Li He, Hongbo Deng, Jian Xu, Bo Zheng, Yong Zhang, Chunxiao Xing|Tsinghua University, BeiJing, China; Alibaba Group, BeiJing, China|Click-Through Rate (CTR) prediction, estimating the probability of a user clicking on items, plays a key fundamental role in sponsored search. E-commerce platforms display organic search results and advertisements (ads), collectively called items, together as a mixed list. The items displayed around the predicted ad, i.e. external items, may affect the user clicking on the predicted. Previous CTR models assume the user click only relies on the ad itself, which overlooks the effects of external items, referred to as external effects, or externalities. During the advertising prediction, the organic results have been generated by the organic system, while the final displayed ads on multiple ad slots have not been figured out, which leads to two challenges: 1) the predicted (target) ad may win any ad slot, bringing about diverse externalities. 2) external ads are undetermined, resulting in incomplete externalities. Facing the above challenges, inspired by the Transformer, we propose EXternality TRansformer (EXTR) which regards target ad with all slots as query and external items as key&value to model externalities in all exposure situations in parallel. Furthermore, we design a Potential Allocation Generator (PAG) for EXTR, to learn the allocation of potential external ads to complete the externalities. Extensive experimental results on Alibaba datasets demonstrate the effectiveness of externalities in the task of CTR prediction and illustrate that our proposed approach can bring significant profits to the real-world e-commerce platform. EXTR now has been successfully deployed in the online search advertising system in Alibaba, serving the main traffic.|点进率(ctrl)预测,估计用户点击项目的概率,在赞助商搜索中起着关键的基础作用。电子商务平台显示有机搜索结果和广告(广告) ,统称项目,一起作为一个混合清单。在预测广告周围显示的项目,即外部项目,可能会影响用户点击预测广告。以前的 CTR 模型假设用户的点击只依赖于广告本身,它忽略了外部项目的影响,称为外部影响,或外部性。在广告预测过程中,有机结果是由有机系统产生的,而最终在多个广告时段上显示的广告还没有计算出来,这就带来了两个挑战: 1)预测的(目标)广告可能赢得任何一个广告时段,带来不同的外部性。2)外部广告不确定性,导致外部性不完全。面对上述挑战,我们提出外部性变压器(EXTR)的启发,以所有时隙为查询目标广告和外部项目为关键和价值模型的外部性在所有曝光情况下并行。此外,我们还为 EXTR 设计了一个潜在分配生成器(PAG) ,学习如何分配潜在的外部广告来完成外部性。对阿里巴巴数据集的大量实验结果显示了外部性在点击率预测任务中的有效性,并说明我们建议的方法可以为现实世界的电子商务平台带来显著的利润。EXTR 现已成功应用于阿里巴巴的在线搜索广告系统,为主要流量提供服务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=EXTR:+Click-Through+Rate+Prediction+with+Externalities+in+E-Commerce+Sponsored+Search)|0| |[PARSRec: Explainable Personalized Attention-fused Recurrent Sequential Recommendation Using Session Partial Actions](https://doi.org/10.1145/3534678.3539432)|Ehsan Gholami, Mohammad Motamedi, Ashwin Aravindakshan|University of California, Davis, Davis, CA, USA|The emerging meta- and multi-verse landscape is yet another step towards the more prevalent use of already ubiquitous online markets. In such markets, recommender systems play critical roles by offering items of interest to the users, thereby narrowing down a vast search space that comprises hundreds of thousands of products. Recommender systems are usually designed to learn common user behaviors and rely on them for inference. This approach, while effective, is oblivious to subtle idiosyncrasies that differentiate humans from each other. Focusing on this observation, we propose an architecture that relies on common patterns as well as individual behaviors to tailor its recommendations for each person. Simulations under a controlled environment show that our proposed model learns interpretable personalized user behaviors. Our empirical results on Nielsen Consumer Panel dataset indicate that the proposed approach achieves up to 27.9% performance improvement compared to the state-of-the-art.|新兴的元和多元宇宙景观是朝着更普遍地使用已经无处不在的在线市场迈出的又一步。在这样的市场中,推荐系统通过向用户提供感兴趣的项目发挥着关键作用,从而缩小了由成千上万个产品组成的巨大搜索空间。推荐系统通常被设计用来学习常见的用户行为,并依赖它们进行推理。这种方法虽然有效,却忽略了区分人与人之间的微妙特质。基于这一观察,我们提出了一个依赖于公共模式和个人行为的体系结构,以便为每个人量身定制其建议。在受控环境下的仿真表明,我们提出的模型学习可解释的个性化用户行为。我们对 AC尼尔森面板数据集的实验结果表明,与最先进的技术相比,提出的方法实现了高达27.9% 的性能改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PARSRec:+Explainable+Personalized+Attention-fused+Recurrent+Sequential+Recommendation+Using+Session+Partial+Actions)|0| -|[Pretraining Representations of Multi-modal Multi-query E-commerce Search](https://doi.org/10.1145/3534678.3539200)|Xinyi Liu, Wanxian Guan, Lianyun Li, Hui Li, Chen Lin, Xubin Li, Si Chen, Jian Xu, Hongbo Deng, Bo Zheng|Alibaba Group, Hangzhou, China; Xiamen University, Xiamen, China|The importance of modeling contextual information within a search session has been widely acknowledged. However, learning representations of multi-query multi-modal (MM) search, in which Mobile Taobao users repeatedly submit textual and visual queries, remains unexplored in literature. Previous work which learns task-specific representations of textual query sessions fails to capture diverse query types and correlations in MM search sessions. This paper presents to represent MM search sessions by heterogeneous graph neural network (HGN). A multi-view contrastive learning framework is proposed to pretrain the HGN, with two views to model different intra-query, inter-query, and inter-modality information diffusion in MM search. Extensive experiments demonstrate that, the pretrained session representation can benefit state-of-the-art baselines on various downstream tasks, such as personalized click prediction, query suggestion, and intent classification.|在搜索会话中建模上下文信息的重要性已经得到了广泛的认可。然而,多查询多模态(MM)搜索的学习表征,其中移动淘宝用户重复提交文本和视觉查询,仍然没有文献探索。前面的工作学习了文本查询会话的特定任务表示,但未能在 MM 搜索会话中捕获不同的查询类型和相关性。本文提出用异构图神经网络(HGN)来表示 MM 搜索会话。提出了一种多视图对比学习框架对 HGN 进行预训练,使用两种视图对 MM 搜索中不同的查询内、查询间和模态间信息扩散进行建模。大量的实验表明,预先训练的会话表示可以使各种下游任务的最先进的基线受益,例如个性化的点击预测、查询建议和意图分类。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Pretraining+Representations+of+Multi-modal+Multi-query+E-commerce+Search)|0| -|[Deep Search Relevance Ranking in Practice](https://doi.org/10.1145/3534678.3542632)|Linsey Pang, Wei Liu, Kenghao Chang, Xue Li, Moumita Bhattacharya, Xianjing Liu, Stephen Guo|Microsoft, Mountain View, CA, USA; Netflix, Los Gatos, CA, USA; Walmart Global Tech, Sunnyvale, CA, USA; Twitter, San Jose , CA, USA; University of Technology Sydney, Sydney, Australia; Salesforce, San Francisco, CA, USA|Machine learning techniques for developing industry-scale search engines have long been a prominent part of most domains and their online products. Search relevance algorithms are key components of products across different fields, including e-commerce, streaming services, and social networks. In this tutorial, we give an introduction to such large-scale search ranking systems, specifically focusing on deep learning techniques in this area. The topics we cover are the following: (1) Overview of search ranking systems in practice, including classical and machine learning techniques; (2) Introduction to sequential and language models in the context of search ranking; and (3) Knowledge distillation approaches for this area. For each of the aforementioned sessions, we first give an introductory talk and then go over an hands-on tutorial to really hone in on the concepts. We cover fundamental concepts using demos, case studies, and hands-on examples, including the latest Deep Learning methods that have achieved state-of-the-art results in generating the most relevant search results. Moreover, we show example implementations of these methods in python, leveraging a variety of open-source machine-learning/deep-learning libraries as well as real industrial data or open-source data.|用于开发行业规模搜索引擎的机器学习技术长期以来一直是大多数领域及其在线产品的重要组成部分。搜索相关算法是不同领域产品的关键组成部分,包括电子商务、流媒体服务和社交网络。在本教程中,我们将介绍这种大规模的搜索排名系统,特别关注这一领域的深度学习技术。我们讨论的主题如下: (1)搜索排名系统在实践中的概述,包括经典的和机器学习技术; (2)在搜索排名的背景下序列和语言模型的介绍; 和(3)这个领域的知识提取方法。对于前面提到的每一个会议,我们首先做一个介绍性的演讲,然后通过一个实践教程来真正地深入理解这些概念。我们使用演示、案例研究和实践例子介绍基本概念,包括最新的深度学习方法,这些方法在生成最相关的搜索结果时取得了最先进的结果。此外,我们还展示了这些方法在 python 中的实现示例,利用了各种开源机器学习/深度学习库以及真实的工业数据或开源数据。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Deep+Search+Relevance+Ranking+in+Practice)|0| +|[Pretraining Representations of Multi-modal Multi-query E-commerce Search](https://doi.org/10.1145/3534678.3539200)|Xinyi Liu, Wanxian Guan, Lianyun Li, Hui Li, Chen Lin, Xubin Li, Si Chen, Jian Xu, Hongbo Deng, Bo Zheng|Xiamen University, Xiamen, China; Alibaba Group, Hangzhou, China|The importance of modeling contextual information within a search session has been widely acknowledged. However, learning representations of multi-query multi-modal (MM) search, in which Mobile Taobao users repeatedly submit textual and visual queries, remains unexplored in literature. Previous work which learns task-specific representations of textual query sessions fails to capture diverse query types and correlations in MM search sessions. This paper presents to represent MM search sessions by heterogeneous graph neural network (HGN). A multi-view contrastive learning framework is proposed to pretrain the HGN, with two views to model different intra-query, inter-query, and inter-modality information diffusion in MM search. Extensive experiments demonstrate that, the pretrained session representation can benefit state-of-the-art baselines on various downstream tasks, such as personalized click prediction, query suggestion, and intent classification.|在搜索会话中建模上下文信息的重要性已经得到了广泛的认可。然而,多查询多模态(MM)搜索的学习表征,其中移动淘宝用户重复提交文本和视觉查询,仍然没有文献探索。前面的工作学习了文本查询会话的特定任务表示,但未能在 MM 搜索会话中捕获不同的查询类型和相关性。本文提出用异构图神经网络(HGN)来表示 MM 搜索会话。提出了一种多视图对比学习框架对 HGN 进行预训练,使用两种视图对 MM 搜索中不同的查询内、查询间和模态间信息扩散进行建模。大量的实验表明,预先训练的会话表示可以使各种下游任务的最先进的基线受益,例如个性化的点击预测、查询建议和意图分类。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Pretraining+Representations+of+Multi-modal+Multi-query+E-commerce+Search)|0| +|[Deep Search Relevance Ranking in Practice](https://doi.org/10.1145/3534678.3542632)|Linsey Pang, Wei Liu, Kenghao Chang, Xue Li, Moumita Bhattacharya, Xianjing Liu, Stephen Guo|Walmart Global Tech, Sunnyvale, CA, USA; Netflix, Los Gatos, CA, USA; Microsoft, Mountain View, CA, USA; University of Technology Sydney, Sydney, Australia; Twitter, San Jose , CA, USA; Salesforce, San Francisco, CA, USA|Machine learning techniques for developing industry-scale search engines have long been a prominent part of most domains and their online products. Search relevance algorithms are key components of products across different fields, including e-commerce, streaming services, and social networks. In this tutorial, we give an introduction to such large-scale search ranking systems, specifically focusing on deep learning techniques in this area. The topics we cover are the following: (1) Overview of search ranking systems in practice, including classical and machine learning techniques; (2) Introduction to sequential and language models in the context of search ranking; and (3) Knowledge distillation approaches for this area. For each of the aforementioned sessions, we first give an introductory talk and then go over an hands-on tutorial to really hone in on the concepts. We cover fundamental concepts using demos, case studies, and hands-on examples, including the latest Deep Learning methods that have achieved state-of-the-art results in generating the most relevant search results. Moreover, we show example implementations of these methods in python, leveraging a variety of open-source machine-learning/deep-learning libraries as well as real industrial data or open-source data.|用于开发行业规模搜索引擎的机器学习技术长期以来一直是大多数领域及其在线产品的重要组成部分。搜索相关算法是不同领域产品的关键组成部分,包括电子商务、流媒体服务和社交网络。在本教程中,我们将介绍这种大规模的搜索排名系统,特别关注这一领域的深度学习技术。我们讨论的主题如下: (1)搜索排名系统在实践中的概述,包括经典的和机器学习技术; (2)在搜索排名的背景下序列和语言模型的介绍; 和(3)这个领域的知识提取方法。对于前面提到的每一个会议,我们首先做一个介绍性的演讲,然后通过一个实践教程来真正地深入理解这些概念。我们使用演示、案例研究和实践例子介绍基本概念,包括最新的深度学习方法,这些方法在生成最相关的搜索结果时取得了最先进的结果。此外,我们还展示了这些方法在 python 中的实现示例,利用了各种开源机器学习/深度学习库以及真实的工业数据或开源数据。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Deep+Search+Relevance+Ranking+in+Practice)|0| |[Debiasing the Cloze Task in Sequential Recommendation with Bidirectional Transformers](https://doi.org/10.1145/3534678.3539430)|Khalil Damak, Sami Khenissi, Olfa Nasraoui|University of Louisville, Louisville, KY, USA|Bidirectional Transformer architectures are state-of-the-art sequential recommendation models that use a bi-directional representation capacity based on the Cloze task, a.k.a. Masked Language Modeling. The latter aims to predict randomly masked items within the sequence. Because they assume that the true interacted item is the most relevant one, an exposure bias results, where non-interacted items with low exposure propensities are assumed to be irrelevant. The most common approach to mitigating exposure bias in recommendation has been Inverse Propensity Scoring (IPS), which consists of down-weighting the interacted predictions in the loss function in proportion to their propensities of exposure, yielding a theoretically unbiased learning. In this work, we argue and prove that IPS does not extend to sequential recommendation because it fails to account for the temporal nature of the problem. We then propose a novel propensity scoring mechanism, which can theoretically debias the Cloze task in sequential recommendation. Finally we empirically demonstrate the debiasing capabilities of our proposed approach and its robustness to the severity of exposure bias.|双向转换器体系结构是最先进的顺序推荐模型,它使用基于完形填空任务的双向表示能力,也就是掩码语言建模。后者旨在预测序列中随机掩盖的项目。因为他们假设真正的相互作用的项目是最相关的一个,暴露偏差的结果,其中没有相互作用的项目低暴露倾向被认为是无关紧要的。减轻推荐中暴露偏倚的最常见方法是逆倾向评分(IPS) ,其包括按照暴露倾向的比例降低损失函数中的相互作用预测的权重,从而产生理论上无偏倚的学习。在这项工作中,我们争论和证明 IPS 没有扩展到顺序推荐,因为它没有考虑到问题的时间性质。然后,我们提出了一种新的倾向评分机制,它可以在理论上降低完形填空任务的顺序推荐。最后,我们通过实证证明了我们提出的方法的去偏能力及其对暴露偏差严重程度的鲁棒性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Debiasing+the+Cloze+Task+in+Sequential+Recommendation+with+Bidirectional+Transformers)|0| -|[A Generalized Doubly Robust Learning Framework for Debiasing Post-Click Conversion Rate Prediction](https://doi.org/10.1145/3534678.3539270)|Quanyu Dai, Haoxuan Li, Peng Wu, Zhenhua Dong, XiaoHua Zhou, Rui Zhang, Rui Zhang, Jie Sun|Huawei Noah's Ark Lab, Shenzhen, China; Huawei Hong Kong Theory Lab, Hong Kong, China; ruizhang.info, Shenzhen, China; Beijing Technology and Business University, Beijing, China; Peking University, Beijing, China|Post-click conversion rate (CVR) prediction is an essential task for discovering user interests and increasing platform revenues in a range of industrial applications. One of the most challenging problems of this task is the existence of severe selection bias caused by the inherent self-selection behavior of users and the item selection process of systems. Currently, doubly robust (DR) learning approaches achieve the state-of-the-art performance for debiasing CVR prediction. However, in this paper, by theoretically analyzing the bias, variance and generalization bounds of DR methods, we find that existing DR approaches may have poor generalization caused by inaccurate estimation of propensity scores and imputation errors, which often occur in practice. Motivated by such analysis, we propose a generalized learning framework that not only unifies existing DR methods, but also provides a valuable opportunity to develop a series of new debiasing techniques to accommodate different application scenarios. Based on the framework, we propose two new DR methods, namely DR-BIAS and DR-MSE. DR-BIAS directly controls the bias of DR loss, while DR-MSE balances the bias and variance flexibly, which achieves better generalization performance. In addition, we propose a novel tri-level joint learning optimization method for DR-MSE in CVR prediction, and an efficient training algorithm correspondingly. We conduct extensive experiments on both real-world and semi-synthetic datasets, which validate the effectiveness of our proposed methods.|点击后转换率(CVR)预测是发现用户兴趣和增加平台收入的一个重要任务,在一系列的工业应用。这项任务最具挑战性的问题之一是由于用户固有的自我选择行为和系统的项目选择过程所引起的严重选择偏差的存在。目前,双鲁棒(DR)学习方法在降低 CVR 预测偏差方面取得了最好的效果。然而,通过对 DR 方法的偏差、方差和泛化界限的理论分析,我们发现现有的 DR 方法可能由于在实际应用中经常出现的倾向分数估计不准确和插补错误而导致泛化能力较差。基于这样的分析,我们提出了一个通用的学习框架,它不仅统一了现有的 DR 方法,而且为开发一系列新的去偏技术以适应不同的应用场景提供了宝贵的机会。在此基础上,提出了两种新的 DR 方法: DR-BIAS 和 DR-MSE。DR-BIAS 直接控制 DR 损失的偏差,而 DR-MSE 灵活地平衡偏差和方差,从而获得更好的泛化性能。此外,本文还提出了一种新的基于 DR-MSE 的 CVR 预测三层联合学习优化方法,并给出了相应的训练算法。我们在真实世界和半合成数据集上进行了广泛的实验,验证了我们提出的方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Generalized+Doubly+Robust+Learning+Framework+for+Debiasing+Post-Click+Conversion+Rate+Prediction)|0| -|[User-Event Graph Embedding Learning for Context-Aware Recommendation](https://doi.org/10.1145/3534678.3539458)|Dugang Liu, Mingkai He, Jinwei Luo, Jiangxu Lin, Meng Wang, Xiaolian Zhang, Weike Pan, Zhong Ming|Shenzhen University, Shenzhen, China; Southeast University, Nanjing, China; Huawei Technologies Co Ltd, Shenzhen, China; Shenzhen University & Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen, China|Most methods for context-aware recommendation focus on improving the feature interaction layer, but overlook the embedding layer. However, an embedding layer with random initialization often suffers in practice from the sparsity of the contextual features, as well as the interactions between the users (or items) and context. In this paper, we propose a novel user-event graph embedding learning (UEG-EL) framework to address these two sparsity challenges. Specifically, our UEG-EL contains three modules: 1) a graph construction module is used to obtain a user-event graph containing nodes for users, intents and items, where the intent nodes are generated by applying intent node attention (INA) on nodes of the contextual features; 2) a user-event collaborative graph convolution module is designed to obtain the refined embeddings of all features by executing a new convolution strategy on the user-event graph, where each intent node acts as a hub to efficiently propagate the information among different features; 3) a recommendation module is equipped to integrate some existing context-aware recommendation model, where the feature embeddings are directly initialized with the obtained refined embeddings. Moreover, we identify a unique challenge of the basic framework, that is, the contextual features associated with too many instances may suffer from noise when aggregating the information. We thus further propose a simple but effective variant, i.e., UEG-EL-V, in order to prune the information propagation of the contextual features. Finally, we conduct extensive experiments on three public datasets to verify the effectiveness and compatibility of our UEG-EL and its variant.|大多数上下文感知的推荐方法侧重于改进特征交互层,而忽略了嵌入层。然而,具有随机初始化的嵌入层在实践中经常受到上下文特征稀疏性以及用户(或项目)与上下文之间交互的影响。本文提出了一种新的用户事件图嵌入学习(UEG-EL)框架来解决这两个稀疏性问题。具体来说,我们的 UEG-EL 包含三个模块: 1)一个图形构造模块用于获得一个包含用户、意图和项目节点的用户事件图,其中意图节点是通过在上下文特征的节点上应用意图节点注意力(INA)来生成的; 2)一个用户事件协作图卷积模块用于通过在用户事件图上执行一个新的卷积策略来获得所有特征的精细嵌入,其中每个意图节点作为一个中心来有效地传播不同特征之间的信息; 3)一个推荐模块用于集成一些现有的上下文感知的推荐模型,其中特征嵌入是直接初始化。此外,我们发现了基本框架的一个独特的挑战,即与太多实例相关的上下文特征在聚合信息时可能会受到噪声的影响。因此,我们进一步提出了一个简单而有效的变体,即 UEG-EL-V,以修剪信息传播的上下文特征。最后,我们在三个公共数据集上进行了广泛的实验,以验证我们的 UEG-EL 及其变体的有效性和兼容性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=User-Event+Graph+Embedding+Learning+for+Context-Aware+Recommendation)|0| +|[A Generalized Doubly Robust Learning Framework for Debiasing Post-Click Conversion Rate Prediction](https://doi.org/10.1145/3534678.3539270)|Quanyu Dai, Haoxuan Li, Peng Wu, Zhenhua Dong, XiaoHua Zhou, Rui Zhang, Rui Zhang, Jie Sun|Huawei Hong Kong Theory Lab, Hong Kong, China; Beijing Technology and Business University, Beijing, China; Peking University, Beijing, China; Huawei Noah's Ark Lab, Shenzhen, China; ruizhang.info, Shenzhen, China|Post-click conversion rate (CVR) prediction is an essential task for discovering user interests and increasing platform revenues in a range of industrial applications. One of the most challenging problems of this task is the existence of severe selection bias caused by the inherent self-selection behavior of users and the item selection process of systems. Currently, doubly robust (DR) learning approaches achieve the state-of-the-art performance for debiasing CVR prediction. However, in this paper, by theoretically analyzing the bias, variance and generalization bounds of DR methods, we find that existing DR approaches may have poor generalization caused by inaccurate estimation of propensity scores and imputation errors, which often occur in practice. Motivated by such analysis, we propose a generalized learning framework that not only unifies existing DR methods, but also provides a valuable opportunity to develop a series of new debiasing techniques to accommodate different application scenarios. Based on the framework, we propose two new DR methods, namely DR-BIAS and DR-MSE. DR-BIAS directly controls the bias of DR loss, while DR-MSE balances the bias and variance flexibly, which achieves better generalization performance. In addition, we propose a novel tri-level joint learning optimization method for DR-MSE in CVR prediction, and an efficient training algorithm correspondingly. We conduct extensive experiments on both real-world and semi-synthetic datasets, which validate the effectiveness of our proposed methods.|点击后转换率(CVR)预测是发现用户兴趣和增加平台收入的一个重要任务,在一系列的工业应用。这项任务最具挑战性的问题之一是由于用户固有的自我选择行为和系统的项目选择过程所引起的严重选择偏差的存在。目前,双鲁棒(DR)学习方法在降低 CVR 预测偏差方面取得了最好的效果。然而,通过对 DR 方法的偏差、方差和泛化界限的理论分析,我们发现现有的 DR 方法可能由于在实际应用中经常出现的倾向分数估计不准确和插补错误而导致泛化能力较差。基于这样的分析,我们提出了一个通用的学习框架,它不仅统一了现有的 DR 方法,而且为开发一系列新的去偏技术以适应不同的应用场景提供了宝贵的机会。在此基础上,提出了两种新的 DR 方法: DR-BIAS 和 DR-MSE。DR-BIAS 直接控制 DR 损失的偏差,而 DR-MSE 灵活地平衡偏差和方差,从而获得更好的泛化性能。此外,本文还提出了一种新的基于 DR-MSE 的 CVR 预测三层联合学习优化方法,并给出了相应的训练算法。我们在真实世界和半合成数据集上进行了广泛的实验,验证了我们提出的方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Generalized+Doubly+Robust+Learning+Framework+for+Debiasing+Post-Click+Conversion+Rate+Prediction)|0| +|[User-Event Graph Embedding Learning for Context-Aware Recommendation](https://doi.org/10.1145/3534678.3539458)|Dugang Liu, Mingkai He, Jinwei Luo, Jiangxu Lin, Meng Wang, Xiaolian Zhang, Weike Pan, Zhong Ming|Huawei Technologies Co Ltd, Shenzhen, China; Shenzhen University & Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen, China; Shenzhen University, Shenzhen, China; Southeast University, Nanjing, China|Most methods for context-aware recommendation focus on improving the feature interaction layer, but overlook the embedding layer. However, an embedding layer with random initialization often suffers in practice from the sparsity of the contextual features, as well as the interactions between the users (or items) and context. In this paper, we propose a novel user-event graph embedding learning (UEG-EL) framework to address these two sparsity challenges. Specifically, our UEG-EL contains three modules: 1) a graph construction module is used to obtain a user-event graph containing nodes for users, intents and items, where the intent nodes are generated by applying intent node attention (INA) on nodes of the contextual features; 2) a user-event collaborative graph convolution module is designed to obtain the refined embeddings of all features by executing a new convolution strategy on the user-event graph, where each intent node acts as a hub to efficiently propagate the information among different features; 3) a recommendation module is equipped to integrate some existing context-aware recommendation model, where the feature embeddings are directly initialized with the obtained refined embeddings. Moreover, we identify a unique challenge of the basic framework, that is, the contextual features associated with too many instances may suffer from noise when aggregating the information. We thus further propose a simple but effective variant, i.e., UEG-EL-V, in order to prune the information propagation of the contextual features. Finally, we conduct extensive experiments on three public datasets to verify the effectiveness and compatibility of our UEG-EL and its variant.|大多数上下文感知的推荐方法侧重于改进特征交互层,而忽略了嵌入层。然而,具有随机初始化的嵌入层在实践中经常受到上下文特征稀疏性以及用户(或项目)与上下文之间交互的影响。本文提出了一种新的用户事件图嵌入学习(UEG-EL)框架来解决这两个稀疏性问题。具体来说,我们的 UEG-EL 包含三个模块: 1)一个图形构造模块用于获得一个包含用户、意图和项目节点的用户事件图,其中意图节点是通过在上下文特征的节点上应用意图节点注意力(INA)来生成的; 2)一个用户事件协作图卷积模块用于通过在用户事件图上执行一个新的卷积策略来获得所有特征的精细嵌入,其中每个意图节点作为一个中心来有效地传播不同特征之间的信息; 3)一个推荐模块用于集成一些现有的上下文感知的推荐模型,其中特征嵌入是直接初始化。此外,我们发现了基本框架的一个独特的挑战,即与太多实例相关的上下文特征在聚合信息时可能会受到噪声的影响。因此,我们进一步提出了一个简单而有效的变体,即 UEG-EL-V,以修剪信息传播的上下文特征。最后,我们在三个公共数据集上进行了广泛的实验,以验证我们的 UEG-EL 及其变体的有效性和兼容性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=User-Event+Graph+Embedding+Learning+for+Context-Aware+Recommendation)|0| |[Adversarial Gradient Driven Exploration for Deep Click-Through Rate Prediction](https://doi.org/10.1145/3534678.3539461)|Kailun Wu, Weijie Bian, Zhangming Chan, Lejian Ren, Shiming Xiang, Shuguang Han, Hongbo Deng, Bo Zheng|Alibaba Group, Beijing, China; Institute of Automation, Chinese Academy of Sciences, Beijing, China|Exploration-Exploitation (E& E) algorithms are commonly adopted to deal with the feedback-loop issue in large-scale online recommender systems. Most of existing studies believe that high uncertainty can be a good indicator of potential reward, and thus primarily focus on the estimation of model uncertainty. We argue that such an approach overlooks the subsequent effect of exploration on model training. From the perspective of online learning, the adoption of an exploration strategy would also affect the collecting of training data, which further influences model learning. To understand the interaction between exploration and training, we design a Pseudo-Exploration module that simulates the model updating process after a certain item is explored and the corresponding feedback is received. We further show that such a process is equivalent to adding an adversarial perturbation to the model input, and thereby name our proposed approach as an the Adversarial Gradient Driven Exploration (AGE). For production deployment, we propose a dynamic gating unit to pre-determine the utility of an exploration. This enables us to utilize the limited amount of resources for exploration, and avoid wasting pageview resources on ineffective exploration. The effectiveness of AGE was firstly examined through an extensive number of ablation studies on an academic dataset. Meanwhile, AGE has also been deployed to one of the world-leading display advertising platforms, and we observe significant improvements on various top-line evaluation metrics.|在大规模在线推荐系统中,探索-开发(E & E)算法是处理反馈回路问题的常用算法。大多数已有的研究认为高不确定性可以作为潜在报酬的一个很好的指标,因此主要集中在模型不确定性的估计上。我们认为这种方法忽视了探索对模型训练的后续影响。从在线学习的角度来看,探索策略的采用也会影响训练数据的收集,从而进一步影响模型学习。为了理解探索与训练的相互作用,我们设计了一个拟探索模块,模拟探索某一项目并收到相应反馈后的模型更新过程。我们进一步表明,这样一个过程是相当于添加一个对抗扰动的模型输入,从而命名我们提出的方法作为一个对抗梯度驱动探索(AGE)。对于生产部署,我们提出了一个动态门控单元来预先确定勘探的效用。这使我们能够利用有限的资源进行探索,避免在无效探索上浪费页面浏览资源。AGE 的有效性首先通过一个学术数据集上的大量消融研究进行了检验。与此同时,AGE 也被部署到世界领先的展示广告平台之一,我们观察到各种顶线评估指标的显著改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Adversarial+Gradient+Driven+Exploration+for+Deep+Click-Through+Rate+Prediction)|0| -|[Graph-based Multilingual Language Model: Leveraging Product Relations for Search Relevance](https://doi.org/10.1145/3534678.3539158)|Nurendra Choudhary, Nikhil Rao, Karthik Subbian, Chandan K. Reddy|Amazon, Palo Alto, CA, USA; Virginia Tech, Arlington, VA, USA|The large-scale nature of product catalog and the changing demands of customer queries makes product search a challenging problem. The customer queries are ambiguous and implicit. They may be looking for an exact match of their query, or a functional equivalent (i.e., substitute), or an accessory to go with it (i.e., complement). It is important to distinguish these three categories from merely classifying an item for a customer query as relevant or not. This information can help direct the customer and improve search applications to understand the customer mission. In this paper, we formulate search relevance as a multi-class classification problem and propose a graph-based solution to classify a given query-item pair as exact, substitute, complement, or irrelevant (ESCI). The customer engagement (clicks, add-to-cart, and purchases) between query and items serve as a crucial information for this problem. However, existing approaches rely purely on the textual information (such as BERT) and do not sufficiently focus on the structural relationships. Another challenge in including the structural information is the sparsity of such data in some regions. We propose Structure-Aware multilingual LAnguage Model (SALAM), that utilizes a language model along with a graph neural network, to extract region-specific semantics as well as relational information for the classification of query-product pairs. Our model is first pre-trained on a large region-agnostic dataset and behavioral graph data and then fine-tuned on region-specific versions to address the sparsity. We show in our experiments that SALAM significantly outperforms the current matching frameworks on the ESCI classification task in several regions. We also demonstrate the effectiveness of using a two-phased training setup (i.e., pre-training and fine-tuning) in capturing region-specific information. Also, we provide various challenges and solutions for using the model in an industrial setting and outline its contribution to the e-commerce engine.|产品目录的大规模性和客户查询需求的变化使得产品搜索成为一个具有挑战性的问题。客户查询是模糊和隐式的。他们可能在寻找与他们的查询完全匹配的查询,或者功能等价的查询(即替代查询) ,或者附属查询(即补充查询)。区分这三个类别与仅仅为客户查询分类一个项目是否相关是很重要的。这些信息可以帮助指导客户并改进搜索应用程序,以理解客户的使命。本文将搜索相关性表述为一个多类分类问题,并提出了一种基于图的解决方案,将给定的查询项对分类为精确、替代、补充或不相关(ESCI)。查询和项目之间的客户参与(单击、添加到购物车和购买)是解决此问题的关键信息。然而,现有的方法仅仅依赖于文本信息(比如 BERT) ,并没有充分关注结构关系。在纳入结构信息方面的另一个挑战是,一些区域的此类数据稀少。提出了一种基于结构感知的多语言语言模型(SALAM) ,该模型利用语言模型和图神经网络提取区域特定的语义和关系信息,用于查询产品对的分类。我们的模型首先在大型区域不可知数据集和行为图数据上进行预训练,然后在区域特定版本上进行微调,以解决稀疏性问题。我们的实验表明,在 ESCI 分类任务中,SALAM 在几个地区的性能明显优于目前的匹配框架。我们还演示了使用两阶段的训练设置(即预训练和微调)在捕获特定区域的信息方面的有效性。此外,我们提供了在工业环境中使用该模型的各种挑战和解决方案,并概述了其对电子商务引擎的贡献。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph-based+Multilingual+Language+Model:+Leveraging+Product+Relations+for+Search+Relevance)|0| +|[Graph-based Multilingual Language Model: Leveraging Product Relations for Search Relevance](https://doi.org/10.1145/3534678.3539158)|Nurendra Choudhary, Nikhil Rao, Karthik Subbian, Chandan K. Reddy|Virginia Tech, Arlington, VA, USA; Amazon, Palo Alto, CA, USA|The large-scale nature of product catalog and the changing demands of customer queries makes product search a challenging problem. The customer queries are ambiguous and implicit. They may be looking for an exact match of their query, or a functional equivalent (i.e., substitute), or an accessory to go with it (i.e., complement). It is important to distinguish these three categories from merely classifying an item for a customer query as relevant or not. This information can help direct the customer and improve search applications to understand the customer mission. In this paper, we formulate search relevance as a multi-class classification problem and propose a graph-based solution to classify a given query-item pair as exact, substitute, complement, or irrelevant (ESCI). The customer engagement (clicks, add-to-cart, and purchases) between query and items serve as a crucial information for this problem. However, existing approaches rely purely on the textual information (such as BERT) and do not sufficiently focus on the structural relationships. Another challenge in including the structural information is the sparsity of such data in some regions. We propose Structure-Aware multilingual LAnguage Model (SALAM), that utilizes a language model along with a graph neural network, to extract region-specific semantics as well as relational information for the classification of query-product pairs. Our model is first pre-trained on a large region-agnostic dataset and behavioral graph data and then fine-tuned on region-specific versions to address the sparsity. We show in our experiments that SALAM significantly outperforms the current matching frameworks on the ESCI classification task in several regions. We also demonstrate the effectiveness of using a two-phased training setup (i.e., pre-training and fine-tuning) in capturing region-specific information. Also, we provide various challenges and solutions for using the model in an industrial setting and outline its contribution to the e-commerce engine.|产品目录的大规模性和客户查询需求的变化使得产品搜索成为一个具有挑战性的问题。客户查询是模糊和隐式的。他们可能在寻找与他们的查询完全匹配的查询,或者功能等价的查询(即替代查询) ,或者附属查询(即补充查询)。区分这三个类别与仅仅为客户查询分类一个项目是否相关是很重要的。这些信息可以帮助指导客户并改进搜索应用程序,以理解客户的使命。本文将搜索相关性表述为一个多类分类问题,并提出了一种基于图的解决方案,将给定的查询项对分类为精确、替代、补充或不相关(ESCI)。查询和项目之间的客户参与(单击、添加到购物车和购买)是解决此问题的关键信息。然而,现有的方法仅仅依赖于文本信息(比如 BERT) ,并没有充分关注结构关系。在纳入结构信息方面的另一个挑战是,一些区域的此类数据稀少。提出了一种基于结构感知的多语言语言模型(SALAM) ,该模型利用语言模型和图神经网络提取区域特定的语义和关系信息,用于查询产品对的分类。我们的模型首先在大型区域不可知数据集和行为图数据上进行预训练,然后在区域特定版本上进行微调,以解决稀疏性问题。我们的实验表明,在 ESCI 分类任务中,SALAM 在几个地区的性能明显优于目前的匹配框架。我们还演示了使用两阶段的训练设置(即预训练和微调)在捕获特定区域的信息方面的有效性。此外,我们提供了在工业环境中使用该模型的各种挑战和解决方案,并概述了其对电子商务引擎的贡献。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph-based+Multilingual+Language+Model:+Leveraging+Product+Relations+for+Search+Relevance)|0| |[ASPIRE: Air Shipping Recommendation for E-commerce Products via Causal Inference Framework](https://doi.org/10.1145/3534678.3539197)|Abhirup Mondal, Anirban Majumder, Vineet Chaoji|Amazon, Bengaluru, India|Speed of delivery is critical for the success of e-commerce platforms. Faster delivery promise to the customer results in increased conversion and revenue. There are typically two mechanisms to control the delivery speed - a) replication of products across warehouses, and b) air-shipping the product. In this paper, we present a machine learning based framework to recommend air-shipping eligibility for products. Specifically, we develop a causal inference framework (referred to as Air Shipping Recommendation or ASPIRE) that balances the trade-off between revenue or conversion and delivery cost to decide whether a product should be shipped via air. We propose a doubly-robust estimation technique followed by an optimization algorithm to determine air eligibility of products and calculate the uplift in revenue and shipping cost. We ran extensive experiments (both offline and online) to demonstrate the superiority of our technique as compared to the incumbent policies and baseline approaches. ASPIRE resulted in a lift of +79 bps of revenue as measured through an A/B experiment in an emerging marketplace on Amazon.|交付速度对电子商务平台的成功至关重要。更快的交付承诺给客户的结果增加转换和收入。通常有两种机制来控制交付速度: a)在仓库之间复制产品,b)空运产品。在本文中,我们提出了一个基于机器学习的框架来推荐产品的空运资格。具体来说,我们开发了一个因果推理框架(称为航空运输建议书或 ASPIRE) ,平衡收入或转换和交付成本之间的权衡,以决定是否应该通过空运运输产品。我们提出了一个双稳健估计技术和一个优化算法来确定产品的空气合格性,并计算收入和运输成本的提高。我们进行了大量的实验(线下和线上) ,以证明我们的技术相对于现有的策略和基线方法的优越性。通过在亚马逊新兴市场的 A/B 实验,ASPIRE 的收入提高了79个基点。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ASPIRE:+Air+Shipping+Recommendation+for+E-commerce+Products+via+Causal+Inference+Framework)|0| -|[Improving Relevance Modeling via Heterogeneous Behavior Graph Learning in Bing Ads](https://doi.org/10.1145/3534678.3539128)|Bochen Pang, Chaozhuo Li, Yuming Liu, Jianxun Lian, Jianan Zhao, Hao Sun, Weiwei Deng, Xing Xie, Qi Zhang|Microsoft Research Asia, Beijing, China; Microsoft, Beijing, China; University of Notre Dame, Indiana, IN, USA|As the fundamental basis of sponsored search, relevance modeling measures the closeness between the input queries and the candidate ads. Conventional relevance models solely rely on the textual data, which suffer from the scarce semantic signals within the short queries. Recently, user historical click behaviors are incorporated in the format of click graphs to provide additional correlations beyond pure textual semantics, which contributes to advancing the relevance modeling performance. However, user behaviors are usually arbitrary and unpredictable, leading to the noisy and sparse graph topology. In addition, there exist other types of user behaviors besides clicks, which may also provide complementary information. In this paper, we study the novel problem of heterogeneous behavior graph learning to facilitate relevance modeling task. Our motivation lies in learning an optimal and task-relevant heterogeneous behavior graph consisting of multiple types of user behaviors. We further propose a novel HBGLR model to learn the behavior graph structure by mining the sophisticated correlations between node semantics and graph topology, and encode the textual semantics and structural heterogeneity into the learned representations. Our proposal is evaluated over real-world industry datasets, and has been mainstreamed in the Bing ads. Both offline and online experimental results demonstrate its superiority.|作为赞助商搜索的基础,相关性建模测量了输入查询和候选广告之间的密切程度。传统的关联模型仅仅依赖于文本数据,而文本数据受到短查询中语义信号稀缺的影响。近年来,用户的历史点击行为被整合到点击图的格式中,提供了超越纯文本语义的额外相关性,这有助于提高相关性建模的性能。然而,用户行为通常是任意和不可预测的,导致噪声和稀疏图拓扑。此外,除了点击之外,还存在其他类型的用户行为,这些行为也可能提供补充信息。本文研究了异构行为图学习的新问题,以促进相关建模任务的完成。我们的动机在于学习一个由多种类型的用户行为组成的最优的和与任务相关的异构行为图。我们进一步提出了一种新的 HBGLR 模型,通过挖掘节点语义和图拓扑之间复杂的相关性来学习行为图结构,并将文本语义和结构异质性编码到所学习的表示中。我们的建议是评估在现实世界的行业数据集,并已成为主流的必应广告。离线和在线实验结果都证明了该方法的优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+Relevance+Modeling+via+Heterogeneous+Behavior+Graph+Learning+in+Bing+Ads)|0| +|[Improving Relevance Modeling via Heterogeneous Behavior Graph Learning in Bing Ads](https://doi.org/10.1145/3534678.3539128)|Bochen Pang, Chaozhuo Li, Yuming Liu, Jianxun Lian, Jianan Zhao, Hao Sun, Weiwei Deng, Xing Xie, Qi Zhang|Microsoft, Beijing, China; Microsoft Research Asia, Beijing, China; University of Notre Dame, Indiana, IN, USA|As the fundamental basis of sponsored search, relevance modeling measures the closeness between the input queries and the candidate ads. Conventional relevance models solely rely on the textual data, which suffer from the scarce semantic signals within the short queries. Recently, user historical click behaviors are incorporated in the format of click graphs to provide additional correlations beyond pure textual semantics, which contributes to advancing the relevance modeling performance. However, user behaviors are usually arbitrary and unpredictable, leading to the noisy and sparse graph topology. In addition, there exist other types of user behaviors besides clicks, which may also provide complementary information. In this paper, we study the novel problem of heterogeneous behavior graph learning to facilitate relevance modeling task. Our motivation lies in learning an optimal and task-relevant heterogeneous behavior graph consisting of multiple types of user behaviors. We further propose a novel HBGLR model to learn the behavior graph structure by mining the sophisticated correlations between node semantics and graph topology, and encode the textual semantics and structural heterogeneity into the learned representations. Our proposal is evaluated over real-world industry datasets, and has been mainstreamed in the Bing ads. Both offline and online experimental results demonstrate its superiority.|作为赞助商搜索的基础,相关性建模测量了输入查询和候选广告之间的密切程度。传统的关联模型仅仅依赖于文本数据,而文本数据受到短查询中语义信号稀缺的影响。近年来,用户的历史点击行为被整合到点击图的格式中,提供了超越纯文本语义的额外相关性,这有助于提高相关性建模的性能。然而,用户行为通常是任意和不可预测的,导致噪声和稀疏图拓扑。此外,除了点击之外,还存在其他类型的用户行为,这些行为也可能提供补充信息。本文研究了异构行为图学习的新问题,以促进相关建模任务的完成。我们的动机在于学习一个由多种类型的用户行为组成的最优的和与任务相关的异构行为图。我们进一步提出了一种新的 HBGLR 模型,通过挖掘节点语义和图拓扑之间复杂的相关性来学习行为图结构,并将文本语义和结构异质性编码到所学习的表示中。我们的建议是评估在现实世界的行业数据集,并已成为主流的必应广告。离线和在线实验结果都证明了该方法的优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+Relevance+Modeling+via+Heterogeneous+Behavior+Graph+Learning+in+Bing+Ads)|0| |[Type Linking for Query Understanding and Semantic Search](https://doi.org/10.1145/3534678.3539067)|Giorgos Stoilos, Nikos Papasarantopoulos, Pavlos Vougiouklis, Patrik Bansky|Huawei Technologies, Edinburgh, United Kingdom|Huawei is currently undertaking an effort to build map and web search services using query understanding and semantic search techniques. We present our efforts to built a low-latency type mention detection and linking service for map search. In addition to latency challenges, we only had access to low quality and biased training data plus we had to support 13 languages. Consequently, our service is based mostly on unsupervised term- and vector-based methods. Nevertheless, we trained a Transformer-based query tagger which we integrated with the rest of the pipeline using a reward and penalisation approach. We present techniques that we designed in order to address challenges with the type dictionary, incompatibilities in scoring between the term-based and vector-based methods as well as over-segmentation issues in Thai, Chinese, and Japanese. We have evaluated our approach on the Huawei map search use case as well as on community Question Answering benchmarks.|华为目前正致力于利用查询理解和语义搜索技术建立地图和网络搜索服务。我们介绍了我们的努力,建立一个低延迟类型提及检测和地图搜索链接服务。除了延迟挑战,我们只能访问低质量和有偏见的培训数据,加上我们必须支持13种语言。因此,我们的服务主要是基于无监督的术语和向量方法。尽管如此,我们还是训练了一个基于 Transformer 的查询标记器,并使用奖励和惩罚方法将其与管道的其他部分集成在一起。我们提出的技术,我们设计的目的是为了解决类型字典的挑战,在评分之间的基于术语和基于向量的方法以及在泰国,中国和日本的过分割问题。我们评估了华为地图搜索用例和社区问答基准的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Type+Linking+for+Query+Understanding+and+Semantic+Search)|0| |[Combo-Fashion: Fashion Clothes Matching CTR Prediction with Item History](https://doi.org/10.1145/3534678.3539101)|Chenxu Zhu, Peng Du, Weinan Zhang, Yong Yu, Yang Cao|Shanghai Jiao Tong University, Shanghai, China; Alibaba Group, Hangzhou, China|As one of the fundamental trends for future development of recommender systems, Fashion Clothes Matching Recommendation for click-through rate (CTR) prediction has become an increasingly essential task. Unlike traditional single-item recommendation, a combo item, composed of a top item (e.g. a shirt) and a bottom item (e.g. a skirt), is recommended. In such a task, the matching effect between these two single items plays a crucial role, and greatly influences the users' preferences; however, it is usually neglected by previous approaches in CTR prediction. In this work, we tackle this problem by designing a novel algorithm called Combo-Fashion, which extracts the matching effect by introducing the matching history of the combo item with two cascaded modules: (i) Matching Search Module (MSM) seeks the popular combo items and undesirable ones as a positive set and a negative set, respectively; (ii) Matching Prediction Module (MPM) models the precise relationship between the candidate combo item and the positive/negative set by an attention-based deep model. Besides, the CPM Fashion Attribute, considered from characteristic, pattern and material, is applied to capture the matching effect further. As part of this work, we release two large-scale datasets consisting of 3.56 million and 6.01 million user behaviors with rich context and fashion information in millions of combo items. The experimental results over these two real-world datasets have demonstrated the superiority of our proposed model with significant improvements. Furthermore, we have deployed Combo-Fashion onto the platform of Taobao to recommend the combo items to the users, where an 8-day online A/B test proved the effectiveness of Combo-Fashion with an improvement of pCTR by 1.02% and uCTR by 0.70%.|作为推荐系统未来发展的基本趋势之一,服装搭配推荐系统的点进率预测已经成为一项日益重要的任务。不同于传统的单一项目推荐,一个组合项目,组成的顶部项目(如衬衫)和底部项目(如裙子) ,是推荐的。在这样一个任务中,这两个项目之间的匹配效果起着至关重要的作用,并且对用户的偏好有很大的影响,但是在以往的 CTR 预测方法中往往忽略了这一点。针对这一问题,本文设计了一种新的组合时尚算法,该算法通过引入组合项目的匹配历史来提取匹配效果,该算法由两个级联模块组成: (1)匹配搜索模块(MSM)分别将流行的组合项目和不受欢迎的组合项目作为一个正集和一个负集来搜索; (2)匹配预测模块(MPM)通过基于注意的深度模型来建立候选组合项目与正/负集之间的精确关系。此外,从特征、图案和材质三个方面考虑,运用 CPM 时尚属性进一步捕捉匹配效果。作为这项工作的一部分,我们发布了两个大型数据集,包括356万和601万用户行为,其中包含数百万个组合项目的丰富上下文和时尚信息。在这两个实际数据集上的实验结果显示了我们提出的模型的优越性,并有显著的改进。此外,我们还在淘宝平台上部署了 Combo-Fashion,向用户推荐组合项目,通过8天的在线 A/B 测试证明了 Combo-Fashion 的有效性,pCTR 提高了1.02% ,uCTR 提高了0.70% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Combo-Fashion:+Fashion+Clothes+Matching+CTR+Prediction+with+Item+History)|0| |[Reward Optimizing Recommendation using Deep Learning and Fast Maximum Inner Product Search](https://doi.org/10.1145/3534678.3542622)|Imad Aouali, Amine Benhalloum, Martin Bompaire, Achraf Ait Sidi Hammou, Sergey Ivanov, Benjamin Heymann, David Rohde, Otmane Sakhi, Flavian Vasile, Maxime Vono|Criteo, Paris, France|How can we build and optimize a recommender system that must rapidly fill slates (i.e. banners) of personalized recommendations? The combination of deep learning stacks with fast maximum inner product search (MIPS) algorithms have shown it is possible to deploy flexible models in production that can rapidly deliver personalized recommendations to users. Albeit promising, this methodology is unfortunately not sufficient to build a recommender system which maximizes the reward, e.g. the probability of click. Usually instead a proxy loss is optimized and A/B testing is used to test if the system actually improved performance. This tutorial takes participants through the necessary steps to model the reward and directly optimize the reward of recommendation engines built upon fast search algorithms to produce high-performance reward-optimizing recommender systems.|我们如何构建和优化一个必须快速填充个性化推荐板块(即横幅)的推荐系统?深度学习栈与快速最大内部产品搜索(MIPS)算法的结合表明,在生产中部署灵活的模型可以迅速向用户提供个性化的建议。尽管这种方法很有前途,但不幸的是,它不足以建立一个最大化回报的推荐系统,例如点击的概率。通常代理丢失是优化和 A/B 测试用于测试系统是否实际上提高了性能。本教程将带领参与者通过必要的步骤来建立奖励模型,并直接优化建立在快速搜索算法基础上的推荐引擎的奖励,从而产生高性能的奖励优化推荐系统。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Reward+Optimizing+Recommendation+using+Deep+Learning+and+Fast+Maximum+Inner+Product+Search)|0| |[Low-rank Nonnegative Tensor Decomposition in Hyperbolic Space](https://doi.org/10.1145/3534678.3539317)|Bo Hui, WeiShinn Ku|Auburn University, Auburn, AL, USA|Tensor decomposition aims to factorize an input tensor into a number of latent factors. Due to the low-rank nature of tensor in real applications, the latent factors can be used to perform tensor completion in numerous tasks, such as knowledge graph completion and timely recommendation. However, existing works solve the problem in Euclidean space, where the tensor is decomposed into Euclidean vectors. Recent studies show that hyperbolic space is roomier than Euclidean space. With the same dimension, a hyperbolic vector can represent richer information (e.g., hierarchical structure) than a Euclidean vector. In this paper, we propose to decompose tensor in hyperbolic space. Considering that the most popular optimization tools (e.g, SGD, Adam) have not been generalized in hyperbolic space, we design an adaptive optimization algorithm according to the distinctive property of hyperbolic manifold. To address the non-convex property of the problem, we adopt gradient ascent in our optimization algorithm to avoid getting trapped in local optimal landscapes. We conduct experiments on various tensor completion tasks and the result validates the superiority of our method over these baselines that solve the problem in Euclidean space.|张量分解旨在将输入张量分解为若干潜在因子。由于张量在实际应用中的低秩特性,潜在因子可以用来完成许多任务,如知识图的完成和及时推荐。然而,现有的工作解决了欧氏空间中的问题,其中张量分解成欧氏向量。最近的研究表明双曲空间比欧几里得空间更宽敞。在相同的维度下,双曲向量可以比矢量表示更丰富的信息(例如,层次结构)。在这篇文章中,我们提出在双曲空间中分解张量。考虑到最流行的优化工具(例如,SGD,Adam)还没有在双曲空间中推广,我们根据双曲流形的独特性质设计了一个自适应优化算法。为了解决该问题的非凸性,在优化算法中采用了梯度上升的方法,以避免陷入局部最优景观中。我们对各种张量完成任务进行了实验,实验结果验证了该方法相对于这些基线解决欧氏空间问题的优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Low-rank+Nonnegative+Tensor+Decomposition+in+Hyperbolic+Space)|0| -|[Personalized Chit-Chat Generation for Recommendation Using External Chat Corpora](https://doi.org/10.1145/3534678.3539215)|Changyu Chen, Xiting Wang, Xiaoyuan Yi, Fangzhao Wu, Xing Xie, Rui Yan|Microsoft Research Asia, Beijing, China; Renmin University of China, Beijing, China|Chit-chat has been shown effective in engaging users in human-computer interaction. We find with a user study that generating appropriate chit-chat for news articles can help expand user interest and increase the probability that a user reads a recommended news article. Based on this observation, we propose a method to generate personalized chit-chat for news recommendation. Different from existing methods for personalized text generation, our method only requires an external chat corpus obtained from an online forum, which can be disconnected from the recommendation dataset from both the user and item (news) perspectives. This is achieved by designing a weak supervision method for estimating users' personalized interest in a chit-chat post by transferring knowledge learned by a news recommendation model. Based on the method for estimating user interest, a reinforcement learning framework is proposed to generate personalized chit-chat. Extensive experiments, including the automatic offline evaluation and user studies, demonstrate the effectiveness of our method.|聊天已被证明能有效地吸引用户参与人机交互。我们通过用户研究发现,为新闻文章产生适当的闲聊可以帮助扩大用户的兴趣,并增加用户阅读推荐新闻文章的可能性。在此基础上,本文提出了一种新闻推荐个性化聊天的生成方法。与现有的个性化文本生成方法不同,该方法只需要一个从在线论坛获得的外部聊天语料库,该语料库可以从用户和项目(新闻)的角度与推荐数据集分离。这是通过设计一种弱监督方法,通过传递新闻推荐模型中学到的知识来估计用户在闲聊帖子中的个性化兴趣来实现的。基于评估用户兴趣的方法,提出了一个强化学习框架来生成个性化的聊天。广泛的实验,包括自动离线评估和用户研究,证明了我们的方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Personalized+Chit-Chat+Generation+for+Recommendation+Using+External+Chat+Corpora)|0| +|[Personalized Chit-Chat Generation for Recommendation Using External Chat Corpora](https://doi.org/10.1145/3534678.3539215)|Changyu Chen, Xiting Wang, Xiaoyuan Yi, Fangzhao Wu, Xing Xie, Rui Yan|Renmin University of China, Beijing, China; Microsoft Research Asia, Beijing, China|Chit-chat has been shown effective in engaging users in human-computer interaction. We find with a user study that generating appropriate chit-chat for news articles can help expand user interest and increase the probability that a user reads a recommended news article. Based on this observation, we propose a method to generate personalized chit-chat for news recommendation. Different from existing methods for personalized text generation, our method only requires an external chat corpus obtained from an online forum, which can be disconnected from the recommendation dataset from both the user and item (news) perspectives. This is achieved by designing a weak supervision method for estimating users' personalized interest in a chit-chat post by transferring knowledge learned by a news recommendation model. Based on the method for estimating user interest, a reinforcement learning framework is proposed to generate personalized chit-chat. Extensive experiments, including the automatic offline evaluation and user studies, demonstrate the effectiveness of our method.|聊天已被证明能有效地吸引用户参与人机交互。我们通过用户研究发现,为新闻文章产生适当的闲聊可以帮助扩大用户的兴趣,并增加用户阅读推荐新闻文章的可能性。在此基础上,本文提出了一种新闻推荐个性化聊天的生成方法。与现有的个性化文本生成方法不同,该方法只需要一个从在线论坛获得的外部聊天语料库,该语料库可以从用户和项目(新闻)的角度与推荐数据集分离。这是通过设计一种弱监督方法,通过传递新闻推荐模型中学到的知识来估计用户在闲聊帖子中的个性化兴趣来实现的。基于评估用户兴趣的方法,提出了一个强化学习框架来生成个性化的聊天。广泛的实验,包括自动离线评估和用户研究,证明了我们的方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Personalized+Chit-Chat+Generation+for+Recommendation+Using+External+Chat+Corpora)|0| |[G2NET: A General Geography-Aware Representation Network for Hotel Search Ranking](https://doi.org/10.1145/3534678.3539025)|Jia Xu, Fei Xiong, Zulong Chen, Mingyuan Tao, Liangyue Li, Quan Lu|Guangxi University, Nanning, China; Alibaba Group, Hangzhou, China|Hotel search ranking is the core function of Online Travel Platforms (OTPs), while geography information of location entities involved in it plays a critically important role in guaranteeing its ranking quality. The closest line of works to the hotel search ranking problem is thus the next POI (or location) recommendation problem, which has extensive works but fails to cope with two new challenges, i.e., consideration of two more location entities and effective utilization of geographical information, in a hotel search ranking scenario. To this end, we propose a General Geography-aware representation NETwork (G2NET for short) to better represent geography information of location entities so as to optimize the hotel search ranking. In G2NET, to address the first challenge, we first propose the concept of Geography Interaction Schema (GIS) which is a meta template for representing the arbitrary number of location entity types and their interactions. Then, a novel geography interaction encoder is devised providing general representation ability for an instance of GIS, followed by an attentive operation that aggregates representations of instances corresponding to all historically interacted hotels of a user in a weighted manner. The second challenge is handled by the combined application of three proposed geography embedding modules in G2NET, each of which focuses on computing embeddings of location entities based on a certain aspect of geographical information of location entities. Moreover, a self-attention layer is deployed in G2NET, to capture correlations among historically interacted hotels of a user which provides non-trivial functionality of understanding the user's behaviors. Both offline and online experiments show that G2NET outperforms the state-of-the-art methods. G2NET has now been successfully deployed to provide the high-quality hotel search ranking service at Fliggy, one of the most popular OTPs in China, serving tens of millions of users.|酒店搜索排名是在线旅游平台(OTP)的核心功能,而位置实体的地理信息对于保证其排名质量起着至关重要的作用。因此,与酒店搜索排名问题最接近的工作是下一个 POI (或位置)推荐问题,这个问题有大量的工作,但未能应对两个新的挑战,即在一个酒店搜索排名场景中考虑另外两个位置实体和有效利用地理信息。为此,我们提出了一个通用地理感知表示网络(G2NET) ,以更好地表示位置实体的地理信息,从而优化酒店搜索排名。在 G2NET 中,为了应对第一个挑战,我们首先提出了地理交互模式(GIS)的概念,它是一个元模板,用于表示任意数量的位置实体类型及其交互。然后,设计了一种新颖的地理交互编码器,提供了 GIS 实例的一般表示能力,然后进行了注意操作,以加权的方式聚合了对应于用户的所有历史交互酒店的实例表示。第二个挑战是通过在 G2NET 中联合应用三个地理嵌入模块来解决,每个模块的重点都是基于位置实体的地理信息的某一方面来计算位置实体的嵌入。此外,在 G2NET 中部署了一个自我关注层,以捕获用户在历史上交互的酒店之间的相关性,从而提供了理解用户行为的重要功能。离线和在线实验都表明,G2NET 的性能优于最先进的方法。目前,g2NET 已成功部署到 Fliggy,为数千万用户提供高质量的酒店搜索排名服务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=G2NET:+A+General+Geography-Aware+Representation+Network+for+Hotel+Search+Ranking)|0| |[Avoiding Biases due to Similarity Assumptions in Node Embeddings](https://doi.org/10.1145/3534678.3539287)|Deepayan Chakrabarti|University of Texas at Austin, Austin, TX, USA|Node embeddings are vectors, one per node, that capture a graph's structure. The basic structure is the adjacency matrix of the graph. Recent methods also make assumptions about the similarity of unlinked nodes. However, such assumptions can lead to unintentional but systematic biases against groups of nodes. Calculating similarities between far-off nodes is also difficult under privacy constraints and in dynamic graphs. Our proposed embedding, called NEWS, makes no similarity assumptions, avoiding potential risks to privacy and fairness. NEWS is parameter-free, enables fast link prediction, and has linear complexity. These gains from avoiding assumptions do not significantly affect accuracy, as we show via comparisons against several existing methods on $21$ real-world networks. Code is available at https://github.com/deepayan12/news.|节点嵌入是向量,每个节点一个,它捕获图的结构。基本结构是图形的邻接矩阵。最近的方法也对未链接节点的相似性做了假设。然而,这样的假设可能会导致对节点群的无意的但是系统性的偏见。在隐私约束和动态图中,计算远程节点之间的相似度也很困难。我们提出的嵌入,称为新闻,没有相似的假设,避免了隐私和公平的潜在风险。NEWS 是无参数的,支持快速链路预测,具有线性复杂度。这些从避免假设中获得的收益不会显著影响准确性,正如我们通过比较现有的几种方法在 $21 $真实世界的网络上所显示的。密码可于 https://github.com/deepayan12/news 索取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Avoiding+Biases+due+to+Similarity+Assumptions+in+Node+Embeddings)|0| -|[Task-optimized User Clustering based on Mobile App Usage for Cold-start Recommendations](https://doi.org/10.1145/3534678.3539105)|Bulou Liu, Bing Bai, Weibang Xie, Yiwen Guo, Hao Chen|Tencent Security Big Data Lab, Beijing, China; University of California, Davis, Davis, CA, USA; Independent Researcher, Beijing, China; Tencent Inc., Guangzhou, China|This paper reports our recent practice of recommending articles to cold-start users at Tencent. Transferring knowledge from information-rich domains to help user modeling is an effective way to address the user-side cold-start problem. Our previous work demonstrated that general-purpose user embeddings based on mobile app usage helped article recommendations. However, high-dimensional embeddings are cumbersome for online usage, thus limiting the adoption. On the other hand, user clustering, which partitions users into several groups, can provide a lightweight, online-friendly, and explainable way to help recommendations. Effective user clustering for article recommendations based on mobile app usage faces unique challenges, including (1) the gap between an active user's behavior of mobile app usage and article reading, and (2) the gap between mobile app usage patterns of active and cold-start users. To address the challenges, we propose a tailored Dual Alignment User Clustering (DAUC) model, which applies a sample-wise contrastive alignment to eliminate the gap between active users' mobile app usage and article reading behavior, and a distribution-wise adversarial alignment to eliminate the gap between active users' and cold-start users' app usage behavior. With DAUC, cold-start recommendation-optimized user clustering based on mobile app usage can be achieved. On top of the user clusters, we further build candidate generation strategies, real-time features, and corresponding ranking models without much engineering difficulty. Both online and offline experiments demonstrate the effectiveness of our work.|本文报道了我们最近向腾讯的冷启动用户推荐文章的做法。从信息丰富的领域转移知识以帮助用户建模是解决用户端冷启动问题的有效途径。我们以前的工作表明,基于移动应用程序使用的通用用户嵌入有助于文章推荐。但是,高维嵌入对于在线使用来说很麻烦,因此限制了采用。另一方面,用户集群(将用户划分为几个组)可以提供一种轻量级的、在线友好的、可解释的方式来帮助推荐。基于移动应用使用的文章推荐的有效用户聚类面临独特的挑战,包括(1)活跃用户的移动应用使用行为和文章阅读之间的差距,以及(2)活跃用户和冷启动用户的移动应用使用模式之间的差距。为了应对这些挑战,我们提出了一个定制的双对齐用户聚类(DAUC)模型,该模型应用样本对比对齐来消除活跃用户的移动应用程序使用和文章阅读行为之间的差距,以及分布式对抗对齐来消除活跃用户和冷启动用户的应用程序使用行为之间的差距。利用 DAUC,可以实现基于移动应用使用情况的冷启动推荐优化用户聚类。在用户集群的基础上,我们进一步构建了候选生成策略、实时特征以及相应的排序模型,这些都不需要很大的工程难度。这两个在线和离线实验都证明了我们工作的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Task-optimized+User+Clustering+based+on+Mobile+App+Usage+for+Cold-start+Recommendations)|0| +|[Task-optimized User Clustering based on Mobile App Usage for Cold-start Recommendations](https://doi.org/10.1145/3534678.3539105)|Bulou Liu, Bing Bai, Weibang Xie, Yiwen Guo, Hao Chen|Independent Researcher, Beijing, China; Tencent Inc., Guangzhou, China; University of California, Davis, Davis, CA, USA; Tencent Security Big Data Lab, Beijing, China|This paper reports our recent practice of recommending articles to cold-start users at Tencent. Transferring knowledge from information-rich domains to help user modeling is an effective way to address the user-side cold-start problem. Our previous work demonstrated that general-purpose user embeddings based on mobile app usage helped article recommendations. However, high-dimensional embeddings are cumbersome for online usage, thus limiting the adoption. On the other hand, user clustering, which partitions users into several groups, can provide a lightweight, online-friendly, and explainable way to help recommendations. Effective user clustering for article recommendations based on mobile app usage faces unique challenges, including (1) the gap between an active user's behavior of mobile app usage and article reading, and (2) the gap between mobile app usage patterns of active and cold-start users. To address the challenges, we propose a tailored Dual Alignment User Clustering (DAUC) model, which applies a sample-wise contrastive alignment to eliminate the gap between active users' mobile app usage and article reading behavior, and a distribution-wise adversarial alignment to eliminate the gap between active users' and cold-start users' app usage behavior. With DAUC, cold-start recommendation-optimized user clustering based on mobile app usage can be achieved. On top of the user clusters, we further build candidate generation strategies, real-time features, and corresponding ranking models without much engineering difficulty. Both online and offline experiments demonstrate the effectiveness of our work.|本文报道了我们最近向腾讯的冷启动用户推荐文章的做法。从信息丰富的领域转移知识以帮助用户建模是解决用户端冷启动问题的有效途径。我们以前的工作表明,基于移动应用程序使用的通用用户嵌入有助于文章推荐。但是,高维嵌入对于在线使用来说很麻烦,因此限制了采用。另一方面,用户集群(将用户划分为几个组)可以提供一种轻量级的、在线友好的、可解释的方式来帮助推荐。基于移动应用使用的文章推荐的有效用户聚类面临独特的挑战,包括(1)活跃用户的移动应用使用行为和文章阅读之间的差距,以及(2)活跃用户和冷启动用户的移动应用使用模式之间的差距。为了应对这些挑战,我们提出了一个定制的双对齐用户聚类(DAUC)模型,该模型应用样本对比对齐来消除活跃用户的移动应用程序使用和文章阅读行为之间的差距,以及分布式对抗对齐来消除活跃用户和冷启动用户的应用程序使用行为之间的差距。利用 DAUC,可以实现基于移动应用使用情况的冷启动推荐优化用户聚类。在用户集群的基础上,我们进一步构建了候选生成策略、实时特征以及相应的排序模型,这些都不需要很大的工程难度。这两个在线和离线实验都证明了我们工作的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Task-optimized+User+Clustering+based+on+Mobile+App+Usage+for+Cold-start+Recommendations)|0| |[Promotheus: An End-to-End Machine Learning Framework for Optimizing Markdown in Online Fashion E-commerce](https://doi.org/10.1145/3534678.3539148)|Eleanor Loh, Jalaj Khandelwal, Brian Regan, Duncan A. Little|ASOS.com, London, United Kingdom|Managing discount promotional events ("markdown") is a significant part of running an e-commerce business, and inefficiencies here can significantly hamper a retailer's profitability. Traditional approaches for tackling this problem rely heavily on price elasticity modelling. However, the partial information nature of price elasticity modelling, together with the non-negotiable responsibility for protecting profitability, mean that machine learning practitioners must often go through great lengths to define strategies for measuring offline model quality. In the face of this, many retailers fall back on rule-based methods, thus forgoing significant gains in profitability that can be captured by machine learning. In this paper, we introduce two novel end-to-end markdown management systems for optimising markdown at different stages of a retailer's journey. The first system, "Ithax," enacts a rational supply-side pricing strategy without demand estimation, and can be usefully deployed as a "cold start" solution to collect markdown data while maintaining revenue control. The second system, "Promotheus," presents a full framework for markdown optimization with price elasticity. We describe in detail the specific modelling and validation procedures that, within our experience, have been crucial to building a system that performs robustly in the real world. Both markdown systems achieve superior profitability compared to decisions made by our experienced operations teams in a controlled online test, with improvements of 86% (Promotheus) and 79% (Ithax) relative to manual strategies. These systems have been deployed to manage markdown at ASOS.com, and both systems can be fruitfully deployed for price optimization across a wide variety of retail e-commerce settings.|管理折扣促销活动(“降价”)是经营电子商务业务的一个重要组成部分,这里的低效率会严重阻碍零售商的盈利能力。解决这一问题的传统方法在很大程度上依赖于价格弹性模型。然而,价格弹性建模的部分信息性质,加上保护盈利能力的不可协商的责任,意味着机器学习从业人员必须经常花费大量的时间来确定衡量离线模型质量的策略。面对这种情况,许多零售商退回到基于规则的方法,因此放弃了可以通过机器学习获得的利润率的显著增长。在本文中,我们介绍了两个新颖的端到端降价管理系统优化降价在不同阶段的零售商的旅程。第一个系统,“ Ithax”,制定了一个合理的供应侧定价策略,没有需求估计,可以作为一个有用的“冷启动”解决方案,收集降价数据,同时保持收入控制。第二个系统,“茂德修斯”,提出了一个完整的框架降价优化与价格弹性。我们详细描述了具体的建模和验证程序,根据我们的经验,这些程序对于建立一个在现实世界中运行良好的系统至关重要。与我们经验丰富的运营团队在受控的在线测试中做出的决策相比,这两种降价系统都实现了更高的盈利能力,相对于手工策略,降价系统的改进率分别为86% 和79% 。这些系统已经部署到 ASOS.com 管理降价,这两个系统都可以在各种零售电子商务环境中进行价格优化,从而取得丰硕成果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Promotheus:+An+End-to-End+Machine+Learning+Framework+for+Optimizing+Markdown+in+Online+Fashion+E-commerce)|0| |[Scalar is Not Enough: Vectorization-based Unbiased Learning to Rank](https://doi.org/10.1145/3534678.3539468)|Mouxiang Chen, Chenghao Liu, Zemin Liu, Jianling Sun|Salesforce Research Area, Singapore, Singapore; Singapore Management University, Singapore, Singapore; Zhejiang University & Alibaba-Zhejiang University Joint Institute of Frontier Technologies, Hangzhou, China|Unbiased learning to rank (ULTR) aims to train an unbiased ranking model from biased user click logs. Most of the current ULTR methods are based on the examination hypothesis (EH), which assumes that the click probability can be factorized into two scalar functions, one related to ranking features and the other related to bias factors. Unfortunately, the interactions among features, bias factors and clicks are complicated in practice, and usually cannot be factorized in this independent way. Fitting click data with EH could lead to model misspecification and bring the approximation error. In this paper, we propose a vector-based EH and formulate the click probability as a dot product of two vector functions. This solution is complete due to its universality in fitting arbitrary click functions. Based on it, we propose a novel model named Vectorization to adaptively learn the relevance embeddings and sort documents by projecting embeddings onto a base vector. Extensive experiments show that our method significantly outperforms the state-of-the-art ULTR methods on complex real clicks as well as simple simulated clicks.|无偏学习排名(ULTR)的目的是从有偏见的用户点击日志中训练一个无偏见的排名模型。目前的 ULTR 方法大多基于检验假设(EH) ,假设点击概率可以分解为两个标量函数,一个与排序特征有关,另一个与偏差因子有关。遗憾的是,特征、偏差因素和点击之间的相互作用在实践中是复杂的,通常不能以这种独立的方式进行因子分解。将 click 数据与 EH 进行匹配可能导致模型错误说明,并带来逼近误差。本文提出了一种基于向量的 EH,并将点击概率表示为两个向量函数的点乘。该解决方案是完整的,因为它在拟合任意点击函数的通用性。在此基础上,提出了一种新的向量化模型,通过向基向量投影来自适应地学习相关嵌入和排序文档。大量的实验表明,我们的方法在复杂的真实点击和简单的模拟点击方面明显优于最先进的 ULTR 方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Scalar+is+Not+Enough:+Vectorization-based+Unbiased+Learning+to+Rank)|0| |[Efficient Approximate Algorithms for Empirical Variance with Hashed Block Sampling](https://doi.org/10.1145/3534678.3539377)|Xingguang Chen, Fangyuan Zhang, Sibo Wang|The Chinese University of Hong Kong, Hong Kong, China|Empirical variance is a fundamental concept widely used in data management and data analytics, e.g., query optimization, approximate query processing, and feature selection. A direct solution to derive the empirical variance is scanning the whole data table, which is expensive when the data size is huge. Hence, most current works focus on approximate answers by sampling. For results with approximation guarantees, the samples usually need to be uniformly independent random, incurring high cache miss rates especially in compact columnar style layouts. An alternative uses block sampling to avoid this issue, which directly samples a block of consecutive records fitting page sizes instead of sampling one record each time. However, this provides no theoretical guarantee. Existing studies show that the practical estimations can be inaccurate as the records within a block can be correlated. Motivated by this, we investigate how to provide approximation guarantees for empirical variances with block sampling from a theoretical perspective. Our results shows that if the records stored in a table are 4-wise independent to each other according to keys, a slightly modified block sampling can provide the same approximation guarantee with the same asymptotic sampling cost as that of independent random sampling. In practice, storing records via hash clusters or hash organized tables are typical scenarios in modern commercial database systems. Thus, for data analysis on tables in the data lake or OLAP stores that are exported from such hash-based storage, our strategy can be easily integrated to improve the sampling efficiency. Based on our sampling strategy, we present an approximate algorithm for empirical variance and an approximate top-k algorithm to return the k columns with the highest empirical variance scores. Extensive experiments show that our solutions outperform existing solutions by up to an order of magnitude.|经验方差是数据管理和数据分析中广泛使用的一个基本概念,如查询优化、近似查询处理和特征选择。推导经验方差的直接方法是对整个数据表进行扫描,当数据量很大时,扫描成本很高。因此,目前大多数的工作集中在抽样近似答案。对于具有近似保证的结果,样本通常需要是一致独立的随机的,特别是在紧凑的柱状样式布局中,会导致高缓存错过率。另一种方法是使用块抽样来避免这个问题,即直接抽样一个连续的记录块来适应页面大小,而不是每次抽样一个记录。然而,这并不能提供理论上的保证。现有的研究表明,实际的估计可能是不准确的,因为一个区块内的记录可以相关。在此基础上,我们从理论的角度研究了如何为区组抽样的经验方差提供近似保证。结果表明,如果存储在表中的记录按键相互独立,稍加修改的块抽样可以提供与独立随机抽样相同的渐近抽样代价的近似保证。在实践中,通过散列集群或散列组织表存储记录是现代商业数据库系统中的典型场景。因此,对于从这种基于散列的存储器导出的数据湖或 OLAP 存储器中的表的数据分析,可以很容易地将我们的策略集成起来以提高采样效率。基于我们的抽样策略,我们提出了一个经验方差的近似算法和一个近似 top-k 算法来返回经验方差得分最高的 k 列。大量的实验表明,我们的解决方案比现有解决方案的性能高出一个数量级。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Efficient+Approximate+Algorithms+for+Empirical+Variance+with+Hashed+Block+Sampling)|0| |[Towards a Native Quantum Paradigm for Graph Representation Learning: A Sampling-based Recurrent Embedding Approach](https://doi.org/10.1145/3534678.3539327)|Ge Yan, Yehui Tang, Junchi Yan|Shanghai Jiao Tong University, Shanghai, China|Graph representation learning has been extensively studied, and recent models can well incorporate both node features and graph structures. Despite these progress, the inherent scalability challenge for classical computers of processing graph data and solving the downstream tasks (many are NP-hard) is still a bottleneck for existing classical graph learning models. On the other hand, quantum computing is known a promising direction for its theoretically verified scalability as well as the increasing evidence for the access to physical quantum machine in near-term. Different from many existing classical-quantum hybrid machine learning models on graphs, in this paper we take a more aggressive initiative for developing a native quantum paradigm for (attributed) graph representation learning, which to our best knowledge, has not been fulfilled in literature yet. Specifically, our model adopts the well-established theory and technique in quantum computing e.g. quantum random walk, and adapt it to the attributed graph. Then the node attribute quantum state sequence is fed into a quantum recurrent network to obtain the final node embedding. Experimental results on three public datasets show the effectiveness of our quantum model which also outperforms a classical learning approach GraphRNA notably in terms of efficiency even on a classical computer. Though it is still restricted to the classical loss-based learning paradigm with gradient descent for model parameter training, while our computing scheme is compatible with quantum computing without involving classical computers. This is in fact largely in contrast to many hybrid quantum graph learning models which often involve many steps and modules having to be performed on classical computers.|图表示学习已经得到了广泛的研究,现有的模型能够很好地结合节点特征和图结构。尽管取得了这些进展,经典计算机在处理图形数据和解决下游任务(许多是 NP 难的)方面固有的可伸缩性挑战仍然是现有经典图形学习模型的瓶颈。另一方面,量子计算因其在理论上被证实的可扩展性以及近期越来越多的物理量子计算的证据而被认为是一个有前途的方向。与许多现有的经典-量子混合机器学习模型不同,本文采取了更积极的主动性,开发了一个本土的量子范式(属性)图表示学习,据我们所知,这尚未在文献中得到实现。具体地说,我们的模型采用了量子计算中已经成熟的理论和技术,例如量子随机游走,并将其适用于属性图。然后将节点属性量子状态序列输入到量子递归网络中,得到最终的节点嵌入。在三个公共数据集上的实验结果表明了量子模型的有效性,即使在经典的计算机上,量子模型的效率也明显优于经典的学习方法 GraphRNA。虽然我们的计算机系统仍然局限于传统的以损失为基础的学习范式,并且只有模型参数训练的梯度下降法,但我们的计算机系统可以与量子计算兼容,而不需要使用传统的计算机。这实际上在很大程度上与许多混合量子图学习模型形成对比,这些模型通常涉及许多步骤和模块,必须在经典计算机上执行。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+a+Native+Quantum+Paradigm+for+Graph+Representation+Learning:+A+Sampling-based+Recurrent+Embedding+Approach)|0| -|[Toward Real-life Dialogue State Tracking Involving Negative Feedback Utterances](https://doi.org/10.1145/3534678.3539385)|Puhai Yang, Heyan Huang, Wei Wei, XianLing Mao|Beijing Institute of Technology, Beijing, China; Beijing Institute of Technology & Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, Beijing, China; Huazhong University of Science and Technology, Wuhan, China|Recently, the research of dialogue systems has been widely concerned, especially task-oriented dialogue systems, which have received increased attention due to their wide application prospect. As a core component, dialogue state tracking (DST) plays a key role in task-oriented dialogue systems, and its function is to parse natural language dialogues into dialogue state formed by slot-value pairs. It is well known that dialogue state tracking has been well studied and explored on current benchmark datasets such as the MultiWOZ. However, almost all current research completely ignores the user negative feedback utterances that exist in real-life conversations when a system error occurs, which often contains user-provided corrective information for the system error. Obviously, user negative feedback utterances can be used to correct the inevitable errors in automatic speech recognition and model generalization. Thus, in this paper, we will explore the role of negative feedback utterances in dialogue state tracking in detail through simulated negative feedback utterances. Specifically, due to the lack of dataset involving negative feedback utterances, first, we have to define the schema of user negative feedback utterances and propose a joint modeling method for feedback utterance generation and filtering. Then, we explore three aspects of interaction mechanism that should be considered in real-life conversations involving negative feedback utterances and propose evaluation metrics related to negative feedback utterances. Finally, on WOZ2.0 and MultiWOZ2.1 datasets, by constructing simulated negative feedback utterances in training and testing, we not only verify the important role of negative feedback utterances in dialogue state tracking, but also analyze the advantages and disadvantages of different interaction mechanisms involving negative feedback utterances, lighting future research on negative feedback utterances.|近年来,对话系统的研究受到了广泛的关注,尤其是面向任务的对话系统,由于其广阔的应用前景而受到越来越多的关注。对话状态跟踪(DST)是任务导向对话系统的核心组成部分,其功能是将自然语言对话解析为由插槽值对形成的对话状态。众所周知,对话状态跟踪已经在当前的基准数据集(如 MultiWOZ)上得到了很好的研究和探索。然而,目前几乎所有的研究都完全忽视了系统错误发生时用户在现实交谈中的负面反馈语,其中往往包含用户提供的系统错误纠正信息。显然,用户负反馈话语可以用来纠正语音自动识别和模型推广中不可避免的错误。因此,本文将通过模拟负反馈话语来详细探讨负反馈话语在对话状态跟踪中的作用。具体来说,由于缺乏涉及负反馈话语的数据集,首先,我们必须定义用户负反馈话语的模式,并提出一种联合建模的方法来生成和过滤反馈话语。然后,从三个方面探讨了负反馈话语在现实会话中应该考虑的互动机制,并提出了与负反馈话语相关的评价指标。最后,在 WOZ2.0和 MultiWOZ2.1数据集上,通过构建训练和测试中的模拟负反馈话语,不仅验证了负反馈话语在对话状态跟踪中的重要作用,而且分析了负反馈话语不同交互机制的优缺点,为进一步研究负反馈话语提供参考。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Toward+Real-life+Dialogue+State+Tracking+Involving+Negative+Feedback+Utterances)|0| -|[M-Mix: Generating Hard Negatives via Multi-sample Mixing for Contrastive Learning](https://doi.org/10.1145/3534678.3539248)|Shaofeng Zhang, Meng Liu, Junchi Yan, Hengrui Zhang, Lingxiao Huang, Xiaokang Yang, Pinyan Lu|Shanghai Jiao Tong University, Shanghai, China; Shanghai University of Finance and Economics & Huawei TCS Lab, Shanghai, China; Huawei TCS Lab, Shanghai, China|Negative pairs, especially hard negatives as combined with common negatives (easy to discriminate), are essential in contrastive learning, which plays a role of avoiding degenerate solutions in the sense of constant representation across different instances. Inspired by recent hard negative mining methods via pairwise mixup operation in vision, we propose M-Mix, which dynamically generates a sequence of hard negatives. Compared with previous methods, M-Mix mainly has three features: 1) adaptively choose samples to mix; 2) simultaneously mix multiple samples; 3) automatically assign different mixing weights to the selected samples. We evaluate our method on two image datasets (CIFAR-10, CIFAR-100), five node classification datasets (PPI, DBLP, Pubmed, etc), five graph classification datasets (IMDB, PTC_MR, etc), and two downstream combinatorial tasks (graph edit distance and node clustering). Results show that it achieves state-of-the-art performance under self-supervised settings. Code is available at: https://github.com/Sherrylone/m-mix.|否定对,尤其是硬否定与普通否定(容易区分)的结合,在对比学习中是必不可少的,对比学习的作用是避免退化的解决方案在不同情况下的持续表征。受当前视觉硬负片挖掘方法的启发,提出了 M-Mix 算法,该算法动态生成硬负片序列。与以往的混合方法相比,M-Mix 方法主要有三个特点: 1)自适应地选择混合样本; 2)同时混合多个样本; 3)自动分配不同的混合权重给选定的样本。我们在两个图像数据集(CIFAR-10,CIFAR-100) ,五个节点分类数据集(PPI,DBLP,Pubmed 等) ,五个图形分类数据集(IMDB,PTC _ MR 等)和两个下游组合任务(图形编辑距离和节点聚类)上评估我们的方法。结果表明,该算法在自监督设置下达到了最佳性能。密码可于以下 https://github.com/sherrylone/m-mix 索取:。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=M-Mix:+Generating+Hard+Negatives+via+Multi-sample+Mixing+for+Contrastive+Learning)|0| -|[Modeling Persuasion Factor of User Decision for Recommendation](https://doi.org/10.1145/3534678.3539114)|Chang Liu, Chen Gao, Yuan Yuan, Chen Bai, Lingrui Luo, Xiaoyi Du, Xinlei Shi, Hengliang Luo, Depeng Jin, Yong Li|Meituan Inc., Beijing, China; Tsinghua University, Beijing, China|In online information systems, users make decisions based on factors of several specific aspects, such as brand, price, etc. Existing recommendation engines ignore the explicit modeling of these factors, leading to sub-optimal recommendation performance. In this paper, we focus on the real-world scenario where these factors can be explicitly captured (the users are exposed with decision factor-based persuasion texts, i.e., persuasion factors). Although it allows us for explicit modeling of user-decision process, there are critical challenges including the persuasion factor's representation learning and effect estimation, along with the data-sparsity problem. To address them, in this work, we present our POEM (short for Persuasion factOr Effect Modeling) system. We first propose the persuasion-factor graph convolutional layers for encoding and learning representations from the persuasion-aware interaction data. Then we develop a prediction layer that fully considers the user sensitivity to the persuasion factors. Finally, to address the data-sparsity issue, we propose a counterfactual learning-based data augmentation method to enhance the supervision signal. Real-world experiments demonstrate the effectiveness of our proposed framework of modeling the effect of persuasion factors.|在网络信息系统中,用户根据品牌、价格等几个具体方面的因素进行决策。现有的推荐引擎忽略了这些因素的显式建模,导致推荐性能不理想。在本文中,我们关注的是真实世界中这些因素可以被明确地捕获的场景(用户暴露在基于决策因素的说服文本中,即,说服因素)。尽管它允许我们对用户决策过程进行明确的建模,但是仍然存在一些关键的挑战,包括说服因子的表示学习和效果估计,以及数据稀疏问题。为了解决这些问题,在本文中,我们提出了我们的 POEM (劝导因素效果建模的缩写)系统。我们首先提出了说服因子图卷积层,用于从感知说服的交互数据中编码和学习表示。然后我们开发了一个预测层,充分考虑了用户对说服因素的敏感性。最后,针对数据稀疏问题,提出了一种基于反事实学习的数据增强方法来增强监控信号。现实世界的实验证明了我们提出的说服因素效应建模框架的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Modeling+Persuasion+Factor+of+User+Decision+for+Recommendation)|0| +|[Toward Real-life Dialogue State Tracking Involving Negative Feedback Utterances](https://doi.org/10.1145/3534678.3539385)|Puhai Yang, Heyan Huang, Wei Wei, XianLing Mao|Beijing Institute of Technology, Beijing, China; Huazhong University of Science and Technology, Wuhan, China; Beijing Institute of Technology & Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, Beijing, China|Recently, the research of dialogue systems has been widely concerned, especially task-oriented dialogue systems, which have received increased attention due to their wide application prospect. As a core component, dialogue state tracking (DST) plays a key role in task-oriented dialogue systems, and its function is to parse natural language dialogues into dialogue state formed by slot-value pairs. It is well known that dialogue state tracking has been well studied and explored on current benchmark datasets such as the MultiWOZ. However, almost all current research completely ignores the user negative feedback utterances that exist in real-life conversations when a system error occurs, which often contains user-provided corrective information for the system error. Obviously, user negative feedback utterances can be used to correct the inevitable errors in automatic speech recognition and model generalization. Thus, in this paper, we will explore the role of negative feedback utterances in dialogue state tracking in detail through simulated negative feedback utterances. Specifically, due to the lack of dataset involving negative feedback utterances, first, we have to define the schema of user negative feedback utterances and propose a joint modeling method for feedback utterance generation and filtering. Then, we explore three aspects of interaction mechanism that should be considered in real-life conversations involving negative feedback utterances and propose evaluation metrics related to negative feedback utterances. Finally, on WOZ2.0 and MultiWOZ2.1 datasets, by constructing simulated negative feedback utterances in training and testing, we not only verify the important role of negative feedback utterances in dialogue state tracking, but also analyze the advantages and disadvantages of different interaction mechanisms involving negative feedback utterances, lighting future research on negative feedback utterances.|近年来,对话系统的研究受到了广泛的关注,尤其是面向任务的对话系统,由于其广阔的应用前景而受到越来越多的关注。对话状态跟踪(DST)是任务导向对话系统的核心组成部分,其功能是将自然语言对话解析为由插槽值对形成的对话状态。众所周知,对话状态跟踪已经在当前的基准数据集(如 MultiWOZ)上得到了很好的研究和探索。然而,目前几乎所有的研究都完全忽视了系统错误发生时用户在现实交谈中的负面反馈语,其中往往包含用户提供的系统错误纠正信息。显然,用户负反馈话语可以用来纠正语音自动识别和模型推广中不可避免的错误。因此,本文将通过模拟负反馈话语来详细探讨负反馈话语在对话状态跟踪中的作用。具体来说,由于缺乏涉及负反馈话语的数据集,首先,我们必须定义用户负反馈话语的模式,并提出一种联合建模的方法来生成和过滤反馈话语。然后,从三个方面探讨了负反馈话语在现实会话中应该考虑的互动机制,并提出了与负反馈话语相关的评价指标。最后,在 WOZ2.0和 MultiWOZ2.1数据集上,通过构建训练和测试中的模拟负反馈话语,不仅验证了负反馈话语在对话状态跟踪中的重要作用,而且分析了负反馈话语不同交互机制的优缺点,为进一步研究负反馈话语提供参考。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Toward+Real-life+Dialogue+State+Tracking+Involving+Negative+Feedback+Utterances)|0| +|[M-Mix: Generating Hard Negatives via Multi-sample Mixing for Contrastive Learning](https://doi.org/10.1145/3534678.3539248)|Shaofeng Zhang, Meng Liu, Junchi Yan, Hengrui Zhang, Lingxiao Huang, Xiaokang Yang, Pinyan Lu|Huawei TCS Lab, Shanghai, China; Shanghai University of Finance and Economics & Huawei TCS Lab, Shanghai, China; Shanghai Jiao Tong University, Shanghai, China|Negative pairs, especially hard negatives as combined with common negatives (easy to discriminate), are essential in contrastive learning, which plays a role of avoiding degenerate solutions in the sense of constant representation across different instances. Inspired by recent hard negative mining methods via pairwise mixup operation in vision, we propose M-Mix, which dynamically generates a sequence of hard negatives. Compared with previous methods, M-Mix mainly has three features: 1) adaptively choose samples to mix; 2) simultaneously mix multiple samples; 3) automatically assign different mixing weights to the selected samples. We evaluate our method on two image datasets (CIFAR-10, CIFAR-100), five node classification datasets (PPI, DBLP, Pubmed, etc), five graph classification datasets (IMDB, PTC_MR, etc), and two downstream combinatorial tasks (graph edit distance and node clustering). Results show that it achieves state-of-the-art performance under self-supervised settings. Code is available at: https://github.com/Sherrylone/m-mix.|否定对,尤其是硬否定与普通否定(容易区分)的结合,在对比学习中是必不可少的,对比学习的作用是避免退化的解决方案在不同情况下的持续表征。受当前视觉硬负片挖掘方法的启发,提出了 M-Mix 算法,该算法动态生成硬负片序列。与以往的混合方法相比,M-Mix 方法主要有三个特点: 1)自适应地选择混合样本; 2)同时混合多个样本; 3)自动分配不同的混合权重给选定的样本。我们在两个图像数据集(CIFAR-10,CIFAR-100) ,五个节点分类数据集(PPI,DBLP,Pubmed 等) ,五个图形分类数据集(IMDB,PTC _ MR 等)和两个下游组合任务(图形编辑距离和节点聚类)上评估我们的方法。结果表明,该算法在自监督设置下达到了最佳性能。密码可于以下 https://github.com/sherrylone/m-mix 索取:。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=M-Mix:+Generating+Hard+Negatives+via+Multi-sample+Mixing+for+Contrastive+Learning)|0| +|[Modeling Persuasion Factor of User Decision for Recommendation](https://doi.org/10.1145/3534678.3539114)|Chang Liu, Chen Gao, Yuan Yuan, Chen Bai, Lingrui Luo, Xiaoyi Du, Xinlei Shi, Hengliang Luo, Depeng Jin, Yong Li|Tsinghua University, Beijing, China; Meituan Inc., Beijing, China|In online information systems, users make decisions based on factors of several specific aspects, such as brand, price, etc. Existing recommendation engines ignore the explicit modeling of these factors, leading to sub-optimal recommendation performance. In this paper, we focus on the real-world scenario where these factors can be explicitly captured (the users are exposed with decision factor-based persuasion texts, i.e., persuasion factors). Although it allows us for explicit modeling of user-decision process, there are critical challenges including the persuasion factor's representation learning and effect estimation, along with the data-sparsity problem. To address them, in this work, we present our POEM (short for Persuasion factOr Effect Modeling) system. We first propose the persuasion-factor graph convolutional layers for encoding and learning representations from the persuasion-aware interaction data. Then we develop a prediction layer that fully considers the user sensitivity to the persuasion factors. Finally, to address the data-sparsity issue, we propose a counterfactual learning-based data augmentation method to enhance the supervision signal. Real-world experiments demonstrate the effectiveness of our proposed framework of modeling the effect of persuasion factors.|在网络信息系统中,用户根据品牌、价格等几个具体方面的因素进行决策。现有的推荐引擎忽略了这些因素的显式建模,导致推荐性能不理想。在本文中,我们关注的是真实世界中这些因素可以被明确地捕获的场景(用户暴露在基于决策因素的说服文本中,即,说服因素)。尽管它允许我们对用户决策过程进行明确的建模,但是仍然存在一些关键的挑战,包括说服因子的表示学习和效果估计,以及数据稀疏问题。为了解决这些问题,在本文中,我们提出了我们的 POEM (劝导因素效果建模的缩写)系统。我们首先提出了说服因子图卷积层,用于从感知说服的交互数据中编码和学习表示。然后我们开发了一个预测层,充分考虑了用户对说服因素的敏感性。最后,针对数据稀疏问题,提出了一种基于反事实学习的数据增强方法来增强监控信号。现实世界的实验证明了我们提出的说服因素效应建模框架的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Modeling+Persuasion+Factor+of+User+Decision+for+Recommendation)|0| |[Lion: A GPU-Accelerated Online Serving System for Web-Scale Recommendation at Baidu](https://doi.org/10.1145/3534678.3539058)|Hao Liu, Qian Gao, Xiaochao Liao, Guangxing Chen, Hao Xiong, Silin Ren, Guobao Yang, Zhiwei Zha|HKUST(GZ), HKUST, Guangzhou, China; Baidu, Inc., Beijing, China|Deep Neural Network (DNN) based recommendation systems are widely used in the modern internet industry for a variety of services. However, the rapid expansion of application scenarios and the explosive global internet traffic growth have caused the industry to face increasing challenges to serve the complicated recommendation workflow regarding online recommendation efficiency and compute resource overhead. In this paper, we present a GPU-accelerated online serving system, namely Lion, which consists of the staged event-driven heterogeneous pipeline, unified memory manager, and automatic execution optimizer to handle web-scale traffic in a real-time and cost-effective way. Moreover, Lion provides a heterogeneous template library to enable fast development and migration for diverse in-house web-scale recommendation systems without requiring knowledge of heterogeneous programming. The system is currently deployed at Baidu, supporting over twenty recommendation services, including news feed, short video clips, and the search engine. Extensive experimental studies on five real-world deployed online recommendation services demonstrate the superiority of the proposed GPU-accelerated online serving system. Since launched in early 2020, Lion has answered billions of recommendation requests per day, and has helped Baidu successfully save millions of U.S. dollars in hardware and utility costs per year.|基于深度神经网络(DNN)的推荐系统广泛应用于现代互联网行业的各种服务。然而,应用场景的快速扩展和全球互联网流量的爆炸性增长,使得业界面临着越来越多的挑战,以服务复杂的推荐工作流,包括在线推荐效率和计算资源开销。本文提出了一种基于 GPU 加速的在线服务系统 Lion,该系统由分级事件驱动的异构流水线、统一内存管理器和自动执行优化器组成,能够实时、高效地处理网络流量。此外,Lion 还提供了一个异构模板库,可以在不需要异构编程知识的情况下快速开发和迁移各种内部 Web 规模的推荐系统。该系统目前部署在百度,支持超过20种推荐服务,包括新闻馈送、短视频剪辑和搜索引擎。通过对五个实际部署的在线推荐服务的大量实验研究,证明了所提出的 GPU 加速在线服务系统的优越性。自2020年初推出以来,Lion 每天回应了数十亿的推荐请求,并帮助百度每年成功节省了数百万美元的硬件和公用事业成本。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Lion:+A+GPU-Accelerated+Online+Serving+System+for+Web-Scale+Recommendation+at+Baidu)|0| |[CognitionNet: A Collaborative Neural Network for Play Style Discovery in Online Skill Gaming Platform](https://doi.org/10.1145/3534678.3539179)|Rukma Talwadker, Surajit Chakrabarty, Aditya Pareek, Tridib Mukherjee, Deepak Saini||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CognitionNet:+A+Collaborative+Neural+Network+for+Play+Style+Discovery+in+Online+Skill+Gaming+Platform)|0| |[FedAttack: Effective and Covert Poisoning Attack on Federated Recommendation via Hard Sampling](https://doi.org/10.1145/3534678.3539119)|Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang, Xing Xie||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=FedAttack:+Effective+and+Covert+Poisoning+Attack+on+Federated+Recommendation+via+Hard+Sampling)|0| diff --git a/papers/kdd/kdd2023.md b/papers/kdd/kdd2023.md index d7200dc6..7594e913 100644 --- a/papers/kdd/kdd2023.md +++ b/papers/kdd/kdd2023.md @@ -17,21 +17,21 @@ |[An Empirical Study of Selection Bias in Pinterest Ads Retrieval](https://doi.org/10.1145/3580305.3599771)|Yuan Wang, Peifeng Yin, Zhiqiang Tao, Hari Venkatesan, Jin Lai, Yi Fang, PJ Xiao||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=An+Empirical+Study+of+Selection+Bias+in+Pinterest+Ads+Retrieval)|0| |[PIER: Permutation-Level Interest-Based End-to-End Re-ranking Framework in E-commerce](https://doi.org/10.1145/3580305.3599886)|Xiaowen Shi, Fan Yang, Ze Wang, Xiaoxu Wu, Muzhi Guan, Guogang Liao, Yongkang Wang, Xingxing Wang, Dong Wang|Meituan|Re-ranking draws increased attention on both academics and industries, which rearranges the ranking list by modeling the mutual influence among items to better meet users' demands. Many existing re-ranking methods directly take the initial ranking list as input, and generate the optimal permutation through a well-designed context-wise model, which brings the evaluation-before-reranking problem. Meanwhile, evaluating all candidate permutations brings unacceptable computational costs in practice. Thus, to better balance efficiency and effectiveness, online systems usually use a two-stage architecture which uses some heuristic methods such as beam-search to generate a suitable amount of candidate permutations firstly, which are then fed into the evaluation model to get the optimal permutation. However, existing methods in both stages can be improved through the following aspects. As for generation stage, heuristic methods only use point-wise prediction scores and lack an effective judgment. As for evaluation stage, most existing context-wise evaluation models only consider the item context and lack more fine-grained feature context modeling. This paper presents a novel end-to-end re-ranking framework named PIER to tackle the above challenges which still follows the two-stage architecture and contains two mainly modules named FPSM and OCPM. We apply SimHash in FPSM to select top-K candidates from the full permutation based on user's permutation-level interest in an efficient way. Then we design a novel omnidirectional attention mechanism in OCPM to capture the context information in the permutation. Finally, we jointly train these two modules end-to-end by introducing a comparative learning loss. Offline experiment results demonstrate that PIER outperforms baseline models on both public and industrial datasets, and we have successfully deployed PIER on Meituan food delivery platform.|重新排名吸引了越来越多的学术界和行业的关注,它们通过建立项目之间的相互影响模型来重新排列排名列表,以更好地满足用户的需求。许多现有的重新排序方法直接以初始排序列表为输入,通过设计良好的上下文智能模型生成最优排序,从而产生重新排序前的评价问题。同时,评估所有候选排列在实践中带来不可接受的计算成本。因此,为了更好地平衡效率和有效性,在线系统通常采用两阶段的体系结构,使用一些启发式的方法,如束搜索,生成适当数量的候选排列,然后反馈到评估模型,以获得最优的排列。然而,这两个阶段的现有方法可以通过以下几个方面进行改进。对于生成阶段,启发式方法只使用逐点预测得分,缺乏有效的判断。在评价阶段,现有的基于上下文的评价模型大多只考虑项目上下文,缺乏更细粒度的特征上下文建模。本文提出了一种新的端到端重新排序框架 PIER,以解决上述挑战,该框架仍然遵循两阶段的体系结构,包含两个主要模块: FPSM 和 OCPM。将模拟哈希算法应用于基于用户兴趣排列的 FSM 中,有效地从完全排列中选择出最优 K 候选算法。然后在 OCPM 中设计了一种新的全方位注意机制来捕获排列中的上下文信息。最后,通过引入比较学习损失,对这两个模块进行了端到端的联合训练。离线实验结果显示 PIER 在公共和工业数据集上都优于基线模型,我们已经成功地在美团食品配送平台上部署 PIER。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PIER:+Permutation-Level+Interest-Based+End-to-End+Re-ranking+Framework+in+E-commerce)|0| |[Exploiting Intent Evolution in E-commercial Query Recommendation](https://doi.org/10.1145/3580305.3599821)|Yu Wang, Zhengyang Wang, Hengrui Zhang, Qingyu Yin, Xianfeng Tang, Yinghan Wang, Danqing Zhang, Limeng Cui, Monica Cheng, Bing Yin, Suhang Wang, Philip S. Yu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Exploiting+Intent+Evolution+in+E-commercial+Query+Recommendation)|0| -|[QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search](https://doi.org/10.1145/3580305.3599891)|Jian Xie, Yidan Liang, Jingping Liu, Yanghua Xiao, Baohua Wu, Shenghua Ni|Alibaba Group; School of Information Science and Engineering, East China University of Science and Technology; Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University|In light of the success of the pre-trained language models (PLMs), continual pre-training of generic PLMs has been the paradigm of domain adaption. In this paper, we propose QUERT, A Continual Pre-trained Language Model for QUERy Understanding in Travel Domain Search. QUERT is jointly trained on four tailored pre-training tasks to the characteristics of query in travel domain search: Geography-aware Mask Prediction, Geohash Code Prediction, User Click Behavior Learning, and Phrase and Token Order Prediction. Performance improvement of downstream tasks and ablation experiment demonstrate the effectiveness of our proposed pre-training tasks. To be specific, the average performance of downstream tasks increases by 2.02% and 30.93% in supervised and unsupervised settings, respectively. To check on the improvement of QUERT to online business, we deploy QUERT and perform A/B testing on Fliggy APP. The feedback results show that QUERT increases the Unique Click-Through Rate and Page Click-Through Rate by 0.89% and 1.03% when applying QUERT as the encoder. Our code and downstream task data will be released for future research.|鉴于预训练语言模型(PLM)的成功,通用 PLM 的连续预训练已经成为领域适应的范例。本文提出了一种连续预训练语言模型 QUERT,用于旅游领域搜索中的查询理解。QUERT 针对旅游领域搜索中查询的特点,共同接受了四项量身定制的预先培训任务: 地理感知掩码预测、 Geohash 代码预测、用户点击行为学习以及短语和令牌顺序预测。下游任务的性能改进和烧蚀实验验证了我们提出的预训练任务的有效性。具体来说,在监督和非监督环境下,下游任务的平均性能分别提高了2.02% 和30.93% 。为了检查 QUERT 对在线业务的改进,我们部署 QUERT 并在 Fliggy APP 上进行 A/B 测试。反馈结果显示,当应用 QUERT 作为编码器时,QUERT 增加了0.89% 和1.03% 的唯一点进率和页面点进率。我们的代码和下游任务数据将被公布,以供未来研究使用。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=QUERT:+Continual+Pre-training+of+Language+Model+for+Query+Understanding+in+Travel+Domain+Search)|0| +|[QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search](https://doi.org/10.1145/3580305.3599891)|Jian Xie, Yidan Liang, Jingping Liu, Yanghua Xiao, Baohua Wu, Shenghua Ni|Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University; School of Information Science and Engineering, East China University of Science and Technology; Alibaba Group|In light of the success of the pre-trained language models (PLMs), continual pre-training of generic PLMs has been the paradigm of domain adaption. In this paper, we propose QUERT, A Continual Pre-trained Language Model for QUERy Understanding in Travel Domain Search. QUERT is jointly trained on four tailored pre-training tasks to the characteristics of query in travel domain search: Geography-aware Mask Prediction, Geohash Code Prediction, User Click Behavior Learning, and Phrase and Token Order Prediction. Performance improvement of downstream tasks and ablation experiment demonstrate the effectiveness of our proposed pre-training tasks. To be specific, the average performance of downstream tasks increases by 2.02% and 30.93% in supervised and unsupervised settings, respectively. To check on the improvement of QUERT to online business, we deploy QUERT and perform A/B testing on Fliggy APP. The feedback results show that QUERT increases the Unique Click-Through Rate and Page Click-Through Rate by 0.89% and 1.03% when applying QUERT as the encoder. Our code and downstream task data will be released for future research.|鉴于预训练语言模型(PLM)的成功,通用 PLM 的连续预训练已经成为领域适应的范例。本文提出了一种连续预训练语言模型 QUERT,用于旅游领域搜索中的查询理解。QUERT 针对旅游领域搜索中查询的特点,共同接受了四项量身定制的预先培训任务: 地理感知掩码预测、 Geohash 代码预测、用户点击行为学习以及短语和令牌顺序预测。下游任务的性能改进和烧蚀实验验证了我们提出的预训练任务的有效性。具体来说,在监督和非监督环境下,下游任务的平均性能分别提高了2.02% 和30.93% 。为了检查 QUERT 对在线业务的改进,我们部署 QUERT 并在 Fliggy APP 上进行 A/B 测试。反馈结果显示,当应用 QUERT 作为编码器时,QUERT 增加了0.89% 和1.03% 的唯一点进率和页面点进率。我们的代码和下游任务数据将被公布,以供未来研究使用。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=QUERT:+Continual+Pre-training+of+Language+Model+for+Query+Understanding+in+Travel+Domain+Search)|0| |[A Collaborative Transfer Learning Framework for Cross-domain Recommendation](https://doi.org/10.1145/3580305.3599758)|Wei Zhang, Pengye Zhang, Bo Zhang, Xingxing Wang, Dong Wang|Meituan|In the recommendation systems, there are multiple business domains to meet the diverse interests and needs of users, and the click-through rate(CTR) of each domain can be quite different, which leads to the demand for CTR prediction modeling for different business domains. The industry solution is to use domain-specific models or transfer learning techniques for each domain. The disadvantage of the former is that the data from other domains is not utilized by a single domain model, while the latter leverage all the data from different domains, but the fine-tuned model of transfer learning may trap the model in a local optimum of the source domain, making it difficult to fit the target domain. Meanwhile, significant differences in data quantity and feature schemas between different domains, known as domain shift, may lead to negative transfer in the process of transferring. To overcome these challenges, we propose the Collaborative Cross-Domain Transfer Learning Framework (CCTL). CCTL evaluates the information gain of the source domain on the target domain using a symmetric companion network and adjusts the information transfer weight of each source domain sample using the information flow network. This approach enables full utilization of other domain data while avoiding negative migration. Additionally, a representation enhancement network is used as an auxiliary task to preserve domain-specific features. Comprehensive experiments on both public and real-world industrial datasets, CCTL achieved SOTA score on offline metrics. At the same time, the CCTL algorithm has been deployed in Meituan, bringing 4.37% CTR and 5.43% GMV lift, which is significant to the business.|在推荐系统中,有多个业务领域可以满足用户的不同兴趣和需求,而每个领域的点进率可能有很大差异,因此需要为不同的业务领域建立点击率预测模型。行业解决方案是对每个领域使用特定于领域的模型或转移学习技术。前者的缺点是其他领域的数据不能被单一的领域模型所利用,而后者则利用来自不同领域的所有数据,但是经过微调的迁移学习模型可能使模型陷入源领域的局部最优,从而难以适应目标领域。同时,不同领域间数据量和特征模式的显著差异,称为领域移位,可能导致传递过程中的负迁移。为了克服这些挑战,我们提出了协作跨域转移学习框架(CCTL)。CCTL 使用对称伴侣网络对源域在目标域上的信息增益进行评估,并使用信息流网络调整每个源域样本的信息传输权重。这种方法可以充分利用其他域数据,同时避免负迁移。此外,表示增强网络用作辅助任务,以保持特定领域的特征。CCTL 在公共和现实世界的工业数据集上进行了全面的实验,在离线指标上取得了 SOTA 评分。与此同时,CCTL 算法已经在美团中部署,带来了4.37% 的点击率和5.43% 的 GMV 提升,这对业务具有重要意义。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Collaborative+Transfer+Learning+Framework+for+Cross-domain+Recommendation)|0| -|[Towards Disentangling Relevance and Bias in Unbiased Learning to Rank](https://doi.org/10.1145/3580305.3599914)|Yunan Zhang, Le Yan, Zhen Qin, Honglei Zhuang, Jiaming Shen, Xuanhui Wang, Michael Bendersky, Marc Najork|Google; Google Research; University of Illinois at Urbana-Champaign|Unbiased learning to rank (ULTR) studies the problem of mitigating various biases from implicit user feedback data such as clicks, and has been receiving considerable attention recently. A popular ULTR approach for real-world applications uses a two-tower architecture, where click modeling is factorized into a relevance tower with regular input features, and a bias tower with bias-relevant inputs such as the position of a document. A successful factorization will allow the relevance tower to be exempt from biases. In this work, we identify a critical issue that existing ULTR methods ignored - the bias tower can be confounded with the relevance tower via the underlying true relevance. In particular, the positions were determined by the logging policy, i.e., the previous production model, which would possess relevance information. We give both theoretical analysis and empirical results to show the negative effects on relevance tower due to such a correlation. We then propose three methods to mitigate the negative confounding effects by better disentangling relevance and bias. Empirical results on both controlled public datasets and a large-scale industry dataset show the effectiveness of the proposed approaches.|无偏学习排序(ULTR)研究的是如何减轻隐性用户反馈数据(如点击)中的各种偏差,近年来受到了广泛的关注。一种流行的 ULTR 方法用于现实世界的应用程序使用一个双塔架构,其中点击建模被分解为一个具有常规输入特征的相关塔,以及一个具有偏倚相关输入(如文档的位置)的偏倚塔。一个成功的因子分解将使相关塔免于偏见。在这项工作中,我们确定了一个关键问题,现有的 ULTR 方法忽略-偏倚塔可以混淆与相关塔通过潜在的真实相关性。具体来说,位置是由测井策略决定的,即先前的生产模型,它将拥有相关信息。我们给出了理论分析和实证结果来说明这种相关性对关联塔的负面影响。然后,我们提出了三种方法,通过更好地分离相关性和偏倚来减轻负面混杂效应。对受控公共数据集和大规模行业数据集的实证结果表明了该方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Disentangling+Relevance+and+Bias+in+Unbiased+Learning+to+Rank)|0| +|[Towards Disentangling Relevance and Bias in Unbiased Learning to Rank](https://doi.org/10.1145/3580305.3599914)|Yunan Zhang, Le Yan, Zhen Qin, Honglei Zhuang, Jiaming Shen, Xuanhui Wang, Michael Bendersky, Marc Najork|Google Research; University of Illinois at Urbana-Champaign; Google|Unbiased learning to rank (ULTR) studies the problem of mitigating various biases from implicit user feedback data such as clicks, and has been receiving considerable attention recently. A popular ULTR approach for real-world applications uses a two-tower architecture, where click modeling is factorized into a relevance tower with regular input features, and a bias tower with bias-relevant inputs such as the position of a document. A successful factorization will allow the relevance tower to be exempt from biases. In this work, we identify a critical issue that existing ULTR methods ignored - the bias tower can be confounded with the relevance tower via the underlying true relevance. In particular, the positions were determined by the logging policy, i.e., the previous production model, which would possess relevance information. We give both theoretical analysis and empirical results to show the negative effects on relevance tower due to such a correlation. We then propose three methods to mitigate the negative confounding effects by better disentangling relevance and bias. Empirical results on both controlled public datasets and a large-scale industry dataset show the effectiveness of the proposed approaches.|无偏学习排序(ULTR)研究的是如何减轻隐性用户反馈数据(如点击)中的各种偏差,近年来受到了广泛的关注。一种流行的 ULTR 方法用于现实世界的应用程序使用一个双塔架构,其中点击建模被分解为一个具有常规输入特征的相关塔,以及一个具有偏倚相关输入(如文档的位置)的偏倚塔。一个成功的因子分解将使相关塔免于偏见。在这项工作中,我们确定了一个关键问题,现有的 ULTR 方法忽略-偏倚塔可以混淆与相关塔通过潜在的真实相关性。具体来说,位置是由测井策略决定的,即先前的生产模型,它将拥有相关信息。我们给出了理论分析和实证结果来说明这种相关性对关联塔的负面影响。然后,我们提出了三种方法,通过更好地分离相关性和偏倚来减轻负面混杂效应。对受控公共数据集和大规模行业数据集的实证结果表明了该方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Disentangling+Relevance+and+Bias+in+Unbiased+Learning+to+Rank)|0| |[M5: Multi-Modal Multi-Interest Multi-Scenario Matching for Over-the-Top Recommendation](https://doi.org/10.1145/3580305.3599863)|Pengyu Zhao, Xin Gao, Chunxu Xu, Liang Chen||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=M5:+Multi-Modal+Multi-Interest+Multi-Scenario+Matching+for+Over-the-Top+Recommendation)|0| -|[Accelerating Personalized PageRank Vector Computation](https://doi.org/10.1145/3580305.3599251)|Zhen Chen, Xingzhi Guo, Baojian Zhou, Deqing Yang, Steven Skiena|Fudan University; State University of New York at Stony Brook|Personalized PageRank Vectors are widely used as fundamental graph-learning tools for detecting anomalous spammers, learning graph embeddings, and training graph neural networks. The well-known local FwdPush algorithm approximates PPVs and has a sublinear rate of $O\big(\frac{1}{\alpha\epsilon}\big)$. A recent study found that when high precision is required, FwdPush is similar to the power iteration method, and its run time is pessimistically bounded by $O\big(\frac{m}{\alpha} \log\frac{1}{\epsilon}\big)$. This paper looks closely at calculating PPVs for both directed and undirected graphs. By leveraging the linear invariant property, we show that FwdPush is a variant of Gauss-Seidel and propose a Successive Over-Relaxation based method, FwdPushSOR to speed it up by slightly modifying FwdPush. Additionally, we prove FwdPush has local linear convergence rate $O\big(\tfrac{\text{vol}(S)}{\alpha} \log\tfrac{1}{\epsilon}\big)$ enjoying advantages of two existing bounds. We also design a new local heuristic push method that reduces the number of operations by 10-50 percent compared to FwdPush. For undirected graphs, we propose two momentum-based acceleration methods that can be expressed as one-line updates and speed up non-acceleration methods by$\mathcal{O}\big(\tfrac{1}{\sqrt{\alpha}}\big)$. Our experiments on six real-world graph datasets confirm the efficiency of FwdPushSOR and the acceleration methods for directed and undirected graphs, respectively.|个性化 PageRank 向量广泛用作基本的图形学习工具,用于检测异常垃圾邮件发送者、学习图形嵌入和训练图形神经网络。著名的局部 FwdPush 算法近似于 PPV,其次线性速率为 $O big (frac {1}{ alpha epsilon } big) $。最近的一项研究发现,当需要高精度时,FwdPush 类似于幂迭代法,其运行时间悲观地受到 $O big (frac { m }{ alpha } log frac {1}{ epsilon } big) $的限制。本文主要研究有向图和无向图的 PPV 的计算。通过利用线性不变性,我们证明了 FwdPush 是 Gauss-Seidel 的一个变体,并提出了一个基于逐次超松驰法的方法,FwdPushSOR,通过稍微修改 FwdPush 来加速它。另外,我们证明了 FwdPush 具有局部线性收敛速度 $O big (tfrac { text { vol }(S)}{ alpha } log tfrac {1}{ epsilon } big) $具有两个现有界的优点。我们还设计了一种新的局部启发式推送方法,与 FwdPush 相比减少了10-50% 的操作次数。对于无向图,我们提出了两种基于动量的加速方法,它们可以表示为一行更新,并且可以通过 $mathcal { O } big (tfrac {1}{ sqrt { alpha }} big) $来加速非加速方法。我们在六个实际图形数据集上的实验分别证实了 FwdPushSOR 和有向图和无向图加速方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Accelerating+Personalized+PageRank+Vector+Computation)|0| -|[Text Is All You Need: Learning Language Representations for Sequential Recommendation](https://doi.org/10.1145/3580305.3599519)|Jiacheng Li, Ming Wang, Jin Li, Jinmiao Fu, Xin Shen, Jingbo Shang, Julian J. McAuley|University of California, San Diego; Amazon|Sequential recommendation aims to model dynamic user behavior from historical interactions. Existing methods rely on either explicit item IDs or general textual features for sequence modeling to understand user preferences. While promising, these approaches still struggle to model cold-start items or transfer knowledge to new datasets. In this paper, we propose to model user preferences and item features as language representations that can be generalized to new items and datasets. To this end, we present a novel framework, named Recformer, which effectively learns language representations for sequential recommendation. Specifically, we propose to formulate an item as a "sentence" (word sequence) by flattening item key-value attributes described by text so that an item sequence for a user becomes a sequence of sentences. For recommendation, Recformer is trained to understand the "sentence" sequence and retrieve the next "sentence". To encode item sequences, we design a bi-directional Transformer similar to the model Longformer but with different embedding layers for sequential recommendation. For effective representation learning, we propose novel pretraining and finetuning methods which combine language understanding and recommendation tasks. Therefore, Recformer can effectively recommend the next item based on language representations. Extensive experiments conducted on six datasets demonstrate the effectiveness of Recformer for sequential recommendation, especially in low-resource and cold-start settings.|顺序推荐旨在从历史交互中建立动态用户行为模型。现有的方法依赖于显式的项 ID 或一般的文本特性来进行序列建模,以理解用户的首选项。尽管这些方法很有前途,但它们仍然难以对冷启动项目进行建模或将知识转移到新的数据集中。本文提出将用户偏好和项目特征建模为语言表示,并将其推广到新的项目和数据集。为此,我们提出了一个新的框架,称为 Recformer,它有效地学习语言表示顺序推荐。具体来说,我们建议通过将文本所描述的项目键值属性扁平化,将项目表述为“句子”(单词序列) ,从而使用户的项目序列成为一个句子序列。为了便于推荐,Recformer 接受了理解“句子”序列和检索下一个“句子”的训练。为了对项目序列进行编码,我们设计了一个类似于 Longform 模型但具有不同嵌入层的双向变压器用于顺序推荐。为了有效地进行表征学习,我们提出了一种新的预训练和微调方法,将语言理解和推荐任务结合起来。因此,Recformer 可以根据语言表示有效地推荐下一个项目。在六个数据集上进行的大量实验证明了 Recformer 对于顺序推荐的有效性,特别是在低资源和冷启动环境下。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Text+Is+All+You+Need:+Learning+Language+Representations+for+Sequential+Recommendation)|0| +|[Accelerating Personalized PageRank Vector Computation](https://doi.org/10.1145/3580305.3599251)|Zhen Chen, Xingzhi Guo, Baojian Zhou, Deqing Yang, Steven Skiena|State University of New York at Stony Brook; Fudan University|Personalized PageRank Vectors are widely used as fundamental graph-learning tools for detecting anomalous spammers, learning graph embeddings, and training graph neural networks. The well-known local FwdPush algorithm approximates PPVs and has a sublinear rate of $O\big(\frac{1}{\alpha\epsilon}\big)$. A recent study found that when high precision is required, FwdPush is similar to the power iteration method, and its run time is pessimistically bounded by $O\big(\frac{m}{\alpha} \log\frac{1}{\epsilon}\big)$. This paper looks closely at calculating PPVs for both directed and undirected graphs. By leveraging the linear invariant property, we show that FwdPush is a variant of Gauss-Seidel and propose a Successive Over-Relaxation based method, FwdPushSOR to speed it up by slightly modifying FwdPush. Additionally, we prove FwdPush has local linear convergence rate $O\big(\tfrac{\text{vol}(S)}{\alpha} \log\tfrac{1}{\epsilon}\big)$ enjoying advantages of two existing bounds. We also design a new local heuristic push method that reduces the number of operations by 10-50 percent compared to FwdPush. For undirected graphs, we propose two momentum-based acceleration methods that can be expressed as one-line updates and speed up non-acceleration methods by$\mathcal{O}\big(\tfrac{1}{\sqrt{\alpha}}\big)$. Our experiments on six real-world graph datasets confirm the efficiency of FwdPushSOR and the acceleration methods for directed and undirected graphs, respectively.|个性化 PageRank 向量广泛用作基本的图形学习工具,用于检测异常垃圾邮件发送者、学习图形嵌入和训练图形神经网络。著名的局部 FwdPush 算法近似于 PPV,其次线性速率为 $O big (frac {1}{ alpha epsilon } big) $。最近的一项研究发现,当需要高精度时,FwdPush 类似于幂迭代法,其运行时间悲观地受到 $O big (frac { m }{ alpha } log frac {1}{ epsilon } big) $的限制。本文主要研究有向图和无向图的 PPV 的计算。通过利用线性不变性,我们证明了 FwdPush 是 Gauss-Seidel 的一个变体,并提出了一个基于逐次超松驰法的方法,FwdPushSOR,通过稍微修改 FwdPush 来加速它。另外,我们证明了 FwdPush 具有局部线性收敛速度 $O big (tfrac { text { vol }(S)}{ alpha } log tfrac {1}{ epsilon } big) $具有两个现有界的优点。我们还设计了一种新的局部启发式推送方法,与 FwdPush 相比减少了10-50% 的操作次数。对于无向图,我们提出了两种基于动量的加速方法,它们可以表示为一行更新,并且可以通过 $mathcal { O } big (tfrac {1}{ sqrt { alpha }} big) $来加速非加速方法。我们在六个实际图形数据集上的实验分别证实了 FwdPushSOR 和有向图和无向图加速方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Accelerating+Personalized+PageRank+Vector+Computation)|0| +|[Text Is All You Need: Learning Language Representations for Sequential Recommendation](https://doi.org/10.1145/3580305.3599519)|Jiacheng Li, Ming Wang, Jin Li, Jinmiao Fu, Xin Shen, Jingbo Shang, Julian J. McAuley|Amazon; University of California, San Diego|Sequential recommendation aims to model dynamic user behavior from historical interactions. Existing methods rely on either explicit item IDs or general textual features for sequence modeling to understand user preferences. While promising, these approaches still struggle to model cold-start items or transfer knowledge to new datasets. In this paper, we propose to model user preferences and item features as language representations that can be generalized to new items and datasets. To this end, we present a novel framework, named Recformer, which effectively learns language representations for sequential recommendation. Specifically, we propose to formulate an item as a "sentence" (word sequence) by flattening item key-value attributes described by text so that an item sequence for a user becomes a sequence of sentences. For recommendation, Recformer is trained to understand the "sentence" sequence and retrieve the next "sentence". To encode item sequences, we design a bi-directional Transformer similar to the model Longformer but with different embedding layers for sequential recommendation. For effective representation learning, we propose novel pretraining and finetuning methods which combine language understanding and recommendation tasks. Therefore, Recformer can effectively recommend the next item based on language representations. Extensive experiments conducted on six datasets demonstrate the effectiveness of Recformer for sequential recommendation, especially in low-resource and cold-start settings.|顺序推荐旨在从历史交互中建立动态用户行为模型。现有的方法依赖于显式的项 ID 或一般的文本特性来进行序列建模,以理解用户的首选项。尽管这些方法很有前途,但它们仍然难以对冷启动项目进行建模或将知识转移到新的数据集中。本文提出将用户偏好和项目特征建模为语言表示,并将其推广到新的项目和数据集。为此,我们提出了一个新的框架,称为 Recformer,它有效地学习语言表示顺序推荐。具体来说,我们建议通过将文本所描述的项目键值属性扁平化,将项目表述为“句子”(单词序列) ,从而使用户的项目序列成为一个句子序列。为了便于推荐,Recformer 接受了理解“句子”序列和检索下一个“句子”的训练。为了对项目序列进行编码,我们设计了一个类似于 Longform 模型但具有不同嵌入层的双向变压器用于顺序推荐。为了有效地进行表征学习,我们提出了一种新的预训练和微调方法,将语言理解和推荐任务结合起来。因此,Recformer 可以根据语言表示有效地推荐下一个项目。在六个数据集上进行的大量实验证明了 Recformer 对于顺序推荐的有效性,特别是在低资源和冷启动环境下。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Text+Is+All+You+Need:+Learning+Language+Representations+for+Sequential+Recommendation)|0| |[MAP: A Model-agnostic Pretraining Framework for Click-through Rate Prediction](https://doi.org/10.1145/3580305.3599422)|Jianghao Lin, Yanru Qu, Wei Guo, Xinyi Dai, Ruiming Tang, Yong Yu, Weinan Zhang|Shanghai Jiao Tong University; Huawei NoahÊäØ Ark Lab|With the widespread application of personalized online services, click-through rate (CTR) prediction has received more and more attention and research. The most prominent features of CTR prediction are its multi-field categorical data format, and vast and daily-growing data volume. The large capacity of neural models helps digest such massive amounts of data under the supervised learning paradigm, yet they fail to utilize the substantial data to its full potential, since the 1-bit click signal is not sufficient to guide the model to learn capable representations of features and instances. The self-supervised learning paradigm provides a more promising pretrain-finetune solution to better exploit the large amount of user click logs, and learn more generalized and effective representations. However, self-supervised learning for CTR prediction is still an open question, since current works on this line are only preliminary and rudimentary. To this end, we propose a Model-agnostic pretraining (MAP) framework that applies feature corruption and recovery on multi-field categorical data, and more specifically, we derive two practical algorithms: masked feature prediction (MFP) and replaced feature detection (RFD). MFP digs into feature interactions within each instance through masking and predicting a small portion of input features, and introduces noise contrastive estimation (NCE) to handle large feature spaces. RFD further turns MFP into a binary classification mode through replacing and detecting changes in input features, making it even simpler and more effective for CTR pretraining. Our extensive experiments on two real-world large-scale datasets (i.e., Avazu, Criteo) demonstrate the advantages of these two methods on several strong backbones (e.g., DCNv2, DeepFM), and achieve new state-of-the-art performance in terms of both effectiveness and efficiency for CTR prediction.|随着个性化网上服务的广泛应用,点进率预测越来越受到重视和研究。CTR 预测最突出的特点是它的多领域分类数据格式,以及海量和日益增长的数据量。神经模型的巨大容量有助于在监督式学习范式下消化如此大量的数据,但它们未能充分利用大量的数据,因为1位点击信号不足以指导模型学习特征和实例的能力表示。自监督学习范式为更好地利用大量的用户点击日志,学习更广泛和有效的表示提供了一种更有前途的预训练-微调解决方案。然而,自我监督学习的 CTR 预测仍然是一个悬而未决的问题,因为目前在这方面的工作只是初步和基础。为此,我们提出了一个模型无关预训练(model-agnotic pretraining,MAP)框架,该框架将特征损坏和恢复应用于多领域分类数据,更具体地说,我们推导出两种实用算法: 掩盖特征预测(mFP)和替换特征提取(RFD)。MFP 通过屏蔽和预测一小部分输入特征,深入挖掘每个实例中的特征交互,并引入噪声对比估计(NCE)来处理较大的特征空间。RFD 通过替换和检测输入特征的变化,进一步将 MFP 转化为二进制分类模式,使 CTR 预训练更加简单有效。我们在两个真实世界的大规模数据集(例如,Avazu,Criteo)上的广泛实验证明了这两种方法在几个强骨干(例如,dCNv2,DeepFM)上的优势,并在有效性和效率方面实现了新的最先进的 CTR 预测性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MAP:+A+Model-agnostic+Pretraining+Framework+for+Click-through+Rate+Prediction)|0| -|[Learning to Relate to Previous Turns in Conversational Search](https://doi.org/10.1145/3580305.3599411)|Fengran Mo, JianYun Nie, Kaiyu Huang, Kelong Mao, Yutao Zhu, Peng Li, Yang Liu|University of Montreal; Tsinghua University; Renmin University of China|Conversational search allows a user to interact with a search system in multiple turns. A query is strongly dependent on the conversation context. An effective way to improve retrieval effectiveness is to expand the current query with historical queries. However, not all the previous queries are related to, and useful for expanding the current query. In this paper, we propose a new method to select relevant historical queries that are useful for the current query. To cope with the lack of labeled training data, we use a pseudo-labeling approach to annotate useful historical queries based on their impact on the retrieval results. The pseudo-labeled data are used to train a selection model. We further propose a multi-task learning framework to jointly train the selector and the retriever during fine-tuning, allowing us to mitigate the possible inconsistency between the pseudo labels and the changed retriever. Extensive experiments on four conversational search datasets demonstrate the effectiveness and broad applicability of our method compared with several strong baselines.|会话搜索允许用户多次与搜索系统交互。查询强烈依赖于会话上下文。提高检索效率的一个有效方法是使用历史查询扩展当前查询。但是,并非所有以前的查询都与之相关,并且对于展开当前查询非常有用。本文提出了一种新的方法来选择对当前查询有用的相关历史查询。为了解决缺乏标记训练数据的问题,我们使用伪标记方法根据有用的历史查询对检索结果的影响来注释它们。利用伪标记数据训练选择模型。我们进一步提出了一个多任务学习框架,在微调过程中联合训练选择器和检索器,使我们能够减轻伪标签和更改后的检索器之间可能的不一致性。通过对四个会话搜索数据集的大量实验,证明了该方法的有效性和广泛的适用性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+to+Relate+to+Previous+Turns+in+Conversational+Search)|0| +|[Learning to Relate to Previous Turns in Conversational Search](https://doi.org/10.1145/3580305.3599411)|Fengran Mo, JianYun Nie, Kaiyu Huang, Kelong Mao, Yutao Zhu, Peng Li, Yang Liu|Tsinghua University; University of Montreal; Renmin University of China|Conversational search allows a user to interact with a search system in multiple turns. A query is strongly dependent on the conversation context. An effective way to improve retrieval effectiveness is to expand the current query with historical queries. However, not all the previous queries are related to, and useful for expanding the current query. In this paper, we propose a new method to select relevant historical queries that are useful for the current query. To cope with the lack of labeled training data, we use a pseudo-labeling approach to annotate useful historical queries based on their impact on the retrieval results. The pseudo-labeled data are used to train a selection model. We further propose a multi-task learning framework to jointly train the selector and the retriever during fine-tuning, allowing us to mitigate the possible inconsistency between the pseudo labels and the changed retriever. Extensive experiments on four conversational search datasets demonstrate the effectiveness and broad applicability of our method compared with several strong baselines.|会话搜索允许用户多次与搜索系统交互。查询强烈依赖于会话上下文。提高检索效率的一个有效方法是使用历史查询扩展当前查询。但是,并非所有以前的查询都与之相关,并且对于展开当前查询非常有用。本文提出了一种新的方法来选择对当前查询有用的相关历史查询。为了解决缺乏标记训练数据的问题,我们使用伪标记方法根据有用的历史查询对检索结果的影响来注释它们。利用伪标记数据训练选择模型。我们进一步提出了一个多任务学习框架,在微调过程中联合训练选择器和检索器,使我们能够减轻伪标签和更改后的检索器之间可能的不一致性。通过对四个会话搜索数据集的大量实验,证明了该方法的有效性和广泛的适用性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+to+Relate+to+Previous+Turns+in+Conversational+Search)|0| |[PSLOG: Pretraining with Search Logs for Document Ranking](https://doi.org/10.1145/3580305.3599477)|Zhan Su, Zhicheng Dou, Yujia Zhou, Ziyuan Zhao, JiRong Wen||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PSLOG:+Pretraining+with+Search+Logs+for+Document+Ranking)|0| |[Improving Conversational Recommendation Systems via Counterfactual Data Simulation](https://doi.org/10.1145/3580305.3599387)|Xiaolei Wang, Kun Zhou, Xinyu Tang, Wayne Xin Zhao, Fan Pan, Zhao Cao, JiRong Wen|Huawei Poisson Lab; Renmin University of China|Conversational recommender systems (CRSs) aim to provide recommendation services via natural language conversations. Although a number of approaches have been proposed for developing capable CRSs, they typically rely on sufficient training data for training. Since it is difficult to annotate recommendation-oriented dialogue datasets, existing CRS approaches often suffer from the issue of insufficient training due to the scarcity of training data. To address this issue, in this paper, we propose a CounterFactual data simulation approach for CRS, named CFCRS, to alleviate the issue of data scarcity in CRSs. Our approach is developed based on the framework of counterfactual data augmentation, which gradually incorporates the rewriting to the user preference from a real dialogue without interfering with the entire conversation flow. To develop our approach, we characterize user preference and organize the conversation flow by the entities involved in the dialogue, and design a multi-stage recommendation dialogue simulator based on a conversation flow language model. Under the guidance of the learned user preference and dialogue schema, the flow language model can produce reasonable, coherent conversation flows, which can be further realized into complete dialogues. Based on the simulator, we perform the intervention at the representations of the interacted entities of target users, and design an adversarial training method with a curriculum schedule that can gradually optimize the data augmentation strategy. Extensive experiments show that our approach can consistently boost the performance of several competitive CRSs, and outperform other data augmentation methods, especially when the training data is limited. Our code is publicly available at https://github.com/RUCAIBox/CFCRS.|会话推荐系统(CRS)旨在通过自然语言对话提供推荐服务。虽然已经提出了一些开发有能力的 CRS 的方法,但它们通常依赖于足够的培训数据进行培训。由于很难对面向建议的对话数据集进行注释,现有的 CRS 方法往往因缺乏培训数据而面临培训不足的问题。为了解决这一问题,本文提出了一种 CRS 的 CounterFact 数据模拟方法 CFCRS,以缓解 CRS 中的数据稀缺问题。我们的方法是在反事实数据增强框架的基础上发展起来的,该框架在不干扰整个会话流程的情况下,逐渐将真实对话中的用户偏好重写纳入其中。为了开发这种方法,我们描述了用户偏好的特征,并根据对话所涉及的实体组织了对话流程,设计了一个基于对话流程语言模型的多阶段推荐对话模拟器。在用户偏好和对话模式的指导下,流语言模型可以产生合理、连贯的会话流,进一步实现完整的对话。在该模拟器的基础上,对目标用户交互实体的表示进行干预,设计了一种基于课程表的对抗性训练方法,可以逐步优化数据增强策略。大量实验表明,该方法可以持续提高多个竞争性 CRS 的性能,并且优于其他数据增强方法,特别是在训练数据有限的情况下。我们的代码可以在 https://github.com/rucaibox/cfcrs 上公开获取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+Conversational+Recommendation+Systems+via+Counterfactual+Data+Simulation)|0| |[Efficient and Joint Hyperparameter and Architecture Search for Collaborative Filtering](https://doi.org/10.1145/3580305.3599322)|Yan Wen, Chen Gao, Lingling Yi, Liwei Qiu, Yaqing Wang, Yong Li|Tsinghua University; Tencent Inc.; Baidu Inc.|Automated Machine Learning (AutoML) techniques have recently been introduced to design Collaborative Filtering (CF) models in a data-specific manner. However, existing works either search architectures or hyperparameters while ignoring the fact they are intrinsically related and should be considered together. This motivates us to consider a joint hyperparameter and architecture search method to design CF models. However, this is not easy because of the large search space and high evaluation cost. To solve these challenges, we reduce the space by screening out usefulness yperparameter choices through a comprehensive understanding of individual hyperparameters. Next, we propose a two-stage search algorithm to find proper configurations from the reduced space. In the first stage, we leverage knowledge from subsampled datasets to reduce evaluation costs; in the second stage, we efficiently fine-tune top candidate models on the whole dataset. Extensive experiments on real-world datasets show better performance can be achieved compared with both hand-designed and previous searched models. Besides, ablation and case studies demonstrate the effectiveness of our search framework.|自动机器学习(AutoML)技术最近被引入到设计特定数据的协同过滤模型(CF)中。然而,现有的工作要么搜索体系结构或超参数,而忽略了这些内在联系的事实,应该一起考虑。这促使我们考虑联合使用超参数和体系结构搜索方法来设计 CF 模型。然而,这并不容易,因为大搜索空间和高评价成本。为了解决这些挑战,我们通过全面理解各个超参数筛选出有用的超参数选择来减少空间。接下来,我们提出了一个两阶段的搜索算法,以找到适当的配置从缩减的空间。在第一阶段,我们利用次采样数据集的知识来降低评估成本; 在第二阶段,我们有效地微调整整个数据集上的顶级候选模型。在真实世界数据集上的大量实验表明,与手工设计和以前的搜索模型相比,该算法可以获得更好的性能。此外,消融和案例研究证明了我们的搜索框架的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Efficient+and+Joint+Hyperparameter+and+Architecture+Search+for+Collaborative+Filtering)|0| |[Efficient Single-Source SimRank Query by Path Aggregation](https://doi.org/10.1145/3580305.3599328)|Mingxi Zhang, Yanghua Xiao, Wei Wang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Efficient+Single-Source+SimRank+Query+by+Path+Aggregation)|0| |[Adaptive Disentangled Transformer for Sequential Recommendation](https://doi.org/10.1145/3580305.3599253)|Yipeng Zhang, Xin Wang, Hong Chen, Wenwu Zhu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Adaptive+Disentangled+Transformer+for+Sequential+Recommendation)|0| |[CADENCE: Offline Category Constrained and Diverse Query Generation for E-commerce Autosuggest](https://doi.org/10.1145/3580305.3599787)|Abhinav Anand, Surender Kumar, Nandeesh Kumar, Samir Shah||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CADENCE:+Offline+Category+Constrained+and+Diverse+Query+Generation+for+E-commerce+Autosuggest)|0| -|[PEPNet: Parameter and Embedding Personalized Network for Infusing with Personalized Prior Information](https://doi.org/10.1145/3580305.3599884)|Jianxin Chang, Chenbin Zhang, Yiqun Hui, Dewei Leng, Yanan Niu, Yang Song, Kun Gai|Kuaishou Technology; Unaffiliated|With the increase of content pages and display styles in online services such as online-shopping and video-watching websites, industrial-scale recommender systems face challenges in multi-domain and multi-task recommendations. The core of multi-task and multi-domain recommendation is to accurately capture user interests in different domains given different user behaviors. In this paper, we propose a plug-and-play \textit{\textbf{P}arameter and \textbf{E}mbedding \textbf{P}ersonalized \textbf{Net}work (\textbf{PEPNet})} for multi-task recommendation in the multi-domain setting. PEPNet takes features with strong biases as input and dynamically scales the bottom-layer embeddings and the top-layer DNN hidden units in the model through a gate mechanism. By mapping personalized priors to scaling weights ranging from 0 to 2, PEPNet introduces both parameter personalization and embedding personalization. Embedding Personalized Network (EPNet) selects and aligns embeddings with different semantics under multiple domains. Parameter Personalized Network (PPNet) influences DNN parameters to balance interdependent targets in multiple tasks. We have made a series of special engineering optimizations combining the Kuaishou training framework and the online deployment environment. We have deployed the model in Kuaishou apps, serving over 300 million daily users. Both online and offline experiments have demonstrated substantial improvements in multiple metrics. In particular, we have seen a more than 1\% online increase in three major scenarios.|随着在线购物和视频观看网站等在线服务内容页面和显示方式的增加,工业规模的推荐系统面临着多领域、多任务推荐的挑战。多任务、多领域推荐的核心是根据不同的用户行为准确捕获不同领域的用户兴趣。本文针对多领域环境下的多任务推荐问题,提出了一种即插即用的文本参数{ textbf { P }参数和 textbf { E }嵌入式 textbf { P }个性化 textbf { Net } work (textbf { PEPNet })}。PEPNet 以具有强偏差的特征作为输入,通过门机制动态扩展模型中的底层嵌入和顶层 DNN 隐藏单元。通过将个性化前期映射到0到2之间的权重,PEPNet 引入了参数个性化和嵌入个性化。嵌入式个性化网络(EPNet)在多个域下选择和对齐具有不同语义的嵌入式。参数个性化网络(PPNet)影响 DNN 参数以平衡多任务中相互依赖的目标。我们结合快手培训框架和在线部署环境,进行了一系列特殊的工程优化。我们在 Kuaishou 的应用程序中采用了这种模式,每天为超过3亿用户提供服务。这两个在线和离线的实验都显示了在多个指标方面的重大改进。特别是,我们已经看到在三种主要情况下在线增长超过1% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PEPNet:+Parameter+and+Embedding+Personalized+Network+for+Infusing+with+Personalized+Prior+Information)|0| +|[PEPNet: Parameter and Embedding Personalized Network for Infusing with Personalized Prior Information](https://doi.org/10.1145/3580305.3599884)|Jianxin Chang, Chenbin Zhang, Yiqun Hui, Dewei Leng, Yanan Niu, Yang Song, Kun Gai|Unaffiliated; Kuaishou Technology|With the increase of content pages and display styles in online services such as online-shopping and video-watching websites, industrial-scale recommender systems face challenges in multi-domain and multi-task recommendations. The core of multi-task and multi-domain recommendation is to accurately capture user interests in different domains given different user behaviors. In this paper, we propose a plug-and-play \textit{\textbf{P}arameter and \textbf{E}mbedding \textbf{P}ersonalized \textbf{Net}work (\textbf{PEPNet})} for multi-task recommendation in the multi-domain setting. PEPNet takes features with strong biases as input and dynamically scales the bottom-layer embeddings and the top-layer DNN hidden units in the model through a gate mechanism. By mapping personalized priors to scaling weights ranging from 0 to 2, PEPNet introduces both parameter personalization and embedding personalization. Embedding Personalized Network (EPNet) selects and aligns embeddings with different semantics under multiple domains. Parameter Personalized Network (PPNet) influences DNN parameters to balance interdependent targets in multiple tasks. We have made a series of special engineering optimizations combining the Kuaishou training framework and the online deployment environment. We have deployed the model in Kuaishou apps, serving over 300 million daily users. Both online and offline experiments have demonstrated substantial improvements in multiple metrics. In particular, we have seen a more than 1\% online increase in three major scenarios.|随着在线购物和视频观看网站等在线服务内容页面和显示方式的增加,工业规模的推荐系统面临着多领域、多任务推荐的挑战。多任务、多领域推荐的核心是根据不同的用户行为准确捕获不同领域的用户兴趣。本文针对多领域环境下的多任务推荐问题,提出了一种即插即用的文本参数{ textbf { P }参数和 textbf { E }嵌入式 textbf { P }个性化 textbf { Net } work (textbf { PEPNet })}。PEPNet 以具有强偏差的特征作为输入,通过门机制动态扩展模型中的底层嵌入和顶层 DNN 隐藏单元。通过将个性化前期映射到0到2之间的权重,PEPNet 引入了参数个性化和嵌入个性化。嵌入式个性化网络(EPNet)在多个域下选择和对齐具有不同语义的嵌入式。参数个性化网络(PPNet)影响 DNN 参数以平衡多任务中相互依赖的目标。我们结合快手培训框架和在线部署环境,进行了一系列特殊的工程优化。我们在 Kuaishou 的应用程序中采用了这种模式,每天为超过3亿用户提供服务。这两个在线和离线的实验都显示了在多个指标方面的重大改进。特别是,我们已经看到在三种主要情况下在线增长超过1% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PEPNet:+Parameter+and+Embedding+Personalized+Network+for+Infusing+with+Personalized+Prior+Information)|0| |[Controllable Multi-Objective Re-ranking with Policy Hypernetworks](https://doi.org/10.1145/3580305.3599796)|Sirui Chen, Yuan Wang, Zijing Wen, Zhiyu Li, Changshuo Zhang, Xiao Zhang, Quan Lin, Cheng Zhu, Jun Xu||Multi-stage ranking pipelines have become widely used strategies in modern recommender systems, where the final stage aims to return a ranked list of items that balances a number of requirements such as user preference, diversity, novelty etc. Linear scalarization is arguably the most widely used technique to merge multiple requirements into one optimization objective, by summing up the requirements with certain preference weights. Existing final-stage ranking methods often adopt a static model where the preference weights are determined during offline training and kept unchanged during online serving. Whenever a modification of the preference weights is needed, the model has to be re-trained, which is time and resources inefficient. Meanwhile, the most appropriate weights may vary greatly for different groups of targeting users or at different time periods (e.g., during holiday promotions). In this paper, we propose a framework called controllable multi-objective re-ranking (CMR) which incorporates a hypernetwork to generate parameters for a re-ranking model according to different preference weights. In this way, CMR is enabled to adapt the preference weights according to the environment changes in an online manner, without retraining the models. Moreover, we classify practical business-oriented tasks into four main categories and seamlessly incorporate them in a new proposed re-ranking model based on an Actor-Evaluator framework, which serves as a reliable real-world testbed for CMR. Offline experiments based on the dataset collected from Taobao App showed that CMR improved several popular re-ranking models by using them as underlying models. Online A/B tests also demonstrated the effectiveness and trustworthiness of CMR.|多阶段排名管道已经成为现代推荐系统中广泛使用的策略,最后阶段的目标是返回一个项目的排名列表,平衡用户偏好、多样性、新颖性等要求。线性标量可以说是最广泛使用的技术合并多个需求到一个优化目标,通过总结需求与一定的偏好权重。现有的最后阶段排序方法通常采用静态模型,在离线训练时确定偏好权重,在线服务时保持不变。当需要修改偏好权重时,模型必须重新训练,这是时间和资源效率低下的。与此同时,最合适的权重可能会因不同的目标用户群体或在不同的时间段(例如,在假日促销期间)而有很大差异。本文提出了一种可控的多目标重排序(CMR)框架,该框架结合了一个超网络,根据不同的偏好权重为重排序模型生成参数。通过这种方式,CMR 能够根据环境变化在线调整偏好权重,而不需要重新训练模型。此外,我们将面向业务的实际任务分为四个主要类别,并将它们无缝地纳入一个新提出的重新排序模型,该模型基于一个演员-评估者框架,作为一个可靠的现实世界的 CMR 测试平台。基于从淘宝应用收集的数据集的离线实验表明,CMR 通过使用它们作为基础模型改进了几个流行的重新排名模型。在线 A/B 测试也证明了 CMR 的有效性和可信性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Controllable+Multi-Objective+Re-ranking+with+Policy+Hypernetworks)|0| |[CT4Rec: Simple yet Effective Consistency Training for Sequential Recommendation](https://doi.org/10.1145/3580305.3599798)|Liu Chong, Xiaoyang Liu, Rongqin Zheng, Lixin Zhang, Xiaobo Liang, Juntao Li, Lijun Wu, Min Zhang, Leyu Lin||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CT4Rec:+Simple+yet+Effective+Consistency+Training+for+Sequential+Recommendation)|0| |[S2phere: Semi-Supervised Pre-training for Web Search over Heterogeneous Learning to Rank Data](https://doi.org/10.1145/3580305.3599935)|Yuchen Li, Haoyi Xiong, Linghe Kong, Qingzhong Wang, Shuaiqiang Wang, Guihai Chen, Dawei Yin||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=S2phere:+Semi-Supervised+Pre-training+for+Web+Search+over+Heterogeneous+Learning+to+Rank+Data)|0| @@ -41,10 +41,10 @@ |[PASS: Personalized Advertiser-aware Sponsored Search](https://doi.org/10.1145/3580305.3599882)|Zhoujin Tian, Chaozhuo Li, Zhiqiang Zuo, Zengxuan Wen, Lichao Sun, Xinyue Hu, Wen Zhang, Haizhen Huang, Senzhang Wang, Weiwei Deng, Xing Xie, Qi Zhang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PASS:+Personalized+Advertiser-aware+Sponsored+Search)|0| |[Towards Fairness in Personalized Ads Using Impression Variance Aware Reinforcement Learning](https://doi.org/10.1145/3580305.3599916)|Aditya Srinivas Timmaraju, Mehdi Mashayekhi, Mingliang Chen, Qi Zeng, Quintin Fettes, Wesley Cheung, Yihan Xiao, Manojkumar Rangasamy Kannadasan, Pushkar Tripathi, Sean Gahagan, Miranda Bogen, Rob Roudani|Meta|Variances in ad impression outcomes across demographic groups are increasingly considered to be potentially indicative of algorithmic bias in personalized ads systems. While there are many definitions of fairness that could be applicable in the context of personalized systems, we present a framework which we call the Variance Reduction System (VRS) for achieving more equitable outcomes in Meta's ads systems. VRS seeks to achieve a distribution of impressions with respect to selected protected class (PC) attributes that more closely aligns the demographics of an ad's eligible audience (a function of advertiser targeting criteria) with the audience who sees that ad, in a privacy-preserving manner. We first define metrics to quantify fairness gaps in terms of ad impression variances with respect to PC attributes including gender and estimated race. We then present the VRS for re-ranking ads in an impression variance-aware manner. We evaluate VRS via extensive simulations over different parameter choices and study the effect of the VRS on the chosen fairness metric. We finally present online A/B testing results from applying VRS to Meta's ads systems, concluding with a discussion of future work. We have deployed the VRS to all users in the US for housing ads, resulting in significant improvement in our fairness metric. VRS is the first large-scale deployed framework for pursuing fairness for multiple PC attributes in online advertising.|不同人群的广告印象结果的差异越来越被认为是个性化广告系统中算法偏差的潜在指示。虽然有许多公平的定义,可以适用于个性化系统的背景下,我们提出了一个框架,我们称之为方差减少系统(VRS) ,以实现更公平的结果在元数据的广告系统。VRS 试图通过选定的受保护类别(PC)属性来实现印象的分布,从而以保护隐私的方式将广告合格受众的人口统计数据(广告客户定位标准的功能)与看到该广告的受众的人口统计数据更紧密地联系起来。我们首先定义指标来量化广告印象差异的公平性差距方面的个人电脑属性,包括性别和估计的种族。然后,我们提出了一个印象方差感知的方式重新排名广告的 VRS。我们通过对不同参数选择的大量仿真来评估 VRS,并研究 VRS 对选择的公平性度量的影响。最后给出了 VRS 应用于 Meta 广告系统的在线 A/B 测试结果,并对今后的工作进行了讨论。我们已在美国所有用户的住房广告中部署了 VRS,从而显著改善了我们的公平性指标。VRS 是第一个大规模部署的框架,以追求公平的多个个人电脑属性在网上广告。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Fairness+in+Personalized+Ads+Using+Impression+Variance+Aware+Reinforcement+Learning)|0| |[PlanRanker: Towards Personalized Ranking of Train Transfer Plans](https://doi.org/10.1145/3580305.3599887)|Jia Xu, Wanjie Tao, Zulong Chen, Jin Huang, Huihui Liu, Hong Wen, Shenghua Ni, Qun Dai, Yu Gu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PlanRanker:+Towards+Personalized+Ranking+of+Train+Transfer+Plans)|0| -|[Multi-factor Sequential Re-ranking with Perception-Aware Diversification](https://doi.org/10.1145/3580305.3599869)|Yue Xu, Hao Chen, Zefan Wang, Jianwen Yin, Qijie Shen, Dimin Wang, Feiran Huang, Lixiang Lai, Tao Zhuang, Junfeng Ge, Xia Hu|Alibaba Group; Jinan University; The Hong Kong Polytechnic University; Rice University|Feed recommendation systems, which recommend a sequence of items for users to browse and interact with, have gained significant popularity in practical applications. In feed products, users tend to browse a large number of items in succession, so the previously viewed items have a significant impact on users' behavior towards the following items. Therefore, traditional methods that mainly focus on improving the accuracy of recommended items are suboptimal for feed recommendations because they may recommend highly similar items. For feed recommendation, it is crucial to consider both the accuracy and diversity of the recommended item sequences in order to satisfy users' evolving interest when consecutively viewing items. To this end, this work proposes a general re-ranking framework named Multi-factor Sequential Re-ranking with Perception-Aware Diversification (MPAD) to jointly optimize accuracy and diversity for feed recommendation in a sequential manner. Specifically, MPAD first extracts users' different scales of interests from their behavior sequences through graph clustering-based aggregations. Then, MPAD proposes two sub-models to respectively evaluate the accuracy and diversity of a given item by capturing users' evolving interest due to the ever-changing context and users' personal perception of diversity from an item sequence perspective. This is consistent with the browsing nature of the feed scenario. Finally, MPAD generates the return list by sequentially selecting optimal items from the candidate set to maximize the joint benefits of accuracy and diversity of the entire list. MPAD has been implemented in Taobao's homepage feed to serve the main traffic and provide services to recommend billions of items to hundreds of millions of users every day.|提要推荐系统为用户推荐了一系列可供浏览和交互的条目,在实际应用中得到了广泛的应用。在提要产品中,用户倾向于连续浏览大量条目,因此以前查看的条目对用户对下列条目的行为有显著影响。因此,主要侧重于提高推荐项目准确性的传统方法对于饲料推荐是次优的,因为它们可能推荐高度相似的项目。为了满足用户在连续查看条目时不断变化的兴趣,对推荐条目序列的准确性和多样性进行考虑是至关重要的。为此,本文提出了一种基于感知多样化的多因素序贯推荐(MPAD)的通用推荐框架,该框架以序贯方式对推荐的准确性和多样性进行联合优化。具体来说,MPAD 首先通过基于图聚类的聚合从用户的行为序列中提取出用户不同尺度的兴趣。然后,MPAD 提出了两个子模型,分别从项目序列的角度通过捕获不断变化的用户兴趣和用户个人对多样性的感知来评价项目的准确性和多样性。这与提要场景的浏览特性一致。最后,MPAD 通过从候选集中依次选择最优项目来生成返回列表,以最大限度地提高整个列表的准确性和多样性。MPAD 已经在淘宝网的主页 feed 中实现,为主流流量提供服务,每天向数亿用户推荐数十亿个项目。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-factor+Sequential+Re-ranking+with+Perception-Aware+Diversification)|0| -|[TWIN: TWo-stage Interest Network for Lifelong User Behavior Modeling in CTR Prediction at Kuaishou](https://doi.org/10.1145/3580305.3599922)|Jianxin Chang, Chenbin Zhang, Zhiyi Fu, Xiaoxue Zang, Lin Guan, Jing Lu, Yiqun Hui, Dewei Leng, Yanan Niu, Yang Song, Kun Gai|Kuaishou Technology; Unaffiliated|Life-long user behavior modeling, i.e., extracting a user's hidden interests from rich historical behaviors in months or even years, plays a central role in modern CTR prediction systems. Conventional algorithms mostly follow two cascading stages: a simple General Search Unit (GSU) for fast and coarse search over tens of thousands of long-term behaviors and an Exact Search Unit (ESU) for effective Target Attention (TA) over the small number of finalists from GSU. Although efficient, existing algorithms mostly suffer from a crucial limitation: the \textit{inconsistent} target-behavior relevance metrics between GSU and ESU. As a result, their GSU usually misses highly relevant behaviors but retrieves ones considered irrelevant by ESU. In such case, the TA in ESU, no matter how attention is allocated, mostly deviates from the real user interests and thus degrades the overall CTR prediction accuracy. To address such inconsistency, we propose \textbf{TWo-stage Interest Network (TWIN)}, where our Consistency-Preserved GSU (CP-GSU) adopts the identical target-behavior relevance metric as the TA in ESU, making the two stages twins. Specifically, to break TA's computational bottleneck and extend it from ESU to GSU, or namely from behavior length $10^2$ to length $10^4-10^5$, we build a novel attention mechanism by behavior feature splitting. For the video inherent features of a behavior, we calculate their linear projection by efficient pre-computing \& caching strategies. And for the user-item cross features, we compress each into a one-dimentional bias term in the attention score calculation to save the computational cost. The consistency between two stages, together with the effective TA-based relevance metric in CP-GSU, contributes to significant performance gain in CTR prediction.|终身用户行为建模,即在数月甚至数年内从丰富的历史行为中提取用户隐藏的兴趣,在现代 CTR 预测系统中起着核心作用。传统的算法大多遵循两个级联阶段: 一个简单的通用搜索单元(GSU)用于快速和粗略搜索成千上万的长期行为和一个精确搜索单元(ESU)用于有效的目标注意(TA)在少数决赛选手从 GSU。虽然有效,但现有的算法大多受到一个关键的限制: 文本{不一致}目标行为相关度量 GSU 和 ESU 之间。因此,他们的 GSU 通常会错过高度相关的行为,但检索被 ESU 认为无关的行为。在这种情况下,ESU 中的 TA,无论如何分配注意力,大多偏离了真实用户的兴趣,从而降低了整体 CTR 预测的准确性。为了解决这种不一致性,我们提出 textbf { TWo-stage Interest Network (TWIN)} ,其中我们的保持一致性的 GSU (CP-GSU)采用与 ESU 中的 TA 相同的目标行为相关度量,使两个阶段成为孪生的。具体来说,为了打破 TA 的计算瓶颈,将其从 ESU 扩展到 GSU,或者说从行为长度 $10 ^ 2 $扩展到长度 $10 ^ 4-10 ^ 5 $,我们通过行为特征分裂构建了一种新的注意机制。对于视频行为的固有特征,我们通过有效的预计算和缓存策略来计算它们的线性投影。对于用户-项目交叉特征,在注意得分计算中将每个特征压缩为一维偏差项,以节省计算成本。两个阶段之间的一致性,加上 CP-GSU 中有效的基于 TA 的相关度量,有助于提高 CTR 预测的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=TWIN:+TWo-stage+Interest+Network+for+Lifelong+User+Behavior+Modeling+in+CTR+Prediction+at+Kuaishou)|0| +|[Multi-factor Sequential Re-ranking with Perception-Aware Diversification](https://doi.org/10.1145/3580305.3599869)|Yue Xu, Hao Chen, Zefan Wang, Jianwen Yin, Qijie Shen, Dimin Wang, Feiran Huang, Lixiang Lai, Tao Zhuang, Junfeng Ge, Xia Hu|Jinan University; The Hong Kong Polytechnic University; Rice University; Alibaba Group|Feed recommendation systems, which recommend a sequence of items for users to browse and interact with, have gained significant popularity in practical applications. In feed products, users tend to browse a large number of items in succession, so the previously viewed items have a significant impact on users' behavior towards the following items. Therefore, traditional methods that mainly focus on improving the accuracy of recommended items are suboptimal for feed recommendations because they may recommend highly similar items. For feed recommendation, it is crucial to consider both the accuracy and diversity of the recommended item sequences in order to satisfy users' evolving interest when consecutively viewing items. To this end, this work proposes a general re-ranking framework named Multi-factor Sequential Re-ranking with Perception-Aware Diversification (MPAD) to jointly optimize accuracy and diversity for feed recommendation in a sequential manner. Specifically, MPAD first extracts users' different scales of interests from their behavior sequences through graph clustering-based aggregations. Then, MPAD proposes two sub-models to respectively evaluate the accuracy and diversity of a given item by capturing users' evolving interest due to the ever-changing context and users' personal perception of diversity from an item sequence perspective. This is consistent with the browsing nature of the feed scenario. Finally, MPAD generates the return list by sequentially selecting optimal items from the candidate set to maximize the joint benefits of accuracy and diversity of the entire list. MPAD has been implemented in Taobao's homepage feed to serve the main traffic and provide services to recommend billions of items to hundreds of millions of users every day.|提要推荐系统为用户推荐了一系列可供浏览和交互的条目,在实际应用中得到了广泛的应用。在提要产品中,用户倾向于连续浏览大量条目,因此以前查看的条目对用户对下列条目的行为有显著影响。因此,主要侧重于提高推荐项目准确性的传统方法对于饲料推荐是次优的,因为它们可能推荐高度相似的项目。为了满足用户在连续查看条目时不断变化的兴趣,对推荐条目序列的准确性和多样性进行考虑是至关重要的。为此,本文提出了一种基于感知多样化的多因素序贯推荐(MPAD)的通用推荐框架,该框架以序贯方式对推荐的准确性和多样性进行联合优化。具体来说,MPAD 首先通过基于图聚类的聚合从用户的行为序列中提取出用户不同尺度的兴趣。然后,MPAD 提出了两个子模型,分别从项目序列的角度通过捕获不断变化的用户兴趣和用户个人对多样性的感知来评价项目的准确性和多样性。这与提要场景的浏览特性一致。最后,MPAD 通过从候选集中依次选择最优项目来生成返回列表,以最大限度地提高整个列表的准确性和多样性。MPAD 已经在淘宝网的主页 feed 中实现,为主流流量提供服务,每天向数亿用户推荐数十亿个项目。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-factor+Sequential+Re-ranking+with+Perception-Aware+Diversification)|0| +|[TWIN: TWo-stage Interest Network for Lifelong User Behavior Modeling in CTR Prediction at Kuaishou](https://doi.org/10.1145/3580305.3599922)|Jianxin Chang, Chenbin Zhang, Zhiyi Fu, Xiaoxue Zang, Lin Guan, Jing Lu, Yiqun Hui, Dewei Leng, Yanan Niu, Yang Song, Kun Gai|Unaffiliated; Kuaishou Technology|Life-long user behavior modeling, i.e., extracting a user's hidden interests from rich historical behaviors in months or even years, plays a central role in modern CTR prediction systems. Conventional algorithms mostly follow two cascading stages: a simple General Search Unit (GSU) for fast and coarse search over tens of thousands of long-term behaviors and an Exact Search Unit (ESU) for effective Target Attention (TA) over the small number of finalists from GSU. Although efficient, existing algorithms mostly suffer from a crucial limitation: the \textit{inconsistent} target-behavior relevance metrics between GSU and ESU. As a result, their GSU usually misses highly relevant behaviors but retrieves ones considered irrelevant by ESU. In such case, the TA in ESU, no matter how attention is allocated, mostly deviates from the real user interests and thus degrades the overall CTR prediction accuracy. To address such inconsistency, we propose \textbf{TWo-stage Interest Network (TWIN)}, where our Consistency-Preserved GSU (CP-GSU) adopts the identical target-behavior relevance metric as the TA in ESU, making the two stages twins. Specifically, to break TA's computational bottleneck and extend it from ESU to GSU, or namely from behavior length $10^2$ to length $10^4-10^5$, we build a novel attention mechanism by behavior feature splitting. For the video inherent features of a behavior, we calculate their linear projection by efficient pre-computing \& caching strategies. And for the user-item cross features, we compress each into a one-dimentional bias term in the attention score calculation to save the computational cost. The consistency between two stages, together with the effective TA-based relevance metric in CP-GSU, contributes to significant performance gain in CTR prediction.|终身用户行为建模,即在数月甚至数年内从丰富的历史行为中提取用户隐藏的兴趣,在现代 CTR 预测系统中起着核心作用。传统的算法大多遵循两个级联阶段: 一个简单的通用搜索单元(GSU)用于快速和粗略搜索成千上万的长期行为和一个精确搜索单元(ESU)用于有效的目标注意(TA)在少数决赛选手从 GSU。虽然有效,但现有的算法大多受到一个关键的限制: 文本{不一致}目标行为相关度量 GSU 和 ESU 之间。因此,他们的 GSU 通常会错过高度相关的行为,但检索被 ESU 认为无关的行为。在这种情况下,ESU 中的 TA,无论如何分配注意力,大多偏离了真实用户的兴趣,从而降低了整体 CTR 预测的准确性。为了解决这种不一致性,我们提出 textbf { TWo-stage Interest Network (TWIN)} ,其中我们的保持一致性的 GSU (CP-GSU)采用与 ESU 中的 TA 相同的目标行为相关度量,使两个阶段成为孪生的。具体来说,为了打破 TA 的计算瓶颈,将其从 ESU 扩展到 GSU,或者说从行为长度 $10 ^ 2 $扩展到长度 $10 ^ 4-10 ^ 5 $,我们通过行为特征分裂构建了一种新的注意机制。对于视频行为的固有特征,我们通过有效的预计算和缓存策略来计算它们的线性投影。对于用户-项目交叉特征,在注意得分计算中将每个特征压缩为一维偏差项,以节省计算成本。两个阶段之间的一致性,加上 CP-GSU 中有效的基于 TA 的相关度量,有助于提高 CTR 预测的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=TWIN:+TWo-stage+Interest+Network+for+Lifelong+User+Behavior+Modeling+in+CTR+Prediction+at+Kuaishou)|0| |[On Manipulating Signals of User-Item Graph: A Jacobi Polynomial-based Graph Collaborative Filtering](https://doi.org/10.1145/3580305.3599450)|Jiayan Guo, Lun Du, Xu Chen, Xiaojun Ma, Qiang Fu, Shi Han, Dongmei Zhang, Yan Zhang|Microsoft; Peking University|Collaborative filtering (CF) is an important research direction in recommender systems that aims to make recommendations given the information on user-item interactions. Graph CF has attracted more and more attention in recent years due to its effectiveness in leveraging high-order information in the user-item bipartite graph for better recommendations. Specifically, recent studies show the success of graph neural networks (GNN) for CF is attributed to its low-pass filtering effects. However, current researches lack a study of how different signal components contributes to recommendations, and how to design strategies to properly use them well. To this end, from the view of spectral transformation, we analyze the important factors that a graph filter should consider to achieve better performance. Based on the discoveries, we design JGCF, an efficient and effective method for CF based on Jacobi polynomial bases and frequency decomposition strategies. Extensive experiments on four widely used public datasets show the effectiveness and efficiency of the proposed methods, which brings at most 27.06% performance gain on Alibaba-iFashion. Besides, the experimental results also show that JGCF is better at handling sparse datasets, which shows potential in making recommendations for cold-start users.|协同过滤(CF)是推荐系统的一个重要研究方向,其目的是根据用户项目交互的信息提供推荐。近年来,Graph CF 由于能够有效地利用用户-项目双向图中的高阶信息来获得更好的建议而引起了越来越多的关注。具体来说,最近的研究表明,图神经网络(GNN)对 CF 的成功归功于其低通滤波效果。然而,目前的研究缺乏研究不同的信号成分如何有助于推荐,以及如何设计策略,以适当地使用它们。为此,本文从谱变换的角度出发,分析了图形滤波器要获得更好的性能所应考虑的重要因素。基于这些发现,我们设计了一种基于 Jacobi 多项式基和频率分解策略的高效率和有效的协同过滤方法。在四个广泛使用的公共数据集上进行的大量实验表明了该方法的有效性和效率,在阿里巴巴-iFashion 平台上最多获得27.06% 的性能增益。此外,实验结果还表明,JGCF 在处理稀疏数据集方面有较好的表现,可以为冷启动用户提供建议。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On+Manipulating+Signals+of+User-Item+Graph:+A+Jacobi+Polynomial-based+Graph+Collaborative+Filtering)|0| -|[Off-Policy Evaluation of Ranking Policies under Diverse User Behavior](https://doi.org/10.1145/3580305.3599447)|Haruka Kiyohara, Masatoshi Uehara, Yusuke Narita, Nobuyuki Shimizu, Yasuo Yamamoto, Yuta Saito|Hanjuku-Kaso Co., Ltd.; Yahoo Japan Corporation; Cornell University; Yale University|Ranking interfaces are everywhere in online platforms. There is thus an ever growing interest in their Off-Policy Evaluation (OPE), aiming towards an accurate performance evaluation of ranking policies using logged data. A de-facto approach for OPE is Inverse Propensity Scoring (IPS), which provides an unbiased and consistent value estimate. However, it becomes extremely inaccurate in the ranking setup due to its high variance under large action spaces. To deal with this problem, previous studies assume either independent or cascade user behavior, resulting in some ranking versions of IPS. While these estimators are somewhat effective in reducing the variance, all existing estimators apply a single universal assumption to every user, causing excessive bias and variance. Therefore, this work explores a far more general formulation where user behavior is diverse and can vary depending on the user context. We show that the resulting estimator, which we call Adaptive IPS (AIPS), can be unbiased under any complex user behavior. Moreover, AIPS achieves the minimum variance among all unbiased estimators based on IPS. We further develop a procedure to identify the appropriate user behavior model to minimize the mean squared error (MSE) of AIPS in a data-driven fashion. Extensive experiments demonstrate that the empirical accuracy improvement can be significant, enabling effective OPE of ranking systems even under diverse user behavior.|在线平台中,排序界面无处不在。因此,人们对非策略评估(OPE)越来越感兴趣,其目标是使用日志数据对策略进行准确的性能评估。OPE 的一个事实上的方法是反倾向评分(IPS) ,它提供了一个无偏和一致的价值估计。然而,它变得非常不准确的排名设置,由于其高方差下的大行动空间。为了解决这个问题,以前的研究假设独立或级联用户行为,导致一些排名版本的 IPS。虽然这些估计量在减少方差方面有一定的效果,但是所有现有的估计量都对每个用户适用一个统一的假设,从而导致过度的偏差和方差。因此,这项工作探索了一个更一般的公式,其中用户行为是多样的,可以根据用户上下文而变化。我们证明了所得到的估计量,我们称之为自适应 IPS (AIPS) ,在任何复杂的用户行为下都是无偏的。此外,AIPS 在所有基于 IPS 的无偏估计量之间实现了最小方差。我们进一步开发了一个程序,以确定适当的用户行为模型,从而以数据驱动的方式最大限度地减少 AIPS 的均方差。大量的实验表明,经验的准确性改善可以是显着的,使有效的排名系统的 OPE 即使在不同的用户行为。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Off-Policy+Evaluation+of+Ranking+Policies+under+Diverse+User+Behavior)|0| +|[Off-Policy Evaluation of Ranking Policies under Diverse User Behavior](https://doi.org/10.1145/3580305.3599447)|Haruka Kiyohara, Masatoshi Uehara, Yusuke Narita, Nobuyuki Shimizu, Yasuo Yamamoto, Yuta Saito|Yahoo Japan Corporation; Yale University; Hanjuku-Kaso Co., Ltd.; Cornell University|Ranking interfaces are everywhere in online platforms. There is thus an ever growing interest in their Off-Policy Evaluation (OPE), aiming towards an accurate performance evaluation of ranking policies using logged data. A de-facto approach for OPE is Inverse Propensity Scoring (IPS), which provides an unbiased and consistent value estimate. However, it becomes extremely inaccurate in the ranking setup due to its high variance under large action spaces. To deal with this problem, previous studies assume either independent or cascade user behavior, resulting in some ranking versions of IPS. While these estimators are somewhat effective in reducing the variance, all existing estimators apply a single universal assumption to every user, causing excessive bias and variance. Therefore, this work explores a far more general formulation where user behavior is diverse and can vary depending on the user context. We show that the resulting estimator, which we call Adaptive IPS (AIPS), can be unbiased under any complex user behavior. Moreover, AIPS achieves the minimum variance among all unbiased estimators based on IPS. We further develop a procedure to identify the appropriate user behavior model to minimize the mean squared error (MSE) of AIPS in a data-driven fashion. Extensive experiments demonstrate that the empirical accuracy improvement can be significant, enabling effective OPE of ranking systems even under diverse user behavior.|在线平台中,排序界面无处不在。因此,人们对非策略评估(OPE)越来越感兴趣,其目标是使用日志数据对策略进行准确的性能评估。OPE 的一个事实上的方法是反倾向评分(IPS) ,它提供了一个无偏和一致的价值估计。然而,它变得非常不准确的排名设置,由于其高方差下的大行动空间。为了解决这个问题,以前的研究假设独立或级联用户行为,导致一些排名版本的 IPS。虽然这些估计量在减少方差方面有一定的效果,但是所有现有的估计量都对每个用户适用一个统一的假设,从而导致过度的偏差和方差。因此,这项工作探索了一个更一般的公式,其中用户行为是多样的,可以根据用户上下文而变化。我们证明了所得到的估计量,我们称之为自适应 IPS (AIPS) ,在任何复杂的用户行为下都是无偏的。此外,AIPS 在所有基于 IPS 的无偏估计量之间实现了最小方差。我们进一步开发了一个程序,以确定适当的用户行为模型,从而以数据驱动的方式最大限度地减少 AIPS 的均方差。大量的实验表明,经验的准确性改善可以是显着的,使有效的排名系统的 OPE 即使在不同的用户行为。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Off-Policy+Evaluation+of+Ranking+Policies+under+Diverse+User+Behavior)|0| |[Impatient Bandits: Optimizing Recommendations for the Long-Term Without Delay](https://doi.org/10.1145/3580305.3599386)|Thomas M. McDonald, Lucas Maystre, Mounia Lalmas, Daniel Russo, Kamil Ciosek||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Impatient+Bandits:+Optimizing+Recommendations+for+the+Long-Term+Without+Delay)|0| |[Unbiased Delayed Feedback Label Correction for Conversion Rate Prediction](https://doi.org/10.1145/3580305.3599536)|Yifan Wang, Peijie Sun, Min Zhang, Qinglin Jia, Jingjie Li, Shaoping Ma|Tsinghua University; Noah’s Ark Lab, Huawei|Conversion rate prediction is critical to many online applications such as digital display advertising. To capture dynamic data distribution, industrial systems often require retraining models on recent data daily or weekly. However, the delay of conversion behavior usually leads to incorrect labeling, which is called delayed feedback problem. Existing work may fail to introduce the correct information about false negative samples due to data sparsity and dynamic data distribution. To directly introduce the correct feedback label information, we propose an Unbiased delayed feedback Label Correction framework (ULC), which uses an auxiliary model to correct labels for observed negative feedback samples. Firstly, we theoretically prove that the label-corrected loss is an unbiased estimate of the oracle loss using true labels. Then, as there are no ready training data for label correction, counterfactual labeling is used to construct artificial training data. Furthermore, since counterfactual labeling utilizes only partial training data, we design an embedding-based alternative training method to enhance performance. Comparative experiments on both public and private datasets and detailed analyses show that our proposed approach effectively alleviates the delayed feedback problem and consistently outperforms the previous state-of-the-art methods.|转化率预测是许多在线应用程序,如数字显示广告的关键。为了捕获动态数据分布,工业系统通常需要每天或每周对最近的数据进行再训练。然而,转换行为的延迟通常会导致不正确的标记,这就是所谓的延迟反馈问题。由于数据稀疏和数据分布的动态性,现有的工作可能无法引入正确的假阴性样本信息。为了直接引入正确的反馈标签信息,我们提出了一种无偏的延迟反馈标签校正框架(ULC) ,它使用一个辅助模型对观测到的负反馈样本进行标签校正。首先,我们从理论上证明了标签校正损失是使用真实标签对甲骨文损失进行的无偏估计。然后,由于没有现成的训练数据用于标签校正,采用反事实标注来构造人工训练数据。此外,由于反事实标注只利用部分训练数据,我们设计了一个基于嵌入的替代训练方法来提高性能。对公共和私人数据集的比较实验和详细的分析表明,我们提出的方法有效地缓解了延迟反馈问题,并始终优于以前的最先进的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Unbiased+Delayed+Feedback+Label+Correction+for+Conversion+Rate+Prediction)|0| |[PrefRec: Recommender Systems with Human Preferences for Reinforcing Long-term User Engagement](https://doi.org/10.1145/3580305.3599473)|Wanqi Xue, Qingpeng Cai, Zhenghai Xue, Shuo Sun, Shuchang Liu, Dong Zheng, Peng Jiang, Kun Gai, Bo An||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PrefRec:+Recommender+Systems+with+Human+Preferences+for+Reinforcing+Long-term+User+Engagement)|0| @@ -53,22 +53,22 @@ |[A Personalized Automated Bidding Framework for Fairness-aware Online Advertising](https://doi.org/10.1145/3580305.3599765)|Haoqi Zhang, Lvyin Niu, Zhenzhe Zheng, Zhilin Zhang, Shan Gu, Fan Wu, Chuan Yu, Jian Xu, Guihai Chen, Bo Zheng||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Personalized+Automated+Bidding+Framework+for+Fairness-aware+Online+Advertising)|0| |[Privacy Matters: Vertical Federated Linear Contextual Bandits for Privacy Protected Recommendation](https://doi.org/10.1145/3580305.3599475)|Zeyu Cao, Zhipeng Liang, Bingzhe Wu, Shu Zhang, Hangyu Li, Ouyang Wen, Yu Rong, Peilin Zhao||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Privacy+Matters:+Vertical+Federated+Linear+Contextual+Bandits+for+Privacy+Protected+Recommendation)|0| |[Approximation Algorithms for Size-Constrained Non-Monotone Submodular Maximization in Deterministic Linear Time](https://doi.org/10.1145/3580305.3599259)|Yixin Chen, Alan Kuhnle||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Approximation+Algorithms+for+Size-Constrained+Non-Monotone+Submodular+Maximization+in+Deterministic+Linear+Time)|0| -|[Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference](https://doi.org/10.1145/3580305.3599284)|Junyan Li, Li Lyna Zhang, Jiahang Xu, Yujing Wang, Shaoguang Yan, Yunqing Xia, Yuqing Yang, Ting Cao, Hao Sun, Weiwei Deng, Qi Zhang, Mao Yang|Microsoft; Zhejiang University; Microsoft Research|Deploying pre-trained transformer models like BERT on downstream tasks in resource-constrained scenarios is challenging due to their high inference cost, which grows rapidly with input sequence length. In this work, we propose a constraint-aware and ranking-distilled token pruning method ToP, which selectively removes unnecessary tokens as input sequence passes through layers, allowing the model to improve online inference speed while preserving accuracy. ToP overcomes the limitation of inaccurate token importance ranking in the conventional self-attention mechanism through a ranking-distilled token distillation technique, which distills effective token rankings from the final layer of unpruned models to early layers of pruned models. Then, ToP introduces a coarse-to-fine pruning approach that automatically selects the optimal subset of transformer layers and optimizes token pruning decisions within these layers through improved $L_0$ regularization. Extensive experiments on GLUE benchmark and SQuAD tasks demonstrate that ToP outperforms state-of-the-art token pruning and model compression methods with improved accuracy and speedups. ToP reduces the average FLOPs of BERT by 8.1x while achieving competitive accuracy on GLUE, and provides a real latency speedup of up to 7.4x on an Intel CPU.|在资源受限的情况下,在下游任务中部署像 BERT 这样的预先训练的变压器模型是具有挑战性的,因为它们的推理成本很高,并且随着输入序列长度的增长而迅速增长。本文提出了一种基于约束和排序的令牌剪枝方法 TOP,该方法在输入序列通过层的同时选择性地去除不必要的令牌,使模型在保持精度的同时提高了在线推理速度。TOP 通过排序-提取令牌精馏技术克服了传统自注意机制中不准确的令牌重要性排序的局限性,该技术将有效的令牌排序从未修剪模型的最后一层提取到修剪模型的早期层。然后,TOP 引入了一种从粗到精的剪枝方法,该方法自动选择变压器层的最优子集,并通过改进的 $L _ 0 $正则化来优化这些变压器层内的令牌剪枝决策。在 GLUE 基准测试和 SQuAD 任务上的大量实验表明,ToP 优于最先进的令牌剪枝和模型压缩方法,具有更高的准确性和加速性。在 GLUE 上,TOP 减少了 BERT 的平均 FLOP 8.1 x,同时实现了具有竞争力的准确性,并且在 Intel CPU 上提供了高达7.4 x 的实际延迟加速。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Constraint-aware+and+Ranking-distilled+Token+Pruning+for+Efficient+Transformer+Inference)|0| +|[Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference](https://doi.org/10.1145/3580305.3599284)|Junyan Li, Li Lyna Zhang, Jiahang Xu, Yujing Wang, Shaoguang Yan, Yunqing Xia, Yuqing Yang, Ting Cao, Hao Sun, Weiwei Deng, Qi Zhang, Mao Yang|Zhejiang University; Microsoft; Microsoft Research|Deploying pre-trained transformer models like BERT on downstream tasks in resource-constrained scenarios is challenging due to their high inference cost, which grows rapidly with input sequence length. In this work, we propose a constraint-aware and ranking-distilled token pruning method ToP, which selectively removes unnecessary tokens as input sequence passes through layers, allowing the model to improve online inference speed while preserving accuracy. ToP overcomes the limitation of inaccurate token importance ranking in the conventional self-attention mechanism through a ranking-distilled token distillation technique, which distills effective token rankings from the final layer of unpruned models to early layers of pruned models. Then, ToP introduces a coarse-to-fine pruning approach that automatically selects the optimal subset of transformer layers and optimizes token pruning decisions within these layers through improved $L_0$ regularization. Extensive experiments on GLUE benchmark and SQuAD tasks demonstrate that ToP outperforms state-of-the-art token pruning and model compression methods with improved accuracy and speedups. ToP reduces the average FLOPs of BERT by 8.1x while achieving competitive accuracy on GLUE, and provides a real latency speedup of up to 7.4x on an Intel CPU.|在资源受限的情况下,在下游任务中部署像 BERT 这样的预先训练的变压器模型是具有挑战性的,因为它们的推理成本很高,并且随着输入序列长度的增长而迅速增长。本文提出了一种基于约束和排序的令牌剪枝方法 TOP,该方法在输入序列通过层的同时选择性地去除不必要的令牌,使模型在保持精度的同时提高了在线推理速度。TOP 通过排序-提取令牌精馏技术克服了传统自注意机制中不准确的令牌重要性排序的局限性,该技术将有效的令牌排序从未修剪模型的最后一层提取到修剪模型的早期层。然后,TOP 引入了一种从粗到精的剪枝方法,该方法自动选择变压器层的最优子集,并通过改进的 $L _ 0 $正则化来优化这些变压器层内的令牌剪枝决策。在 GLUE 基准测试和 SQuAD 任务上的大量实验表明,ToP 优于最先进的令牌剪枝和模型压缩方法,具有更高的准确性和加速性。在 GLUE 上,TOP 减少了 BERT 的平均 FLOP 8.1 x,同时实现了具有竞争力的准确性,并且在 Intel CPU 上提供了高达7.4 x 的实际延迟加速。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Constraint-aware+and+Ranking-distilled+Token+Pruning+for+Efficient+Transformer+Inference)|0| |[Learning Balanced Tree Indexes for Large-Scale Vector Retrieval](https://doi.org/10.1145/3580305.3599406)|Wuchao Li, Chao Feng, Defu Lian, Yuxin Xie, Haifeng Liu, Yong Ge, Enhong Chen||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Balanced+Tree+Indexes+for+Large-Scale+Vector+Retrieval)|0| |[Generative Flow Network for Listwise Recommendation](https://doi.org/10.1145/3580305.3599364)|Shuchang Liu, Qingpeng Cai, Zhankui He, Bowen Sun, Julian J. McAuley, Dong Zheng, Peng Jiang, Kun Gai||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Generative+Flow+Network+for+Listwise+Recommendation)|0| |[Hyper-USS: Answering Subset Query Over Multi-Attribute Data Stream](https://doi.org/10.1145/3580305.3599383)|Ruijie Miao, Yiyao Zhang, Guanyu Qu, Kaicheng Yang, Tong Yang, Bin Cui||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Hyper-USS:+Answering+Subset+Query+Over+Multi-Attribute+Data+Stream)|0| |[Reconsidering Learning Objectives in Unbiased Recommendation: A Distribution Shift Perspective](https://doi.org/10.1145/3580305.3599487)|Teng Xiao, Zhengyu Chen, Suhang Wang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Reconsidering+Learning+Objectives+in+Unbiased+Recommendation:+A+Distribution+Shift+Perspective)|0| |[VQNE: Variational Quantum Network Embedding with Application to Network Alignment](https://doi.org/10.1145/3580305.3599542)|Xinyu Ye, Ge Yan, Junchi Yan||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=VQNE:+Variational+Quantum+Network+Embedding+with+Application+to+Network+Alignment)|0| -|[Debiasing Recommendation by Learning Identifiable Latent Confounders](https://doi.org/10.1145/3580305.3599296)|Qing Zhang, Xiaoying Zhang, Yang Liu, Hongning Wang, Min Gao, Jiheng Zhang, Ruocheng Guo|Chongqing University; University of Virginia; ByteDance Research; Hong Kong University of Science and Technology|Recommendation systems aim to predict users' feedback on items not exposed to them. Confounding bias arises due to the presence of unmeasured variables (e.g., the socio-economic status of a user) that can affect both a user's exposure and feedback. Existing methods either (1) make untenable assumptions about these unmeasured variables or (2) directly infer latent confounders from users' exposure. However, they cannot guarantee the identification of counterfactual feedback, which can lead to biased predictions. In this work, we propose a novel method, i.e., identifiable deconfounder (iDCF), which leverages a set of proxy variables (e.g., observed user features) to resolve the aforementioned non-identification issue. The proposed iDCF is a general deconfounded recommendation framework that applies proximal causal inference to infer the unmeasured confounders and identify the counterfactual feedback with theoretical guarantees. Extensive experiments on various real-world and synthetic datasets verify the proposed method's effectiveness and robustness.|推荐系统旨在预测用户对未接触到的项目的反馈。由于存在不可测量的变量(例如,用户的社会经济地位) ,可以影响用户的曝光和反馈,混淆偏见就会产生。现有的方法要么(1)对这些未测量的变量做出不可靠的假设,要么(2)直接从用户的暴露中推断出潜在的混杂因素。然而,他们不能保证识别反事实反馈,这可能导致偏见的预测。在这项工作中,我们提出了一种新的方法,即可识别的解构者(iDCF) ,它利用一组代理变量(例如,观察到的用户特征)来解决上述非识别问题。提出的 iDCF 是一个通用的解构推荐框架,它应用近因推理来推断不可测量的混杂因素,并用理论保证来识别反事实反馈。在各种真实世界和合成数据集上的大量实验验证了该方法的有效性和鲁棒性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Debiasing+Recommendation+by+Learning+Identifiable+Latent+Confounders)|0| +|[Debiasing Recommendation by Learning Identifiable Latent Confounders](https://doi.org/10.1145/3580305.3599296)|Qing Zhang, Xiaoying Zhang, Yang Liu, Hongning Wang, Min Gao, Jiheng Zhang, Ruocheng Guo|ByteDance Research; Chongqing University; Hong Kong University of Science and Technology; University of Virginia|Recommendation systems aim to predict users' feedback on items not exposed to them. Confounding bias arises due to the presence of unmeasured variables (e.g., the socio-economic status of a user) that can affect both a user's exposure and feedback. Existing methods either (1) make untenable assumptions about these unmeasured variables or (2) directly infer latent confounders from users' exposure. However, they cannot guarantee the identification of counterfactual feedback, which can lead to biased predictions. In this work, we propose a novel method, i.e., identifiable deconfounder (iDCF), which leverages a set of proxy variables (e.g., observed user features) to resolve the aforementioned non-identification issue. The proposed iDCF is a general deconfounded recommendation framework that applies proximal causal inference to infer the unmeasured confounders and identify the counterfactual feedback with theoretical guarantees. Extensive experiments on various real-world and synthetic datasets verify the proposed method's effectiveness and robustness.|推荐系统旨在预测用户对未接触到的项目的反馈。由于存在不可测量的变量(例如,用户的社会经济地位) ,可以影响用户的曝光和反馈,混淆偏见就会产生。现有的方法要么(1)对这些未测量的变量做出不可靠的假设,要么(2)直接从用户的暴露中推断出潜在的混杂因素。然而,他们不能保证识别反事实反馈,这可能导致偏见的预测。在这项工作中,我们提出了一种新的方法,即可识别的解构者(iDCF) ,它利用一组代理变量(例如,观察到的用户特征)来解决上述非识别问题。提出的 iDCF 是一个通用的解构推荐框架,它应用近因推理来推断不可测量的混杂因素,并用理论保证来识别反事实反馈。在各种真实世界和合成数据集上的大量实验验证了该方法的有效性和鲁棒性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Debiasing+Recommendation+by+Learning+Identifiable+Latent+Confounders)|0| |[Hierarchical Invariant Learning for Domain Generalization Recommendation](https://doi.org/10.1145/3580305.3599377)|Zeyu Zhang, Heyang Gao, Hao Yang, Xu Chen||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Hierarchical+Invariant+Learning+for+Domain+Generalization+Recommendation)|0| |[Narrow the Input Mismatch in Deep Graph Neural Network Distillation](https://doi.org/10.1145/3580305.3599442)|Qiqi Zhou, Yanyan Shen, Lei Chen||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Narrow+the+Input+Mismatch+in+Deep+Graph+Neural+Network+Distillation)|0| -|[Robust Positive-Unlabeled Learning via Noise Negative Sample Self-correction](https://doi.org/10.1145/3580305.3599491)|Zhangchi Zhu, Lu Wang, Pu Zhao, Chao Du, Wei Zhang, Hang Dong, Bo Qiao, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang|Microsoft Research; East China Normal University; Microsoft 365|Learning from positive and unlabeled data is known as positive-unlabeled (PU) learning in literature and has attracted much attention in recent years. One common approach in PU learning is to sample a set of pseudo-negatives from the unlabeled data using ad-hoc thresholds so that conventional supervised methods can be applied with both positive and negative samples. Owing to the label uncertainty among the unlabeled data, errors of misclassifying unlabeled positive samples as negative samples inevitably appear and may even accumulate during the training processes. Those errors often lead to performance degradation and model instability. To mitigate the impact of label uncertainty and improve the robustness of learning with positive and unlabeled data, we propose a new robust PU learning method with a training strategy motivated by the nature of human learning: easy cases should be learned first. Similar intuition has been utilized in curriculum learning to only use easier cases in the early stage of training before introducing more complex cases. Specifically, we utilize a novel ``hardness'' measure to distinguish unlabeled samples with a high chance of being negative from unlabeled samples with large label noise. An iterative training strategy is then implemented to fine-tune the selection of negative samples during the training process in an iterative manner to include more ``easy'' samples in the early stage of training. Extensive experimental validations over a wide range of learning tasks show that this approach can effectively improve the accuracy and stability of learning with positive and unlabeled data. Our code is available at https://github.com/woriazzc/Robust-PU|从阳性和未标记数据中学习被称为阳性-未标记(PU)学习,近年来引起了人们的广泛关注。PU 学习中常用的一种方法是使用自组织阈值从未标记的数据中抽取一组伪阴性样本,这样传统的监督方法就可以同时应用于正样本和负样本。由于未标记数据之间存在标记不确定性,训练过程中不可避免地会出现将未标记阳性样本错误分类为阴性样本的错误,甚至可能累积。这些错误经常导致性能下降和模型不稳定。为了减轻标签不确定性的影响,提高正数和未标签数据学习的鲁棒性,我们提出了一种新的鲁棒性 PU 学习方法,其训练策略受人类学习的本质驱动: 应该首先学习简单的情况。在课程学习中也使用了类似的直觉,即在培训的早期阶段只使用较容易的案例,然后再引入更复杂的案例。具体来说,我们利用一种新的“硬度”测量方法来区分未标记样品与具有较大标记噪声的未标记样品。然后采用迭代训练策略,以迭代的方式对训练过程中的负样本选择进行微调,以便在训练的早期阶段包含更多的“简单”样本。通过对大量学习任务的大量实验验证表明,该方法能够有效地提高正数和未标记数据学习的准确性和稳定性。我们的代码可以在 https://github.com/woriazzc/robust-pu 找到|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Robust+Positive-Unlabeled+Learning+via+Noise+Negative+Sample+Self-correction)|0| +|[Robust Positive-Unlabeled Learning via Noise Negative Sample Self-correction](https://doi.org/10.1145/3580305.3599491)|Zhangchi Zhu, Lu Wang, Pu Zhao, Chao Du, Wei Zhang, Hang Dong, Bo Qiao, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang|Microsoft 365; East China Normal University; Microsoft Research|Learning from positive and unlabeled data is known as positive-unlabeled (PU) learning in literature and has attracted much attention in recent years. One common approach in PU learning is to sample a set of pseudo-negatives from the unlabeled data using ad-hoc thresholds so that conventional supervised methods can be applied with both positive and negative samples. Owing to the label uncertainty among the unlabeled data, errors of misclassifying unlabeled positive samples as negative samples inevitably appear and may even accumulate during the training processes. Those errors often lead to performance degradation and model instability. To mitigate the impact of label uncertainty and improve the robustness of learning with positive and unlabeled data, we propose a new robust PU learning method with a training strategy motivated by the nature of human learning: easy cases should be learned first. Similar intuition has been utilized in curriculum learning to only use easier cases in the early stage of training before introducing more complex cases. Specifically, we utilize a novel ``hardness'' measure to distinguish unlabeled samples with a high chance of being negative from unlabeled samples with large label noise. An iterative training strategy is then implemented to fine-tune the selection of negative samples during the training process in an iterative manner to include more ``easy'' samples in the early stage of training. Extensive experimental validations over a wide range of learning tasks show that this approach can effectively improve the accuracy and stability of learning with positive and unlabeled data. Our code is available at https://github.com/woriazzc/Robust-PU|从阳性和未标记数据中学习被称为阳性-未标记(PU)学习,近年来引起了人们的广泛关注。PU 学习中常用的一种方法是使用自组织阈值从未标记的数据中抽取一组伪阴性样本,这样传统的监督方法就可以同时应用于正样本和负样本。由于未标记数据之间存在标记不确定性,训练过程中不可避免地会出现将未标记阳性样本错误分类为阴性样本的错误,甚至可能累积。这些错误经常导致性能下降和模型不稳定。为了减轻标签不确定性的影响,提高正数和未标签数据学习的鲁棒性,我们提出了一种新的鲁棒性 PU 学习方法,其训练策略受人类学习的本质驱动: 应该首先学习简单的情况。在课程学习中也使用了类似的直觉,即在培训的早期阶段只使用较容易的案例,然后再引入更复杂的案例。具体来说,我们利用一种新的“硬度”测量方法来区分未标记样品与具有较大标记噪声的未标记样品。然后采用迭代训练策略,以迭代的方式对训练过程中的负样本选择进行微调,以便在训练的早期阶段包含更多的“简单”样本。通过对大量学习任务的大量实验验证表明,该方法能够有效地提高正数和未标记数据学习的准确性和稳定性。我们的代码可以在 https://github.com/woriazzc/robust-pu 找到|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Robust+Positive-Unlabeled+Learning+via+Noise+Negative+Sample+Self-correction)|0| |[RankFormer: Listwise Learning-to-Rank Using Listwide Labels](https://doi.org/10.1145/3580305.3599892)|Maarten Buyl, Paul Missault, PierreAntoine Sondag|Amazon|Web applications where users are presented with a limited selection of items have long employed ranking models to put the most relevant results first. Any feedback received from users is typically assumed to reflect a relative judgement on the utility of items, e.g. a user clicking on an item only implies it is better than items not clicked in the same ranked list. Hence, the objectives optimized in Learning-to-Rank (LTR) tend to be pairwise or listwise. Yet, by only viewing feedback as relative, we neglect the user's absolute feedback on the list's overall quality, e.g. when no items in the selection are clicked. We thus reconsider the standard LTR paradigm and argue the benefits of learning from this listwide signal. To this end, we propose the RankFormer as an architecture that, with a Transformer at its core, can jointly optimize a novel listwide assessment objective and a traditional listwise LTR objective. We simulate implicit feedback on public datasets and observe that the RankFormer succeeds in benefitting from listwide signals. Additionally, we conduct experiments in e-commerce on Amazon Search data and find the RankFormer to be superior to all baselines offline. An online experiment shows that knowledge distillation can be used to find immediate practical use for the RankFormer.|在 Web 应用程序中,用户只能看到有限的条目,这种情况长期以来一直采用排名模型,将最相关的结果放在第一位。从用户收到的任何反馈通常被认为反映了对项目效用的相对判断,例如,用户点击一个项目只意味着它比没有在同一排名列表中点击的项目要好。因此,学习排名(Learning-to-Rank,LTR)中优化的目标往往是成对的或列表的。然而,由于只把反馈看作是相对的,我们忽略了用户对列表总体质量的绝对反馈,例如,当选择中没有项被点击的时候。因此,我们重新考虑标准的 LTR 范式,并讨论从这个列表范围的信号中学习的好处。为此,我们提出 RankForm 作为一种体系结构,其核心是一个 Transformer,可以联合优化一个新的列表范围评估目标和一个传统的列表式 LTR 目标。我们模拟公共数据集上的隐式反馈,并观察到 RankForm 成功地从列表宽信号中受益。此外,我们在亚马逊搜索数据上进行电子商务实验,发现排名前优于所有离线基线。一个在线实验表明,知识提取可以用来找到直接的实际应用的秩次前。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=RankFormer:+Listwise+Learning-to-Rank+Using+Listwide+Labels)|0| -|[Graph-Based Model-Agnostic Data Subsampling for Recommendation Systems](https://doi.org/10.1145/3580305.3599834)|Xiaohui Chen, Jiankai Sun, Taiqing Wang, Ruocheng Guo, LiPing Liu, Aonan Zhang|Tufts University; Apple Inc.; ByteDance Research; ByteDance Inc.|Data subsampling is widely used to speed up the training of large-scale recommendation systems. Most subsampling methods are model-based and often require a pre-trained pilot model to measure data importance via e.g. sample hardness. However, when the pilot model is misspecified, model-based subsampling methods deteriorate. Since model misspecification is persistent in real recommendation systems, we instead propose model-agnostic data subsampling methods by only exploring input data structure represented by graphs. Specifically, we study the topology of the user-item graph to estimate the importance of each user-item interaction (an edge in the user-item graph) via graph conductance, followed by a propagation step on the network to smooth out the estimated importance value. Since our proposed method is model-agnostic, we can marry the merits of both model-agnostic and model-based subsampling methods. Empirically, we show that combing the two consistently improves over any single method on the used datasets. Experimental results on KuaiRec and MIND datasets demonstrate that our proposed methods achieve superior results compared to baseline approaches.|数据子采样被广泛用于加速大规模推荐系统的训练。大多数次抽样方法是基于模型的,通常需要一个预先训练的试点模型来通过例如样本硬度来测量数据的重要性。然而,当导频模型被错误指定时,基于模型的子抽样方法就会变质。由于模型错误说明在实际推荐系统中一直存在,因此我们提出了模型无关的数据子抽样方法,只是探讨了用图表示的输入数据结构。具体来说,我们研究了用户项目图的拓扑结构,通过图电导来估计每个用户项目交互(用户项目图中的一条边)的重要性,然后通过网络上的传播步骤来平滑估计的重要性值。由于我们提出的方法是模型不可知的,我们可以结合模型不可知和基于模型的子抽样方法的优点。经验上,我们表明,结合使用这两种方法比使用的数据集上的任何单一方法都要好。在 KuaiRec 和 MIND 数据集上的实验结果表明,与基线方法相比,我们提出的方法取得了更好的结果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph-Based+Model-Agnostic+Data+Subsampling+for+Recommendation+Systems)|0| +|[Graph-Based Model-Agnostic Data Subsampling for Recommendation Systems](https://doi.org/10.1145/3580305.3599834)|Xiaohui Chen, Jiankai Sun, Taiqing Wang, Ruocheng Guo, LiPing Liu, Aonan Zhang|ByteDance Research; Tufts University; Apple Inc.; ByteDance Inc.|Data subsampling is widely used to speed up the training of large-scale recommendation systems. Most subsampling methods are model-based and often require a pre-trained pilot model to measure data importance via e.g. sample hardness. However, when the pilot model is misspecified, model-based subsampling methods deteriorate. Since model misspecification is persistent in real recommendation systems, we instead propose model-agnostic data subsampling methods by only exploring input data structure represented by graphs. Specifically, we study the topology of the user-item graph to estimate the importance of each user-item interaction (an edge in the user-item graph) via graph conductance, followed by a propagation step on the network to smooth out the estimated importance value. Since our proposed method is model-agnostic, we can marry the merits of both model-agnostic and model-based subsampling methods. Empirically, we show that combing the two consistently improves over any single method on the used datasets. Experimental results on KuaiRec and MIND datasets demonstrate that our proposed methods achieve superior results compared to baseline approaches.|数据子采样被广泛用于加速大规模推荐系统的训练。大多数次抽样方法是基于模型的,通常需要一个预先训练的试点模型来通过例如样本硬度来测量数据的重要性。然而,当导频模型被错误指定时,基于模型的子抽样方法就会变质。由于模型错误说明在实际推荐系统中一直存在,因此我们提出了模型无关的数据子抽样方法,只是探讨了用图表示的输入数据结构。具体来说,我们研究了用户项目图的拓扑结构,通过图电导来估计每个用户项目交互(用户项目图中的一条边)的重要性,然后通过网络上的传播步骤来平滑估计的重要性值。由于我们提出的方法是模型不可知的,我们可以结合模型不可知和基于模型的子抽样方法的优点。经验上,我们表明,结合使用这两种方法比使用的数据集上的任何单一方法都要好。在 KuaiRec 和 MIND 数据集上的实验结果表明,与基线方法相比,我们提出的方法取得了更好的结果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph-Based+Model-Agnostic+Data+Subsampling+for+Recommendation+Systems)|0| |[BOSS: A Bilateral Occupational-Suitability-Aware Recommender System for Online Recruitment](https://doi.org/10.1145/3580305.3599783)|Xiao Hu, Yuan Cheng, Zhi Zheng, Yue Wang, Xinxin Chi, Hengshu Zhu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=BOSS:+A+Bilateral+Occupational-Suitability-Aware+Recommender+System+for+Online+Recruitment)|0| |[Real Time Index and Search Across Large Quantities of GNN Experts for Low Latency Online Learning](https://doi.org/10.1145/3580305.3599893)|Johan Kok Zhi Kang, Sien Yi Tan, Bingsheng He, Zhen Zhang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Real+Time+Index+and+Search+Across+Large+Quantities+of+GNN+Experts+for+Low+Latency+Online+Learning)|0| |[A Preference-aware Meta-optimization Framework for Personalized Vehicle Energy Consumption Estimation](https://doi.org/10.1145/3580305.3599767)|Siqi Lai, Weijia Zhang, Hao Liu|The Hong Kong University of Science and Technology (Guangzhou)|Vehicle Energy Consumption (VEC) estimation aims to predict the total energy required for a given trip before it starts, which is of great importance to trip planning and transportation sustainability. Existing approaches mainly focus on extracting statistically significant factors from typical trips to improve the VEC estimation. However, the energy consumption of each vehicle may diverge widely due to the personalized driving behavior under varying travel contexts. To this end, this paper proposes a preference-aware meta-optimization framework Meta-Pec for personalized vehicle energy consumption estimation. Specifically, we first propose a spatiotemporal behavior learning module to capture the latent driver preference hidden in historical trips. Moreover, based on the memorization of driver preference, we devise a selection-based driving behavior prediction module to infer driver-specific driving patterns on a given route, which provides additional basis and supervision signals for VEC estimation. Besides, a driver-specific meta-optimization scheme is proposed to enable fast model adaption by learning and sharing transferable knowledge globally. Extensive experiments on two real-world datasets show the superiority of our proposed framework against ten numerical and data-driven machine learning baselines. The source code is available at https://github.com/usail-hkust/Meta-Pec.|车辆能耗(VEC)估算的目的是在出行前预测出行所需的总能量,这对出行规划和交通可持续性有重要意义。现有的方法主要集中在从典型行程中提取统计学显著因子,以改善 VEC 估计。然而,在不同的出行环境下,由于个性化驾驶行为的影响,每辆车的能源消耗可能会有很大的差异。为此,本文提出了一个基于偏好感知的元优化框架 Meta-Pec,用于个性化车辆能耗估算。具体来说,我们首先提出了一个时空行为学习模块来捕捉隐藏在历史行程中的潜在驱动偏好。此外,基于驾驶员偏好的记忆,我们设计了一个基于选择的驾驶行为预测模块,以推断特定路线上驾驶员的驾驶模式,为 VEC 估计提供额外的依据和监控信号。此外,提出了一种特定于驱动程序的元优化方案,通过全局学习和共享可转移知识来实现模型的快速自适应。在两个实际数据集上的大量实验表明,我们提出的框架对于十个数字和数据驱动的机器学习基线具有优越性。源代码可在 https://github.com/usail-hkust/meta-pec 下载。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Preference-aware+Meta-optimization+Framework+for+Personalized+Vehicle+Energy+Consumption+Estimation)|0| -|[MUSER: A MUlti-Step Evidence Retrieval Enhancement Framework for Fake News Detection](https://doi.org/10.1145/3580305.3599873)|Hao Liao, Jiahao Peng, Zhanyi Huang, Wei Zhang, Guanghua Li, Kai Shu, Xing Xie|Shenzhen University; Microsoft Research Asia; Illinois Institute of Technology|The ease of spreading false information online enables individuals with malicious intent to manipulate public opinion and destabilize social stability. Recently, fake news detection based on evidence retrieval has gained popularity in an effort to identify fake news reliably and reduce its impact. Evidence retrieval-based methods can improve the reliability of fake news detection by computing the textual consistency between the evidence and the claim in the news. In this paper, we propose a framework for fake news detection based on MUlti-Step Evidence Retrieval enhancement (MUSER), which simulates the steps of human beings in the process of reading news, summarizing, consulting materials, and inferring whether the news is true or fake. Our model can explicitly model dependencies among multiple pieces of evidence, and perform multi-step associations for the evidence required for news verification through multi-step retrieval. In addition, our model is able to automatically collect existing evidence through paragraph retrieval and key evidence selection, which can save the tedious process of manual evidence collection. We conducted extensive experiments on real-world datasets in different languages, and the results demonstrate that our proposed model outperforms state-of-the-art baseline methods for detecting fake news by at least 3% in F1-Macro and 4% in F1-Micro. Furthermore, it provides interpretable evidence for end users.|在网上传播虚假信息的便利使得有恶意的个人能够操纵公众舆论,破坏社会稳定。近年来,基于证据检索的假新闻检测技术在可靠识别假新闻、减少假新闻影响等方面得到了广泛的应用。基于证据检索的方法通过计算新闻中证据与索赔之间的文本一致性,提高了假新闻检测的可靠性。本文提出了一种基于多步证据检索增强(MUSER)的假新闻检测框架,该框架模拟了人类在阅读新闻、总结新闻、查阅资料、推断新闻是真是假的过程中的步骤。该模型可以显式地对多个证据之间的依赖关系进行建模,并通过多步检索对新闻验证所需的证据进行多步关联。此外,该模型通过段落检索和关键证据选择,能够自动收集现有证据,节省了繁琐的人工证据收集过程。我们在不同语言的真实世界数据集上进行了广泛的实验,结果表明,我们提出的模型比最先进的基线方法在 F1-Macro 中检测假新闻的性能至少高出3% ,在 F1-Micro 中高出4% 。此外,它还为最终用户提供了可解释的证据。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MUSER:+A+MUlti-Step+Evidence+Retrieval+Enhancement+Framework+for+Fake+News+Detection)|0| +|[MUSER: A MUlti-Step Evidence Retrieval Enhancement Framework for Fake News Detection](https://doi.org/10.1145/3580305.3599873)|Hao Liao, Jiahao Peng, Zhanyi Huang, Wei Zhang, Guanghua Li, Kai Shu, Xing Xie|Illinois Institute of Technology; Microsoft Research Asia; Shenzhen University|The ease of spreading false information online enables individuals with malicious intent to manipulate public opinion and destabilize social stability. Recently, fake news detection based on evidence retrieval has gained popularity in an effort to identify fake news reliably and reduce its impact. Evidence retrieval-based methods can improve the reliability of fake news detection by computing the textual consistency between the evidence and the claim in the news. In this paper, we propose a framework for fake news detection based on MUlti-Step Evidence Retrieval enhancement (MUSER), which simulates the steps of human beings in the process of reading news, summarizing, consulting materials, and inferring whether the news is true or fake. Our model can explicitly model dependencies among multiple pieces of evidence, and perform multi-step associations for the evidence required for news verification through multi-step retrieval. In addition, our model is able to automatically collect existing evidence through paragraph retrieval and key evidence selection, which can save the tedious process of manual evidence collection. We conducted extensive experiments on real-world datasets in different languages, and the results demonstrate that our proposed model outperforms state-of-the-art baseline methods for detecting fake news by at least 3% in F1-Macro and 4% in F1-Micro. Furthermore, it provides interpretable evidence for end users.|在网上传播虚假信息的便利使得有恶意的个人能够操纵公众舆论,破坏社会稳定。近年来,基于证据检索的假新闻检测技术在可靠识别假新闻、减少假新闻影响等方面得到了广泛的应用。基于证据检索的方法通过计算新闻中证据与索赔之间的文本一致性,提高了假新闻检测的可靠性。本文提出了一种基于多步证据检索增强(MUSER)的假新闻检测框架,该框架模拟了人类在阅读新闻、总结新闻、查阅资料、推断新闻是真是假的过程中的步骤。该模型可以显式地对多个证据之间的依赖关系进行建模,并通过多步检索对新闻验证所需的证据进行多步关联。此外,该模型通过段落检索和关键证据选择,能够自动收集现有证据,节省了繁琐的人工证据收集过程。我们在不同语言的真实世界数据集上进行了广泛的实验,结果表明,我们提出的模型比最先进的基线方法在 F1-Macro 中检测假新闻的性能至少高出3% ,在 F1-Micro 中高出4% 。此外,它还为最终用户提供了可解释的证据。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MUSER:+A+MUlti-Step+Evidence+Retrieval+Enhancement+Framework+for+Fake+News+Detection)|0| |[PrivateRec: Differentially Private Model Training and Online Serving for Federated News Recommendation](https://doi.org/10.1145/3580305.3599889)|Ruixuan Liu, Yang Cao, Yanlin Wang, Lingjuan Lyu, Yun Chen, Hong Chen||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PrivateRec:+Differentially+Private+Model+Training+and+Online+Serving+for+Federated+News+Recommendation)|0| |[Hierarchical Projection Enhanced Multi-behavior Recommendation](https://doi.org/10.1145/3580305.3599838)|Chang Meng, Hengyu Zhang, Wei Guo, Huifeng Guo, Haotian Liu, Yingxue Zhang, Hongkun Zheng, Ruiming Tang, Xiu Li, Rui Zhang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Hierarchical+Projection+Enhanced+Multi-behavior+Recommendation)|0| |[End-to-End Query Term Weighting](https://doi.org/10.1145/3580305.3599815)|Karan Samel, Cheng Li, Weize Kong, Tao Chen, Mingyang Zhang, Shaleen Kumar Gupta, Swaraj Khadanga, Wensong Xu, Xingyu Wang, Kashyap Kolipaka, Michael Bendersky, Marc Najork||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=End-to-End+Query+Term+Weighting)|0| @@ -77,61 +77,61 @@ |[Semantic-Enhanced Differentiable Search Index Inspired by Learning Strategies](https://doi.org/10.1145/3580305.3599903)|Yubao Tang, Ruqing Zhang, Jiafeng Guo, Jiangui Chen, Zuowei Zhu, Shuaiqiang Wang, Dawei Yin, Xueqi Cheng||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Semantic-Enhanced+Differentiable+Search+Index+Inspired+by+Learning+Strategies)|0| |[Doctor Specific Tag Recommendation for Online Medical Record Management](https://doi.org/10.1145/3580305.3599810)|Yejing Wang, Shen Ge, Xiangyu Zhao, Xian Wu, Tong Xu, Chen Ma, Zhi Zheng||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Doctor+Specific+Tag+Recommendation+for+Online+Medical+Record+Management)|0| |[On-device Integrated Re-ranking with Heterogeneous Behavior Modeling](https://doi.org/10.1145/3580305.3599878)|Yunjia Xi, Weiwen Liu, Yang Wang, Ruiming Tang, Weinan Zhang, Yue Zhu, Rui Zhang, Yong Yu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On-device+Integrated+Re-ranking+with+Heterogeneous+Behavior+Modeling)|0| -|[Empowering Long-tail Item Recommendation through Cross Decoupling Network (CDN)](https://doi.org/10.1145/3580305.3599814)|Yin Zhang, Ruoxi Wang, Derek Zhiyuan Cheng, Tiansheng Yao, Xinyang Yi, Lichan Hong, James Caverlee, Ed H. Chi|Google Research, Brain Team; Texas AM University|Recommenders provide personalized content recommendations to users. They often suffer from highly skewed long-tail item distributions, with a small fraction of the items receiving most of the user feedback. This hurts model quality especially for the slices without much supervision. Existing work in both academia and industry mainly focuses on re-balancing strategies (e.g., up-sampling and up-weighting), leveraging content features, and transfer learning. However, there still lacks of a deeper understanding of how the long-tail distribution influences the recommendation performance. In this work, we theoretically demonstrate that the prediction of user preference is biased under the long-tail distributions. This bias comes from the discrepancy of both the prior and conditional probabilities between training data and test data. Most existing methods mainly attempt to reduce the bias from the prior perspective, which ignores the discrepancy in the conditional probability. This leads to a severe forgetting issue and results in suboptimal performance. To address the problem, we design a novel Cross Decoupling Network (CDN) to reduce the differences in both prior and conditional probabilities. Specifically, CDN (i) decouples the learning process of memorization and generalization on the item side through a mixture-of-expert structure; (ii) decouples the user samples from different distributions through a regularized bilateral branch network. Finally, a novel adapter is introduced to aggregate the decoupled vectors, and softly shift the training attention to tail items. Extensive experimental results show that CDN significantly outperforms state-of-the-art approaches on popular benchmark datasets, leading to an improvement in HR@50 (hit ratio) of 8.7\% for overall recommendation and 12.4\% for tail items.|推荐程序向用户提供个性化内容推荐。他们经常受到高度扭曲的长尾条目分布的影响,其中一小部分条目接受了大部分用户反馈。这会损害模型的质量,特别是对于没有很多监督的切片。学术界和业界现有的工作主要集中在重新平衡策略(例如,上调样本和上调权重)、利用内容特性和转移学习。然而,对于长尾分布是如何影响推荐性能的,目前还缺乏更深入的理解。本文从理论上证明了在长尾分布下,用户偏好的预测是有偏差的。这种偏差来自于训练数据和测试数据之间先验概率和条件概率的差异。大多数现有的方法主要试图从先验的角度减少偏差,而忽略了条件概率的差异。这会导致严重的遗忘问题,并导致次优性能。为了解决这一问题,我们设计了一种新的交叉解耦网络(CDN) ,以减少先验概率和条件概率的差异。具体来说,CDN (i)通过混合专家结构解耦项目侧记忆和概括的学习过程; (ii)通过正则化的双边分支网络解耦来自不同分布的用户样本。最后,引入一种新的适配器对解耦后的向量进行聚合,并将训练注意力柔和地转移到尾项上。广泛的实验结果表明,CDN 在流行的基准数据集上显着优于最先进的方法,导致总体推荐的 HR@50(命中率)改善为8.7% ,尾部项目的改善为12.4% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Empowering+Long-tail+Item+Recommendation+through+Cross+Decoupling+Network+(CDN))|0| +|[Empowering Long-tail Item Recommendation through Cross Decoupling Network (CDN)](https://doi.org/10.1145/3580305.3599814)|Yin Zhang, Ruoxi Wang, Derek Zhiyuan Cheng, Tiansheng Yao, Xinyang Yi, Lichan Hong, James Caverlee, Ed H. Chi|Texas AM University; Google Research, Brain Team|Recommenders provide personalized content recommendations to users. They often suffer from highly skewed long-tail item distributions, with a small fraction of the items receiving most of the user feedback. This hurts model quality especially for the slices without much supervision. Existing work in both academia and industry mainly focuses on re-balancing strategies (e.g., up-sampling and up-weighting), leveraging content features, and transfer learning. However, there still lacks of a deeper understanding of how the long-tail distribution influences the recommendation performance. In this work, we theoretically demonstrate that the prediction of user preference is biased under the long-tail distributions. This bias comes from the discrepancy of both the prior and conditional probabilities between training data and test data. Most existing methods mainly attempt to reduce the bias from the prior perspective, which ignores the discrepancy in the conditional probability. This leads to a severe forgetting issue and results in suboptimal performance. To address the problem, we design a novel Cross Decoupling Network (CDN) to reduce the differences in both prior and conditional probabilities. Specifically, CDN (i) decouples the learning process of memorization and generalization on the item side through a mixture-of-expert structure; (ii) decouples the user samples from different distributions through a regularized bilateral branch network. Finally, a novel adapter is introduced to aggregate the decoupled vectors, and softly shift the training attention to tail items. Extensive experimental results show that CDN significantly outperforms state-of-the-art approaches on popular benchmark datasets, leading to an improvement in HR@50 (hit ratio) of 8.7\% for overall recommendation and 12.4\% for tail items.|推荐程序向用户提供个性化内容推荐。他们经常受到高度扭曲的长尾条目分布的影响,其中一小部分条目接受了大部分用户反馈。这会损害模型的质量,特别是对于没有很多监督的切片。学术界和业界现有的工作主要集中在重新平衡策略(例如,上调样本和上调权重)、利用内容特性和转移学习。然而,对于长尾分布是如何影响推荐性能的,目前还缺乏更深入的理解。本文从理论上证明了在长尾分布下,用户偏好的预测是有偏差的。这种偏差来自于训练数据和测试数据之间先验概率和条件概率的差异。大多数现有的方法主要试图从先验的角度减少偏差,而忽略了条件概率的差异。这会导致严重的遗忘问题,并导致次优性能。为了解决这一问题,我们设计了一种新的交叉解耦网络(CDN) ,以减少先验概率和条件概率的差异。具体来说,CDN (i)通过混合专家结构解耦项目侧记忆和概括的学习过程; (ii)通过正则化的双边分支网络解耦来自不同分布的用户样本。最后,引入一种新的适配器对解耦后的向量进行聚合,并将训练注意力柔和地转移到尾项上。广泛的实验结果表明,CDN 在流行的基准数据集上显着优于最先进的方法,导致总体推荐的 HR@50(命中率)改善为8.7% ,尾部项目的改善为12.4% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Empowering+Long-tail+Item+Recommendation+through+Cross+Decoupling+Network+(CDN))|0| |[PDAS: A Practical Distributed ADMM System for Large-Scale Linear Programming Problems at Alipay](https://doi.org/10.1145/3580305.3599883)|Jun Zhou, Yang Bao, Daohong Jian, Hua Wu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PDAS:+A+Practical+Distributed+ADMM+System+for+Large-Scale+Linear+Programming+Problems+at+Alipay)|0| -|[Practical Design of Performant Recommender Systems using Large-scale Linear Programming-based Global Inference](https://doi.org/10.1145/3580305.3599183)|Aman Gupta, S. Sathiya Keerthi, Ayan Acharya, Miao Cheng, Borja Ocejo Elizondo, Rohan Ramanath, Rahul Mazumder, Kinjal Basu, J. Kenneth Tay, Rupesh Gupta|University of Virginia School of Medicine, USA.; Novant Health Sports Medicine, USA.; Safe Kids Worldwide, Inc., USA.; Norton Healthcare, University of Kentucky, USA.; Banyan Biomarkers, USA.; Andrews Institute for Orthopaedics and Sports Medicine, USA.; CrowdOptic, Inc., USA.; Children's Hospital, Harvard Medical School, USA.; Baylor College of Medicine, USA.; University of Colorado at Denver, USA.; Burke Rehabilitation Hospital, USA.; Metro Orthopedics & Sports Therapy, USA.; George Washington School of Medicine, USA.; National Collegiate Athletic Association, USA.; University of California, USA.; Alzheimer's Drug Discovery Foundation, 57 West 57th Street, Suite 904, New York, NY 10019, USA.; Stanford Center on Longevity, USA.; Boston University Medical Center, USA.; Alzheimer's Association, USA.; The Hastings Center, USA.; Icahn School of Medicine at Mount Sinai, USA.|Sports-related concussions and repetitive subconcussive exposure are increasingly recognized as potential dangers to paediatric populations, but much remains unknown about the short-term and long-term consequences of these events, including potential cognitive impairment and risk of later-life dementia. This Expert Consensus Document is the result of a 1-day meeting convened by Safe Kids Worldwide, the Alzheimer's Drug Discovery Foundation, and the Andrews Institute for Orthopaedics and Sports Medicine. The goal is to highlight knowledge gaps and areas of critically needed research in the areas of concussion science, dementia, genetics, diagnostic and prognostic biomarkers, neuroimaging, sports injury surveillance, and information sharing. For each of these areas, we propose clear and achievable paths to improve the understanding, treatment and prevention of youth sports-related concussions. In 2009, around 250,000 nonfatal traumatic brain injuries (TBIs) were recorded among individuals aged <19 years in the USA.1 The Centers for Disease Control and Prevention estimate that young people aged 5–18 years sustain 65% of all sports-related concussions.2 Despite recent advances in diagnostic brain imaging and in our understanding of the physics of concussion, long-term cognitive outcomes remain poorly understood. As the physical, cognitive and emotional consequences of concussion gain wider public attention, our incomplete knowledge of how to prevent, diagnose and treat such injuries endangers the health of our children in general and the health of their brains in particular. This Expert Consensus Document is the result of a 1-day meeting of experts in the fields of paediatric and adult TBI, Alzheimer disease (AD) research, genetics, epidemiology, bioethics and sports medicine (Box 1), which was convened in November 2013 by Safe Kids Worldwide, the Alzheimer's Drug Discovery Foundation and the Andrews Institute for Orthopaedics and Sports Medicine. Our primary goal is to highlight critical gaps in our knowledge of child and adolescent concussion. We emphasize areas where research is needed, such as development of diagnostic and predictive biomarkers, elucidation of genetic risk factors, and prediction of short-term and long-term outcomes. In our conclusions, we suggest paths toward improving our understanding of the long-term consequences of sports-related paediatric concussion. The term 'concussion' is often used interchangeably with the term 'mild TBI' (mTBI), a potentially misleading practice considering the possible extent of brain damage and potential for chronic neuropsychological dysfunction following concussion. We should stress, however, that most concussions resolve without sequelae. The American Congress of Rehabilitative Medicine defines mTBI as a Glasgow Coma Scale3 score of 13–15, with loss of consciousness for <30 min and post-traumatic amnesia lasting <24 h.4 Concussion describes a heterogeneous mixture of injury phenotypes that depends on many factors, including the magnitude, location and direction of head impact. Despite a lack of macroscopic structural findings, concussive brain injury involves primary neuronal injury caused by linear and rotational shear forces that disrupt axonal and membrane function (diffuse axonal injury,5 ionic flux and glutamate excitotoxicity), followed by secondary pathophysiological effects including mitochondrial oxidative stress, disruption of cerebral blood flow, compromised blood–brain barrier (BBB) integrity, synaptic dysfunction, and neuroinflammation.6, 7 Lasting neuropsychological post-concussion symptoms (post-concussion syndrome) comprise mood disorders (for example, depression), difficulty concentrating, and memory problems (Box 2).8 Both physical and physiological components of concussive injury can damage the developing brain, putting youths engaged in impact sports at particular risk. The necks and torsos of young athletes are weaker than those of older individuals and, consequently, less force is required to cause brain injury. The developing brain might also be particularly vulnerable to axonal damage caused by the shearing forces of head trauma, which, in youth American football, can exceed linear acceleration forces of 100 g.9 However, the average forces sustained in youth sports will generally be smaller than at higher levels of sport. Proper synaptic development is critical to cognitive and behavioural health.10, 11, 12, 13, 14, 15 Processes such as neurogenesis, competitive synaptic elimination ('pruning'), myelination, and axonal and dendritic arborization continue from prenatal development throughout the lifespan.14 The frontal and temporal lobes are the last areas to mature, and humans experience pruning in these regions into their early 20s,16 so damage to these still-developing areas may have pathophysiological effects on the brain that increase the potential for neuropsychological problems later in life.17 Axonal myelination continues through adolescence into the early 20s, and is susceptible to disruption by injury.10, 18, 19, 20, 21, 22 Early results from the Professional Fighters Brain Health Study, a 5-year longitudinal study of boxers and mixed martial arts fighters, who experienced repetitive subconcussive injuries as well as concussions, indicate that earlier age of first exposure to competitive boxing correlates with greater loss of caudate volume and greater axonal damage in the frontal lobe.23, 24 The young brain also has features that contribute to its resilience. Increased neuroplasticity in this age group has been shown to contribute to better outcomes after focal injuries.25 In addition, developing animals display a shorter window of glucose metabolic impairment in response to repeat TBI than do adult animals.26 Overall, the developing brain shows both vulnerability and resilience after TBI. These interwoven factors are likely to account for differences in the effects of concussion and repeat mTBI on young versus adult brains. A conservative approach to concussion risk and greater efforts to investigate these developmental differences should be given high priority. Most people—both young and old—recover fully from concussions. In children, factors potentially influencing recovery include age and history of concussions.27, 28 In one study, approximately 90% of young adult male athletes experienced symptomatic recovery within 21 days.29 However, in an emergency department study of patients aged 11–22 years (including all causes of concussion, not just sports-related), 15% of the sample still exhibited post-concussion symptoms, including headache, dizziness, 'mental fogginess' and depression, 90 days after injury.30 Several studies suggest that high school American football players are slower to recover from concussion than are college31, 32 and professional players.33 No direct comparisons with adolescents below high school age have yet been published, although a recent study that included a pre-adolescent age group (11–12 years) suggested that post-concussion recovery duration may not exhibit a linear relationship with age,30 as adolescents in this sample took longer to recover than did the pre-adolescent children. These findings, taken together, imply a unique risk of lengthier recovery in the male adolescent age group. Further studies of younger children and females would add greatly to our ability to assess and mitigate risk across the full paediatric and adolescent age span. Youths who sustained one or more concussions within 1 year prior to a new concussion reported more-prolonged symptoms,30 suggesting a possible 'window of vulnerability', and placing previously injured youths at higher risk of protracted recovery. Adolescents aged 11–18 years were nearly 80% more likely to develop post-concussion syndrome after presenting in emergency rooms than were children aged 5–10 years; similarly, presentation with headache doubled the risk of post-concussion syndrome in both children and adolescents.34 Among children treated in an emergency room after mTBI, those aged >6 years reported higher rates of persistent symptoms 3 months post injury than did those aged <6 years.35 Of course, the ability to acquire accurate information about concussion symptoms in children <6 years of age may be limited by a lack of self-awareness of symptoms and the necessary verbal skills to effectively communicate those symptoms. Also, direct comparison of injury severity is not possible from these reports; in fact, the physical heterogeneity of various injuries, taken together with the individual's innate capacity to recover from concussion, makes such comparisons highly challenging. 'Smart helmets' are being used in some speciality research centres to standardize the physical force and angular acceleration that accompanies head hits, and the utility of these helmets to measure and predict impacts that may result in concussion is currently under investigation.36, 37 Young people recovering from concussion can experience important challenges, including altered social and academic development,38, 39, 40 lower scores on general intelligence tests, and decreased school performance (measured by grade-point average).39 Lower levels of parental education and child academic achievement both correlate with poorer concussion recovery.41 Personality traits also play a part; for example, pre-injury anxiety is a risk factor for prolonged recovery periods after sports-related concussion.42 Young athletes of both sexes are at risk of concussion, but girls report higher concussion rates than boys, particularly in high school and college soccer, basketball, and baseball or softball.28, 43, 44, 45 The factors that account for these differences remain uncertain, but might include quality of protective gear, recognition and reporting of concussion symptoms, and neck length and neck muscle strength.46 Differences in recovery trajectories between males and females are also poorly understood. However, one recent study suggested that progesterone levels in females influence post-concussion recovery.47 Hormonal changes during puberty that contribute to migraine headaches might also contribute to sex differences in concussion recovery. Migraine headaches are up to fourfold more common in females than in males after puberty,48, 49 and some evidence suggests that migraineurs recover more slowly after concussion.50, 51 Research is warranted to further delineate sex differences in concussion risk and recovery. In general, adult concussive brain injury is much better understood than its counterpart in children and adolescents. Several points are important to note. First, concussion has multiple, non-harmonized definitions. Second, concussion diagnosis is an imperfect art. Last, in the absence of rapid and inexpensive objective diagnostic measures, concussion remains a clinical diagnosis that is subject to variability—including different thresholds for diagnosis across various subspecialities and across individual physicians, neuropsychologists and athletic trainers—and under-reporting by coaches, parents and young athletes. Without validated diagnostics, concussion will remain a nebulous and under-reported entity, and the accuracy of incidence estimates will continue to be tainted by the differential application of inexact criteria. Repetitive subconcussive trauma can result in structural and functional brain changes.52 White matter abnormalities detected by diffusion tensor imaging (DTI) have been reported in professional soccer players even in the absence of any obvious history of concussions. Compared with swimmers, male professional soccer players showed DTI signal changes suggestive of decreased white matter integrity in several brain regions, which might indicate loss of axonal myelination, similar to changes seen in individuals with mTBI.53 Collegiate ice hockey players exhibited similar white matter changes over the course of a season.54, 55, 56, 57 In addition, repetitive subconcussive head impacts in collegiate American football players have been linked, in a dose-dependent manner, to deficits in BBB integrity, potential loss of white matter integrity, and cognitive dysfunction.58 These findings probably reflect some level of risk for youths who sustain repetitive subconcussive head impacts, although little research has been devoted specifically to this topic. A metric to track head impacts—that is, a 'hit count'—has been proposed,59 and could serve as one factor to determine cumulative risk exposure. One challenge of this approach is to accurately define the parameters of a 'hit', but improved biosensors show some promise in this regard. Similar to a 'pitch count' in baseball, this concept has also recently been proposed for boxers.24 No evidence is currently available to show a causal link between repetitive subconcussive head impacts in youth and dementia later in life, and such metrics could prove invaluable if validated by future studies correlating head impacts with subsequent neuropsychological dysfunction. In adults, TBI, including concussion,60, 61, 62 might increase an individual's risk of developing neurodegenerative disease,63, 64 including AD and chronic traumatic encephalopathy (CTE), a disease associated exclusively with repetitive head trauma.65, 66 TBI may also increase the risk of developing Parkinson disease (PD),67 although the relationship between mTBI and PD risk remains uncertain.68 In paediatric populations, particularly young athletes, the effects of single or repetitive concussions on the risk of later-life neurodegeneration and dementia are unknown. CTE was first described symptomatically in the late 1920s as 'punch-drunk' dementia in boxers,69 was later described as 'dementia pugilistica',70 and was first described pathologically in 1973.71 Since the identification of CTE in a former professional American football player in 2005,72 and additional intensive pathological studies, this condition has gained widespread public attention, and has now been identified in brains of former ice hockey, baseball, rugby and soccer players,73 wrestlers,74 and military veterans.75, 76 The prevalence and incidence of CTE in amateur and professional athletes is still unknown, adding to difficulties in discussing its epidemiology and population risks for athletes. Although CTE is primarily considered to be a neurodegenerative disease that sometimes results from a career of either collegiate or professional contact sports, cases of CTE have been reported in high school athletes.77 This finding suggests that long sporting careers are not required for CTE development, and that youth athletes represent an at-risk population. Emerging evidence suggests that clinical CTE symptoms can be grouped into two common presentations: cognitive and mood–behavioural.78, 79 Subjective memory complaints such as anterograde amnesia are common, as are mood disorders including anxiety or depression,79 and reduced executive function, which can result in disinhibition and impaired decision-making skills.80 These clinical symptoms define disease severity.81 The neurodegenerative pathophysiology of CTE is complex, and the neurological sequelae are poorly understood. In severe cases, the cerebral cortex and medial temporal lobes seem most profoundly affected,81, 82 with pathology characterized by neurofibrillary tangles composed of phosphorylated tau79 and, in some cases, TAR DNA-binding protein 43 pathology.83 CTE is also associated with marked atrophy, notably in the frontal cortex and medial temporal lobe, as well as in the mammillary bodies, thalamus and hypothalamus.79 Confirmed clinical diagnosis of CTE remains autopsy-based.84 Given the uncertainty over whether the tauopathy described in CTE is causative of the clinical phenotype, and the fact that most professional and collegiate athletes do not develop CTE, it is vital to understand whether early exposure to concussion is associated with other forms of neurodegeneration and cognitive dysfunction, including chronic neurocognitive impairment (CNI). Important clinical distinctions exist between CTE and CNI,28, 51 some of which make direct comparisons difficult. CTE is an emerging clinical and pathological condition that involves progressive deterioration of neurological and cognitive function in multiple domains, and is diagnosed primarily at autopsy. Conversely, the CNI phenotype is not necessarily progressive, and is characterized by functional decline from group averages or baseline functioning established before TBI. CNI can be diagnosed clinically through neuropsychological testing. No causal link between CNI and head trauma has yet been confirmed, but a dose-dependent risk has consistently been found in professional athletes.28 In addition, almost half of the studies conducted in amateur athletes have found an elevated risk of CNI.28 Whether similar risk associations are present in younger populations remains to be determined. One hypothesis is that CNI represents a prodromal—but not inevitable—step toward CTE, analogous to the relationship between mild cognitive impairment (MCI) and AD.85, 86 Alternatively, CNI may represent static impairment without degeneration. Our current lack of understanding of the basic biological underpinnings of CNI and CTE underscores the need for more research. Increased knowledge of the biology of both conditions, as well as early detection of CNI in athletes (in particular, youth athletes), may drive interventions to stem the development of further cognitive impairment, and could also aid validation of putative biomarkers. Assessment of CNI via tau imaging may help determine the likelihood of progression to CTE. The field of concussion genetics, especially in paediatric populations, is still in its infancy. Although repetitive head impacts seem necessary for the development of CTE, other factors, including genetics, are likely to have an important role, as most concussed athletes do not develop CTE.87 The genetic risk factors for CTE probably overlap with those that influence susceptibility to and recovery from concussion, and genetic risk factors for AD are providing important clues to the identity of these factors. The ε4 allele of apolipoprotein E (APOE ε4), the most important genetic risk factor for AD identified to date,88 critically affects the CNS injury response,89 in particular, amyloid-β (Aβ) clearance from the brain. The three alleles of APOE confer varying degrees of AD risk: APOE ε2 reduces the risk, APOE ε3, the most common allele, represents baseline risk with which other variants are compared, and APOE ε4 increases the risk.90, 91 Studies suggest an interaction between APOE ε4 and sex, such that APOE ε4-related risk of AD is more prominent in women than in men.92, 93 The APOE genotype acts synergistically with TBI in increasing the risk of AD,94 although its hypothesized risk association with CTE as an outcome of repetitive mTBI requires more study.95 No consensus has yet been reached on the effects of APOE isotype on the outcome of paediatric TBI, but data from adults suggest that APOE ε4 negatively influences concussion outcomes. Several studies indicate that possession of at least one APOE ε4 allele is associated with poorer cognition and lasting neuropsychological impairment after concussion in professional American football players,96 boxers95 and other adults,97, 98, 99, 100 although other studies found no such association.101, 102 Some evidence points to polymorphisms in both the APOE gene and its promoter as contributory factors to concussion risk in college athletes.103, 104 Another study did not identify a role for APOE ε4 in concussion risk,105 although this allele might increase the risk of dementia following midlife or late-life mTBI.106 Drawing conclusions from these conflicting studies is difficult, owing to small sample sizes and differing methodologies. In children, little is known about the relationship between APOE ε4 and neuropsychological outcomes after concussion, and APOE ε4 testing is not routine in paediatric TBI studies. In 2012, Kurowski reviewed the few existing studies and combined the results of three studies107, 108, 109 that used the Glasgow Outcome Scale.110 In the combined sample (252 children), the risk of poor clinical outcomes after 6–12 months was over twofold higher in APOE ε4 carriers than in noncarriers (19% versus 9%). However, these studies included a broad developmental range of children with heterogeneous injuries, and did not account for a possible interaction between age and genotype. In addition, the interaction between APOE and sex has not been studied in the context of concussion. Improved prospective studies are warranted to clarify these connections. Incorporation of genetics into paediatric concussion research is fraught with complicated challenges, including acquisition of parental consent and informed consent for a child, perceived stigmatization of clinical study participants, the actionability of the genetic knowledge obtained, and potential concerns regarding insurability (particularly long-term care insurance). Studies of adults who learn of their APOE ε4+ status demonstrate that many are willing to make lifestyle modifications, including increased exercise and improved medication management,111 as well as increased purchases of health and long-term care insurance.112, 113 Education about new genetic knowledge and corresponding disease risk is essential, as demonstrated by the substantial discordance between an individual's personal feelings about the implications of the acquired knowledge and the actual consequences of increased dementia risk.114 The effects of APOE genetic knowledge on children, their families and decision-making processes regarding participation in impact sports remain unclear. The influence of APOE genotype on concussion risk and recovery in this age group also needs further elucidation. If future studies find that, for any particular level of impact, children with APOE ε4+ status are at greater risk of concussion or poor recovery than are their APOE ε4− peers, consideration should be given to genetic testing of school-age athletes before participation in impact sports. Careful studies of high school and younger athletes are required to fully understand the nuances of genetic influences. Future research into youth concussion outcomes, including cognitive outcomes and risk of dementia, should include APOE genotyping wherever possible. New APOE studies should standardize research methodologies and reporting measures, including the collection of 'common data elements', to ensure valid comparison across studies.110, 115 The APOE genotype is not necessarily a non-modifiable risk factor for concussion recovery: therapies being developed for AD include drugs that modify the interaction between the ApoE4 protein and Aβ, which might also be applicable to paediatric concussion.116, 117 The Val66Met polymorphism in the gene encoding brain-derived neurotrophic factor has been linked to better outcomes after mTBI,118 but worse outcomes after focal penetrating brain injury.119 Polymorphisms in genes involved in dopaminergic signalling may also help to account for the wide range of TBI outcomes.120 In addition, the Rep1 polymorphism in the promoter region of the α-synuclein gene might increase the risk of PD after head injury.121 To advance our understanding of concussion risk and management, large, prospective, population-based genome-wide association studies (GWAS) and whole-genome sequencing studies should be conducted to identify other genetic variants—possibly of low frequency or low penetrance—that modify the risk of prolonged recovery, poor cognitive outcomes or dementia.122 Such studies will require large-scale data sharing, and must address issues of ethics, privacy, and potential implications for insurability and employability. Despite progress in identifying possible cerebrospinal fluid (CSF) and blood-based biomarkers that might be applied to adult TBI management, no clinically validated biomarkers are available for either the adult or the paediatric population. Paediatric concussions present with even greater clinical variability than do adult concussions; therefore, biomarkers have special potential for improving concussion diagnosis in children. Of note, most TBI biomarkers have been studied in the context of moderate to severe TBI, leaving us with obvious gaps in our knowledge of mTBI biomarkers, especially in children. Biomarker development has been critical to the advancement of AD therapeutics. CSF-based biomarkers are already being employed to identify at-risk patients and to improve the design of both epidemiological studies and clinical trials.123 New PET radioligands, such as amyloid-labelling agents (three of which are now FDA-approved), can be used both diagnostically and to improve neuropathology-based patient stratification for clinical trials. Several tau imaging agents are also in human trials, and their utility in tauopathies, including CTE, is rapidly being established. As with fluid-based biomarkers, there are currently no neuroimaging biomarkers sensitive or specific enough to diagnose concussion or CTE in either adults or children. No TBI diagnostic or therapeutic agents have yet been approved by the FDA, and validation of concussion biomarkers could accelerate the development of such agents. Efforts must be made, however, to ensure the cost-effectiveness and wide availability of clinical biomarker testing. Also, given the risks associated with lumbar puncture, ethical concerns regarding sampling of CSF from concussed youths for biomarker research should be addressed. Promising findings in adult fluid-based biomarker research must be explored in paediatric populations. Putative concussion biomarkers have emerged sporadically in the scientific literature over the past few decades, the most prominent being S100 calcium-binding protein B (S100B), a nonspecific marker of astrocyte activation. The presence of S100B in serum may indicate loss of BBB integrity. Elevated serum and CSF levels of S100B have been observed in adult boxers after matches, and correlate positively with the number and severity of head impacts.124, 125 Increased serum S100B levels have also been observed in concussed professional ice hockey players,126 with levels measured 1 h post-concussion predicting symptomatic recovery time. However, S100B levels were also raised after controlled play where no concussions occurred, indicating that this marker is not injury-specific.126 Indeed, S100B serum levels are elevated in adult trauma patients without head injury.127, 128, 129 Other research suggests that initial post-concussion S100B levels are poor predictors of recovery.130 As with all biomarkers, the role of S100B in TBI management in children is even less clear,131 with some arguing that this marker has little diagnostic or prognostic utility in paediatric populations.132 In a study of children with TBI aged ≤15 years, those <5 years or >9 years of age had higher serum levels of S100B than did those aged 5–9 years.133 S100B may, therefore, be an inadequate marker to distinguish between symptomatic and asymptomatic children with concussion,133 and the utility of S100B in diagnostics and outcome prognosis is questionable.134, 135, 136 Neuron-specific enolase (NSE) is a marker of neuronal injury, but its usefulness as a serum or CSF biomarker remains uncertain.133, 134, 135, 136, 137 Elevated serum NSE levels have been observed after head impacts in boxers,124 but were also seen in ice hockey players after a match where no concussions occurred.126 Serum NSE levels failed to predict recovery time after concussion,126 and might not correlate with injury severity in children.133 In children aged ≤15 years, serum NSE levels correlate inversely with age.133 Once released into the blood, NSE has slow elimination kinetics, making it difficult to distinguish primary from secondary neuronal injuries on the basis of NSE levels.138, 139 Neurofilament light chain and glial fibrillary acidic protein (GFAP) are CSF neuron-specific and glial-specific damage markers, respectively, and are both elevated in CSF in adult boxers after fights.125, 137, 140 Little is known about either marker in the context of paediatric concussion, but a preliminary study in children and young adults suggested that serum GFAP levels within 72 h after concussion correlate with symptom burden up to 1 month post injury.141 The neuron-specific protein UCH-L1 (ubiquitin carboxyl-terminal hydrolase isozyme L1) was first linked to neurodegenerative pathology through its involvement in PD,142 and its presence in serum was later identified as a biomarker for severe TBI.143, 144, 145 Serum levels of UCH-L1 may have diagnostic utility in concussion,146 but recent evidence suggests a lack of correlation between elevated serum levels and subconcussive hits.147 The clinical utility of UCH-L1 in paediatric populations warrants further study. Perhaps the most promising advances in adult fluid-based TBI biomarkers concern tau protein. Serum or CSF tau levels are thought to indicate axonal damage, as tau normally resides in axons, where it stabilizes microtubules. Serum tau is proteolytically cleaved,148 and in patients with AD, levels of cleaved tau in CSF might correlate with cognitive function.149 Tau levels in CSF and blood are elevated in boxers after a match, and CSF tau levels correlate with the quality and quantity of head impacts.125, 150 Recent evidence suggests that tau levels are elevated in the blood of ice hockey players after concussion, and may be useful in predicting recovery time.126 Questions remain, however, with several studies reporting little or no value of serum cleaved tau for predicting post-concussion syndrome or long-term outcomes.130, 151 The potential of tau as a biomarker in children remains unclear, with no studies conducted to date. In fact, the reliability of serum tau as a biomarker has not yet been established for any indication. The likelihood is that no single biomarker will suffice to diagnose paediatric concussion or predict outcomes. In addition, few studies have examined the interactions between genetic make-up and putative biomarkers. As our understanding of the relationships of biomarkers to injury severity and to each other increases, development of biomarker panels, perhaps incorporating inflammatory and oxidative markers,152 should be considered. Future studies should attempt to further define these relationships and establish the clinical value of biomarker panels, factoring in commercial cost and practical feasibility. Recent advances in metabolomics, lipidomics and proteomics—in particular, the search for metabolomic and lipidomic markers for AD—might inform future research into biomarkers for concussion and subconcussive injuries. Several recent studies propose altered metabolite and lipid profiles associated with MCI and AD.153, 154, 155, 156 Data from animal models suggest that lipid and metabolite changes accompany both acute and chronic post-concussion periods, and could be useful for predicting recovery trajectory,157, 158 but these findings have yet to be validated in humans. Expanding the biomarker search beyond blood and CSF to saliva and urine159 might improve the ability to obtain measurements rapidly and noninvasively, particularly from children. Sampling of CSF from children, particularly when rapid assessment is desirable, is largely impractical. Mondello et al. proposed a set of useful criteria for evaluating TBI biomarkers that should allow more-streamlined development and validation.137 Any validated biomarker panel must, inevitably, be a component of a larger, multimodal diagnostic suite that may include structural and functional imaging and neuropsychological testing. When designing future biomarker studies, the potential for FDA approval should be considered, in order to expedite approval for clinical use. Although concussion remains a clinical diagnosis, neuroimaging techniques are improving our understanding of the structural and functional consequences in adults. Neuroimaging in paediatric populations may be limited by several factors; for example, measurements of longitudinal changes after concussion are complicated by the background of a dynamic, immature brain. No imaging techniques have been validated as diagnostic tools for concussion, and the correlation between imaging findings and clinically measurable cognitive or behavioural functions is variable. Tools such as volumetric imaging, DTI and functional MRI (fMRI)—in particular, arterial spin labelling—are currently being explored.160, 161 Fractional anisotropy (FA), as measured by DTI, allows inference of the structural integrity of white matter tracts, which are commonly disrupted after TBI. The clinical implications of FA change remain controversial, as both increased and decreased FA has been observed in concussion studies.162, 163, 164, 165, 166 These discrepancies may be due, in part, to the considerable spatial heterogeneity in the brain areas examined,167 as well as differences in the post-injury interval. FA may still have prognostic value, with evidence suggesting that the direction and magnitude of change correlates with clinical outcomes;166, 168 however, this idea awaits validation in both paediatric and adult populations. FA might lack the necessary sensitivity to fully appreciate changes in white matter tract integrity following brain injury, and measures of diffusivity may be more appropriate.169 The DTI field would benefit greatly from the development of normative data sets against which to gauge observed changes. Pre-game versus post-game and season-long studies of young athletes could employ serial DTI imaging to establish normative data for a particular individual, but the utility of the data when pooled is unclear. The scarcity of normative paediatric data severely limits the clinical usefulness of neuroimaging techniques, including DTI. Studies of 'return-to-baseline' neuroimaging after paediatric concussion are also needed, as they could greatly improve prediction of recovery. Although automation has increased reproducibility, DTI measurements remain sensitive to the hardware and software specifics, acquisition parameters and analysis software, which limit reproducibility, standardization and comparison between centres and across studies. Efforts to standardize DTI across imaging centres are underway.170 MRI has been particularly successful in mapping the brain's 'connectome'—the collection of structural and functional neural connectivity networks and their respective focal nodes—and for studying how concussion affects these networks. Focal or diffuse TBI can disrupt the brain's functional connectivity, resulting in dysfunction of multiple networks including the default mode and salience networks, which have been implicated in memory, emotion and mood.171 Network dysfunction might have a stronger influence on recovery than does lesion location,171, 172, 173 but the long-term implications for brain development and cognitive function remain unclear.26, 174 Further studies of network connectivity dysfunction in children after concussion will be critical to improve injury prognostication and management. Radiotracers for PET imaging have the potential to advance the diagnosis and treatment of concussion and CTE, but their use in paediatric populations is purely investigational at present. Three FDA-approved radiolabelled imaging agents are currently available for detecting brain amyloid in patients with suspected AD.175 In adults, some cases of concussion are associated with acute Aβ pathology. PET scanning could enable paediatric patients to be monitored for the presence and persistence of acute post-concussion amyloid, and to determine whether scan positivity and negativity predict different outcomes.176, 177 Other PET imaging agents with potential utility in paediatric populations include new tracers that bind neurofibrillary tangles composed of tau. Early imaging results with 18F-T807, 18F-T808 and 18F-THK5105 are proving to be useful in confirming the presence of tauopathy in various clinical situations, including AD.178, 179, 180 In a recent AD study, the magnitude of tau tracer signal correlated positively with the stage of disease and severity of cognitive impairment.180 A third tau PET tracer, 11C-PBB3, has been tested in healthy individuals and patients with AD, and may be able to detect non-AD conformations of tau.181 In addition, a recent report contains the first description of tauopathy imaging in a living person with suspected sports-associated CTE.177 Given the extent of chronic tau pathology in concussion, repetitive subconcussive injury and CTE, tau tracers may be useful as diagnostic and prognostic biomarkers (for example, to distinguish CNI from CTE). Studies with these tracers in adults with CTE are underway, but their use in paediatric populations will depend on future research to determine whether tau pathology is present in young patients after TBI or concussion. A PET tracer for the microglial cholesterol transporter protein might be useful for imaging of neuroinflammation associated with TBI.182 New PET ligands to image brain microglia, which are being developed with potential utility in neurodegenerative diseases, may also prove useful in concussion and CTE management. Exploration of these PET ligands in paediatric populations with concussion and TBI would be informative, but risk–benefit analyses must be performed before embarking on studies involving radiotracers in this age group. The ultimate utility of any PET imaging agent will depend on its diagnostic and prognostic value as part of a multimodal panel of biomarkers and neuroimaging techniques. Noninvasive techniques such as transcranial magnetic stimulation (TMS) have also uncovered changes in synaptic plasticity following TBI and concussion,183 particularly in asymptomatic individuals.184, 185, 186 Several small TMS studies of young athletes in their early 20s with a history of concussion suggest imbalances in γ-aminobutyric acid and/or glutamate neurotransmission in the motor cortex that are associated with deficits in synaptic long-term potentiation and depression.184, 185, 187, 188 TMS has also revealed that concussion-related impairments in synaptic plasticity can impair aspects of motor learning,188 and that these deficits are detectable decades after an individual's last concussion.189 Another crucial noninvasive tool for detecting neurochemical dysfunction associated with concussion is proton magnetic resonance spectroscopy (MRS). Reports specifically addressing the use of spectroscopy following sports-related concussion suggest various abnormalities consistent with neurochemical alterations.190 In younger (high school) athletes, increased glutamate and glutamine levels were detected by MRS at post-season versus pre-season evaluation, even in players who had not experienced clinically significant concussion during the season.191 Such findings suggest that even subconcussive head impacts can result in the activation of glutamate pathways, implying cellular injury or neuronal death, despite the absence of symptoms. Levels of creatinine and myoinositol (an organic osmolyte located in astrocytes192, 193) were also significantly altered in a subset of the participants in the aforementioned study. In a rare longitudinal study utilizing MRS,194 individuals who sustained a single sports-related concussion exhibited significantly reduced levels of N-acetylaspartate (NAA, a marker of neuronal and axonal health, integrity and functioning195) in the brain 3 days after injury. Levels were increased at 15 days post injury, and reverted to control values at 30 days post injury. By contrast, participants who sustained a second concussion 10–13 days after their initial concussion displayed a prolonged reduction in NAA levels, which had not normalized even 45 days post injury. These results suggest that repeated injury within a short time frame increases the likelihood of protracted or incomplete recovery. In addition to the acute and subacute alterations detected by MRS, other studies of the long-term effects of concussion have disclosed increased myoinositol (associated with glial proliferation) and decreased choline (associated with membrane turnover195) levels in the medial temporal lobe in otherwise healthy former athletes who sustained their last concussion more than three decades prior to testing.196 Another recent study examined a cohort of symptomatic retired National Football League players, using an advanced MRS method called correlated spectroscopy (COSY), which can measure additional metabolites.197 The authors identified increased choline and glutamate–glutamine levels (indicative of diffuse axonal injury and excitotoxicity, respectively), consistent with previous mTBI MRS studies, as well as additional cerebral metabolites that were indicative of neuroinflammatory changes. These metabolic changes may provide insight into mechanisms of injury, such as excitotoxicity and/or inflammation, which could underlie the reported structural changes. Overall, the available data support the use of MRS as a research tool to identify altered neurophysiology and monitor recovery in adult athletes, even following resolution of post-concussive symptoms. At present, MRS-detected biochemical alterations may enhance our understanding of the underlying pathophysiology, but do not yet provide specific diagnostic information. Larger cross-sectional, prospective and longitudinal studies are needed to determine the sensitivity and prognostic value of MRS within the field of sports-related concussion.190 Because the interpretation of MRS in the immature brain requires certain developmental considerations, appropriate comparison samples will be needed for future work in children. MRS techniques with greater spectral resolution, including COSY, might provide additional biochemical specificity.197 Other advances in spatial resolution, such as 3D chemical shift imaging, may also provide greater specificity by allowing the investigation of metabolic alterations throughout the brain rather than in specific regions of interest. Finally, MRS could have a role in measurement of treatment effects, such as those induced by transcranial direct current stimulation198 and TMS.199 The mechanisms and surveillance infrastructure for sports-related injury measurement, reporting, tracking and data sharing are insufficient for current needs and objectives. Concussion research and clinical efforts are hindered by a lack of concussion data across sports and playing levels. A 2014 Institute of Medicine report identified only three national sports injury surveillance systems: the National Electronic Injury Surveillance System—All Injury Program (NEISS-AIP), the National Collegiate Athletic Association Injury Surveillance System (NCAA ISS), and the High School Reporting Injury Online (RIO™).1 These systems can be supplemented with clinical data (for example, from emergency departments, hospitalized inpatients and sports clinics), but these data are biased toward more-severe injuries and patients of higher socioeconomic status. Indeed, schools in rural areas or communities with lower socioeconomic status often have limited access to sports medicine care professionals and facilities. Several emerging programmes may improve surveillance. Regional efforts such as Clinical Outcomes Research Education for Athletic Trainers (CORE-AT) and national efforts such as the National Athletic Trainers' Association National Athletic Treatment, Injury and Outcomes Network (NATA NATION™) attempt to integrate injury tracking with treatment and outcomes data at the high school and collegiate levels. However, none of these systems specifically capture injuries to younger athletes, those participating in non-school sponsored sports, or those at schools without athletic trainers. Sports injury databases also rarely account for demographic factors including socioeconomic status, race or ethnicity, and health-care coverage. Currently, no effective mechanisms exist to consistently and inexpensively link various surveillance data sets, or to follow up individual athletes across sports, tracking systems or the age continuum. There is a considerable need for a system that tracks individual athletes through their playing careers and beyond. Each individual should be tracked for several decades to establish if, when and how a given burden of TBI evolves into CTE, and to assess all the possible negative health outcomes associated with concussion. Such a system would also provide more-accurate descriptions of concussion history and exposure to risk factors, and could capture both short-term and long-term outcomes, including measures of physical and mental health, academic and career success, quality of life and social connectivity, and evolving socioeconomic status. Such efforts are challenged by a variety of issues, including a lack of mandatory reporting of concussion at any level. Mandatory concussion reporting, funding for surveillance efforts, and provision of training to data reporters (for example, coaches and athletic trainers) would greatly improve epidemiological research. However, mandatory reporting will not provide meaningful results without validated, consensus definitions for concussions, and development of a universal data repository and a global unique identifier (GUID) system. Data sets from standardized surveillance efforts could then be linked, thereby improving data sharing for research and clinical care. Coupling of surveillance data with standardized collection, storage and curation infrastructures for biobanking of tissue and fluid samples could dramatically improve injury and outcomes research.200 These efforts might be catalyzed by funding from public–private partnerships, and made actionable by setting realistic short-term and long-term goals to create a multi-year plan. However, in the USA at least, such efforts are currently hampered by misunderstanding of Health Insurance Portability and Accountability Act (HIPAA) regulations and general concerns for athlete confidentiality. Wider use of computerized neurocognitive testing (CNT) for athletes could improve concussion surveillance, as well as diagnosis and management. However, several important challenges must be overcome before CNT becomes routine. These challenges include a lack of standardized administration protocols, the potential for technological errors arising from different computer hardware, limits in the types of cognitive functions assessed, and a lack of qualified test administrators and data interpreters.201 Despite these shortcomings, however, CNT is already used by approximately 40% of US high schools that employ athletic trainers.202 Though not affordable for all schools, CNT could enhance ground-level data collection and aid risk-exposure estimation and post-concussion recovery tracking, as well as increasing the quality of data reported to sports injury surveillance networks. CNT may be also useful in evaluating and tracking post-concussion cognitive improvement or decline, and could have utility in predicting outcomes.203, 204 Whether CNT data collected in the school setting will reach the validation and reproducibility standards achieved by CNT conducted by a clinical research team remains to be seen. Importantly, CNT needs standardization and guidelines for determining 'return to play' and 'return to learn' for athletes who show recovery in one domain but are still symptomatic in others. More research is required on the utility of CNT, both in the clinic and for concussion surveillance and management of youth athletes. In several critical areas, incomplete knowledge hampers meaningful advances in the field of paediatric concussion. At the molecular and cellular levels, research that focuses on axonal damage after concussion and repetitive subconcussive injury is urgently needed to elucidate changes in axonal trafficking and repair, and to better define the role of transient Aβ accumulation as a potential driver of downstream and/or future pathology. Concussion researchers may need to identify more-suitable animal models to study molecular pathology, including tau and its contribution to post-concussion and CTE pathologies, as the structure and organization of the brain differs dramatically in rodents and humans. Without a clearer understanding of how TBI changes the young, still-developing brain, and what pathological events happen in the weeks, months and years following injury, we are left to speculate about the underlying biological bases of such changes. Head impact data collection and risk assessment in youth sports might be improved through use of sensor technologies that record linear and rotational forces. Such commercially available devices, if validated, could determine levels of cumulative head impact forces during games and across seasons of play, and the findings could be linked to neuroimaging data and functional outcome assessments. Combined with 'hit-count' metrics, sensor data may improve knowledge of short-term and long-term neuropsychological outcomes of repetitive subconcussive impacts. Our knowledge of CTE might be improved by understanding baseline rates in the general population, in injured athletes, among uninjured athletes matched by sport and playing positions, and in 'control' athletes in low-risk sports. Improved knowledge of risk exposures could lead to prevention efforts, including practice and competition rule changes. A decades-long, prospective, longitudinal study, following youth athletes through their sporting careers and beyond, would provide more-definitive knowledge of cumulative head impacts and risks of long-term neuropsychological dysfunction and dementia. Such a study is underway in NCAA alumni, who were first studied in 2003 and were re-assessed in 2013.29, 205 Studies in other populations, especially if NIH-funded, would probably begin with a 5-year study that could be renewed in further 5-year increments. Public–private partnerships are likely to be required to secure enough funding to involve multiple study centres. The NCAA has provided partial sponsorship for the 10-year re-assessment of over 100 athletes, but further funding from the NIH, the US Department of Defense (DoD), and private philanthropic sources will be required to extend the range of assessment from neuropsychology, through MRI, to molecular imaging for amyloid, tau and/or inflammation. Ideally, the longitudinal study design should combine epidemiological and interventional trial methodologies and utilize multiple control groups, including non-contact athletes and uninjured impact sport athletes. A longitudinal study would also shed light on the role of cognitive reserve. A precedent for such studies has been established by the late-life dementia research community, using NIH funds and public–private partnerships involving pharmaceutical companies and foundations. For such studies to be successful, additional surveillance systems and data repositories must first be established. Efforts would be accelerated if athletes participating in impact sports had universal access to athletic trainers, who could act as reliable data reporters while promoting safety and providing basic care. In addition, any longitudinal studies must include postmortem analyses to better understand the influence of childhood and young-adult concussions on the development of neurodegenerative pathology and dementia in later life. 'Return-to-play' guidelines are currently hampered by a lack of rigorous epidemiological evidence, and could be greatly improved by long-term safety data from longitudinal studies.206 Longitudinal research could also include studies to determine whether those athletes who fail to follow guidelines experience any negative health effects, such as lingering symptoms or altered risk of incurring a second concussion. The infrastructure for a long-term prospective study might be created through the formation of a research consortium modelled after the Alzheimer's Disease Neuroimaging Initiative (ADNI). ADNI has set standards for data collection, dissemination agreements, testing methodologies, and biomarker collection and analysis. A version of ADNI currently underway with participation of the DoD (ADNI-DoD) is focused on blast-related TBI research in military populations.207 In May 2014, in addition to the NCAA Concussion Study, the NCAA and the DoD announced the launch of the largest prospective sports-related concussion study to date, which will monitor approximately 37,000 NCAA athletes over 3 years. One can envision how this study's infrastructure may eventually be extended to study younger athletes over an extended longitudinal range. Many gaps remain in our knowledge of the biology of TBI, which limit our ability to develop effective drugs. These gaps must be filled if we are to tackle the underlying disease pathology and move beyond treating the symptoms. However, much can be accomplished while research into fundamental TBI biology continues. Drug repurposing involves testing of existing FDA-approved drugs for new indications, and can reduce expense and shorten the path for drug approval. Current repurposing trials include methylphenidate for pain and mental fatigue,208 the dopamine receptor agonist bromocriptine for working memory,209 and the antidepressant sertraline for mood and anxiety, the most frequent neuropsychological complications that influence long-term outcomes after concussion.210 Larger randomized clinical trials should be conducted before these drugs can be introduced into clinical practice for these new indications. In addition, the recent failure of the PROTECT phase III trial of progesterone to improve outcomes after acute TBI211 may serve as a reminder of the need for more research to better understand the fundamental biology underlying TBI. Although many drug repurposing efforts are designed primarily to address concussion symptoms, the drugs may also influence injury pathology and progression. Research on established drugs can also lead to new drug discovery efforts and, potentially, new preventive or management therapeutics. New drugs are urgently needed for TBI and concussions that do not resolve. Drug discovery efforts in the areas of neuroprotection and anti-inflammation are especially relevant because of their potential cross-applicability to neurodegenerative diseases such as AD. Similarly, drugs currently in development for other neurodegenerative diseases might be repositioned for testing in patients with TBI or nonresolving concussion symptoms. As is often the case in medical research, recent advances in concussion research raise as many questions as they answer. Evidence exists for long-term neuropsychological dysfunction and later-life dementia after concussions or repetitive subconcussive head impacts, and more work is needed to better understand the implications and outcomes of youth participation in impact sports. As outlined in this Expert Consensus Document, there is a path forward, but achieving the goals outlined here will require public and private sector cooperation. While recommendations can be improved with increased knowledge, the available evidence can still inform individual decision-making when considering youth sport participation, as well as practice policies and competition rules. With an ageing population and a looming epidemic of dementia, we must learn more about potential early-life risk factors, including sports-related concussion. The choices made by parents, coaches, school boards and children will be better informed when the critical gaps in scientific knowledge of concussion are filled. Download references|与运动相关的脑震荡和重复性亚震荡暴露越来越被认为是儿科人群的潜在危险,但是对于这些事件的短期和长期后果,包括潜在的认知障碍和晚年痴呆的风险,仍然知之甚少。这份专家共识文件是由全球安全儿童、阿尔茨海默氏症药物发现基金会和安德鲁斯矫形外科和运动医学研究所召集的为期一天的会议的结果。目标是强调在脑震荡科学、痴呆症、遗传学、诊断和预后生物标志物、神经影像学、运动损伤监测和信息共享等领域的知识差距和亟需研究的领域。针对这些领域,我们提出了明确和可实现的途径,以提高对青少年体育相关脑震荡的理解、治疗和预防。2009年,美国年龄 < 19岁的个体中记录了约250,000例非致命性创伤性脑损伤(TBI)。1疾病控制和预防中心估计,5-18岁的年轻人维持着所有运动相关脑震荡的65% 。2尽管最近在诊断性脑成像方面取得了进展,并且在我们对脑震荡物理学的理解方面,长期的认知结果仍然知之甚少。由于脑震荡的身体、认知和情感后果引起了公众的广泛关注,我们对如何预防、诊断和治疗这种伤害的不完整知识危及我们儿童的总体健康,特别是他们的大脑健康。这份专家共识文件是儿科和成人创伤性脑损伤、阿兹海默病(AD)研究、遗传学、流行病学、生物伦理学和运动医学领域专家为期一天的会议的结果(专栏1) ,该会议于2013年11月由全球安全儿童、阿尔茨海默氏症药物发现基金会和安德鲁斯矫形外科和运动医学研究所召集。我们的主要目标是强调我们在儿童和青少年脑震荡知识方面的重大差距。我们强调需要进行研究的领域,如开发诊断和预测性生物标志物,阐明遗传风险因素,以及预测短期和长期结果。在我们的结论中,我们提出了提高我们对与运动相关的儿童脑震荡的长期后果的理解的途径。术语“脑震荡”经常与术语“轻度 TBI”(mTBI)交替使用,考虑到脑震荡后可能的脑损伤程度和慢性神经心理功能障碍的潜在可能性,这是一种潜在的误导性做法。然而,我们应该强调的是,大多数脑震荡不会产生后遗症。美国康复医学会将 mTBI 定义为格拉斯哥昏迷量表3评分为13-15分,意识丧失 < 30分钟,创伤后遗忘持续时间 < 24小时。脑震荡描述了损伤表型的异质混合物,取决于许多因素,包括头部撞击的大小,位置和方向。尽管缺乏宏观结构发现,脑震荡损伤涉及由线性和旋转剪切力破坏轴突和膜功能(弥漫性轴突损伤,5离子通量和谷氨酸兴奋毒性)引起的原发性神经元损伤,随后是继发性病理生理效应,包括线粒体氧化应激,脑血流中断,血脑屏障(BBB)完整性受损,突触功能障碍和神经炎症。持续的神经心理学脑震荡后症状(脑震盪症候群)包括情绪障碍(例如抑郁症) ,难以集中和记忆问题(方框2)。年轻运动员的脖子和躯干比老年人的脖子和躯干更弱,因此,造成脑损伤所需的力量更少。发育中的大脑也可能特别容易受到由头部创伤的剪切力引起的轴突损伤,这在美国青年足球中可以超过100g 的线性加速力。然而,青年运动中持续的平均力量通常会小于较高水平的运动。正确的突触发育对认知和行为健康至关重要。神经发生、竞争性突触消除(“修剪”)、髓鞘形成、轴突和树突树枝化等过程在产前发育的整个生命周期中持续进行。额叶和颞叶是最后成熟的区域,人类在20岁出头的时候经历了这些区域的修剪[16] ,因此这些仍在发育的区域的损伤可能对大脑产生病理生理效应,增加了以后生活中出现神经心理问题的可能性。轴突髓鞘形成在青春期持续到20岁出头,易受损伤的影响。职业拳击手大脑健康研究的早期结果表明,第一次接触拳击比赛的年龄越早,尾状核体积损失越大,额叶轴突损伤越严重。这项研究对拳击手和追踪研究综合格斗拳击手进行了5年的研究,他们都经历过重复性的脑震荡和脑震荡。23,24年轻的大脑也有一些有助于恢复的特征。已经显示,这个年龄组的神经可塑性增加有助于局灶性损伤后更好的结果[25]。此外,发育中的动物对重复 TBI 的葡萄糖代谢障碍的窗口比成年动物更短[26]。总的来说,发育中的大脑在 TBI 后显示出脆弱性和恢复力。这些相互交织的因素可能解释了脑震荡和重复 mTBI 对年轻人和成年人大脑影响的差异。应高度重视对脑震荡风险采取保守的方法,并加大努力调查这些发育差异。大多数人ーー无论老少ーー从脑震荡中完全恢复过来。在儿童中,可能影响康复的因素包括年龄和脑震荡史。27,28在一项研究中,大约90% 的年轻成年男运动员在21天内经历了症状恢复。然而,在一项针对11-22岁患者(包括所有脑震荡原因,而不仅仅是运动相关)的急诊科研究中,15% 的样本在受伤后90天仍然表现出脑震荡后症状,包括头痛,头晕,“精神模糊”和抑郁。一些研究表明,美国高中橄榄球运动员从脑震荡中恢复的速度比大学运动员和职业运动员要慢。尽管最近一项包括青春期前年龄组(11-12岁)的研究表明,脑震荡后恢复持续时间可能与年龄没有线性关系,但与高中以下青少年的直接比较尚未发表[30] ,因为这个样本中的青少年恢复时间比青春期前的儿童更长。这些发现加在一起,意味着男性青春期年龄组的恢复时间较长的独特风险。对年幼儿童和女性的进一步研究将大大提高我们评估和减轻整个儿科和青少年年龄段风险的能力。在新的脑震荡发生前1年内遭受一次或多次脑震荡的青少年报告出现更长时间的症状,30表明可能存在“脆弱性窗口”,并将先前受伤的青少年置于更高的长期恢复风险中。11-18岁的青少年在急诊室出现脑震荡后发生脑震盪症候群的可能性比5-10岁的儿童高出近80% ,同样,伴有头痛的儿童和青少年出现脑震盪症候群的风险增加了一倍。在 mtBI 后在急诊室接受治疗的儿童中,6岁以上的儿童在受伤后3个月报告持续症状的发生率高于6岁以下的儿童。当然,获得 < 6岁儿童脑震荡症状的准确信息的能力可能受到缺乏症状自我意识和有效沟通这些症状的必要语言技能的限制。此外,从这些报告中不可能直接比较损伤的严重程度; 事实上,各种损伤的身体异质性,加上个体从脑震荡中恢复的先天能力,使得这种比较具有高度挑战性。一些专业研究中心正在使用“智能头盔”来标准化头部撞击产生的体力和角加速度,目前正在研究这些头盔用于测量和预测可能导致脑震荡的影响。36,37从脑震荡中恢复的年轻人可能会经历重大挑战,包括社会和学术发展的改变,38,39,40在一般智力测试中得分较低,以及学校表现下降(以年级平均分衡量)。39较低的父母教育水平和儿童学业成绩都与较差的脑震荡恢复相关。人格特质也起到了一定的作用,例如,伤前焦虑是运动性脑震荡后长时间恢复的一个危险因素。42年轻的男女运动员都有脑震荡的危险,但是女孩的脑震荡发生率高于男孩,特别是在高中和大学的足球、篮球、棒球或垒球比赛中。28,43,44,45解释这些差异的因素仍然不确定,但可能包括保护装备的质量,脑震荡症状的识别和报告,以及颈部长度和颈部肌肉力量。46男女之间在恢复轨迹方面的差异也知之甚少。然而,最近的一项研究表明,女性黄体酮水平影响脑震荡后的恢复。47青春期激素变化导致偏头痛,也可能导致脑震荡后恢复的性别差异。在青春期后,女性偏头痛的发病率是男性的四倍[48,49] ,一些证据表明,偏头痛患者在脑震荡后恢复较慢[50,51]。有必要进一步研究脑震荡风险和恢复的性别差异。一般来说,成人脑震荡比儿童和青少年脑震荡更容易理解。有几点值得注意。首先,脑震荡有多种非协调的定义。其次,脑震荡诊断是一门不完善的艺术。最后,在缺乏快速和廉价的客观诊断措施的情况下,脑震荡仍然是一种临床诊断,受到变异性的影响,包括不同亚专业和个体医生、神经心理学家和运动训练员的诊断阈值不同,以及教练、家长和年轻运动员报告不足。如果没有经过验证的诊断,脑震荡将仍然是一个模糊和报告不足的实体,发病率估计的准确性将继续受到不确切标准的差别应用的影响。重复性次级脑震荡可导致大脑结构和功能的改变。52弥散张量成像(DTI)检测到的白质异常在职业足球运动员中已有报道,即使没有任何明显的脑震荡史。与游泳运动员相比,男性职业足球运动员表现出 DTI 信号改变,提示几个大脑区域的白质完整性降低,这可能表明轴突髓鞘形成的丧失,类似于 mTBI 患者的改变。53名大学冰球运动员在一个赛季中表现出类似的白质变化。54,55,56,57此外,美国大学生橄榄球运动员重复性亚震荡性头部撞击已经以剂量依赖性方式与 BBB 完整性缺陷,白质完整性潜在丧失和认知功能障碍有关。58这些研究结果可能反映了持续遭受重复性次生脑震荡撞击的青少年的某种程度的风险,尽管很少有专门针对这一主题的研究。一个跟踪头部影响的指标ーー即“命中次数”ーー已经提出,59可以作为确定累积风险敞口的一个因素。这种方法的一个挑战是准确定义“命中”的参数,但改进的生物传感器在这方面显示出一些希望。与棒球中的“投球次数”类似,这个概念最近也被提出用于拳击运动员。24目前没有证据表明青少年重复性脑震荡冲击与晚年痴呆之间的因果关系,如果未来的研究将头部冲击与随后的神经心理功能障碍相关联,这些指标可能被证明是无价的。在成年人中,包括脑震荡在内的脑外伤可能会增加个体发生神经退行性疾病的风险,包括 AD 和 CTE (CTE) ,这是一种仅与重复性头部创伤相关的疾病[65,66]。尽管 mTBI 和 PD 风险之间的关系仍然不确定,但 TBI 也可能增加发生帕金森氏症的风险[67]。在儿科人群,特别是年轻运动员中,单次或重复性脑震荡对晚年神经退行性疾病和痴呆风险的影响是未知的。CTE 在20世纪20年代后期首次被症状性描述为拳击运动员的“拳击醉”痴呆,69后来被描述为“痴呆拳击”[70] ,并在1973年首次被病理学描述[71]。自2005年在一名前职业美式足球运动员身上发现 CTE 以来,这种病症已经引起了公众的广泛关注,目前已经在前冰球、棒球、橄榄球和足球运动员、73名摔跤运动员、74名退伍军人的大脑中发现。75,76业余和职业运动员慢性创伤性脑病的患病率和发病率仍然是未知的,这增加了讨论其流行病学和运动员的人口风险的困难。虽然慢性创伤性脑病主要被认为是一种神经退行性疾病,有时是由大学或专业接触性运动的职业生涯造成的,但在高中运动员中也有慢性创伤性脑病的报道。这一发现表明,慢性创伤性脑病的发展并不需要长期的运动生涯,青年运动员代表着高危人群。新出现的证据表明,临床的慢性创伤性脑病症状可以分为认知和情绪行为两种常见表现[78,79]。主观记忆症状如顺行性遗忘症是常见的,包括焦虑或抑郁在内的情绪障碍也是常见的[79] ,并且执行功能降低,这可能导致去抑制和决策技能受损[80]。这些临床症状定义了疾病的严重程度[81]。慢性创伤性脑病的神经退行性病理生理学是复杂的,对神经系统后遗症的了解很少。在严重的情况下,大脑皮层和内侧颞叶似乎受到最深刻的影响,81,82与病理学拥有属性由磷酸化 tau79组成的神经原纤维缠结,在某些情况下,TAR DNA 结合蛋白43病理学。CTE 也与明显的萎缩有关,特别是在额叶皮层和内侧颞叶,以及在乳头体,丘脑和下丘脑。79确诊的 CTE 临床诊断仍以尸检为基础。鉴于慢性脑震荡中描述的重复病变是否引起临床表型的不确定性,以及大多数专业和大学运动员不发展慢性脑震荡的事实,了解早期暴露于脑震荡是否与其他形式的神经退行性疾病和认知功能障碍(包括慢性神经认知障碍(CNI))相关至关重要。CTE 和 CNI 之间存在重要的临床区别,其中一些使得直接比较困难。CTE 是一种新出现的临床和病理状况,涉及多个领域的神经和认知功能的进行性恶化,主要在尸检中诊断。相反,CNI 表型并不一定是进行性的,而是拥有属性功能从组平均值或基线功能下降到创伤性脑损伤之前的水平。CNI 可以通过神经心理测试进行临床诊断。CNI 与头部创伤之间的因果关系尚未得到证实,但在专业运动员中一直发现剂量依赖性风险。此外,在业余运动员中进行的几乎一半的研究发现 CNI 的风险升高。年轻人群中是否存在类似的风险关联仍有待确定。一个假设是 CNI 代表了慢性创伤性脑病的前驱症状,但并非不可避免,类似于轻微认知障碍和 AD 之间的关系。另外,CNI 可能代表静态损伤而不退化。我们目前对 CNI 和 CTE 的基本生物学基础缺乏了解,这强调了进一步研究的必要性。对这两种情况的生物学知识的增加以及运动员(特别是青年运动员) CNI 的早期检测可能会推动干预措施以阻止进一步认知障碍的发展,并且还可能有助于验证推定的生物标志物。通过 tau 成像评估 CNI 可能有助于确定进展为 CTE 的可能性。脑震荡遗传学领域,特别是在儿科人群中,仍然处于起步阶段。尽管重复的头部撞击似乎对于 CTE 的发展是必要的,但是包括遗传学在内的其他因素可能具有重要作用,因为大多数脑震荡运动员不发展 CTE.87 CTE 的遗传危险因素可能与影响脑震荡易感性和恢复的因素重叠,AD 的遗传危险因素为这些因素的身份提供了重要的线索。E型载脂蛋白质的 ε4等位基因(APOEε4)是迄今为止发现的 AD 最重要的遗传危险因素,它严重影响中枢神经系统的损伤反应,特别是从大脑中清除淀粉样蛋白 -β (Aβ)。APOE 的三个等位基因赋予不同程度的 AD 风险: APOEε2降低风险,APOEε3是最常见的等位基因,代表与其他变体进行比较的基线风险,APOEε4增加风险。90,91研究表明 APOEε4与性别之间存在相互作用,因此 APOEε4相关的 AD 风险在女性中比在男性中更为突出。92,93 APOE 基因型与 TBI 协同作用增加 AD 的风险[94] ,尽管其与 CTE 作为重复 mTBI 的结果的假设风险相关性需要更多的研究。关于 APOE 同种型对儿童 TBI 结果的影响尚未达成共识,但来自成年人的数据表明 APOEε4对脑震荡结果有负面影响。一些研究表明,拥有至少一个 APOEε4等位基因与美国职业橄榄球运动员,96名拳击运动员95和其他成年人97,98,99,100的脑震荡后认知较差和持续的神经心理障碍有关,尽管其他研究没有发现这种关联。101,102一些证据表明 APOE 基因及其启动子的多态性是大学生运动员脑震荡危险的促成因素。另一项研究没有确定 APOEε4在脑震荡风险中的作用[105] ,尽管这个等位基因可能增加中年或晚年 mTBI 后痴呆的风险。106由于样本量小,方法不同,很难从这些相互矛盾的研究中得出结论。在儿童中,对于 APOEε4与脑震荡后神经心理学结果之间的关系知之甚少,而且 APOEε4测试在儿科 TBI 研究中并不常规。2012年,Kurowski 回顾了少数现有的研究,并结合了使用格拉斯哥结果量表的三项研究的结果[107,108,109]。在合并样本(252名儿童)中,6-12个月后不良临床结果的风险在 APOEε4携带者中高于非携带者(19% 比9%)。然而,这些研究包括了广泛的异质性损伤儿童的发育范围,并没有考虑到年龄和基因型之间可能的相互作用。此外,APOE 与性别之间的相互作用尚未在脑震荡的背景下进行研究。改进的前瞻性研究有助于澄清这些联系。将遗传学纳入儿科脑震荡研究充满了复杂的挑战,包括获得父母同意和儿童的知情同意,临床研究参与者的感知耻辱,获得的遗传知识的可行性以及关于可保性(特别是长期护理保险)的潜在担忧。对了解 APOEε4 + 状态的成年人的研究表明,许多人愿意改变生活方式,包括增加运动和改善药物管理[111] ,以及增加购买健康和长期护理保险[112,113]。关于新的遗传知识和相应的疾病风险的教育是必不可少的,正如个人对获得的知识的影响的个人感觉与痴呆风险增加的实际后果之间的实质性不一致所证明的那样.114 APOE 遗传知识对儿童,其家庭和参与影响性体育的决策过程的影响尚不清楚。APOE 基因型对该年龄组脑震荡风险和恢复的影响也需要进一步阐明。如果未来的研究发现,对于任何特定水平的影响,具有 APOEε4 + 状态的儿童比其 APOEε4同龄人具有更大的脑震荡或恢复不良的风险,则应考虑在参加影响性运动之前对学龄运动员进行基因检测。要充分理解基因影响的细微差别,就需要对高中和年轻运动员进行仔细研究。未来对青少年脑震荡结果(包括认知结果和痴呆风险)的研究应尽可能包括 APOE 基因分型。新的 APOE 研究应标准化研究方法和报告措施,包括收集“共同数据元素”,以确保有效的比较研究。110,115 APOE 基因型不一定是脑震荡恢复的不可改变的危险因素: 正在开发的 AD 治疗包括改变 ApoE4蛋白和 Aβ 之间相互作用的药物,这也可能适用于儿科脑震荡。编码脑源性神经营养因子的基因中的 Val66Met 多态性与 mtBI 后更好的结果有关,但与局灶性穿透性脑损伤后更差的结果有关。参与多巴胺能信号传导的基因多态性也可能有助于解释广泛的 TBI 结果。120此外,α-synuclein 基因启动子区的 Rep1多态性可能增加头部损伤后帕金森病的风险。为了提高我们对脑震荡风险和管理的理解,应该进行大型的前瞻性基于人群的全基因组关联研究(GWAS)和全基因组测序研究,以确定其他遗传变异(可能是低频率或低外显率) ,这些变异可以改变长期恢复,认知结果差或痴呆的风险。122这样的研究将需要大规模的数据共享,并且必须解决道德、隐私以及对可保性和可雇佣性的潜在影响等问题。尽管在确定可能应用于成人创伤性脑损伤治疗的可能的脑嵴液(CSF)和血液生物标志物方面取得了进展,但成人或儿科人群都没有经过临床验证的生物标志物。与成人脑震荡相比,儿童脑震荡的临床变异性更大; 因此,生物标志物在改善儿童脑震荡诊断方面具有特殊的潜力。值得注意的是,大多数 TBI 生物标志物已经在中度至重度 TBI 的背景下进行了研究,这使我们在 mTBI 生物标志物的知识方面存在明显的差距,特别是在儿童中。生物标志物的发展对 AD 治疗的进步至关重要。基于脑脊液的生物标志物已经被用于识别高危患者,并改善流行病学研究和临床试验的设计。123新的 PET 放射性配体,如淀粉样蛋白标记剂(其中三种现在是 FDA 批准的) ,可以用于诊断和改善基于神经病理学的患者临床试验分层。一些 tau 成像剂也在人体试验中,它们在包括 CTE 在内的 tau 病中的应用正在迅速建立。与基于液体的生物标志物一样,目前还没有足够敏感或特异的神经影像生物标志物来诊断成人或儿童的脑震荡或 CTE。目前 FDA 尚未批准任何创伤性脑损伤的诊断或治疗药物,而脑震荡生物标志物的验证可以加速这类药物的开发。然而,必须努力确保临床生物标志物检测的成本效益和广泛可用性。此外,考虑到与腰椎穿刺相关的风险,对脑震荡青少年脑脊液取样用于生物标志物研究的伦理问题应该得到解决。在成人体液为基础的生物标志物研究中有希望的发现必须在儿科人群中探索。过去数十年,推定脑震荡的生物标志物在科学文献中零星出现,其中最突出的是星形胶质细胞活化的非特异性标志物 S100钙结合蛋白 B (S100B)。血清中 S100B 的存在可能提示血脑屏障完整性的丧失。在成年拳击手比赛后观察到血清和脑脊液 S100B 水平升高,并且与头部撞击的数量和严重程度呈正相关。在脑震荡的职业冰球运动员中也观察到血清 S100B 水平升高,126在脑震荡后1小时测量的水平预测症状恢复时间。然而,S100B 的水平也提高后,控制发挥,没有发生脑震荡,表明这一标志物是不伤害特异性。事实上,没有头部损伤的成年创伤患者血清 S100B 水平升高。127,128,129其他研究表明,脑震荡后最初的 S100B 水平对于恢复不能很好地预测。与所有生物标志物一样,S100B 在儿童 TBI 管理中的作用甚至更不清楚[131] ,一些人认为这种标志物在儿科人群中几乎没有诊断或预后效用。132在一项关于≤15岁 TBI 患儿的研究中,5岁以下或9岁以上儿童的血清 S100B 水平高于5-9岁儿童。因此,S100B 可能不足以区分有症状和无症状的脑震荡儿童[133] ,S100B 在诊断和预后预后方面的效用是值得怀疑的。134,135,136神经元特异性烯醇化酶(NSE)是神经元损伤的标志物,但其作为血清或脑脊液生物标志物的用途仍不确定。拳击手头部撞击后观察到血清 NSE 水平升高[133,134,135,136,137] ,但在没有发生脑震荡的比赛后,冰球运动员也观察到 NSE 水平升高。血清 NSE 水平无法预测脑震荡后的恢复时间,可能与儿童损伤严重程度无关。133在≤15岁的儿童中,血清 NSE 水平与年龄呈负相关。一旦释放到血液中,NSE 具有缓慢的消除动力学,使得难以根据 NSE 水平区分原发性和继发性神经元损伤。神经丝轻链和胶质纤维酸性蛋白(GFAP)分别是 CSF 神经元特异性和胶质特异性损伤标志物,并且在成年拳击手打斗后 CSF 均升高。125,137,140在儿科脑震荡的情况下,对任何一种标志物都知之甚少,但对儿童和年轻成年人的初步研究表明,脑震荡后72小时内的血清 GFAP 水平与损伤后1个月的症状负担相关。神经元特异性蛋白 UCH-L1(泛素羧基末端水解酶同工酶 L1)首先通过参与 PD 与神经退行性病理学相关[142] ,其在血清中的存在后来被确定为严重 TBI 的生物标志物。血清 UCH-L1水平可能对脑震荡有诊断价值[146] ,但最近的证据表明血清水平升高与脑震荡次数之间缺乏相关性。UCH-L1在儿科人群中的临床应用值得进一步研究。也许最有希望的进展成人液基 TBI 生物标志物涉及 tau 蛋白。血清或脑脊液 tau 蛋白水平被认为表明轴突损伤,因为 tau 蛋白通常存在于轴突中,稳定微管。在 AD 患者中,脑脊液中切割的 tau 蛋白水解水平可能与认知功能相关。拳击手在比赛后脑脊液和血液中的 Tau 水平升高,脑脊液 Tau 水平与头部撞击的质量和数量相关。125,150最近的证据表明,脑震荡后冰球运动员血液中的 tau 水平升高,可能有助于预测恢复时间。然而,问题依然存在,一些研究报道血清切割 tau 对预测脑震盪症候群或长期结果的价值很小或没有价值。130,151 tau 作为儿童生物标志物的潜力尚不清楚,至今没有进行研究。事实上,血清 tau 作为一种生物标志物的可靠性尚未被确定为任何适应症。这种可能性是没有单一的生物标志物将足以诊断儿童脑震荡或预测结果。此外,很少有研究调查遗传组成和推定的生物标志物之间的相互作用。随着我们对生物标志物与损伤严重程度及其相互关系的理解的增加,生物标志物小组的发展,可能包括炎症和氧化标志物,152应该被考虑。未来的研究应试图进一步确定这些关系,建立生物标志物小组的临床价值,考虑到商业成本和实际可行性。代谢组学、脂质组学和蛋白质组学的最新进展ーー特别是寻找 AD 的代谢组学和脂质组学标志物ーー可能为今后研究脑震荡和脑震荡下损伤的生物标志物提供参考。最近的一些研究提出了与 MCI 和 AD 相关的代谢物和脂质谱的改变.153,154,155,156来自动物模型的数据表明,脂质和代谢物变化伴随着急性和慢性脑震荡后期,并且可能有助于预测恢复轨迹,157,158但是这些发现尚未在人类中得到验证。将生物标志物的搜索范围从血液和脑脊液扩展到唾液和尿液159,可能会提高快速和非侵入性测量的能力,特别是从儿童身上。从儿童抽取脑脊液样本,特别是在需要快速评估的情况下,在很大程度上是不切实际的。Mondello 等人提出了一套评估 TBI 生物标志物的有用标准,这些标准应该允许更精简的开发和验证.137任何经过验证的生物标志物小组必然是更大的多模式诊断套件的组成部分,其中可能包括结构和功能成像以及神经心理学测试。在设计未来的生物标志物研究时,应考虑 FDA 批准的可能性,以加快批准临床使用。虽然脑震荡仍然是一种临床诊断,但神经影像学技术正在提高我们对成人脑结构和功能后果的认识。儿科人群的神经影像学可能受到几个因素的限制,例如,脑震荡后纵向变化的测量由于动态的、未成熟的大脑的背景而变得复杂。没有成像技术被证实为脑震荡的诊断工具,成像结果与临床可测量的认知或行为功能之间的相关性是可变的。目前正在研究容积成像、 DTI 和功能磁共振成像(fMRI)等工具,特别是动脉自旋标记。通过 DTI 测量的分数各向异性(FA)可以推断白质束的结构完整性,TBI 后白质束通常被破坏。FA 变化的临床意义仍然存在争议,因为在脑震荡研究中观察到 FA 增加和减少[162,163,164,165,166]。这些差异可能部分是由于所检查的脑区域的相当大的空间异质性[167]以及损伤后间隔的差异。FA 可能仍然具有预后价值,有证据表明变化的方向和幅度与临床结果相关; 然而,这个想法等待在儿科和成人人群中验证。FA 可能缺乏必要的敏感性来充分评估脑损伤后白质束完整性的变化,扩散率的测量可能更合适。169 DTI 领域将大大受益于规范数据集的开发,以衡量观察到的变化。年轻运动员的赛前、赛后和赛季研究可以采用连续 DTI 成像技术为特定个体建立规范的数据,但数据汇总后的效用尚不清楚。儿科标准数据的缺乏严重限制了包括 DTI 在内的神经影像技术的临床应用。儿童脑震荡后的“回归基线”神经影像学研究也是必要的,因为它们可以极大地改善恢复的预测。尽管自动化提高了重复性,但 DTI 测量仍然对硬件和软件特异性,采集参数和分析软件敏感,这限制了重复性,标准化和中心之间以及跨研究之间的比较。标准化 DTI 成像中心的努力正在进行中。170 MRI 在绘制大脑的“连接体”(结构和功能神经连接网络及其各自的焦点节点的集合)以及研究脑震荡如何影响这些网络方面特别成功。局灶性或弥漫性 TBI 可以破坏大脑的功能连接,导致多个网络的功能障碍,包括默认模式和显着网络,这与记忆,情绪和情绪有关[171]。网络功能障碍对恢复的影响可能比病变部位更强[171,172,173] ,但对大脑发育和认知功能的长期影响尚不清楚[26,174]。脑震荡后儿童网络连接功能障碍的进一步研究对于改善损伤预后和管理至关重要。用于 PET 成像的放射性示踪剂有可能推进脑震荡和 CTE 的诊断和治疗,但目前它们在儿科人群中的应用纯粹是研究性的。三种 FDA 批准的放射性标记成像剂目前可用于检测疑似 AD 患者的脑淀粉样蛋白。175在成年人中,一些脑震荡病例与急性 Aβ 病理有关。PET 扫描可以使儿科患者监测急性脑震荡后淀粉样蛋白的存在和持续性,并确定扫描阳性和阴性是否预测不同的结果.176,177在儿科人群中具有潜在用途的其他 PET 成像剂包括结合由 tau 组成的神经原纤维缠结的新示踪剂。用18F-T807,18F-T808和18F-THK5105进行的早期成像结果证明对于确认包括 AD 在内的各种临床情况下存在共病是有用的。178,179,180在最近的一项 AD 研究中,tau 示踪信号的大小与疾病的分期和认知障碍的严重程度呈正相关。第三种 tau PET 示踪剂11C-PBB3已经在健康个体和 AD 患者中进行了测试,并且可能能够检测 tau 的非 AD 构象。181此外,最近的一份报告首次描述了疑似与运动相关的慢性创伤性脑病(CTE)在活人中的重病影像学表现。鉴于脑震荡,重复性亚震荡损伤和 CTE 中慢性 tau 病理学的程度,tau 示踪剂可用作诊断和预后生物标志物(例如,区分 CNI 和 CTE)。目前正在对 CTE 成人进行这些示踪剂的研究,但它们在儿科人群中的应用将取决于未来的研究,以确定 TBI 或脑震荡后年轻患者是否存在 tau 病理学。小胶质细胞胆固醇转运蛋白的 PET 示踪剂可能有助于成像与创伤性脑损伤相关的神经炎症。182正在开发的新型 PET 配体可以成像脑小胶质细胞,对神经退行性疾病具有潜在的应用价值,也可能证明对脑震荡和慢性创伤性脑病的治疗有用。在脑震荡和 TBI 的儿科人群中探索这些 PET 配体将是有益的,但是在开始进行涉及该年龄组放射性示踪剂的研究之前必须进行风险-效益分析。任何 PET 成像剂的最终效用将取决于其作为多模式生物标志物和神经影像技术小组的一部分的诊断和预后价值。非侵入性技术如经颅磁力刺激(tMS)也发现了创伤性脑损伤和脑震荡后突触可塑性的变化,特别是在无症状的个体中。对20多岁有脑震荡史的年轻运动员进行的几项小型 TMS 研究表明,运动皮层中 γ-氨基丁酸和/或谷氨酸神经传导的不平衡与突触长时程增强作用和抑郁症的缺陷有关。184,185,187,188经颅磁刺激还显示,脑震荡相关的突触可塑性损伤可以损害运动学习的各个方面,这些缺陷在个体最后一次脑震荡几十年后仍然可以检测到。另一个检测与脑震荡相关的神经化学功能障碍的关键非侵入性工具是质子磁共振谱(MRS)。专门针对运动相关脑震荡后使用光谱学的报告表明,与神经化学改变一致的各种异常。在年轻(高中)运动员中,MRS 在赛季后与赛季前评估中检测到谷氨酸和谷氨酰胺水平增加,即使在赛季期间没有经历临床显着脑震荡的运动员中也是如此。这些发现表明,即使是次震荡性头部撞击也可能导致谷氨酸途径的激活,意味着细胞损伤或神经元死亡,尽管没有症状。在上述研究中,一部分参与者的肌酐和肌醇水平(位于星形胶质细胞中的有机渗透液192,193)也发生了显著变化。在一项使用 MRS 的罕见追踪研究中,194名持续单次运动相关脑震荡的个体在受伤后3天在大脑中表现出显着降低的 N- 乙酰天冬氨酸(NAA,神经元和轴突健康,完整性和功能的标志物195)水平。损伤后15天水平升高,损伤后30天恢复到对照值。相比之下,在第一次脑震荡后10-13天再次受到脑震荡的参与者表现出 NAA 水平的长时间下降,即使在受伤后45天也没有恢复正常。这些结果表明,在短时间内反复受伤增加了延长或不完全恢复的可能性。除了 MRS 检测到的急性和亚急性改变之外,其他关于脑震荡长期影响的研究已经揭示了在其他健康的前运动员中,内侧颞叶中肌醇(与胶质增殖相关)增加和胆碱(与膜转换相关195)水平降低在测试之前持续最后一次脑震荡超过三十年。196最近的另一项研究使用一种叫做相关光谱学(COSY)的先进的 MRS 方法,检测了一组有症状的退役国家橄榄球联盟球员,这种方法可以测量额外的代谢物。作者发现胆碱和谷氨酸-谷氨酰胺水平升高(分别表明弥漫性轴突损伤和兴奋性毒性) ,与之前的 mtBI MRS 研究一致,以及额外的大脑代谢物表明神经炎症的变化。这些新陈代谢的变化可能提供了损伤机制的洞察力,如兴奋性毒性和/或炎症,这可能是所报道的结构变化的基础。总的来说,现有的数据支持使用 MRS 作为一种研究工具,以确定改变的神经生理学和监测恢复成年运动员,即使在解决后脑震荡症状。目前,MRS 检测到的生化改变可以增强我们对潜在病理生理学的理解,但尚不能提供具体的诊断信息。需要更大的横断面,前瞻性和纵向研究来确定 MRS 在运动相关脑震荡领域内的敏感性和预后价值.190由于未成熟大脑中 MRS 的解释需要某些发育方面的考虑,因此将来在儿童中的工作将需要适当的比较样本。具有更高光谱分辨率的 MRS 技术,包括 COSY,可能提供额外的生化特异性。空间分辨率的其他进展,如3D 化学位移成像,也可以通过允许调查整个大脑的代谢改变而不是在特定的感兴趣的区域,提供更大的特异性。最后,MRS 可以在测量治疗效果方面发挥作用,例如经颅直流电刺激198和 TMS.199。体育相关伤害测量,报告,跟踪和数据共享的机制和监测基础设施不足以满足目前的需求和目标。脑震荡的研究和临床工作受到缺乏运动和运动水平的脑震荡数据的阻碍。2014年美国医学研究所的一份报告只确定了三个国家运动伤害监测系统: 国家电子伤害监测系统ーー所有伤害项目(NEISS-AIP)、全美大学体育协会伤害监测系统(NCAA ISS)和高中伤害在线报告系统(rIOTM)。1这些系统可以补充临床数据(例如,来自急诊科、住院病人和体育诊所) ,但这些数据偏向于更严重的伤害和社会经济地位更高的病人。事实上,农村地区或社会经济地位较低的社区的学校往往很难获得运动医疗专业人员和设施。一些新出现的项目可能会改善监督。区域性的努力,如运动训练员临床结果研究教育(CORE-AT)和全国性的努力,如全国运动训练员协会全国运动治疗,伤害和结果网络(NATA NATIONTM)试图将伤害跟踪与高中和大学水平的治疗和结果数据结合起来。然而,这些系统中没有一个专门针对年轻运动员、那些参加非学校赞助体育项目的运动员或那些在没有运动教练的学校的运动员。运动损伤数据库也很少考虑人口统计因素,包括社会经济地位、种族或民族以及医疗保健覆盖率。目前,还没有有效的机制来连贯和廉价地将各种监测数据集联系起来,或者跨越体育、跟踪系统或年龄连续体跟踪个别运动员。现在相当需要一个系统来追踪个人运动员的运动生涯和其他方面。应该对每个人进行数十年的跟踪,以确定 TBI 的负担是否、何时以及如何演变为 CTE,并评估与脑震荡相关的所有可能的负面健康结果。这种系统还可以更准确地描述脑震荡病史和风险因素,并可以捕捉短期和长期的结果,包括身体和心理健康、学业和职业成功、生活质量和社会联系以及不断变化的社会经济地位。这种努力受到各种问题的挑战,包括缺乏任何级别的脑震荡强制性报告。强制性脑震荡报告、为监测工作提供资金以及为数据记者(例如教练和运动员培训员)提供培训将极大地改善流行病学研究。然而,如果没有经过验证的、对脑震荡的共识定义,以及通用数据库和全球唯一标识符(GUID)系统的开发,强制性报告将无法提供有意义的结果。然后可以将标准化监测工作的数据集联系起来,从而改善研究和临床护理的数据共享。将监测数据与组织和液体样本生物库的标准化收集、储存和管理基础设施耦合起来,可以大大改善损伤和结果研究。200这些努力可以通过公私伙伴关系的资金来催化,并通过制定现实的短期和长期目标来实现,以创建一个多年计划。然而,至少在美国,这些努力目前受到对健康保险便利和责任法案(HIPAA)规定的误解和对运动员保密的普遍关注的阻碍。运动员更广泛地使用计算机神经认知测试(CNT)可以改善脑震荡的监测,以及诊断和管理。然而,在 CNT 成为常规手术之前,必须克服几个重要的挑战。这些挑战包括缺乏标准化的管理协议,不同计算机硬件引起的技术错误的可能性,评估的认知功能类型的限制,以及缺乏合格的测试管理员和数据解释员.201尽管存在这些缺陷,但是,CNT 已经被大约40% 的美国高中雇用运动教练员.202虽然不是所有学校都负担得起,但是 CNT 可以加强地面数据收集,帮助风险暴露估计和脑震荡后恢复跟踪,以及提高向运动损伤监测网络报告的数据质量。CNT 也可能有助于评估和跟踪脑震荡后认知改善或下降,并可能有助于预测结果.203,204在学校环境中收集的 CNT 数据是否将达到由临床研究小组进行的 CNT 所达到的验证和重复性标准仍有待观察。重要的是,CNT 需要标准化和指导方针,以确定“返回运动”和“返回学习”的运动员在一个领域表现出恢复,但在其他领域仍然有症状。在临床和青少年运动员脑震荡监测和管理方面,需要对 CNT 的应用进行更多的研究。在一些关键领域,不完整的知识阻碍了儿科脑震荡领域有意义的进展。在分子和细胞水平上,迫切需要重点研究脑震荡和重复性亚震荡损伤后的轴突损伤,以阐明轴突运输和修复的变化,并更好地定义瞬时 Aβ 积累作为下游和/或未来病理学的潜在驱动因素的作用。脑震荡研究人员可能需要确定更合适的动物模型来研究分子病理学,包括 tau 蛋白及其对脑震荡后和慢性创伤脑炎病理学的贡献,因为啮齿动物和人类的大脑结构和组织大不相同。如果不能更清楚地了解创伤性脑损伤如何改变年轻、仍在发育中的大脑,以及在损伤后的数周、数月和数年内会发生什么样的病理事件,我们就只能推测这种改变的潜在生物学基础。通过使用记录线性和旋转力的传感器技术,可以改进青年体育运动中头部影响数据的收集和风险评估。这种商业上可用的设备,如果经过验证,可以确定在比赛期间和整个比赛季节中头部累积冲击力的水平,并且研究结果可以与神经影像学数据和功能结果评估联系起来。结合“击中计数”指标,传感器数据可以提高对重复性次生震荡影响的短期和长期神经心理学结果的认识。我们对慢性创伤性脑病的认识可以通过了解一般人群、受伤运动员、运动和运动位置匹配的未受伤运动员以及低风险运动中的“控制”运动员的基线率来改善。提高对风险暴露的认识可导致预防努力,包括改变做法和竞争规则。一项长达数十年的前瞻性追踪研究,追踪青年运动员的运动生涯及以后的发展,将提供有关累积性头部撞击以及长期神经心理功能障碍和痴呆风险的更确切知识。这样的研究正在 NCAA 校友中进行,他们于2003年首次接受研究,并于2013年重新评估。其他人群的研究,特别是如果 NIH 资助的话,可能会从5年的研究开始,可以进一步延长5年的增量。可能需要建立公私伙伴关系,以获得足够的资金,使多个研究中心参与进来。NCAA 已经为100多名运动员的10年重新评估提供了部分赞助,但需要来自 NIH,美国国防部(DoD)和私人慈善来源的进一步资助,以扩大评估范围,从神经心理学,通过 MRI,淀粉样蛋白,tau 和/或炎症的分子成像。理想情况下,追踪研究设计应结合流行病学和介入试验方法,并利用多个对照组,包括非接触运动员和未受伤的撞击运动员。追踪研究还将阐明认知储备的作用。老年痴呆症研究团体利用国家卫生研究院的资金以及涉及制药公司和基金会的公私伙伴关系,开创了这类研究的先例。为了使这类研究取得成功,必须首先建立更多的监测系统和数据库。如果参加影响力体育运动的运动员能够普遍获得运动员训练员的帮助,这些训练员能够在促进安全和提供基本护理的同时充当可靠的数据报告员,那么将加快努力。此外,任何纵向研究都必须包括死后分析,以便更好地了解儿童和青少年脑震荡对今后生活中神经退行性病理和痴呆发展的影响。由于缺乏严格的流行病学证据,“重返赛场”的指导方针目前受到阻碍,纵向研究的长期安全数据可能会大大改善这一点。纵向研究还可以包括确定那些未能遵循指导方针的运动员是否会经历任何负面健康影响的研究,例如持续的症状或改变发生第二次脑震荡的风险。长期前瞻性研究的基础设施可以通过建立一个以阿尔茨海默氏病神经影像学倡议(ADNI)为模型的研究联盟来创建。ADNI 为数据收集、传播协议、测试方法和生物标志物收集和分析制定了标准。目前正在国防部参与的一个版本的 ADNI (ADNI-DoD)专注于军事人群中与爆炸相关的 TBI 研究。2072014年5月,除了 NCAA 脑震荡研究,NCAA 和国防部宣布启动迄今为止最大的前瞻性运动相关脑震荡研究,该研究将在3年内监测大约37,000名 NCAA 运动员。我们可以想象,这项研究的基础设施可能最终扩展到研究年轻运动员在一个延长的纵向范围。我们对创伤性脑损伤的生物学知识仍然存在许多差距,这限制了我们开发有效药物的能力。如果我们要解决潜在的疾病病理,并超越治疗症状,就必须填补这些空白。然而,当基础创伤性脑损伤生物学的研究继续进行时,许多工作可以完成。药物再利用包括测试现有 FDA 批准的新适应症药物,可以减少费用和缩短药物批准的路径。目前的再利用试验包括哌醋甲酯治疗疼痛和精神疲劳,多巴胺受体激动剂溴隐亭治疗工作记忆,舍曲林治疗情绪和焦虑,这是最常见的影响脑震荡后长期结果的神经心理并发症。此外,黄体酮的 PROTECT III 期临床试验最近未能改善急性 TBI211后的结局,这可能提醒人们需要更多的研究来更好地理解 TBI 的基础生物学。虽然许多药物重新利用的努力主要是为了解决脑震荡症状,药物也可能影响损伤病理学和进展。对现有药物的研究也可能导致新的药物发现努力,并可能导致新的预防或管理治疗。急需新的药物治疗创伤性脑损伤和无法消除的脑震荡。在神经保护和抗炎领域的药物发现努力是特别相关的,因为它们潜在的交叉适用于神经退行性疾病,如 AD。同样,目前正在开发的治疗其他神经退行性疾病的药物可能会被重新定位,用于 TBI 或无脑震荡症状患者的检测。正如医学研究中经常出现的情况一样,脑震荡研究的最新进展提出的问题和回答的问题一样多。有证据表明脑震荡或重复性次生脑震荡后长期神经心理功能障碍和晚年痴呆,需要更多的工作来更好地理解青年参与影响性运动的含义和结果。正如本专家共识文件所概述的那样,有一条前进的道路,但实现这里概述的目标将需要公共和私营部门的合作。虽然可以通过增加知识来改进建议,但现有证据仍然可以在考虑青年参与体育运动以及实践政策和竞赛规则时为个人决策提供信息。随着人口老龄化和痴呆症的流行,我们必须更多地了解潜在的早期生活风险因素,包括与运动有关的脑震荡。家长、教练、学校董事会和孩子们做出的选择将在脑震荡科学知识的关键差距得到填补时得到更好的信息。下载参考资料|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Practical+Design+of+Performant+Recommender+Systems+using+Large-scale+Linear+Programming-based+Global+Inference)|0| -|[Rank-heterogeneous Preference Models for School Choice](https://doi.org/10.1145/3580305.3599484)|Amel Awadelkarim, Arjun Seshadri, Itai Ashlagi, Irene Lo, Johan Ugander|Stanford University; Amazon|School choice mechanism designers use discrete choice models to understand and predict families' preferences. The most widely-used choice model, the multinomial logit (MNL), is linear in school and/or household attributes. While the model is simple and interpretable, it assumes the ranked preference lists arise from a choice process that is uniform throughout the ranking, from top to bottom. In this work, we introduce two strategies for rank-heterogeneous choice modeling tailored for school choice. First, we adapt a context-dependent random utility model (CDM), considering down-rank choices as occurring in the context of earlier up-rank choices. Second, we consider stratifying the choice modeling by rank, regularizing rank-adjacent models towards one another when appropriate. Using data on household preferences from the San Francisco Unified School District (SFUSD) across multiple years, we show that the contextual models considerably improve our out-of-sample evaluation metrics across all rank positions over the non-contextual models in the literature. Meanwhile, stratifying the model by rank can yield more accurate first-choice predictions while down-rank predictions are relatively unimproved. These models provide performance upgrades that school choice researchers can adopt to improve predictions and counterfactual analyses.|学校选择机制的设计者使用离散选择模型来理解和预测家庭的偏好。最广泛使用的选择模型,多项式 logit (MNL) ,在学校和/或家庭属性中是线性的。虽然这个模型是简单和可解释的,但是它假设排名的偏好列表来自于一个从上到下在整个排名过程中是统一的选择过程。本文介绍了两种适用于学校选择的秩异质选择模型的建模策略。首先,我们采用了一个上下文相关的随机效用模型(CDM) ,考虑了在早期上层选择的情况下发生的下层选择。其次,我们考虑根据等级对选择模型进行分层,在适当的时候将相邻等级的模型相互调整。使用来自旧金山联合校区多年的家庭偏好数据,我们发现相对于文献中的非上下文模型,上下文模型大大提高了我们在所有排名位置的外部评估指标。同时,按等级对模型进行分层可以得到更准确的第一选择预测,而低等级预测相对来说没有改进。这些模型提供了学校选择研究人员可以用来改进预测和反事实分析的绩效提升。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Rank-heterogeneous+Preference+Models+for+School+Choice)|0| +|[Practical Design of Performant Recommender Systems using Large-scale Linear Programming-based Global Inference](https://doi.org/10.1145/3580305.3599183)|Aman Gupta, S. Sathiya Keerthi, Ayan Acharya, Miao Cheng, Borja Ocejo Elizondo, Rohan Ramanath, Rahul Mazumder, Kinjal Basu, J. Kenneth Tay, Rupesh Gupta|Metro Orthopedics & Sports Therapy, USA.; Alzheimer's Association, USA.; University of Colorado at Denver, USA.; University of Virginia School of Medicine, USA.; Banyan Biomarkers, USA.; Safe Kids Worldwide, Inc., USA.; National Collegiate Athletic Association, USA.; CrowdOptic, Inc., USA.; Stanford Center on Longevity, USA.; Burke Rehabilitation Hospital, USA.; Alzheimer's Drug Discovery Foundation, 57 West 57th Street, Suite 904, New York, NY 10019, USA.; Icahn School of Medicine at Mount Sinai, USA.; Children's Hospital, Harvard Medical School, USA.; Norton Healthcare, University of Kentucky, USA.; Baylor College of Medicine, USA.; Novant Health Sports Medicine, USA.; Boston University Medical Center, USA.; Andrews Institute for Orthopaedics and Sports Medicine, USA.; The Hastings Center, USA.; George Washington School of Medicine, USA.; University of California, USA.|Sports-related concussions and repetitive subconcussive exposure are increasingly recognized as potential dangers to paediatric populations, but much remains unknown about the short-term and long-term consequences of these events, including potential cognitive impairment and risk of later-life dementia. This Expert Consensus Document is the result of a 1-day meeting convened by Safe Kids Worldwide, the Alzheimer's Drug Discovery Foundation, and the Andrews Institute for Orthopaedics and Sports Medicine. The goal is to highlight knowledge gaps and areas of critically needed research in the areas of concussion science, dementia, genetics, diagnostic and prognostic biomarkers, neuroimaging, sports injury surveillance, and information sharing. For each of these areas, we propose clear and achievable paths to improve the understanding, treatment and prevention of youth sports-related concussions. In 2009, around 250,000 nonfatal traumatic brain injuries (TBIs) were recorded among individuals aged <19 years in the USA.1 The Centers for Disease Control and Prevention estimate that young people aged 5–18 years sustain 65% of all sports-related concussions.2 Despite recent advances in diagnostic brain imaging and in our understanding of the physics of concussion, long-term cognitive outcomes remain poorly understood. As the physical, cognitive and emotional consequences of concussion gain wider public attention, our incomplete knowledge of how to prevent, diagnose and treat such injuries endangers the health of our children in general and the health of their brains in particular. This Expert Consensus Document is the result of a 1-day meeting of experts in the fields of paediatric and adult TBI, Alzheimer disease (AD) research, genetics, epidemiology, bioethics and sports medicine (Box 1), which was convened in November 2013 by Safe Kids Worldwide, the Alzheimer's Drug Discovery Foundation and the Andrews Institute for Orthopaedics and Sports Medicine. Our primary goal is to highlight critical gaps in our knowledge of child and adolescent concussion. We emphasize areas where research is needed, such as development of diagnostic and predictive biomarkers, elucidation of genetic risk factors, and prediction of short-term and long-term outcomes. In our conclusions, we suggest paths toward improving our understanding of the long-term consequences of sports-related paediatric concussion. The term 'concussion' is often used interchangeably with the term 'mild TBI' (mTBI), a potentially misleading practice considering the possible extent of brain damage and potential for chronic neuropsychological dysfunction following concussion. We should stress, however, that most concussions resolve without sequelae. The American Congress of Rehabilitative Medicine defines mTBI as a Glasgow Coma Scale3 score of 13–15, with loss of consciousness for <30 min and post-traumatic amnesia lasting <24 h.4 Concussion describes a heterogeneous mixture of injury phenotypes that depends on many factors, including the magnitude, location and direction of head impact. Despite a lack of macroscopic structural findings, concussive brain injury involves primary neuronal injury caused by linear and rotational shear forces that disrupt axonal and membrane function (diffuse axonal injury,5 ionic flux and glutamate excitotoxicity), followed by secondary pathophysiological effects including mitochondrial oxidative stress, disruption of cerebral blood flow, compromised blood–brain barrier (BBB) integrity, synaptic dysfunction, and neuroinflammation.6, 7 Lasting neuropsychological post-concussion symptoms (post-concussion syndrome) comprise mood disorders (for example, depression), difficulty concentrating, and memory problems (Box 2).8 Both physical and physiological components of concussive injury can damage the developing brain, putting youths engaged in impact sports at particular risk. The necks and torsos of young athletes are weaker than those of older individuals and, consequently, less force is required to cause brain injury. The developing brain might also be particularly vulnerable to axonal damage caused by the shearing forces of head trauma, which, in youth American football, can exceed linear acceleration forces of 100 g.9 However, the average forces sustained in youth sports will generally be smaller than at higher levels of sport. Proper synaptic development is critical to cognitive and behavioural health.10, 11, 12, 13, 14, 15 Processes such as neurogenesis, competitive synaptic elimination ('pruning'), myelination, and axonal and dendritic arborization continue from prenatal development throughout the lifespan.14 The frontal and temporal lobes are the last areas to mature, and humans experience pruning in these regions into their early 20s,16 so damage to these still-developing areas may have pathophysiological effects on the brain that increase the potential for neuropsychological problems later in life.17 Axonal myelination continues through adolescence into the early 20s, and is susceptible to disruption by injury.10, 18, 19, 20, 21, 22 Early results from the Professional Fighters Brain Health Study, a 5-year longitudinal study of boxers and mixed martial arts fighters, who experienced repetitive subconcussive injuries as well as concussions, indicate that earlier age of first exposure to competitive boxing correlates with greater loss of caudate volume and greater axonal damage in the frontal lobe.23, 24 The young brain also has features that contribute to its resilience. Increased neuroplasticity in this age group has been shown to contribute to better outcomes after focal injuries.25 In addition, developing animals display a shorter window of glucose metabolic impairment in response to repeat TBI than do adult animals.26 Overall, the developing brain shows both vulnerability and resilience after TBI. These interwoven factors are likely to account for differences in the effects of concussion and repeat mTBI on young versus adult brains. A conservative approach to concussion risk and greater efforts to investigate these developmental differences should be given high priority. Most people—both young and old—recover fully from concussions. In children, factors potentially influencing recovery include age and history of concussions.27, 28 In one study, approximately 90% of young adult male athletes experienced symptomatic recovery within 21 days.29 However, in an emergency department study of patients aged 11–22 years (including all causes of concussion, not just sports-related), 15% of the sample still exhibited post-concussion symptoms, including headache, dizziness, 'mental fogginess' and depression, 90 days after injury.30 Several studies suggest that high school American football players are slower to recover from concussion than are college31, 32 and professional players.33 No direct comparisons with adolescents below high school age have yet been published, although a recent study that included a pre-adolescent age group (11–12 years) suggested that post-concussion recovery duration may not exhibit a linear relationship with age,30 as adolescents in this sample took longer to recover than did the pre-adolescent children. These findings, taken together, imply a unique risk of lengthier recovery in the male adolescent age group. Further studies of younger children and females would add greatly to our ability to assess and mitigate risk across the full paediatric and adolescent age span. Youths who sustained one or more concussions within 1 year prior to a new concussion reported more-prolonged symptoms,30 suggesting a possible 'window of vulnerability', and placing previously injured youths at higher risk of protracted recovery. Adolescents aged 11–18 years were nearly 80% more likely to develop post-concussion syndrome after presenting in emergency rooms than were children aged 5–10 years; similarly, presentation with headache doubled the risk of post-concussion syndrome in both children and adolescents.34 Among children treated in an emergency room after mTBI, those aged >6 years reported higher rates of persistent symptoms 3 months post injury than did those aged <6 years.35 Of course, the ability to acquire accurate information about concussion symptoms in children <6 years of age may be limited by a lack of self-awareness of symptoms and the necessary verbal skills to effectively communicate those symptoms. Also, direct comparison of injury severity is not possible from these reports; in fact, the physical heterogeneity of various injuries, taken together with the individual's innate capacity to recover from concussion, makes such comparisons highly challenging. 'Smart helmets' are being used in some speciality research centres to standardize the physical force and angular acceleration that accompanies head hits, and the utility of these helmets to measure and predict impacts that may result in concussion is currently under investigation.36, 37 Young people recovering from concussion can experience important challenges, including altered social and academic development,38, 39, 40 lower scores on general intelligence tests, and decreased school performance (measured by grade-point average).39 Lower levels of parental education and child academic achievement both correlate with poorer concussion recovery.41 Personality traits also play a part; for example, pre-injury anxiety is a risk factor for prolonged recovery periods after sports-related concussion.42 Young athletes of both sexes are at risk of concussion, but girls report higher concussion rates than boys, particularly in high school and college soccer, basketball, and baseball or softball.28, 43, 44, 45 The factors that account for these differences remain uncertain, but might include quality of protective gear, recognition and reporting of concussion symptoms, and neck length and neck muscle strength.46 Differences in recovery trajectories between males and females are also poorly understood. However, one recent study suggested that progesterone levels in females influence post-concussion recovery.47 Hormonal changes during puberty that contribute to migraine headaches might also contribute to sex differences in concussion recovery. Migraine headaches are up to fourfold more common in females than in males after puberty,48, 49 and some evidence suggests that migraineurs recover more slowly after concussion.50, 51 Research is warranted to further delineate sex differences in concussion risk and recovery. In general, adult concussive brain injury is much better understood than its counterpart in children and adolescents. Several points are important to note. First, concussion has multiple, non-harmonized definitions. Second, concussion diagnosis is an imperfect art. Last, in the absence of rapid and inexpensive objective diagnostic measures, concussion remains a clinical diagnosis that is subject to variability—including different thresholds for diagnosis across various subspecialities and across individual physicians, neuropsychologists and athletic trainers—and under-reporting by coaches, parents and young athletes. Without validated diagnostics, concussion will remain a nebulous and under-reported entity, and the accuracy of incidence estimates will continue to be tainted by the differential application of inexact criteria. Repetitive subconcussive trauma can result in structural and functional brain changes.52 White matter abnormalities detected by diffusion tensor imaging (DTI) have been reported in professional soccer players even in the absence of any obvious history of concussions. Compared with swimmers, male professional soccer players showed DTI signal changes suggestive of decreased white matter integrity in several brain regions, which might indicate loss of axonal myelination, similar to changes seen in individuals with mTBI.53 Collegiate ice hockey players exhibited similar white matter changes over the course of a season.54, 55, 56, 57 In addition, repetitive subconcussive head impacts in collegiate American football players have been linked, in a dose-dependent manner, to deficits in BBB integrity, potential loss of white matter integrity, and cognitive dysfunction.58 These findings probably reflect some level of risk for youths who sustain repetitive subconcussive head impacts, although little research has been devoted specifically to this topic. A metric to track head impacts—that is, a 'hit count'—has been proposed,59 and could serve as one factor to determine cumulative risk exposure. One challenge of this approach is to accurately define the parameters of a 'hit', but improved biosensors show some promise in this regard. Similar to a 'pitch count' in baseball, this concept has also recently been proposed for boxers.24 No evidence is currently available to show a causal link between repetitive subconcussive head impacts in youth and dementia later in life, and such metrics could prove invaluable if validated by future studies correlating head impacts with subsequent neuropsychological dysfunction. In adults, TBI, including concussion,60, 61, 62 might increase an individual's risk of developing neurodegenerative disease,63, 64 including AD and chronic traumatic encephalopathy (CTE), a disease associated exclusively with repetitive head trauma.65, 66 TBI may also increase the risk of developing Parkinson disease (PD),67 although the relationship between mTBI and PD risk remains uncertain.68 In paediatric populations, particularly young athletes, the effects of single or repetitive concussions on the risk of later-life neurodegeneration and dementia are unknown. CTE was first described symptomatically in the late 1920s as 'punch-drunk' dementia in boxers,69 was later described as 'dementia pugilistica',70 and was first described pathologically in 1973.71 Since the identification of CTE in a former professional American football player in 2005,72 and additional intensive pathological studies, this condition has gained widespread public attention, and has now been identified in brains of former ice hockey, baseball, rugby and soccer players,73 wrestlers,74 and military veterans.75, 76 The prevalence and incidence of CTE in amateur and professional athletes is still unknown, adding to difficulties in discussing its epidemiology and population risks for athletes. Although CTE is primarily considered to be a neurodegenerative disease that sometimes results from a career of either collegiate or professional contact sports, cases of CTE have been reported in high school athletes.77 This finding suggests that long sporting careers are not required for CTE development, and that youth athletes represent an at-risk population. Emerging evidence suggests that clinical CTE symptoms can be grouped into two common presentations: cognitive and mood–behavioural.78, 79 Subjective memory complaints such as anterograde amnesia are common, as are mood disorders including anxiety or depression,79 and reduced executive function, which can result in disinhibition and impaired decision-making skills.80 These clinical symptoms define disease severity.81 The neurodegenerative pathophysiology of CTE is complex, and the neurological sequelae are poorly understood. In severe cases, the cerebral cortex and medial temporal lobes seem most profoundly affected,81, 82 with pathology characterized by neurofibrillary tangles composed of phosphorylated tau79 and, in some cases, TAR DNA-binding protein 43 pathology.83 CTE is also associated with marked atrophy, notably in the frontal cortex and medial temporal lobe, as well as in the mammillary bodies, thalamus and hypothalamus.79 Confirmed clinical diagnosis of CTE remains autopsy-based.84 Given the uncertainty over whether the tauopathy described in CTE is causative of the clinical phenotype, and the fact that most professional and collegiate athletes do not develop CTE, it is vital to understand whether early exposure to concussion is associated with other forms of neurodegeneration and cognitive dysfunction, including chronic neurocognitive impairment (CNI). Important clinical distinctions exist between CTE and CNI,28, 51 some of which make direct comparisons difficult. CTE is an emerging clinical and pathological condition that involves progressive deterioration of neurological and cognitive function in multiple domains, and is diagnosed primarily at autopsy. Conversely, the CNI phenotype is not necessarily progressive, and is characterized by functional decline from group averages or baseline functioning established before TBI. CNI can be diagnosed clinically through neuropsychological testing. No causal link between CNI and head trauma has yet been confirmed, but a dose-dependent risk has consistently been found in professional athletes.28 In addition, almost half of the studies conducted in amateur athletes have found an elevated risk of CNI.28 Whether similar risk associations are present in younger populations remains to be determined. One hypothesis is that CNI represents a prodromal—but not inevitable—step toward CTE, analogous to the relationship between mild cognitive impairment (MCI) and AD.85, 86 Alternatively, CNI may represent static impairment without degeneration. Our current lack of understanding of the basic biological underpinnings of CNI and CTE underscores the need for more research. Increased knowledge of the biology of both conditions, as well as early detection of CNI in athletes (in particular, youth athletes), may drive interventions to stem the development of further cognitive impairment, and could also aid validation of putative biomarkers. Assessment of CNI via tau imaging may help determine the likelihood of progression to CTE. The field of concussion genetics, especially in paediatric populations, is still in its infancy. Although repetitive head impacts seem necessary for the development of CTE, other factors, including genetics, are likely to have an important role, as most concussed athletes do not develop CTE.87 The genetic risk factors for CTE probably overlap with those that influence susceptibility to and recovery from concussion, and genetic risk factors for AD are providing important clues to the identity of these factors. The ε4 allele of apolipoprotein E (APOE ε4), the most important genetic risk factor for AD identified to date,88 critically affects the CNS injury response,89 in particular, amyloid-β (Aβ) clearance from the brain. The three alleles of APOE confer varying degrees of AD risk: APOE ε2 reduces the risk, APOE ε3, the most common allele, represents baseline risk with which other variants are compared, and APOE ε4 increases the risk.90, 91 Studies suggest an interaction between APOE ε4 and sex, such that APOE ε4-related risk of AD is more prominent in women than in men.92, 93 The APOE genotype acts synergistically with TBI in increasing the risk of AD,94 although its hypothesized risk association with CTE as an outcome of repetitive mTBI requires more study.95 No consensus has yet been reached on the effects of APOE isotype on the outcome of paediatric TBI, but data from adults suggest that APOE ε4 negatively influences concussion outcomes. Several studies indicate that possession of at least one APOE ε4 allele is associated with poorer cognition and lasting neuropsychological impairment after concussion in professional American football players,96 boxers95 and other adults,97, 98, 99, 100 although other studies found no such association.101, 102 Some evidence points to polymorphisms in both the APOE gene and its promoter as contributory factors to concussion risk in college athletes.103, 104 Another study did not identify a role for APOE ε4 in concussion risk,105 although this allele might increase the risk of dementia following midlife or late-life mTBI.106 Drawing conclusions from these conflicting studies is difficult, owing to small sample sizes and differing methodologies. In children, little is known about the relationship between APOE ε4 and neuropsychological outcomes after concussion, and APOE ε4 testing is not routine in paediatric TBI studies. In 2012, Kurowski reviewed the few existing studies and combined the results of three studies107, 108, 109 that used the Glasgow Outcome Scale.110 In the combined sample (252 children), the risk of poor clinical outcomes after 6–12 months was over twofold higher in APOE ε4 carriers than in noncarriers (19% versus 9%). However, these studies included a broad developmental range of children with heterogeneous injuries, and did not account for a possible interaction between age and genotype. In addition, the interaction between APOE and sex has not been studied in the context of concussion. Improved prospective studies are warranted to clarify these connections. Incorporation of genetics into paediatric concussion research is fraught with complicated challenges, including acquisition of parental consent and informed consent for a child, perceived stigmatization of clinical study participants, the actionability of the genetic knowledge obtained, and potential concerns regarding insurability (particularly long-term care insurance). Studies of adults who learn of their APOE ε4+ status demonstrate that many are willing to make lifestyle modifications, including increased exercise and improved medication management,111 as well as increased purchases of health and long-term care insurance.112, 113 Education about new genetic knowledge and corresponding disease risk is essential, as demonstrated by the substantial discordance between an individual's personal feelings about the implications of the acquired knowledge and the actual consequences of increased dementia risk.114 The effects of APOE genetic knowledge on children, their families and decision-making processes regarding participation in impact sports remain unclear. The influence of APOE genotype on concussion risk and recovery in this age group also needs further elucidation. If future studies find that, for any particular level of impact, children with APOE ε4+ status are at greater risk of concussion or poor recovery than are their APOE ε4− peers, consideration should be given to genetic testing of school-age athletes before participation in impact sports. Careful studies of high school and younger athletes are required to fully understand the nuances of genetic influences. Future research into youth concussion outcomes, including cognitive outcomes and risk of dementia, should include APOE genotyping wherever possible. New APOE studies should standardize research methodologies and reporting measures, including the collection of 'common data elements', to ensure valid comparison across studies.110, 115 The APOE genotype is not necessarily a non-modifiable risk factor for concussion recovery: therapies being developed for AD include drugs that modify the interaction between the ApoE4 protein and Aβ, which might also be applicable to paediatric concussion.116, 117 The Val66Met polymorphism in the gene encoding brain-derived neurotrophic factor has been linked to better outcomes after mTBI,118 but worse outcomes after focal penetrating brain injury.119 Polymorphisms in genes involved in dopaminergic signalling may also help to account for the wide range of TBI outcomes.120 In addition, the Rep1 polymorphism in the promoter region of the α-synuclein gene might increase the risk of PD after head injury.121 To advance our understanding of concussion risk and management, large, prospective, population-based genome-wide association studies (GWAS) and whole-genome sequencing studies should be conducted to identify other genetic variants—possibly of low frequency or low penetrance—that modify the risk of prolonged recovery, poor cognitive outcomes or dementia.122 Such studies will require large-scale data sharing, and must address issues of ethics, privacy, and potential implications for insurability and employability. Despite progress in identifying possible cerebrospinal fluid (CSF) and blood-based biomarkers that might be applied to adult TBI management, no clinically validated biomarkers are available for either the adult or the paediatric population. Paediatric concussions present with even greater clinical variability than do adult concussions; therefore, biomarkers have special potential for improving concussion diagnosis in children. Of note, most TBI biomarkers have been studied in the context of moderate to severe TBI, leaving us with obvious gaps in our knowledge of mTBI biomarkers, especially in children. Biomarker development has been critical to the advancement of AD therapeutics. CSF-based biomarkers are already being employed to identify at-risk patients and to improve the design of both epidemiological studies and clinical trials.123 New PET radioligands, such as amyloid-labelling agents (three of which are now FDA-approved), can be used both diagnostically and to improve neuropathology-based patient stratification for clinical trials. Several tau imaging agents are also in human trials, and their utility in tauopathies, including CTE, is rapidly being established. As with fluid-based biomarkers, there are currently no neuroimaging biomarkers sensitive or specific enough to diagnose concussion or CTE in either adults or children. No TBI diagnostic or therapeutic agents have yet been approved by the FDA, and validation of concussion biomarkers could accelerate the development of such agents. Efforts must be made, however, to ensure the cost-effectiveness and wide availability of clinical biomarker testing. Also, given the risks associated with lumbar puncture, ethical concerns regarding sampling of CSF from concussed youths for biomarker research should be addressed. Promising findings in adult fluid-based biomarker research must be explored in paediatric populations. Putative concussion biomarkers have emerged sporadically in the scientific literature over the past few decades, the most prominent being S100 calcium-binding protein B (S100B), a nonspecific marker of astrocyte activation. The presence of S100B in serum may indicate loss of BBB integrity. Elevated serum and CSF levels of S100B have been observed in adult boxers after matches, and correlate positively with the number and severity of head impacts.124, 125 Increased serum S100B levels have also been observed in concussed professional ice hockey players,126 with levels measured 1 h post-concussion predicting symptomatic recovery time. However, S100B levels were also raised after controlled play where no concussions occurred, indicating that this marker is not injury-specific.126 Indeed, S100B serum levels are elevated in adult trauma patients without head injury.127, 128, 129 Other research suggests that initial post-concussion S100B levels are poor predictors of recovery.130 As with all biomarkers, the role of S100B in TBI management in children is even less clear,131 with some arguing that this marker has little diagnostic or prognostic utility in paediatric populations.132 In a study of children with TBI aged ≤15 years, those <5 years or >9 years of age had higher serum levels of S100B than did those aged 5–9 years.133 S100B may, therefore, be an inadequate marker to distinguish between symptomatic and asymptomatic children with concussion,133 and the utility of S100B in diagnostics and outcome prognosis is questionable.134, 135, 136 Neuron-specific enolase (NSE) is a marker of neuronal injury, but its usefulness as a serum or CSF biomarker remains uncertain.133, 134, 135, 136, 137 Elevated serum NSE levels have been observed after head impacts in boxers,124 but were also seen in ice hockey players after a match where no concussions occurred.126 Serum NSE levels failed to predict recovery time after concussion,126 and might not correlate with injury severity in children.133 In children aged ≤15 years, serum NSE levels correlate inversely with age.133 Once released into the blood, NSE has slow elimination kinetics, making it difficult to distinguish primary from secondary neuronal injuries on the basis of NSE levels.138, 139 Neurofilament light chain and glial fibrillary acidic protein (GFAP) are CSF neuron-specific and glial-specific damage markers, respectively, and are both elevated in CSF in adult boxers after fights.125, 137, 140 Little is known about either marker in the context of paediatric concussion, but a preliminary study in children and young adults suggested that serum GFAP levels within 72 h after concussion correlate with symptom burden up to 1 month post injury.141 The neuron-specific protein UCH-L1 (ubiquitin carboxyl-terminal hydrolase isozyme L1) was first linked to neurodegenerative pathology through its involvement in PD,142 and its presence in serum was later identified as a biomarker for severe TBI.143, 144, 145 Serum levels of UCH-L1 may have diagnostic utility in concussion,146 but recent evidence suggests a lack of correlation between elevated serum levels and subconcussive hits.147 The clinical utility of UCH-L1 in paediatric populations warrants further study. Perhaps the most promising advances in adult fluid-based TBI biomarkers concern tau protein. Serum or CSF tau levels are thought to indicate axonal damage, as tau normally resides in axons, where it stabilizes microtubules. Serum tau is proteolytically cleaved,148 and in patients with AD, levels of cleaved tau in CSF might correlate with cognitive function.149 Tau levels in CSF and blood are elevated in boxers after a match, and CSF tau levels correlate with the quality and quantity of head impacts.125, 150 Recent evidence suggests that tau levels are elevated in the blood of ice hockey players after concussion, and may be useful in predicting recovery time.126 Questions remain, however, with several studies reporting little or no value of serum cleaved tau for predicting post-concussion syndrome or long-term outcomes.130, 151 The potential of tau as a biomarker in children remains unclear, with no studies conducted to date. In fact, the reliability of serum tau as a biomarker has not yet been established for any indication. The likelihood is that no single biomarker will suffice to diagnose paediatric concussion or predict outcomes. In addition, few studies have examined the interactions between genetic make-up and putative biomarkers. As our understanding of the relationships of biomarkers to injury severity and to each other increases, development of biomarker panels, perhaps incorporating inflammatory and oxidative markers,152 should be considered. Future studies should attempt to further define these relationships and establish the clinical value of biomarker panels, factoring in commercial cost and practical feasibility. Recent advances in metabolomics, lipidomics and proteomics—in particular, the search for metabolomic and lipidomic markers for AD—might inform future research into biomarkers for concussion and subconcussive injuries. Several recent studies propose altered metabolite and lipid profiles associated with MCI and AD.153, 154, 155, 156 Data from animal models suggest that lipid and metabolite changes accompany both acute and chronic post-concussion periods, and could be useful for predicting recovery trajectory,157, 158 but these findings have yet to be validated in humans. Expanding the biomarker search beyond blood and CSF to saliva and urine159 might improve the ability to obtain measurements rapidly and noninvasively, particularly from children. Sampling of CSF from children, particularly when rapid assessment is desirable, is largely impractical. Mondello et al. proposed a set of useful criteria for evaluating TBI biomarkers that should allow more-streamlined development and validation.137 Any validated biomarker panel must, inevitably, be a component of a larger, multimodal diagnostic suite that may include structural and functional imaging and neuropsychological testing. When designing future biomarker studies, the potential for FDA approval should be considered, in order to expedite approval for clinical use. Although concussion remains a clinical diagnosis, neuroimaging techniques are improving our understanding of the structural and functional consequences in adults. Neuroimaging in paediatric populations may be limited by several factors; for example, measurements of longitudinal changes after concussion are complicated by the background of a dynamic, immature brain. No imaging techniques have been validated as diagnostic tools for concussion, and the correlation between imaging findings and clinically measurable cognitive or behavioural functions is variable. Tools such as volumetric imaging, DTI and functional MRI (fMRI)—in particular, arterial spin labelling—are currently being explored.160, 161 Fractional anisotropy (FA), as measured by DTI, allows inference of the structural integrity of white matter tracts, which are commonly disrupted after TBI. The clinical implications of FA change remain controversial, as both increased and decreased FA has been observed in concussion studies.162, 163, 164, 165, 166 These discrepancies may be due, in part, to the considerable spatial heterogeneity in the brain areas examined,167 as well as differences in the post-injury interval. FA may still have prognostic value, with evidence suggesting that the direction and magnitude of change correlates with clinical outcomes;166, 168 however, this idea awaits validation in both paediatric and adult populations. FA might lack the necessary sensitivity to fully appreciate changes in white matter tract integrity following brain injury, and measures of diffusivity may be more appropriate.169 The DTI field would benefit greatly from the development of normative data sets against which to gauge observed changes. Pre-game versus post-game and season-long studies of young athletes could employ serial DTI imaging to establish normative data for a particular individual, but the utility of the data when pooled is unclear. The scarcity of normative paediatric data severely limits the clinical usefulness of neuroimaging techniques, including DTI. Studies of 'return-to-baseline' neuroimaging after paediatric concussion are also needed, as they could greatly improve prediction of recovery. Although automation has increased reproducibility, DTI measurements remain sensitive to the hardware and software specifics, acquisition parameters and analysis software, which limit reproducibility, standardization and comparison between centres and across studies. Efforts to standardize DTI across imaging centres are underway.170 MRI has been particularly successful in mapping the brain's 'connectome'—the collection of structural and functional neural connectivity networks and their respective focal nodes—and for studying how concussion affects these networks. Focal or diffuse TBI can disrupt the brain's functional connectivity, resulting in dysfunction of multiple networks including the default mode and salience networks, which have been implicated in memory, emotion and mood.171 Network dysfunction might have a stronger influence on recovery than does lesion location,171, 172, 173 but the long-term implications for brain development and cognitive function remain unclear.26, 174 Further studies of network connectivity dysfunction in children after concussion will be critical to improve injury prognostication and management. Radiotracers for PET imaging have the potential to advance the diagnosis and treatment of concussion and CTE, but their use in paediatric populations is purely investigational at present. Three FDA-approved radiolabelled imaging agents are currently available for detecting brain amyloid in patients with suspected AD.175 In adults, some cases of concussion are associated with acute Aβ pathology. PET scanning could enable paediatric patients to be monitored for the presence and persistence of acute post-concussion amyloid, and to determine whether scan positivity and negativity predict different outcomes.176, 177 Other PET imaging agents with potential utility in paediatric populations include new tracers that bind neurofibrillary tangles composed of tau. Early imaging results with 18F-T807, 18F-T808 and 18F-THK5105 are proving to be useful in confirming the presence of tauopathy in various clinical situations, including AD.178, 179, 180 In a recent AD study, the magnitude of tau tracer signal correlated positively with the stage of disease and severity of cognitive impairment.180 A third tau PET tracer, 11C-PBB3, has been tested in healthy individuals and patients with AD, and may be able to detect non-AD conformations of tau.181 In addition, a recent report contains the first description of tauopathy imaging in a living person with suspected sports-associated CTE.177 Given the extent of chronic tau pathology in concussion, repetitive subconcussive injury and CTE, tau tracers may be useful as diagnostic and prognostic biomarkers (for example, to distinguish CNI from CTE). Studies with these tracers in adults with CTE are underway, but their use in paediatric populations will depend on future research to determine whether tau pathology is present in young patients after TBI or concussion. A PET tracer for the microglial cholesterol transporter protein might be useful for imaging of neuroinflammation associated with TBI.182 New PET ligands to image brain microglia, which are being developed with potential utility in neurodegenerative diseases, may also prove useful in concussion and CTE management. Exploration of these PET ligands in paediatric populations with concussion and TBI would be informative, but risk–benefit analyses must be performed before embarking on studies involving radiotracers in this age group. The ultimate utility of any PET imaging agent will depend on its diagnostic and prognostic value as part of a multimodal panel of biomarkers and neuroimaging techniques. Noninvasive techniques such as transcranial magnetic stimulation (TMS) have also uncovered changes in synaptic plasticity following TBI and concussion,183 particularly in asymptomatic individuals.184, 185, 186 Several small TMS studies of young athletes in their early 20s with a history of concussion suggest imbalances in γ-aminobutyric acid and/or glutamate neurotransmission in the motor cortex that are associated with deficits in synaptic long-term potentiation and depression.184, 185, 187, 188 TMS has also revealed that concussion-related impairments in synaptic plasticity can impair aspects of motor learning,188 and that these deficits are detectable decades after an individual's last concussion.189 Another crucial noninvasive tool for detecting neurochemical dysfunction associated with concussion is proton magnetic resonance spectroscopy (MRS). Reports specifically addressing the use of spectroscopy following sports-related concussion suggest various abnormalities consistent with neurochemical alterations.190 In younger (high school) athletes, increased glutamate and glutamine levels were detected by MRS at post-season versus pre-season evaluation, even in players who had not experienced clinically significant concussion during the season.191 Such findings suggest that even subconcussive head impacts can result in the activation of glutamate pathways, implying cellular injury or neuronal death, despite the absence of symptoms. Levels of creatinine and myoinositol (an organic osmolyte located in astrocytes192, 193) were also significantly altered in a subset of the participants in the aforementioned study. In a rare longitudinal study utilizing MRS,194 individuals who sustained a single sports-related concussion exhibited significantly reduced levels of N-acetylaspartate (NAA, a marker of neuronal and axonal health, integrity and functioning195) in the brain 3 days after injury. Levels were increased at 15 days post injury, and reverted to control values at 30 days post injury. By contrast, participants who sustained a second concussion 10–13 days after their initial concussion displayed a prolonged reduction in NAA levels, which had not normalized even 45 days post injury. These results suggest that repeated injury within a short time frame increases the likelihood of protracted or incomplete recovery. In addition to the acute and subacute alterations detected by MRS, other studies of the long-term effects of concussion have disclosed increased myoinositol (associated with glial proliferation) and decreased choline (associated with membrane turnover195) levels in the medial temporal lobe in otherwise healthy former athletes who sustained their last concussion more than three decades prior to testing.196 Another recent study examined a cohort of symptomatic retired National Football League players, using an advanced MRS method called correlated spectroscopy (COSY), which can measure additional metabolites.197 The authors identified increased choline and glutamate–glutamine levels (indicative of diffuse axonal injury and excitotoxicity, respectively), consistent with previous mTBI MRS studies, as well as additional cerebral metabolites that were indicative of neuroinflammatory changes. These metabolic changes may provide insight into mechanisms of injury, such as excitotoxicity and/or inflammation, which could underlie the reported structural changes. Overall, the available data support the use of MRS as a research tool to identify altered neurophysiology and monitor recovery in adult athletes, even following resolution of post-concussive symptoms. At present, MRS-detected biochemical alterations may enhance our understanding of the underlying pathophysiology, but do not yet provide specific diagnostic information. Larger cross-sectional, prospective and longitudinal studies are needed to determine the sensitivity and prognostic value of MRS within the field of sports-related concussion.190 Because the interpretation of MRS in the immature brain requires certain developmental considerations, appropriate comparison samples will be needed for future work in children. MRS techniques with greater spectral resolution, including COSY, might provide additional biochemical specificity.197 Other advances in spatial resolution, such as 3D chemical shift imaging, may also provide greater specificity by allowing the investigation of metabolic alterations throughout the brain rather than in specific regions of interest. Finally, MRS could have a role in measurement of treatment effects, such as those induced by transcranial direct current stimulation198 and TMS.199 The mechanisms and surveillance infrastructure for sports-related injury measurement, reporting, tracking and data sharing are insufficient for current needs and objectives. Concussion research and clinical efforts are hindered by a lack of concussion data across sports and playing levels. A 2014 Institute of Medicine report identified only three national sports injury surveillance systems: the National Electronic Injury Surveillance System—All Injury Program (NEISS-AIP), the National Collegiate Athletic Association Injury Surveillance System (NCAA ISS), and the High School Reporting Injury Online (RIO™).1 These systems can be supplemented with clinical data (for example, from emergency departments, hospitalized inpatients and sports clinics), but these data are biased toward more-severe injuries and patients of higher socioeconomic status. Indeed, schools in rural areas or communities with lower socioeconomic status often have limited access to sports medicine care professionals and facilities. Several emerging programmes may improve surveillance. Regional efforts such as Clinical Outcomes Research Education for Athletic Trainers (CORE-AT) and national efforts such as the National Athletic Trainers' Association National Athletic Treatment, Injury and Outcomes Network (NATA NATION™) attempt to integrate injury tracking with treatment and outcomes data at the high school and collegiate levels. However, none of these systems specifically capture injuries to younger athletes, those participating in non-school sponsored sports, or those at schools without athletic trainers. Sports injury databases also rarely account for demographic factors including socioeconomic status, race or ethnicity, and health-care coverage. Currently, no effective mechanisms exist to consistently and inexpensively link various surveillance data sets, or to follow up individual athletes across sports, tracking systems or the age continuum. There is a considerable need for a system that tracks individual athletes through their playing careers and beyond. Each individual should be tracked for several decades to establish if, when and how a given burden of TBI evolves into CTE, and to assess all the possible negative health outcomes associated with concussion. Such a system would also provide more-accurate descriptions of concussion history and exposure to risk factors, and could capture both short-term and long-term outcomes, including measures of physical and mental health, academic and career success, quality of life and social connectivity, and evolving socioeconomic status. Such efforts are challenged by a variety of issues, including a lack of mandatory reporting of concussion at any level. Mandatory concussion reporting, funding for surveillance efforts, and provision of training to data reporters (for example, coaches and athletic trainers) would greatly improve epidemiological research. However, mandatory reporting will not provide meaningful results without validated, consensus definitions for concussions, and development of a universal data repository and a global unique identifier (GUID) system. Data sets from standardized surveillance efforts could then be linked, thereby improving data sharing for research and clinical care. Coupling of surveillance data with standardized collection, storage and curation infrastructures for biobanking of tissue and fluid samples could dramatically improve injury and outcomes research.200 These efforts might be catalyzed by funding from public–private partnerships, and made actionable by setting realistic short-term and long-term goals to create a multi-year plan. However, in the USA at least, such efforts are currently hampered by misunderstanding of Health Insurance Portability and Accountability Act (HIPAA) regulations and general concerns for athlete confidentiality. Wider use of computerized neurocognitive testing (CNT) for athletes could improve concussion surveillance, as well as diagnosis and management. However, several important challenges must be overcome before CNT becomes routine. These challenges include a lack of standardized administration protocols, the potential for technological errors arising from different computer hardware, limits in the types of cognitive functions assessed, and a lack of qualified test administrators and data interpreters.201 Despite these shortcomings, however, CNT is already used by approximately 40% of US high schools that employ athletic trainers.202 Though not affordable for all schools, CNT could enhance ground-level data collection and aid risk-exposure estimation and post-concussion recovery tracking, as well as increasing the quality of data reported to sports injury surveillance networks. CNT may be also useful in evaluating and tracking post-concussion cognitive improvement or decline, and could have utility in predicting outcomes.203, 204 Whether CNT data collected in the school setting will reach the validation and reproducibility standards achieved by CNT conducted by a clinical research team remains to be seen. Importantly, CNT needs standardization and guidelines for determining 'return to play' and 'return to learn' for athletes who show recovery in one domain but are still symptomatic in others. More research is required on the utility of CNT, both in the clinic and for concussion surveillance and management of youth athletes. In several critical areas, incomplete knowledge hampers meaningful advances in the field of paediatric concussion. At the molecular and cellular levels, research that focuses on axonal damage after concussion and repetitive subconcussive injury is urgently needed to elucidate changes in axonal trafficking and repair, and to better define the role of transient Aβ accumulation as a potential driver of downstream and/or future pathology. Concussion researchers may need to identify more-suitable animal models to study molecular pathology, including tau and its contribution to post-concussion and CTE pathologies, as the structure and organization of the brain differs dramatically in rodents and humans. Without a clearer understanding of how TBI changes the young, still-developing brain, and what pathological events happen in the weeks, months and years following injury, we are left to speculate about the underlying biological bases of such changes. Head impact data collection and risk assessment in youth sports might be improved through use of sensor technologies that record linear and rotational forces. Such commercially available devices, if validated, could determine levels of cumulative head impact forces during games and across seasons of play, and the findings could be linked to neuroimaging data and functional outcome assessments. Combined with 'hit-count' metrics, sensor data may improve knowledge of short-term and long-term neuropsychological outcomes of repetitive subconcussive impacts. Our knowledge of CTE might be improved by understanding baseline rates in the general population, in injured athletes, among uninjured athletes matched by sport and playing positions, and in 'control' athletes in low-risk sports. Improved knowledge of risk exposures could lead to prevention efforts, including practice and competition rule changes. A decades-long, prospective, longitudinal study, following youth athletes through their sporting careers and beyond, would provide more-definitive knowledge of cumulative head impacts and risks of long-term neuropsychological dysfunction and dementia. Such a study is underway in NCAA alumni, who were first studied in 2003 and were re-assessed in 2013.29, 205 Studies in other populations, especially if NIH-funded, would probably begin with a 5-year study that could be renewed in further 5-year increments. Public–private partnerships are likely to be required to secure enough funding to involve multiple study centres. The NCAA has provided partial sponsorship for the 10-year re-assessment of over 100 athletes, but further funding from the NIH, the US Department of Defense (DoD), and private philanthropic sources will be required to extend the range of assessment from neuropsychology, through MRI, to molecular imaging for amyloid, tau and/or inflammation. Ideally, the longitudinal study design should combine epidemiological and interventional trial methodologies and utilize multiple control groups, including non-contact athletes and uninjured impact sport athletes. A longitudinal study would also shed light on the role of cognitive reserve. A precedent for such studies has been established by the late-life dementia research community, using NIH funds and public–private partnerships involving pharmaceutical companies and foundations. For such studies to be successful, additional surveillance systems and data repositories must first be established. Efforts would be accelerated if athletes participating in impact sports had universal access to athletic trainers, who could act as reliable data reporters while promoting safety and providing basic care. In addition, any longitudinal studies must include postmortem analyses to better understand the influence of childhood and young-adult concussions on the development of neurodegenerative pathology and dementia in later life. 'Return-to-play' guidelines are currently hampered by a lack of rigorous epidemiological evidence, and could be greatly improved by long-term safety data from longitudinal studies.206 Longitudinal research could also include studies to determine whether those athletes who fail to follow guidelines experience any negative health effects, such as lingering symptoms or altered risk of incurring a second concussion. The infrastructure for a long-term prospective study might be created through the formation of a research consortium modelled after the Alzheimer's Disease Neuroimaging Initiative (ADNI). ADNI has set standards for data collection, dissemination agreements, testing methodologies, and biomarker collection and analysis. A version of ADNI currently underway with participation of the DoD (ADNI-DoD) is focused on blast-related TBI research in military populations.207 In May 2014, in addition to the NCAA Concussion Study, the NCAA and the DoD announced the launch of the largest prospective sports-related concussion study to date, which will monitor approximately 37,000 NCAA athletes over 3 years. One can envision how this study's infrastructure may eventually be extended to study younger athletes over an extended longitudinal range. Many gaps remain in our knowledge of the biology of TBI, which limit our ability to develop effective drugs. These gaps must be filled if we are to tackle the underlying disease pathology and move beyond treating the symptoms. However, much can be accomplished while research into fundamental TBI biology continues. Drug repurposing involves testing of existing FDA-approved drugs for new indications, and can reduce expense and shorten the path for drug approval. Current repurposing trials include methylphenidate for pain and mental fatigue,208 the dopamine receptor agonist bromocriptine for working memory,209 and the antidepressant sertraline for mood and anxiety, the most frequent neuropsychological complications that influence long-term outcomes after concussion.210 Larger randomized clinical trials should be conducted before these drugs can be introduced into clinical practice for these new indications. In addition, the recent failure of the PROTECT phase III trial of progesterone to improve outcomes after acute TBI211 may serve as a reminder of the need for more research to better understand the fundamental biology underlying TBI. Although many drug repurposing efforts are designed primarily to address concussion symptoms, the drugs may also influence injury pathology and progression. Research on established drugs can also lead to new drug discovery efforts and, potentially, new preventive or management therapeutics. New drugs are urgently needed for TBI and concussions that do not resolve. Drug discovery efforts in the areas of neuroprotection and anti-inflammation are especially relevant because of their potential cross-applicability to neurodegenerative diseases such as AD. Similarly, drugs currently in development for other neurodegenerative diseases might be repositioned for testing in patients with TBI or nonresolving concussion symptoms. As is often the case in medical research, recent advances in concussion research raise as many questions as they answer. Evidence exists for long-term neuropsychological dysfunction and later-life dementia after concussions or repetitive subconcussive head impacts, and more work is needed to better understand the implications and outcomes of youth participation in impact sports. As outlined in this Expert Consensus Document, there is a path forward, but achieving the goals outlined here will require public and private sector cooperation. While recommendations can be improved with increased knowledge, the available evidence can still inform individual decision-making when considering youth sport participation, as well as practice policies and competition rules. With an ageing population and a looming epidemic of dementia, we must learn more about potential early-life risk factors, including sports-related concussion. The choices made by parents, coaches, school boards and children will be better informed when the critical gaps in scientific knowledge of concussion are filled. Download references|与运动相关的脑震荡和重复性亚震荡暴露越来越被认为是儿科人群的潜在危险,但是对于这些事件的短期和长期后果,包括潜在的认知障碍和晚年痴呆的风险,仍然知之甚少。这份专家共识文件是由全球安全儿童、阿尔茨海默氏症药物发现基金会和安德鲁斯矫形外科和运动医学研究所召集的为期一天的会议的结果。目标是强调在脑震荡科学、痴呆症、遗传学、诊断和预后生物标志物、神经影像学、运动损伤监测和信息共享等领域的知识差距和亟需研究的领域。针对这些领域,我们提出了明确和可实现的途径,以提高对青少年体育相关脑震荡的理解、治疗和预防。2009年,美国年龄 < 19岁的个体中记录了约250,000例非致命性创伤性脑损伤(TBI)。1疾病控制和预防中心估计,5-18岁的年轻人维持着所有运动相关脑震荡的65% 。2尽管最近在诊断性脑成像方面取得了进展,并且在我们对脑震荡物理学的理解方面,长期的认知结果仍然知之甚少。由于脑震荡的身体、认知和情感后果引起了公众的广泛关注,我们对如何预防、诊断和治疗这种伤害的不完整知识危及我们儿童的总体健康,特别是他们的大脑健康。这份专家共识文件是儿科和成人创伤性脑损伤、阿兹海默病(AD)研究、遗传学、流行病学、生物伦理学和运动医学领域专家为期一天的会议的结果(专栏1) ,该会议于2013年11月由全球安全儿童、阿尔茨海默氏症药物发现基金会和安德鲁斯矫形外科和运动医学研究所召集。我们的主要目标是强调我们在儿童和青少年脑震荡知识方面的重大差距。我们强调需要进行研究的领域,如开发诊断和预测性生物标志物,阐明遗传风险因素,以及预测短期和长期结果。在我们的结论中,我们提出了提高我们对与运动相关的儿童脑震荡的长期后果的理解的途径。术语“脑震荡”经常与术语“轻度 TBI”(mTBI)交替使用,考虑到脑震荡后可能的脑损伤程度和慢性神经心理功能障碍的潜在可能性,这是一种潜在的误导性做法。然而,我们应该强调的是,大多数脑震荡不会产生后遗症。美国康复医学会将 mTBI 定义为格拉斯哥昏迷量表3评分为13-15分,意识丧失 < 30分钟,创伤后遗忘持续时间 < 24小时。脑震荡描述了损伤表型的异质混合物,取决于许多因素,包括头部撞击的大小,位置和方向。尽管缺乏宏观结构发现,脑震荡损伤涉及由线性和旋转剪切力破坏轴突和膜功能(弥漫性轴突损伤,5离子通量和谷氨酸兴奋毒性)引起的原发性神经元损伤,随后是继发性病理生理效应,包括线粒体氧化应激,脑血流中断,血脑屏障(BBB)完整性受损,突触功能障碍和神经炎症。持续的神经心理学脑震荡后症状(脑震盪症候群)包括情绪障碍(例如抑郁症) ,难以集中和记忆问题(方框2)。年轻运动员的脖子和躯干比老年人的脖子和躯干更弱,因此,造成脑损伤所需的力量更少。发育中的大脑也可能特别容易受到由头部创伤的剪切力引起的轴突损伤,这在美国青年足球中可以超过100g 的线性加速力。然而,青年运动中持续的平均力量通常会小于较高水平的运动。正确的突触发育对认知和行为健康至关重要。神经发生、竞争性突触消除(“修剪”)、髓鞘形成、轴突和树突树枝化等过程在产前发育的整个生命周期中持续进行。额叶和颞叶是最后成熟的区域,人类在20岁出头的时候经历了这些区域的修剪[16] ,因此这些仍在发育的区域的损伤可能对大脑产生病理生理效应,增加了以后生活中出现神经心理问题的可能性。轴突髓鞘形成在青春期持续到20岁出头,易受损伤的影响。职业拳击手大脑健康研究的早期结果表明,第一次接触拳击比赛的年龄越早,尾状核体积损失越大,额叶轴突损伤越严重。这项研究对拳击手和追踪研究综合格斗拳击手进行了5年的研究,他们都经历过重复性的脑震荡和脑震荡。23,24年轻的大脑也有一些有助于恢复的特征。已经显示,这个年龄组的神经可塑性增加有助于局灶性损伤后更好的结果[25]。此外,发育中的动物对重复 TBI 的葡萄糖代谢障碍的窗口比成年动物更短[26]。总的来说,发育中的大脑在 TBI 后显示出脆弱性和恢复力。这些相互交织的因素可能解释了脑震荡和重复 mTBI 对年轻人和成年人大脑影响的差异。应高度重视对脑震荡风险采取保守的方法,并加大努力调查这些发育差异。大多数人ーー无论老少ーー从脑震荡中完全恢复过来。在儿童中,可能影响康复的因素包括年龄和脑震荡史。27,28在一项研究中,大约90% 的年轻成年男运动员在21天内经历了症状恢复。然而,在一项针对11-22岁患者(包括所有脑震荡原因,而不仅仅是运动相关)的急诊科研究中,15% 的样本在受伤后90天仍然表现出脑震荡后症状,包括头痛,头晕,“精神模糊”和抑郁。一些研究表明,美国高中橄榄球运动员从脑震荡中恢复的速度比大学运动员和职业运动员要慢。尽管最近一项包括青春期前年龄组(11-12岁)的研究表明,脑震荡后恢复持续时间可能与年龄没有线性关系,但与高中以下青少年的直接比较尚未发表[30] ,因为这个样本中的青少年恢复时间比青春期前的儿童更长。这些发现加在一起,意味着男性青春期年龄组的恢复时间较长的独特风险。对年幼儿童和女性的进一步研究将大大提高我们评估和减轻整个儿科和青少年年龄段风险的能力。在新的脑震荡发生前1年内遭受一次或多次脑震荡的青少年报告出现更长时间的症状,30表明可能存在“脆弱性窗口”,并将先前受伤的青少年置于更高的长期恢复风险中。11-18岁的青少年在急诊室出现脑震荡后发生脑震盪症候群的可能性比5-10岁的儿童高出近80% ,同样,伴有头痛的儿童和青少年出现脑震盪症候群的风险增加了一倍。在 mtBI 后在急诊室接受治疗的儿童中,6岁以上的儿童在受伤后3个月报告持续症状的发生率高于6岁以下的儿童。当然,获得 < 6岁儿童脑震荡症状的准确信息的能力可能受到缺乏症状自我意识和有效沟通这些症状的必要语言技能的限制。此外,从这些报告中不可能直接比较损伤的严重程度; 事实上,各种损伤的身体异质性,加上个体从脑震荡中恢复的先天能力,使得这种比较具有高度挑战性。一些专业研究中心正在使用“智能头盔”来标准化头部撞击产生的体力和角加速度,目前正在研究这些头盔用于测量和预测可能导致脑震荡的影响。36,37从脑震荡中恢复的年轻人可能会经历重大挑战,包括社会和学术发展的改变,38,39,40在一般智力测试中得分较低,以及学校表现下降(以年级平均分衡量)。39较低的父母教育水平和儿童学业成绩都与较差的脑震荡恢复相关。人格特质也起到了一定的作用,例如,伤前焦虑是运动性脑震荡后长时间恢复的一个危险因素。42年轻的男女运动员都有脑震荡的危险,但是女孩的脑震荡发生率高于男孩,特别是在高中和大学的足球、篮球、棒球或垒球比赛中。28,43,44,45解释这些差异的因素仍然不确定,但可能包括保护装备的质量,脑震荡症状的识别和报告,以及颈部长度和颈部肌肉力量。46男女之间在恢复轨迹方面的差异也知之甚少。然而,最近的一项研究表明,女性黄体酮水平影响脑震荡后的恢复。47青春期激素变化导致偏头痛,也可能导致脑震荡后恢复的性别差异。在青春期后,女性偏头痛的发病率是男性的四倍[48,49] ,一些证据表明,偏头痛患者在脑震荡后恢复较慢[50,51]。有必要进一步研究脑震荡风险和恢复的性别差异。一般来说,成人脑震荡比儿童和青少年脑震荡更容易理解。有几点值得注意。首先,脑震荡有多种非协调的定义。其次,脑震荡诊断是一门不完善的艺术。最后,在缺乏快速和廉价的客观诊断措施的情况下,脑震荡仍然是一种临床诊断,受到变异性的影响,包括不同亚专业和个体医生、神经心理学家和运动训练员的诊断阈值不同,以及教练、家长和年轻运动员报告不足。如果没有经过验证的诊断,脑震荡将仍然是一个模糊和报告不足的实体,发病率估计的准确性将继续受到不确切标准的差别应用的影响。重复性次级脑震荡可导致大脑结构和功能的改变。52弥散张量成像(DTI)检测到的白质异常在职业足球运动员中已有报道,即使没有任何明显的脑震荡史。与游泳运动员相比,男性职业足球运动员表现出 DTI 信号改变,提示几个大脑区域的白质完整性降低,这可能表明轴突髓鞘形成的丧失,类似于 mTBI 患者的改变。53名大学冰球运动员在一个赛季中表现出类似的白质变化。54,55,56,57此外,美国大学生橄榄球运动员重复性亚震荡性头部撞击已经以剂量依赖性方式与 BBB 完整性缺陷,白质完整性潜在丧失和认知功能障碍有关。58这些研究结果可能反映了持续遭受重复性次生脑震荡撞击的青少年的某种程度的风险,尽管很少有专门针对这一主题的研究。一个跟踪头部影响的指标ーー即“命中次数”ーー已经提出,59可以作为确定累积风险敞口的一个因素。这种方法的一个挑战是准确定义“命中”的参数,但改进的生物传感器在这方面显示出一些希望。与棒球中的“投球次数”类似,这个概念最近也被提出用于拳击运动员。24目前没有证据表明青少年重复性脑震荡冲击与晚年痴呆之间的因果关系,如果未来的研究将头部冲击与随后的神经心理功能障碍相关联,这些指标可能被证明是无价的。在成年人中,包括脑震荡在内的脑外伤可能会增加个体发生神经退行性疾病的风险,包括 AD 和 CTE (CTE) ,这是一种仅与重复性头部创伤相关的疾病[65,66]。尽管 mTBI 和 PD 风险之间的关系仍然不确定,但 TBI 也可能增加发生帕金森氏症的风险[67]。在儿科人群,特别是年轻运动员中,单次或重复性脑震荡对晚年神经退行性疾病和痴呆风险的影响是未知的。CTE 在20世纪20年代后期首次被症状性描述为拳击运动员的“拳击醉”痴呆,69后来被描述为“痴呆拳击”[70] ,并在1973年首次被病理学描述[71]。自2005年在一名前职业美式足球运动员身上发现 CTE 以来,这种病症已经引起了公众的广泛关注,目前已经在前冰球、棒球、橄榄球和足球运动员、73名摔跤运动员、74名退伍军人的大脑中发现。75,76业余和职业运动员慢性创伤性脑病的患病率和发病率仍然是未知的,这增加了讨论其流行病学和运动员的人口风险的困难。虽然慢性创伤性脑病主要被认为是一种神经退行性疾病,有时是由大学或专业接触性运动的职业生涯造成的,但在高中运动员中也有慢性创伤性脑病的报道。这一发现表明,慢性创伤性脑病的发展并不需要长期的运动生涯,青年运动员代表着高危人群。新出现的证据表明,临床的慢性创伤性脑病症状可以分为认知和情绪行为两种常见表现[78,79]。主观记忆症状如顺行性遗忘症是常见的,包括焦虑或抑郁在内的情绪障碍也是常见的[79] ,并且执行功能降低,这可能导致去抑制和决策技能受损[80]。这些临床症状定义了疾病的严重程度[81]。慢性创伤性脑病的神经退行性病理生理学是复杂的,对神经系统后遗症的了解很少。在严重的情况下,大脑皮层和内侧颞叶似乎受到最深刻的影响,81,82与病理学拥有属性由磷酸化 tau79组成的神经原纤维缠结,在某些情况下,TAR DNA 结合蛋白43病理学。CTE 也与明显的萎缩有关,特别是在额叶皮层和内侧颞叶,以及在乳头体,丘脑和下丘脑。79确诊的 CTE 临床诊断仍以尸检为基础。鉴于慢性脑震荡中描述的重复病变是否引起临床表型的不确定性,以及大多数专业和大学运动员不发展慢性脑震荡的事实,了解早期暴露于脑震荡是否与其他形式的神经退行性疾病和认知功能障碍(包括慢性神经认知障碍(CNI))相关至关重要。CTE 和 CNI 之间存在重要的临床区别,其中一些使得直接比较困难。CTE 是一种新出现的临床和病理状况,涉及多个领域的神经和认知功能的进行性恶化,主要在尸检中诊断。相反,CNI 表型并不一定是进行性的,而是拥有属性功能从组平均值或基线功能下降到创伤性脑损伤之前的水平。CNI 可以通过神经心理测试进行临床诊断。CNI 与头部创伤之间的因果关系尚未得到证实,但在专业运动员中一直发现剂量依赖性风险。此外,在业余运动员中进行的几乎一半的研究发现 CNI 的风险升高。年轻人群中是否存在类似的风险关联仍有待确定。一个假设是 CNI 代表了慢性创伤性脑病的前驱症状,但并非不可避免,类似于轻微认知障碍和 AD 之间的关系。另外,CNI 可能代表静态损伤而不退化。我们目前对 CNI 和 CTE 的基本生物学基础缺乏了解,这强调了进一步研究的必要性。对这两种情况的生物学知识的增加以及运动员(特别是青年运动员) CNI 的早期检测可能会推动干预措施以阻止进一步认知障碍的发展,并且还可能有助于验证推定的生物标志物。通过 tau 成像评估 CNI 可能有助于确定进展为 CTE 的可能性。脑震荡遗传学领域,特别是在儿科人群中,仍然处于起步阶段。尽管重复的头部撞击似乎对于 CTE 的发展是必要的,但是包括遗传学在内的其他因素可能具有重要作用,因为大多数脑震荡运动员不发展 CTE.87 CTE 的遗传危险因素可能与影响脑震荡易感性和恢复的因素重叠,AD 的遗传危险因素为这些因素的身份提供了重要的线索。E型载脂蛋白质的 ε4等位基因(APOEε4)是迄今为止发现的 AD 最重要的遗传危险因素,它严重影响中枢神经系统的损伤反应,特别是从大脑中清除淀粉样蛋白 -β (Aβ)。APOE 的三个等位基因赋予不同程度的 AD 风险: APOEε2降低风险,APOEε3是最常见的等位基因,代表与其他变体进行比较的基线风险,APOEε4增加风险。90,91研究表明 APOEε4与性别之间存在相互作用,因此 APOEε4相关的 AD 风险在女性中比在男性中更为突出。92,93 APOE 基因型与 TBI 协同作用增加 AD 的风险[94] ,尽管其与 CTE 作为重复 mTBI 的结果的假设风险相关性需要更多的研究。关于 APOE 同种型对儿童 TBI 结果的影响尚未达成共识,但来自成年人的数据表明 APOEε4对脑震荡结果有负面影响。一些研究表明,拥有至少一个 APOEε4等位基因与美国职业橄榄球运动员,96名拳击运动员95和其他成年人97,98,99,100的脑震荡后认知较差和持续的神经心理障碍有关,尽管其他研究没有发现这种关联。101,102一些证据表明 APOE 基因及其启动子的多态性是大学生运动员脑震荡危险的促成因素。另一项研究没有确定 APOEε4在脑震荡风险中的作用[105] ,尽管这个等位基因可能增加中年或晚年 mTBI 后痴呆的风险。106由于样本量小,方法不同,很难从这些相互矛盾的研究中得出结论。在儿童中,对于 APOEε4与脑震荡后神经心理学结果之间的关系知之甚少,而且 APOEε4测试在儿科 TBI 研究中并不常规。2012年,Kurowski 回顾了少数现有的研究,并结合了使用格拉斯哥结果量表的三项研究的结果[107,108,109]。在合并样本(252名儿童)中,6-12个月后不良临床结果的风险在 APOEε4携带者中高于非携带者(19% 比9%)。然而,这些研究包括了广泛的异质性损伤儿童的发育范围,并没有考虑到年龄和基因型之间可能的相互作用。此外,APOE 与性别之间的相互作用尚未在脑震荡的背景下进行研究。改进的前瞻性研究有助于澄清这些联系。将遗传学纳入儿科脑震荡研究充满了复杂的挑战,包括获得父母同意和儿童的知情同意,临床研究参与者的感知耻辱,获得的遗传知识的可行性以及关于可保性(特别是长期护理保险)的潜在担忧。对了解 APOEε4 + 状态的成年人的研究表明,许多人愿意改变生活方式,包括增加运动和改善药物管理[111] ,以及增加购买健康和长期护理保险[112,113]。关于新的遗传知识和相应的疾病风险的教育是必不可少的,正如个人对获得的知识的影响的个人感觉与痴呆风险增加的实际后果之间的实质性不一致所证明的那样.114 APOE 遗传知识对儿童,其家庭和参与影响性体育的决策过程的影响尚不清楚。APOE 基因型对该年龄组脑震荡风险和恢复的影响也需要进一步阐明。如果未来的研究发现,对于任何特定水平的影响,具有 APOEε4 + 状态的儿童比其 APOEε4同龄人具有更大的脑震荡或恢复不良的风险,则应考虑在参加影响性运动之前对学龄运动员进行基因检测。要充分理解基因影响的细微差别,就需要对高中和年轻运动员进行仔细研究。未来对青少年脑震荡结果(包括认知结果和痴呆风险)的研究应尽可能包括 APOE 基因分型。新的 APOE 研究应标准化研究方法和报告措施,包括收集“共同数据元素”,以确保有效的比较研究。110,115 APOE 基因型不一定是脑震荡恢复的不可改变的危险因素: 正在开发的 AD 治疗包括改变 ApoE4蛋白和 Aβ 之间相互作用的药物,这也可能适用于儿科脑震荡。编码脑源性神经营养因子的基因中的 Val66Met 多态性与 mtBI 后更好的结果有关,但与局灶性穿透性脑损伤后更差的结果有关。参与多巴胺能信号传导的基因多态性也可能有助于解释广泛的 TBI 结果。120此外,α-synuclein 基因启动子区的 Rep1多态性可能增加头部损伤后帕金森病的风险。为了提高我们对脑震荡风险和管理的理解,应该进行大型的前瞻性基于人群的全基因组关联研究(GWAS)和全基因组测序研究,以确定其他遗传变异(可能是低频率或低外显率) ,这些变异可以改变长期恢复,认知结果差或痴呆的风险。122这样的研究将需要大规模的数据共享,并且必须解决道德、隐私以及对可保性和可雇佣性的潜在影响等问题。尽管在确定可能应用于成人创伤性脑损伤治疗的可能的脑嵴液(CSF)和血液生物标志物方面取得了进展,但成人或儿科人群都没有经过临床验证的生物标志物。与成人脑震荡相比,儿童脑震荡的临床变异性更大; 因此,生物标志物在改善儿童脑震荡诊断方面具有特殊的潜力。值得注意的是,大多数 TBI 生物标志物已经在中度至重度 TBI 的背景下进行了研究,这使我们在 mTBI 生物标志物的知识方面存在明显的差距,特别是在儿童中。生物标志物的发展对 AD 治疗的进步至关重要。基于脑脊液的生物标志物已经被用于识别高危患者,并改善流行病学研究和临床试验的设计。123新的 PET 放射性配体,如淀粉样蛋白标记剂(其中三种现在是 FDA 批准的) ,可以用于诊断和改善基于神经病理学的患者临床试验分层。一些 tau 成像剂也在人体试验中,它们在包括 CTE 在内的 tau 病中的应用正在迅速建立。与基于液体的生物标志物一样,目前还没有足够敏感或特异的神经影像生物标志物来诊断成人或儿童的脑震荡或 CTE。目前 FDA 尚未批准任何创伤性脑损伤的诊断或治疗药物,而脑震荡生物标志物的验证可以加速这类药物的开发。然而,必须努力确保临床生物标志物检测的成本效益和广泛可用性。此外,考虑到与腰椎穿刺相关的风险,对脑震荡青少年脑脊液取样用于生物标志物研究的伦理问题应该得到解决。在成人体液为基础的生物标志物研究中有希望的发现必须在儿科人群中探索。过去数十年,推定脑震荡的生物标志物在科学文献中零星出现,其中最突出的是星形胶质细胞活化的非特异性标志物 S100钙结合蛋白 B (S100B)。血清中 S100B 的存在可能提示血脑屏障完整性的丧失。在成年拳击手比赛后观察到血清和脑脊液 S100B 水平升高,并且与头部撞击的数量和严重程度呈正相关。在脑震荡的职业冰球运动员中也观察到血清 S100B 水平升高,126在脑震荡后1小时测量的水平预测症状恢复时间。然而,S100B 的水平也提高后,控制发挥,没有发生脑震荡,表明这一标志物是不伤害特异性。事实上,没有头部损伤的成年创伤患者血清 S100B 水平升高。127,128,129其他研究表明,脑震荡后最初的 S100B 水平对于恢复不能很好地预测。与所有生物标志物一样,S100B 在儿童 TBI 管理中的作用甚至更不清楚[131] ,一些人认为这种标志物在儿科人群中几乎没有诊断或预后效用。132在一项关于≤15岁 TBI 患儿的研究中,5岁以下或9岁以上儿童的血清 S100B 水平高于5-9岁儿童。因此,S100B 可能不足以区分有症状和无症状的脑震荡儿童[133] ,S100B 在诊断和预后预后方面的效用是值得怀疑的。134,135,136神经元特异性烯醇化酶(NSE)是神经元损伤的标志物,但其作为血清或脑脊液生物标志物的用途仍不确定。拳击手头部撞击后观察到血清 NSE 水平升高[133,134,135,136,137] ,但在没有发生脑震荡的比赛后,冰球运动员也观察到 NSE 水平升高。血清 NSE 水平无法预测脑震荡后的恢复时间,可能与儿童损伤严重程度无关。133在≤15岁的儿童中,血清 NSE 水平与年龄呈负相关。一旦释放到血液中,NSE 具有缓慢的消除动力学,使得难以根据 NSE 水平区分原发性和继发性神经元损伤。神经丝轻链和胶质纤维酸性蛋白(GFAP)分别是 CSF 神经元特异性和胶质特异性损伤标志物,并且在成年拳击手打斗后 CSF 均升高。125,137,140在儿科脑震荡的情况下,对任何一种标志物都知之甚少,但对儿童和年轻成年人的初步研究表明,脑震荡后72小时内的血清 GFAP 水平与损伤后1个月的症状负担相关。神经元特异性蛋白 UCH-L1(泛素羧基末端水解酶同工酶 L1)首先通过参与 PD 与神经退行性病理学相关[142] ,其在血清中的存在后来被确定为严重 TBI 的生物标志物。血清 UCH-L1水平可能对脑震荡有诊断价值[146] ,但最近的证据表明血清水平升高与脑震荡次数之间缺乏相关性。UCH-L1在儿科人群中的临床应用值得进一步研究。也许最有希望的进展成人液基 TBI 生物标志物涉及 tau 蛋白。血清或脑脊液 tau 蛋白水平被认为表明轴突损伤,因为 tau 蛋白通常存在于轴突中,稳定微管。在 AD 患者中,脑脊液中切割的 tau 蛋白水解水平可能与认知功能相关。拳击手在比赛后脑脊液和血液中的 Tau 水平升高,脑脊液 Tau 水平与头部撞击的质量和数量相关。125,150最近的证据表明,脑震荡后冰球运动员血液中的 tau 水平升高,可能有助于预测恢复时间。然而,问题依然存在,一些研究报道血清切割 tau 对预测脑震盪症候群或长期结果的价值很小或没有价值。130,151 tau 作为儿童生物标志物的潜力尚不清楚,至今没有进行研究。事实上,血清 tau 作为一种生物标志物的可靠性尚未被确定为任何适应症。这种可能性是没有单一的生物标志物将足以诊断儿童脑震荡或预测结果。此外,很少有研究调查遗传组成和推定的生物标志物之间的相互作用。随着我们对生物标志物与损伤严重程度及其相互关系的理解的增加,生物标志物小组的发展,可能包括炎症和氧化标志物,152应该被考虑。未来的研究应试图进一步确定这些关系,建立生物标志物小组的临床价值,考虑到商业成本和实际可行性。代谢组学、脂质组学和蛋白质组学的最新进展ーー特别是寻找 AD 的代谢组学和脂质组学标志物ーー可能为今后研究脑震荡和脑震荡下损伤的生物标志物提供参考。最近的一些研究提出了与 MCI 和 AD 相关的代谢物和脂质谱的改变.153,154,155,156来自动物模型的数据表明,脂质和代谢物变化伴随着急性和慢性脑震荡后期,并且可能有助于预测恢复轨迹,157,158但是这些发现尚未在人类中得到验证。将生物标志物的搜索范围从血液和脑脊液扩展到唾液和尿液159,可能会提高快速和非侵入性测量的能力,特别是从儿童身上。从儿童抽取脑脊液样本,特别是在需要快速评估的情况下,在很大程度上是不切实际的。Mondello 等人提出了一套评估 TBI 生物标志物的有用标准,这些标准应该允许更精简的开发和验证.137任何经过验证的生物标志物小组必然是更大的多模式诊断套件的组成部分,其中可能包括结构和功能成像以及神经心理学测试。在设计未来的生物标志物研究时,应考虑 FDA 批准的可能性,以加快批准临床使用。虽然脑震荡仍然是一种临床诊断,但神经影像学技术正在提高我们对成人脑结构和功能后果的认识。儿科人群的神经影像学可能受到几个因素的限制,例如,脑震荡后纵向变化的测量由于动态的、未成熟的大脑的背景而变得复杂。没有成像技术被证实为脑震荡的诊断工具,成像结果与临床可测量的认知或行为功能之间的相关性是可变的。目前正在研究容积成像、 DTI 和功能磁共振成像(fMRI)等工具,特别是动脉自旋标记。通过 DTI 测量的分数各向异性(FA)可以推断白质束的结构完整性,TBI 后白质束通常被破坏。FA 变化的临床意义仍然存在争议,因为在脑震荡研究中观察到 FA 增加和减少[162,163,164,165,166]。这些差异可能部分是由于所检查的脑区域的相当大的空间异质性[167]以及损伤后间隔的差异。FA 可能仍然具有预后价值,有证据表明变化的方向和幅度与临床结果相关; 然而,这个想法等待在儿科和成人人群中验证。FA 可能缺乏必要的敏感性来充分评估脑损伤后白质束完整性的变化,扩散率的测量可能更合适。169 DTI 领域将大大受益于规范数据集的开发,以衡量观察到的变化。年轻运动员的赛前、赛后和赛季研究可以采用连续 DTI 成像技术为特定个体建立规范的数据,但数据汇总后的效用尚不清楚。儿科标准数据的缺乏严重限制了包括 DTI 在内的神经影像技术的临床应用。儿童脑震荡后的“回归基线”神经影像学研究也是必要的,因为它们可以极大地改善恢复的预测。尽管自动化提高了重复性,但 DTI 测量仍然对硬件和软件特异性,采集参数和分析软件敏感,这限制了重复性,标准化和中心之间以及跨研究之间的比较。标准化 DTI 成像中心的努力正在进行中。170 MRI 在绘制大脑的“连接体”(结构和功能神经连接网络及其各自的焦点节点的集合)以及研究脑震荡如何影响这些网络方面特别成功。局灶性或弥漫性 TBI 可以破坏大脑的功能连接,导致多个网络的功能障碍,包括默认模式和显着网络,这与记忆,情绪和情绪有关[171]。网络功能障碍对恢复的影响可能比病变部位更强[171,172,173] ,但对大脑发育和认知功能的长期影响尚不清楚[26,174]。脑震荡后儿童网络连接功能障碍的进一步研究对于改善损伤预后和管理至关重要。用于 PET 成像的放射性示踪剂有可能推进脑震荡和 CTE 的诊断和治疗,但目前它们在儿科人群中的应用纯粹是研究性的。三种 FDA 批准的放射性标记成像剂目前可用于检测疑似 AD 患者的脑淀粉样蛋白。175在成年人中,一些脑震荡病例与急性 Aβ 病理有关。PET 扫描可以使儿科患者监测急性脑震荡后淀粉样蛋白的存在和持续性,并确定扫描阳性和阴性是否预测不同的结果.176,177在儿科人群中具有潜在用途的其他 PET 成像剂包括结合由 tau 组成的神经原纤维缠结的新示踪剂。用18F-T807,18F-T808和18F-THK5105进行的早期成像结果证明对于确认包括 AD 在内的各种临床情况下存在共病是有用的。178,179,180在最近的一项 AD 研究中,tau 示踪信号的大小与疾病的分期和认知障碍的严重程度呈正相关。第三种 tau PET 示踪剂11C-PBB3已经在健康个体和 AD 患者中进行了测试,并且可能能够检测 tau 的非 AD 构象。181此外,最近的一份报告首次描述了疑似与运动相关的慢性创伤性脑病(CTE)在活人中的重病影像学表现。鉴于脑震荡,重复性亚震荡损伤和 CTE 中慢性 tau 病理学的程度,tau 示踪剂可用作诊断和预后生物标志物(例如,区分 CNI 和 CTE)。目前正在对 CTE 成人进行这些示踪剂的研究,但它们在儿科人群中的应用将取决于未来的研究,以确定 TBI 或脑震荡后年轻患者是否存在 tau 病理学。小胶质细胞胆固醇转运蛋白的 PET 示踪剂可能有助于成像与创伤性脑损伤相关的神经炎症。182正在开发的新型 PET 配体可以成像脑小胶质细胞,对神经退行性疾病具有潜在的应用价值,也可能证明对脑震荡和慢性创伤性脑病的治疗有用。在脑震荡和 TBI 的儿科人群中探索这些 PET 配体将是有益的,但是在开始进行涉及该年龄组放射性示踪剂的研究之前必须进行风险-效益分析。任何 PET 成像剂的最终效用将取决于其作为多模式生物标志物和神经影像技术小组的一部分的诊断和预后价值。非侵入性技术如经颅磁力刺激(tMS)也发现了创伤性脑损伤和脑震荡后突触可塑性的变化,特别是在无症状的个体中。对20多岁有脑震荡史的年轻运动员进行的几项小型 TMS 研究表明,运动皮层中 γ-氨基丁酸和/或谷氨酸神经传导的不平衡与突触长时程增强作用和抑郁症的缺陷有关。184,185,187,188经颅磁刺激还显示,脑震荡相关的突触可塑性损伤可以损害运动学习的各个方面,这些缺陷在个体最后一次脑震荡几十年后仍然可以检测到。另一个检测与脑震荡相关的神经化学功能障碍的关键非侵入性工具是质子磁共振谱(MRS)。专门针对运动相关脑震荡后使用光谱学的报告表明,与神经化学改变一致的各种异常。在年轻(高中)运动员中,MRS 在赛季后与赛季前评估中检测到谷氨酸和谷氨酰胺水平增加,即使在赛季期间没有经历临床显着脑震荡的运动员中也是如此。这些发现表明,即使是次震荡性头部撞击也可能导致谷氨酸途径的激活,意味着细胞损伤或神经元死亡,尽管没有症状。在上述研究中,一部分参与者的肌酐和肌醇水平(位于星形胶质细胞中的有机渗透液192,193)也发生了显著变化。在一项使用 MRS 的罕见追踪研究中,194名持续单次运动相关脑震荡的个体在受伤后3天在大脑中表现出显着降低的 N- 乙酰天冬氨酸(NAA,神经元和轴突健康,完整性和功能的标志物195)水平。损伤后15天水平升高,损伤后30天恢复到对照值。相比之下,在第一次脑震荡后10-13天再次受到脑震荡的参与者表现出 NAA 水平的长时间下降,即使在受伤后45天也没有恢复正常。这些结果表明,在短时间内反复受伤增加了延长或不完全恢复的可能性。除了 MRS 检测到的急性和亚急性改变之外,其他关于脑震荡长期影响的研究已经揭示了在其他健康的前运动员中,内侧颞叶中肌醇(与胶质增殖相关)增加和胆碱(与膜转换相关195)水平降低在测试之前持续最后一次脑震荡超过三十年。196最近的另一项研究使用一种叫做相关光谱学(COSY)的先进的 MRS 方法,检测了一组有症状的退役国家橄榄球联盟球员,这种方法可以测量额外的代谢物。作者发现胆碱和谷氨酸-谷氨酰胺水平升高(分别表明弥漫性轴突损伤和兴奋性毒性) ,与之前的 mtBI MRS 研究一致,以及额外的大脑代谢物表明神经炎症的变化。这些新陈代谢的变化可能提供了损伤机制的洞察力,如兴奋性毒性和/或炎症,这可能是所报道的结构变化的基础。总的来说,现有的数据支持使用 MRS 作为一种研究工具,以确定改变的神经生理学和监测恢复成年运动员,即使在解决后脑震荡症状。目前,MRS 检测到的生化改变可以增强我们对潜在病理生理学的理解,但尚不能提供具体的诊断信息。需要更大的横断面,前瞻性和纵向研究来确定 MRS 在运动相关脑震荡领域内的敏感性和预后价值.190由于未成熟大脑中 MRS 的解释需要某些发育方面的考虑,因此将来在儿童中的工作将需要适当的比较样本。具有更高光谱分辨率的 MRS 技术,包括 COSY,可能提供额外的生化特异性。空间分辨率的其他进展,如3D 化学位移成像,也可以通过允许调查整个大脑的代谢改变而不是在特定的感兴趣的区域,提供更大的特异性。最后,MRS 可以在测量治疗效果方面发挥作用,例如经颅直流电刺激198和 TMS.199。体育相关伤害测量,报告,跟踪和数据共享的机制和监测基础设施不足以满足目前的需求和目标。脑震荡的研究和临床工作受到缺乏运动和运动水平的脑震荡数据的阻碍。2014年美国医学研究所的一份报告只确定了三个国家运动伤害监测系统: 国家电子伤害监测系统ーー所有伤害项目(NEISS-AIP)、全美大学体育协会伤害监测系统(NCAA ISS)和高中伤害在线报告系统(rIOTM)。1这些系统可以补充临床数据(例如,来自急诊科、住院病人和体育诊所) ,但这些数据偏向于更严重的伤害和社会经济地位更高的病人。事实上,农村地区或社会经济地位较低的社区的学校往往很难获得运动医疗专业人员和设施。一些新出现的项目可能会改善监督。区域性的努力,如运动训练员临床结果研究教育(CORE-AT)和全国性的努力,如全国运动训练员协会全国运动治疗,伤害和结果网络(NATA NATIONTM)试图将伤害跟踪与高中和大学水平的治疗和结果数据结合起来。然而,这些系统中没有一个专门针对年轻运动员、那些参加非学校赞助体育项目的运动员或那些在没有运动教练的学校的运动员。运动损伤数据库也很少考虑人口统计因素,包括社会经济地位、种族或民族以及医疗保健覆盖率。目前,还没有有效的机制来连贯和廉价地将各种监测数据集联系起来,或者跨越体育、跟踪系统或年龄连续体跟踪个别运动员。现在相当需要一个系统来追踪个人运动员的运动生涯和其他方面。应该对每个人进行数十年的跟踪,以确定 TBI 的负担是否、何时以及如何演变为 CTE,并评估与脑震荡相关的所有可能的负面健康结果。这种系统还可以更准确地描述脑震荡病史和风险因素,并可以捕捉短期和长期的结果,包括身体和心理健康、学业和职业成功、生活质量和社会联系以及不断变化的社会经济地位。这种努力受到各种问题的挑战,包括缺乏任何级别的脑震荡强制性报告。强制性脑震荡报告、为监测工作提供资金以及为数据记者(例如教练和运动员培训员)提供培训将极大地改善流行病学研究。然而,如果没有经过验证的、对脑震荡的共识定义,以及通用数据库和全球唯一标识符(GUID)系统的开发,强制性报告将无法提供有意义的结果。然后可以将标准化监测工作的数据集联系起来,从而改善研究和临床护理的数据共享。将监测数据与组织和液体样本生物库的标准化收集、储存和管理基础设施耦合起来,可以大大改善损伤和结果研究。200这些努力可以通过公私伙伴关系的资金来催化,并通过制定现实的短期和长期目标来实现,以创建一个多年计划。然而,至少在美国,这些努力目前受到对健康保险便利和责任法案(HIPAA)规定的误解和对运动员保密的普遍关注的阻碍。运动员更广泛地使用计算机神经认知测试(CNT)可以改善脑震荡的监测,以及诊断和管理。然而,在 CNT 成为常规手术之前,必须克服几个重要的挑战。这些挑战包括缺乏标准化的管理协议,不同计算机硬件引起的技术错误的可能性,评估的认知功能类型的限制,以及缺乏合格的测试管理员和数据解释员.201尽管存在这些缺陷,但是,CNT 已经被大约40% 的美国高中雇用运动教练员.202虽然不是所有学校都负担得起,但是 CNT 可以加强地面数据收集,帮助风险暴露估计和脑震荡后恢复跟踪,以及提高向运动损伤监测网络报告的数据质量。CNT 也可能有助于评估和跟踪脑震荡后认知改善或下降,并可能有助于预测结果.203,204在学校环境中收集的 CNT 数据是否将达到由临床研究小组进行的 CNT 所达到的验证和重复性标准仍有待观察。重要的是,CNT 需要标准化和指导方针,以确定“返回运动”和“返回学习”的运动员在一个领域表现出恢复,但在其他领域仍然有症状。在临床和青少年运动员脑震荡监测和管理方面,需要对 CNT 的应用进行更多的研究。在一些关键领域,不完整的知识阻碍了儿科脑震荡领域有意义的进展。在分子和细胞水平上,迫切需要重点研究脑震荡和重复性亚震荡损伤后的轴突损伤,以阐明轴突运输和修复的变化,并更好地定义瞬时 Aβ 积累作为下游和/或未来病理学的潜在驱动因素的作用。脑震荡研究人员可能需要确定更合适的动物模型来研究分子病理学,包括 tau 蛋白及其对脑震荡后和慢性创伤脑炎病理学的贡献,因为啮齿动物和人类的大脑结构和组织大不相同。如果不能更清楚地了解创伤性脑损伤如何改变年轻、仍在发育中的大脑,以及在损伤后的数周、数月和数年内会发生什么样的病理事件,我们就只能推测这种改变的潜在生物学基础。通过使用记录线性和旋转力的传感器技术,可以改进青年体育运动中头部影响数据的收集和风险评估。这种商业上可用的设备,如果经过验证,可以确定在比赛期间和整个比赛季节中头部累积冲击力的水平,并且研究结果可以与神经影像学数据和功能结果评估联系起来。结合“击中计数”指标,传感器数据可以提高对重复性次生震荡影响的短期和长期神经心理学结果的认识。我们对慢性创伤性脑病的认识可以通过了解一般人群、受伤运动员、运动和运动位置匹配的未受伤运动员以及低风险运动中的“控制”运动员的基线率来改善。提高对风险暴露的认识可导致预防努力,包括改变做法和竞争规则。一项长达数十年的前瞻性追踪研究,追踪青年运动员的运动生涯及以后的发展,将提供有关累积性头部撞击以及长期神经心理功能障碍和痴呆风险的更确切知识。这样的研究正在 NCAA 校友中进行,他们于2003年首次接受研究,并于2013年重新评估。其他人群的研究,特别是如果 NIH 资助的话,可能会从5年的研究开始,可以进一步延长5年的增量。可能需要建立公私伙伴关系,以获得足够的资金,使多个研究中心参与进来。NCAA 已经为100多名运动员的10年重新评估提供了部分赞助,但需要来自 NIH,美国国防部(DoD)和私人慈善来源的进一步资助,以扩大评估范围,从神经心理学,通过 MRI,淀粉样蛋白,tau 和/或炎症的分子成像。理想情况下,追踪研究设计应结合流行病学和介入试验方法,并利用多个对照组,包括非接触运动员和未受伤的撞击运动员。追踪研究还将阐明认知储备的作用。老年痴呆症研究团体利用国家卫生研究院的资金以及涉及制药公司和基金会的公私伙伴关系,开创了这类研究的先例。为了使这类研究取得成功,必须首先建立更多的监测系统和数据库。如果参加影响力体育运动的运动员能够普遍获得运动员训练员的帮助,这些训练员能够在促进安全和提供基本护理的同时充当可靠的数据报告员,那么将加快努力。此外,任何纵向研究都必须包括死后分析,以便更好地了解儿童和青少年脑震荡对今后生活中神经退行性病理和痴呆发展的影响。由于缺乏严格的流行病学证据,“重返赛场”的指导方针目前受到阻碍,纵向研究的长期安全数据可能会大大改善这一点。纵向研究还可以包括确定那些未能遵循指导方针的运动员是否会经历任何负面健康影响的研究,例如持续的症状或改变发生第二次脑震荡的风险。长期前瞻性研究的基础设施可以通过建立一个以阿尔茨海默氏病神经影像学倡议(ADNI)为模型的研究联盟来创建。ADNI 为数据收集、传播协议、测试方法和生物标志物收集和分析制定了标准。目前正在国防部参与的一个版本的 ADNI (ADNI-DoD)专注于军事人群中与爆炸相关的 TBI 研究。2072014年5月,除了 NCAA 脑震荡研究,NCAA 和国防部宣布启动迄今为止最大的前瞻性运动相关脑震荡研究,该研究将在3年内监测大约37,000名 NCAA 运动员。我们可以想象,这项研究的基础设施可能最终扩展到研究年轻运动员在一个延长的纵向范围。我们对创伤性脑损伤的生物学知识仍然存在许多差距,这限制了我们开发有效药物的能力。如果我们要解决潜在的疾病病理,并超越治疗症状,就必须填补这些空白。然而,当基础创伤性脑损伤生物学的研究继续进行时,许多工作可以完成。药物再利用包括测试现有 FDA 批准的新适应症药物,可以减少费用和缩短药物批准的路径。目前的再利用试验包括哌醋甲酯治疗疼痛和精神疲劳,多巴胺受体激动剂溴隐亭治疗工作记忆,舍曲林治疗情绪和焦虑,这是最常见的影响脑震荡后长期结果的神经心理并发症。此外,黄体酮的 PROTECT III 期临床试验最近未能改善急性 TBI211后的结局,这可能提醒人们需要更多的研究来更好地理解 TBI 的基础生物学。虽然许多药物重新利用的努力主要是为了解决脑震荡症状,药物也可能影响损伤病理学和进展。对现有药物的研究也可能导致新的药物发现努力,并可能导致新的预防或管理治疗。急需新的药物治疗创伤性脑损伤和无法消除的脑震荡。在神经保护和抗炎领域的药物发现努力是特别相关的,因为它们潜在的交叉适用于神经退行性疾病,如 AD。同样,目前正在开发的治疗其他神经退行性疾病的药物可能会被重新定位,用于 TBI 或无脑震荡症状患者的检测。正如医学研究中经常出现的情况一样,脑震荡研究的最新进展提出的问题和回答的问题一样多。有证据表明脑震荡或重复性次生脑震荡后长期神经心理功能障碍和晚年痴呆,需要更多的工作来更好地理解青年参与影响性运动的含义和结果。正如本专家共识文件所概述的那样,有一条前进的道路,但实现这里概述的目标将需要公共和私营部门的合作。虽然可以通过增加知识来改进建议,但现有证据仍然可以在考虑青年参与体育运动以及实践政策和竞赛规则时为个人决策提供信息。随着人口老龄化和痴呆症的流行,我们必须更多地了解潜在的早期生活风险因素,包括与运动有关的脑震荡。家长、教练、学校董事会和孩子们做出的选择将在脑震荡科学知识的关键差距得到填补时得到更好的信息。下载参考资料|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Practical+Design+of+Performant+Recommender+Systems+using+Large-scale+Linear+Programming-based+Global+Inference)|0| +|[Rank-heterogeneous Preference Models for School Choice](https://doi.org/10.1145/3580305.3599484)|Amel Awadelkarim, Arjun Seshadri, Itai Ashlagi, Irene Lo, Johan Ugander|Amazon; Stanford University|School choice mechanism designers use discrete choice models to understand and predict families' preferences. The most widely-used choice model, the multinomial logit (MNL), is linear in school and/or household attributes. While the model is simple and interpretable, it assumes the ranked preference lists arise from a choice process that is uniform throughout the ranking, from top to bottom. In this work, we introduce two strategies for rank-heterogeneous choice modeling tailored for school choice. First, we adapt a context-dependent random utility model (CDM), considering down-rank choices as occurring in the context of earlier up-rank choices. Second, we consider stratifying the choice modeling by rank, regularizing rank-adjacent models towards one another when appropriate. Using data on household preferences from the San Francisco Unified School District (SFUSD) across multiple years, we show that the contextual models considerably improve our out-of-sample evaluation metrics across all rank positions over the non-contextual models in the literature. Meanwhile, stratifying the model by rank can yield more accurate first-choice predictions while down-rank predictions are relatively unimproved. These models provide performance upgrades that school choice researchers can adopt to improve predictions and counterfactual analyses.|学校选择机制的设计者使用离散选择模型来理解和预测家庭的偏好。最广泛使用的选择模型,多项式 logit (MNL) ,在学校和/或家庭属性中是线性的。虽然这个模型是简单和可解释的,但是它假设排名的偏好列表来自于一个从上到下在整个排名过程中是统一的选择过程。本文介绍了两种适用于学校选择的秩异质选择模型的建模策略。首先,我们采用了一个上下文相关的随机效用模型(CDM) ,考虑了在早期上层选择的情况下发生的下层选择。其次,我们考虑根据等级对选择模型进行分层,在适当的时候将相邻等级的模型相互调整。使用来自旧金山联合校区多年的家庭偏好数据,我们发现相对于文献中的非上下文模型,上下文模型大大提高了我们在所有排名位置的外部评估指标。同时,按等级对模型进行分层可以得到更准确的第一选择预测,而低等级预测相对来说没有改进。这些模型提供了学校选择研究人员可以用来改进预测和反事实分析的绩效提升。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Rank-heterogeneous+Preference+Models+for+School+Choice)|0| |[Connecting the Dots - Density-Connectivity Distance unifies DBSCAN, k-Center and Spectral Clustering](https://doi.org/10.1145/3580305.3599283)|Anna Beer, Andrew Draganov, Ellen Hohma, Philipp Jahn, Christian M. M. Frey, Ira Assent||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Connecting+the+Dots+-+Density-Connectivity+Distance+unifies+DBSCAN,+k-Center+and+Spectral+Clustering)|0| |[Shilling Black-box Review-based Recommender Systems through Fake Review Generation](https://doi.org/10.1145/3580305.3599502)|HungYun Chiang, YiSyuan Chen, YunZhu Song, HongHan Shuai, Jason S. Chang|National Tsing Hua University; National Yang Ming Chiao Tung University|Review-Based Recommender Systems (RBRS) have attracted increasing research interest due to their ability to alleviate well-known cold-start problems. RBRS utilizes reviews to construct the user and items representations. However, in this paper, we argue that such a reliance on reviews may instead expose systems to the risk of being shilled. To explore this possibility, in this paper, we propose the first generation-based model for shilling attacks against RBRSs. Specifically, we learn a fake review generator through reinforcement learning, which maliciously promotes items by forcing prediction shifts after adding generated reviews to the system. By introducing the auxiliary rewards to increase text fluency and diversity with the aid of pre-trained language models and aspect predictors, the generated reviews can be effective for shilling with high fidelity. Experimental results demonstrate that the proposed framework can successfully attack three different kinds of RBRSs on the Amazon corpus with three domains and Yelp corpus. Furthermore, human studies also show that the generated reviews are fluent and informative. Finally, equipped with Attack Review Generators (ARGs), RBRSs with adversarial training are much more robust to malicious reviews.|基于评论的推荐系统(RBRS)由于其缓解众所周知的冷启动问题的能力而引起了越来越多的研究兴趣。RBRS 利用评论来构建用户和项目表示。然而,在本文中,我们认为,这种对审查的依赖反而可能使系统面临被托儿的风险。为了探索这种可能性,本文提出了第一代基于先令攻击的 RBRS 模型。具体来说,我们通过强化学习学习一个虚假的评论生成器,它在向系统添加生成的评论之后,通过强制预测变化来恶意推销项目。通过引入辅助奖励,以提高文本流畅性和多样性的帮助下,预先训练的语言模型和方面预测,生成的评论可以有效的先令与高保真度。实验结果表明,该框架能够成功地利用三个域和 Yelp 语料库对亚马逊语料库中的三种不同类型的 RBRS 进行攻击。此外,人类研究也表明,生成的评论是流畅和信息。最后,配备了攻击评论生成器(ARGs) ,具有对抗性训练的 RBRS 对恶意评论更加有力。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Shilling+Black-box+Review-based+Recommender+Systems+through+Fake+Review+Generation)|0| |[Below the Surface: Summarizing Event Sequences with Generalized Sequential Patterns](https://doi.org/10.1145/3580305.3599264)|Joscha Cüppers, Jilles Vreeken||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Below+the+Surface:+Summarizing+Event+Sequences+with+Generalized+Sequential+Patterns)|0| -|[Generalized Matrix Local Low Rank Representation by Random Projection and Submatrix Propagation](https://doi.org/10.1145/3580305.3599361)|Pengtao Dang, Haiqi Zhu, Tingbo Guo, Changlin Wan, Tong Zhao, Paul Salama, Yijie Wang, Sha Cao, Chi Zhang|; Purdue University; Indiana University; Genentech; Amazon; Indiana University, Bloomington; Indiana University, School of Medicine|Detecting distinct submatrices of low rank property is a highly desirable matrix representation learning technique for the ease of data interpretation, called the matrix local low rank representation (MLLRR). Based on different mathematical assumptions of the local pattern, the MLLRR problem could be categorized into two sub-problems, namely local constant variation (LCV) and local linear low rank (LLR). Existing solutions on MLLRR only focused on the LCV problem, which misses a substantial amount of true and interesting patterns. In this work, we develop a novel matrix computational framework called RPSP (Random Probing based submatrix Propagation) that provides an effective solution for both of the LCV and LLR problems. RPSP detects local low rank patterns that grow from small submatrices of low rank property, which are determined by a random projection approach. RPSP is supported by theories of random projection. Experiments on synthetic data demonstrate that RPSP outperforms all state-of-the-art methods, with the capacity to robustly and correctly identify the low rank matrices under both LCV and LLR settings. On real-world datasets, RPSP also demonstrates its effectiveness in identifying interpretable local low rank matrices.|矩阵局部低秩表示(MLLRR)是一种非常理想的矩阵表示学习技术,它可以检测出具有低秩性质的不同子矩阵。根据对局部模式的不同数学假设,MLLRR 问题可以分为局部常变(LCV)和局部线性低秩(LLR)两个子问题。MLLRR 上的现有解决方案只关注 LCV 问题,而这个问题忽略了大量真实而有趣的模式。在这项工作中,我们开发了一个新的矩阵计算框架称为 RPSP (随机探测为基础的子矩阵传播) ,提供了一个有效的解决方案,这两个 LCV 和 LLR 问题。RPSP 检测由低秩性质的小子矩阵生成的局部低秩模式,这些小子矩阵由随机投影方法确定。RPSP 得到了随机投影理论的支持。对合成数据的实验表明,RPSP 算法优于所有的最新方法,在 LCV 和 LLR 设置下都具有鲁棒性和正确识别低秩矩阵的能力。在实际数据集上,RPSP 也证明了其识别可解释的局部低秩矩阵的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Generalized+Matrix+Local+Low+Rank+Representation+by+Random+Projection+and+Submatrix+Propagation)|0| +|[Generalized Matrix Local Low Rank Representation by Random Projection and Submatrix Propagation](https://doi.org/10.1145/3580305.3599361)|Pengtao Dang, Haiqi Zhu, Tingbo Guo, Changlin Wan, Tong Zhao, Paul Salama, Yijie Wang, Sha Cao, Chi Zhang|; Purdue University; Indiana University, Bloomington; Amazon; Indiana University; Genentech; Indiana University, School of Medicine|Detecting distinct submatrices of low rank property is a highly desirable matrix representation learning technique for the ease of data interpretation, called the matrix local low rank representation (MLLRR). Based on different mathematical assumptions of the local pattern, the MLLRR problem could be categorized into two sub-problems, namely local constant variation (LCV) and local linear low rank (LLR). Existing solutions on MLLRR only focused on the LCV problem, which misses a substantial amount of true and interesting patterns. In this work, we develop a novel matrix computational framework called RPSP (Random Probing based submatrix Propagation) that provides an effective solution for both of the LCV and LLR problems. RPSP detects local low rank patterns that grow from small submatrices of low rank property, which are determined by a random projection approach. RPSP is supported by theories of random projection. Experiments on synthetic data demonstrate that RPSP outperforms all state-of-the-art methods, with the capacity to robustly and correctly identify the low rank matrices under both LCV and LLR settings. On real-world datasets, RPSP also demonstrates its effectiveness in identifying interpretable local low rank matrices.|矩阵局部低秩表示(MLLRR)是一种非常理想的矩阵表示学习技术,它可以检测出具有低秩性质的不同子矩阵。根据对局部模式的不同数学假设,MLLRR 问题可以分为局部常变(LCV)和局部线性低秩(LLR)两个子问题。MLLRR 上的现有解决方案只关注 LCV 问题,而这个问题忽略了大量真实而有趣的模式。在这项工作中,我们开发了一个新的矩阵计算框架称为 RPSP (随机探测为基础的子矩阵传播) ,提供了一个有效的解决方案,这两个 LCV 和 LLR 问题。RPSP 检测由低秩性质的小子矩阵生成的局部低秩模式,这些小子矩阵由随机投影方法确定。RPSP 得到了随机投影理论的支持。对合成数据的实验表明,RPSP 算法优于所有的最新方法,在 LCV 和 LLR 设置下都具有鲁棒性和正确识别低秩矩阵的能力。在实际数据集上,RPSP 也证明了其识别可解释的局部低秩矩阵的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Generalized+Matrix+Local+Low+Rank+Representation+by+Random+Projection+and+Submatrix+Propagation)|0| |[TWIN: Personalized Clinical Trial Digital Twin Generation](https://doi.org/10.1145/3580305.3599534)|Trisha Das, Zifeng Wang, Jimeng Sun||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=TWIN:+Personalized+Clinical+Trial+Digital+Twin+Generation)|0| |[Accelerating Dynamic Network Embedding with Billions of Parameter Updates to Milliseconds](https://doi.org/10.1145/3580305.3599250)|Haoran Deng, Yang Yang, Jiahe Li, Haoyang Cai, Shiliang Pu, Weihao Jiang|Carnegie Mellon University; Zhejiang University; Hikvision Research Institute|Network embedding, a graph representation learning method illustrating network topology by mapping nodes into lower-dimension vectors, is challenging to accommodate the ever-changing dynamic graphs in practice. Existing research is mainly based on node-by-node embedding modifications, which falls into the dilemma of efficient calculation and accuracy. Observing that the embedding dimensions are usually much smaller than the number of nodes, we break this dilemma with a novel dynamic network embedding paradigm that rotates and scales the axes of embedding space instead of a node-by-node update. Specifically, we propose the Dynamic Adjacency Matrix Factorization (DAMF) algorithm, which achieves an efficient and accurate dynamic network embedding by rotating and scaling the coordinate system where the network embedding resides with no more than the number of edge modifications changes of node embeddings. Moreover, a dynamic Personalized PageRank is applied to the obtained network embeddings to enhance node embeddings and capture higher-order neighbor information dynamically. Experiments of node classification, link prediction, and graph reconstruction on different-sized dynamic graphs suggest that DAMF advances dynamic network embedding. Further, we unprecedentedly expand dynamic network embedding experiments to billion-edge graphs, where DAMF updates billion-level parameters in less than 10ms.|网络嵌入是一种通过将节点映射为低维向量来表示网络拓扑的图形表示学习方法,在实际应用中很难适应不断变化的动态图形。现有的研究主要是基于逐个节点的嵌入修改,这种方法陷入了计算效率和精度的两难境地。针对嵌入维数通常远小于节点数的问题,提出了一种新的动态网络嵌入方法,该方法不需要逐个节点更新,而是通过对嵌入空间的轴线进行旋转和缩放来解决这一问题。具体来说,我们提出了动态邻接矩阵分解(dAMF)算法,该算法通过旋转和缩放网络嵌入所在的坐标系,在不超过节点嵌入的边修改变化量的情况下,实现了一个高效、准确的动态网络嵌入。此外,将动态个性化 PageRank 应用于所获得的网络嵌入,以增强节点的嵌入,并动态捕获高阶邻居信息。对不同大小的动态图进行节点分类、链路预测和图重构的实验表明,DAMF 推进了动态网络嵌入。进一步,我们前所未有地将动态网络嵌入实验扩展到十亿边图,其中 DAMF 在不到10ms 的时间内更新十亿级参数。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Accelerating+Dynamic+Network+Embedding+with+Billions+of+Parameter+Updates+to+Milliseconds)|0| |[MetricPrompt: Prompting Model as a Relevance Metric for Few-shot Text Classification](https://doi.org/10.1145/3580305.3599430)|Hongyuan Dong, Weinan Zhang, Wanxiang Che|Harbin Institute of Technology|Prompting methods have shown impressive performance in a variety of text mining tasks and applications, especially few-shot ones. Despite the promising prospects, the performance of prompting model largely depends on the design of prompt template and verbalizer. In this work, we propose MetricPrompt, which eases verbalizer design difficulty by reformulating few-shot text classification task into text pair relevance estimation task. MetricPrompt adopts prompting model as the relevance metric, further bridging the gap between Pre-trained Language Model's (PLM) pre-training objective and text classification task, making possible PLM's smooth adaption. Taking a training sample and a query one simultaneously, MetricPrompt captures cross-sample relevance information for accurate relevance estimation. We conduct experiments on three widely used text classification datasets across four few-shot settings. Results show that MetricPrompt outperforms manual verbalizer and other automatic verbalizer design methods across all few-shot settings, achieving new state-of-the-art (SOTA) performance.|提示方法已经在各种文本挖掘任务和应用程序中显示出了令人印象深刻的性能,特别是那些很少使用的方法。尽管激励模式前景广阔,但其性能在很大程度上取决于激励模板和语言表达器的设计。在这项工作中,我们提出了 MetricPrompt,它通过将少镜头文本分类任务重构为文本对相关性估计任务,从而减轻了语言表达器的设计难度。MetricPrompt 采用提示模型作为相关度量,进一步缩小了预训练语言模型(Pre-training Language Model,PLM)的预训练目标与文本分类任务之间的差距,使得 PLM 的顺利适应成为可能。同时采用训练样本和查询样本,MetricPrompt 捕获跨样本的相关性信息以进行准确的相关性估计。我们在三个广泛使用的文本分类数据集上通过四个少镜头设置进行实验。结果表明,MetricPrompt 在所有短镜头设置中都优于手动语音表达器和其他自动语音表达器设计方法,实现了新的最新(SOTA)性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MetricPrompt:+Prompting+Model+as+a+Relevance+Metric+for+Few-shot+Text+Classification)|0| |[Delving into Global Dialogue Structures: Structure Planning Augmented Response Selection for Multi-turn Conversations](https://doi.org/10.1145/3580305.3599304)|Tingchen Fu, Xueliang Zhao, Rui Yan||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Delving+into+Global+Dialogue+Structures:+Structure+Planning+Augmented+Response+Selection+for+Multi-turn+Conversations)|0| -|[Partial-label Learning with Mixed Closed-set and Open-set Out-of-candidate Examples](https://doi.org/10.1145/3580305.3599460)|Shuo He, Lei Feng, Guowu Yang|University of Electronic Science and Technology of China; Nanyang Technological University|Partial-label learning (PLL) relies on a key assumption that the true label of each training example must be in the candidate label set. This restrictive assumption may be violated in complex real-world scenarios, and thus the true label of some collected examples could be unexpectedly outside the assigned candidate label set. In this paper, we term the examples whose true label is outside the candidate label set OOC (out-of-candidate) examples, and pioneer a new PLL study to learn with OOC examples. We consider two types of OOC examples in reality, i.e., the closed-set/open-set OOC examples whose true label is inside/outside the known label space. To solve this new PLL problem, we first calculate the wooden cross-entropy loss from candidate and non-candidate labels respectively, and dynamically differentiate the two types of OOC examples based on specially designed criteria. Then, for closed-set OOC examples, we conduct reversed label disambiguation in the non-candidate label set; for open-set OOC examples, we leverage them for training by utilizing an effective regularization strategy that dynamically assigns random candidate labels from the candidate label set. In this way, the two types of OOC examples can be differentiated and further leveraged for model training. Extensive experiments demonstrate that our proposed method outperforms state-of-the-art PLL methods.|部分标签学习(PLL)依赖于一个关键的假设,即每个训练样本的真实标签必须在候选标签集中。在复杂的现实场景中,这种限制性假设可能会被违反,因此,一些收集的示例的真实标签可能意外地位于分配的候选标签集之外。在本文中,我们将真实标签在候选标签集外的例子称为候选标签集外的例子,并且开创了一种新的 PLL 研究方法来学习候选标签集外的例子。我们在实际中考虑两种类型的 OOC 示例,即闭集/开集 OOC 示例,它们的真实标签位于已知标签空间的内部或外部。为了解决这个新的锁相环问题,我们首先分别计算候选标签和非候选标签的木质交叉熵损失,并根据特定的准则动态区分两种类型的 OOC 实例。然后,对于闭集 OOC 例子,我们在非候选标签集中进行反向标签消歧; 对于开集 OOC 例子,我们利用它们进行训练,利用一种有效的正则化策略,从候选标签集中动态分配随机候选标签。通过这种方式,两种类型的 OOC 示例可以区分并进一步用于模型培训。大量的实验表明,我们提出的方法优于最先进的锁相环方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Partial-label+Learning+with+Mixed+Closed-set+and+Open-set+Out-of-candidate+Examples)|0| +|[Partial-label Learning with Mixed Closed-set and Open-set Out-of-candidate Examples](https://doi.org/10.1145/3580305.3599460)|Shuo He, Lei Feng, Guowu Yang|Nanyang Technological University; University of Electronic Science and Technology of China|Partial-label learning (PLL) relies on a key assumption that the true label of each training example must be in the candidate label set. This restrictive assumption may be violated in complex real-world scenarios, and thus the true label of some collected examples could be unexpectedly outside the assigned candidate label set. In this paper, we term the examples whose true label is outside the candidate label set OOC (out-of-candidate) examples, and pioneer a new PLL study to learn with OOC examples. We consider two types of OOC examples in reality, i.e., the closed-set/open-set OOC examples whose true label is inside/outside the known label space. To solve this new PLL problem, we first calculate the wooden cross-entropy loss from candidate and non-candidate labels respectively, and dynamically differentiate the two types of OOC examples based on specially designed criteria. Then, for closed-set OOC examples, we conduct reversed label disambiguation in the non-candidate label set; for open-set OOC examples, we leverage them for training by utilizing an effective regularization strategy that dynamically assigns random candidate labels from the candidate label set. In this way, the two types of OOC examples can be differentiated and further leveraged for model training. Extensive experiments demonstrate that our proposed method outperforms state-of-the-art PLL methods.|部分标签学习(PLL)依赖于一个关键的假设,即每个训练样本的真实标签必须在候选标签集中。在复杂的现实场景中,这种限制性假设可能会被违反,因此,一些收集的示例的真实标签可能意外地位于分配的候选标签集之外。在本文中,我们将真实标签在候选标签集外的例子称为候选标签集外的例子,并且开创了一种新的 PLL 研究方法来学习候选标签集外的例子。我们在实际中考虑两种类型的 OOC 示例,即闭集/开集 OOC 示例,它们的真实标签位于已知标签空间的内部或外部。为了解决这个新的锁相环问题,我们首先分别计算候选标签和非候选标签的木质交叉熵损失,并根据特定的准则动态区分两种类型的 OOC 实例。然后,对于闭集 OOC 例子,我们在非候选标签集中进行反向标签消歧; 对于开集 OOC 例子,我们利用它们进行训练,利用一种有效的正则化策略,从候选标签集中动态分配随机候选标签。通过这种方式,两种类型的 OOC 示例可以区分并进一步用于模型培训。大量的实验表明,我们提出的方法优于最先进的锁相环方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Partial-label+Learning+with+Mixed+Closed-set+and+Open-set+Out-of-candidate+Examples)|0| |[COMET: Learning Cardinality Constrained Mixture of Experts with Trees and Local Search](https://doi.org/10.1145/3580305.3599278)|Shibal Ibrahim, Wenyu Chen, Hussein Hazimeh, Natalia Ponomareva, Zhe Zhao, Rahul Mazumder|Google DeepMind; Google Research; Massachusetts Institute of Technology|The sparse Mixture-of-Experts (Sparse-MoE) framework efficiently scales up model capacity in various domains, such as natural language processing and vision. Sparse-MoEs select a subset of the "experts" (thus, only a portion of the overall network) for each input sample using a sparse, trainable gate. Existing sparse gates are prone to convergence and performance issues when training with first-order optimization methods. In this paper, we introduce two improvements to current MoE approaches. First, we propose a new sparse gate: COMET, which relies on a novel tree-based mechanism. COMET is differentiable, can exploit sparsity to speed up computation, and outperforms state-of-the-art gates. Second, due to the challenging combinatorial nature of sparse expert selection, first-order methods are typically prone to low-quality solutions. To deal with this challenge, we propose a novel, permutation-based local search method that can complement first-order methods in training any sparse gate, e.g., Hash routing, Top-k, DSelect-k, and COMET. We show that local search can help networks escape bad initializations or solutions. We performed large-scale experiments on various domains, including recommender systems, vision, and natural language processing. On standard vision and recommender systems benchmarks, COMET+ (COMET with local search) achieves up to 13% improvement in ROC AUC over popular gates, e.g., Hash routing and Top-k, and up to 9% over prior differentiable gates e.g., DSelect-k. When Top-k and Hash gates are combined with local search, we see up to $100\times$ reduction in the budget needed for hyperparameter tuning. Moreover, for language modeling, our approach improves over the state-of-the-art MoEBERT model for distilling BERT on 5/7 GLUE benchmarks as well as SQuAD dataset.|稀疏混合专家(Sparse-MoE)框架有效地扩展了各种领域的模型容量,例如自然语言处理和视觉。稀疏-MoEs 使用稀疏的、可训练的门为每个输入样本选择一个“专家”子集(因此,只是整个网络的一部分)。现有的稀疏门在用一阶优化方法进行训练时容易出现收敛和性能问题。在本文中,我们介绍了两个改进的目前的教育方法。首先,我们提出了一种新的稀疏门: COMET,它依赖于一种新的基于树的机制。COMET 是可微的,可以利用稀疏性来加速计算,并且性能优于最先进的门。其次,由于稀疏专家选择具有挑战性的组合性质,一阶方法通常倾向于低质量的解决方案。为了应对这一挑战,我们提出了一种新颖的基于置换的局部搜索方法,可以补充一阶方法训练任何稀疏门,例如,散列路由,Top-k,DSelect-k 和 COMET。我们展示了本地搜索可以帮助网络逃避糟糕的初始化或解决方案。我们在不同的领域进行了大规模的实验,包括推荐系统、视觉和自然语言处理。在标准愿景和推荐系统基准上,COMET + (本地搜索的 COMET)在 ROC AUC 比流行的门(如散列路由和 Top-k)提高了13% ,比以前的可微分门(如 DSelect-k)提高了9% 。当 Top-k 和 Hash 门与本地搜索相结合时,我们看到超参数调优所需的预算减少了100倍。此外,对于语言建模,我们的方法改进了最先进的 MoEBERT 模型,用于提取5/7 GLUE 基准测试和 SQuAD 数据集上的 BERT。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=COMET:+Learning+Cardinality+Constrained+Mixture+of+Experts+with+Trees+and+Local+Search)|0| |[Exploiting Relation-aware Attribute Representation Learning in Knowledge Graph Embedding for Numerical Reasoning](https://doi.org/10.1145/3580305.3599338)|Gayeong Kim, Sookyung Kim, Ko Keun Kim, Suchan Park, Heesoo Jung, Hogun Park||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Exploiting+Relation-aware+Attribute+Representation+Learning+in+Knowledge+Graph+Embedding+for+Numerical+Reasoning)|0| |[Efficient Distributed Approximate k-Nearest Neighbor Graph Construction by Multiway Random Division Forest](https://doi.org/10.1145/3580305.3599327)|SangHong Kim, HaMyung Park||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Efficient+Distributed+Approximate+k-Nearest+Neighbor+Graph+Construction+by+Multiway+Random+Division+Forest)|0| -|[MM-DAG: Multi-task DAG Learning for Multi-modal Data - with Application for Traffic Congestion Analysis](https://doi.org/10.1145/3580305.3599436)|Tian Lan, Ziyue Li, Zhishuai Li, Lei Bai, Man Li, Fugee Tsung, Wolfgang Ketter, Rui Zhao, Chen Zhang|The Hong Kong University of Science and Technology; Tsinghua University; Shanghai AI Laboratory; SenseTime Research; The Hong Kong University of Science and Technology (Guangzhou); University of Cologne|This paper proposes to learn Multi-task, Multi-modal Direct Acyclic Graphs (MM-DAGs), which are commonly observed in complex systems, e.g., traffic, manufacturing, and weather systems, whose variables are multi-modal with scalars, vectors, and functions. This paper takes the traffic congestion analysis as a concrete case, where a traffic intersection is usually regarded as a DAG. In a road network of multiple intersections, different intersections can only have some overlapping and distinct variables observed. For example, a signalized intersection has traffic light-related variables, whereas unsignalized ones do not. This encourages the multi-task design: with each DAG as a task, the MM-DAG tries to learn the multiple DAGs jointly so that their consensus and consistency are maximized. To this end, we innovatively propose a multi-modal regression for linear causal relationship description of different variables. Then we develop a novel Causality Difference (CD) measure and its differentiable approximator. Compared with existing SOTA measures, CD can penalize the causal structural difference among DAGs with distinct nodes and can better consider the uncertainty of causal orders. We rigidly prove our design's topological interpretation and consistency properties. We conduct thorough simulations and one case study to show the effectiveness of our MM-DAG. The code is available under https://github.com/Lantian72/MM-DAG|本文提出学习多任务、多模态直接无环图(MM-DAGs) ,这是在交通、制造、天气等复杂系统中常见的图形,其变量是多模态的,包括标量、向量和函数。本文以交通堵塞分析作为一个具体案例,其中交通十字路口通常被视为一个 DAG。在一个多交叉口的道路网络中,不同的交叉口只能观察到一些重叠的、不同的变量。例如,信号交叉口有与交通灯相关的变量,而无信号交叉口没有。这鼓励了多任务设计: 将每个 DAG 作为一个任务,MM-DAG 试图联合学习多个 DAG,以便最大化它们的一致性和一致性。为此,我们创新性地提出了一种多模态回归方法来描述不同变量之间的线性因果关系。然后我们发展了一个新的因果差分(CD)测度及其可微逼近器。与现有的 SOTA 方法相比,CD 方法能够更好地考虑因果顺序的不确定性,并且能够惩罚具有不同节点的 DAGs 之间的因果结构差异。我们严格证明了我们的设计的拓扑解释和一致性性质。我们进行了彻底的模拟和一个案例研究,以显示我们的 MM-DAG 的有效性。代码可在 https://github.com/lantian72/mm-dag 下查阅|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MM-DAG:+Multi-task+DAG+Learning+for+Multi-modal+Data+-+with+Application+for+Traffic+Congestion+Analysis)|0| +|[MM-DAG: Multi-task DAG Learning for Multi-modal Data - with Application for Traffic Congestion Analysis](https://doi.org/10.1145/3580305.3599436)|Tian Lan, Ziyue Li, Zhishuai Li, Lei Bai, Man Li, Fugee Tsung, Wolfgang Ketter, Rui Zhao, Chen Zhang|University of Cologne; The Hong Kong University of Science and Technology; Tsinghua University; The Hong Kong University of Science and Technology (Guangzhou); Shanghai AI Laboratory; SenseTime Research|This paper proposes to learn Multi-task, Multi-modal Direct Acyclic Graphs (MM-DAGs), which are commonly observed in complex systems, e.g., traffic, manufacturing, and weather systems, whose variables are multi-modal with scalars, vectors, and functions. This paper takes the traffic congestion analysis as a concrete case, where a traffic intersection is usually regarded as a DAG. In a road network of multiple intersections, different intersections can only have some overlapping and distinct variables observed. For example, a signalized intersection has traffic light-related variables, whereas unsignalized ones do not. This encourages the multi-task design: with each DAG as a task, the MM-DAG tries to learn the multiple DAGs jointly so that their consensus and consistency are maximized. To this end, we innovatively propose a multi-modal regression for linear causal relationship description of different variables. Then we develop a novel Causality Difference (CD) measure and its differentiable approximator. Compared with existing SOTA measures, CD can penalize the causal structural difference among DAGs with distinct nodes and can better consider the uncertainty of causal orders. We rigidly prove our design's topological interpretation and consistency properties. We conduct thorough simulations and one case study to show the effectiveness of our MM-DAG. The code is available under https://github.com/Lantian72/MM-DAG|本文提出学习多任务、多模态直接无环图(MM-DAGs) ,这是在交通、制造、天气等复杂系统中常见的图形,其变量是多模态的,包括标量、向量和函数。本文以交通堵塞分析作为一个具体案例,其中交通十字路口通常被视为一个 DAG。在一个多交叉口的道路网络中,不同的交叉口只能观察到一些重叠的、不同的变量。例如,信号交叉口有与交通灯相关的变量,而无信号交叉口没有。这鼓励了多任务设计: 将每个 DAG 作为一个任务,MM-DAG 试图联合学习多个 DAG,以便最大化它们的一致性和一致性。为此,我们创新性地提出了一种多模态回归方法来描述不同变量之间的线性因果关系。然后我们发展了一个新的因果差分(CD)测度及其可微逼近器。与现有的 SOTA 方法相比,CD 方法能够更好地考虑因果顺序的不确定性,并且能够惩罚具有不同节点的 DAGs 之间的因果结构差异。我们严格证明了我们的设计的拓扑解释和一致性性质。我们进行了彻底的模拟和一个案例研究,以显示我们的 MM-DAG 的有效性。代码可在 https://github.com/lantian72/mm-dag 下查阅|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MM-DAG:+Multi-task+DAG+Learning+for+Multi-modal+Data+-+with+Application+for+Traffic+Congestion+Analysis)|0| |[Who Should Be Given Incentives? Counterfactual Optimal Treatment Regimes Learning for Recommendation](https://doi.org/10.1145/3580305.3599550)|Haoxuan Li, Chunyuan Zheng, Peng Wu, Kun Kuang, Yue Liu, Peng Cui||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Who+Should+Be+Given+Incentives?+Counterfactual+Optimal+Treatment+Regimes+Learning+for+Recommendation)|0| |[UCEpic: Unifying Aspect Planning and Lexical Constraints for Generating Explanations in Recommendation](https://doi.org/10.1145/3580305.3599535)|Jiacheng Li, Zhankui He, Jingbo Shang, Julian J. McAuley||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=UCEpic:+Unifying+Aspect+Planning+and+Lexical+Constraints+for+Generating+Explanations+in+Recommendation)|0| |[Learning-Based Ad Auction Design with Externalities: The Framework and A Matching-Based Approach](https://doi.org/10.1145/3580305.3599403)|Ningyuan Li, Yunxuan Ma, Yang Zhao, Zhijian Duan, Yurong Chen, Zhilin Zhang, Jian Xu, Bo Zheng, Xiaotie Deng||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning-Based+Ad+Auction+Design+with+Externalities:+The+Framework+and+A+Matching-Based+Approach)|0| -|[Communication Efficient Distributed Newton Method with Fast Convergence Rates](https://doi.org/10.1145/3580305.3599280)|Chengchang Liu, Lesi Chen, Luo Luo, John C. S. Lui|Fudan University; The Chinese University of Hong Kong; Chinese University of Hong Kong|We propose a communication and computation efficient second-order method for distributed optimization. For each iteration, our method only requires $\mathcal{O}(d)$ communication complexity, where $d$ is the problem dimension. We also provide theoretical analysis to show the proposed method has the similar convergence rate as the classical second-order optimization algorithms. Concretely, our method can find~$\big(\epsilon, \sqrt{dL\epsilon}\,\big)$-second-order stationary points for nonconvex problem by $\mathcal{O}\big(\sqrt{dL}\,\epsilon^{-3/2}\big)$ iterations, where $L$ is the Lipschitz constant of Hessian. Moreover, it enjoys a local superlinear convergence under the strongly-convex assumption. Experiments on both convex and nonconvex problems show that our proposed method performs significantly better than baselines.|提出了一种分布式优化的通信和计算有效的二阶方法。对于每个迭代,我们的方法只需要 $mathcal { O }(d) $通信复杂性,其中 $d $是问题维度。理论分析表明,该方法与经典的二阶优化算法具有相似的收敛速度。具体地说,我们的方法可以通过数学上的{ O } big (sqrt { dL } ,epsilon ^ {-3/2} big)迭代找到非凸问题的 ~ $big (epsilon,sqrt { dL epsilon } ,big) $- 二阶驻点,其中 $L $是 Hessian 的 Lipschitz 常数。在强凸假设下,该算法具有局部超线性收敛性。对凸问题和非凸问题的实验表明,该方法的性能明显优于基线方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Communication+Efficient+Distributed+Newton+Method+with+Fast+Convergence+Rates)|0| +|[Communication Efficient Distributed Newton Method with Fast Convergence Rates](https://doi.org/10.1145/3580305.3599280)|Chengchang Liu, Lesi Chen, Luo Luo, John C. S. Lui|The Chinese University of Hong Kong; Chinese University of Hong Kong; Fudan University|We propose a communication and computation efficient second-order method for distributed optimization. For each iteration, our method only requires $\mathcal{O}(d)$ communication complexity, where $d$ is the problem dimension. We also provide theoretical analysis to show the proposed method has the similar convergence rate as the classical second-order optimization algorithms. Concretely, our method can find~$\big(\epsilon, \sqrt{dL\epsilon}\,\big)$-second-order stationary points for nonconvex problem by $\mathcal{O}\big(\sqrt{dL}\,\epsilon^{-3/2}\big)$ iterations, where $L$ is the Lipschitz constant of Hessian. Moreover, it enjoys a local superlinear convergence under the strongly-convex assumption. Experiments on both convex and nonconvex problems show that our proposed method performs significantly better than baselines.|提出了一种分布式优化的通信和计算有效的二阶方法。对于每个迭代,我们的方法只需要 $mathcal { O }(d) $通信复杂性,其中 $d $是问题维度。理论分析表明,该方法与经典的二阶优化算法具有相似的收敛速度。具体地说,我们的方法可以通过数学上的{ O } big (sqrt { dL } ,epsilon ^ {-3/2} big)迭代找到非凸问题的 ~ $big (epsilon,sqrt { dL epsilon } ,big) $- 二阶驻点,其中 $L $是 Hessian 的 Lipschitz 常数。在强凸假设下,该算法具有局部超线性收敛性。对凸问题和非凸问题的实验表明,该方法的性能明显优于基线方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Communication+Efficient+Distributed+Newton+Method+with+Fast+Convergence+Rates)|0| |[Meta Multi-agent Exercise Recommendation: A Game Application Perspective](https://doi.org/10.1145/3580305.3599429)|Fei Liu, Xuegang Hu, Shuochen Liu, Chenyang Bu, Le Wu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Meta+Multi-agent+Exercise+Recommendation:+A+Game+Application+Perspective)|0| |[Criteria Tell You More than Ratings: Criteria Preference-Aware Light Graph Convolution for Effective Multi-Criteria Recommendation](https://doi.org/10.1145/3580305.3599292)|JinDuk Park, Siqing Li, Xin Cao, WonYong Shin|The University of New South Wales; Yonsei University|The multi-criteria (MC) recommender system, which leverages MC rating information in a wide range of e-commerce areas, is ubiquitous nowadays. Surprisingly, although graph neural networks (GNNs) have been widely applied to develop various recommender systems due to GNN's high expressive capability in learning graph representations, it has been still unexplored how to design MC recommender systems with GNNs. In light of this, we make the first attempt towards designing a GNN-aided MC recommender system. Specifically, rather than straightforwardly adopting existing GNN-based recommendation methods, we devise a novel criteria preference-aware light graph convolution CPA-LGC method, which is capable of precisely capturing the criteria preference of users as well as the collaborative signal in complex high-order connectivities. To this end, we first construct an MC expansion graph that transforms user--item MC ratings into an expanded bipartite graph to potentially learn from the collaborative signal in MC ratings. Next, to strengthen the capability of criteria preference awareness, CPA-LGC incorporates newly characterized embeddings, including user-specific criteria-preference embeddings and item-specific criterion embeddings, into our graph convolution model. Through comprehensive evaluations using four real-world datasets, we demonstrate (a) the superiority over benchmark MC recommendation methods and benchmark recommendation methods using GNNs with tremendous gains, (b) the effectiveness of core components in CPA-LGC, and (c) the computational efficiency.|多准则推荐系统在电子商贸领域广泛应用,充分利用多准则评级信息。令人惊讶的是,尽管图神经网络(GNN)由于其在学习图表示方面的高度表达能力而被广泛应用于开发各种推荐系统,但是如何利用 GNN 设计 MC 推荐系统仍然是一个未知数。有鉴于此,我们首次尝试设计一个 GNN 辅助的 MC 推荐系统。具体而言,我们不直接采用现有的基于 GNN 的推荐方法,而是设计了一种新的标准偏好感知光图卷积 CPA-LGC 方法,该方法能够精确地捕获用户的标准偏好以及复杂高阶连接中的协作信号。为此,我们首先构建一个 MC 扩展图,将用户-项目 MC 评分转换为一个扩展的二分图,以便潜在地学习 MC 评分中的协作信号。接下来,为了加强标准偏好意识的能力,CPA-LGC 将新的特征嵌入,包括用户特定的标准偏好嵌入和项目特定的标准嵌入,纳入我们的图卷积模型。通过使用四个实际数据集的综合评估,我们证明了(a)使用 GNN 的基准 MC 推荐方法和基准推荐方法的优越性,(b) CPA-LGC 中核心组件的有效性,以及(c)计算效率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Criteria+Tell+You+More+than+Ratings:+Criteria+Preference-Aware+Light+Graph+Convolution+for+Effective+Multi-Criteria+Recommendation)|0| |[Locality Sensitive Hashing for Optimizing Subgraph Query Processing in Parallel Computing Systems](https://doi.org/10.1145/3580305.3599419)|Peng Peng, Shengyi Ji, Zhen Tian, Hongbo Jiang, Weiguo Zheng, Xuecang Zhang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Locality+Sensitive+Hashing+for+Optimizing+Subgraph+Query+Processing+in+Parallel+Computing+Systems)|0| |[Deep Pipeline Embeddings for AutoML](https://doi.org/10.1145/3580305.3599303)|Sebastian PinedaArango, Josif Grabocka|University of Freiburg|Automated Machine Learning (AutoML) is a promising direction for democratizing AI by automatically deploying Machine Learning systems with minimal human expertise. The core technical challenge behind AutoML is optimizing the pipelines of Machine Learning systems (e.g. the choice of preprocessing, augmentations, models, optimizers, etc.). Existing Pipeline Optimization techniques fail to explore deep interactions between pipeline stages/components. As a remedy, this paper proposes a novel neural architecture that captures the deep interaction between the components of a Machine Learning pipeline. We propose embedding pipelines into a latent representation through a novel per-component encoder mechanism. To search for optimal pipelines, such pipeline embeddings are used within deep-kernel Gaussian Process surrogates inside a Bayesian Optimization setup. Furthermore, we meta-learn the parameters of the pipeline embedding network using existing evaluations of pipelines on diverse collections of related datasets (a.k.a. meta-datasets). Through extensive experiments on three large-scale meta-datasets, we demonstrate that pipeline embeddings yield state-of-the-art results in Pipeline Optimization.|自动机器学习(AutoML)是通过自动部署具有最少人类专业知识的机器学习系统来实现人工智能大众化的一个有前途的方向。AutoML 背后的核心技术挑战是优化机器学习系统的管道(例如,预处理、扩展、模型、优化器等的选择)。现有的流水线优化技术无法探索流水线阶段/组件之间的深层交互。作为补救措施,本文提出了一种新颖的神经网络结构,该结构能够捕捉机器学习流水线各组件之间的深层交互。我们提出了一种新的每组件编码机制,将管道嵌入到潜在表示中。为了寻找最佳管道,这种管道嵌入在贝叶斯优化设置内的深核高斯过程代理中使用。此外,我们使用现有的对不同相关数据集(也称为元数据集)上的管道的评估来元学习管道嵌入网络的参数。通过在三个大规模元数据集上的大量实验,我们证明了流水线嵌入在流水线优化中产生了最先进的结果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Deep+Pipeline+Embeddings+for+AutoML)|0| |[FedAPEN: Personalized Cross-silo Federated Learning with Adaptability to Statistical Heterogeneity](https://doi.org/10.1145/3580305.3599344)|Zhen Qin, Shuiguang Deng, Mingyu Zhao, Xueqiang Yan||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=FedAPEN:+Personalized+Cross-silo+Federated+Learning+with+Adaptability+to+Statistical+Heterogeneity)|0| -|[All in One: Multi-Task Prompting for Graph Neural Networks](https://doi.org/10.1145/3580305.3599256)|Xiangguo Sun, Hong Cheng, Jia Li, Bo Liu, Jihong Guan|The Chinese University of Hong Kong; The Hong Kong University of Science and Technology (Guangzhou); Tongji University; Southeast University|Recently, ''pre-training and fine-tuning'' has been adopted as a standard workflow for many graph tasks since it can take general graph knowledge to relieve the lack of graph annotations from each application. However, graph tasks with node level, edge level, and graph level are far diversified, making the pre-training pretext often incompatible with these multiple tasks. This gap may even cause a ''negative transfer'' to the specific application, leading to poor results. Inspired by the prompt learning in natural language processing (NLP), which has presented significant effectiveness in leveraging prior knowledge for various NLP tasks, we study the prompting topic for graphs with the motivation of filling the gap between pre-trained models and various graph tasks. In this paper, we propose a novel multi-task prompting method for graph models. Specifically, we first unify the format of graph prompts and language prompts with the prompt token, token structure, and inserting pattern. In this way, the prompting idea from NLP can be seamlessly introduced to the graph area. Then, to further narrow the gap between various graph tasks and state-of-the-art pre-training strategies, we further study the task space of various graph applications and reformulate downstream problems to the graph-level task. Afterward, we introduce meta-learning to efficiently learn a better initialization for the multi-task prompt of graphs so that our prompting framework can be more reliable and general for different tasks. We conduct extensive experiments, results from which demonstrate the superiority of our method.|近年来,“预训练和微调”已经成为许多图形任务的标准工作流,因为它需要一般的图形知识来解决每个应用程序缺乏图形注释的问题。然而,具有节点级、边级和图级的图形任务种类繁多,使得预训练的借口往往与这些多任务不相容。这种差距甚至可能导致特定应用程序的“负转移”,从而导致较差的结果。自然语言处理中的快速学习在利用先验知识完成各种自然语言处理任务方面表现出了显著的效果,受此启发,我们研究了图形的提示主题,以填补预先训练的模型和各种图形任务之间的空白。本文提出了一种新的图模型多任务提示方法。具体来说,我们首先将图形提示符和语言提示符的格式与提示符标记、标记结构和插入模式统一起来。通过这种方式,可以将自然语言处理中的提示思想无缝地引入到图区域中。然后,为了进一步缩小各种图形任务与最先进的预训练策略之间的差距,我们进一步研究了各种图形应用的任务空间,并将下游问题重新表述为图形级任务。在此基础上,引入元学习,有效地学习图形的多任务提示的初始化,使得提示框架对于不同的任务具有更高的可靠性和通用性。我们进行了广泛的实验,实验结果证明了我们方法的优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=All+in+One:+Multi-Task+Prompting+for+Graph+Neural+Networks)|0| +|[All in One: Multi-Task Prompting for Graph Neural Networks](https://doi.org/10.1145/3580305.3599256)|Xiangguo Sun, Hong Cheng, Jia Li, Bo Liu, Jihong Guan|The Hong Kong University of Science and Technology (Guangzhou); The Chinese University of Hong Kong; Southeast University; Tongji University|Recently, ''pre-training and fine-tuning'' has been adopted as a standard workflow for many graph tasks since it can take general graph knowledge to relieve the lack of graph annotations from each application. However, graph tasks with node level, edge level, and graph level are far diversified, making the pre-training pretext often incompatible with these multiple tasks. This gap may even cause a ''negative transfer'' to the specific application, leading to poor results. Inspired by the prompt learning in natural language processing (NLP), which has presented significant effectiveness in leveraging prior knowledge for various NLP tasks, we study the prompting topic for graphs with the motivation of filling the gap between pre-trained models and various graph tasks. In this paper, we propose a novel multi-task prompting method for graph models. Specifically, we first unify the format of graph prompts and language prompts with the prompt token, token structure, and inserting pattern. In this way, the prompting idea from NLP can be seamlessly introduced to the graph area. Then, to further narrow the gap between various graph tasks and state-of-the-art pre-training strategies, we further study the task space of various graph applications and reformulate downstream problems to the graph-level task. Afterward, we introduce meta-learning to efficiently learn a better initialization for the multi-task prompt of graphs so that our prompting framework can be more reliable and general for different tasks. We conduct extensive experiments, results from which demonstrate the superiority of our method.|近年来,“预训练和微调”已经成为许多图形任务的标准工作流,因为它需要一般的图形知识来解决每个应用程序缺乏图形注释的问题。然而,具有节点级、边级和图级的图形任务种类繁多,使得预训练的借口往往与这些多任务不相容。这种差距甚至可能导致特定应用程序的“负转移”,从而导致较差的结果。自然语言处理中的快速学习在利用先验知识完成各种自然语言处理任务方面表现出了显著的效果,受此启发,我们研究了图形的提示主题,以填补预先训练的模型和各种图形任务之间的空白。本文提出了一种新的图模型多任务提示方法。具体来说,我们首先将图形提示符和语言提示符的格式与提示符标记、标记结构和插入模式统一起来。通过这种方式,可以将自然语言处理中的提示思想无缝地引入到图区域中。然后,为了进一步缩小各种图形任务与最先进的预训练策略之间的差距,我们进一步研究了各种图形应用的任务空间,并将下游问题重新表述为图形级任务。在此基础上,引入元学习,有效地学习图形的多任务提示的初始化,使得提示框架对于不同的任务具有更高的可靠性和通用性。我们进行了广泛的实验,实验结果证明了我们方法的优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=All+in+One:+Multi-Task+Prompting+for+Graph+Neural+Networks)|0| |[GMOCAT: A Graph-Enhanced Multi-Objective Method for Computerized Adaptive Testing](https://doi.org/10.1145/3580305.3599367)|Hangyu Wang, Ting Long, Liang Yin, Weinan Zhang, Wei Xia, Qichen Hong, Dingyin Xia, Ruiming Tang, Yong Yu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=GMOCAT:+A+Graph-Enhanced+Multi-Objective+Method+for+Computerized+Adaptive+Testing)|0| |[Theoretical Convergence Guaranteed Resource-Adaptive Federated Learning with Mixed Heterogeneity](https://doi.org/10.1145/3580305.3599521)|Yangyang Wang, Xiao Zhang, Mingyi Li, Tian Lan, Huashan Chen, Hui Xiong, Xiuzhen Cheng, Dongxiao Yu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Theoretical+Convergence+Guaranteed+Resource-Adaptive+Federated+Learning+with+Mixed+Heterogeneity)|0| |[Efficient Bi-Level Optimization for Recommendation Denoising](https://doi.org/10.1145/3580305.3599324)|Zongwei Wang, Min Gao, Wentao Li, Junliang Yu, Linxin Guo, Hongzhi Yin||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Efficient+Bi-Level+Optimization+for+Recommendation+Denoising)|0| |[Meta Graph Learning for Long-tail Recommendation](https://doi.org/10.1145/3580305.3599428)|Chunyu Wei, Jian Liang, Di Liu, Zehui Dai, Mang Li, Fei Wang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Meta+Graph+Learning+for+Long-tail+Recommendation)|0| |[Personalized Federated Learning with Parameter Propagation](https://doi.org/10.1145/3580305.3599464)|Jun Wu, Wenxuan Bao, Elizabeth A. Ainsworth, Jingrui He||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Personalized+Federated+Learning+with+Parameter+Propagation)|0| -|[Serverless Federated AUPRC Optimization for Multi-Party Collaborative Imbalanced Data Mining](https://doi.org/10.1145/3580305.3599499)|Xidong Wu, Zhengmian Hu, Jian Pei, Heng Huang|Duke University; University of Pittsburgh|Multi-party collaborative training, such as distributed learning and federated learning, is used to address the big data challenges. However, traditional multi-party collaborative training algorithms were mainly designed for balanced data mining tasks and are intended to optimize accuracy (\emph{e.g.}, cross-entropy). The data distribution in many real-world applications is skewed and classifiers, which are trained to improve accuracy, perform poorly when applied to imbalanced data tasks since models could be significantly biased toward the primary class. Therefore, the Area Under Precision-Recall Curve (AUPRC) was introduced as an effective metric. Although single-machine AUPRC maximization methods have been designed, multi-party collaborative algorithm has never been studied. The change from the single-machine to the multi-party setting poses critical challenges. To address the above challenge, we study the serverless multi-party collaborative AUPRC maximization problem since serverless multi-party collaborative training can cut down the communications cost by avoiding the server node bottleneck, and reformulate it as a conditional stochastic optimization problem in a serverless multi-party collaborative learning setting and propose a new ServerLess biAsed sTochastic gradiEnt (SLATE) algorithm to directly optimize the AUPRC. After that, we use the variance reduction technique and propose ServerLess biAsed sTochastic gradiEnt with Momentum-based variance reduction (SLATE-M) algorithm to improve the convergence rate, which matches the best theoretical convergence result reached by the single-machine online method. To the best of our knowledge, this is the first work to solve the multi-party collaborative AUPRC maximization problem.|多方协作培训,如分布式学习和联合学习,被用来解决大数据的挑战。然而,传统的多方协同训练算法主要是针对平衡的数据挖掘任务而设计的,其目的是优化精度(例如: 交叉熵)。许多实际应用中的数据分布是倾斜的,分类器经过训练以提高准确性,但在应用于不平衡的数据任务时表现不佳,因为模型可能明显偏向于主类。因此,引入精确回忆曲线下面积(AUPRC)作为一个有效的度量指标。虽然单机 AUPRC 最大化方法已经设计出来,但是多方协作算法还没有得到研究。从单一机器设置到多方设置的变化提出了关键的挑战。为了解决上述问题,我们研究了无服务器多方协作的 AUPRC 最大化问题,因为无服务器多方协作培训可以通过避免服务器节点瓶颈来降低通信成本,并将其重新表述为无服务器多方协作环境中的条件随机最佳化问题,提出了一种新的无服务器偏置 sTo侯机梯度(slATE)算法来直接优化 AUPRC,该算法可以用于解决无服务器多方协作的合作学习。在此基础上,利用方差减少技术,提出了基于动量方差减少(SLATE-M)的 ServerLess 偏向随机梯度算法,提高了算法的收敛速度,达到了单机在线算法的最佳理论收敛效果。据我们所知,这是第一个解决多方协作 AUPRC 最大化问题的工作。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Serverless+Federated+AUPRC+Optimization+for+Multi-Party+Collaborative+Imbalanced+Data+Mining)|0| +|[Serverless Federated AUPRC Optimization for Multi-Party Collaborative Imbalanced Data Mining](https://doi.org/10.1145/3580305.3599499)|Xidong Wu, Zhengmian Hu, Jian Pei, Heng Huang|University of Pittsburgh; Duke University|Multi-party collaborative training, such as distributed learning and federated learning, is used to address the big data challenges. However, traditional multi-party collaborative training algorithms were mainly designed for balanced data mining tasks and are intended to optimize accuracy (\emph{e.g.}, cross-entropy). The data distribution in many real-world applications is skewed and classifiers, which are trained to improve accuracy, perform poorly when applied to imbalanced data tasks since models could be significantly biased toward the primary class. Therefore, the Area Under Precision-Recall Curve (AUPRC) was introduced as an effective metric. Although single-machine AUPRC maximization methods have been designed, multi-party collaborative algorithm has never been studied. The change from the single-machine to the multi-party setting poses critical challenges. To address the above challenge, we study the serverless multi-party collaborative AUPRC maximization problem since serverless multi-party collaborative training can cut down the communications cost by avoiding the server node bottleneck, and reformulate it as a conditional stochastic optimization problem in a serverless multi-party collaborative learning setting and propose a new ServerLess biAsed sTochastic gradiEnt (SLATE) algorithm to directly optimize the AUPRC. After that, we use the variance reduction technique and propose ServerLess biAsed sTochastic gradiEnt with Momentum-based variance reduction (SLATE-M) algorithm to improve the convergence rate, which matches the best theoretical convergence result reached by the single-machine online method. To the best of our knowledge, this is the first work to solve the multi-party collaborative AUPRC maximization problem.|多方协作培训,如分布式学习和联合学习,被用来解决大数据的挑战。然而,传统的多方协同训练算法主要是针对平衡的数据挖掘任务而设计的,其目的是优化精度(例如: 交叉熵)。许多实际应用中的数据分布是倾斜的,分类器经过训练以提高准确性,但在应用于不平衡的数据任务时表现不佳,因为模型可能明显偏向于主类。因此,引入精确回忆曲线下面积(AUPRC)作为一个有效的度量指标。虽然单机 AUPRC 最大化方法已经设计出来,但是多方协作算法还没有得到研究。从单一机器设置到多方设置的变化提出了关键的挑战。为了解决上述问题,我们研究了无服务器多方协作的 AUPRC 最大化问题,因为无服务器多方协作培训可以通过避免服务器节点瓶颈来降低通信成本,并将其重新表述为无服务器多方协作环境中的条件随机最佳化问题,提出了一种新的无服务器偏置 sTo侯机梯度(slATE)算法来直接优化 AUPRC,该算法可以用于解决无服务器多方协作的合作学习。在此基础上,利用方差减少技术,提出了基于动量方差减少(SLATE-M)的 ServerLess 偏向随机梯度算法,提高了算法的收敛速度,达到了单机在线算法的最佳理论收敛效果。据我们所知,这是第一个解决多方协作 AUPRC 最大化问题的工作。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Serverless+Federated+AUPRC+Optimization+for+Multi-Party+Collaborative+Imbalanced+Data+Mining)|0| |[MSSRNet: Manipulating Sequential Style Representation for Unsupervised Text Style Transfer](https://doi.org/10.1145/3580305.3599438)|Yazheng Yang, Zhou Zhao, Qi Liu|Zhejiang University; The University of Hong Kong|Unsupervised text style transfer task aims to rewrite a text into target style while preserving its main content. Traditional methods rely on the use of a fixed-sized vector to regulate text style, which is difficult to accurately convey the style strength for each individual token. In fact, each token of a text contains different style intensity and makes different contribution to the overall style. Our proposed method addresses this issue by assigning individual style vector to each token in a text, allowing for fine-grained control and manipulation of the style strength. Additionally, an adversarial training framework integrated with teacher-student learning is introduced to enhance training stability and reduce the complexity of high-dimensional optimization. The results of our experiments demonstrate the efficacy of our method in terms of clearly improved style transfer accuracy and content preservation in both two-style transfer and multi-style transfer settings.|无监督文本样式转换任务的目标是在保留文本主要内容的同时将文本重写成目标样式。传统的方法依赖于使用固定大小的向量来调整文本样式,这很难准确地表达每个单独标记的样式强度。事实上,文本的每一个标记都包含着不同的风格强度,并对整体风格做出不同的贡献。我们提出的方法通过为文本中的每个标记分配单独的样式向量来解决这个问题,从而允许对样式强度进行细粒度控制和操作。此外,为了提高训练的稳定性,降低高维优化的复杂性,提出了一种结合师生学习的对抗性训练框架。实验结果表明,该方法在两种类型和多种类型的转移设置下,均能明显提高文体转移的准确性和内容保存率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MSSRNet:+Manipulating+Sequential+Style+Representation+for+Unsupervised+Text+Style+Transfer)|0| -|[Knowledge Graph Self-Supervised Rationalization for Recommendation](https://doi.org/10.1145/3580305.3599400)|Yuhao Yang, Chao Huang, Lianghao Xia, Chunzhen Huang|Tencent; The University of Hong Kong|In this paper, we introduce a new self-supervised rationalization method, called KGRec, for knowledge-aware recommender systems. To effectively identify informative knowledge connections, we propose an attentive knowledge rationalization mechanism that generates rational scores for knowledge triplets. With these scores, KGRec integrates generative and contrastive self-supervised tasks for recommendation through rational masking. To highlight rationales in the knowledge graph, we design a novel generative task in the form of masking-reconstructing. By masking important knowledge with high rational scores, KGRec is trained to rebuild and highlight useful knowledge connections that serve as rationales. To further rationalize the effect of collaborative interactions on knowledge graph learning, we introduce a contrastive learning task that aligns signals from knowledge and user-item interaction views. To ensure noise-resistant contrasting, potential noisy edges in both graphs judged by the rational scores are masked. Extensive experiments on three real-world datasets demonstrate that KGRec outperforms state-of-the-art methods. We also provide the implementation codes for our approach at https://github.com/HKUDS/KGRec.|本文针对知识感知推荐系统,提出了一种新的自监督合理化方法 KGRec。为了有效地识别信息知识连接,我们提出了一种注意的知识合理化机制,为知识三元组生成合理的分数。根据这些分数,KGRec 通过合理的掩蔽将生成性和对比性自我监督任务集成到推荐系统中。为了突出知识图中的基本原理,我们设计了一个新的生成任务,即掩蔽-重构。通过用高理性分数掩盖重要的知识,KGRec 被训练重建和突出作为基本原理的有用的知识联系。为了进一步合理化协作交互对知识图学习的影响,我们引入了一个对比学习任务,该任务从知识和用户项目交互视图中调整信号。为了确保抗噪声的对比,在两个图的潜在噪声边缘判断有理分数被掩盖。在三个真实世界数据集上的大量实验表明,KGRec 的性能优于最先进的方法。我们亦会在 https://github.com/hkuds/kgrec 为我们的方法提供实施守则。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Knowledge+Graph+Self-Supervised+Rationalization+for+Recommendation)|0| -|[FedCP: Separating Feature Information for Personalized Federated Learning via Conditional Policy](https://doi.org/10.1145/3580305.3599345)|Jianqing Zhang, Yang Hua, Hao Wang, Tao Song, Zhengui Xue, Ruhui Ma, Haibing Guan|Queen’s University Belfast; Shanghai Jiao Tong University; Louisiana State University|Recently, personalized federated learning (pFL) has attracted increasing attention in privacy protection, collaborative learning, and tackling statistical heterogeneity among clients, e.g., hospitals, mobile smartphones, etc. Most existing pFL methods focus on exploiting the global information and personalized information in the client-level model parameters while neglecting that data is the source of these two kinds of information. To address this, we propose the Federated Conditional Policy (FedCP) method, which generates a conditional policy for each sample to separate the global information and personalized information in its features and then processes them by a global head and a personalized head, respectively. FedCP is more fine-grained to consider personalization in a sample-specific manner than existing pFL methods. Extensive experiments in computer vision and natural language processing domains show that FedCP outperforms eleven state-of-the-art methods by up to 6.69%. Furthermore, FedCP maintains its superiority when some clients accidentally drop out, which frequently happens in mobile settings. Our code is public at https://github.com/TsingZ0/FedCP.|近年来,个性化联邦学习(pFL)在保护个人隐私、合作学习以及处理客户之间的统计异质性等方面受到越来越多的关注,例如医院、移动智能手机等。现有的 pFL 方法大多侧重于利用客户端模型参数中的全局信息和个性化信息,而忽视了数据是这两类信息的来源。为了解决这个问题,我们提出了联邦条件策略(FedCP)方法,该方法为每个样本生成一个条件策略来分离其特征中的全局信息和个性化信息,然后分别通过一个全局头和一个个性化头来处理它们。与现有的 pFL 方法相比,FedCP 更加细粒度地以特定于样本的方式考虑个性化。在计算机视觉和自然语言处理领域的大量实验表明,FedCP 比11种最先进的方法的性能提高了6.69% 。此外,当一些客户端意外退出时,FedCP 仍然保持其优势,这种情况在移动设置中经常发生。我们的代码在 https://github.com/tsingz0/fedcp 是公开的。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=FedCP:+Separating+Feature+Information+for+Personalized+Federated+Learning+via+Conditional+Policy)|0| +|[Knowledge Graph Self-Supervised Rationalization for Recommendation](https://doi.org/10.1145/3580305.3599400)|Yuhao Yang, Chao Huang, Lianghao Xia, Chunzhen Huang|The University of Hong Kong; Tencent|In this paper, we introduce a new self-supervised rationalization method, called KGRec, for knowledge-aware recommender systems. To effectively identify informative knowledge connections, we propose an attentive knowledge rationalization mechanism that generates rational scores for knowledge triplets. With these scores, KGRec integrates generative and contrastive self-supervised tasks for recommendation through rational masking. To highlight rationales in the knowledge graph, we design a novel generative task in the form of masking-reconstructing. By masking important knowledge with high rational scores, KGRec is trained to rebuild and highlight useful knowledge connections that serve as rationales. To further rationalize the effect of collaborative interactions on knowledge graph learning, we introduce a contrastive learning task that aligns signals from knowledge and user-item interaction views. To ensure noise-resistant contrasting, potential noisy edges in both graphs judged by the rational scores are masked. Extensive experiments on three real-world datasets demonstrate that KGRec outperforms state-of-the-art methods. We also provide the implementation codes for our approach at https://github.com/HKUDS/KGRec.|本文针对知识感知推荐系统,提出了一种新的自监督合理化方法 KGRec。为了有效地识别信息知识连接,我们提出了一种注意的知识合理化机制,为知识三元组生成合理的分数。根据这些分数,KGRec 通过合理的掩蔽将生成性和对比性自我监督任务集成到推荐系统中。为了突出知识图中的基本原理,我们设计了一个新的生成任务,即掩蔽-重构。通过用高理性分数掩盖重要的知识,KGRec 被训练重建和突出作为基本原理的有用的知识联系。为了进一步合理化协作交互对知识图学习的影响,我们引入了一个对比学习任务,该任务从知识和用户项目交互视图中调整信号。为了确保抗噪声的对比,在两个图的潜在噪声边缘判断有理分数被掩盖。在三个真实世界数据集上的大量实验表明,KGRec 的性能优于最先进的方法。我们亦会在 https://github.com/hkuds/kgrec 为我们的方法提供实施守则。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Knowledge+Graph+Self-Supervised+Rationalization+for+Recommendation)|0| +|[FedCP: Separating Feature Information for Personalized Federated Learning via Conditional Policy](https://doi.org/10.1145/3580305.3599345)|Jianqing Zhang, Yang Hua, Hao Wang, Tao Song, Zhengui Xue, Ruhui Ma, Haibing Guan|Shanghai Jiao Tong University; Louisiana State University; Queen’s University Belfast|Recently, personalized federated learning (pFL) has attracted increasing attention in privacy protection, collaborative learning, and tackling statistical heterogeneity among clients, e.g., hospitals, mobile smartphones, etc. Most existing pFL methods focus on exploiting the global information and personalized information in the client-level model parameters while neglecting that data is the source of these two kinds of information. To address this, we propose the Federated Conditional Policy (FedCP) method, which generates a conditional policy for each sample to separate the global information and personalized information in its features and then processes them by a global head and a personalized head, respectively. FedCP is more fine-grained to consider personalization in a sample-specific manner than existing pFL methods. Extensive experiments in computer vision and natural language processing domains show that FedCP outperforms eleven state-of-the-art methods by up to 6.69%. Furthermore, FedCP maintains its superiority when some clients accidentally drop out, which frequently happens in mobile settings. Our code is public at https://github.com/TsingZ0/FedCP.|近年来,个性化联邦学习(pFL)在保护个人隐私、合作学习以及处理客户之间的统计异质性等方面受到越来越多的关注,例如医院、移动智能手机等。现有的 pFL 方法大多侧重于利用客户端模型参数中的全局信息和个性化信息,而忽视了数据是这两类信息的来源。为了解决这个问题,我们提出了联邦条件策略(FedCP)方法,该方法为每个样本生成一个条件策略来分离其特征中的全局信息和个性化信息,然后分别通过一个全局头和一个个性化头来处理它们。与现有的 pFL 方法相比,FedCP 更加细粒度地以特定于样本的方式考虑个性化。在计算机视觉和自然语言处理领域的大量实验表明,FedCP 比11种最先进的方法的性能提高了6.69% 。此外,当一些客户端意外退出时,FedCP 仍然保持其优势,这种情况在移动设置中经常发生。我们的代码在 https://github.com/tsingz0/fedcp 是公开的。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=FedCP:+Separating+Feature+Information+for+Personalized+Federated+Learning+via+Conditional+Policy)|0| |[CFGL-LCR: A Counterfactual Graph Learning Framework for Legal Case Retrieval](https://doi.org/10.1145/3580305.3599273)|Kun Zhang, Chong Chen, Yuanzhuo Wang, Qi Tian, Long Bai||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CFGL-LCR:+A+Counterfactual+Graph+Learning+Framework+for+Legal+Case+Retrieval)|0| |[DM-PFL: Hitchhiking Generic Federated Learning for Efficient Shift-Robust Personalization](https://doi.org/10.1145/3580305.3599311)|Wenhao Zhang, Zimu Zhou, Yansheng Wang, Yongxin Tong||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DM-PFL:+Hitchhiking+Generic+Federated+Learning+for+Efficient+Shift-Robust+Personalization)|0| -|[Efficient Approximation Algorithms for Spanning Centrality](https://doi.org/10.1145/3580305.3599323)|Shiqi Zhang, Renchi Yang, Jing Tang, Xiaokui Xiao, Bo Tang|The Hong Kong University of Science and Technology (Guangzhou); National University of Singapore; Hong Kong Baptist University; Southern University of Science and Technology|Given a graph $\mathcal{G}$, the spanning centrality (SC) of an edge $e$ measures the importance of $e$ for $\mathcal{G}$ to be connected. In practice, SC has seen extensive applications in computational biology, electrical networks, and combinatorial optimization. However, it is highly challenging to compute the SC of all edges (AESC) on large graphs. Existing techniques fail to deal with such graphs, as they either suffer from expensive matrix operations or require sampling numerous long random walks. To circumvent these issues, this paper proposes TGT and its enhanced version TGT+, two algorithms for AESC computation that offers rigorous theoretical approximation guarantees. In particular, TGT remedies the deficiencies of previous solutions by conducting deterministic graph traversals with carefully-crafted truncated lengths. TGT+ further advances TGT in terms of both empirical efficiency and asymptotic performance while retaining result quality, based on the combination of TGT with random walks and several additional heuristic optimizations. We experimentally evaluate TGT+ against recent competitors for AESC using a variety of real datasets. The experimental outcomes authenticate that TGT+ outperforms the state of the arts often by over one order of magnitude speedup without degrading the accuracy.|给定一个图 $mathal { G } $,边 $e $的生成中心性(SC)度量 $e $对于要连接的 $mathal { G } $的重要性。在实践中,SC 已经在计算生物学、电力网络和组合优化等领域得到了广泛的应用。然而,计算大图上所有边的 SC (AESC)是一个非常具有挑战性的问题。现有的技术无法处理这样的图,因为它们要么需要进行昂贵的矩阵运算,要么需要对大量的长随机游动进行采样。为了解决这些问题,本文提出了 TGT 及其改进版本 TGT + ,这两种 AESC 计算算法提供了严格的理论近似保证。特别是,TGT 通过使用精心设计的截断长度进行确定性图遍历,弥补了以前解决方案的缺陷。TGT + 基于 TGT 与随机游动的结合以及几个附加的启发式优化,在保持结果质量的同时,进一步提高了 TGT 的经验有效性和渐近性能。我们使用各种实际数据集对 AESC 的最近竞争对手进行了 TGT + 的实验评估。实验结果表明,在不降低准确性的情况下,TGT + 的性能通常比现有技术水平高出一个数量级。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Efficient+Approximation+Algorithms+for+Spanning+Centrality)|0| +|[Efficient Approximation Algorithms for Spanning Centrality](https://doi.org/10.1145/3580305.3599323)|Shiqi Zhang, Renchi Yang, Jing Tang, Xiaokui Xiao, Bo Tang|Southern University of Science and Technology; Hong Kong Baptist University; The Hong Kong University of Science and Technology (Guangzhou); National University of Singapore|Given a graph $\mathcal{G}$, the spanning centrality (SC) of an edge $e$ measures the importance of $e$ for $\mathcal{G}$ to be connected. In practice, SC has seen extensive applications in computational biology, electrical networks, and combinatorial optimization. However, it is highly challenging to compute the SC of all edges (AESC) on large graphs. Existing techniques fail to deal with such graphs, as they either suffer from expensive matrix operations or require sampling numerous long random walks. To circumvent these issues, this paper proposes TGT and its enhanced version TGT+, two algorithms for AESC computation that offers rigorous theoretical approximation guarantees. In particular, TGT remedies the deficiencies of previous solutions by conducting deterministic graph traversals with carefully-crafted truncated lengths. TGT+ further advances TGT in terms of both empirical efficiency and asymptotic performance while retaining result quality, based on the combination of TGT with random walks and several additional heuristic optimizations. We experimentally evaluate TGT+ against recent competitors for AESC using a variety of real datasets. The experimental outcomes authenticate that TGT+ outperforms the state of the arts often by over one order of magnitude speedup without degrading the accuracy.|给定一个图 $mathal { G } $,边 $e $的生成中心性(SC)度量 $e $对于要连接的 $mathal { G } $的重要性。在实践中,SC 已经在计算生物学、电力网络和组合优化等领域得到了广泛的应用。然而,计算大图上所有边的 SC (AESC)是一个非常具有挑战性的问题。现有的技术无法处理这样的图,因为它们要么需要进行昂贵的矩阵运算,要么需要对大量的长随机游动进行采样。为了解决这些问题,本文提出了 TGT 及其改进版本 TGT + ,这两种 AESC 计算算法提供了严格的理论近似保证。特别是,TGT 通过使用精心设计的截断长度进行确定性图遍历,弥补了以前解决方案的缺陷。TGT + 基于 TGT 与随机游动的结合以及几个附加的启发式优化,在保持结果质量的同时,进一步提高了 TGT 的经验有效性和渐近性能。我们使用各种实际数据集对 AESC 的最近竞争对手进行了 TGT + 的实验评估。实验结果表明,在不降低准确性的情况下,TGT + 的性能通常比现有技术水平高出一个数量级。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Efficient+Approximation+Algorithms+for+Spanning+Centrality)|0| |[Improving Search Clarification with Structured Information Extracted from Search Results](https://doi.org/10.1145/3580305.3599389)|Ziliang Zhao, Zhicheng Dou, Yu Guo, Zhao Cao, Xiaohua Cheng||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+Search+Clarification+with+Structured+Information+Extracted+from+Search+Results)|0| |[Dense Representation Learning and Retrieval for Tabular Data Prediction](https://doi.org/10.1145/3580305.3599305)|Lei Zheng, Ning Li, Xianyu Chen, Quan Gan, Weinan Zhang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Dense+Representation+Learning+and+Retrieval+for+Tabular+Data+Prediction)|0| |[A Sublinear Time Algorithm for Opinion Optimization in Directed Social Networks via Edge Recommendation](https://doi.org/10.1145/3580305.3599247)|Xiaotian Zhou, Liwang Zhu, Wei Li, Zhongzhi Zhang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Sublinear+Time+Algorithm+for+Opinion+Optimization+in+Directed+Social+Networks+via+Edge+Recommendation)|0| -|[Path-Specific Counterfactual Fairness for Recommender Systems](https://doi.org/10.1145/3580305.3599462)|Yaochen Zhu, Jing Ma, Liang Wu, Qi Guo, Liangjie Hong, Jundong Li|University of Virginia; LinkedIn Inc.|Recommender systems (RSs) have become an indispensable part of online platforms. With the growing concerns of algorithmic fairness, RSs are not only expected to deliver high-quality personalized content, but are also demanded not to discriminate against users based on their demographic information. However, existing RSs could capture undesirable correlations between sensitive features and observed user behaviors, leading to biased recommendations. Most fair RSs tackle this problem by completely blocking the influences of sensitive features on recommendations. But since sensitive features may also affect user interests in a fair manner (e.g., race on culture-based preferences), indiscriminately eliminating all the influences of sensitive features inevitably degenerate the recommendations quality and necessary diversities. To address this challenge, we propose a path-specific fair RS (PSF-RS) for recommendations. Specifically, we summarize all fair and unfair correlations between sensitive features and observed ratings into two latent proxy mediators, where the concept of path-specific bias (PS-Bias) is defined based on path-specific counterfactual inference. Inspired by Pearl's minimal change principle, we address the PS-Bias by minimally transforming the biased factual world into a hypothetically fair world, where a fair RS model can be learned accordingly by solving a constrained optimization problem. For the technical part, we propose a feasible implementation of PSF-RS, i.e., PSF-VAE, with weakly-supervised variational inference, which robustly infers the latent mediators such that unfairness can be mitigated while necessary recommendation diversities can be maximally preserved simultaneously. Experiments conducted on semi-simulated and real-world datasets demonstrate the effectiveness of PSF-RS.|推荐系统已经成为在线平台不可或缺的一部分。随着对算法公平性的日益关注,RSS 不仅被期望提供高质量的个性化内容,而且被要求不因用户的人口统计信息而歧视用户。然而,现有的 RSS 可能捕获敏感特性和观察到的用户行为之间不希望看到的相关性,从而导致有偏见的推荐。大多数公平的 RSS 通过完全屏蔽敏感特性对推荐的影响来解决这个问题。但是,由于敏感特性也可能以公平的方式影响用户的兴趣(例如,基于文化的偏好的种族) ,不加区分地消除敏感特性的所有影响必然会降低推荐的质量和必要的多样性。为了应对这一挑战,我们提出了一个路径特定公平 RS (PSF-RS)的建议。具体而言,我们将敏感特征和观察评分之间的所有公平和不公平的相关性总结为两个潜在的代理中介,其中路径特异性偏倚(PS-Bias)的概念是基于路径特异性反事实推断定义的。受珀尔的最小改变原则的启发,我们通过最小化地将有偏见的现实世界转化为一个假设的公平世界,在这个假设的公平世界中,可以通过解决一个受限制的最佳化问题来相应地学习一个公平的遥感模型,从而解决偏差问题。在技术部分,我们提出了一种可行的 PSF-RS 实现方法,即 PSF-VAE,该方法利用弱监督变分推理强有力地推导出潜在的中介因子,从而在最大限度地保留必要的推荐多样性的同时减少不公平性。在半模拟和真实数据集上进行的实验证明了 PSF-RS 算法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Path-Specific+Counterfactual+Fairness+for+Recommender+Systems)|0| -|[Capturing Conversion Rate Fluctuation during Sales Promotions: A Novel Historical Data Reuse Approach](https://doi.org/10.1145/3580305.3599788)|Zhangming Chan, Yu Zhang, Shuguang Han, Yong Bai, XiangRong Sheng, Siyuan Lou, Jiacen Hu, Baolin Liu, Yuning Jiang, Jian Xu, Bo Zheng|Alibaba Group; Nanjing University; University of Science and Technology Beijing|Conversion rate (CVR) prediction is one of the core components in online recommender systems, and various approaches have been proposed to obtain accurate and well-calibrated CVR estimation. However, we observe that a well-trained CVR prediction model often performs sub-optimally during sales promotions. This can be largely ascribed to the problem of the data distribution shift, in which the conventional methods no longer work. To this end, we seek to develop alternative modeling techniques for CVR prediction. Observing similar purchase patterns across different promotions, we propose reusing the historical promotion data to capture the promotional conversion patterns. Herein, we propose a novel \textbf{H}istorical \textbf{D}ata \textbf{R}euse (\textbf{HDR}) approach that first retrieves historically similar promotion data and then fine-tunes the CVR prediction model with the acquired data for better adaptation to the promotion mode. HDR consists of three components: an automated data retrieval module that seeks similar data from historical promotions, a distribution shift correction module that re-weights the retrieved data for better aligning with the target promotion, and a TransBlock module that quickly fine-tunes the original model for better adaptation to the promotion mode. Experiments conducted with real-world data demonstrate the effectiveness of HDR, as it improves both ranking and calibration metrics to a large extent. HDR has also been deployed on the display advertising system in Alibaba, bringing a lift of $9\%$ RPM and $16\%$ CVR during Double 11 Sales in 2022.|转化率(CVR)预测是在线推荐系统的核心组成部分之一,为了获得准确、标定良好的 CVR 估计,人们提出了各种方法。然而,我们观察到,训练有素的 CVR 预测模型在促销期间往往表现不佳。这在很大程度上归因于数据分布偏移的问题,在这个问题中,传统的方法不再起作用。为此,我们寻求发展可替代的 CVR 预测建模技术。通过观察不同促销活动中相似的购买模式,我们建议重用历史促销数据来捕获促销转换模式。在这里,我们提出了一种新的 textbf { H }历史 textbf { D } ata textbf { R } euse (textbf { HDR })方法,首先检索历史上相似的促销数据,然后用所获得的数据对 CVR 预测模型进行微调,以更好地适应促销模式。人类发展报告由三个组成部分组成: 一个自动数据检索模块,从历史促销活动中寻找类似数据; 一个分配转移校正模块,重新加权检索的数据,以便更好地与目标促销活动保持一致; 一个 TransBlock 模块,快速微调原始模型,以便更好地适应促销模式。利用实际数据进行的实验证明了 HDR 的有效性,因为它在很大程度上改善了排序和校准指标。HDR 也已经部署在阿里巴巴的显示广告系统上,在2022年双11销售期间,带来了9% 的每分钟转速和16% 的 CVR。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Capturing+Conversion+Rate+Fluctuation+during+Sales+Promotions:+A+Novel+Historical+Data+Reuse+Approach)|0| +|[Path-Specific Counterfactual Fairness for Recommender Systems](https://doi.org/10.1145/3580305.3599462)|Yaochen Zhu, Jing Ma, Liang Wu, Qi Guo, Liangjie Hong, Jundong Li|LinkedIn Inc.; University of Virginia|Recommender systems (RSs) have become an indispensable part of online platforms. With the growing concerns of algorithmic fairness, RSs are not only expected to deliver high-quality personalized content, but are also demanded not to discriminate against users based on their demographic information. However, existing RSs could capture undesirable correlations between sensitive features and observed user behaviors, leading to biased recommendations. Most fair RSs tackle this problem by completely blocking the influences of sensitive features on recommendations. But since sensitive features may also affect user interests in a fair manner (e.g., race on culture-based preferences), indiscriminately eliminating all the influences of sensitive features inevitably degenerate the recommendations quality and necessary diversities. To address this challenge, we propose a path-specific fair RS (PSF-RS) for recommendations. Specifically, we summarize all fair and unfair correlations between sensitive features and observed ratings into two latent proxy mediators, where the concept of path-specific bias (PS-Bias) is defined based on path-specific counterfactual inference. Inspired by Pearl's minimal change principle, we address the PS-Bias by minimally transforming the biased factual world into a hypothetically fair world, where a fair RS model can be learned accordingly by solving a constrained optimization problem. For the technical part, we propose a feasible implementation of PSF-RS, i.e., PSF-VAE, with weakly-supervised variational inference, which robustly infers the latent mediators such that unfairness can be mitigated while necessary recommendation diversities can be maximally preserved simultaneously. Experiments conducted on semi-simulated and real-world datasets demonstrate the effectiveness of PSF-RS.|推荐系统已经成为在线平台不可或缺的一部分。随着对算法公平性的日益关注,RSS 不仅被期望提供高质量的个性化内容,而且被要求不因用户的人口统计信息而歧视用户。然而,现有的 RSS 可能捕获敏感特性和观察到的用户行为之间不希望看到的相关性,从而导致有偏见的推荐。大多数公平的 RSS 通过完全屏蔽敏感特性对推荐的影响来解决这个问题。但是,由于敏感特性也可能以公平的方式影响用户的兴趣(例如,基于文化的偏好的种族) ,不加区分地消除敏感特性的所有影响必然会降低推荐的质量和必要的多样性。为了应对这一挑战,我们提出了一个路径特定公平 RS (PSF-RS)的建议。具体而言,我们将敏感特征和观察评分之间的所有公平和不公平的相关性总结为两个潜在的代理中介,其中路径特异性偏倚(PS-Bias)的概念是基于路径特异性反事实推断定义的。受珀尔的最小改变原则的启发,我们通过最小化地将有偏见的现实世界转化为一个假设的公平世界,在这个假设的公平世界中,可以通过解决一个受限制的最佳化问题来相应地学习一个公平的遥感模型,从而解决偏差问题。在技术部分,我们提出了一种可行的 PSF-RS 实现方法,即 PSF-VAE,该方法利用弱监督变分推理强有力地推导出潜在的中介因子,从而在最大限度地保留必要的推荐多样性的同时减少不公平性。在半模拟和真实数据集上进行的实验证明了 PSF-RS 算法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Path-Specific+Counterfactual+Fairness+for+Recommender+Systems)|0| +|[Capturing Conversion Rate Fluctuation during Sales Promotions: A Novel Historical Data Reuse Approach](https://doi.org/10.1145/3580305.3599788)|Zhangming Chan, Yu Zhang, Shuguang Han, Yong Bai, XiangRong Sheng, Siyuan Lou, Jiacen Hu, Baolin Liu, Yuning Jiang, Jian Xu, Bo Zheng|Nanjing University; University of Science and Technology Beijing; Alibaba Group|Conversion rate (CVR) prediction is one of the core components in online recommender systems, and various approaches have been proposed to obtain accurate and well-calibrated CVR estimation. However, we observe that a well-trained CVR prediction model often performs sub-optimally during sales promotions. This can be largely ascribed to the problem of the data distribution shift, in which the conventional methods no longer work. To this end, we seek to develop alternative modeling techniques for CVR prediction. Observing similar purchase patterns across different promotions, we propose reusing the historical promotion data to capture the promotional conversion patterns. Herein, we propose a novel \textbf{H}istorical \textbf{D}ata \textbf{R}euse (\textbf{HDR}) approach that first retrieves historically similar promotion data and then fine-tunes the CVR prediction model with the acquired data for better adaptation to the promotion mode. HDR consists of three components: an automated data retrieval module that seeks similar data from historical promotions, a distribution shift correction module that re-weights the retrieved data for better aligning with the target promotion, and a TransBlock module that quickly fine-tunes the original model for better adaptation to the promotion mode. Experiments conducted with real-world data demonstrate the effectiveness of HDR, as it improves both ranking and calibration metrics to a large extent. HDR has also been deployed on the display advertising system in Alibaba, bringing a lift of $9\%$ RPM and $16\%$ CVR during Double 11 Sales in 2022.|转化率(CVR)预测是在线推荐系统的核心组成部分之一,为了获得准确、标定良好的 CVR 估计,人们提出了各种方法。然而,我们观察到,训练有素的 CVR 预测模型在促销期间往往表现不佳。这在很大程度上归因于数据分布偏移的问题,在这个问题中,传统的方法不再起作用。为此,我们寻求发展可替代的 CVR 预测建模技术。通过观察不同促销活动中相似的购买模式,我们建议重用历史促销数据来捕获促销转换模式。在这里,我们提出了一种新的 textbf { H }历史 textbf { D } ata textbf { R } euse (textbf { HDR })方法,首先检索历史上相似的促销数据,然后用所获得的数据对 CVR 预测模型进行微调,以更好地适应促销模式。人类发展报告由三个组成部分组成: 一个自动数据检索模块,从历史促销活动中寻找类似数据; 一个分配转移校正模块,重新加权检索的数据,以便更好地与目标促销活动保持一致; 一个 TransBlock 模块,快速微调原始模型,以便更好地适应促销模式。利用实际数据进行的实验证明了 HDR 的有效性,因为它在很大程度上改善了排序和校准指标。HDR 也已经部署在阿里巴巴的显示广告系统上,在2022年双11销售期间,带来了9% 的每分钟转速和16% 的 CVR。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Capturing+Conversion+Rate+Fluctuation+during+Sales+Promotions:+A+Novel+Historical+Data+Reuse+Approach)|0| |[SAMD: An Industrial Framework for Heterogeneous Multi-Scenario Recommendation](https://doi.org/10.1145/3580305.3599955)|Zhaoxin Huan, Ang Li, Xiaolu Zhang, Xu Min, Jieyu Yang, Yong He, Jun Zhou||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SAMD:+An+Industrial+Framework+for+Heterogeneous+Multi-Scenario+Recommendation)|0| |[Learning Discrete Document Representations in Web Search](https://doi.org/10.1145/3580305.3599854)|Rong Huang, Danfeng Zhang, Weixue Lu, Han Li, Meng Wang, Daiting Shi, Jun Fan, Zhicong Cheng, Simiu Gu, Dawei Yin||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Discrete+Document+Representations+in+Web+Search)|0| |[AdSEE: Investigating the Impact of Image Style Editing on Advertisement Attractiveness](https://doi.org/10.1145/3580305.3599770)|Liyao Jiang, Chenglin Li, Haolan Chen, Xiaodong Gao, Xinwang Zhong, Yang Qiu, Shani Ye, Di Niu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=AdSEE:+Investigating+the+Impact+of+Image+Style+Editing+on+Advertisement+Attractiveness)|0| |[Adaptive Graph Contrastive Learning for Recommendation](https://doi.org/10.1145/3580305.3599768)|Yangqin Jiang, Chao Huang, Lianghao Huang|University of Hong Kong|Recently, graph neural networks (GNNs) have been successfully applied to recommender systems as an effective collaborative filtering (CF) approach. The key idea of GNN-based recommender system is to recursively perform the message passing along the user-item interaction edge for refining the encoded embeddings, relying on sufficient and high-quality training data. Since user behavior data in practical recommendation scenarios is often noisy and exhibits skewed distribution, some recommendation approaches, e.g., SGL and SimGCL, leverage self-supervised learning to improve user representations against the above issues. Despite their effectiveness, however, they conduct self-supervised learning through creating contrastvie views, depending on the exploration of data augmentations with the problem of tedious trial-and-error selection of augmentation methods. In this paper, we propose a novel Adaptive Graph Contrastive Learning (AdaptiveGCL) framework which conducts graph contrastive learning with two adaptive contrastive view generators to better empower CF paradigm. Specifically, we use two trainable view generators, which are a graph generative model and a graph denoising model respectively, to create contrastive views. Two generators are able to create adaptive contrastive views, addressing the problem of model collapse and achieving adaptive contrastive learning. With two adaptive contrasive views, more additionally high-quality training signals will be introduced into the CF paradigm and help to alleviate the data sparsity and noise issues. Extensive experiments on three benchmark datasets demonstrate the superiority of our model over various state-of-the-art recommendation methods. Further visual analysis intuitively explains why our AdaptiveGCL outperforms existing contrastive learning approaches based on selected data augmentation methods.|最近,图形神经网络(GNN)已成功应用于推荐系统,作为一种有效的协同过滤(CF)方法。基于 GNN 的推荐系统的关键思想是依靠充分和高质量的训练数据,递归地执行沿用户项目交互边缘传递的消息,以完善编码的嵌入。由于实际推荐场景中的用户行为数据通常是有噪音的,并且呈现出偏态分布,因此一些推荐方法,如 SGL 和 SimGCL,利用自监督学习来改善用户对上述问题的表示。然而,尽管他们的有效性,他们进行自我监督学习通过创建对比观点,依赖于探索数据增强与繁琐的试错选择增强方法的问题。本文提出了一种新的自适应图形对比学习(AdaptiveGCL)框架,该框架使用两个自适应对比视图生成器进行图形对比学习,以更好地支持 CF 范式。具体来说,我们使用两个可训练的视图生成器,分别是一个图形生成模型和一个图形去噪模型,来创建对比视图。两个生成器能够创建自适应对比视图,解决模型崩溃问题,实现自适应对比学习。通过两个自适应对立视图,在 CF 范式中引入更多高质量的训练信号,有助于缓解数据稀疏和噪声问题。在三个基准数据集上的大量实验证明了我们的模型优于各种最先进的推荐方法。进一步的可视化分析直观地解释了为什么我们的 AdaptiveGCL 优于基于所选数据增强方法的现有对比学习方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Adaptive+Graph+Contrastive+Learning+for+Recommendation)|0| |[PGLBox: Multi-GPU Graph Learning Framework for Web-Scale Recommendation](https://doi.org/10.1145/3580305.3599885)|Xuewu Jiao, Weibin Li, Xinxuan Wu, Wei Hu, Miao Li, Jiang Bian, Siming Dai, Xinsheng Luo, Mingqing Hu, Zhengjie Huang, Danlei Feng, Junchao Yang, Shikun Feng, Haoyi Xiong, Dianhai Yu, Shuanglong Li, Jingzhou He, Yanjun Ma, Lin Liu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PGLBox:+Multi-GPU+Graph+Learning+Framework+for+Web-Scale+Recommendation)|0| -|[IGB: Addressing The Gaps In Labeling, Features, Heterogeneity, and Size of Public Graph Datasets for Deep Learning Research](https://doi.org/10.1145/3580305.3599843)|Arpandeep Khatua, Vikram Sharma Mailthody, Bhagyashree Taleka, Tengfei Ma, Xiang Song, WenMei Hwu|AWS AI; UIUC; NVIDIA; IBM Research|Graph neural networks (GNNs) have shown high potential for a variety of real-world, challenging applications, but one of the major obstacles in GNN research is the lack of large-scale flexible datasets. Most existing public datasets for GNNs are relatively small, which limits the ability of GNNs to generalize to unseen data. The few existing large-scale graph datasets provide very limited labeled data. This makes it difficult to determine if the GNN model's low accuracy for unseen data is inherently due to insufficient training data or if the model failed to generalize. Additionally, datasets used to train GNNs need to offer flexibility to enable a thorough study of the impact of various factors while training GNN models. In this work, we introduce the Illinois Graph Benchmark (IGB), a research dataset tool that the developers can use to train, scrutinize and systematically evaluate GNN models with high fidelity. IGB includes both homogeneous and heterogeneous graphs of enormous sizes, with more than 40% of their nodes labeled. Compared to the largest graph datasets publicly available, the IGB provides over 162X more labeled data for deep learning practitioners and developers to create and evaluate models with higher accuracy. The IGB dataset is designed to be flexible, enabling the study of various GNN architectures, embedding generation techniques, and analyzing system performance issues. IGB is open-sourced, supports DGL and PyG frameworks, and comes with releases of the raw text that we believe foster emerging language models and GNN research projects. An early public version of IGB is available at https://github.com/IllinoisGraphBenchmark/IGB-Datasets.|图形神经网络(GNN)在现实世界中具有很大的应用潜力,但是缺乏大规模的灵活数据集是 GNN 研究的主要障碍之一。大多数现有的 GNN 公共数据集相对较小,这限制了 GNN 推广到未见数据的能力。少数现有的大规模图形数据集提供非常有限的标记数据。这使得很难确定 GNN 模型对于不可见数据的低精度是否本质上是由于训练数据不足或者模型没有推广。此外,用于训练 GNN 的数据集需要提供灵活性,以便在训练 GNN 模型时能够对各种因素的影响进行彻底的研究。在这项工作中,我们介绍了伊利诺伊图基准(IGB) ,一个研究数据集工具,开发人员可以用来训练,审查和系统地评估 GNN 模型的高保真度。IGB 包括大型的同质和异质图,其中超过40% 的节点被标记。与公开发布的最大的图形数据集相比,IGB 为深度学习从业者和开发者提供了超过162倍的标记数据,以创建和评估更高精度的模型。IGB 数据集的设计是灵活的,能够研究各种 GNN 体系结构、嵌入生成技术和分析系统性能问题。IGB 是开源的,支持 DGL 和 PyG 框架,并且附带了原始文本的发布,我们相信这些原始文本可以促进新兴语言模型和 GNN 研究项目的发展。IGB 的早期公开版本可在 https://github.com/illinoisgraphbenchmark/IGB-datasets 下载。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=IGB:+Addressing+The+Gaps+In+Labeling,+Features,+Heterogeneity,+and+Size+of+Public+Graph+Datasets+for+Deep+Learning+Research)|0| -|[AdaTT: Adaptive Task-to-Task Fusion Network for Multitask Learning in Recommendations](https://doi.org/10.1145/3580305.3599769)|Danwei Li, Zhengyu Zhang, Siyang Yuan, Mingze Gao, Weilin Zhang, Chaofei Yang, Xi Liu, Jiyan Yang|Meta Platforms, Inc.; Meta AI|Multi-task learning (MTL) aims at enhancing the performance and efficiency of machine learning models by training them on multiple tasks simultaneously. However, MTL research faces two challenges: 1) modeling the relationships between tasks to effectively share knowledge between them, and 2) jointly learning task-specific and shared knowledge. In this paper, we present a novel model Adaptive Task-to-Task Fusion Network (AdaTT) to address both challenges. AdaTT is a deep fusion network built with task specific and optional shared fusion units at multiple levels. By leveraging a residual mechanism and gating mechanism for task-to-task fusion, these units adaptively learn shared knowledge and task specific knowledge. To evaluate the performance of AdaTT, we conduct experiments on a public benchmark and an industrial recommendation dataset using various task groups. Results demonstrate AdaTT can significantly outperform existing state-of-the-art baselines.|多任务学习(MTL)旨在通过同时对多任务进行训练来提高机器学习模型的性能和效率。然而,MTL 研究面临着两个挑战: 1)建立任务之间的关系以有效地分享它们之间的知识,2)联合学习任务特定的和共享的知识。在本文中,我们提出了一个新的模型自适应任务到任务融合网络(AdaTT) ,以解决这两个挑战。AdaTT 是一个深度融合网络,由多个级别的任务特定的和可选的共享融合单元构成。通过利用剩余机制和门控机制进行任务-任务融合,这些单元自适应地学习共享知识和任务特定知识。为了评估 AdaTT 的性能,我们使用不同的任务组在一个公共基准和一个工业推荐数据集上进行了实验。结果表明,AdaTT 可以显著优于现有的最先进的基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=AdaTT:+Adaptive+Task-to-Task+Fusion+Network+for+Multitask+Learning+in+Recommendations)|0| +|[IGB: Addressing The Gaps In Labeling, Features, Heterogeneity, and Size of Public Graph Datasets for Deep Learning Research](https://doi.org/10.1145/3580305.3599843)|Arpandeep Khatua, Vikram Sharma Mailthody, Bhagyashree Taleka, Tengfei Ma, Xiang Song, WenMei Hwu|IBM Research; NVIDIA; AWS AI; UIUC|Graph neural networks (GNNs) have shown high potential for a variety of real-world, challenging applications, but one of the major obstacles in GNN research is the lack of large-scale flexible datasets. Most existing public datasets for GNNs are relatively small, which limits the ability of GNNs to generalize to unseen data. The few existing large-scale graph datasets provide very limited labeled data. This makes it difficult to determine if the GNN model's low accuracy for unseen data is inherently due to insufficient training data or if the model failed to generalize. Additionally, datasets used to train GNNs need to offer flexibility to enable a thorough study of the impact of various factors while training GNN models. In this work, we introduce the Illinois Graph Benchmark (IGB), a research dataset tool that the developers can use to train, scrutinize and systematically evaluate GNN models with high fidelity. IGB includes both homogeneous and heterogeneous graphs of enormous sizes, with more than 40% of their nodes labeled. Compared to the largest graph datasets publicly available, the IGB provides over 162X more labeled data for deep learning practitioners and developers to create and evaluate models with higher accuracy. The IGB dataset is designed to be flexible, enabling the study of various GNN architectures, embedding generation techniques, and analyzing system performance issues. IGB is open-sourced, supports DGL and PyG frameworks, and comes with releases of the raw text that we believe foster emerging language models and GNN research projects. An early public version of IGB is available at https://github.com/IllinoisGraphBenchmark/IGB-Datasets.|图形神经网络(GNN)在现实世界中具有很大的应用潜力,但是缺乏大规模的灵活数据集是 GNN 研究的主要障碍之一。大多数现有的 GNN 公共数据集相对较小,这限制了 GNN 推广到未见数据的能力。少数现有的大规模图形数据集提供非常有限的标记数据。这使得很难确定 GNN 模型对于不可见数据的低精度是否本质上是由于训练数据不足或者模型没有推广。此外,用于训练 GNN 的数据集需要提供灵活性,以便在训练 GNN 模型时能够对各种因素的影响进行彻底的研究。在这项工作中,我们介绍了伊利诺伊图基准(IGB) ,一个研究数据集工具,开发人员可以用来训练,审查和系统地评估 GNN 模型的高保真度。IGB 包括大型的同质和异质图,其中超过40% 的节点被标记。与公开发布的最大的图形数据集相比,IGB 为深度学习从业者和开发者提供了超过162倍的标记数据,以创建和评估更高精度的模型。IGB 数据集的设计是灵活的,能够研究各种 GNN 体系结构、嵌入生成技术和分析系统性能问题。IGB 是开源的,支持 DGL 和 PyG 框架,并且附带了原始文本的发布,我们相信这些原始文本可以促进新兴语言模型和 GNN 研究项目的发展。IGB 的早期公开版本可在 https://github.com/illinoisgraphbenchmark/IGB-datasets 下载。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=IGB:+Addressing+The+Gaps+In+Labeling,+Features,+Heterogeneity,+and+Size+of+Public+Graph+Datasets+for+Deep+Learning+Research)|0| +|[AdaTT: Adaptive Task-to-Task Fusion Network for Multitask Learning in Recommendations](https://doi.org/10.1145/3580305.3599769)|Danwei Li, Zhengyu Zhang, Siyang Yuan, Mingze Gao, Weilin Zhang, Chaofei Yang, Xi Liu, Jiyan Yang|Meta AI; Meta Platforms, Inc.|Multi-task learning (MTL) aims at enhancing the performance and efficiency of machine learning models by training them on multiple tasks simultaneously. However, MTL research faces two challenges: 1) modeling the relationships between tasks to effectively share knowledge between them, and 2) jointly learning task-specific and shared knowledge. In this paper, we present a novel model Adaptive Task-to-Task Fusion Network (AdaTT) to address both challenges. AdaTT is a deep fusion network built with task specific and optional shared fusion units at multiple levels. By leveraging a residual mechanism and gating mechanism for task-to-task fusion, these units adaptively learn shared knowledge and task specific knowledge. To evaluate the performance of AdaTT, we conduct experiments on a public benchmark and an industrial recommendation dataset using various task groups. Results demonstrate AdaTT can significantly outperform existing state-of-the-art baselines.|多任务学习(MTL)旨在通过同时对多任务进行训练来提高机器学习模型的性能和效率。然而,MTL 研究面临着两个挑战: 1)建立任务之间的关系以有效地分享它们之间的知识,2)联合学习任务特定的和共享的知识。在本文中,我们提出了一个新的模型自适应任务到任务融合网络(AdaTT) ,以解决这两个挑战。AdaTT 是一个深度融合网络,由多个级别的任务特定的和可选的共享融合单元构成。通过利用剩余机制和门控机制进行任务-任务融合,这些单元自适应地学习共享知识和任务特定知识。为了评估 AdaTT 的性能,我们使用不同的任务组在一个公共基准和一个工业推荐数据集上进行了实验。结果表明,AdaTT 可以显著优于现有的最先进的基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=AdaTT:+Adaptive+Task-to-Task+Fusion+Network+for+Multitask+Learning+in+Recommendations)|0| |[Stationary Algorithmic Balancing For Dynamic Email Re-Ranking Problem](https://doi.org/10.1145/3580305.3599909)|Jiayi Liu, Jennifer Neville|Purdue University|Email platforms need to generate personalized rankings of emails that satisfy user preferences, which may vary over time. We approach this as a recommendation problem based on three criteria: closeness (how relevant the sender and topic are to the user), timeliness (how recent the email is), and conciseness (how brief the email is). We propose MOSR (Multi-Objective Stationary Recommender), a novel online algorithm that uses an adaptive control model to dynamically balance these criteria and adapt to preference changes. We evaluate MOSR on the Enron Email Dataset, a large collection of real emails, and compare it with other baselines. The results show that MOSR achieves better performance, especially under non-stationary preferences, where users value different criteria more or less over time. We also test MOSR's robustness on a smaller down-sampled dataset that exhibits high variance in email characteristics, and show that it maintains stable rankings across different samples. Our work offers novel insights into how to design email re-ranking systems that account for multiple objectives impacting user satisfaction.|电子邮件平台需要生成个性化的电子邮件排名,以满足用户的喜好,这可能随着时间的推移而变化。我们基于三个标准来处理这个推荐问题: 亲密性(发送者和主题与用户的相关程度)、及时性(邮件发送时间有多近)和简洁性(邮件有多简短)。我们提出了一种新的在线算法——多目标平稳推荐(MOSR) ,它使用自适应控制模型来动态平衡这些标准,并适应偏好的变化。我们评估 MOSR 的安然电子邮件数据集,一大批真实的电子邮件,并比较它与其他基线。结果表明,在非平稳偏好条件下,特别是在用户随着时间的推移或多或少评价不同标准的情况下,MOSR 可以获得更好的性能。我们还测试了 MOSR 的稳健性较小的下采样数据集,表现出高的电子邮件特征方差,并表明它保持稳定的排名在不同的样本。我们的工作提供了新颖的见解,如何设计电子邮件重新排序系统的帐户多个目标影响用户的满意度。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Stationary+Algorithmic+Balancing+For+Dynamic+Email+Re-Ranking+Problem)|0| |[Tree based Progressive Regression Model for Watch-Time Prediction in Short-video Recommendation](https://doi.org/10.1145/3580305.3599919)|Xiao Lin, Xiaokai Chen, Linfeng Song, Jingwei Liu, Biao Li, Peng Jiang|Kuaishou Technology|An accurate prediction of watch time has been of vital importance to enhance user engagement in video recommender systems. To achieve this, there are four properties that a watch time prediction framework should satisfy: first, despite its continuous value, watch time is also an ordinal variable and the relative ordering between its values reflects the differences in user preferences. Therefore the ordinal relations should be reflected in watch time predictions. Second, the conditional dependence between the video-watching behaviors should be captured in the model. For instance, one has to watch half of the video before he/she finishes watching the whole video. Third, modeling watch time with a point estimation ignores the fact that models might give results with high uncertainty and this could cause bad cases in recommender systems. Therefore the framework should be aware of prediction uncertainty. Forth, the real-life recommender systems suffer from severe bias amplifications thus an estimation without bias amplification is expected. Therefore we propose TPM for watch time prediction. Specifically, the ordinal ranks of watch time are introduced into TPM and the problem is decomposed into a series of conditional dependent classification tasks which are organized into a tree structure. The expectation of watch time can be generated by traversing the tree and the variance of watch time predictions is explicitly introduced into the objective function as a measurement for uncertainty. Moreover, we illustrate that backdoor adjustment can be seamlessly incorporated into TPM, which alleviates bias amplifications. Extensive offline evaluations have been conducted in public datasets and TPM have been deployed in a real-world video app Kuaishou with over 300 million DAUs. The results indicate that TPM outperforms state-of-the-art approaches and indeed improves video consumption significantly.|准确预测观看时间对于提高用户在视频推荐系统中的参与度至关重要。为了实现这一点,手表时间预测框架应该满足四个特性: 第一,尽管手表时间是连续的,但它也是一个有序变量,其值之间的相对排序反映了用户偏好的差异。因此,序数关系应反映在手表时间预测中。其次,在模型中要捕捉视频观看行为之间的条件依赖关系。例如,一个人必须看完一半的视频才能看完整个视频。第三,使用点估计对手表时间进行建模忽略了这样一个事实,即模型可能会给出高度不确定性的结果,这可能会导致推荐系统出现问题。因此,框架应该意识到预测的不确定性。第四,现实生活中的推荐系统遭受严重的偏差放大,因此估计没有偏差放大的预期。因此,我们提出 TPM 来预测手表时间。在 TPM 中引入了观察时间序列,并将问题分解为一系列条件相关的分类任务,这些任务被组织成一个树形结构。通过遍历该树可以产生观察时间的期望值,并且在目标函数中明确地引入观察时间预测的方差作为不确定性的度量。此外,我们说明后门调整可以无缝地纳入 TPM,从而减轻偏差放大。在公共数据集中已经进行了广泛的离线评估,TPM 已经部署在一个现实世界的视频应用快手中,有超过3亿 DAU。结果表明,TPM 优于最先进的方法,确实显著提高了视频消费。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Tree+based+Progressive+Regression+Model+for+Watch-Time+Prediction+in+Short-video+Recommendation)|0| |[HUGE: Huge Unsupervised Graph Embeddings with TPUs](https://doi.org/10.1145/3580305.3599840)|Brandon A. Mayer, Anton Tsitsulin, Hendrik Fichtenberger, Jonathan Halcrow, Bryan Perozzi|Google Research|Graphs are a representation of structured data that captures the relationships between sets of objects. With the ubiquity of available network data, there is increasing industrial and academic need to quickly analyze graphs with billions of nodes and trillions of edges. A common first step for network understanding is Graph Embedding, the process of creating a continuous representation of nodes in a graph. A continuous representation is often more amenable, especially at scale, for solving downstream machine learning tasks such as classification, link prediction, and clustering. A high-performance graph embedding architecture leveraging Tensor Processing Units (TPUs) with configurable amounts of high-bandwidth memory is presented that simplifies the graph embedding problem and can scale to graphs with billions of nodes and trillions of edges. We verify the embedding space quality on real and synthetic large-scale datasets.|图表是结构化数据的表示,它捕捉对象集之间的关系。随着可用网络数据的普及,工业界和学术界越来越需要快速分析具有数十亿个节点和数万亿条边的图形。网络理解的一个常见的第一步是图形嵌入,即在图形中创建连续的节点表示的过程。连续表示通常更适合于解决下游机器学习任务,如分类、链接预测和聚类,尤其是在规模上。提出了一种利用张量处理单元(TPU)和可配置的高带宽存储器构成的高性能图嵌入体系结构,简化了图嵌入问题,并且可以扩展到具有数十亿个节点和数万亿条边的图。在实际和合成的大规模数据集上验证了嵌入空间的质量。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=HUGE:+Huge+Unsupervised+Graph+Embeddings+with+TPUs)|0| -|[Revisiting Personalized Federated Learning: Robustness Against Backdoor Attacks](https://doi.org/10.1145/3580305.3599898)|Zeyu Qin, Liuyi Yao, Daoyuan Chen, Yaliang Li, Bolin Ding, Minhao Cheng|Alibaba Group; Hong Kong University of Science and Technology|In this work, besides improving prediction accuracy, we study whether personalization could bring robustness benefits to backdoor attacks. We conduct the first study of backdoor attacks in the pFL framework, testing 4 widely used backdoor attacks against 6 pFL methods on benchmark datasets FEMNIST and CIFAR-10, a total of 600 experiments. The study shows that pFL methods with partial model-sharing can significantly boost robustness against backdoor attacks. In contrast, pFL methods with full model-sharing do not show robustness. To analyze the reasons for varying robustness performances, we provide comprehensive ablation studies on different pFL methods. Based on our findings, we further propose a lightweight defense method, Simple-Tuning, which empirically improves defense performance against backdoor attacks. We believe that our work could provide both guidance for pFL application in terms of its robustness and offer valuable insights to design more robust FL methods in the future.|在这项工作中,除了提高预测的准确性,我们研究个性化是否可以带来健壮性的好处后门攻击。我们在 pFL 框架中进行了后门攻击的第一次研究,在基准数据集 FEMNIST 和 CIFAR-10上测试了4个广泛使用的后门攻击与6个 pFL 方法,共计600个实验。研究表明,部分模型共享的 pFL 方法可以显著提高对后门攻击的鲁棒性。相比之下,完全模型共享的 pFL 方法不具有鲁棒性。为了分析鲁棒性能变化的原因,我们对不同的 pFL 方法进行了全面的消融研究。在此基础上,我们进一步提出了一种轻量级的防御方法——简单调整(Simple-Tuning) ,该方法可以实验性地提高对后门攻击的防御性能。我们相信,我们的工作可以为 pFL 的应用提供指导,在其健壮性方面,并提供有价值的见解,以设计更健壮的 FL 方法在未来。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Revisiting+Personalized+Federated+Learning:+Robustness+Against+Backdoor+Attacks)|0| +|[Revisiting Personalized Federated Learning: Robustness Against Backdoor Attacks](https://doi.org/10.1145/3580305.3599898)|Zeyu Qin, Liuyi Yao, Daoyuan Chen, Yaliang Li, Bolin Ding, Minhao Cheng|Hong Kong University of Science and Technology; Alibaba Group|In this work, besides improving prediction accuracy, we study whether personalization could bring robustness benefits to backdoor attacks. We conduct the first study of backdoor attacks in the pFL framework, testing 4 widely used backdoor attacks against 6 pFL methods on benchmark datasets FEMNIST and CIFAR-10, a total of 600 experiments. The study shows that pFL methods with partial model-sharing can significantly boost robustness against backdoor attacks. In contrast, pFL methods with full model-sharing do not show robustness. To analyze the reasons for varying robustness performances, we provide comprehensive ablation studies on different pFL methods. Based on our findings, we further propose a lightweight defense method, Simple-Tuning, which empirically improves defense performance against backdoor attacks. We believe that our work could provide both guidance for pFL application in terms of its robustness and offer valuable insights to design more robust FL methods in the future.|在这项工作中,除了提高预测的准确性,我们研究个性化是否可以带来健壮性的好处后门攻击。我们在 pFL 框架中进行了后门攻击的第一次研究,在基准数据集 FEMNIST 和 CIFAR-10上测试了4个广泛使用的后门攻击与6个 pFL 方法,共计600个实验。研究表明,部分模型共享的 pFL 方法可以显著提高对后门攻击的鲁棒性。相比之下,完全模型共享的 pFL 方法不具有鲁棒性。为了分析鲁棒性能变化的原因,我们对不同的 pFL 方法进行了全面的消融研究。在此基础上,我们进一步提出了一种轻量级的防御方法——简单调整(Simple-Tuning) ,该方法可以实验性地提高对后门攻击的防御性能。我们相信,我们的工作可以为 pFL 的应用提供指导,在其健壮性方面,并提供有价值的见解,以设计更健壮的 FL 方法在未来。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Revisiting+Personalized+Federated+Learning:+Robustness+Against+Backdoor+Attacks)|0| |[Joint Optimization of Ranking and Calibration with Contextualized Hybrid Model](https://doi.org/10.1145/3580305.3599851)|XiangRong Sheng, Jingyue Gao, Yueyao Cheng, Siran Yang, Shuguang Han, Hongbo Deng, Yuning Jiang, Jian Xu, Bo Zheng|Alibaba Group|Despite the development of ranking optimization techniques, the pointwise model remains the dominating approach for click-through rate (CTR) prediction. It can be attributed to the calibration ability of the pointwise model since the prediction can be viewed as the click probability. In practice, a CTR prediction model is also commonly assessed with the ranking ability, for which prediction models based on ranking losses (e.g., pairwise or listwise loss) usually achieve better performances than the pointwise loss. Previous studies have experimented with a direct combination of the two losses to obtain the benefit from both losses and observed an improved performance. However, previous studies break the meaning of output logit as the click-through rate, which may lead to sub-optimal solutions. To address this issue, we propose an approach that can Jointly optimize the Ranking and Calibration abilities (JRC for short). JRC improves the ranking ability by contrasting the logit value for the sample with different labels and constrains the predicted probability to be a function of the logit subtraction. We further show that JRC consolidates the interpretation of logits, where the logits model the joint distribution. With such an interpretation, we prove that JRC approximately optimizes the contextualized hybrid discriminative-generative objective. Experiments on public and industrial datasets and online A/B testing show that our approach improves both ranking and calibration abilities. Since May 2022, JRC has been deployed on the display advertising platform of Alibaba and has obtained significant performance improvements.|尽管排序优化技术不断发展,逐点模型仍然是点进率预测的主要方法。这可以归因于点态模型的校准能力,因为预测可以被视为点击概率。在实践中,CTR 预测模型通常也是用排序能力来评估的,其中基于排序损失的预测模型(例如,成对损失或列表损失)通常比逐点损失的预测模型获得更好的性能。以前的研究已经试验了两种损失的直接组合,以获得两种损失的收益,并观察到改善的性能。然而,以前的研究打破了 logit 作为点进率的意义,这可能导致次优解。为了解决这个问题,我们提出了一种方法,可以联合优化排名和校准能力(简称 JRC)。JRC 通过对比不同标签样本的 logit 值来提高排序能力,并将预测概率约束为 logit 减法的函数。我们进一步表明,JRC 巩固了 logit 的解释,其中 logit 模型的联合分布。通过这样的解释,我们证明了 JRC 近似地优化了上下文混合判别生成目标。在公共和工业数据集上的实验和在线 A/B 测试表明,该方法提高了排序和校准能力。自2022年5月起,JRC 已被部署在阿里巴巴的展示广告平台上,并取得显著的性能改善。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Joint+Optimization+of+Ranking+and+Calibration+with+Contextualized+Hybrid+Model)|0| |[Workplace Recommendation with Temporal Network Objectives](https://doi.org/10.1145/3580305.3599932)|Kiran Tomlinson, Jennifer Neville, Longqi Yang, Mengting Wan, Cao Lu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Workplace+Recommendation+with+Temporal+Network+Objectives)|0| |[Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-Making for Continuous Monitoring](https://doi.org/10.1145/3580305.3599818)|Runzhe Wan, Yu Liu, James McQueen, Doug Hains, Rui Song|Amazon|With the growing needs of online A/B testing to support the innovation in industry, the opportunity cost of running an experiment becomes non-negligible. Therefore, there is an increasing demand for an efficient continuous monitoring service that allows early stopping when appropriate. Classic statistical methods focus on hypothesis testing and are mostly developed for traditional high-stake problems such as clinical trials, while experiments at online service companies typically have very different features and focuses. Motivated by the real needs, in this paper, we introduce a novel framework that we developed in Amazon to maximize customer experience and control opportunity cost. We formulate the problem as a Bayesian optimal sequential decision making problem that has a unified utility function. We discuss extensively practical design choices and considerations. We further introduce how to solve the optimal decision rule via Reinforcement Learning and scale the solution. We show the effectiveness of this novel approach compared with existing methods via a large-scale meta-analysis on experiments in Amazon.|随着支持行业创新的在线 A/B 测试需求的不断增长,运行一个实验的机会成本变得不可忽视。因此,人们越来越需要一种有效的连续监测服务,以便能够在适当的时候提早停止。经典的统计方法侧重于假设检验,主要针对传统的高风险问题,如临床试验,而在线服务公司的实验通常具有非常不同的特点和重点。本文从实际需求出发,介绍了我们在亚马逊开发的一个新的框架,以最大限度地提高客户体验和控制机会成本。将该问题表示为一个具有统一效用函数的贝叶斯最优序贯决策问题。我们广泛讨论实用的设计选择和考虑因素。我们进一步介绍了如何通过强化学习和规模求解最优决策规则。我们通过对亚马逊上的实验进行大规模的荟萃分析,证明了这种新方法与现有方法相比的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Experimentation+Platforms+Meet+Reinforcement+Learning:+Bayesian+Sequential+Decision-Making+for+Continuous+Monitoring)|0| @@ -141,16 +141,16 @@ |[Fresh Content Needs More Attention: Multi-funnel Fresh Content Recommendation](https://doi.org/10.1145/3580305.3599826)|Jianling Wang, Haokai Lu, Sai Zhang, Bart N. Locanthi, Haoting Wang, Dylan Greaves, Benjamin Lipshitz, Sriraj Badam, Ed H. Chi, Cristos J. Goodrow, SuLin Wu, Lexi Baugher, Minmin Chen|Google|Recommendation system serves as a conduit connecting users to an incredibly large, diverse and ever growing collection of contents. In practice, missing information on fresh (and tail) contents needs to be filled in order for them to be exposed and discovered by their audience. We here share our success stories in building a dedicated fresh content recommendation stack on a large commercial platform. To nominate fresh contents, we built a multi-funnel nomination system that combines (i) a two-tower model with strong generalization power for coverage, and (ii) a sequence model with near real-time update on user feedback for relevance. The multi-funnel setup effectively balances between coverage and relevance. An in-depth study uncovers the relationship between user activity level and their proximity toward fresh contents, which further motivates a contextual multi-funnel setup. Nominated fresh candidates are then scored and ranked by systems considering prediction uncertainty to further bootstrap content with less exposure. We evaluate the benefits of the dedicated fresh content recommendation stack, and the multi-funnel nomination system in particular, through user corpus co-diverted live experiments. We conduct multiple rounds of live experiments on a commercial platform serving billion of users demonstrating efficacy of our proposed methods.|推荐系统作为一个管道,将用户连接到一个极其庞大、多样化和不断增长的内容集合。在实践中,需要填补关于新鲜(和尾部)内容的缺失信息,以便它们被观众暴露和发现。我们在这里分享我们在一个大型商业平台上建立一个专门的新内容推荐堆栈的成功故事。为了提名新的内容,我们建立了一个多漏斗提名系统,该系统结合了(i)一个具有很强覆盖泛化能力的双塔模型和(ii)一个具有近实时更新用户反馈相关性的序列模型。多漏斗设置有效地平衡了覆盖率和相关性。深入的研究揭示了用户活动水平与其接近新鲜内容之间的关系,进一步激发了上下文多漏斗设置。提名的新鲜候选人,然后得分和排名的系统考虑预测不确定性,以进一步引导内容,较少的曝光。我们通过用户语料库共转向的现场实验,评估了专用新鲜内容推荐堆栈,特别是多漏斗提名系统的优点。我们在一个为数十亿用户服务的商业平台上进行多轮实验,证明我们提出的方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Fresh+Content+Needs+More+Attention:+Multi-funnel+Fresh+Content+Recommendation)|0| |[Contrastive Learning of Stress-specific Word Embedding for Social Media based Stress Detection](https://doi.org/10.1145/3580305.3599795)|Xin Wang, Huijun Zhang, Lei Cao, Kaisheng Zeng, Qi Li, Ningyun Li, Ling Feng||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Contrastive+Learning+of+Stress-specific+Word+Embedding+for+Social+Media+based+Stress+Detection)|0| |[RLTP: Reinforcement Learning to Pace for Delayed Impression Modeling in Preloaded Ads](https://doi.org/10.1145/3580305.3599900)|Penghui Wei, Yongqiang Chen, Shaoguo Liu, Liang Wang, Bo Zheng|Alibaba Group|To increase brand awareness, many advertisers conclude contracts with advertising platforms to purchase traffic and then deliver advertisements to target audiences. In a whole delivery period, advertisers usually desire a certain impression count for the ads, and they also expect that the delivery performance is as good as possible (e.g., obtaining high click-through rate). Advertising platforms employ pacing algorithms to satisfy the demands via adjusting the selection probabilities to traffic requests in real-time. However, the delivery procedure is also affected by the strategies from publishers, which cannot be controlled by advertising platforms. Preloading is a widely used strategy for many types of ads (e.g., video ads) to make sure that the response time for displaying after a traffic request is legitimate, which results in delayed impression phenomenon. Traditional pacing algorithms cannot handle the preloading nature well because they rely on immediate feedback signals, and may fail to guarantee the demands from advertisers. In this paper, we focus on a new research problem of impression pacing for preloaded ads, and propose a Reinforcement Learning To Pace framework RLTP. It learns a pacing agent that sequentially produces selection probabilities in the whole delivery period. To jointly optimize the two objectives of impression count and delivery performance, RLTP employs tailored reward estimator to satisfy the guaranteed impression count, penalize the over-delivery and maximize the traffic value. Experiments on large-scale industrial datasets verify that RLTP outperforms baseline pacing algorithms by a large margin. We have deployed the RLTP framework online to our advertising platform, and results show that it achieves significant uplift to core metrics including delivery completion rate and click-through rate.|为了提高品牌知名度,许多广告商与广告平台签订合同,购买流量,然后向目标受众投放广告。在整个投放期间,广告商通常希望广告能给人留下一定的印象,而且他们也希望投放的效果尽可能好(例如,获得较高的点进率)。广告平台采用节奏算法,通过实时调整流量请求的选择概率来满足需求。然而,传递过程也受到出版商策略的影响,而出版商策略又不受广告平台的控制。预加载是一种广泛使用的策略,许多类型的广告(如视频广告) ,以确保响应时间显示后的流量请求是合法的,这导致了延迟印象现象。传统的节奏算法不能很好地处理预载性质,因为它们依赖于即时反馈信号,可能无法保证来自广告商的需求。在这篇文章中,我们关注一个新的研究问题——预装广告的印象节奏,并提出了一个强化学习到节奏的框架 RLTP。它学习一种起搏剂,该起搏剂在整个交付期间依次产生选择概率。为了共同优化印象计数和传递性能这两个目标,RLTP 使用定制的报酬估计器来满足保证的印象计数,惩罚超额传递和最大化流量价值。在大规模工业数据集上的实验证明,RLTP 算法的性能优于基线起搏算法。我们已经在我们的广告平台上部署了 RLTP 框架,结果显示它实现了包括交付完成率和点进率在内的核心指标的显著提升。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=RLTP:+Reinforcement+Learning+to+Pace+for+Delayed+Impression+Modeling+in+Preloaded+Ads)|0| -|[Multi-channel Integrated Recommendation with Exposure Constraints](https://doi.org/10.1145/3580305.3599868)|Yue Xu, Qijie Shen, Jianwen Yin, Zengde Deng, Dimin Wang, Hao Chen, Lixiang Lai, Tao Zhuang, Junfeng Ge|Alibaba Group.; Cainiao Network.; The Hong Kong Polytechnic University.|Integrated recommendation, which aims at jointly recommending heterogeneous items from different channels in a main feed, has been widely applied to various online platforms. Though attractive, integrated recommendation requires the ranking methods to migrate from conventional user-item models to the new user-channel-item paradigm in order to better capture users' preferences on both item and channel levels. Moreover, practical feed recommendation systems usually impose exposure constraints on different channels to ensure user experience. This leads to greater difficulty in the joint ranking of heterogeneous items. In this paper, we investigate the integrated recommendation task with exposure constraints in practical recommender systems. Our contribution is forth-fold. First, we formulate this task as a binary online linear programming problem and propose a two-layer framework named Multi-channel Integrated Recommendation with Exposure Constraints (MIREC) to obtain the optimal solution. Second, we propose an efficient online allocation algorithm to determine the optimal exposure assignment of different channels from a global view of all user requests over the entire time horizon. We prove that this algorithm reaches the optimal point under a regret bound of $ \mathcal{O}(\sqrt{T}) $ with linear complexity. Third, we propose a series of collaborative models to determine the optimal layout of heterogeneous items at each user request. The joint modeling of user interests, cross-channel correlation, and page context in our models aligns more with the browsing nature of feed products than existing models. Finally, we conduct extensive experiments on both offline datasets and online A/B tests to verify the effectiveness of MIREC. The proposed framework has now been implemented on the homepage of Taobao to serve the main traffic.|综合推荐是指在一个主要的推送平台上联合推荐来自不同渠道的异构项目,已广泛应用于各种在线平台。虽然综合推荐具有吸引力,但是它需要排名方法从传统的用户项目模型迁移到新的用户渠道项目范式,以便更好地捕捉用户在项目和渠道级别上的偏好。此外,实际的饲料推荐系统通常对不同的渠道施加暴露约束,以确保用户体验。这导致了异构项目联合排序的更大困难。本文研究了实际推荐系统中具有曝光约束的集成推荐任务。我们的贡献是四倍。首先,我们将这个任务表述为一个二进制在线线性规划问题,并提出一个名为多通道暴露约束综合推荐(MIREC)的两层架构来获得最优解。其次,我们提出了一个有效的在线分配算法,从全局的角度来确定不同信道在整个时间范围内的最佳曝光分配。证明了该算法在线性复杂度为 $数学{ O }(sqrt { T }) $的遗憾界下达到最优点。第三,我们提出了一系列的协作模型,以确定在每个用户请求的异构项目的最佳布局。在我们的模型中,用户兴趣、跨通道相关性和页面上下文的联合建模比现有模型更符合饲料产品的浏览特性。最后,我们对离线数据集和在线 A/B 测试进行了广泛的实验,以验证 MIREC 的有效性。这个建议框架现已在淘宝网的主页上实施,以服务于主要流量。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-channel+Integrated+Recommendation+with+Exposure+Constraints)|0| +|[Multi-channel Integrated Recommendation with Exposure Constraints](https://doi.org/10.1145/3580305.3599868)|Yue Xu, Qijie Shen, Jianwen Yin, Zengde Deng, Dimin Wang, Hao Chen, Lixiang Lai, Tao Zhuang, Junfeng Ge|Cainiao Network.; The Hong Kong Polytechnic University.; Alibaba Group.|Integrated recommendation, which aims at jointly recommending heterogeneous items from different channels in a main feed, has been widely applied to various online platforms. Though attractive, integrated recommendation requires the ranking methods to migrate from conventional user-item models to the new user-channel-item paradigm in order to better capture users' preferences on both item and channel levels. Moreover, practical feed recommendation systems usually impose exposure constraints on different channels to ensure user experience. This leads to greater difficulty in the joint ranking of heterogeneous items. In this paper, we investigate the integrated recommendation task with exposure constraints in practical recommender systems. Our contribution is forth-fold. First, we formulate this task as a binary online linear programming problem and propose a two-layer framework named Multi-channel Integrated Recommendation with Exposure Constraints (MIREC) to obtain the optimal solution. Second, we propose an efficient online allocation algorithm to determine the optimal exposure assignment of different channels from a global view of all user requests over the entire time horizon. We prove that this algorithm reaches the optimal point under a regret bound of $ \mathcal{O}(\sqrt{T}) $ with linear complexity. Third, we propose a series of collaborative models to determine the optimal layout of heterogeneous items at each user request. The joint modeling of user interests, cross-channel correlation, and page context in our models aligns more with the browsing nature of feed products than existing models. Finally, we conduct extensive experiments on both offline datasets and online A/B tests to verify the effectiveness of MIREC. The proposed framework has now been implemented on the homepage of Taobao to serve the main traffic.|综合推荐是指在一个主要的推送平台上联合推荐来自不同渠道的异构项目,已广泛应用于各种在线平台。虽然综合推荐具有吸引力,但是它需要排名方法从传统的用户项目模型迁移到新的用户渠道项目范式,以便更好地捕捉用户在项目和渠道级别上的偏好。此外,实际的饲料推荐系统通常对不同的渠道施加暴露约束,以确保用户体验。这导致了异构项目联合排序的更大困难。本文研究了实际推荐系统中具有曝光约束的集成推荐任务。我们的贡献是四倍。首先,我们将这个任务表述为一个二进制在线线性规划问题,并提出一个名为多通道暴露约束综合推荐(MIREC)的两层架构来获得最优解。其次,我们提出了一个有效的在线分配算法,从全局的角度来确定不同信道在整个时间范围内的最佳曝光分配。证明了该算法在线性复杂度为 $数学{ O }(sqrt { T }) $的遗憾界下达到最优点。第三,我们提出了一系列的协作模型,以确定在每个用户请求的异构项目的最佳布局。在我们的模型中,用户兴趣、跨通道相关性和页面上下文的联合建模比现有模型更符合饲料产品的浏览特性。最后,我们对离线数据集和在线 A/B 测试进行了广泛的实验,以验证 MIREC 的有效性。这个建议框架现已在淘宝网的主页上实施,以服务于主要流量。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-channel+Integrated+Recommendation+with+Exposure+Constraints)|0| |[Interactive Generalized Additive Model and Its Applications in Electric Load Forecasting](https://doi.org/10.1145/3580305.3599848)|Linxiao Yang, Rui Ren, Xinyue Gu, Liang Sun||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Interactive+Generalized+Additive+Model+and+Its+Applications+in+Electric+Load+Forecasting)|0| |[UA-FedRec: Untargeted Attack on Federated News Recommendation](https://doi.org/10.1145/3580305.3599923)|Jingwei Yi, Fangzhao Wu, Bin Zhu, Jing Yao, Zhulin Tao, Guangzhong Sun, Xing Xie|; Microsoft Research Asia; University of Science and Technology of China|News recommendation is critical for personalized news distribution. Federated news recommendation enables collaborative model learning from many clients without sharing their raw data. It is promising for privacy-preserving news recommendation. However, the security of federated news recommendation is still unclear. In this paper, we study this problem by proposing an untargeted attack called UA-FedRec. By exploiting the prior knowledge of news recommendation and federated learning, UA-FedRec can effectively degrade the model performance with a small percentage of malicious clients. First, the effectiveness of news recommendation highly depends on user modeling and news modeling. We design a news similarity perturbation method to make representations of similar news farther and those of dissimilar news closer to interrupt news modeling, and propose a user model perturbation method to make malicious user updates in opposite directions of benign updates to interrupt user modeling. Second, updates from different clients are typically aggregated by weighted-averaging based on their sample sizes. We propose a quantity perturbation method to enlarge sample sizes of malicious clients in a reasonable range to amplify the impact of malicious updates. Extensive experiments on two real-world datasets show that UA-FedRec can effectively degrade the accuracy of existing federated news recommendation methods, even when defense is applied. Our study reveals a critical security issue in existing federated news recommendation systems and calls for research efforts to address the issue.|新闻推荐对个性化新闻发布至关重要。联合新闻推荐使得许多客户能够在不共享原始数据的情况下进行协作模型学习。它对于保护隐私的新闻推荐来说是很有前途的。然而,联邦新闻推荐的安全性仍不清楚。在本文中,我们通过提出一种称为 UA-FedRec 的非目标攻击来研究这个问题。通过利用新闻推荐和联邦学习的先验知识,UA-FedRec 能够有效地降低小比例恶意客户端的模型性能。首先,新闻推荐的有效性很大程度上取决于用户建模和新闻建模。设计了一种新闻相似性摄动方法,使相似新闻和不同新闻的表示更接近于中断新闻建模,提出了一种用户模型摄动方法,使恶意用户在良性更新的相反方向更新,以中断用户建模。其次,来自不同客户端的更新通常根据样本大小进行加权平均。我们提出了一种数量扰动方法,在合理的范围内扩大恶意客户端的样本量,以放大恶意更新的影响。在两个实际数据集上的大量实验表明,UA-FedRec 能够有效地降低现有联邦新闻推荐方法的准确性,即使在采用防御策略的情况下也是如此。我们的研究揭示了现有联邦新闻推荐系统中的一个关键安全问题,并呼吁研究人员努力解决这个问题。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=UA-FedRec:+Untargeted+Attack+on+Federated+News+Recommendation)|0| |[Group-based Fraud Detection Network on e-Commerce Platforms](https://doi.org/10.1145/3580305.3599836)|Jianke Yu, Hanchen Wang, Xiaoyang Wang, Zhao Li, Lu Qin, Wenjie Zhang, Jian Liao, Ying Zhang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Group-based+Fraud+Detection+Network+on+e-Commerce+Platforms)|0| |[Commonsense Knowledge Graph towards Super APP and Its Applications in Alipay](https://doi.org/10.1145/3580305.3599791)|Xiaoling Zang, Binbin Hu, Jun Chu, Zhiqiang Zhang, Guannan Zhang, Jun Zhou, Wenliang Zhong||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Commonsense+Knowledge+Graph+towards+Super+APP+and+Its+Applications+in+Alipay)|0| |[Revisiting Neural Retrieval on Accelerators](https://doi.org/10.1145/3580305.3599897)|Jiaqi Zhai, Zhaojie Gong, Yueming Wang, Xiao Sun, Zheng Yan, Fu Li, Xing Liu|Meta Platforms, Inc.|Retrieval finds a small number of relevant candidates from a large corpus for information retrieval and recommendation applications. A key component of retrieval is to model (user, item) similarity, which is commonly represented as the dot product of two learned embeddings. This formulation permits efficient inference, commonly known as Maximum Inner Product Search (MIPS). Despite its popularity, dot products cannot capture complex user-item interactions, which are multifaceted and likely high rank. We hence examine non-dot-product retrieval settings on accelerators, and propose \textit{mixture of logits} (MoL), which models (user, item) similarity as an adaptive composition of elementary similarity functions. This new formulation is expressive, capable of modeling high rank (user, item) interactions, and further generalizes to the long tail. When combined with a hierarchical retrieval strategy, \textit{h-indexer}, we are able to scale up MoL to 100M corpus on a single GPU with latency comparable to MIPS baselines. On public datasets, our approach leads to uplifts of up to 77.3\% in hit rate (HR). Experiments on a large recommendation surface at Meta showed strong metric gains and reduced popularity bias, validating the proposed approach's performance and improved generalization.|Retrieval 从一个大型语料库中为信息检索和推荐应用程序找到少量相关的候选人。检索的一个关键组成部分是模型(用户,项目)的相似性,这是通常表示为点积的两个学习嵌入。这个公式允许有效的推理,通常称为最大内积搜索(MIPS)。尽管广受欢迎,点产品不能捕捉复杂的用户项目交互,这是多方面的,可能排名很高。因此,我们研究了加速器上的非点积检索设置,并提出了 text { mix of logits }(MoL) ,它将(用户,项目)相似度建模为基本相似度函数的自适应组合。这个新的公式是有表现力的,能够建模高级别(用户,项目)的交互,并进一步推广到长尾。当结合分层检索策略 texttit { h-indexer }时,我们能够在单个 GPU 上扩展 MoL 到100M 语料库,延迟与 MIPS 基线相当。在公共数据集上,我们的方法导致命中率(HR)提高高达77.3% 。在 Meta 的一个大型推荐面上进行的实验表明,该方法具有很强的度量增益和较小的普及偏差,验证了该方法的性能和改进的泛化能力。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Revisiting+Neural+Retrieval+on+Accelerators)|0| |[Constrained Social Community Recommendation](https://doi.org/10.1145/3580305.3599793)|Xingyi Zhang, Shuliang Xu, Wenqing Lin, Sibo Wang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Constrained+Social+Community+Recommendation)|0| -|[Modeling Dual Period-Varying Preferences for Takeaway Recommendation](https://doi.org/10.1145/3580305.3599866)|Yuting Zhang, Yiqing Wu, Ran Le, Yongchun Zhu, Fuzhen Zhuang, Ruidong Han, Xiang Li, Wei Lin, Zhulin An, Yongjun Xu|Institute of Artificial Intelligence, Beihang University; Institute of Computing Technology, Chinese Academy of Sciences; Meituan; Unaffiliated|Takeaway recommender systems, which aim to accurately provide stores that offer foods meeting users' interests, have served billions of users in our daily life. Different from traditional recommendation, takeaway recommendation faces two main challenges: (1) Dual Interaction-Aware Preference Modeling. Traditional recommendation commonly focuses on users' single preferences for items while takeaway recommendation needs to comprehensively consider users' dual preferences for stores and foods. (2) Period-Varying Preference Modeling. Conventional recommendation generally models continuous changes in users' preferences from a session-level or day-level perspective. However, in practical takeaway systems, users' preferences vary significantly during the morning, noon, night, and late night periods of the day. To address these challenges, we propose a Dual Period-Varying Preference modeling (DPVP) for takeaway recommendation. Specifically, we design a dual interaction-aware module, aiming to capture users' dual preferences based on their interactions with stores and foods. Moreover, to model various preferences in different time periods of the day, we propose a time-based decomposition module as well as a time-aware gating mechanism. Extensive offline and online experiments demonstrate that our model outperforms state-of-the-art methods on real-world datasets and it is capable of modeling the dual period-varying preferences. Moreover, our model has been deployed online on Meituan Takeaway platform, leading to an average improvement in GMV (Gross Merchandise Value) of 0.70%.|外卖推荐系统,旨在准确地提供商店,提供符合用户兴趣的食品,已服务于数十亿用户在我们的日常生活。与传统的推荐不同,外卖推荐面临着两个主要挑战: (1)双交互感知偏好建模。传统的推荐方式通常侧重于用户对商品的单一偏好,而外卖推荐方式则需要全面考虑用户对商店和食品的双重偏好。(变周期偏好模型。传统的推荐通常从会话级或日级的角度模拟用户偏好的持续变化。然而,在实际的外卖系统中,用户的偏好在白天的早上、中午、晚上和深夜各不相同。为了应对这些挑战,我们提出了一个外卖推荐的双周期变化偏好模型(DPVP)。具体来说,我们设计了一个双交互感知模块,旨在根据用户与商店和食物的交互来捕捉他们的双重偏好。此外,为了模拟一天中不同时段的不同偏好,我们提出了一个基于时间的分解模块以及一个时间感知的门控机制。大量的离线和在线实验表明,我们的模型优于现实世界数据集的最先进的方法,它能够建模的双周期变化的偏好。此外,我们的模型已经在美团外卖平台上进行了在线部署,导致平均商品总值(GMV)提高了0.70% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Modeling+Dual+Period-Varying+Preferences+for+Takeaway+Recommendation)|0| -|[JiuZhang 2.0: A Unified Chinese Pre-trained Language Model for Multi-task Mathematical Problem Solving](https://doi.org/10.1145/3580305.3599850)|Xin Zhao, Kun Zhou, Beichen Zhang, Zheng Gong, Zhipeng Chen, Yuanhang Zhou, JiRong Wen, Jing Sha, Shijin Wang, Cong Liu, Guoping Hu|iFLYTEK Research, State Key Laboratory of Cognitive Intelligence; iFLYTEK Research; Gaoling School of Artificial Intelligence, Renmin University of China; School of Information, Renmin University of China; iFLYTEK AI Research (Central China|Although pre-trained language models~(PLMs) have recently advanced the research progress in mathematical reasoning, they are not specially designed as a capable multi-task solver, suffering from high cost for multi-task deployment (\eg a model copy for a task) and inferior performance on complex mathematical problems in practical applications. To address these issues, in this paper, we propose \textbf{JiuZhang~2.0}, a unified Chinese PLM specially for multi-task mathematical problem solving. Our idea is to maintain a moderate-sized model and employ the \emph{cross-task knowledge sharing} to improve the model capacity in a multi-task setting. Specially, we construct a Mixture-of-Experts~(MoE) architecture for modeling mathematical text, so as to capture the common mathematical knowledge across tasks. For optimizing the MoE architecture, we design \emph{multi-task continual pre-training} and \emph{multi-task fine-tuning} strategies for multi-task adaptation. These training strategies can effectively decompose the knowledge from the task data and establish the cross-task sharing via expert networks. In order to further improve the general capacity of solving different complex tasks, we leverage large language models~(LLMs) as complementary models to iteratively refine the generated solution by our PLM, via in-context learning. Extensive experiments have demonstrated the effectiveness of our model.|尽管预先训练的语言模型 ~ (PLM)最近已经推动了数学推理的研究进展,但是它们并没有被特别设计成一个有能力的多任务解决者,因为多任务部署的高成本(例如一个任务的模型拷贝)和在实际应用中复杂数学问题的低表现。为了解决这些问题,本文提出了一个专门用于多任务数学问题求解的统一中文 PLM textbf {旧掌 ~ 2.0}。我们的想法是维持一个中等规模的模型,并采用跨任务知识共享的方法来提高模型在多任务环境下的能力。特别地,我们构建了一个专家混合模型,用于数学文本建模,以便跨任务获取常见的数学知识。为了优化教学体系结构,我们设计了多任务连续预训练和多任务微调策略来实现多任务自适应。这些训练策略可以有效地分解任务数据中的知识,并通过专家网络建立跨任务共享。为了进一步提高解决不同复杂任务的能力,我们利用大语言模型 ~ (LLM)作为补充模型,通过上下文学习的方法,迭代地完善 PLM 生成的解决方案。大量的实验证明了我们模型的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=JiuZhang+2.0:+A+Unified+Chinese+Pre-trained+Language+Model+for+Multi-task+Mathematical+Problem+Solving)|0| -|[ReLoop2: Building Self-Adaptive Recommendation Models via Responsive Error Compensation Loop](https://doi.org/10.1145/3580305.3599785)|Jieming Zhu, Guohao Cai, Junjie Huang, Zhenhua Dong, Ruiming Tang, Weinan Zhang|Huawei Noah’s Ark Lab; Shanghai Jiao Tong University|Industrial recommender systems face the challenge of operating in non-stationary environments, where data distribution shifts arise from evolving user behaviors over time. To tackle this challenge, a common approach is to periodically re-train or incrementally update deployed deep models with newly observed data, resulting in a continual training process. However, the conventional learning paradigm of neural networks relies on iterative gradient-based updates with a small learning rate, making it slow for large recommendation models to adapt. In this paper, we introduce ReLoop2, a self-correcting learning loop that facilitates fast model adaptation in online recommender systems through responsive error compensation. Inspired by the slow-fast complementary learning system observed in human brains, we propose an error memory module that directly stores error samples from incoming data streams. These stored samples are subsequently leveraged to compensate for model prediction errors during testing, particularly under distribution shifts. The error memory module is designed with fast access capabilities and undergoes continual refreshing with newly observed data samples during the model serving phase to support fast model adaptation. We evaluate the effectiveness of ReLoop2 on three open benchmark datasets as well as a real-world production dataset. The results demonstrate the potential of ReLoop2 in enhancing the responsiveness and adaptiveness of recommender systems operating in non-stationary environments.|工业推荐系统面临着在非平稳环境下运行的挑战,数据分布随着时间的推移而发生变化。为了应对这一挑战,一种常见的方法是定期用新观测数据重新训练或增量更新已部署的深度模型,从而形成持续的训练过程。然而,传统的神经网络学习范式依赖于迭代的基于梯度的更新,学习速度很小,使得大型推荐模型的适应速度变慢。本文介绍了 ReLoop2,一种通过响应误差补偿实现在线推荐系统中模型快速自适应的自校正学习循环。受到在人脑中观察到的慢-快互补学习系统的启发,我们提出了一个错误记忆模块,它直接存储来自输入数据流的错误样本。这些存储的样本随后被用来补偿测试期间的模型预测错误,特别是在分布变化的情况下。错误存储模块设计具有快速访问能力,并在模型服务阶段不断刷新新观察到的数据样本,以支持快速模型适应。我们评估了 ReLoop2在三个开放基准数据集和一个真实生产数据集上的有效性。结果表明,ReLoop2在提高非平稳环境中运行的推荐系统的响应能力和适应能力方面具有潜力。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ReLoop2:+Building+Self-Adaptive+Recommendation+Models+via+Responsive+Error+Compensation+Loop)|0| +|[Modeling Dual Period-Varying Preferences for Takeaway Recommendation](https://doi.org/10.1145/3580305.3599866)|Yuting Zhang, Yiqing Wu, Ran Le, Yongchun Zhu, Fuzhen Zhuang, Ruidong Han, Xiang Li, Wei Lin, Zhulin An, Yongjun Xu|Unaffiliated; Institute of Artificial Intelligence, Beihang University; Institute of Computing Technology, Chinese Academy of Sciences; Meituan|Takeaway recommender systems, which aim to accurately provide stores that offer foods meeting users' interests, have served billions of users in our daily life. Different from traditional recommendation, takeaway recommendation faces two main challenges: (1) Dual Interaction-Aware Preference Modeling. Traditional recommendation commonly focuses on users' single preferences for items while takeaway recommendation needs to comprehensively consider users' dual preferences for stores and foods. (2) Period-Varying Preference Modeling. Conventional recommendation generally models continuous changes in users' preferences from a session-level or day-level perspective. However, in practical takeaway systems, users' preferences vary significantly during the morning, noon, night, and late night periods of the day. To address these challenges, we propose a Dual Period-Varying Preference modeling (DPVP) for takeaway recommendation. Specifically, we design a dual interaction-aware module, aiming to capture users' dual preferences based on their interactions with stores and foods. Moreover, to model various preferences in different time periods of the day, we propose a time-based decomposition module as well as a time-aware gating mechanism. Extensive offline and online experiments demonstrate that our model outperforms state-of-the-art methods on real-world datasets and it is capable of modeling the dual period-varying preferences. Moreover, our model has been deployed online on Meituan Takeaway platform, leading to an average improvement in GMV (Gross Merchandise Value) of 0.70%.|外卖推荐系统,旨在准确地提供商店,提供符合用户兴趣的食品,已服务于数十亿用户在我们的日常生活。与传统的推荐不同,外卖推荐面临着两个主要挑战: (1)双交互感知偏好建模。传统的推荐方式通常侧重于用户对商品的单一偏好,而外卖推荐方式则需要全面考虑用户对商店和食品的双重偏好。(变周期偏好模型。传统的推荐通常从会话级或日级的角度模拟用户偏好的持续变化。然而,在实际的外卖系统中,用户的偏好在白天的早上、中午、晚上和深夜各不相同。为了应对这些挑战,我们提出了一个外卖推荐的双周期变化偏好模型(DPVP)。具体来说,我们设计了一个双交互感知模块,旨在根据用户与商店和食物的交互来捕捉他们的双重偏好。此外,为了模拟一天中不同时段的不同偏好,我们提出了一个基于时间的分解模块以及一个时间感知的门控机制。大量的离线和在线实验表明,我们的模型优于现实世界数据集的最先进的方法,它能够建模的双周期变化的偏好。此外,我们的模型已经在美团外卖平台上进行了在线部署,导致平均商品总值(GMV)提高了0.70% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Modeling+Dual+Period-Varying+Preferences+for+Takeaway+Recommendation)|0| +|[JiuZhang 2.0: A Unified Chinese Pre-trained Language Model for Multi-task Mathematical Problem Solving](https://doi.org/10.1145/3580305.3599850)|Xin Zhao, Kun Zhou, Beichen Zhang, Zheng Gong, Zhipeng Chen, Yuanhang Zhou, JiRong Wen, Jing Sha, Shijin Wang, Cong Liu, Guoping Hu|School of Information, Renmin University of China; iFLYTEK Research, State Key Laboratory of Cognitive Intelligence; iFLYTEK Research; Gaoling School of Artificial Intelligence, Renmin University of China; iFLYTEK AI Research (Central China|Although pre-trained language models~(PLMs) have recently advanced the research progress in mathematical reasoning, they are not specially designed as a capable multi-task solver, suffering from high cost for multi-task deployment (\eg a model copy for a task) and inferior performance on complex mathematical problems in practical applications. To address these issues, in this paper, we propose \textbf{JiuZhang~2.0}, a unified Chinese PLM specially for multi-task mathematical problem solving. Our idea is to maintain a moderate-sized model and employ the \emph{cross-task knowledge sharing} to improve the model capacity in a multi-task setting. Specially, we construct a Mixture-of-Experts~(MoE) architecture for modeling mathematical text, so as to capture the common mathematical knowledge across tasks. For optimizing the MoE architecture, we design \emph{multi-task continual pre-training} and \emph{multi-task fine-tuning} strategies for multi-task adaptation. These training strategies can effectively decompose the knowledge from the task data and establish the cross-task sharing via expert networks. In order to further improve the general capacity of solving different complex tasks, we leverage large language models~(LLMs) as complementary models to iteratively refine the generated solution by our PLM, via in-context learning. Extensive experiments have demonstrated the effectiveness of our model.|尽管预先训练的语言模型 ~ (PLM)最近已经推动了数学推理的研究进展,但是它们并没有被特别设计成一个有能力的多任务解决者,因为多任务部署的高成本(例如一个任务的模型拷贝)和在实际应用中复杂数学问题的低表现。为了解决这些问题,本文提出了一个专门用于多任务数学问题求解的统一中文 PLM textbf {旧掌 ~ 2.0}。我们的想法是维持一个中等规模的模型,并采用跨任务知识共享的方法来提高模型在多任务环境下的能力。特别地,我们构建了一个专家混合模型,用于数学文本建模,以便跨任务获取常见的数学知识。为了优化教学体系结构,我们设计了多任务连续预训练和多任务微调策略来实现多任务自适应。这些训练策略可以有效地分解任务数据中的知识,并通过专家网络建立跨任务共享。为了进一步提高解决不同复杂任务的能力,我们利用大语言模型 ~ (LLM)作为补充模型,通过上下文学习的方法,迭代地完善 PLM 生成的解决方案。大量的实验证明了我们模型的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=JiuZhang+2.0:+A+Unified+Chinese+Pre-trained+Language+Model+for+Multi-task+Mathematical+Problem+Solving)|0| +|[ReLoop2: Building Self-Adaptive Recommendation Models via Responsive Error Compensation Loop](https://doi.org/10.1145/3580305.3599785)|Jieming Zhu, Guohao Cai, Junjie Huang, Zhenhua Dong, Ruiming Tang, Weinan Zhang|Shanghai Jiao Tong University; Huawei Noah’s Ark Lab|Industrial recommender systems face the challenge of operating in non-stationary environments, where data distribution shifts arise from evolving user behaviors over time. To tackle this challenge, a common approach is to periodically re-train or incrementally update deployed deep models with newly observed data, resulting in a continual training process. However, the conventional learning paradigm of neural networks relies on iterative gradient-based updates with a small learning rate, making it slow for large recommendation models to adapt. In this paper, we introduce ReLoop2, a self-correcting learning loop that facilitates fast model adaptation in online recommender systems through responsive error compensation. Inspired by the slow-fast complementary learning system observed in human brains, we propose an error memory module that directly stores error samples from incoming data streams. These stored samples are subsequently leveraged to compensate for model prediction errors during testing, particularly under distribution shifts. The error memory module is designed with fast access capabilities and undergoes continual refreshing with newly observed data samples during the model serving phase to support fast model adaptation. We evaluate the effectiveness of ReLoop2 on three open benchmark datasets as well as a real-world production dataset. The results demonstrate the potential of ReLoop2 in enhancing the responsiveness and adaptiveness of recommender systems operating in non-stationary environments.|工业推荐系统面临着在非平稳环境下运行的挑战,数据分布随着时间的推移而发生变化。为了应对这一挑战,一种常见的方法是定期用新观测数据重新训练或增量更新已部署的深度模型,从而形成持续的训练过程。然而,传统的神经网络学习范式依赖于迭代的基于梯度的更新,学习速度很小,使得大型推荐模型的适应速度变慢。本文介绍了 ReLoop2,一种通过响应误差补偿实现在线推荐系统中模型快速自适应的自校正学习循环。受到在人脑中观察到的慢-快互补学习系统的启发,我们提出了一个错误记忆模块,它直接存储来自输入数据流的错误样本。这些存储的样本随后被用来补偿测试期间的模型预测错误,特别是在分布变化的情况下。错误存储模块设计具有快速访问能力,并在模型服务阶段不断刷新新观察到的数据样本,以支持快速模型适应。我们评估了 ReLoop2在三个开放基准数据集和一个真实生产数据集上的有效性。结果表明,ReLoop2在提高非平稳环境中运行的推荐系统的响应能力和适应能力方面具有潜力。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ReLoop2:+Building+Self-Adaptive+Recommendation+Models+via+Responsive+Error+Compensation+Loop)|0| |[Trustworthy Recommender Systems: Foundations and Frontiers](https://doi.org/10.1145/3580305.3599575)|Wenqi Fan, Xiangyu Zhao, Lin Wang, Xiao Chen, Jingtong Gao, Qidong Liu, Shijie Wang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Trustworthy+Recommender+Systems:+Foundations+and+Frontiers)|0| |[Mining Electronic Health Records for Real-World Evidence](https://doi.org/10.1145/3580305.3599566)|Chengxi Zang, Weishen Pan, Fei Wang|Department of Clinical Pharmacy and Toxicology, Leiden University Medical Center, Leiden, the Netherlands.|Real-world evidence can close the inferential gap between marketing authorization studies and clinical practice. However, the current standard for real-world data extraction from electronic health records (EHRs) for treatment evaluation is manual review (MR), which is time-consuming and laborious. Clinical Data Collector (CDC) is a novel natural language processing and text mining software tool for both structured and unstructured EHR data and only shows relevant EHR sections improving efficiency. We investigated CDC as a real-world data (RWD) collection method, through application of CDC queries for patient inclusion and information extraction on a cohort of patients with metastatic renal cell carcinoma (RCC) receiving systemic drug treatment. Baseline patient characteristics, disease characteristics, and treatment outcomes were extracted and these were compared with MR for validation. One hundred patients receiving 175 treatments were included using CDC, which corresponded to 99% with MR. Calculated median overall survival was 21.7 months (95% confidence interval (CI) 18.7-24.8) vs. 21.7 months (95% CI 18.6-24.8) and progression-free survival 8.9 months (95% CI 5.4-12.4) vs. 7.6 months (95% CI 5.7-9.4) for CDC vs. MR, respectively. Highest F1-score was found for cancer-related variables (88.1-100), followed by comorbidities (71.5-90.4) and adverse drug events (53.3-74.5), with most diverse scores on international metastatic RCC database criteria (51.4-100). Mean data collection time was 12 minutes (CDC) vs. 86 minutes (MR). In conclusion, CDC is a promising tool for retrieving RWD from EHRs because the correct patient population can be identified as well as relevant outcome data, such as overall survival and progression-free survival.|真实世界的证据可以缩小上市许可研究和临床实践之间的推断差距。然而,目前从电子健康记录(EHRs)中提取真实数据用于治疗评估的标准是人工审查(MR) ,这是一项费时费力的工作。临床数据采集器(CDC)是一种新型的自然语言处理和文本挖掘软件工具,用于结构化和非结构化 EHR 数据,只显示相关的 EHR 部分提高效率。我们研究了 CDC 作为一种现实世界数据(RWD)收集方法,通过应用 CDC 查询对患者进行纳入,并对接受全身药物治疗的转移性信息抽取(rCC)患者队列进行肾细胞癌分析。提取基线患者特征、疾病特征和治疗结果,并与 MR 进行比较验证。接受175次治疗的100名患者使用 CDC,其中99% 为 MR。计算的中位总生存期分别为21.7个月(95% 置信区间(CI)18.7-24.8)和21.7个月(95% CI 18.6-24.8)和无进展生存期分别为 CDC 和 MR 的8.9个月(95% CI 5.4-12.4)和7.6个月(95% CI 5.7-9.4)。发现癌症相关变量(88.1-100)的 F1评分最高,其次是合并症(71.5-90.4)和不良药物事件(53.3-74.5) ,国际转移性 RCC 数据库标准(51.4-100)。平均数据收集时间为12分钟(CDC)比86分钟(MR)。总之,CDC 是从 EHR 中检索 RWD 的有希望的工具,因为可以确定正确的患者人群以及相关的结果数据,如总生存期和无进展生存期。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Mining+Electronic+Health+Records+for+Real-World+Evidence)|0| |[EvalRS 2023: Well-Rounded Recommender Systems for Real-World Deployments](https://doi.org/10.1145/3580305.3599222)|Federico Bianchi, Patrick John Chia, Jacopo Tagliabue, Ciro Greco, Gabriel de Souza P. Moreira, Davide Eynard, Fahd Husain, Claudio Pomo||EvalRS aims to bring together practitioners from industry and academia to foster a debate on rounded evaluation of recommender systems, with a focus on real-world impact across a multitude of deployment scenarios. Recommender systems are often evaluated only through accuracy metrics, which fall short of fully characterizing their generalization capabilities and miss important aspects, such as fairness, bias, usefulness, informativeness. This workshop builds on the success of last year's workshop at CIKM, but with a broader scope and an interactive format.|EvalRS 旨在汇集来自行业和学术界的从业人员,促进关于全面评估推荐系统的辩论,重点是在多种部署情景下的现实世界影响。推荐系统往往只能通过精度指标进行评估,这些指标不能充分表征推荐系统的泛化能力,而且忽略了公平性、偏差性、有用性、信息性等重要方面。这个研讨会建立在去年 CIKM 研讨会的成功基础之上,但是范围更广,而且采用了交互式的形式。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=EvalRS+2023:+Well-Rounded+Recommender+Systems+for+Real-World+Deployments)|0| @@ -163,10 +163,10 @@ |[B2-Sampling: Fusing Balanced and Biased Sampling for Graph Contrastive Learning](https://doi.org/10.1145/3580305.3599262)|Mengyue Liu, Yun Lin, Jun Liu, Bohao Liu, Qinghua Zheng, Jin Song Dong||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=B2-Sampling:+Fusing+Balanced+and+Biased+Sampling+for+Graph+Contrastive+Learning)|0| |[DotHash: Estimating Set Similarity Metrics for Link Prediction and Document Deduplication](https://doi.org/10.1145/3580305.3599314)|Igor Nunes, Mike Heddes, Pere Vergés, Danny Abraham, Alexander V. Veidenbaum, Alex Nicolau, Tony Givargis|University of California, Irvine|Metrics for set similarity are a core aspect of several data mining tasks. To remove duplicate results in a Web search, for example, a common approach looks at the Jaccard index between all pairs of pages. In social network analysis, a much-celebrated metric is the Adamic-Adar index, widely used to compare node neighborhood sets in the important problem of predicting links. However, with the increasing amount of data to be processed, calculating the exact similarity between all pairs can be intractable. The challenge of working at this scale has motivated research into efficient estimators for set similarity metrics. The two most popular estimators, MinHash and SimHash, are indeed used in applications such as document deduplication and recommender systems where large volumes of data need to be processed. Given the importance of these tasks, the demand for advancing estimators is evident. We propose DotHash, an unbiased estimator for the intersection size of two sets. DotHash can be used to estimate the Jaccard index and, to the best of our knowledge, is the first method that can also estimate the Adamic-Adar index and a family of related metrics. We formally define this family of metrics, provide theoretical bounds on the probability of estimate errors, and analyze its empirical performance. Our experimental results indicate that DotHash is more accurate than the other estimators in link prediction and detecting duplicate documents with the same complexity and similar comparison time.|集合相似性度量是数据挖掘任务的一个核心方面。例如,为了删除 Web 搜索中的重复结果,通常的方法是查看所有页对之间的 Jaccard 索引。在社会网络分析中,一个著名的度量是阿达姆-阿达尔指数,广泛用于比较节点邻域集在预测链路的重要问题。然而,随着需要处理的数据量的增加,计算所有对之间的精确相似度是很困难的。在这种规模下工作的挑战促使人们研究集合相似度量的有效估计器。MinHash 和 SimHash 这两个最流行的估计器确实用于需要处理大量数据的应用程序,如文档删除重复数据和推荐系统。鉴于这些任务的重要性,提前估算的需求是显而易见的。我们提出了 DotHash,一个两个集合的交集大小的无偏估计。DotHash 可以用来估计 Jaccard 指数,据我们所知,DotHash 是第一种也可以估计 Adam-Adar 指数和一系列相关指标的方法。我们正式地定义了这个度量族,给出了估计误差概率的理论界限,并分析了它的经验性能。实验结果表明,在相同复杂度和相似比较时间的链路预测和重复文档检测方面,DotHash 比其他估计器具有更高的精度。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DotHash:+Estimating+Set+Similarity+Metrics+for+Link+Prediction+and+Document+Deduplication)|0| |[Domain-Guided Spatio-Temporal Self-Attention for Egocentric 3D Pose Estimation](https://doi.org/10.1145/3580305.3599312)|Jinman Park, Kimathi Kaai, Saad Hossain, Norikatsu Sumi, Sirisha Rambhatla, Paul W. Fieguth||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Domain-Guided+Spatio-Temporal+Self-Attention+for+Egocentric+3D+Pose+Estimation)|0| -|[Quantitatively Measuring and Contrastively Exploring Heterogeneity for Domain Generalization](https://doi.org/10.1145/3580305.3599481)|Yunze Tong, Junkun Yuan, Min Zhang, Didi Zhu, Keli Zhang, Fei Wu, Kun Kuang|Zhejiang University; Noah’s Ark Lab, Huawei Technologies|Domain generalization (DG) is a prevalent problem in real-world applications, which aims to train well-generalized models for unseen target domains by utilizing several source domains. Since domain labels, i.e., which domain each data point is sampled from, naturally exist, most DG algorithms treat them as a kind of supervision information to improve the generalization performance. However, the original domain labels may not be the optimal supervision signal due to the lack of domain heterogeneity, i.e., the diversity among domains. For example, a sample in one domain may be closer to another domain, its original label thus can be the noise to disturb the generalization learning. Although some methods try to solve it by re-dividing domains and applying the newly generated dividing pattern, the pattern they choose may not be the most heterogeneous due to the lack of the metric for heterogeneity. In this paper, we point out that domain heterogeneity mainly lies in variant features under the invariant learning framework. With contrastive learning, we propose a learning potential-guided metric for domain heterogeneity by promoting learning variant features. Then we notice the differences between seeking variance-based heterogeneity and training invariance-based generalizable model. We thus propose a novel method called Heterogeneity-based Two-stage Contrastive Learning (HTCL) for the DG task. In the first stage, we generate the most heterogeneous dividing pattern with our contrastive metric. In the second stage, we employ an invariance-aimed contrastive learning by re-building pairs with the stable relation hinted by domains and classes, which better utilizes generated domain labels for generalization learning. Extensive experiments show HTCL better digs heterogeneity and yields great generalization performance.|领域广义化(DG)是现实应用中的一个普遍问题,其目的是利用多个源域来训练未知目标域的广义模型。由于领域标签(即每个数据点从哪个领域采样)的自然存在,大多数 DG 算法都将它们视为一种监督信息,以提高泛化性能。然而,由于缺乏领域异质性,即领域之间的差异性,原始的领域标签可能不是最佳的监督信号。例如,一个领域中的样本可能更接近另一个领域,其原始标签因此可能是噪声干扰推广学习。尽管有些方法试图通过重新划分域并应用新生成的划分模式来解决这个问题,但是由于缺乏对异构性的度量,所选择的模式可能不是最异构的。本文指出,在不变学习框架下,领域异质性主要表现在变异特征上。在对比学习的基础上,提出了一种基于学习势引导的领域异构度量方法。然后我们注意到基于方差的异质性寻求和基于训练不变性的可推广模型之间的区别。因此,我们提出了一种新的方法称为异质性为基础的两阶段对比学习(HTCL)的 DG 任务。在第一阶段,我们使用对比度量生成最不均匀的分割模式。在第二阶段,我们采用不变性对比学习方法,通过重新构建由领域和类提示的稳定关系的对,更好地利用生成的领域标签进行泛化学习。大量实验表明,HTCL 能够更好地挖掘异构性,并产生很好的泛化性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Quantitatively+Measuring+and+Contrastively+Exploring+Heterogeneity+for+Domain+Generalization)|0| +|[Quantitatively Measuring and Contrastively Exploring Heterogeneity for Domain Generalization](https://doi.org/10.1145/3580305.3599481)|Yunze Tong, Junkun Yuan, Min Zhang, Didi Zhu, Keli Zhang, Fei Wu, Kun Kuang|Noah’s Ark Lab, Huawei Technologies; Zhejiang University|Domain generalization (DG) is a prevalent problem in real-world applications, which aims to train well-generalized models for unseen target domains by utilizing several source domains. Since domain labels, i.e., which domain each data point is sampled from, naturally exist, most DG algorithms treat them as a kind of supervision information to improve the generalization performance. However, the original domain labels may not be the optimal supervision signal due to the lack of domain heterogeneity, i.e., the diversity among domains. For example, a sample in one domain may be closer to another domain, its original label thus can be the noise to disturb the generalization learning. Although some methods try to solve it by re-dividing domains and applying the newly generated dividing pattern, the pattern they choose may not be the most heterogeneous due to the lack of the metric for heterogeneity. In this paper, we point out that domain heterogeneity mainly lies in variant features under the invariant learning framework. With contrastive learning, we propose a learning potential-guided metric for domain heterogeneity by promoting learning variant features. Then we notice the differences between seeking variance-based heterogeneity and training invariance-based generalizable model. We thus propose a novel method called Heterogeneity-based Two-stage Contrastive Learning (HTCL) for the DG task. In the first stage, we generate the most heterogeneous dividing pattern with our contrastive metric. In the second stage, we employ an invariance-aimed contrastive learning by re-building pairs with the stable relation hinted by domains and classes, which better utilizes generated domain labels for generalization learning. Extensive experiments show HTCL better digs heterogeneity and yields great generalization performance.|领域广义化(DG)是现实应用中的一个普遍问题,其目的是利用多个源域来训练未知目标域的广义模型。由于领域标签(即每个数据点从哪个领域采样)的自然存在,大多数 DG 算法都将它们视为一种监督信息,以提高泛化性能。然而,由于缺乏领域异质性,即领域之间的差异性,原始的领域标签可能不是最佳的监督信号。例如,一个领域中的样本可能更接近另一个领域,其原始标签因此可能是噪声干扰推广学习。尽管有些方法试图通过重新划分域并应用新生成的划分模式来解决这个问题,但是由于缺乏对异构性的度量,所选择的模式可能不是最异构的。本文指出,在不变学习框架下,领域异质性主要表现在变异特征上。在对比学习的基础上,提出了一种基于学习势引导的领域异构度量方法。然后我们注意到基于方差的异质性寻求和基于训练不变性的可推广模型之间的区别。因此,我们提出了一种新的方法称为异质性为基础的两阶段对比学习(HTCL)的 DG 任务。在第一阶段,我们使用对比度量生成最不均匀的分割模式。在第二阶段,我们采用不变性对比学习方法,通过重新构建由领域和类提示的稳定关系的对,更好地利用生成的领域标签进行泛化学习。大量实验表明,HTCL 能够更好地挖掘异构性,并产生很好的泛化性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Quantitatively+Measuring+and+Contrastively+Exploring+Heterogeneity+for+Domain+Generalization)|0| |[Grace: Graph Self-Distillation and Completion to Mitigate Degree-Related Biases](https://doi.org/10.1145/3580305.3599368)|Hui Xu, Liyao Xiang, Femke Huang, Yuting Weng, Ruijie Xu, Xinbing Wang, Chenghu Zhou||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Grace:+Graph+Self-Distillation+and+Completion+to+Mitigate+Degree-Related+Biases)|0| |[DisasterNet: Causal Bayesian Networks with Normalizing Flows for Cascading Hazards Estimation from Satellite Imagery](https://doi.org/10.1145/3580305.3599807)|Xuechun Li, Paula M. Bürgi, Wei Ma, Hae Young Noh, David Jay Wald, Susu Xu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DisasterNet:+Causal+Bayesian+Networks+with+Normalizing+Flows+for+Cascading+Hazards+Estimation+from+Satellite+Imagery)|0| -|[Explicit Feature Interaction-aware Uplift Network for Online Marketing](https://doi.org/10.1145/3580305.3599820)|Dugang Liu, Xing Tang, Han Gao, Fuyuan Lyu, Xiuqiang He|FiT, Tencent; Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ; McGill University|As a key component in online marketing, uplift modeling aims to accurately capture the degree to which different treatments motivate different users, such as coupons or discounts, also known as the estimation of individual treatment effect (ITE). In an actual business scenario, the options for treatment may be numerous and complex, and there may be correlations between different treatments. In addition, each marketing instance may also have rich user and contextual features. However, existing methods still fall short in both fully exploiting treatment information and mining features that are sensitive to a particular treatment. In this paper, we propose an explicit feature interaction-aware uplift network (EFIN) to address these two problems. Our EFIN includes four customized modules: 1) a feature encoding module encodes not only the user and contextual features, but also the treatment features; 2) a self-interaction module aims to accurately model the user's natural response with all but the treatment features; 3) a treatment-aware interaction module accurately models the degree to which a particular treatment motivates a user through interactions between the treatment features and other features, i.e., ITE; and 4) an intervention constraint module is used to balance the ITE distribution of users between the control and treatment groups so that the model would still achieve a accurate uplift ranking on data collected from a non-random intervention marketing scenario. We conduct extensive experiments on two public datasets and one product dataset to verify the effectiveness of our EFIN. In addition, our EFIN has been deployed in a credit card bill payment scenario of a large online financial platform with a significant improvement.|作为在线营销的一个关键组成部分,提升模型旨在准确地捕捉不同的治疗激励不同的用户的程度,如优惠券或折扣,也被称为个体治疗效果(ITE)的估计。在实际的业务场景中,治疗的选择可能是多种多样和复杂的,不同治疗之间可能存在相关性。此外,每个营销实例还可能具有丰富的用户和上下文特性。然而,现有方法仍然不能充分利用对特定治疗敏感的治疗信息和挖掘特征。针对这两个问题,本文提出了一种显式的特征交互感知提升网络(EFIN)。我们的 EFIN 包括四个定制模块: 1)特征编码模块不仅编码用户和上下文特征,而且还编码治疗特征; 2)自我交互模块旨在准确地模拟除治疗特征以外的所有用户的自然反应; 3)治疗感知交互模块准确地模拟特定治疗通过治疗特征和其他特征之间的交互激励用户的程度,即 ITE; 和4)干预约束模块用于平衡控制和治疗组之间的用户 ITE 分布,以便该模型仍然能够实现从非随机干预营销场景收集的数据的准确提升排名。我们对两个公共数据集和一个产品数据集进行了广泛的实验,以验证我们的 EFIN 的有效性。此外,我们的 EFIN 已经部署在一个大型在线金融平台的信用卡账单支付场景中,并得到了显著的改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Explicit+Feature+Interaction-aware+Uplift+Network+for+Online+Marketing)|0| +|[Explicit Feature Interaction-aware Uplift Network for Online Marketing](https://doi.org/10.1145/3580305.3599820)|Dugang Liu, Xing Tang, Han Gao, Fuyuan Lyu, Xiuqiang He|McGill University; Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ; FiT, Tencent|As a key component in online marketing, uplift modeling aims to accurately capture the degree to which different treatments motivate different users, such as coupons or discounts, also known as the estimation of individual treatment effect (ITE). In an actual business scenario, the options for treatment may be numerous and complex, and there may be correlations between different treatments. In addition, each marketing instance may also have rich user and contextual features. However, existing methods still fall short in both fully exploiting treatment information and mining features that are sensitive to a particular treatment. In this paper, we propose an explicit feature interaction-aware uplift network (EFIN) to address these two problems. Our EFIN includes four customized modules: 1) a feature encoding module encodes not only the user and contextual features, but also the treatment features; 2) a self-interaction module aims to accurately model the user's natural response with all but the treatment features; 3) a treatment-aware interaction module accurately models the degree to which a particular treatment motivates a user through interactions between the treatment features and other features, i.e., ITE; and 4) an intervention constraint module is used to balance the ITE distribution of users between the control and treatment groups so that the model would still achieve a accurate uplift ranking on data collected from a non-random intervention marketing scenario. We conduct extensive experiments on two public datasets and one product dataset to verify the effectiveness of our EFIN. In addition, our EFIN has been deployed in a credit card bill payment scenario of a large online financial platform with a significant improvement.|作为在线营销的一个关键组成部分,提升模型旨在准确地捕捉不同的治疗激励不同的用户的程度,如优惠券或折扣,也被称为个体治疗效果(ITE)的估计。在实际的业务场景中,治疗的选择可能是多种多样和复杂的,不同治疗之间可能存在相关性。此外,每个营销实例还可能具有丰富的用户和上下文特性。然而,现有方法仍然不能充分利用对特定治疗敏感的治疗信息和挖掘特征。针对这两个问题,本文提出了一种显式的特征交互感知提升网络(EFIN)。我们的 EFIN 包括四个定制模块: 1)特征编码模块不仅编码用户和上下文特征,而且还编码治疗特征; 2)自我交互模块旨在准确地模拟除治疗特征以外的所有用户的自然反应; 3)治疗感知交互模块准确地模拟特定治疗通过治疗特征和其他特征之间的交互激励用户的程度,即 ITE; 和4)干预约束模块用于平衡控制和治疗组之间的用户 ITE 分布,以便该模型仍然能够实现从非随机干预营销场景收集的数据的准确提升排名。我们对两个公共数据集和一个产品数据集进行了广泛的实验,以验证我们的 EFIN 的有效性。此外,我们的 EFIN 已经部署在一个大型在线金融平台的信用卡账单支付场景中,并得到了显著的改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Explicit+Feature+Interaction-aware+Uplift+Network+for+Online+Marketing)|0| |[Online Quality Prediction in Windshield Manufacturing using Data-Efficient Machine Learning](https://doi.org/10.1145/3580305.3599880)|Hasan Tercan, Tobias Meisen||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Online+Quality+Prediction+in+Windshield+Manufacturing+using+Data-Efficient+Machine+Learning)|0| |[C-AOI: Contour-based Instance Segmentation for High-Quality Areas-of-Interest in Online Food Delivery Platform](https://doi.org/10.1145/3580305.3599786)|Yida Zhu, Liying Chen, Daping Xiong, Shuiping Chen, Fangxiao Du, Jinghua Hao, Renqing He, Zhizhao Sun||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=C-AOI:+Contour-based+Instance+Segmentation+for+High-Quality+Areas-of-Interest+in+Online+Food+Delivery+Platform)|0| |[Addressing Bias and Fairness in Machine Learning: A Practical Guide and Hands-on Tutorial](https://doi.org/10.1145/3580305.3599180)|Rayid Ghani, Kit T. Rodolfa, Pedro Saleiro, Sérgio M. Jesus||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Addressing+Bias+and+Fairness+in+Machine+Learning:+A+Practical+Guide+and+Hands-on+Tutorial)|0| @@ -181,14 +181,14 @@ |[Similarity Preserving Adversarial Graph Contrastive Learning](https://doi.org/10.1145/3580305.3599503)|Yeonjun In, Kanghoon Yoon, Chanyoung Park|KAIST|Recent works demonstrate that GNN models are vulnerable to adversarial attacks, which refer to imperceptible perturbation on the graph structure and node features. Among various GNN models, graph contrastive learning (GCL) based methods specifically suffer from adversarial attacks due to their inherent design that highly depends on the self-supervision signals derived from the original graph, which however already contains noise when the graph is attacked. To achieve adversarial robustness against such attacks, existing methods adopt adversarial training (AT) to the GCL framework, which considers the attacked graph as an augmentation under the GCL framework. However, we find that existing adversarially trained GCL methods achieve robustness at the expense of not being able to preserve the node feature similarity. In this paper, we propose a similarity-preserving adversarial graph contrastive learning (SP-AGCL) framework that contrasts the clean graph with two auxiliary views of different properties (i.e., the node similarity-preserving view and the adversarial view). Extensive experiments demonstrate that SP-AGCL achieves a competitive performance on several downstream tasks, and shows its effectiveness in various scenarios, e.g., a network with adversarial attacks, noisy labels, and heterophilous neighbors. Our code is available at https://github.com/yeonjun-in/torch-SP-AGCL.|最近的工作表明,GNN 模型是脆弱的对手攻击,这是指不可察觉的扰动图结构和节点特征。在各种 GNN 模型中,基于图形对比学习(GCL)的方法由于其固有的设计,高度依赖于来自原始图形的自我监督信号,而这些信号在图形受到攻击时已经含有噪声,因此特别容易受到攻击。为了实现对抗这种攻击的鲁棒性,现有的方法对 GCL 框架采用了对抗训练(AT) ,该框架将被攻击图视为 GCL 框架下的一种增强。然而,我们发现现有的对抗训练的 GCL 方法在不能保持节点特征相似性的前提下达到了鲁棒性。本文提出了一个保持相似性的对抗图对比学习(SP-AGCL)框架,该框架将干净图与具有不同性质的两个辅助视图(即节点相似性保持视图和对抗视图)进行对比。广泛的实验表明,SP-AGCL 在几个下游任务上取得了有竞争力的性能,并且在多种情况下显示了其有效性,例如,一个具有对抗性攻击、噪声标签和异质邻居的网络。我们的代码可以在 https://github.com/yeonjun-in/torch-sp-agcl 找到。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Similarity+Preserving+Adversarial+Graph+Contrastive+Learning)|0| |[Fast and Accurate Dual-Way Streaming PARAFAC2 for Irregular Tensors - Algorithm and Application](https://doi.org/10.1145/3580305.3599342)|JunGi Jang, Jeongyoung Lee, Yongchan Park, U Kang|Seoul National University|How can we efficiently and accurately analyze an irregular tensor in a dual-way streaming setting where the sizes of two dimensions of the tensor increase over time? What types of anomalies are there in the dual-way streaming setting? An irregular tensor is a collection of matrices whose column lengths are the same while their row lengths are different. In a dual-way streaming setting, both new rows of existing matrices and new matrices arrive over time. PARAFAC2 decomposition is a crucial tool for analyzing irregular tensors. Although real-time analysis is necessary in the dual-way streaming, static PARAFAC2 decomposition methods fail to efficiently work in this setting since they perform PARAFAC2 decomposition for accumulated tensors whenever new data arrive. Existing streaming PARAFAC2 decomposition methods work in a limited setting and fail to handle new rows of matrices efficiently. In this paper, we propose Dash, an efficient and accurate PARAFAC2 decomposition method working in the dual-way streaming setting. When new data are given, Dash efficiently performs PARAFAC2 decomposition by carefully dividing the terms related to old and new data and avoiding naive computations involved with old data. Furthermore, applying a forgetting factor makes Dash follow recent movements. Extensive experiments show that Dash achieves up to 14.0x faster speed than existing PARAFAC2 decomposition methods for newly arrived data. We also provide discoveries for detecting anomalies in real-world datasets, including Subprime Mortgage Crisis and COVID-19.|我们如何才能有效和准确地分析一个不规则张量在双向流设置,其中张量的二维尺寸随着时间的推移增加?在双向流设置中有哪些类型的异常?不规则张量是列长相同而行长不同的矩阵集合。在双向流设置中,现有矩阵的新行和新矩阵都会随着时间的推移到达。PARAFAC2分解是分析不规则张量的重要工具。尽管实时分析在双向流中是必需的,但是静态 PARAFAC2分解方法在这种情况下无法有效工作,因为每当新数据到达时,它们都会对累积的张量执行 PARAFAC2分解。现有的流 PARAFAC2分解方法在有限的设置下工作,无法有效地处理新的矩阵行。本文提出了一种高效、准确的双向流设置 PARAFAC2分解方法 Dash。当给定新数据时,Dash 通过仔细划分与旧数据和新数据相关的术语并避免涉及旧数据的幼稚计算,有效地执行 PARAFAC2分解。此外,应用遗忘因子使达什跟随最近的动作。大量的实验表明,Dash 对新到达的数据的分解速度比现有的 PARAFAC2分解方法快14.0倍。我们还提供发现,以检测现实世界数据集中的异常,包括次贷危机和2019冠状病毒疾病。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Fast+and+Accurate+Dual-Way+Streaming+PARAFAC2+for+Irregular+Tensors+-+Algorithm+and+Application)|0| |[Predicting Information Pathways Across Online Communities](https://doi.org/10.1145/3580305.3599470)|Yiqiao Jin, YeonChang Lee, Kartik Sharma, Meng Ye, Karan Sikka, Ajay Divakaran, Srijan Kumar|SRI International; Georgia Institute of Technology|The problem of community-level information pathway prediction (CLIPP) aims at predicting the transmission trajectory of content across online communities. A successful solution to CLIPP holds significance as it facilitates the distribution of valuable information to a larger audience and prevents the proliferation of misinformation. Notably, solving CLIPP is non-trivial as inter-community relationships and influence are unknown, information spread is multi-modal, and new content and new communities appear over time. In this work, we address CLIPP by collecting large-scale, multi-modal datasets to examine the diffusion of online YouTube videos on Reddit. We analyze these datasets to construct community influence graphs (CIGs) and develop a novel dynamic graph framework, INPAC (Information Pathway Across Online Communities), which incorporates CIGs to capture the temporal variability and multi-modal nature of video propagation across communities. Experimental results in both warm-start and cold-start scenarios show that INPAC outperforms seven baselines in CLIPP.|社区层面的信息路径预测(CLIPP)问题旨在预测内容在网络社区之间的传播轨迹。CLIPP 的成功解决方案具有重要意义,因为它有助于向更多的受众传播有价值的信息,并防止错误信息的扩散。值得注意的是,解决 CLIPP 是非常重要的,因为社区之间的关系和影响是未知的,信息传播是多模式的,新的内容和新的社区随着时间的推移出现。在这项工作中,我们通过收集大规模的多模态数据集来检查在线 YouTube 视频在 Reddit 上的传播,从而解决 CLIPP 问题。我们分析这些数据集来构建社区影响图(CIGs) ,并开发一种新的动态图框架 INPAC (在线社区信息路径) ,其结合 CIGs 来捕获跨社区视频传播的时间变异性和多模态性质。在热启动和冷启动两种情况下的实验结果表明,INPAC 的性能优于 CLIPP 中的七个基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Predicting+Information+Pathways+Across+Online+Communities)|0| -|[Task Relation-aware Continual User Representation Learning](https://doi.org/10.1145/3580305.3599516)|Sein Kim, Namkyeong Lee, Donghyun Kim, MinChul Yang, Chanyoung Park|NAVER Corporation; KAIST|User modeling, which learns to represent users into a low-dimensional representation space based on their past behaviors, got a surge of interest from the industry for providing personalized services to users. Previous efforts in user modeling mainly focus on learning a task-specific user representation that is designed for a single task. However, since learning task-specific user representations for every task is infeasible, recent studies introduce the concept of universal user representation, which is a more generalized representation of a user that is relevant to a variety of tasks. Despite their effectiveness, existing approaches for learning universal user representations are impractical in real-world applications due to the data requirement, catastrophic forgetting and the limited learning capability for continually added tasks. In this paper, we propose a novel continual user representation learning method, called TERACON, whose learning capability is not limited as the number of learned tasks increases while capturing the relationship between the tasks. The main idea is to introduce an embedding for each task, i.e., task embedding, which is utilized to generate task-specific soft masks that not only allow the entire model parameters to be updated until the end of training sequence, but also facilitate the relationship between the tasks to be captured. Moreover, we introduce a novel knowledge retention module with pseudo-labeling strategy that successfully alleviates the long-standing problem of continual learning, i.e., catastrophic forgetting. Extensive experiments on public and proprietary real-world datasets demonstrate the superiority and practicality of TERACON. Our code is available at https://github.com/Sein-Kim/TERACON.|用户建模是根据用户过去的行为学习如何将用户表示成一个低维的表示空间,因此为用户提供个性化的服务引起了业界的极大兴趣。以前的用户建模工作主要集中在学习为单个任务设计的特定于任务的用户表示。然而,由于学习任务特定的每个任务的用户表示是不可行的,最近的研究引入了通用用户表示的概念,这是一个更广泛的用户表示相关的各种任务。尽管现有的学习通用用户表示的方法很有效,但是由于数据需求、灾难性遗忘以及对不断增加的任务的学习能力有限,这些方法在实际应用中是不切实际的。在本文中,我们提出了一种新的持续用户表征学习方法,称为 TERACON,它的学习能力不受任务数量增加的限制,同时捕捉任务之间的关系。其主要思想是为每个任务引入一个嵌入,即任务嵌入,用于生成任务特定的软掩码,不仅允许整个模型参数更新直到训练序列结束,而且有利于任务之间的关系被捕获。此外,我们还引入了一个新的知识保留模块,该模块采用伪标记策略,成功地解决了长期以来存在的连续学习问题,即灾难性遗忘问题。在公共和专有的真实世界数据集上的大量实验证明了 TERACON 的优越性和实用性。我们的代码可以在 https://github.com/sein-kim/teracon 找到。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Task+Relation-aware+Continual+User+Representation+Learning)|0| +|[Task Relation-aware Continual User Representation Learning](https://doi.org/10.1145/3580305.3599516)|Sein Kim, Namkyeong Lee, Donghyun Kim, MinChul Yang, Chanyoung Park|KAIST; NAVER Corporation|User modeling, which learns to represent users into a low-dimensional representation space based on their past behaviors, got a surge of interest from the industry for providing personalized services to users. Previous efforts in user modeling mainly focus on learning a task-specific user representation that is designed for a single task. However, since learning task-specific user representations for every task is infeasible, recent studies introduce the concept of universal user representation, which is a more generalized representation of a user that is relevant to a variety of tasks. Despite their effectiveness, existing approaches for learning universal user representations are impractical in real-world applications due to the data requirement, catastrophic forgetting and the limited learning capability for continually added tasks. In this paper, we propose a novel continual user representation learning method, called TERACON, whose learning capability is not limited as the number of learned tasks increases while capturing the relationship between the tasks. The main idea is to introduce an embedding for each task, i.e., task embedding, which is utilized to generate task-specific soft masks that not only allow the entire model parameters to be updated until the end of training sequence, but also facilitate the relationship between the tasks to be captured. Moreover, we introduce a novel knowledge retention module with pseudo-labeling strategy that successfully alleviates the long-standing problem of continual learning, i.e., catastrophic forgetting. Extensive experiments on public and proprietary real-world datasets demonstrate the superiority and practicality of TERACON. Our code is available at https://github.com/Sein-Kim/TERACON.|用户建模是根据用户过去的行为学习如何将用户表示成一个低维的表示空间,因此为用户提供个性化的服务引起了业界的极大兴趣。以前的用户建模工作主要集中在学习为单个任务设计的特定于任务的用户表示。然而,由于学习任务特定的每个任务的用户表示是不可行的,最近的研究引入了通用用户表示的概念,这是一个更广泛的用户表示相关的各种任务。尽管现有的学习通用用户表示的方法很有效,但是由于数据需求、灾难性遗忘以及对不断增加的任务的学习能力有限,这些方法在实际应用中是不切实际的。在本文中,我们提出了一种新的持续用户表征学习方法,称为 TERACON,它的学习能力不受任务数量增加的限制,同时捕捉任务之间的关系。其主要思想是为每个任务引入一个嵌入,即任务嵌入,用于生成任务特定的软掩码,不仅允许整个模型参数更新直到训练序列结束,而且有利于任务之间的关系被捕获。此外,我们还引入了一个新的知识保留模块,该模块采用伪标记策略,成功地解决了长期以来存在的连续学习问题,即灾难性遗忘问题。在公共和专有的真实世界数据集上的大量实验证明了 TERACON 的优越性和实用性。我们的代码可以在 https://github.com/sein-kim/teracon 找到。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Task+Relation-aware+Continual+User+Representation+Learning)|0| |[GraphSHA: Synthesizing Harder Samples for Class-Imbalanced Node Classification](https://doi.org/10.1145/3580305.3599374)|WenZhi Li, ChangDong Wang, Hui Xiong, JianHuang Lai|The Hong Kong University of Science and Technology (Guangzhou); Sun Yat-sen University|Class imbalance is the phenomenon that some classes have much fewer instances than others, which is ubiquitous in real-world graph-structured scenarios. Recent studies find that off-the-shelf Graph Neural Networks (GNNs) would under-represent minor class samples. We investigate this phenomenon and discover that the subspaces of minor classes being squeezed by those of the major ones in the latent space is the main cause of this failure. We are naturally inspired to enlarge the decision boundaries of minor classes and propose a general framework GraphSHA by Synthesizing HArder minor samples. Furthermore, to avoid the enlarged minor boundary violating the subspaces of neighbor classes, we also propose a module called SemiMixup to transmit enlarged boundary information to the interior of the minor classes while blocking information propagation from minor classes to neighbor classes. Empirically, GraphSHA shows its effectiveness in enlarging the decision boundaries of minor classes, as it outperforms various baseline methods in class-imbalanced node classification with different GNN backbone encoders over seven public benchmark datasets. Code is avilable at https://github.com/wenzhilics/GraphSHA.|类不平衡是一些类的实例比其他类少得多的现象,这种现象在真实世界的图形结构场景中普遍存在。最近的研究发现,现成的图形神经网络(GNN)会低估次要类样本。我们研究了这一现象,发现潜空间中次类的子空间被主类的子空间挤压是导致这一失败的主要原因。我们很自然地受到了扩大次类决策边界的启发,并通过综合 HArder 次样本提出了一个通用的 GraphSHA 框架。此外,为了避免扩大的次边界侵犯邻居类的子空间,我们还提出了一个称为 SemiMixup 的模块来传输扩大的边界信息到次类的内部,同时阻止信息从次类传播到邻居类。实验表明,GraphSHA 在扩大次类的决策边界方面是有效的,因为它在七个公共基准数据集上使用不同的 GNN 骨干编码器进行类不平衡节点分类时优于各种基准方法。代码可在 https://github.com/wenzhilics/graphsha 下载。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=GraphSHA:+Synthesizing+Harder+Samples+for+Class-Imbalanced+Node+Classification)|0| |[Physics-Guided Discovery of Highly Nonlinear Parametric Partial Differential Equations](https://doi.org/10.1145/3580305.3599466)|Yingtao Luo, Qiang Liu, Yuntian Chen, Wenbo Hu, Tian Tian, Jun Zhu||Partial differential equations (PDEs) fitting scientific data can represent physical laws with explainable mechanisms for various mathematically-oriented subjects. The data-driven discovery of PDEs from scientific data thrives as a new attempt to model complex phenomena in nature, but the effectiveness of current practice is typically limited by the scarcity of data and the complexity of phenomena. Especially, the discovery of PDEs with highly nonlinear coefficients from low-quality data remains largely under-addressed. To deal with this challenge, we propose a novel physics-guided learning method, which can not only encode observation knowledge such as initial and boundary conditions but also incorporate the basic physical principles and laws to guide the model optimization. We empirically demonstrate that the proposed method is more robust against data noise and sparsity, and can reduce the estimation error by a large margin; moreover, for the first time we are able to discover PDEs with highly nonlinear coefficients. With the promising performance, the proposed method pushes forward the boundary of the PDEs that can be found by machine learning models for scientific discovery.|拟合科学数据的偏微分方程(PDE)可以表示各种数学导向学科的具有可解释机制的物理规律。数据驱动发现偏微分方程的科学数据蓬勃发展,作为一种新的尝试模拟自然界中的复杂现象,但目前的做法的有效性通常受到数据稀缺和现象复杂性的限制。特别是,从低质量数据中发现具有高度非线性系数的偏微分方程仍然是一个很大的问题。为了解决这一问题,我们提出了一种新的物理导向学习方法,它不仅可以对初始条件和边界条件等观测知识进行编码,而且可以结合基本的物理原理和规律来指导模型的优化。实验结果表明,该方法对数据噪声和稀疏性具有较强的鲁棒性,能够大幅度地减小估计误差,并且首次发现了具有高度非线性系数的偏微分方程。该方法具有良好的性能,为科学发现推进了机器学习模型所能找到的偏微分方程的边界。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Physics-Guided+Discovery+of+Highly+Nonlinear+Parametric+Partial+Differential+Equations)|0| |[Online Fairness Auditing through Iterative Refinement](https://doi.org/10.1145/3580305.3599454)|Pranav Maneriker, Codi Burley, Srinivasan Parthasarathy||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Online+Fairness+Auditing+through+Iterative+Refinement)|0| |[Online Level-wise Hierarchical Clustering](https://doi.org/10.1145/3580305.3599455)|Nicholas Monath, Manzil Zaheer, Andrew McCallum||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Online+Level-wise+Hierarchical+Clustering)|0| |[Cracking White-box DNN Watermarks via Invariant Neuron Transforms](https://doi.org/10.1145/3580305.3599291)|Xudong Pan, Mi Zhang, Yifan Yan, Yining Wang, Min Yang|; Fudan University|Recently, how to protect the Intellectual Property (IP) of deep neural networks (DNN) becomes a major concern for the AI industry. To combat potential model piracy, recent works explore various watermarking strategies to embed secret identity messages into the prediction behaviors or the internals (e.g., weights and neuron activation) of the target model. Sacrificing less functionality and involving more knowledge about the target model, the latter branch of watermarking schemes (i.e., white-box model watermarking) is claimed to be accurate, credible and secure against most known watermark removal attacks, with emerging research efforts and applications in the industry. In this paper, we present the first effective removal attack which cracks almost all the existing white-box watermarking schemes with provably no performance overhead and no required prior knowledge. By analyzing these IP protection mechanisms at the granularity of neurons, we for the first time discover their common dependence on a set of fragile features of a local neuron group, all of which can be arbitrarily tampered by our proposed chain of invariant neuron transforms. On $9$ state-of-the-art white-box watermarking schemes and a broad set of industry-level DNN architectures, our attack for the first time reduces the embedded identity message in the protected models to be almost random. Meanwhile, unlike known removal attacks, our attack requires no prior knowledge on the training data distribution or the adopted watermark algorithms, and leaves model functionality intact.|近年来,如何保护深层神经网络(DNN)的知识产权成为人工智能产业关注的主要问题。为了打击潜在的盗版模型,最近的研究探索了各种水印策略,将秘密身份信息嵌入到目标模型的预测行为或内部(如权重和神经元激活)中。水印技术的后一个分支(即白盒模型水印)牺牲了较少的功能,涉及更多关于目标模型的知识,被认为是准确、可靠和安全的,能够抵御大多数已知的水印去除攻击,并且在业界中得到了新的研究和应用。在本文中,我们提出了第一个有效的移除攻击,这个攻击破坏了几乎所有现有的白盒水印方案,并且可以证明没有性能开销和不需要先验知识。通过分析神经元粒度上的这些 IP 保护机制,我们首次发现它们对局部神经元群的一组脆弱特征的共同依赖性,所有这些特征都可以被我们提出的不变神经元变换链任意篡改。在9美元的国家最先进的白盒水印方案和广泛的行业级 DNN 架构,我们的攻击第一次减少嵌入的身份信息在受保护的模型几乎是随机的。同时,与已知的移除攻击不同,我们的攻击不需要关于训练数据分布或所采用的水印算法的先验知识,并且保留了模型的功能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Cracking+White-box+DNN+Watermarks+via+Invariant+Neuron+Transforms)|0| |[Graph Neural Bandits](https://doi.org/10.1145/3580305.3599371)|Yunzhe Qi, Yikun Ban, Jingrui He|University of Illinois, Urbana Champaign; University of Illinois at Urbana-Champaign|Contextual bandits aim to choose the optimal arm with the highest reward out of a set of candidates based on their contextual information, and various bandit algorithms have been applied to personalized recommendation due to their ability of solving the exploitation-exploration dilemma. Motivated by online recommendation scenarios, in this paper, we propose a framework named Graph Neural Bandits (GNB) to leverage the collaborative nature among users empowered by graph neural networks (GNNs). Instead of estimating rigid user clusters, we model the "fine-grained'' collaborative effects through estimated user graphs in terms of exploitation and exploration individually. Then, to refine the recommendation strategy, we utilize separate GNN-based models on estimated user graphs for exploitation and adaptive exploration. Theoretical analysis and experimental results on multiple real data sets in comparison with state-of-the-art baselines are provided to demonstrate the effectiveness of our proposed framework.|关联强盗的目标是根据候选人的关联信息从一组候选人中选择报酬最高的最优组合,各种强盗算法因其解决开发-探索两难问题的能力而被应用于个性化推荐。受在线推荐场景的启发,本文提出了一种基于图形神经网络的用户协作框架 GNB。我们没有对刚性用户集群进行估计,而是根据开发和探索的不同,通过估计的用户图对“细粒度”的协作效果进行建模。然后,为了完善推荐策略,我们利用基于 GNN 的分离模型对估计用户图进行开发和自适应探索。通过对多个实际数据集的理论分析和实验结果与最新基线的比较,证明了我们提出的框架的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph+Neural+Bandits)|0| -|[Source-Free Domain Adaptation with Temporal Imputation for Time Series Data](https://doi.org/10.1145/3580305.3599507)|Mohamed Ragab, Emadeldeen Eldele, Min Wu, ChuanSheng Foo, Xiaoli Li, Zhenghua Chen|Center for Frontier AI Research, Agency for Science and Technology and Research (A*STAR); Institute for Infocomm Research, Agency for Science Technology and Research (A*STAR)|Source-free domain adaptation (SFDA) aims to adapt a pretrained model from a labeled source domain to an unlabeled target domain without access to the source domain data, preserving source domain privacy. Despite its prevalence in visual applications, SFDA is largely unexplored in time series applications. The existing SFDA methods that are mainly designed for visual applications may fail to handle the temporal dynamics in time series, leading to impaired adaptation performance. To address this challenge, this paper presents a simple yet effective approach for source-free domain adaptation on time series data, namely MAsk and imPUte (MAPU). First, to capture temporal information of the source domain, our method performs random masking on the time series signals while leveraging a novel temporal imputer to recover the original signal from a masked version in the embedding space. Second, in the adaptation step, the imputer network is leveraged to guide the target model to produce target features that are temporally consistent with the source features. To this end, our MAPU can explicitly account for temporal dependency during the adaptation while avoiding the imputation in the noisy input space. Our method is the first to handle temporal consistency in SFDA for time series data and can be seamlessly equipped with other existing SFDA methods. Extensive experiments conducted on three real-world time series datasets demonstrate that our MAPU achieves significant performance gain over existing methods. Our code is available at \url{https://github.com/mohamedr002/MAPU_SFDA_TS}.|无源域适应(SFDA)的目标是在不访问源域数据的情况下,将预先训练好的模型从标记的源域适应到未标记的目标域,从而保护源域的隐私。尽管 SFDA 在视觉应用方面很流行,但在时间序列应用方面却很大程度上没有得到探索。现有的 SFDA 方法主要是为视觉应用而设计的,可能无法处理时间序列中的时间动态,导致适应性能受损。为了解决这一问题,本文提出了一种简单而有效的时间序列数据无源域自适应方法,即 MAsk 和 imPUte (MAPU)。首先,为了获取源域的时间信息,该方法对时间序列信号进行随机掩蔽,同时利用一种新的时间计算机在嵌入空间中从掩蔽版本中恢复原始信号。其次,在适应步骤中,利用计算机网络引导目标模型产生与源特征在时间上一致的目标特征。为此,我们的 MAPU 可以明确地说明在适应期间的时间依赖性,同时避免了在噪声输入空间的插补。我们的方法是第一个处理时间序列数据时间一致性的 SFDA 方法,可以与其他现有的 SFDA 方法无缝配备。在三个实际时间序列数据集上进行的大量实验表明,我们的 MAPU 比现有的方法获得了显著的性能提高。我们的代码可以在 url { https://github.com/mohamedr002/mapu_sfda_ts }找到。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Source-Free+Domain+Adaptation+with+Temporal+Imputation+for+Time+Series+Data)|0| +|[Source-Free Domain Adaptation with Temporal Imputation for Time Series Data](https://doi.org/10.1145/3580305.3599507)|Mohamed Ragab, Emadeldeen Eldele, Min Wu, ChuanSheng Foo, Xiaoli Li, Zhenghua Chen|Institute for Infocomm Research, Agency for Science Technology and Research (A*STAR); Center for Frontier AI Research, Agency for Science and Technology and Research (A*STAR)|Source-free domain adaptation (SFDA) aims to adapt a pretrained model from a labeled source domain to an unlabeled target domain without access to the source domain data, preserving source domain privacy. Despite its prevalence in visual applications, SFDA is largely unexplored in time series applications. The existing SFDA methods that are mainly designed for visual applications may fail to handle the temporal dynamics in time series, leading to impaired adaptation performance. To address this challenge, this paper presents a simple yet effective approach for source-free domain adaptation on time series data, namely MAsk and imPUte (MAPU). First, to capture temporal information of the source domain, our method performs random masking on the time series signals while leveraging a novel temporal imputer to recover the original signal from a masked version in the embedding space. Second, in the adaptation step, the imputer network is leveraged to guide the target model to produce target features that are temporally consistent with the source features. To this end, our MAPU can explicitly account for temporal dependency during the adaptation while avoiding the imputation in the noisy input space. Our method is the first to handle temporal consistency in SFDA for time series data and can be seamlessly equipped with other existing SFDA methods. Extensive experiments conducted on three real-world time series datasets demonstrate that our MAPU achieves significant performance gain over existing methods. Our code is available at \url{https://github.com/mohamedr002/MAPU_SFDA_TS}.|无源域适应(SFDA)的目标是在不访问源域数据的情况下,将预先训练好的模型从标记的源域适应到未标记的目标域,从而保护源域的隐私。尽管 SFDA 在视觉应用方面很流行,但在时间序列应用方面却很大程度上没有得到探索。现有的 SFDA 方法主要是为视觉应用而设计的,可能无法处理时间序列中的时间动态,导致适应性能受损。为了解决这一问题,本文提出了一种简单而有效的时间序列数据无源域自适应方法,即 MAsk 和 imPUte (MAPU)。首先,为了获取源域的时间信息,该方法对时间序列信号进行随机掩蔽,同时利用一种新的时间计算机在嵌入空间中从掩蔽版本中恢复原始信号。其次,在适应步骤中,利用计算机网络引导目标模型产生与源特征在时间上一致的目标特征。为此,我们的 MAPU 可以明确地说明在适应期间的时间依赖性,同时避免了在噪声输入空间的插补。我们的方法是第一个处理时间序列数据时间一致性的 SFDA 方法,可以与其他现有的 SFDA 方法无缝配备。在三个实际时间序列数据集上进行的大量实验表明,我们的 MAPU 比现有的方法获得了显著的性能提高。我们的代码可以在 url { https://github.com/mohamedr002/mapu_sfda_ts }找到。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Source-Free+Domain+Adaptation+with+Temporal+Imputation+for+Time+Series+Data)|0| |[Causal Effect Estimation on Hierarchical Spatial Graph Data](https://doi.org/10.1145/3580305.3599269)|Koh Takeuchi, Ryo Nishida, Hisashi Kashima, Masaki Onishi||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Causal+Effect+Estimation+on+Hierarchical+Spatial+Graph+Data)|0| |[Networked Time Series Imputation via Position-aware Graph Enhanced Variational Autoencoders](https://doi.org/10.1145/3580305.3599444)|Dingsu Wang, Yuchen Yan, Ruizhong Qiu, Yada Zhu, Kaiyu Guan, Andrew Margenot, Hanghang Tong|IBM Research; University of Illinois at Urbana-Champaign|Multivariate time series (MTS) imputation is a widely studied problem in recent years. Existing methods can be divided into two main groups, including (1) deep recurrent or generative models that primarily focus on time series features, and (2) graph neural networks (GNNs) based models that utilize the topological information from the inherent graph structure of MTS as relational inductive bias for imputation. Nevertheless, these methods either neglect topological information or assume the graph structure is fixed and accurately known. Thus, they fail to fully utilize the graph dynamics for precise imputation in more challenging MTS data such as networked time series (NTS), where the underlying graph is constantly changing and might have missing edges. In this paper, we propose a novel approach to overcome these limitations. First, we define the problem of imputation over NTS which contains missing values in both node time series features and graph structures. Then, we design a new model named PoGeVon which leverages variational autoencoder (VAE) to predict missing values over both node time series features and graph structures. In particular, we propose a new node position embedding based on random walk with restart (RWR) in the encoder with provable higher expressive power compared with message-passing based graph neural networks (GNNs). We further design a decoder with 3-stage predictions from the perspective of multi-task learning to impute missing values in both time series and graph structures reciprocally. Experiment results demonstrate the effectiveness of our model over baselines.|多变量时间序列(MTS)插补是近年来研究较多的一个问题。现有的方法可以分为两大类,包括(1)主要关注时间序列特征的深度递归或生成模型,和(2)基于图神经网络(GNN)的模型,这些模型利用 MTS 固有图结构的拓扑信息作为关系归纳偏差进行插补。然而,这些方法或者忽略了拓扑信息,或者假设图的结构是固定的并且已知的。因此,他们未能充分利用图动力学精确插补更具挑战性的 MTS 数据,如网络时间序列(NTS) ,其中底层图是不断变化的,可能有缺失的边。在本文中,我们提出了一种新的方法来克服这些限制。首先,我们定义了包含节点时间序列特征和图结构缺失值的 NTS 插补问题。然后,我们设计了一个新的模型 PoGeVon,它利用变分自动编码器(VAE)来预测节点时间序列特征和图结构上的缺失值。特别地,我们提出了一种新的基于重启随机游走(RWR)的节点位置嵌入编码器,与基于消息传递的图形神经网络(GNN)相比,具有可证明的更高的表达能力。我们进一步从多任务学习的角度设计了一个具有三阶段预测的解码器来相互推算时间序列和图结构中的缺失值。实验结果证明了该模型在基线上的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Networked+Time+Series+Imputation+via+Position-aware+Graph+Enhanced+Variational+Autoencoders)|0| |[Incremental Causal Graph Learning for Online Root Cause Analysis](https://doi.org/10.1145/3580305.3599392)|Dongjie Wang, Zhengzhang Chen, Yanjie Fu, Yanchi Liu, Haifeng Chen||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Incremental+Causal+Graph+Learning+for+Online+Root+Cause+Analysis)|0| @@ -204,27 +204,27 @@ |[Optimal Dynamic Subset Sampling: Theory and Applications](https://doi.org/10.1145/3580305.3599458)|Lu Yi, Hanzhi Wang, Zhewei Wei|Renmin University of China|We study the fundamental problem of sampling independent events, called subset sampling. Specifically, consider a set of $n$ events $S=\{x_1, \ldots, x_n\}$, where each event $x_i$ has an associated probability $p(x_i)$. The subset sampling problem aims to sample a subset $T \subseteq S$, such that every $x_i$ is independently included in $S$ with probability $p_i$. A naive solution is to flip a coin for each event, which takes $O(n)$ time. However, the specific goal is to develop data structures that allow drawing a sample in time proportional to the expected output size $\mu=\sum_{i=1}^n p(x_i)$, which can be significantly smaller than $n$ in many applications. The subset sampling problem serves as an important building block in many tasks and has been the subject of various research for more than a decade. However, most of the existing subset sampling approaches are conducted in a static setting, where the events or their associated probability in set $S$ is not allowed to be changed over time. These algorithms incur either large query time or update time in a dynamic setting despite the ubiquitous time-evolving events with changing probability in real life. Therefore, it is a pressing need, but still, an open problem, to design efficient dynamic subset sampling algorithms. In this paper, we propose ODSS, the first optimal dynamic subset sampling algorithm. The expected query time and update time of ODSS are both optimal, matching the lower bounds of the subset sampling problem. We present a nontrivial theoretical analysis to demonstrate the superiority of ODSS. We also conduct comprehensive experiments to empirically evaluate the performance of ODSS. Moreover, we apply ODSS to a concrete application: influence maximization. We empirically show that our ODSS can improve the complexities of existing influence maximization algorithms on large real-world evolving social networks.|我们研究抽样独立事件的基本问题,称为子集抽样。具体来说,考虑一组 $n $事件 $S = { x _ 1,ldot,x _ n } $,其中每个事件 $x _ i $具有相关的概率 $p (x _ i) $。子集抽样问题的目标是抽样一个子集 $T 子集 S $,这样每个 $x _ i $都独立地包含在 $S $中,概率为 $p _ i $。一个天真的解决方案是为每个事件抛硬币,这需要花费 $O (n) $时间。然而,我们的具体目标是开发一种数据结构,它允许在与预期输出大小成正比的时间内绘制样本 $mu = sum _ { i = 1} ^ n p (x _ i) $,在许多应用程序中,它可以明显小于 $n $。子集抽样问题是许多工作中的一个重要组成部分,也是近十多年来各种研究的主题。但是,大多数现有的子集抽样方法都是在静态环境中进行的,其中不允许随着时间的推移更改集 $S $中的事件或其相关概率。尽管在现实生活中随时间演化的事件随概率的变化无处不在,但这些算法在动态环境中会产生大量的查询时间或更新时间。因此,设计高效的动态子集采样算法是一个迫切而又尚未解决的问题。本文提出了第一种最优动态子集抽样算法 ODSS。ODSS 的期望查询时间和更新时间均为最优,与子集抽样问题的下界相匹配。我们提出了一个非平凡的理论分析,以证明 ODSS 的优越性。我们还进行了综合性的实验,对 ODSS 的性能进行了实证评估。此外,我们将 ODSS 应用于一个具体的应用: 影响最大化。我们的实验表明,我们的 ODSS 可以改善现有的影响最大化算法在大型真实世界演化的社会网络上的复杂性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Optimal+Dynamic+Subset+Sampling:+Theory+and+Applications)|0| |[Sharpness-Aware Minimization Revisited: Weighted Sharpness as a Regularization Term](https://doi.org/10.1145/3580305.3599501)|Yun Yue, Jiadi Jiang, Zhiling Ye, Ning Gao, Yongchao Liu, Ke Zhang|Ant Group|Deep Neural Networks (DNNs) generalization is known to be closely related to the flatness of minima, leading to the development of Sharpness-Aware Minimization (SAM) for seeking flatter minima and better generalization. In this paper, we revisit the loss of SAM and propose a more general method, called WSAM, by incorporating sharpness as a regularization term. We prove its generalization bound through the combination of PAC and Bayes-PAC techniques, and evaluate its performance on various public datasets. The results demonstrate that WSAM achieves improved generalization, or is at least highly competitive, compared to the vanilla optimizer, SAM and its variants. The code is available at https://github.com/intelligent-machine-learning/dlrover/tree/master/atorch/atorch/optimizers.|深度神经网络(DNN)泛化与最小值的平坦性密切相关,导致锐度感知最小化(SAM)的发展,以寻求更平坦的最小值和更好的泛化。在本文中,我们重新审视了 SAM 的损失,并提出了一种更一般的方法,称为 WSAM,通过合并锐度作为一个正则项。通过结合 PAC 和 Bayes-PAC 技术证明了其泛化界,并对其在各种公共数据集上的性能进行了评估。结果表明,与普通的优化器 SAM 及其变体相比,WSAM 实现了改进的泛化,或者至少具有很强的竞争力。密码可在 https://github.com/intelligent-machine-learning/dlrover/tree/master/atorch/atorch/optimizers 查阅。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Sharpness-Aware+Minimization+Revisited:+Weighted+Sharpness+as+a+Regularization+Term)|0| |[Doubly Robust AUC Optimization against Noisy and Adversarial Samples](https://doi.org/10.1145/3580305.3599316)|Chenkang Zhang, Wanli Shi, Lei Luo, Bin Gu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Doubly+Robust+AUC+Optimization+against+Noisy+and+Adversarial+Samples)|0| -|[Finding Favourite Tuples on Data Streams with Provably Few Comparisons](https://doi.org/10.1145/3580305.3599352)|Guangyi Zhang, Nikolaj Tatti, Aristides Gionis|HIIT, University of Helsinki; Shenzhen Institute of Computing Sciences; KTH Royal Institute of Technology|One of the most fundamental tasks in data science is to assist a user with unknown preferences in finding high-utility tuples within a large database. To accurately elicit the unknown user preferences, a widely-adopted way is by asking the user to compare pairs of tuples. In this paper, we study the problem of identifying one or more high-utility tuples by adaptively receiving user input on a minimum number of pairwise comparisons. We devise a single-pass streaming algorithm, which processes each tuple in the stream at most once, while ensuring that the memory size and the number of requested comparisons are in the worst case logarithmic in $n$, where $n$ is the number of all tuples. An important variant of the problem, which can help to reduce human error in comparisons, is to allow users to declare ties when confronted with pairs of tuples of nearly equal utility. We show that the theoretical guarantees of our method can be maintained for this important problem variant. In addition, we show how to enhance existing pruning techniques in the literature by leveraging powerful tools from mathematical programming. Finally, we systematically evaluate all proposed algorithms over both synthetic and real-life datasets, examine their scalability, and demonstrate their superior performance over existing methods.|数据科学中最基本的任务之一是帮助具有未知偏好的用户在大型数据库中查找高效用元组。为了准确地获得未知的用户首选项,一种被广泛采用的方法是要求用户比较元组对。在本文中,我们研究识别一个或多个高效用元组的问题,通过自适应接收用户输入的最小数目的成对比较。我们设计了一个单通道流式算法,它最多处理流中的每个元组一次,同时确保内存大小和请求比较的数量在最坏的情况下是以 $n $为对数的,其中 $n $是所有元组的数量。该问题的一个重要变体是允许用户在遇到效用几乎相等的元组对时声明关系,这有助于减少比较中的人为错误。我们表明,对于这个重要的问题变量,我们方法的理论保证是可以保持的。此外,我们还展示了如何通过利用数学编程中的强大工具来增强文献中现有的剪枝技术。最后,我们系统地评估了所有提出的算法在合成和实际数据集上的性能,检验了它们的可伸缩性,并证明了它们优于现有方法的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Finding+Favourite+Tuples+on+Data+Streams+with+Provably+Few+Comparisons)|0| +|[Finding Favourite Tuples on Data Streams with Provably Few Comparisons](https://doi.org/10.1145/3580305.3599352)|Guangyi Zhang, Nikolaj Tatti, Aristides Gionis|HIIT, University of Helsinki; KTH Royal Institute of Technology; Shenzhen Institute of Computing Sciences|One of the most fundamental tasks in data science is to assist a user with unknown preferences in finding high-utility tuples within a large database. To accurately elicit the unknown user preferences, a widely-adopted way is by asking the user to compare pairs of tuples. In this paper, we study the problem of identifying one or more high-utility tuples by adaptively receiving user input on a minimum number of pairwise comparisons. We devise a single-pass streaming algorithm, which processes each tuple in the stream at most once, while ensuring that the memory size and the number of requested comparisons are in the worst case logarithmic in $n$, where $n$ is the number of all tuples. An important variant of the problem, which can help to reduce human error in comparisons, is to allow users to declare ties when confronted with pairs of tuples of nearly equal utility. We show that the theoretical guarantees of our method can be maintained for this important problem variant. In addition, we show how to enhance existing pruning techniques in the literature by leveraging powerful tools from mathematical programming. Finally, we systematically evaluate all proposed algorithms over both synthetic and real-life datasets, examine their scalability, and demonstrate their superior performance over existing methods.|数据科学中最基本的任务之一是帮助具有未知偏好的用户在大型数据库中查找高效用元组。为了准确地获得未知的用户首选项,一种被广泛采用的方法是要求用户比较元组对。在本文中,我们研究识别一个或多个高效用元组的问题,通过自适应接收用户输入的最小数目的成对比较。我们设计了一个单通道流式算法,它最多处理流中的每个元组一次,同时确保内存大小和请求比较的数量在最坏的情况下是以 $n $为对数的,其中 $n $是所有元组的数量。该问题的一个重要变体是允许用户在遇到效用几乎相等的元组对时声明关系,这有助于减少比较中的人为错误。我们表明,对于这个重要的问题变量,我们方法的理论保证是可以保持的。此外,我们还展示了如何通过利用数学编程中的强大工具来增强文献中现有的剪枝技术。最后,我们系统地评估了所有提出的算法在合成和实际数据集上的性能,检验了它们的可伸缩性,并证明了它们优于现有方法的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Finding+Favourite+Tuples+on+Data+Streams+with+Provably+Few+Comparisons)|0| |[Domain-Specific Risk Minimization for Domain Generalization](https://doi.org/10.1145/3580305.3599313)|YiFan Zhang, Jindong Wang, Jian Liang, Zhang Zhang, Baosheng Yu, Liang Wang, Dacheng Tao, Xing Xie||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Domain-Specific+Risk+Minimization+for+Domain+Generalization)|0| -|[Towards Fair Disentangled Online Learning for Changing Environments](https://doi.org/10.1145/3580305.3599523)|Chen Zhao, Feng Mi, Xintao Wu, Kai Jiang, Latifur Khan, Christan Grant, Feng Chen|Baylor University; University of Florida; University of Arkansas; The University of Texas at Dallas; University of Texas at Dallas|In the problem of online learning for changing environments, data are sequentially received one after another over time, and their distribution assumptions may vary frequently. Although existing methods demonstrate the effectiveness of their learning algorithms by providing a tight bound on either dynamic regret or adaptive regret, most of them completely ignore learning with model fairness, defined as the statistical parity across different sub-population (e.g., race and gender). Another drawback is that when adapting to a new environment, an online learner needs to update model parameters with a global change, which is costly and inefficient. Inspired by the sparse mechanism shift hypothesis, we claim that changing environments in online learning can be attributed to partial changes in learned parameters that are specific to environments and the rest remain invariant to changing environments. To this end, in this paper, we propose a novel algorithm under the assumption that data collected at each time can be disentangled with two representations, an environment-invariant semantic factor and an environment-specific variation factor. The semantic factor is further used for fair prediction under a group fairness constraint. To evaluate the sequence of model parameters generated by the learner, a novel regret is proposed in which it takes a mixed form of dynamic and static regret metrics followed by a fairness-aware long-term constraint. The detailed analysis provides theoretical guarantees for loss regret and violation of cumulative fairness constraints. Empirical evaluations on real-world datasets demonstrate our proposed method sequentially outperforms baseline methods in model accuracy and fairness.|在变化环境下的在线学习问题中,数据随着时间的推移依次接收,其分布假设可能会频繁变化。尽管现有的方法通过提供动态后悔或适应性后悔的紧密界限来证明其学习算法的有效性,但大多数方法完全忽略了模型公平性的学习,这种模型公平性被定义为不同子群(例如种族和性别)的统计平价。另一个缺点是,在适应新环境时,在线学习者需要根据全局变化更新模型参数,这样做成本高,效率低。受稀疏机制转移假说的启发,我们认为在线学习中不断变化的环境可以归因于特定于环境的学习参数的部分变化,而其余的参数对不断变化的环境保持不变。为此,本文提出了一种新的算法,该算法假设每次采集的数据可以分解为两种表示: 环境不变的语义因子和环境特定的变化因子。语义因子进一步用于群体公平约束下的公平预测。为了评估学习者生成的模型参数序列,提出了一种新的遗憾度量方法,该方法采用动态和静态遗憾度量的混合形式,并且具有公平意识的长期约束。详细的分析为损失后悔和违反累积公平约束提供了理论保证。对实际数据集的实证分析表明,该方法在模型精度和公平性方面均优于基线方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Fair+Disentangled+Online+Learning+for+Changing+Environments)|0| +|[Towards Fair Disentangled Online Learning for Changing Environments](https://doi.org/10.1145/3580305.3599523)|Chen Zhao, Feng Mi, Xintao Wu, Kai Jiang, Latifur Khan, Christan Grant, Feng Chen|University of Texas at Dallas; Baylor University; University of Arkansas; University of Florida; The University of Texas at Dallas|In the problem of online learning for changing environments, data are sequentially received one after another over time, and their distribution assumptions may vary frequently. Although existing methods demonstrate the effectiveness of their learning algorithms by providing a tight bound on either dynamic regret or adaptive regret, most of them completely ignore learning with model fairness, defined as the statistical parity across different sub-population (e.g., race and gender). Another drawback is that when adapting to a new environment, an online learner needs to update model parameters with a global change, which is costly and inefficient. Inspired by the sparse mechanism shift hypothesis, we claim that changing environments in online learning can be attributed to partial changes in learned parameters that are specific to environments and the rest remain invariant to changing environments. To this end, in this paper, we propose a novel algorithm under the assumption that data collected at each time can be disentangled with two representations, an environment-invariant semantic factor and an environment-specific variation factor. The semantic factor is further used for fair prediction under a group fairness constraint. To evaluate the sequence of model parameters generated by the learner, a novel regret is proposed in which it takes a mixed form of dynamic and static regret metrics followed by a fairness-aware long-term constraint. The detailed analysis provides theoretical guarantees for loss regret and violation of cumulative fairness constraints. Empirical evaluations on real-world datasets demonstrate our proposed method sequentially outperforms baseline methods in model accuracy and fairness.|在变化环境下的在线学习问题中,数据随着时间的推移依次接收,其分布假设可能会频繁变化。尽管现有的方法通过提供动态后悔或适应性后悔的紧密界限来证明其学习算法的有效性,但大多数方法完全忽略了模型公平性的学习,这种模型公平性被定义为不同子群(例如种族和性别)的统计平价。另一个缺点是,在适应新环境时,在线学习者需要根据全局变化更新模型参数,这样做成本高,效率低。受稀疏机制转移假说的启发,我们认为在线学习中不断变化的环境可以归因于特定于环境的学习参数的部分变化,而其余的参数对不断变化的环境保持不变。为此,本文提出了一种新的算法,该算法假设每次采集的数据可以分解为两种表示: 环境不变的语义因子和环境特定的变化因子。语义因子进一步用于群体公平约束下的公平预测。为了评估学习者生成的模型参数序列,提出了一种新的遗憾度量方法,该方法采用动态和静态遗憾度量的混合形式,并且具有公平意识的长期约束。详细的分析为损失后悔和违反累积公平约束提供了理论保证。对实际数据集的实证分析表明,该方法在模型精度和公平性方面均优于基线方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Fair+Disentangled+Online+Learning+for+Changing+Environments)|0| |[SMILE: Evaluation and Domain Adaptation for Social Media Language Understanding](https://doi.org/10.1145/3580305.3599907)|Vasilisa Bashlovkina, Riley Matthews, Zhaobin Kuang, Simon Baumgartner, Michael Bendersky|Google Research|We study the ability of transformer-based language models (LMs) to understand social media language. Social media (SM) language is distinct from standard written language, yet existing benchmarks fall short of capturing LM performance in this socially, economically, and politically important domain. We quantify the degree to which social media language differs from conventional language and conclude that the difference is significant both in terms of token distribution and rate of linguistic shift. Next, we introduce a new benchmark for Social MedIa Language Evaluation (SMILE) that covers four SM platforms and eleven tasks. Finally, we show that learning a tokenizer and pretraining on a mix of social media and conventional language yields an LM that outperforms the best similar-sized alternative by 4.2 points on the overall SMILE score.|我们研究了基于转换器的语言模型(LM)理解社交媒体语言的能力。社会媒体语言(SM)与标准的书面语言不同,然而现有的基准在这个社会、经济和政治重要的领域还不足以捕捉 LM 的表现。我们量化了社交媒体语言与传统语言的差异程度,并得出结论: 社交媒体语言与传统语言的差异在表征分布和语言转换率方面都是显著的。接下来,我们将为社会媒体语言评估(SMILE)引入一个新的基准,它涵盖了四个 SM 平台和十一个任务。最后,我们表明,学习一个标记器和预训练的社会媒体和传统语言的混合产生了一个 LM 的表现最好的类似大小的选择4.2分的总体 SMILE 得分。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SMILE:+Evaluation+and+Domain+Adaptation+for+Social+Media+Language+Understanding)|0| -|[Augmenting Rule-based DNS Censorship Detection at Scale with Machine Learning](https://doi.org/10.1145/3580305.3599775)|Jacob Alexander Markson Brown, Xi Jiang, Van Tran, Arjun Nitin Bhagoji, Nguyen Phong Hoang, Nick Feamster, Prateek Mittal, Vinod Yegneswaran|Princeton University; University of Chicago; SRI International|The proliferation of global censorship has led to the development of a plethora of measurement platforms to monitor and expose it. Censorship of the domain name system (DNS) is a key mechanism used across different countries. It is currently detected by applying heuristics to samples of DNS queries and responses (probes) for specific destinations. These heuristics, however, are both platform-specific and have been found to be brittle when censors change their blocking behavior, necessitating a more reliable automated process for detecting censorship. In this paper, we explore how machine learning (ML) models can (1) help streamline the detection process, (2) improve the usability of large-scale datasets for censorship detection, and (3) discover new censorship instances and blocking signatures missed by existing heuristic methods. Our study shows that supervised models, trained using expert-derived labels on instances of known anomalies and possible censorship, can learn the detection heuristics employed by different measurement platforms. More crucially, we find that unsupervised models, trained solely on uncensored instances, can identify new instances and variations of censorship missed by existing heuristics. Moreover, both methods demonstrate the capability to uncover a substantial number of new DNS blocking signatures, i.e., injected fake IP addresses overlooked by existing heuristics. These results are underpinned by an important methodological finding: comparing the outputs of models trained using the same probes but with labels arising from independent processes allows us to more reliably detect cases of censorship in the absence of ground-truth labels of censorship.|全球审查制度的扩散导致了监测和揭露它的大量测量平台的发展。域名系统(DNS)的审查是各国使用的一个关键机制。目前,通过对特定目的地的 DNS 查询和响应(探测)样本应用启发式方法来检测它。然而,这些启发式方法都是针对特定平台的,当审查者改变他们的拦截行为时,这些方法被发现是脆弱的,这就需要一个更可靠的自动化过程来检测审查。本文探讨了机器学习(ML)模型在以下几个方面的作用: (1)简化检测过程; (2)提高大规模数据集在检测中的可用性; (3)发现新的检测实例和现有启发式方法遗漏的阻塞签名。我们的研究表明,监督模型,训练使用专家派生的标签对已知的异常和可能的检查的实例,可以学习检测启发采用不同的测量平台。更重要的是,我们发现,无监督模型,仅仅训练未经审查的实例,可以识别新的实例和变化的审查错过了现有的启发。此外,这两种方法都证明了能够发现大量新的 DNS 阻塞签名,即注入的假 IP 地址被现有的启发式方法忽略。这些结果得到了一个重要的方法论发现的支持: 比较使用相同探针训练的模型的输出,但是与独立过程产生的标签进行比较,使我们能够更可靠地检测在没有审查的地面真相标签的情况下的审查情况。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Augmenting+Rule-based+DNS+Censorship+Detection+at+Scale+with+Machine+Learning)|0| +|[Augmenting Rule-based DNS Censorship Detection at Scale with Machine Learning](https://doi.org/10.1145/3580305.3599775)|Jacob Alexander Markson Brown, Xi Jiang, Van Tran, Arjun Nitin Bhagoji, Nguyen Phong Hoang, Nick Feamster, Prateek Mittal, Vinod Yegneswaran|SRI International; University of Chicago; Princeton University|The proliferation of global censorship has led to the development of a plethora of measurement platforms to monitor and expose it. Censorship of the domain name system (DNS) is a key mechanism used across different countries. It is currently detected by applying heuristics to samples of DNS queries and responses (probes) for specific destinations. These heuristics, however, are both platform-specific and have been found to be brittle when censors change their blocking behavior, necessitating a more reliable automated process for detecting censorship. In this paper, we explore how machine learning (ML) models can (1) help streamline the detection process, (2) improve the usability of large-scale datasets for censorship detection, and (3) discover new censorship instances and blocking signatures missed by existing heuristic methods. Our study shows that supervised models, trained using expert-derived labels on instances of known anomalies and possible censorship, can learn the detection heuristics employed by different measurement platforms. More crucially, we find that unsupervised models, trained solely on uncensored instances, can identify new instances and variations of censorship missed by existing heuristics. Moreover, both methods demonstrate the capability to uncover a substantial number of new DNS blocking signatures, i.e., injected fake IP addresses overlooked by existing heuristics. These results are underpinned by an important methodological finding: comparing the outputs of models trained using the same probes but with labels arising from independent processes allows us to more reliably detect cases of censorship in the absence of ground-truth labels of censorship.|全球审查制度的扩散导致了监测和揭露它的大量测量平台的发展。域名系统(DNS)的审查是各国使用的一个关键机制。目前,通过对特定目的地的 DNS 查询和响应(探测)样本应用启发式方法来检测它。然而,这些启发式方法都是针对特定平台的,当审查者改变他们的拦截行为时,这些方法被发现是脆弱的,这就需要一个更可靠的自动化过程来检测审查。本文探讨了机器学习(ML)模型在以下几个方面的作用: (1)简化检测过程; (2)提高大规模数据集在检测中的可用性; (3)发现新的检测实例和现有启发式方法遗漏的阻塞签名。我们的研究表明,监督模型,训练使用专家派生的标签对已知的异常和可能的检查的实例,可以学习检测启发采用不同的测量平台。更重要的是,我们发现,无监督模型,仅仅训练未经审查的实例,可以识别新的实例和变化的审查错过了现有的启发。此外,这两种方法都证明了能够发现大量新的 DNS 阻塞签名,即注入的假 IP 地址被现有的启发式方法忽略。这些结果得到了一个重要的方法论发现的支持: 比较使用相同探针训练的模型的输出,但是与独立过程产生的标签进行比较,使我们能够更可靠地检测在没有审查的地面真相标签的情况下的审查情况。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Augmenting+Rule-based+DNS+Censorship+Detection+at+Scale+with+Machine+Learning)|0| |[Taming the Domain Shift in Multi-source Learning for Energy Disaggregation](https://doi.org/10.1145/3580305.3599910)|Xiaomin Chang, Wei Li, Yunchuan Shi, Albert Y. Zomaya||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Taming+the+Domain+Shift+in+Multi-source+Learning+for+Energy+Disaggregation)|0| |[Variance Reduction Using In-Experiment Data: Efficient and Targeted Online Measurement for Sparse and Delayed Outcomes](https://doi.org/10.1145/3580305.3599928)|Alex Deng, Michelle Du, Anna Matlin, Qing Zhang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Variance+Reduction+Using+In-Experiment+Data:+Efficient+and+Targeted+Online+Measurement+for+Sparse+and+Delayed+Outcomes)|0| |[Modelling Delayed Redemption with Importance Sampling and Pre-Redemption Engagement](https://doi.org/10.1145/3580305.3599867)|Samik Datta, Anshuman Mourya, Anirban Majumder, Vineet Chaoji||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Modelling+Delayed+Redemption+with+Importance+Sampling+and+Pre-Redemption+Engagement)|0| |[From Human Days to Machine Seconds: Automatically Answering and Generating Machine Learning Final Exams](https://doi.org/10.1145/3580305.3599827)|Iddo Drori, Sarah J. Zhang, Reece Shuttleworth, Sarah Zhang, Keith Tyser, Zad Chin, Pedro Lantigua, Saisamrit Surbehera, Gregory Hunter, Derek Austin, Leonard Tang, Yann Hicke, Sage Simhon, Sathwik Karnik, Darnell Granberry, Madeleine Udell||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=From+Human+Days+to+Machine+Seconds:+Automatically+Answering+and+Generating+Machine+Learning+Final+Exams)|0| -|[Learning Multi-Agent Intention-Aware Communication for Optimal Multi-Order Execution in Finance](https://doi.org/10.1145/3580305.3599856)|Yuchen Fang, Zhenggang Tang, Kan Ren, Weiqing Liu, Li Zhao, Jiang Bian, Dongsheng Li, Weinan Zhang, Yong Yu, TieYan Liu|Microsoft; University of Illinois Urbana-Champaign; Shanghai Jiao Tong University; Microsoft Research Asia|Order execution is a fundamental task in quantitative finance, aiming at finishing acquisition or liquidation for a number of trading orders of the specific assets. Recent advance in model-free reinforcement learning (RL) provides a data-driven solution to the order execution problem. However, the existing works always optimize execution for an individual order, overlooking the practice that multiple orders are specified to execute simultaneously, resulting in suboptimality and bias. In this paper, we first present a multi-agent RL (MARL) method for multi-order execution considering practical constraints. Specifically, we treat every agent as an individual operator to trade one specific order, while keeping communicating with each other and collaborating for maximizing the overall profits. Nevertheless, the existing MARL algorithms often incorporate communication among agents by exchanging only the information of their partial observations, which is inefficient in complicated financial market. To improve collaboration, we then propose a learnable multi-round communication protocol, for the agents communicating the intended actions with each other and refining accordingly. It is optimized through a novel action value attribution method which is provably consistent with the original learning objective yet more efficient. The experiments on the data from two real-world markets have illustrated superior performance with significantly better collaboration effectiveness achieved by our method.|定单执行是定量金融的一项基本任务,其目的是完成对特定资产的多个交易定单的收购或清算。无模型强化学习的最新进展为订单执行问题提供了一个数据驱动的解决方案。然而,现有的工作总是优化单个订单的执行,忽视了多个订单指定同时执行的做法,导致次优性和偏差。本文首先提出了一种考虑实际约束的多代理 RL (MARL)多订单执行方法。具体来说,我们把每个代理当作一个独立的经营者来交易一个特定的订单,同时保持相互之间的沟通和合作,以实现总体利润的最大化。然而,现有的 MARL 算法往往只交换代理人的部分观测信息,而不考虑代理人之间的通信,在复杂的金融市场中效率低下。为了改进协作,我们提出了一个可学习的多轮通信协议,用于代理之间相互通信预期的操作并相应地进行细化。通过一种新的行为价值归因方法对其进行优化,该方法与原有的学习目标一致,但效率更高。对两个实际市场的数据进行的实验表明,该方法具有更好的协作效率和更好的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Multi-Agent+Intention-Aware+Communication+for+Optimal+Multi-Order+Execution+in+Finance)|0| +|[Learning Multi-Agent Intention-Aware Communication for Optimal Multi-Order Execution in Finance](https://doi.org/10.1145/3580305.3599856)|Yuchen Fang, Zhenggang Tang, Kan Ren, Weiqing Liu, Li Zhao, Jiang Bian, Dongsheng Li, Weinan Zhang, Yong Yu, TieYan Liu|Shanghai Jiao Tong University; University of Illinois Urbana-Champaign; Microsoft Research Asia; Microsoft|Order execution is a fundamental task in quantitative finance, aiming at finishing acquisition or liquidation for a number of trading orders of the specific assets. Recent advance in model-free reinforcement learning (RL) provides a data-driven solution to the order execution problem. However, the existing works always optimize execution for an individual order, overlooking the practice that multiple orders are specified to execute simultaneously, resulting in suboptimality and bias. In this paper, we first present a multi-agent RL (MARL) method for multi-order execution considering practical constraints. Specifically, we treat every agent as an individual operator to trade one specific order, while keeping communicating with each other and collaborating for maximizing the overall profits. Nevertheless, the existing MARL algorithms often incorporate communication among agents by exchanging only the information of their partial observations, which is inefficient in complicated financial market. To improve collaboration, we then propose a learnable multi-round communication protocol, for the agents communicating the intended actions with each other and refining accordingly. It is optimized through a novel action value attribution method which is provably consistent with the original learning objective yet more efficient. The experiments on the data from two real-world markets have illustrated superior performance with significantly better collaboration effectiveness achieved by our method.|定单执行是定量金融的一项基本任务,其目的是完成对特定资产的多个交易定单的收购或清算。无模型强化学习的最新进展为订单执行问题提供了一个数据驱动的解决方案。然而,现有的工作总是优化单个订单的执行,忽视了多个订单指定同时执行的做法,导致次优性和偏差。本文首先提出了一种考虑实际约束的多代理 RL (MARL)多订单执行方法。具体来说,我们把每个代理当作一个独立的经营者来交易一个特定的订单,同时保持相互之间的沟通和合作,以实现总体利润的最大化。然而,现有的 MARL 算法往往只交换代理人的部分观测信息,而不考虑代理人之间的通信,在复杂的金融市场中效率低下。为了改进协作,我们提出了一个可学习的多轮通信协议,用于代理之间相互通信预期的操作并相应地进行细化。通过一种新的行为价值归因方法对其进行优化,该方法与原有的学习目标一致,但效率更高。对两个实际市场的数据进行的实验表明,该方法具有更好的协作效率和更好的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Multi-Agent+Intention-Aware+Communication+for+Optimal+Multi-Order+Execution+in+Finance)|0| |[iETA: A Robust and Scalable Incremental Learning Framework for Time-of-Arrival Estimation](https://doi.org/10.1145/3580305.3599842)|Jindong Han, Hao Liu, Shui Liu, Xi Chen, Naiqiang Tan, Hua Chai, Hui Xiong||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=iETA:+A+Robust+and+Scalable+Incremental+Learning+Framework+for+Time-of-Arrival+Estimation)|0| |[Identifying Complicated Contagion Scenarios from Cascade Data](https://doi.org/10.1145/3580305.3599841)|Galen Harrison, Amro Alabsi Aljundi, Jiangzhuo Chen, S. S. Ravi, Anil Kumar S. Vullikanti, Madhav V. Marathe, Abhijin Adiga||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Identifying+Complicated+Contagion+Scenarios+from+Cascade+Data)|0| |[Large-scale Urban Cellular Traffic Generation via Knowledge-Enhanced GANs with Multi-Periodic Patterns](https://doi.org/10.1145/3580305.3599853)|Shuodi Hui, Huandong Wang, Tong Li, Xinghao Yang, Xing Wang, Junlan Feng, Lin Zhu, Chao Deng, Pan Hui, Depeng Jin, Yong Li||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Large-scale+Urban+Cellular+Traffic+Generation+via+Knowledge-Enhanced+GANs+with+Multi-Periodic+Patterns)|0| -|[SentiGOLD: A Large Bangla Gold Standard Multi-Domain Sentiment Analysis Dataset and Its Evaluation](https://doi.org/10.1145/3580305.3599904)|Md. Ekramul Islam, Labib Chowdhury, Faisal Ahamed Khan, Shazzad Hossain, Md. Sourave Hossain, Mohammad Mamun Or Rashid, Nabeel Mohammed, Mohammad Ruhul Amin|Bangladesh Computer Council; North South University; Giga Tech Limited; Fordham University|This study introduces SentiGOLD, a Bangla multi-domain sentiment analysis dataset. Comprising 70,000 samples, it was created from diverse sources and annotated by a gender-balanced team of linguists. SentiGOLD adheres to established linguistic conventions agreed upon by the Government of Bangladesh and a Bangla linguistics committee. Unlike English and other languages, Bangla lacks standard sentiment analysis datasets due to the absence of a national linguistics framework. The dataset incorporates data from online video comments, social media posts, blogs, news, and other sources while maintaining domain and class distribution rigorously. It spans 30 domains (e.g., politics, entertainment, sports) and includes 5 sentiment classes (strongly negative, weakly negative, neutral, and strongly positive). The annotation scheme, approved by the national linguistics committee, ensures a robust Inter Annotator Agreement (IAA) with a Fleiss' kappa score of 0.88. Intra- and cross-dataset evaluation protocols are applied to establish a standard classification system. Cross-dataset evaluation on the noisy SentNoB dataset presents a challenging test scenario. Additionally, zero-shot experiments demonstrate the generalizability of SentiGOLD. The top model achieves a macro f1 score of 0.62 (intra-dataset) across 5 classes, setting a benchmark, and 0.61 (cross-dataset from SentNoB) across 3 classes, comparable to the state-of-the-art. Fine-tuned sentiment analysis model can be accessed at https://sentiment.bangla.gov.bd.|本文介绍了孟加拉语多领域情感分析数据集 SentiGOLD。它包括70,000个样本,由不同的来源创建,并由一个性别平衡的语言学家团队进行注释。SentiGOLD 遵守孟加拉国政府和孟加拉语言学委员会商定的既定语言公约。与英语和其他语言不同,由于缺乏国家语言学框架,孟加拉语缺乏标准的情感分析数据集。该数据集合并了来自在线视频评论、社交媒体帖子、博客、新闻和其他来源的数据,同时严格维护了域和类的分布。它跨越30个领域(例如,政治,娱乐,体育) ,包括5个情绪类(强烈消极,弱消极,中立,和强烈积极)。由国家语言学委员会批准的注释方案确保了一个强有力的内部注释协议(IAA) ,Fleiss 的 kappa 得分为0.88。数据集内和数据集间的评估协议被用来建立一个标准的分类方案。对噪声 SentNoB 数据集进行跨数据集评估是一个具有挑战性的测试场景。此外,零拍实验证明了 SentiGOLD 的通用性。顶级模型在5个类别中实现了0.62(数据集内)的宏观 f1评分,设定了基准,并在3个类别中实现了0.61(来自 SentNoB 的跨数据集) ,与最先进的技术相当。微调的情绪分析模型可以在 https://sentiment.bangla.gov.bd 访问。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SentiGOLD:+A+Large+Bangla+Gold+Standard+Multi-Domain+Sentiment+Analysis+Dataset+and+Its+Evaluation)|0| +|[SentiGOLD: A Large Bangla Gold Standard Multi-Domain Sentiment Analysis Dataset and Its Evaluation](https://doi.org/10.1145/3580305.3599904)|Md. Ekramul Islam, Labib Chowdhury, Faisal Ahamed Khan, Shazzad Hossain, Md. Sourave Hossain, Mohammad Mamun Or Rashid, Nabeel Mohammed, Mohammad Ruhul Amin|North South University; Giga Tech Limited; Fordham University; Bangladesh Computer Council|This study introduces SentiGOLD, a Bangla multi-domain sentiment analysis dataset. Comprising 70,000 samples, it was created from diverse sources and annotated by a gender-balanced team of linguists. SentiGOLD adheres to established linguistic conventions agreed upon by the Government of Bangladesh and a Bangla linguistics committee. Unlike English and other languages, Bangla lacks standard sentiment analysis datasets due to the absence of a national linguistics framework. The dataset incorporates data from online video comments, social media posts, blogs, news, and other sources while maintaining domain and class distribution rigorously. It spans 30 domains (e.g., politics, entertainment, sports) and includes 5 sentiment classes (strongly negative, weakly negative, neutral, and strongly positive). The annotation scheme, approved by the national linguistics committee, ensures a robust Inter Annotator Agreement (IAA) with a Fleiss' kappa score of 0.88. Intra- and cross-dataset evaluation protocols are applied to establish a standard classification system. Cross-dataset evaluation on the noisy SentNoB dataset presents a challenging test scenario. Additionally, zero-shot experiments demonstrate the generalizability of SentiGOLD. The top model achieves a macro f1 score of 0.62 (intra-dataset) across 5 classes, setting a benchmark, and 0.61 (cross-dataset from SentNoB) across 3 classes, comparable to the state-of-the-art. Fine-tuned sentiment analysis model can be accessed at https://sentiment.bangla.gov.bd.|本文介绍了孟加拉语多领域情感分析数据集 SentiGOLD。它包括70,000个样本,由不同的来源创建,并由一个性别平衡的语言学家团队进行注释。SentiGOLD 遵守孟加拉国政府和孟加拉语言学委员会商定的既定语言公约。与英语和其他语言不同,由于缺乏国家语言学框架,孟加拉语缺乏标准的情感分析数据集。该数据集合并了来自在线视频评论、社交媒体帖子、博客、新闻和其他来源的数据,同时严格维护了域和类的分布。它跨越30个领域(例如,政治,娱乐,体育) ,包括5个情绪类(强烈消极,弱消极,中立,和强烈积极)。由国家语言学委员会批准的注释方案确保了一个强有力的内部注释协议(IAA) ,Fleiss 的 kappa 得分为0.88。数据集内和数据集间的评估协议被用来建立一个标准的分类方案。对噪声 SentNoB 数据集进行跨数据集评估是一个具有挑战性的测试场景。此外,零拍实验证明了 SentiGOLD 的通用性。顶级模型在5个类别中实现了0.62(数据集内)的宏观 f1评分,设定了基准,并在3个类别中实现了0.61(来自 SentNoB 的跨数据集) ,与最先进的技术相当。微调的情绪分析模型可以在 https://sentiment.bangla.gov.bd 访问。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SentiGOLD:+A+Large+Bangla+Gold+Standard+Multi-Domain+Sentiment+Analysis+Dataset+and+Its+Evaluation)|0| |[Off-Policy Learning-to-Bid with AuctionGym](https://doi.org/10.1145/3580305.3599877)|Olivier Jeunen, Sean Murphy, Ben Allison||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Off-Policy+Learning-to-Bid+with+AuctionGym)|0| |[FairCod: A Fairness-aware Concurrent Dispatch System for Large-scale Instant Delivery Services](https://doi.org/10.1145/3580305.3599824)|Lin Jiang, Shuai Wang, Baoshen Guo, Hai Wang, Desheng Zhang, Guang Wang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=FairCod:+A+Fairness-aware+Concurrent+Dispatch+System+for+Large-scale+Instant+Delivery+Services)|0| |[CBLab: Supporting the Training of Large-scale Traffic Control Policies with Scalable Traffic Simulation](https://doi.org/10.1145/3580305.3599789)|Chumeng Liang, Zherui Huang, Yicheng Liu, Zhanyu Liu, Guanjie Zheng, Hanyuan Shi, Kan Wu, Yuhao Du, Fuliang Li, Zhenhui Jessie Li||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CBLab:+Supporting+the+Training+of+Large-scale+Traffic+Control+Policies+with+Scalable+Traffic+Simulation)|0| |[Practical Synthetic Human Trajectories Generation Based on Variational Point Processes](https://doi.org/10.1145/3580305.3599888)|Qingyue Long, Huandong Wang, Tong Li, Lisi Huang, Kun Wang, Qiong Wu, Guangyu Li, Yanping Liang, Li Yu, Yong Li||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Practical+Synthetic+Human+Trajectories+Generation+Based+on+Variational+Point+Processes)|0| |[Deep Landscape Forecasting in Multi-Slot Real-Time Bidding](https://doi.org/10.1145/3580305.3599799)|Weitong Ou, Bo Chen, Yingxuan Yang, Xinyi Dai, Weiwen Liu, Weinan Zhang, Ruiming Tang, Yong Yu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Deep+Landscape+Forecasting+in+Multi-Slot+Real-Time+Bidding)|0| |[NFT-Based Data Marketplace with Digital Watermarking](https://doi.org/10.1145/3580305.3599876)|Saeed Ranjbar Alvar, Mohammad Akbari, David (Ming Xuan) Yue, Yong Zhang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=NFT-Based+Data+Marketplace+with+Digital+Watermarking)|0| -|[Rover: An Online Spark SQL Tuning Service via Generalized Transfer Learning](https://doi.org/10.1145/3580305.3599953)|Yu Shen, Xinyuyang Ren, Yupeng Lu, Huaijun Jiang, Huanyong Xu, Di Peng, Yang Li, Wentao Zhang, Bin Cui|Mila – Québec AI Institute; Peking University; ByteDance Inc.|Distributed data analytic engines like Spark are common choices to process massive data in industry. However, the performance of Spark SQL highly depends on the choice of configurations, where the optimal ones vary with the executed workloads. Among various alternatives for Spark SQL tuning, Bayesian optimization (BO) is a popular framework that finds near-optimal configurations given sufficient budget, but it suffers from the re-optimization issue and is not practical in real production. When applying transfer learning to accelerate the tuning process, we notice two domain-specific challenges: 1) most previous work focus on transferring tuning history, while expert knowledge from Spark engineers is of great potential to improve the tuning performance but is not well studied so far; 2) history tasks should be carefully utilized, where using dissimilar ones lead to a deteriorated performance in production. In this paper, we present Rover, a deployed online Spark SQL tuning service for efficient and safe search on industrial workloads. To address the challenges, we propose generalized transfer learning to boost the tuning performance based on external knowledge, including expert-assisted Bayesian optimization and controlled history transfer. Experiments on public benchmarks and real-world tasks show the superiority of Rover over competitive baselines. Notably, Rover saves an average of 50.1% of the memory cost on 12k real-world Spark SQL tasks in 20 iterations, among which 76.2% of the tasks achieve a significant memory reduction of over 60%.|像 Spark 这样的分布式数据分析引擎是工业中处理海量数据的常见选择。然而,Spark SQL 的性能在很大程度上取决于配置的选择,其中最佳配置随执行的工作负载而变化。在各种 Spark SQL 调优方案中,贝叶斯优化(BO)是一种流行的框架,它能够在预算充足的情况下找到接近最优的配置,但是它存在重新优化的问题,在实际生产中并不实用。当应用转移学习来加速调优过程时,我们注意到两个领域特有的挑战: 1)大多数以前的工作集中在转移调优历史,而来自 Spark 工程师的专家知识对于提高调优性能具有巨大的潜力,但是目前还没有得到很好的研究; 2)历史任务应该被仔细地利用,在使用不同的任务导致生产性能恶化的情况下。在本文中,我们介绍了 Rover,一个已部署的在线 Spark SQL 调优服务,用于在工业工作负载上进行高效和安全的搜索。针对这一挑战,我们提出了基于外部知识的广义迁移学习来提高调优性能,包括专家辅助的贝叶斯优化和受控历史迁移。在公共基准测试和实际任务上的实验表明,Rover 优于竞争基准测试。值得注意的是,在20次迭代中,Rover 为12k 实际 Spark SQL 任务平均节省了50.1% 的内存成本,其中76.2% 的任务实现了超过60% 的显著内存减少。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Rover:+An+Online+Spark+SQL+Tuning+Service+via+Generalized+Transfer+Learning)|0| +|[Rover: An Online Spark SQL Tuning Service via Generalized Transfer Learning](https://doi.org/10.1145/3580305.3599953)|Yu Shen, Xinyuyang Ren, Yupeng Lu, Huaijun Jiang, Huanyong Xu, Di Peng, Yang Li, Wentao Zhang, Bin Cui|Mila – Québec AI Institute; ByteDance Inc.; Peking University|Distributed data analytic engines like Spark are common choices to process massive data in industry. However, the performance of Spark SQL highly depends on the choice of configurations, where the optimal ones vary with the executed workloads. Among various alternatives for Spark SQL tuning, Bayesian optimization (BO) is a popular framework that finds near-optimal configurations given sufficient budget, but it suffers from the re-optimization issue and is not practical in real production. When applying transfer learning to accelerate the tuning process, we notice two domain-specific challenges: 1) most previous work focus on transferring tuning history, while expert knowledge from Spark engineers is of great potential to improve the tuning performance but is not well studied so far; 2) history tasks should be carefully utilized, where using dissimilar ones lead to a deteriorated performance in production. In this paper, we present Rover, a deployed online Spark SQL tuning service for efficient and safe search on industrial workloads. To address the challenges, we propose generalized transfer learning to boost the tuning performance based on external knowledge, including expert-assisted Bayesian optimization and controlled history transfer. Experiments on public benchmarks and real-world tasks show the superiority of Rover over competitive baselines. Notably, Rover saves an average of 50.1% of the memory cost on 12k real-world Spark SQL tasks in 20 iterations, among which 76.2% of the tasks achieve a significant memory reduction of over 60%.|像 Spark 这样的分布式数据分析引擎是工业中处理海量数据的常见选择。然而,Spark SQL 的性能在很大程度上取决于配置的选择,其中最佳配置随执行的工作负载而变化。在各种 Spark SQL 调优方案中,贝叶斯优化(BO)是一种流行的框架,它能够在预算充足的情况下找到接近最优的配置,但是它存在重新优化的问题,在实际生产中并不实用。当应用转移学习来加速调优过程时,我们注意到两个领域特有的挑战: 1)大多数以前的工作集中在转移调优历史,而来自 Spark 工程师的专家知识对于提高调优性能具有巨大的潜力,但是目前还没有得到很好的研究; 2)历史任务应该被仔细地利用,在使用不同的任务导致生产性能恶化的情况下。在本文中,我们介绍了 Rover,一个已部署的在线 Spark SQL 调优服务,用于在工业工作负载上进行高效和安全的搜索。针对这一挑战,我们提出了基于外部知识的广义迁移学习来提高调优性能,包括专家辅助的贝叶斯优化和受控历史迁移。在公共基准测试和实际任务上的实验表明,Rover 优于竞争基准测试。值得注意的是,在20次迭代中,Rover 为12k 实际 Spark SQL 任务平均节省了50.1% 的内存成本,其中76.2% 的任务实现了超过60% 的显著内存减少。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Rover:+An+Online+Spark+SQL+Tuning+Service+via+Generalized+Transfer+Learning)|0| |[Root Cause Analysis for Microservice Systems via Hierarchical Reinforcement Learning from Human Feedback](https://doi.org/10.1145/3580305.3599934)|Lu Wang, Chaoyun Zhang, Ruomeng Ding, Yong Xu, Qihang Chen, Wentao Zou, Qingjun Chen, Meng Zhang, Xuedong Gao, Hao Fan, Saravan Rajmohan, Qingwei Lin, Dongmei Zhang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Root+Cause+Analysis+for+Microservice+Systems+via+Hierarchical+Reinforcement+Learning+from+Human+Feedback)|0| |[Knowledge Based Prohibited Item Detection on Heterogeneous Risk Graphs](https://doi.org/10.1145/3580305.3599852)|Tingyan Xiang, Ao Li, Yugang Ji, Dong Li||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Knowledge+Based+Prohibited+Item+Detection+on+Heterogeneous+Risk+Graphs)|0| |[A Data-Driven Decision Support Framework for Player Churn Analysis in Online Games](https://doi.org/10.1145/3580305.3599759)|Yu Xiong, Runze Wu, Shiwei Zhao, Jianrong Tao, Xudong Shen, Tangjie Lyu, Changjie Fan, Peng Cui||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Data-Driven+Decision+Support+Framework+for+Player+Churn+Analysis+in+Online+Games)|0| @@ -233,37 +233,37 @@ |[TwHIN-BERT: A Socially-Enriched Pre-trained Language Model for Multilingual Tweet Representations at Twitter](https://doi.org/10.1145/3580305.3599921)|Xinyang Zhang, Yury Malkov, Omar Florez, Serim Park, Brian McWilliams, Jiawei Han, Ahmed ElKishky||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=TwHIN-BERT:+A+Socially-Enriched+Pre-trained+Language+Model+for+Multilingual+Tweet+Representations+at+Twitter)|0| |[Online Few-Shot Time Series Classification for Aftershock Detection](https://doi.org/10.1145/3580305.3599879)|Sheng Zhong, Vinicius M. A. Souza, Glenn Eli Baker, Abdullah Mueen||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Online+Few-Shot+Time+Series+Classification+for+Aftershock+Detection)|0| |[A Feature-Based Coalition Game Framework with Privileged Knowledge Transfer for User-tag Profile Modeling](https://doi.org/10.1145/3580305.3599761)|Xianghui Zhu, Peng Du, Shuo Shao, Chenxu Zhu, Weinan Zhang, Yang Wang, Yang Cao||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Feature-Based+Coalition+Game+Framework+with+Privileged+Knowledge+Transfer+for+User-tag+Profile+Modeling)|0| -|[Fairness in Graph Machine Learning: Recent Advances and Future Prospectives](https://doi.org/10.1145/3580305.3599555)|Yushun Dong, Oyku Deniz Kose, Yanning Shen, Jundong Li|; university of grenoble; potsdam institute for climate impact research; vienna university of economics and business; university of cambridge; polish academy of sciences; vu university amsterdam; netherlands environmental assessment agency; university of reading|Scenarios are used to explore the consequences of different adaptation and mitigation strategies under uncertainty. In this paper, two scenarios are used to explore developments with (1) no mitigation leading to an increase of global mean temperature of 4 °C by 2100 and (2) an ambitious mitigation strategy leading to 2 °C increase by 2100. For the second scenario, uncertainties in the climate system imply that a global mean temperature increase of 3 °C or more cannot be ruled out. Our analysis shows that, in many cases, adaptation and mitigation are not trade-offs but supplements. For example, the number of people exposed to increased water resource stress due to climate change can be substantially reduced in the mitigation scenario, but adaptation will still be required for the remaining large numbers of people exposed to increased stress. Another example is sea level rise, for which, from a global and purely monetary perspective, adaptation (up to 2100) seems more effective than mitigation. From the perspective of poorer and small island countries, however, stringent mitigation is necessary to keep risks at manageable levels. For agriculture, only a scenario based on a combination of adaptation and mitigation is able to avoid serious climate change impacts. Keywords Scenarios Integrated assessment Climate change Mitigation Adaptation Climate impacts 1 Introduction Scenario analysis forms a very important tool in the assessment of climate change and climate change policy, allowing analysts to explore the complex and uncertain future interactions between factors like economic development, greenhouse gas (GHG) emissions, climate and ecosystems. Together these factors determine the need and the possibilities for mitigation and adaptation policy. Scenarios can also act as a means to harmonize assumptions across very different research communities that are involved in the fields of climate research, allowing a better comparison of their results. As such, scenarios have been used extensively in both mitigation and adaptation studies (see Metz et al., 2007; Parry et al., 2007 ) (especially the scenarios from Special Report on Emission Scenarios (SRES) ( Nakicenovic et al., 2000 )). Moss et al. (2010) point out that since the SRES information requirements from scenario analysis are changing. First, there is an increasing interest in exploring the relationships between adaptation and mitigation. As indicated by Moss et al. (2010) , this would require a further integration of information across the different analytical traditions involved in climate research. Secondly, there is also an increased interest in scenarios that explicitly explore the impact of climate policies in addition to the climate policy-free scenarios explored so far. Specifically, there is a strong interest in being able to evaluate the “costs” and “benefits” of long-term climate goals vis-à-vis the situation without climate policy. In this paper, we follow this line of thought and explore how scenario analysis can contribute to a joint assessment of future adaptation and mitigation strategies. Such a joint assessment can be useful for several reasons: (1) the preferred mitigation strategy depends on expected climate impacts and adaptation costs, (2) it takes account of the limitations of adaptation to climate change, (3) some adaptation and mitigation strategies may interact and (4) finally, impacts of climate change may have important feedbacks that need to be taken into account. Such analysis is most useful at a strategic level, and not for individual adaptation (or mitigation) decisions. Given this purpose, we discuss in the paper two main scenarios that include elements of adaptation and mitigation strategies (see further in this paper), resulting in an increase of global mean temperature of 4 °C and 2 °C by the end of this century. These two temperature levels have started to become iconic numbers, representing a potential outcome in the situation without mitigation policy (4 °C) and the temperature target of international climate negotiations (2 °C) ( Copenhagen Accord, 2009 ). Arguably, understanding the implications of these two temperature levels is essential if political leaders are to make informed choices about the balance between mitigation, adaptation and climate impacts ( Environmental Change Institute, 2009 ). Integrated assessment of mitigation and adaptation strategies is hampered by methodological differences. Integrated assessment models have difficulties describing adaptation processes given the importance of local circumstances ( Patt et al., 2010 ). A practical problem is that to date a considerable part of the impact literature has concentrated on impacts under no-policy scenarios (exceptions include Arnell et al., 2002; Bakkenes et al., 2006; Hayashi et al., 2010; Krol et al., 1997; Nicholls and Lowe, 2004 ). This paper therefore presents a generalised scenario assessment based on coupled pieces of information – but without pretending to be complete or to be fully integrated. As a learning-by-doing exercise, the paper intends to show important differences between a 4 °C and a 2 °C world, but also to identify some of the practical issues involved in performing integrated scenario analysis. This implies that the most important advancement compared to existing literature is that we present a multi-sector analysis based on consistent scenarios. Given the state-of-the-art of current integrated assessment models, the experiments have been done using several loosely coupled models. As a result, several important linkages could not be addressed such as between the adaptation responses for agriculture, which may involve irrigation (see Section 5.3 ) and water demand (Section 5.4 ). In fact, an important question raised in the paper is whether a fully integrated analysis is needed or whether partial integration is sufficient. The paper is organized as follows: we first discuss some of the methodological complications in developing scenarios that can provide information for both adaptation and mitigation policy decisions. Next, we discuss the differences between the two main scenarios in terms of socio-economic drivers (Sections 3 and 4 ). In Section 5 we explore the potential consequences of adaptation and mitigation strategies on various impacts of climate change. 2 Assessment of climate strategies and scenario development (theory and methods) 2.1 Different strategies in response to climate change Climate change and the responses to it can lead to three forms of costs (not necessarily monetary): (1) the (residual) costs of climate impacts, (2) the costs of adaptation and (3) the costs of mitigation. At least theoretically, this corresponds to three different strategies: (1) “laissez faire” (accept climate change), (2) focus on adaptation and (3) focus on mitigation as illustrated conceptually in Fig. 1 (see also Klein et al., 2007 ). While Fig. 1 suggests that the costs and benefits of mitigation, adaptation and residual damages can be traded-off against each other, there are conceptual and analytical problems that complicate such an approach. These relate to spatial and temporal scales, and risks and uncertainty ( Swart and Raes, 2007 ). Mitigation and adaptation are processes that take place at different spatial s cales. While mitigation action is often taken at the national or local scale, the benefits are shared globally. As a result, a critical factor in the success and costs of climate policy is the degree of international cooperation ( Barker et al., 2009; Clarke et al., 2010; van Vliet et al., 2009; van Vuuren et al., 2009 ). For adaptation, in contrast, both costs and benefits occur on multiple scales from local to national and even international. An enabling environment at a larger scale can still enhance adaptation at a smaller scale (e.g. local capacity-building funded by international financing mechanisms). For these kinds of reasons, assessment of mitigation tend to concentrate on the global level, while by contrast, adaptation research is mostly focusing at the local scale. The dynamics over time of mitigation and adaptation is also an important factor. Stringent mitigation scenarios typically require strong, early reduction of emissions. Climate change impacts of these scenarios, however, will in the short-term (first decades) hardly differ from those in scenarios without climate change policy due to the large inertia within the climate system. In contrast, some associated impacts (e.g. co-benefits in reduced local air pollution) may be realized at a much faster pace. Adaptation measures are likely to yield private and social benefits over the near-term. For instance, simple adaptation measures such as air conditioning can bring clear short-term benefits. Some important exceptions exist which may require decades to implement, such as changes in spatial planning or large-scale engineering works for flood protection (see Hallegatte, 2009 ). Other important factors are risk and uncertainty . Our understanding of climate change faces many uncertainties. Key uncertainties to be identified comprise epistemic, data, model, and ontic uncertainties ( Schneider and Kuntz-Duriseti, 2002; van Vuuren et al., 2008a ). Examples of factors that involve uncertainty are (i) future emissions, (ii) the climate system, (iii) future vulnerability and exposure to climate risks and (iv) mitigation costs. Taking mitigative action reduces some uncertainties, since it reduces the originating sources of climate change and reveals the actual mitigation costs ( Barker, 2003; Piani et al., 2005 ). Mitigation may, however, also add to risks. For example, bio-energy, if implemented unsustainably, may offset one set of risks (climate change) while creating another set of different risks (biodiversity loss and reduced food security). One way of dealing with risks is to include assessments of probabilities. This is often done using past evidence, extrapolated to cover specific future circumstances. Other uncertainties (for instance unknowable shocks and surprises) are more difficult to deal with in quantitative sense, but justify acknowledgement of ignorance. Scenarios can be used to explore the potential for extreme events and the robustness of various policy portfolios but this is not often done ( Berkhout et al., 2002 ). Traditionally, the disciplines involved in mitigation research and adaptation research have different ways of describing uncertainty. While mitigation research often uses quantitative methods and concentrates on mean estimates, adaptation research often focuses more on qualitative descriptions of uncertainty and concentrates on the risks of hazardous events even if these have a low probability of occurrence. These different perceptions of uncertainty may complicate an integrated assessment of different strategies ( Swart et al., 2009 ). 2.2 Types of scenarios We can characterize scenarios into different classes based on the considerations about mitigation and adaptation. First, we define a baseline scenario, as a trajectory of events assuming no major feedbacks from climate change and no specific policy efforts on either mitigation or adaptation (such a scenario may still include many actions that indirectly influence the ability to mitigate or adapt to climate change; for instance, increasing income levels can be expected to coincide with greater investment in health services reducing the risks of climate-related diseases such as malaria). The main purpose of this type of scenario is analytical, serving as a point of reference for other scenarios. Second, adaptation scenarios describe a world in which societies are responding to climate change impacts. Their purpose is to explore the type of technologies and policies required to adapt to climate change, the avoided damage and the associated costs. Adaptation includes so-called autonomous adaptation (i.e. actions that occur without specific government action) and planned adaptation. Third, mitigation scenarios describe a world including policies aiming to limit climate change. Their purpose is to explore the type of technologies and policies required to minimize climate change and the associated costs. As there will always be remaining impacts, the fourth set, adaptation and mitigation scenarios combine both types of responses to climate change. Possibly, this fourth category of scenarios could re-order policy options according to the synergies that might exists between adaptation and mitigation options, e.g. for some re-afforestation options. Each of these scenarios is connected to a broader social, political and cultural context in which they are assumed to arise. In exploring a preferred mix of mitigation, adaptation and residual damage, two main approaches exist: (i) the impact and risk-based approach that describes potential impacts as function of global mean temperature increase (and thus mitigation), and (ii) the cost–benefit analysis, which identifies monetary costs and benefits in order to maximize welfare (see for instance Nordhaus, 2008; Tol, 2002c ). In both cases, we believe it to be more useful and reflective of the issue to describe the relationships between different response strategies than to seek to determine an optimum. Given the complexities and uncertainties laid out in Section 2.1 , we believe no optimal mitigation, adaptation or combined strategy can be pursued in reality. 2.3 Integrated analysis An integrated analysis of mitigation and adaptation can be achieved in different ways: e.g., by using one single, so-called integrated assessment model, or by exchanging information between different models and disciplines, assessing available literature and making results comparable. Both methods are organized around the cause–effect chain of climate change, i.e. describing the relationship between economic activities (income, energy use, agriculture, etc.), emissions, climate change and impacts – and the related feedbacks ( Fig. 2 ). The scheme in fact also forms the backbone of information flows around scenarios for the IPCC reports ( Moss et al., 2010 ). Scenarios are developed first by integrated assessment and emission modelers (focusing on economic driving forces, energy and land use and GHG emissions (IPCC “Working Group III”)). Subsequently, the emission trajectories are used in climate models to assess the impacts of climate change (IPCC “Working Group I”). Finally, the scenarios are used for impact, adaptation and vulnerability analyses (IPCC “Working Group II”). The involvement of different research disciplines and working groups implies that it is difficult to account for feedbacks between the different areas. Integrated Assessment models capture only a limited number of the possible feedbacks (frequently omitted feedbacks include the impact of food and water security on population and economic drivers; relationships between water scarcity and food production, impact of climate change on energy use, etc.). Ignoring (some of) these feedbacks may be reasonable if they are not substantial enough to significantly influence the system. For analytical reasons, there are major advantages to organizing scenario development within disciplinary fields and consider a limited number of feedbacks. It allows researchers to focus on elements of the chain that they understand well and to add the required amount of detail, without being confronted with the complications of interlinkages. However, this may change in a situation of increased focus on integrated analysis of mitigation and adaptation strategies. Some examples of why an integrated approach may be necessary are: i. Climate impacts, such as those triggered by extreme events, may be so severe that they undermine the economic assumptions of the original scenario; ii. Climate impacts could be substantial in agriculture so that estimates of land-use related emissions not taking impacts into account might be wrong, and the mitigation potential of bio-energy may be affected; and iii. There may be competing claims for land areas attractive for both mitigation and adaptation purposes. Thus, an interesting question is whether the need for more integrated analysis is so urgent that more complex modes of integration are needed (interactive coupling of models; one complex model), or whether the impacts can be handled separately simplifying the analysis framework. The time horizon and the decision focus may also be important here, e.g. whether potential tipping points are taken into account ( Lenton et al., 2008 ). The few available studies that have looked into this question seem to suggest that in most sectors the adaptation implications of any mitigation project are small as well as the emissions generated by most adaptation activities ( Klein et al., 2007 ). The most integrated analyses to date come from the cost–benefit oriented integrated assessment models like FUND, DICE and MERGE ( Manne and Richels, 2005; Nordhaus, 2008; Tol, 2002c ) – but these models typically aggregated climate impacts into a limited amount of rather abstract damage functions. We believe that over time, with growing intensity of both mitigation and adaptation measures across many sectors, the need for joint assessment with sufficient detail will intensify. The scenarios presented here, based on the current state of the art in modeling and scenario development, take a first step. The same scenarios are used in one assessment for mitigation and impact assessment and we explicitly address mitigation and adaptation strategies (either as part of the scenarios or within the models used for the different impacts). However, many feedbacks are not accounted for. We come back at the end of the paper to the role of more integrated (but also more complex) scenarios. 2.4 Methods used in this paper As described above, several types of scenarios can be identified: baseline, mitigation, adaptation and adaptation–mitigation scenarios. These scenario types are also presented in this paper. For the baseline/adaptation scenario, we assume intermediate assumptions for most socio-economic drivers. Scenarios assumptions are described in Sections 3 and 4 . The scenarios do not include mitigation, leading to a global mean temperature increase of 4 °C above pre-industrial levels by 2100. While we describe possible impacts and adaptation in these scenarios, we do not include feedbacks on the original drivers. In the mitigation scenarios, stringent mitigation efforts are included leading to a global mean temperature increase of 2 °C. Using the median value for climate sensitivity given by IPCC of 3 °C ( Meehl et al., 2007 ), this translates into a stabilization level of around 450 ppm CO 2 -equivalent (CO 2 -equiv.). The impacts of climate policy on economic drivers are not accounted for – but several other relationships are coupled (e.g. land use). In most of the paper, we thus ignore potential impacts of climate change and climate policy on the economic assumptions. In Section 5.8 , however, we discuss their impacts within a simple, economic model (FAIR) to provide some insight in the possible size of the economic consequences on the global scale. Several model tools are used. The scenarios are mainly developed using the IMAGE integrated assessment model ( Bouwman et al., 2006 ). The IMAGE model describes developments in energy and land use in the 21st century based on assumptions for population and the world economy, combined with assumptions for technology development and consumption patterns. The model projects climate change (as indexed by global mean temperature change and sea level rise) at the global scale, and constructs spatial scenarios for change in monthly temperature and rainfall at a 0.5° × 0.5° grid by pattern-scaling downscaled climate model patterns. The output of IMAGE is used in the model DIVA to describe sea-level rise; in the global hydrology model Mac-PDM to estimate consequences for water stress; in the TIMER energy model to estimate implications for heating and cooling demand; in the MARA/ARMA malaria suitability model for impacts on malaria and in the FAIR model for a monetary cost–benefit analysis. Moreover, we discuss more generally the implications for agriculture (based on IPCC AR4) and extreme events. Appendix A provides a brief description of all models used. In our descriptions, we focus on the global level (in view of the limited space). Clearly, this leads to limitations in our discussion of adaptation. The experiments depend on the design each model and thus the number of scenarios that can be presented differs between different impacts. This implies that the study should be interpreted as a first illustration of an integrated assessment, and not as a holistic study on adaptation and its limits. 3 Results: socio-economic trends in the baseline scenario 3.1 Population development and economic growth We assume that population follows medium-fertility variant of the 2004 revision of the World Population Projections ( UN, 2005 ) up to 2050, and the UN's long-range medium projections up to 2100 ( Fig. 3 ). This implies that the global population steadily increases to almost 9.1 billion people by 2050 and stabilizes at about 9.2 billion people over the subsequent 50 years up to 2100. The scenario takes a middle ground within the range of population forecasting (see Fig. 3 ). For economic growth up to 2050, the scenario follows projections linked to the Cambridge model E3MG ( Barker and Scrieciu, 2010; Barker et al., 2008 ). The scenario was extended beyond 2050 using the economic growth projections of the SRES-based B2 scenario ( IMAGE-team, 2001 ). Quantitatively, the scenario is a medium to high economic growth scenario, which is mainly the result of optimistic growth assumptions for China and India. The OECD economies are projected to remain the richest in the world in per capita terms, but in terms of total economic activity the importance of developing regions grows rapidly. The growth of GDP per capita is between 0 and 2% per annum in Africa, the Middle East and Latin America. In Asia, it falls from the current high levels to 3% per annum in 2050. 3.2 Energy use and greenhouse gas emissions for the baseline scenario Energy use in the baseline scenario is made consistent with a baseline published by the European Commission ( EC, 2006 ). Despite a further decrease of energy intensity, world energy consumption more than doubles in the 2000–2050 period and increases by another 25% in the 2050–2100 period ( Fig. 4 ). Over the whole century, energy supply remains dominated by fossil fuels. While oil and natural gas production peak and decline during the century, the use of coal increases during the whole scenario period. Also non-fossil energy production increases rapidly. Nuclear energy use increases by a factor of two to three to 76 EJ over the period until 2100, the use of biomass increases strongly, while hydro-electricity production increases by about 60–80%. The largest relative increase is that of wind and solar energy; this rises from less than 1% of all non-fossil energy to between 10 and 14% in 2050. Total renewable energy use in 2050 is 120–140 EJ, and 190 EJ in 2100. The trends described above imply that emissions of CO 2 from energy activities more than double in the period to 2050, and rise by another third between 2050 and 2100 (see Fig. 3 ). As such, the scenario forms an intermediate baseline scenario within the literature range ( Fisher et al., 2007 ). Non-CO 2 GHGs (in particular methane) increase steadily in the period 2000–2050, but at a slower rate than CO 2 (as their driver, agriculture, is expected to grow more slowly than the energy sector). CO 2 emissions from land-use fall back to zero during the first half of the century. The area of agricultural land lies within the range of similar scenarios that have recently been published, although at the low end of the range ( Rose et al., 2007 ). 4 Results for the mitigation scenario and climate scenarios 4.1 Energy use and greenhouse gas emissions The mitigation scenario aims at stabilising GHGs at around 450 ppm CO 2 -equiv. (see also van Vuuren et al., 2007, 2010 ). The scenario allows for an initial overshoot of concentration to about 510 ppm CO 2 -equiv. Den Elzen and van Vuuren (2007) have shown earlier that a limited overshoot of concentration allows for meeting similar climate targets at lower costs. Emission reductions are achieved in various ways. One element is to increase energy efficiency, which reduces the total amount of energy use (a 20% reduction in 2050 compared to baseline) (see Fig. 4 ). The scenario also shows an increasing use of energy from non-fossil sources, which account for most of the growth in total energy use. Non-fossil energy use increases from about 15% of total primary energy use in 2010 to more than 30% in 2050 and is over 40% of the total by the end of the century. Most of this growth is due to an increase in bio-energy use. Carbon capture and storage is applied in most remaining stationary uses of fossil fuels. Finally, also non-carbon dioxide greenhouse gas emissions are reduced. As a result, global emissions peak around 2020, and reduce further with time. Emissions are reduced by more than 70% compared to the baseline in 2050 and more than 80% by 2100. The consequences of mitigation policies affect not only the energy sector, but also land use. Substantial additional land areas are used for afforestation and bio-energy (see Fig. 5 ). Model comparison studies show that the mitigation scenarios presented here are consistent with the current literature, although models show significant differences in the contribution of various reduction measures ( Clarke et al., 2010; Edenhofer et al., 2010 ). According to the IMAGE model calculations, the abatement costs of the emission reductions are in the order of 1–2% of GDP (i.e. the annual additional expenditures which can be compared to the current expenditure of around 1.5% of GDP on environmental policy in OECD countries) ( Fig. 6 ). The literature range of comparable scenarios is in the order 0.5–5.5% in 2100. Most studies agree that these additional expenditures would lead to a reduction of GDP. We discuss this further in Section 5.8 . 4.2 Climate change under the baseline and mitigation scenario The atmospheric GHG concentration and associated mean global temperature change resulting from the emissions of the two scenarios is shown in Fig. 7 (solid lines indicate best-guess values), based on the IMAGE model calculations. The IMAGE model uses the MAGICC model to calculate changes in global mean temperature. The MAGICC model was used earlier for similar IMAGE scenarios by van Vuuren et al. (2008b) to calculate trajectories for greenhouse gas concentration and temperature including uncertainty ranges. Here, the uncertainty ranges used for the MAGICC calculations were based on existing runs of more complex carbon cycle and climate models. We have used the implications for ranges in greenhouse gas concentration and temperature outcomes to also depict the uncertainty ranges here as is indicated by the shaded areas in this graph. For temperature, the wider shaded area indicates the uncertainty as result of uncertainty in the carbon cycle and climate sensitivity. For the baseline scenario, global mean temperature increases almost linearly to 2.1 °C above the pre-industrial levels in 2050 and to 3.7 °C in 2100 (uncertainty range 3–5 °C). In the mitigation scenario, the global mean temperature increase by 2100 is limited to 1.9 °C. Again, there is considerable uncertainty. Fig. 7 indicates that by the end of the century the mitigation case could also lead to a temperature increase of 2.6 °C compared to pre-industrial levels. As the mitigation scenario presented here is among the most stringent in the scientific literature (cf. Clarke et al., 2010; Edenhofer et al., 2010; Fisher et al., 2007 ), two important conclusions can be drawn. First, the analysis indicates that global warming can be moderated but not halted. Second, the observation that a stringent scenario could also lead to considerably greater climate change than 2 °C may imply that hedging adaptation policies against more warming might have considerable value. For example, such policies may be to ‘… aim for 2 °C, but prepare for 3 °C’. In the assessment of impacts below, we focus on the central climate change projections. Changes in mean monthly temperature and precipitation across the globe at the 0.5° × 0.5° scale, associated with the global average temperature changes, have been constructed by rescaling patterns derived from the HadCM2 climate model ( Fig. 8 ). These patterns show that the change in annual mean temperature is larger at high latitudes than at low latitudes, and show considerable spatial variation in change in rainfall. Considerable disagreement about the expected patterns of climate change exists, especially for precipitation: the impact results presented in this paper therefore represent only one possible outcome. 5 Results: impacts and adaptation in the different scenarios 5.1 Introduction IPCC's Fourth Assessment Report ( IPCC, 2007 ) gives an overview of climate impacts. Some of these impacts result from changes in average climate, but other impacts may result from changes in extreme events. Table 1 summarizes some of the impacts, for health, agriculture, water availability, coastal flooding, urban areas and energy system, and large-scale disruptions of the climate system (in contrast, biodiversity and ecosystem services have not been included). As noted earlier, most of the literature has treated climate change as “a gradual phenomena” ( Agrawala and Fankhauser, 2008 ). This is problematic for impacts characterized by low probabilities coupled with high impacts (see below). In this exploratory analysis, we sketch some of the impacts and adaptation requirements. We aimed to cover several key impacts mentioned in Table 1 , but the assessment was limited by the availability of models that could easily be coupled. Therefore, rather than intending to be exhaustive, the descriptions provide some indication of the magnitude of some impacts and key adaptation challenges. In presenting our results, we have used several new model runs based on the scenario discussed above (e.g. for malaria, water resources, sea-level rise, heating and cooling demand). We have, however, also assessed existing information from IPCC 4th Assessment Report in the context of the two scenarios presented here (temperature-related mortality, agriculture and extreme events). 5.2 Human health: temperature-related mortality and malaria Health impacts of climate change need to be seen in the context of other, more important drivers of human health, including lifestyle-related factors ( Hilderink et al., 2008 ). We focus here on temperature-related mortality and malaria. 5.2.1 Temperature-related mortality Temperature-related mortality impacts may occur via changes in extreme temperatures, changes in average temperatures, or in seasonal variation of temperatures, with the literature showing varying results. McMichael et al. (1996) made an estimation of temperature-related mortality using relative risk ratios, showing that there is an optimum temperature at which the death rate is lowest (also know as the U-shaped dose–response relation). If temperature increases, heat stress-related mortality increases, but cold-related mortality decreases. Tol (2002a) concluded that in monetary terms the reduction in cold-related mortality due to climate change outnumbers the increase in heat-related mortality. This conclusion is however, influenced by the approach used to value a life and also subject to the large uncertainties with respect to the relationships between average and regional temperatures and temperature and health. Adaptation may occur both by the adjustment of the human physiology to higher temperatures ( McMichael et al., 1996 ), changes in behavior and an increase of air conditioning use ( Kinney et al., 2008 ). Given the complexities in using dose–response relationships between temperature and mortality, we have not attempted to quantify these here. 5.2.2 Malaria Considerable attention has been paid to the relationship between malaria and climate change. In this paper, we also focus on climate-induced changes in malaria risks. Annually more than one million people, mostly African children, die from malaria, a vector-born infectious disease. The anopheles mosquitoes (the vector which spreads the malaria infection) can only survive in climates with high average temperatures, no frost and sufficient precipitation. The MARA/ARMA malaria suitability model ( Craig et al., 1999 ) incorporates these factors to determine climatically suitable areas. Mortality due to malaria is, however, also heavily influenced by factors such as access to preventative measures (including indoor spraying and insecticide-treated bed nets) and access to health care. In the MARA/ARMA model these factors are linked to income and urbanization. Fig. 9 shows the results of this model for the scenarios of this paper. The impact of autonomous adaptation (as function of rising income) reduces malaria deaths by around 50%, especially in Africa (mainly due to better provision of health care). In contrast, the impacts of climate – and especially the difference between the mitigation scenario and the baseline case are much smaller. Mitigation reduces malaria health risks by about 2% (2050). Adaptation, therefore, has a much more decisive influence on malaria control than mitigation (this finding seems to be robust with available literature). 5.3 Agriculture: impacts on yields Easterling et al. (2007) have synthesized a large amount of research on the impacts of climate change on crop growth, with and without adaptation. The results were summarized as a function of global mean temperature increase, although in reality changes in temperature and precipitation patterns and CO 2 fertilisation all play a role. For instance, the impacts of CO 2 fertilisation partly offset the impact of climate change. The results can be used to assess the climate impacts for our scenarios by using the best-fit polynomials from Easterling et al. (2007) , that indicate the impact on yield as a function of mean temperature change. 1 1 We have in each case taken the global mean temperature change for a scenario and used that as an indication of the average local temperature change to be expected. This means that our impact estimates are likely to be conservative, as temperature increase is likely to be stronger the global average over many land areas. We looked at the impacts for the baseline (4 °C) and mitigation (2 °C) scenario, with and without adaptation, for maize, wheat and rice (see Fig. 10 ; results are presented for tropical and temperate zones in 2100; these impacts are additional to the yield increases as a result of other factors than climate change). Although the results are very uncertain, some conclusions seem to be possible. First, the baseline scenario (no adaptation) causes a very substantial decrease in yields (relative to the situation without climate change) for all cases shown: Climate change impacts may reduce yields for the aggregated regions shown by 10–35% for the crops studied (2050). Second, engaging in either mitigation or adaptation limits the decrease in yields. In the tropics, however, impacts remain negative and typically in the order of a 10% loss. Third, the combination of mitigation and adaptation may result in an improvement from today's situation. Agricultural impacts may be more positive for temperate regions, but only if the advantages of higher temperature are not offset by impacts of extreme weather. These results underline the need to look at both mitigation and adaptation. The results presented are based on the IPCC assessment and represent a wide range of models. The results can also be illustrated by individual studies. Tubiello and Fischer (2007) , for instance, found that a mitigation scenario could reduce the global costs of climate change in agriculture significantly. Similarly, Fischer et al. (2007) illustrated the importance of adaptation for water irrigation requirements. They found that mitigation reduced agricultural water requirements by about 40%, leaving 60% of the impacts requiring adaptation. When dealing with impacts on agriculture both drought and heat wave stress play important roles. Fig. 11 shows, for Europe, the impact of drought and heat wave stress on crop yields for a 2 °C warming scenario, assuming various forms of adaptation ( Mechler et al., 2010; Moriondo et al., 2010 ). 2 2 Calculations were done using the Cropsyst model on the basis of the HADCM3 climate model for the 2030–2060 time slice. Winter and summer crop yields were simulated for spring wheat with today's and future crop management practices. Adaptation options considered comprised shifting the sowing date by a few days and using cultivars with a longer/shorter growth cycle. Results show that Southern Europe and parts of France are today already particularly exposed to drought and heat stress, and this situation is expected to worsen even under the 2 °C (mitigation) scenario ( Fig. 11 panel A). When considering the two adaptation strategies in combination with mitigation ( Fig. 11 panels B and C), many regions in Europe may actually benefit. Northern Europe, in particular, could exploit the advantage of higher precipitation by using crop varieties with a longer growing cycle. In contrast, in Southern Europe the same adaptation options would result in an added negative impact, since crop development would shift towards summer when longer dry spells and heat waves may significantly affect crop growth. Also, the results show that while there are some region-specific limits to adaptation, overall adaptation would effectively reduce impacts on the agricultural sector in Europe. 5.4 Water resources: potential water availability The effects of the two scenarios on exposure to changes in water resources stress are assessed using a global-scale water resources impact model ( Arnell, 2003 ). Fig. 12 shows the percentage change in average annual runoff by 2100 (relative to the 1961–1990 mean) under the baseline scenario and the mitigation scenario (with the HadCM2 climate model pattern). We define watersheds to be in a water-stressed condition if average annual runoff is less than 1000 m 3 /capita/year (other definitions are also used in the literature). The effect of climate change is indexed by summing (i) the populations living in water-stressed watersheds where runoff decreases (increases) significantly (typically by more than 5–10%) and (ii) the population living in watersheds that become water-stressed (cease to be water-stressed) due to climate change. The number of people exposed to an increase or decrease in water stress due to climate change have not been summed for two reasons: (i) the adverse effects of having less water are greater than the beneficial effects of having more water in a water-stressed catchment, and (ii) the regions with an increase and decrease in exposure to water resources stress are widely separated, and “surpluses” in one area do not offset “deficits” in another. The results show substantial differences in exposure to increased water resource stress in 2050, 2080 and 2100 between the mitigation and baseline scenarios. In 2020, there is little difference in runoff between the two scenarios. Fig. 13 shows the numbers of people exposed to an increase or decrease in water resource stress due to climate change under the two scenarios. In both the baseline and the mitigation scenario, the numbers of people living in water-stressed watersheds who apparently benefit from increased water availability is larger than the numbers exposed to a reduction in runoff, but – as outlined above – we do not focus on the net effect. The numbers of people exposed to change in water resources stresses are sensitive to the assumed pattern of climate change. Compared to the baseline, the mitigation scenario reduces the numbers exposed to an increase in water resources stress by 135 million (reducing impacts by 12%), 281 million (20% reduction) and 457 (30% reduction) million in 2050, 2080 and 2100 respectively. At the same time, however, there are also people benefiting from climate change. The relative size of the groups with positive and negative impacts depends on the climate model used (here only the Hadley pattern has been used). Clearly, mitigation also decreases the number of people benefiting from climate change. It is also clear that mitigation does not eliminate water supply impacts of climate change, and adaptation will be required for the remaining billion people exposed to increased water resource stress due to climate change. Adaptation may include measures to increase water storage, transport of water, or reduction of water demand by increasing efficiency. Underlying results show that the effects of mitigation vary significantly by region. In fact, in some regions mitigation may even increase the numbers of people exposed to increased stress. Specific uncertainty analysis shows that results are highly dependent on the uncertainty in the changes in the precipitation pattern due to climate change. 5.5 Sea level rise Another important impact of climate change is rising sea levels. Global mean sea-level rise has been projected for both scenarios using the MAGICC component of the IMAGE model. Due to the delayed response of sea-level to global warming, the projections mainly diverge in the second part of the century: sea level rise is 35 and 31 cm in 2050 in the 4 °C and 2 °C scenario, respectively and 71 and 49 cm in 2100. These projections do not include a potential accelerated contribution of the ice sheets of Greenland and Antarctica, which could lead to higher sea-level rises but the underlying processes are insufficiently understood and are currently not included in climate models ( Meehl et al., 2007; Nicholls et al., 2010; Vermeer and Rahmstorf, 2009 ). We use the DIVA model to assess both damage and adaptation costs of sea-level rise, associated storm surges and socio-economic development under the two scenarios taking into account coastal erosion (both direct and indirect), forced migration, coastal flooding (including rivers) and salinity intrusion into deltas and estuaries. For each scenario the model is run first without and then with adaptation in terms of raising dikes and nourishing beaches ( DINAS-COAST Consortium, 2006; Hinkel and Klein, 2009 ). Further impacts such as salinity intrusion in coastal aquifers, loss of coastal wetlands and biodiversity as well as further adaptation options such as salinity intrusion barriers, port upgrade, set-back zones and ecosystem-based protection could not be included due to the unavailability of global data and general models of these processes. Fig. 14 shows that independent of the level of mitigation, adaptation reduces global overall costs rather effectively, which illustrates the necessity for engaging in adaptation even under ambitious mitigation. At the aggregated scale more damages can be avoided through an adaptation-only strategy than through a mitigation-only strategy, although a combination of the two has the strongest positive impact. From the perspective of poorer and small island countries, however, stringent mitigation is necessary to keep risks at manageable levels. Even without sea-level rise, adaptation would be cost-effective in order to protect the assets situated in the floodplain, which increase due to socio-economic development alone. While this would involve substantial investment flows (tens of billions of US$ worldwide), they are a relatively small fraction of global GDP, even for sea level rise at the level of the baseline scenario. However, for individual countries or regions (particularly small island states) these costs can be a much larger fraction of GDP, including the risk of a complete loss. 5.6 Heating and cooling demand (settlements and society) Climate change is likely to influence the demand for space cooling and heating. Therefore, we have developed a set of simple relationships to describe heating and air conditioning demand in the residential sector and explored the impacts of climate change on this simulated energy demand ( Isaac and van Vuuren, 2009 ). Clearly, changes in population and income are projected to lead to a considerable growth in the energy demand for heating and air conditioning in the coming century (see Fig. 15 , no climate change case). Driven by climate, changes in cooling and heating practices are examples of autonomous adaptation (i.e. without policy intervention). Adaptation is not universal, however, since the population will not always be able to respond. Unfulfilled demand for heating and cooling can lead to health impacts (as described in Section 5.2 ) and to loss of labour productivity. In addition to these effects, there is reduced comfort when indoor temperatures rise above a given level. Fig. 15 shows that, globally, the autonomous increase in energy demand without taking climate change into account due to increasing income and wealth is much larger than the difference between the energy demand in the baseline scenario and the mitigation scenario ( Isaac and van Vuuren (2009) show this a robust result also for other baselines). The effect of climate change on combined energy demand is also smaller than the effect on heating and air conditioning separately, since increases in air conditioning compensate decreases in heating. On the regional and country level, impacts can be far more significant: for example, in India we project a large increase in energy demand due to increased cooling, while in Western Europe and the USA, we project a substantial decrease due to reduced heating. 5.7 Extreme events Climate change is expected to lead to changes in the frequency and intensity of some weather-related extreme events ( Parry et al., 2007 ). Extremes like floods, droughts, heat waves and storm surges could become more frequent and intense, while cold-extremes, such as cold spells, are likely to become less frequent and weaker. Assessing risks of climate change based on changes in average conditions-only runs the risk that changes in extreme event risks are averaged out. A more risk-based, geographically explicit method is therefore preferable. However, knowledge on disaster impacts is complex and contested. To date, there are only a limited number of national level studies taking a probabilistic approach to projecting future risk in the presence of climate change, mostly focusing on flood risk ( Mechler et al., 2010 ). One such study on the pan-European scale by Feyen et al. (2009) computed that expected annual damages would triple under a baseline scenario. A key constraint to quantitative risk approaches is the uncertainty in the climate projections. For precipitation, for instance, models often disagree on the sign of changes at the local scale. This is especially important for studies looking for instance flood risk. While the Mechler et al. (2010) study aimed to project future risk, they found future projection to be so uncertain that the authors refrained from projecting future flood risk based on an estimate of today's flood impacts. Current models and data, however, seem to be sufficient to assess the combined risk of drought and heat wave stress on agriculture with a relatively high level of certainty (slower phenomena). Some examples of work in the context of the 2 °C and 4 °C scenarios are provided here. Several studies looked into flood-affected people at the global scale ( Hirabayashi and Kanae, 2009; Kundzewicz et al., 2010 ). Regression of samples shows that the average global number of people affected by 100-year floods per year for the mitigation scenario (2 °C) is projected to be 211 million compared to 544 million for the baseline (4 °C). Mirza et al. (2003) showed that for Bangladesh, a flood-vulnerable country, even the 2 °C scenario is expected to increase the projected flooded area by at least 23–29%. It should be noted, however, that the uncertainties about exposure, vulnerability and adaptation still lead to a wide range of estimates for the costs of future flood damage. With respect to drought, the projections for the 2090s made by Burke et al. (2006) show that the number of extreme drought events per 100 years and mean drought duration are likely to increase by factors of two and six, respectively, for the baseline scenario by the 2090s. Evidence suggests that damage of weather and climate related impacts has already increased in the present-day, but these are mainly due to the wealth and population increases ( Bouwer, 2010 ). However, climate change is expected to increase over time, and is likely to become a more significant contributor to rising damages in the future. The most recent IPCC report indicates that the costs of major events are expected to range from several percent of annual regional GDP and income in very large regions with very strong economies, to more than 25% in smaller areas ( Parry et al., 2007 ). Disaster losses for highly exposed small island states in the past have in fact exceeded annual GDP ( Cummins and Mahul, 2009 ). 5.8 Economic evaluation of impacts Cost–benefit analysis (CBA) is used to express the costs and benefits of climate change of different strategies in terms of a common monetary unit. We use the CBA module of the FAIR model (see model Appendix A ) here to obtain some idea of impacts at a more aggregated scale. For mitigation costs, the FAIR model uses the information of the IMAGE model presented earlier. The climate damage and adaptation cost functions used in FAIR are derived from the AD-DICE model ( De Bruin et al., 2009a; Hof et al., 2009a ). In short, AD-DICE estimates adaptation costs based on the damage function of the DICE model ( Nordhaus and Boyer, 2000 ). The AD-DICE separates these functions into a damage cost function and residual damage function based on an assessment of each impact category described in the DICE model – agriculture, coastal zones, health, settlements, non-market time use, other vulnerable markets and catastrophic impacts. For this study, we assumed an optimal adaptation response to climate change (i.e. given a level of temperature change the model minimizes the sum of adaptation costs and residual impacts). The impact estimates used in DICE (and thus FAIR) include: (i) real, measurable, economic costs (so-called market costs); and (ii) other, intangible losses (non-market losses), which are monetized using the willingness-to-pay concept. The damage functions are not directly related to the physical or economic damages described earlier in this section, as they are derived from a separate source. It has been shown earlier that the FAIR results of adaptation costs are consistent with the range of values reported in the literature ( Hof et al., 2009a ). Under default settings of the FAIR model and a discount rate of 2.5%, the discounted costs as a share of global GDP due to climate change impacts for the period 2005–2200 amount to nearly 4.5% in the baseline ( Fig. 16 ). These costs may seem higher than suggested by the limited set of sectoral analyses presented above, but include more sectors and also the impacts of possible catastrophic events ( Nordhaus and Boyer, 2000 ). Annual costs rise sharply over time, reaching 17% in 2200 (note that impact estimates are very uncertain and both higher and lower values can be found in the literature ( Parry et al., 2007; Stern, 2006; Tol, 2002b )). Scenarios with only adaptation or mitigation reduce discounted costs substantially to around 2.5% ( Fig. 16 ). Hof et al. (2008) have shown that the results of CBA of climate change are very sensitive to model assumptions, with the discount rate playing the most important role. The discount rate is especially important due to the different costs function over time related to the adaptation only and mitigation only scenarios. 3 3 A discount rate of 5% leads to discounted costs of 0.8% and 1.9% for the adaptation-only scenario and mitigation-only scenario, respectively. If a discount rate of 1.4% is used (equal to the discount rate used by Stern (2006) ), the discounted costs are 3.2% and 2.5% for the adaptation-only scenario and mitigation-only scenario, respectively. With our discount rate of 2.5%, the combination of mitigation and adaptation leads to the lowest discounted costs, namely 2% of GDP. Consistent with literature, the adaptation investments are assessed to be smaller than mitigation investments and residual damages. However, they are very important in limiting residual damages. Some important caveats need to be mentioned. First, calculations cannot be regarded as reliable for the extreme tails of risks (i.e. low probability, high impact events). As a subjective assessment on how to handle such risks is involved, Weitzman (2008) questioned the usefulness of CBA for policymakers. Secondly, the value of the discount rate to account for time preference and risk is currently heavily debated, with arguments relating to subjective time preference and risk perception ( Nordhaus, 2008; Price, 2005; Stern, 2006 ). As mentioned above, the value of the discount rate can have a large effect on the results. Finally, non-market impacts need subjective quantification of damages; while it is difficult to monetize these impacts, in general, it is even more difficult for irreversible changes, for example a warming of the oceans leading to the loss of coral reefs ( Ackerman and Heinzerling, 2004 ). 5.9 Uncertainties in climate change, impacts and adaptation There are many sources of uncertainty in projections of future climate change and its impacts. Uncertainties are associated with every step in the causal chain: emissions, climatic drivers (e.g. the carbon cycle), climate (mainly climate sensitivity and pattern of climate change), and impacts (including adaptive capacity). As a result, different studies might give very different results for the same emission scenario. In fact, these differences are often larger than those arising in a particular model under different emission scenarios. For example, for precipitation changes at the end of the century, the multi-model ensemble mean exceeds the inter-model standard deviation only at high latitudes ( Kundzewicz et al., 2007 ). Uncertainties in climate change projections increase with the length of the time horizon. In the near term (e.g., the 2020s), climate model uncertainties play the most important role; while over longer time horizons (e.g. the 2090s), uncertainties due to the selection of emissions scenario become increasingly significant ( Jenkins and Lowe, 2003 ). The impact of future climate change on extreme events is particularly uncertain. This is partly due to a mismatch between the larger spatial and temporal scale of coarse-resolution climate models, and the local occurrence and short life of some weather extremes (e.g. cloudburst precipitation and flash floods). As impacts and adaptation take place at the local scale, detailed information is needed – which implies an increase in uncertainty. The large uncertainty ranges suggests that planning for adaptation should not be based on a single scenarios, but that a large range of projections need to be account for. 6 Conclusions In this paper, we have discussed how scenario analysis may contribute to the assessment of mitigation and adaptation strategies. We have also presented two integrated scenarios as a starting point for analysis. The scenarios have explicitly treated mitigation and adaptation action for several indicators – and cover several important linkages and feedbacks between socio-economic development and impacts (e.g. the impacts of climate change on land use and mitigation are accounted for). We specified impacts in those scenarios for a selected number of indicators, focusing mainly on mean climate changes. Based on our work, we draw the following conclusions: • By describing two contrasting sets of possible climate change trajectories for the world, we have created the basis for a more integrated analysis of the interaction between mitigation, adaptation and climate impacts. The first scenario (no mitigation) is expected to lead to a global mean temperature increase by the end of the century of around 4 °C (for the most likely values for climate parameters, and current economic trends). This scenario has high adaptation needs as has been shown in some of our analyses. The second scenario assumes stringent mitigation and limits global mean temperature change to 2 °C, with a probability of 50%. Even under this scenario, substantial adaptation measures will be needed. • Integrated scenario analysis as presented here can form a good basis for exploring the different consequences of policy choices (including uncertainties); it is not feasible, given uncertainties to determine an optimal mix between mitigation, adaptation and residual damages. As discussed in this paper, the weighing of the consequences of climate change and the various policy responses is complicated by large differences in scale, space and time; large uncertainties; and clear differences in interest between actors (whether they are perpetrators or victims of climate change, for instance). As a result, subjective interpretation of risks will always play an important role. Still, scenario analysis can provide a description of possible consequences and risks. At this stage, the monetary assessment of cost and benefits (Section 5.8 ) could not be linked to the description of physical change in the preceding sections. • Effective climate policy includes both adaptation and mitigation. Model calculations show that mitigation scenarios can be designed that lead to an increase of global mean temperature increase 2 °C for a best-guess climate sensitivity. However, even these stringent scenarios can still also result in a global mean temperature increase of more than 2.5 °C (and at best a temperature increase of 1.5 °C) and regional temperature change which is far greater. The need for a combination of mitigation and adaptation has been shown for most of the impacts explored in this paper. For example, adaptation can be more effective than mitigation in dealing with sea-level rise (at least during the 21st century), but mitigation still has a role to play in reducing damages and costs of adaptation. Agriculture presents an example where adaptation and mitigation are both clearly necessary. Crop yields in agriculture are projected to suffer negative impacts in many regions due to climate change in the absence of both adaptation and mitigation action. Without stringent mitigation, adaptation could limit negative impacts, but not remove them. An advantage of mitigation is that it affects all impact categories, while adaptation needs to be tailored to impacts and contexts. • While impacts of climate change can be severe and, depending on subjective choices, may warrant stringent climate policy, the impacts assessed in this study (given the state of the art) are likely to remain secondary influences of population change and economic growth at a global scale. Yet important caveats apply (see below). While climate change may have an impact on millions of people, other challenges are likely to influence people and governance more significantly. It should be noted, however, that we have covered only a limited set of impacts and focused mostly on mean estimates of gradual climate change and, for instance, not on catastrophic, very high-impact, extremely low-probability events ( Weitzman, 2008 ). Such events in fact may be so severe that the conclusion above no longer holds. If costs at a global scale remain relatively low, there is less need for global analysis to include all feedbacks on main drivers based on the consistency of the storylines. Clearly, at the local scale the situation is likely to be very different; impacts for individual countries can be far more substantial than at the global scale. For example, sea level rise is very important for some low-lying island states and countries that could be significantly affected by either large adaptation costs and/or damages (up to complete destruction). For agriculture, positive and negative impacts are projected to occur in different places and at different times – with low-income countries often experiencing relatively more negative impacts. Agriculture in temperate regions, where it is currently temperature-limited, could benefit. All in all, we believe that it useful to pursue further the development of integrated scenarios specifying these further on a regional scale. While this paper presents a useful first step, it also has left many feedbacks still unaccounted for. • The overall mitigation costs in this study are estimated to be in the order of 1–2% of GDP for the 2 °C scenario. The mitigation scenario reduces the risks of climate change. There are several types of benefits of investments in mitigation. First, climate-related damages and the costs of adaptation are reduced. Second, also uncertainty is reduced, which is important given the risks involved. While we argue there can be no optimal trade-off between mitigation and adaptation at a global level, we have shown that over the longer-run the costs and benefits of mitigation and adaptation are of an equivalent magnitude. • Important foci for further analysis include the linkages between assessment of physical changes and monetary impact analysis, variability and changes in extreme events, the potential role of large scale disruptions and governance. In our and other assessments, the focus has mostly been on changes in mean values, yet there is considerable concern about extreme events (resulting in natural disasters) associated with climate variability, but also in large scale disruptions (such as the disintegration of the West Antarctic Ice Shield), which are not accurately described by average values. Projections of changes in climate variability have been highly uncertain, and to date often hinder analyses from robustly predicting future extreme event risk. The role of different actors is another issue; some forms of adaptation require active governmental involvement; other forms are likely to be implemented by private investors, such as installation of space cooling systems. The differences between these two adaptation protagonists are relevant for future scenario development. Acknowledgements The research presented in this paper was performed as part of the EU-funded ADAM research project. An earlier version of this paper was published as part of the book “Making Climate Change work for us” edited by Hulme and Neufeld and published by Cambridge University Press in 2010. Appendix A Model descriptions A.1 IMAGE 2.4 The IMAGE 2.4 Integrated Assessment model ( Bouwman et al., 2006 ) consists of a set of linked and integrated models that together describe important elements of the long-term dynamics of global environmental change, such as air pollution, climate change, and land-use change. As part of IMAGE, the global energy model TIMER ( van Vuuren et al., 2006 ) describes the long-term dynamics of demand and production of primary and secondary energy and the related emissions of greenhouse gases and regional air pollutants. The model behavior is mainly determined by substitution processes of various technologies on the basis of long-term prices and fuel-preferences. The agricultural model of IMAGE models the productivity of 7 crop groups and 5 animal categories ( Leemans and Born, 1994 ). The regional production of agricultural goods is distributed spatially (at 0.5° × 0.5°) on the basis of a set of allocation rules ( Alcamo et al., 1998 ). Both the land use change maps and the agricultural activity data are used to model emissions from land use (change). The emissions of GHGs are used by the MAGICC model to calculate global mean temperature change ( Wigley and Raper, 2001 ). Patterns of temperature change are obtained by making a link to climate change patterns generated by a general circulation models (GCM). Limitations : IMAGE is provides a physically oriented description of human activities (use of tons of oil, production of tons of cereals, etc.). A fuller macro-economic description only emerges from cooperation with other models. The broad coverage of IMAGE as Integrated Assessment Model implies that many critical uncertainties influence the model outcomes. In this context, use of a single baseline (as in the ADAM project) does not do fully justice to the fundament uncertainties involved. A.2 FAIR The climate policy model FAIR ( Den Elzen et al., 2008 ) is used in conjunction with the IMAGE model to determine the reduction rates across different emission sources. Global climate calculations make use of the simple climate model, MAGICC 4.1 ( Wigley, 2003; Wigley and Raper, 2001 ). Required global emission reductions are derived by taking the difference between the baseline and a global emission pathway. The FAIR cost model distributes these between the regions following a least-cost approach using regional marginal abatement costs curves (MACs) for the different emissions sources. Recently, the FAIR model has been extended with damage and adaptation costs curves (based on the AD-DICE model ( De Bruin et al., 2009b ) and the ability to estimate macro-economic impacts on GDP growth ( Hof et al., 2008 )). This allows the model to explore the economic impacts of combined mitigation and adaptation strategies. Limitations : In its aim to be flexible, the FAIR model does not include a sectoral macro-economic model or an energy model. The model thus works from a partial equilibrium approach – and more underlying consequences of climate policy can only be studied by forwarding the FAIR results to other (linked) models. A.3 DIVA DIVA (Dynamic and Interactive Vulnerability Assessment) is an integrated model of coastal systems that was developed, together with its proper coastal database, within the EU-funded project DINAS-COAST 4 4 Dynamic and Interactive Assessment of National, Regional and Global Vulnerability of Coastal Zones to Sea-Level Rise; http://www.pik-potsdam.de/dinas-coast/ . ( DINAS-COAST Consortium, 2006; Hinkel and Klein, 2009 ). DIVA produces quantitative information on a range of ecological, social and economic coastal vulnerability indicators from sub-national to global scales, covering all coastal nations. The model consists of a number of modules developed by experts from various engineering, natural and social science disciplines. Based on climatic and socio-economic scenarios, the model assesses coastal erosion (both direct and indirect), coastal flooding (including rivers), wetland change and salinity intrusion into deltas and estuaries. DIVA also considers coastal adaptation in terms of raising dikes and nourishing beaches and includes several predefined adaption strategies such as no protection, full protection or optimal protection. Limitations : DIVA excludes the following processes that are likely to affect coastal impacts, but can currently not be modeled with confidence: changes in storm frequency and intensity, local distribution of GDP and population growth due to rapid coastal development and urbanization, and salinity intrusion into coastal aquifers. Further important uncertainties arise due to the coarse resolution and accuracy of elevation data. A.4 TIMER-cooling/heating energy demand The TIMER cooling/heating energy demand model ( Isaac and van Vuuren, 2009 ) describes the energy use for cooling and heating as a function of several factors, including population levels, changing income levels and climate. For both heating and cooling, empirical data is used to calibrate a set of system-dynamic demand functions. Climate (cooling and heating degree days) plays an important role. The model is able to account for the impacts of climate change. Limitations : The empirical basis on which the model is calibrated is relatively poor for developing countries. The model does not contain a description of different ways cooling and heating demand can be supplied and the costs involved in substituting one technology for the other. A.5 Water resources impact model The water resources impact model ( Arnell, 2003, 2004 ) has two components. The first simulates river runoff across the entire global land surface (at 0.5° × 0.5°) using the macro-scale hydrological model Mac-PDM, and the second determines indicators of water resources stress at the watershed level by calculating per capita water resource availability. A watershed is assumed to be exposed to water resources stress if it has an annual average runoff equivalent to less than 1000 m 3 /capita/year, a semi-arbitrary threshold widely used to identify water-stressed regions. Climate change leads to an increase in exposure to water resources stress if it causes runoff in a water-stressed watershed to decrease significantly, or causes the watershed to fall below the threshold. Climate change leads to an apparent reduction in exposure for the opposite trends. These changes cannot be directly compared; whilst a reduction in runoff (and an increase in exposure) is highly likely to be adverse, an increase in runoff (and apparent decrease in exposure) may not be beneficial if the additional water cannot be stored or if it occurs during high flow seasons as increased flooding. The number of people living in watersheds exposed to an increase in water resources stress can be used as an indicator of exposure to climate change. The actual impacts (in terms of real water shortages) will depend on water management structures in place. Limitations : The hydrological model does not simulate perfectly the volume of river runoff, and in particular tends to overestimate runoff in semi-arid regions. The water resources indicator is a measure of exposure to impact, not actual impact; it can be seen as a surrogate for the demand for adaptation. A.6 Malaria risks Malaria vectors, the mosquitoes spreading the infection, can only survive in suitable climates with high average temperatures, no frost and enough precipitation. The MARA/ARMA malaria suitability model ( Craig et al., 1999 ) incorporates these climatic factors to determine climatic suitable areas. The climatic levels required for the maximum suitability of 1, and for the minimum suitability of 0, are shown in Table A.1 . For indicators with levels between those required for 0 or 1 suitability a level is calculated using s simple function ( Craig et al., 1999 ). All these factors are calculated at half by half degree grid level, making use of the output from the IMAGE-model ( Bouwman et al., 2006 ). Total climatic malaria suitability for each grid cell is determined by the lowest of these three indices. Limitations : The MARA/ARMA model describes suitability for malaria vectors. It does not provide a process description of the spread of mosquitos, nor does it explicitly describe how people may react to increased risk levels. References Ackerman and Heinzerling, 2004 F. Ackerman L. Heinzerling Priceless: On Knowing the Price of Everything and the Value of Nothing 2004 The New Press New York Agrawala and Fankhauser, 2008 S. Agrawala S. Fankhauser Economic Aspects of Adaptation to Climate Change. Costs, Benefits and Policy Instruments 2008 OECD Paris Alcamo et al., 1998 J. Alcamo E. Kreileman M. Krol R. Leemans J. Bollen J.V. Minnen M. Schaeffer S. Toet B. de Vries Global modelling of environmental change: an overview of IMAGE 2.1 J. Alcamo R. Leemans E. Kreileman Global Change Scenarios of the 21st Century. Results from the IMAGE 2.1 Model 1998 Elsevier Science Ltd. Oxford 3 94 Arnell, 2003 N. Arnell Effects of IPCC SRES emissions scenarios on river runoff: a global perspective Hydrology and Earth System Sciences 7 5 2003 619 641 Arnell, 2004 N. Arnell Climate change and global water resources: SRES emissions and socio-economic scenarios Global Environmental Change 14 1 2004 31 52 Arnell et al., 2002 N.W. Arnell M.G.R. Cannell M. Hulme R.S. Kovats J.F.B. Mitchell R.J. Nicholls M.L. Parry M.J.L. Livermore A. White The consequences of CO 2 stabilisation for the impacts of climate change Climatic Change 53 4 2002 413 446 Bakkenes et al., 2006 M. Bakkenes B. Eickhout R. Alkemade Impacts of different climate stabilisation scenarios on plant species in Europe Global Environmental Change 16 1 2006 19 28 Barker, 2003 T. Barker Representing global climate change, adaptation and mitigation Global Environmental Change 13 2003 1 6 Barker et al., 2009 Barker, T., Kenber, M., Scrieciu, S., Ryan, D., 2009. Breaking the Climate Deadlock. Cutting the Cost: The Economic Benefits of Collaborative Climate Action. The Climate Group, The Office of Tony Blair, 4CMR – University of Cambridge and Cambridge Econometrics. Barker and Scrieciu, 2010 T. Barker S.S. Scrieciu Modelling low stabilisation with E3MG: towards a ‘New Economics’ approach to simulating energy-environment-economy system dynamics The Energy Journal 31 Special issue 1 2010 137 164 Barker et al., 2008 T. Barker S.S. Scrieciu T. Foxon Achieving the G8 50% target: modelling induced and accelerated technological change using the macro-econometric model E3MG Climate Policy 8 2008 S30 S45 Berkhout et al., 2002 F. Berkhout J. Hertin A. Jordan Socio-economic futures in climate change impact assessment: using scenarios as ‘learning machines’ Global Environmental Change 12 2 2002 83 95 Bouwer, 2010 L.M. Bouwer Have disaster losses increased due to anthropogenic climate change? Bulletin of the American Meteorological Society 2010 10.1175/2010BAMS3092.1 Bouwman et al., 2006 A.F. Bouwman T. Kram K. Klein Goldewijk Integrated Modelling of Global Environmental Change. An Overview of IMAGE 2.4 2006 Netherlands Environmental Assessment Agency Bilthoven 228 pp. (Publication 500110002/2006) Burke et al., 2006 E.J. Burke S.J. Brown N. Christidis Modelling the recent evolution of global drought and projections for the 21st century with the Hadley Centre climate model Journal of Hydrometeorology 7 2006 1113 1125 Clarke et al., 2010 L. Clarke J. Edmonds V. Krey R. Richels S. Rose M. Tavoni International climate policy architectures: overview of the EMF 22 international scenarios Energy Economics 31 Suppl. 2 2010 S64 S81 Copenhagen Accord, 2009 Copenhagen Accord, 2009. (Copenhagen Accord of 18 December 2009). United Nations Climate Change Conference 2009, Copenhagen. Craig et al., 1999 M.H. Craig R.W. Snow D. le Sueur A climate-based distribution model of malaria transmission in Africa Parasitology Today 15 3 1999 105 111 Cummins and Mahul, 2009 J.D. Cummins O. Mahul Catastrophe Risk Financing in Developing Countries: Principles for Public Intervention 2009 The World Bank Washington, DC De Bruin et al., 2009a K.C. De Bruin R.B. Dellink S. Agrawala Economic Aspects of Adaptation to Climate Change: Integrated Assessment Modelling of Adaptation Costs and Benefits 2009 OECD Paris De Bruin et al., 2009b K.C. De Bruin R.B. Dellink R.S.J. Tol AD-DICE: an implementation of adaptation in the DICE model Climatic Change 95 1–2 2009 63 81 Den Elzen et al., 2008 M.G.J. Den Elzen P.L. Lucas D.P. van Vuuren Regional abatement action and costs under allocation schemes for emission allowances for achieving low CO 2 -equivalent concentrations Climatic Change 90 3 2008 243 268 Den Elzen and van Vuuren, 2007 M.G.J. Den Elzen D.P. van Vuuren Peaking profiles for achieving long-term temperature targets with more likelihood at lower costs Proceedings of the National Academy of Sciences of the United States of America 104 46 2007 17931 17936 DINAS-COAST Consortium, 2006 DINAS-COAST Consortium, 2006. DIVA 1.5.5 CD-ROM, Potsdam Institute for Climate Impact Research, Potsdam, Germany. Easterling et al., 2007 W. Easterling P. Aggarwal P. Batima K. Brander L. Erda M. Howden A. Kirilenko J. Morton J.-F. Soussana J. Schmidhuber F.N. Tubiello Food, fibre and forest products M.L. Parry O.F. Canziani J.P. Palutikof P.J. van der Linden C.E. Hanson Climate Change 2007: Impacts, Adaptation and Vulnerability. Contribution of Working Group II to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change 2007 Cambridge University Press Cambridge, UK EC, 2006 EC World Energy Technology Outlook 2050 (WETO H2) 2006 European Commission Brussels Edenhofer et al., 2010 O. Edenhofer B. Knopf T. Barker L. Baumstark E. Bellevrat B. Chateau P. Criqui M. Isaac A. Kitous S. Kypreos M. Leimbach K. Lessmann B. Magné S. Scrieciu H. Turton D.P. van Vuuren The economics of low stabilization: model comparison of mitigation strategies and costs The Energy Journal 31 SI-1 2010 11 48 Environmental Change Institute, 2009 Environmental Change Institute, 2009. International Climate Conference – 4 Degrees and Beyond. Environmental Change Institute, Oxford University, 28–30 September, Oxford, UK. Feyen et al., 2009 L. Feyen J.I. Barredo R. Dankers Implications of global warming and urban land use change on flooding in Europe J. Feyen K. Shannon M. Neville Water and Urban Development Paradigms. Towards an Integration of Engineering, Design and Management Approaches 2009 Taylor and Francis Group London Fischer et al., 2007 G. Fischer F.N. Tubiello H. van Velthuizen D.A. Wiberg Climate change impacts on irrigation water requirements: effects of mitigation, 1990–2080 Technological Forecasting and Social Change 74 7 2007 1083 1107 Fisher et al., 2007 B. Fisher N. Nakicenovic K. Alfsen J. Corfee Morlot F. de la Chesnaye J.-C. Hourcade K. Jiang M. Kainuma E. La Rovere A. Matysek A. Rana K. Riahi R. Richels S. Rose D. van Vuuren R. Warren P. Ambrosi F. Birol D. Bouille C. Clapp B. Eickhout T. Hanaoka M.D. Mastrandrea Y. Matsuoko B. O’Neill H. Pitcher S. Rao F. Toth Issues related to mitigation in the long-term context B. Metz O. Davidson P. Bosch R. Dave L. Meyer Climate Change 2007. Mitigation of Climate Change. Contribution of Working Group III to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change 2007 Cambridge University Press New York 169 250 Hallegatte, 2009 S. Hallegatte Strategies to adapt to an uncertain climate change Global Environmental Change 19 2009 240 247 Hayashi et al., 2010 A. Hayashi K. Akimoto F. Sano S. Mori T. Tomoda Evaluation of global warming impacts for different levels of stabilization as a step toward determination of the long-term stabilization target Climatic Change 98 2010 87 112 Hilderink et al., 2008 H. Hilderink P.L. Lucas A. ten Hove M. Kok M. de Vos P. Janssen J. Meijer A. Faber A. Ignaciuk A. Petersen H.J.M. de Vries Towards a Global Integrated Sustainability Model 2008 Netherlands Environmental Assessment Agency Bilthoven Hinkel and Klein, 2009 J. Hinkel R.J.T. Klein Integrating knowledge to assess coastal vulnerability to sea-level rise: the development of the DIVA tool Global Environmental Change 19 3 2009 384 395 Hirabayashi and Kanae, 2009 Y. Hirabayashi S. Kanae First estimate of the future global population at risk of flooding Hydrological Research Letters 3 2009 6 9 Hof et al., 2009a A.F. Hof K.C. de Bruin R.B. Dellink M.G.J. den Elzen D.P. van Vuuren The effect of different mitigation strategies on international financing of adaptation Environmental Science and Policy 12 7 2009 832 843 Hof et al., 2009b A.F. Hof K. de Bruin R. Dellink M.G.J. den Elzen D.P. van Vuuren Costs, benefits and inter-linkages between adaptation and mitigation F. Biermann P. Pattberg F. Zelli Global Climate Governance After 2012: Architecture, Agency and Adaptation 2009 Cambridge University Press Cambridge Hof et al., 2008 A.F. Hof M.G.J. den Elzen D.P. van Vuuren Analysing the costs and benefits of climate policy: value judgements and scientific uncertainties Global Environmental Change 18 3 2008 412 424 IMAGE-team, 2001 IMAGE-team, 2001. The IMAGE 2.2 implementation of the IPCC SRES scenarios. A comprehensive analysis of emissions, climate change and impacts in the 21st century. RIVM CD-ROM publication 481508018, National Institute for Public Health and the Environment, Bilthoven, the Netherlands. IPCC, 2007 IPCC (Ed.), 2007. Climate Change 2007: Synthesis Report. Contribution of Working Groups I, II and III to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change. IPCC, Geneva, 104 pp. Isaac and van Vuuren, 2009 M. Isaac D.P. van Vuuren Modeling global residential sector energy demand for heating and air conditioning in the context of climate change Energy Policy 37 2 2009 507 521 Jenkins and Lowe, 2003 Jenkins, G., Lowe, J., 2003. Handling uncertainties in the UKCIP02 scenarios of climate change. Hadley Centre Technical Note 44, Met Office, Exeter. Kinney et al., 2008 P.L. Kinney M.S. O’Neill M.L. Bell J. Schwartz Approaches for estimating effects of climate change on heat-related deaths: challenges and opportunities Environmental Science and Policy 11 87 2008 Klein et al., 2007 R.J.T. Klein S. Huq F. Denton T.E. Downing R.G. Richels J.B. Robinson F.L. Toth Inter-relationships between Adaptation and Mitigation. Climate Change 2007. Impacts, Adaptation and Vulnerability. Contribution of Working Group II. Report of the Intergovernmental Panel on Climate Change 2007 Cambridge University Press Cambridge 745 777 Krol et al., 1997 M. Krol J. Alcamo R. Leemans Global and regional impacts of stabilizing atmospheric CO 2 Mitigation and Adaptation Strategies for Global Change 1 1997 341 361 Kundzewicz et al., 2010 Z.W. Kundzewicz Y. Hirabayashi S. Kanae River floods in the changing climate – observations and projections Water Resources Management 2010 10.1007/s11269-009-9571-6 Kundzewicz et al., 2007 Z.W. Kundzewicz L.J. Mata N. Arnell P. Döll P. Kabat B. Jiménez K. Miller T. Oki Z. Şen I. Shiklomanov Freshwater resources and their management M.L. Parry O.F. Canziani J.P. Palutikof C.E. Hanson P.J. van der Linden Climate Change 2007: Impacts, Adaptation and Vulnerability. Contribution of Working Group II to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change 2007 Cambridge University Press Cambridge, UK Leemans and Born, 1994 R. Leemans G.J.v.d. Born Determining the potential global distribution of natural vegetation, crops and agricultural productivity Water, Air and Soil Pollution 76 1994 133 161 Lenton et al., 2008 T.M. Lenton H. Held E. Kriegler J.W. Hall W. Lucht S. Rahmstorf H.J. Schellnhuber Tipping elements in the Earth's climate system Proceedings of the National Academy of Sciences of the United States of America 105 6 2008 1786 1793 Manne and Richels, 2005 A.S. Manne R.G. Richels Merge: an integrated assessment model for global climate change R. Loulou J.-P. Waaub G. Zaccour Energy and Environment 2005 Springer USA McMichael et al., 1996 A. McMichael A. Haines R. Slooff S. Kovats Climate Change and Human Health 1996 World Health Organization Geneva Mechler et al., 2010 R. Mechler S. Hochrainer A. Aaheim Z. Kundzewicz N. Lugeri M. Moriondo H. Salen M. Bindi I. Banaszak A. Chorynski E. Genovese H. Kalirai J. Linnerooth-Bayer C. Lavalle D. McEvoy P. Matczak M. Radziejewski D. Rübbelke M.-J. Schelhaas M. Szwed A. Wreford Risk management approach for assessing adaptation to changing flood and drought risks in Europe M. Hulme H. Neufeldt Making Climate Change Work for Us: European Perspectives on Adaptation and Mitigation Strategies 2010 Cambridge University Cambridge, UK Meehl et al., 2007 G.A. Meehl T.F. Stocker W.D. Collins P. Friedlingstein A.T. Gaye J.M. Gregory A. Kitoh R. Knutti J.M. Murphy A. Noda S.C.B. Raper I.G. Watterson A.J. Weaver Z.-C. Zhao Global climate projections S. Solomon Climate Change 2007: The Physical Science Basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change 2007 Cambridge University Press Cambridge Metz et al., 2007 B. Metz O.R. Davidson P.R. Bosch R. Dave L.A. Meyer Climate Change. Mitigation of Climate Change. Contribution of Working Group III to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change 2007 Cambridge University Press Cambridge, United Kingdom Mirza et al., 2003 M.M.Q. Mirza R.A. Warrick N.J. Ericksen The implications of climate change on floods of the Ganges, Brahmaputra and Meghna Rrivers in Bangladesh Climatic Change 57 2003 287 318 Moriondo et al., 2010 M. Moriondo M. Bindi Z.W. Kundzewicz M. Szwed A. Chorynski P. Matczak M. Radziejewski D. McEvoy A. Wreford Impact and adaptation opportunities for European agriculture in response to climatic change and variability Mitigation and Adaptation Strategies for Global Change 15 7 2010 657 679 Moss et al., 2010 R.H. Moss J.A. Edmonds K.A. Hibbard M.R. Manning S.K. Rose D.P. van Vuuren T.R. Carter S. Emori M. Kainuma T. Kram G.A. Meehl J.F.B. Mitchell N. Nakicenovic K. Riahi S.J. Smith R.J. Stouffer A.M. Thomson J.P. Weyant T.J. Wilbanks The next generation of scenarios for climate change research and assessment Nature 2010 10.1038/nature08823 Nakicenovic et al., 2000 N. Nakicenovic Special Report on Emissions Scenarios (SRES) 2000 Cambridge University Press Cambridge, UK Nakicenovic et al., 2006 N. Nakicenovic P. Kolp K. Riahi M. Kainuma T. Hanaoka Assessment of Emissions Scenarios Revisited Environmental Economics and Policy Studies 7 3 2006 137 173 Nicholls and Lowe, 2004 R.J. Nicholls J.A. Lowe Benefits of mitigation of climate change for coastal areas Global Environmental Change 14 3 2004 229 244 Nicholls et al., 2010 R.J. Nicholls N. Marinova J.A. Lowe S. Brown P. Vellinga D. de Gusmao J. Hinkel R.S.J. Tol Sea-level rise and its possible impacts given a “beyond 4 degree world” in the 21st century Philosophical Transactions of the Royal Society 2010 10.1098/rsta.2010.029 Nordhaus and Boyer, 2000 W.D. Nordhaus J. Boyer Warming the World: Economic Models for Global Warming 2000 MIT Press Cambridge, MA pp. 315–328 Nordhaus, 2008 W.D. Nordhaus A Question of Balance Weighing the Options on Global Warming Policies 2008 Yale University Press New Haven and London Parry et al., 2007 M.L. Parry O.F. Canziani J.P. Palutikof P.J. van der Linden C.E. Hanson Climate Change 2007: Impacts, Adaptation and Vulnerability. Contribution of Working Group II to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change 2007 Cambridge University Press Cambridge Patt et al., 2010 A.G. Patt D.P. van Vuuren F. Berkhout A. Aaheim A.F. Hof M. Isaac R. Mechler Adaptation in integrated assessment modeling: Where do we stand? Climatic Change 99 3 2010 383 402 Piani et al., 2005 C. Piani D.J. Frame D.A. Stainforth M.R. Allen Constraints on climate change from a multi-thousand member ensemble of simulations Geophysical Research Letters 32 2005 L23825 Price, 2005 C. Price An intergenerational perspective on effects of environmental changes: discounting the future's viewpoint J.L. Innes G.M. Hickey H.F. Hoen Forestry and Environmental Change: Socioeconomic and Political Dimensions 2005 International Union on Forestry Research Organisations (IUFRO) Vienna Rose et al., 2007 Rose, S., Ahammad, H., Eickhout, B., Fisher, B., Kurosawa, A., Rao, S., Riahi, K., van Vuuren, D. 2007. Land in climate stabilization modeling: initial observations Energy Modeling Forum Report. Stanford University. Schneider and Kuntz-Duriseti, 2002 S.H. Schneider K. Kuntz-Duriseti Uncertainty and climate change policy S.H. Schneider A. Rosencranz O. Niles Climate Change Policy: A Survey 2002 Island Press Washington, DC Stern, 2006 N. Stern Stern Review on the Economics of Climate Change 2006 Cambridge University Press Cambridge Swart et al., 2009 R. Swart L. Bernstein M. Ha-Duong A. Petersen Agreeing to disagree: uncertainty management in assessing climate change, impacts and responses by the IPCC Climatic Change 92 2009 1 29 Swart and Raes, 2007 R. Swart F. Raes Making integration of adaptation and mitigation work: mainstreaming into sustainable development policies? Climate Policy 7 4 2007 288 303 Tol, 2002a R. Tol Estimates of the damage costs of climate change. Part II. Dynamic estimates Environmental and Resource Economics 21 2 2002 135 160 Tol, 2002b R.S.J. Tol Estimates of the damage costs of climate change. Part 1. Benchmark estimates Environmental and Resource Economics 21 1 2002 47 73 Tol, 2002c R.S.J. Tol Welfare specifications and optimal control of climate change: an application of fund Energy Economics 24 4 2002 367 376 Tubiello and Fischer, 2007 F.N. Tubiello G. Fischer Reducing climate change impacts on agriculture: global and regional effects of mitigation, 2000–2080 Technological Forecasting and Social Change 74 7 2007 1030 1056 UN, 2005 UN, 2005. World Population Prospects: The 2004 Revision. CD-ROM Edition – Extended Dataset. United Nations publications, Sales No. E.05.XIII.12, United Nations, Department of Economic and Social Affairs, Population Division. van Vliet et al., 2009 J. van Vliet M.G.J. den Elzen D.P. van Vuuren Meeting radiative forcing targets under delayed participation Energy Economics 31 Suppl. 2 2009 S152 S162 van Vuuren et al., 2006 D.P. van Vuuren B. van Ruijven M. Hoogwijk M. Isaac B. De Vries TIMER 2: model description and application L. Bouwman T. Kram K. Klein-Goldewijk Integrated Modelling of Global Environmental Change. An overview of IMAGE 2.4 2006 MNP - Netherlands Environmental Assessment Agency Bilthoven van Vuuren et al., 2007 D.P. van Vuuren M.G.J. Den Elzen P.L. Lucas B. Eickhout B.J. Strengers B. Van Ruijven S. Wonink R. Van Houdt Stabilizing greenhouse gas concentrations at low levels: an assessment of reduction strategies and costs Climatic Change 81 2 2007 119 159 van Vuuren et al., 2008a D.P. van Vuuren B. De Vries A. Beusen P.S.C. Heuberger Conditional probabilistic estimates of 21st century greenhouse gas emissions based on the storylines of the IPCC-SRES scenarios Global Environmental Change 18 4 2008 635 654 van Vuuren et al., 2008b D.P. van Vuuren M. Meinshausen G.K. Plattner F. Joos K.M. Strassmann S.J. Smith T.M.L. Wigley S.C.B. Raper K. Riahi F. De La Chesnaye M.G.J. Den Elzen J. Fujino K. Jiang N. Nakicenovic S. Paltsev J.M. Reilly Temperature increase of 21st century mitigation scenarios Proceedings of the National Academy of Sciences of the United States of America 105 40 2008 15258 15262 van Vuuren et al., 2009 D.P. van Vuuren M.G.J. den Elzen J. van Vliet T. Kram P. Lucas M. Isaac Comparison of different climate regimes: the impact of broadening participation Energy Policy 37 12 2009 5351 5362 van Vuuren et al., 2010 D.P. van Vuuren E. Stehfest M.G.J. den Elzen J. Van Vliet M. Isaac Exploring scenarios that keep greenhouse gas radiative forcing below 3 W/m 2 in 2100 Energy Economics 31 Special Issue 1 2010 165 192 Vermeer and Rahmstorf, 2009 M. Vermeer S. Rahmstorf Global sea level linked to global temperature Proceedings of the National Academy of Sciences of the United States of America 106 2009 21527 21532 Weitzman, 2008 Weitzman, M.L., 2008. On modeling and interpreting the economics of catastrophic climate change. Wigley, 2003 T.M.L. Wigley MAGICC/SCENGEN 4.1: Technical Manual 2003 UCAR - Climate and Global Dynamics Division Boulder, CO Wigley and Raper, 2001 T.M.L. Wigley S.C.B. Raper Interpretation of high projections for global-mean warming Science 293 2001 451 454|设想情景用于探讨不确定情况下不同适应和缓解战略的后果。在本文中,我们使用了两种情景来探讨发展: (1)没有缓解措施导致全球平均气温到2100年上升4摄氏度; (2)一个雄心勃勃的缓解策略导致到2100年上升2摄氏度。就第二种情况而言,气候系统的不确定性意味着不能排除全球平均气温上升3摄氏度或更多的可能性。我们的分析表明,在许多情况下,适应和减缓不是权衡,而是补充。例如,在缓解设想方案中,因气候变化而面临更大水资源压力的人数可以大幅度减少,但仍然需要对面临更大压力的其余大量人口进行适应。另一个例子是海平面上升,从全球和纯货币的角度来看,适应(直到2100年)似乎比缓解更有效。然而,从较贫穷和小岛屿国家的角度来看,严格的缓解措施对于将风险保持在可控水平是必要的。就农业而言,只有基于适应和缓解相结合的设想方案才能避免严重的气候变化影响。关键词情景综合评估气候变化缓解适应气候影响1引言情景分析是评估气候变化和气候变化政策的一个非常重要的工具,它使分析人员能够探索经济发展、温室气体排放、气候和生态系统等因素之间复杂而不确定的未来相互作用。这些因素共同决定了缓解和适应政策的必要性和可能性。设想情景还可以作为一种手段,协调参与气候研究领域的各种不同研究群体的假设,从而更好地比较其结果。因此,情景在缓解和适应研究中得到了广泛的应用(参见 Metz 等,2007; Parry 等,2007)(特别是来自排放情景特别报告(SRES)的情景(Nakicenovic 等,2000))。Moss 等人(2010)指出,由于 SRES 对场景分析的信息需求正在发生变化。首先,人们对探索适应与缓解之间的关系越来越感兴趣。正如 Moss 等人(2010)所指出的,这将需要进一步整合气候研究中涉及的不同分析传统的信息。第二,除了迄今为止探讨的无气候政策情景之外,人们对明确探讨气候政策影响的情景也越来越感兴趣。具体而言,在没有气候政策的情况下,能够评估长期气候目标的“成本”和“收益”是非常有意义的。在本文中,我们遵循这一思路,探讨情景分析如何能够促进对未来适应和缓解战略的联合评估。这样的联合评估有以下几个原因: (1)首选的缓解策略取决于预期的气候影响和适应成本; (2)考虑到适应气候变化的局限性; (3)一些适应和缓解策略可能相互作用; (4)最后,气候变化的影响可能有需要考虑的重要反馈。这种分析在战略层面上是最有用的,而不是针对个人的适应(或缓解)决策。鉴于这一目的,我们在本文中讨论了两个主要的情景,其中包括适应和减缓战略的要素(见本文的进一步内容) ,导致本世纪末全球平均气温上升4摄氏度和2摄氏度。这两个温度水平已经开始成为标志性的数字,代表着在没有减缓政策(4摄氏度)和国际气候谈判的温度目标(2摄氏度)(2009年哥本哈根协议)的情况下的潜在结果。可以说,如果政治领导人要在减缓、适应和气候影响之间做出明智的选择,了解这两个温度水平的影响是至关重要的(环境变化研究所,2009)。缓解和适应战略的综合评估由于方法上的差异而受到阻碍。考虑到当地环境的重要性,综合评估模型很难描述适应过程(Patt et al。 ,2010)。一个实际问题是,迄今为止,影响文献的相当一部分集中在非政策情景下的影响(例外包括 Arnell 等,2002; Bakkenes 等,2006; Hayashi 等,2010; Krol 等,1997; Nicholls 和 Lowe,2004)。因此,本文提出了一个基于耦合信息的广义情景评估——但没有假装是完整的或完全集成的。作为一项边做边学的活动,本文件打算说明4摄氏度和2摄氏度世界之间的重要区别,但也要确定进行综合情景分析所涉及的一些实际问题。这意味着,与现有文献相比,最重要的进步是我们提出了一个基于一致情景的多部门分析。鉴于目前综合评估模型的先进水平,已经使用几个松散耦合模型进行了试验。因此,一些重要的联系无法得到解决,如农业的适应性反应,这可能涉及灌溉(见第5.3节)和水需求(第5.4节)。事实上,本文提出的一个重要问题是,是否需要进行全面综合分析,或者部分综合是否足够。本文的内容安排如下: 我们首先讨论在开发能够为适应和缓解政策决策提供信息的设想方案时所遇到的一些方法上的复杂问题。接下来,我们讨论两种主要情景在社会经济驱动因素方面的差异(第3和第4部分)。在第5节中,我们探讨了适应和减缓战略对气候变化各种影响的潜在后果。2评估气候战略和情景发展(理论和方法)2.1应对气候变化的不同战略气候变化及其响应可能导致三种形式的成本(不一定是货币) : (1)气候影响的(剩余)成本,(2)适应的成本和(3)缓解的成本。至少在理论上,这对应于三种不同的策略: (1)“自由放任”(接受气候变化) ,(2)关注适应,(3)关注缓解,如图1所示(另见 Klein 等,2007)。虽然图1表明,缓解、适应和剩余损害的成本和收益可以相互交换,但存在一些概念和分析问题,使这种办法复杂化。这些与空间和时间尺度、风险和不确定性有关(SwartandRaes,2007)。缓解和适应是在不同空间尺度上发生的过程。虽然缓解行动通常是在国家或地方范围内采取的,但好处是全球共享的。因此,气候政策成功和成本的关键因素是国际合作的程度(Barker 等,2009; Clarke 等,2010; van Vliet 等,2009; van Vuuren 等,2009)。相比之下,对于适应而言,成本和收益在从地方到国家乃至国际的多个尺度上都存在。较大规模的扶持性环境仍然可以在较小规模上加强适应(例如,由国际融资机制资助的地方能力建设)。由于这些原因,缓解评估往往集中在全球一级,而相比之下,适应研究大多集中在地方一级。随着时间的推移,缓解和适应的动态也是一个重要因素。严格的缓解方案通常需要强有力的早期减排。然而,由于气候系统内部的巨大惯性,这些假设情景的气候变化影响在短期(前几十年)与没有气候变化政策的假设情景几乎没有差别。相比之下,一些相关的影响(例如减少当地空气污染的共同利益)可以以更快的速度实现。适应措施可能在短期内产生私人和社会效益。例如,空气调节等简单的适应措施可以带来明显的短期效益。一些重要的例外存在,可能需要几十年的实施,如空间规划的变化或大规模的工程工程防洪(见哈勒盖特,2009年)。其他重要因素是风险和不确定性。我们对气候变化的理解面临许多不确定性。要确定的关键不确定性包括认知、数据、模型和实体不确定性(施奈德和 Kuntz-Duriseti,2002; van Vuuren 等,2008a)。涉及不确定因素的例子有: (i)未来的排放量,(ii)气候系统,(iii)未来的脆弱性和对气候风险的暴露,以及(iv)缓解成本。采取缓解行动减少了一些不确定性,因为它减少了气候变化的源头,并揭示了实际的缓解成本(Barker,2003; Piani 等,2005)。然而,缓解措施也可能增加风险。例如,如果以不可持续的方式实施生物能源,可能会抵消一组风险(气候变化) ,同时产生另一组不同的风险(生物多样性丧失和粮食安全下降)。处理风险的一种方法是包括概率评估。这通常是使用过去的证据,推断以涵盖特定的未来情况。其他不确定性(例如不可知的冲击和意外)在量化意义上更难处理,但它们证明了承认无知的合理性。情景可以用来探索极端事件的可能性和各种政策组合的稳健性,但这并不常见(Berkhout et al。 ,2002)。传统上,涉及缓解研究和适应研究的学科对不确定性有不同的描述方式。虽然缓解研究往往使用定量方法并侧重于平均估计,但适应研究往往更侧重于对不确定性的定性描述,并侧重于危险事件的风险,即使这些事件发生的概率很低。这些不同的不确定性感知可能会使不同策略的综合评估复杂化(Swartet al。 ,2009)。2.2场景的类型我们可以根据缓解和适应的考虑将场景分为不同的类别。首先,我们将基线情景定义为一个事件的轨迹,假设没有来自气候变化的重大反馈,也没有关于缓解或适应的具体政策努力(这种情景可能仍然包括许多间接影响缓解或适应气候变化能力的行动; 例如,可以预期收入水平的增加与对减少疟疾等气候相关疾病风险的卫生服务的更大投资相一致)。这种类型的场景的主要目的是进行分析,作为其他场景的参考点。其次,适应情景描述了一个社会正在应对气候变化影响的世界。其目的是探讨适应气候变化所需的技术和政策类型、避免的损害和相关费用。适应包括所谓的自主适应(即在没有特定政府行动的情况下发生的行动)和有计划的适应。第三,缓解方案描述了一个包括旨在限制气候变化的政策的世界。其目的是探讨最大限度地减少气候变化及相关成本所需的技术和政策类型。由于总是存在剩余的影响,第四组,适应和缓解情景综合了两种类型的气候变化应对措施。可能的话,这第四类情景可以根据适应和缓解备选办法之间可能存在的协同作用,例如对于一些重新造林备选办法,重新排列政策备选办法。每一种情况都与更广泛的社会、政治和文化背景有关,在这种背景下,它们被认为会出现。在探索缓解、适应和残余损害的优选组合时,存在两种主要方法: (i)将潜在影响描述为全球平均气温上升(从而缓解)的功能的影响和基于风险的方法,以及(ii)成本效益分析,其中确定货币成本和收益,以最大限度地提高福利(例如,参见 Nordhaus,2008; Tol,2002c)。在这两种情况下,我们认为,描述不同应对战略之间的关系比寻求确定最佳办法更有用,也更能反映问题。鉴于第2.1节所列出的复杂性和不确定性,我们认为在现实中不可能采取任何最佳的缓解、适应或联合策略。2.3综合分析缓解和适应的综合分析可以通过不同方式实现: 例如,使用单一的所谓综合评估模型,或在不同模型和学科之间交流信息,评估现有文献并使结果具有可比性。这两种方法都是围绕气候变化的因果链进行组织的,即描述经济活动(收入、能源使用、农业等)、排放、气候变化和影响之间的关系——以及相关的反馈(图2)。实际上,该方案也构成了 IPCC 报告情景信息流的主干(Moss et al。 ,2010)。情景首先由综合评估和排放模型制定(侧重于经济驱动力、能源和土地使用以及温室气体排放(IPCC“工作组 III”))。随后,排放轨迹在气候模型中被用来评估气候变化的影响(IPCC“工作组 I”)。最后,这些情景被用于影响、适应和脆弱性分析(IPCC“工作组 II”)。不同研究学科和工作组的参与意味着很难说明不同领域之间的反馈意见。综合评估模型只能获取有限数量的可能反馈(经常被忽略的反馈包括粮食和水安全对人口和经济驱动因素的影响; 水资源短缺与粮食生产之间的关系; 气候变化对能源使用的影响等)。如果这些反馈不足以对系统产生重大影响,忽略(其中一些)可能是合理的。出于分析原因,在学科领域内组织场景开发并考虑有限数量的反馈有很大的优势。它使研究人员能够专注于他们很好地理解的链条要素,并增加所需的细节数量,而不必面对相互联系的复杂性。然而,在更加注重对缓解和适应战略进行综合分析的情况下,这种情况可能会改变。关于为什么需要采取综合办法的一些例子是: 一、气候影响,例如极端事件引发的影响,可能非常严重,破坏了原先设想的经济假设; 二。气候影响可能对农业产生重大影响,因此,对未考虑影响的土地使用相关排放量的估计可能是错误的,生物能源的缓解潜力可能受到影响;。对于对缓解和适应都有吸引力的土地面积,可能存在相互竞争的权利主张。因此,一个有趣的问题是,是否需要更加集成的分析是如此迫切,以至于需要更加复杂的集成模式(模型的交互耦合; 一个复杂的模型) ,或者是否可以单独处理影响,简化分析框架。时间范围和决策焦点在这里可能也很重要,例如是否考虑到潜在的临界点(Lenton et al。 ,2008)。研究这个问题的少数现有研究似乎表明,在大多数部门,任何缓解项目的适应影响都很小,大多数适应活动产生的排放量也很小(Klein et al。 ,2007)。迄今为止,最完整的分析来自以成本效益为导向的综合评估模型,如基金、 DICE 和 MERGE (Manne and Richels,2005; Nordhaus,2008; Tol,2002c) ,但这些模型通常将气候影响聚合为有限数量的相当抽象的损害函数。我们认为,随着时间的推移,随着许多部门减缓和适应措施的力度不断加大,将更加需要进行足够详细的联合评估。这里提供的场景基于建模和场景开发的当前技术状态,迈出了第一步。缓解和影响评估的一项评估使用了相同的设想方案,我们明确提出了缓解和适应战略(作为设想方案的一部分或在用于不同影响的模型中)。然而,许多反馈没有被考虑进去。在本文的最后,我们回到了更加集成(但也更加复杂)的场景的作用。2.4本文件使用的方法如上所述,可以确定几种情景: 基线情景、缓解情景、适应情景和适应-缓解情景。本文还介绍了这些场景类型。对于基准/适应情景,我们假设大多数社会经济驱动因素的中间假设。情景假设在第3和第4节中描述。这些设想方案不包括缓解措施,导致到2100年全球平均气温比工业化前水平上升4摄氏度。虽然我们描述了在这些情况下可能产生的影响和适应,但我们没有包括对原始驱动因素的反馈。在缓解设想方案中,包括严格的缓解努力,导致全球平均气温上升2摄氏度。使用 IPCC 给出的3 ° C 的气候敏感性中值(Meehl 等,2007) ,这意味着稳定水平约为450 ppm CO2当量(CO2当量).气候政策对经济驱动因素的影响没有被考虑在内——但是其他几个关系是耦合的(例如土地使用)。因此,在大多数文章中,我们忽略了气候变化和气候政策对经济假设的潜在影响。然而,在第5.8节中,我们用一个简单的经济模型(FAIR)来讨论它们的影响,以提供对全球范围内经济后果的可能规模的一些见解。使用了几个模型工具。这些情景主要是使用 IMAGE 综合评估模型开发的(Bouwman 等,2006)。IMAGE 模型根据对人口和世界经济的假设,结合对技术发展和消费模式的假设,描述了21世纪能源和土地使用的发展情况。该模型预测了全球范围内的气候变化(以全球平均温度变化和海平面上升为指标) ,并通过模式缩放缩小的气候模式模式构建了0.5 ° × 0.5 ° 网格上月气温和降雨量变化的空间场景。IMAGE 的产出用于描述海平面上升的 DIVA 模型; 用于估计水资源紧张后果的 Mac-PDM 全球水文模型; 用于估计加热和降温需求影响的 TIMER 能源模型; 用于疟疾对疟疾影响的 MARA/ARMA 适宜性模型和用于货币成本效益分析的 FAIR 模型。此外,我们更一般地讨论对农业的影响(基于 IPCC AR4)和极端事件。附录 A 提供了所有使用模型的简要描述。在我们的描述中,我们关注于全局级别(考虑到空间有限)。显然,这导致了我们讨论适应的局限性。实验取决于每个模型的设计,因此,不同影响之间可以提出的假设情景的数量是不同的。这意味着研究报告应被解释为综合评估的第一个例证,而不是关于适应及其限制的全面研究。3结果: 基线情景中的社会经济趋势3.1人口发展和经济增长我们假设人口遵循2004年世界人口预测修订版(UN,2005)到2050年的中等生育率变量,以及联合国到2100年的长期中等预测(图3)。这意味着,到2050年,全球人口将稳步增至近91亿,并在随后的50年中稳定在约92亿人,直至2100年。这种情况在人口预测范围内采取中间立场(见图3)。对于直到2050年的经济增长,这种情况遵循与剑桥模型 E3MG 相关的预测(巴克和 Scrieciu,2010; 巴克等人,2008)。利用基于 SRES 的 B2情景(IMAGE-team,2001)的经济增长预测,该情景延伸到2050年以后。从数量上看,这是一个中高速经济增长的情景,主要是对中国和印度经济增长的乐观假设的结果。按人均计算,经合组织经济体预计仍将是世界上最富有的经济体,但就经济活动总量而言,发展中区域的重要性迅速增加。在非洲、中东和拉丁美洲,人均国内生产总值的年增长率在0% 到2% 之间。在亚洲,到2050年,这一比例将从目前的高水平降至每年3% 。3.2基线情景下的能源使用和温室气体排放基线情景下的能源使用与欧盟委员会(EC,2006)公布的基线保持一致。尽管能源强度进一步降低,世界能源消费在2000-2050年期间增加了一倍以上,在2050-2100年期间又增加了25% (图4)。整个世纪以来,能源供应仍然以化石燃料为主。当石油和天然气的生产在本世纪达到高峰和下降时,煤炭的使用在整个情景期间增加。此外,非化石能源的产量也在迅速增长。在截至2100年的这段时间里,核能使用量增加了2到3倍,达到76 EJ,生物质使用量强劲增加,而水力发电产量增加了大约60% 到80% 。相对增长最大的是风能和太阳能; 到2050年,风能和太阳能在所有非化石能源中的比例将从不到1% 上升到10% 至14% 。2050年可再生能源总使用量为120-140 EJ,2100年为190 EJ。上述趋势表明,能源活动的二氧化碳排放量在2050年之前增加了一倍以上,在2050年至2100年之间又增加了三分之一(见图3)。因此,该方案在文献范围内形成了一个中间基线方案(Fisher et al。 ,2007)。非二氧化碳温室气体(尤其是甲烷)在2000-2050年期间稳步增长,但增长速度低于二氧化碳(作为其驱动因素,农业的增长速度预计将低于能源部门)。本世纪上半叶,土地利用产生的二氧化碳排放量回落至零。农业用地面积位于最近公布的类似情景的范围内,尽管处于该范围的低端(Rose et al。 ,2007)。4缓解方案和气候方案的结果4.1能源使用和温室气体排放缓解方案旨在将温室气体稳定在大约450 ppm CO2-当量。(参见 van Vuuren 等人,2007,2010)。这种情况允许初始浓度超调至大约510 ppm 二氧化碳当量。DenElzen 和 van Vuuren (2007)早些时候已经表明,有限的过度集中可以以较低的成本达到类似的气候目标。减少排放的方法有很多种。一个要素是提高能源效率,从而减少能源使用总量(2050年与基准相比减少20%)(见图4)。该设想方案还显示,非化石能源的使用日益增加,占能源使用总量增长的大部分。非化石能源的使用从2010年占一次能源使用总量的15% 增加到2050年的30% 以上,到本世纪末占总量的40% 以上。这种增长主要是由于生物能源使用的增加。碳捕获和储存应用于化石燃料的大多数固定用途。最后,还减少了非二氧化碳温室气体的排放。其结果是,全球排放量在2020年左右达到峰值,并随着时间的推移进一步减少。与2050年的基准相比,排放量减少了70% 以上,到2100年减少了80% 以上。减缓政策的后果不仅影响能源部门,而且影响土地使用。大量额外的土地用于造林和生物能源(见图5)。模型比较研究表明,这里提出的缓解方案与目前的文献是一致的,尽管模型显示各种减排措施的贡献有显着差异(Clarke 等,2010; Edenhofer 等,2010)。根据 IMAGE 模型计算,减排成本约为 GDP 的1-2% (即每年的额外支出,可与经合组织国家目前约占 GDP 1.5% 的环境政策支出相比较)(图6)。可比情景的文献范围在2100年的0.5.5% 左右。大多数研究都认为,这些额外的支出将导致国内生产总值的减少。我们将在第5.8节进一步讨论这个问题。4.2基线和缓解情景下的气候变化基于 IMAGE 模型计算,大气温室气体浓度和由这两种情景排放引起的相关平均全球温度变化如图7所示(实线表示最佳猜测值)。IMAGE 模型使用 MAGICC 模型来计算全球平均温度的变化。早期,van Vuuren 等人(2008b)将 MAGICC 模型用于类似的 IMAGE 场景,以计算温室气体浓度和温度(包括不确定范围)的轨迹。在这里,用于 MAGICC 计算的不确定性范围是基于现有的更复杂的碳循环和气候模型。我们使用了温室气体浓度范围和温度结果的含义来描述这里的不确定性范围,如图中阴影区域所示。对于温度,较宽的阴影区域表示由于碳循环和气候敏感性的不确定性而产生的不确定性。对于基线情景,全球平均气温在2050年几乎与工业化前水平线性上升至2.1摄氏度,在2100年上升至3.7摄氏度(不确定性范围为3-5摄氏度)。在缓解方案中,到2100年全球平均气温上升幅度限制在1.9摄氏度。同样,存在相当大的不确定性。图7表明,到本世纪末,与工业化前水平相比,减缓气候变化的情况也可能导致气温上升2.6摄氏度。由于这里提出的缓解方案是科学文献中最严格的方案之一(参考 Clarke 等,2010; Edenhofer 等,2010; Fisher 等,2007) ,可以得出两个重要的结论。首先,分析表明,全球变暖可以得到缓解,但不能停止。第二,严格的设想也可能导致气候变化大大超过2摄氏度这一观察结果可能意味着,对冲适应政策以防止气候变暖加剧可能具有相当大的价值。例如,这些政策可能是“ ... ... 目标是2摄氏度,但准备3摄氏度”。在下面的影响评估中,我们关注的是中央气候变化预测。通过源自 HadCM2气候模型的重新尺度模式(图8)构建了与全球平均温度变化相关的全球0.5 ° × 0.5 ° 尺度上的月平均温度和降水量的变化。结果表明,高纬度地区年平均气温变化大于低纬度地区,降水量变化具有明显的空间变化特征。关于气候变化的预期模式,特别是降水的模式,存在着相当大的分歧: 因此,本文件提出的影响结果只代表一种可能的结果。5结果: 不同情景下的影响和适应5.1导言 IPCC 的第四次评估报告(IPCC,2007)概述了气候影响。其中一些影响来自平均气候的变化,但其他影响可能来自极端事件的变化。表1总结了一些影响,包括健康、农业、水资源供应、沿海洪水、城市地区和能源系统,以及气候系统的大规模破坏(相比之下,生物多样性和生态系统服务没有包括在内)。如前所述,大多数文献都将气候变化视为“渐进现象”(Agrawala and Fankhauser,2008)。这对于低概率和高拥有属性的影响是有问题的(见下文)。在这个探索性的分析中,我们描绘了一些影响和适应需求。我们的目标是涵盖表1中提到的几个关键影响,但是评估受到可以很容易耦合的模型的限制。因此,这些描述并不打算详尽无遗,而是在一定程度上说明了一些影响的严重程度和关键的适应挑战。在介绍我们的结果时,我们使用了几个基于上述情景的新模型运行(例如疟疾、水资源、海平面上升、加热和降温需求)。然而,我们也根据这里提出的两种情景(与温度相关的死亡率、农业和极端事件)评估了 IPCC 第四次评估报告中的现有信息。5.2人类健康: 与温度相关的死亡率和疟疾气候变化的健康影响需要在其他更重要的人类健康驱动因素的背景下看待,包括与生活方式相关的因素(Hilderink 等,2008)。我们在这里关注与温度有关的死亡率和疟疾。5.2.1与温度有关的死亡率与温度有关的死亡率影响可能通过极端温度的变化、平均温度的变化或温度的季节性变化而发生,文献显示的结果各不相同。McMichael 等(1996)使用相对风险比对温度相关死亡率进行了估计,表明存在死亡率最低的最佳温度(也称为 U 形剂量-反应关系)。如果温度升高,热应激相关的死亡率增加,但寒冷相关的死亡率下降。Tol (2002a)的结论是,从货币角度来看,由于气候变化导致的与寒冷有关的死亡率的下降超过与热有关的死亡率的增加。然而,这一结论受到用于评估生命价值的方法的影响,也受到平均和区域温度以及温度和健康之间关系的巨大不确定性的影响。适应可能发生在人体生理学适应更高的温度(麦克迈克尔等人,1996) ,行为的改变和空气调节使用的增加(Kinney 等人,2008)。考虑到使用温度和死亡率之间的剂量-反应关系的复杂性,我们还没有尝试在这里量化这些。5.2.2疟疾疟疾与气候变化之间的关系引起了人们的极大关注。在本文中,我们还重点讨论了气候引起的疟疾风险变化。每年有100多万人死于疟疾,其中大多数是非洲儿童。疟疾是一种由病媒传播的传染病。按蚊(传播疟疾感染的媒介)只能在平均温度高、没有霜冻和降水充足的气候中生存。MARA/ARMA 疟疾适宜性模型(Craig et al。 ,1999)综合了这些因素来确定气候适宜的地区。然而,疟疾造成的死亡率也受到诸如获得预防措施(包括室内喷洒和经杀虫剂处理的蚊帐)和获得医疗保健等因素的严重影响。在 MARA/ARMA 模型中,这些因素与收入和城市化有关。图9显示了这个模型在本文情景下的结果。自主适应的影响(作为收入增加的功能)减少了约50% 的疟疾死亡,特别是在非洲(主要是由于更好地提供卫生保健)。相比之下,气候的影响——特别是缓解设想方案与基准设想方案之间的差异要小得多。减缓措施将疟疾健康风险降低约2% (2050年)。因此,适应对疟疾控制的影响比缓解更具决定性(这一发现在现有文献中似乎很有说服力)。5.3农业: 对产量的影响伊斯特林等人(2007)综合了大量关于气候变化对作物生长的影响的研究,包括适应和不适应。结果总结为全球平均气温升高的函数,尽管实际上气温和降水模式的变化以及 CO2施肥都起作用。例如,二氧化碳施肥的影响部分抵消了气候变化的影响。结果可以用来评估我们的情景的气候影响使用最佳拟合多项式从伊斯特林等人(2007年) ,这表明产量的影响作为平均温度变化的函数。我们在每种情况下都采用全球平均气温变化作为一种假设,并将其作为预期的局部平均气温变化的指示。这意味着我们的影响估计可能是保守的,因为在许多陆地地区,气温上升可能比全球平均水平更强。我们研究了基线(4摄氏度)和缓解(2摄氏度)情景对玉米、小麦和水稻的影响,包括适应和不适应(见图10; 2100年热带和温带地区的结果; 这些影响是由于气候变化以外的其他因素导致的产量增加的额外影响)。虽然结果很不确定,但有些结论似乎是可能的。首先,基准情景(无适应性)导致所有情况下的产量(相对于没有气候变化的情况)大幅度下降: 气候变化影响可能会使所研究作物(2050年)的总体产量下降10-35% 。其次,无论是采取缓解还是适应措施,都会限制产量的下降。然而,在热带地区,影响仍然是负面的,通常在10% 左右的损失。第三,缓解和适应相结合可能会导致从今天的情况得到改善。对温带地区的农业影响可能更为积极,但前提是气温较高的优势不被极端天气的影响所抵消。这些结果突出表明,需要同时考虑缓解和适应问题。所提出的结果以气专委的评估为基础,代表了范围广泛的模型。结果也可以通过个别研究来说明。比如,Tubiello 和 Fischer (2007)发现,缓解方案可以显著降低全球农业气候变化的成本。同样,Fischer 等人(2007)阐述了适应水灌溉需求的重要性。他们发现,缓解措施减少了约40% 的农业用水需求,剩下60% 的影响需要适应。在处理对农业的影响时,干旱和热浪胁迫都起着重要作用。图11显示了在欧洲,假设各种形式的适应(Mechler 等,2010; Moriondo 等,2010) ,干旱和热浪胁迫对2 °C 升温情景下作物产量的影响。22在 HADCM3气候模式的基础上,利用 Cropsyst 模式对2030-2060年的时间片进行了计算。利用当前和未来的作物管理措施,模拟了春小麦的冬夏季作物产量。所考虑的适应选择包括将播种日期推迟几天和使用生长周期较长/较短的品种。结果显示,南欧和法国部分地区如今已经特别容易受到干旱和热应激的影响,即使在2摄氏度(缓解)的情况下,这种情况预计也会恶化(图11图 A)。当考虑两种适应策略与缓解措施相结合时(图11图 B 和 C) ,欧洲的许多地区实际上可能会受益。尤其是北欧,可以利用降水量较高的优势,使用生长周期较长的作物品种。相比之下,在南欧,同样的适应办法将产生额外的负面影响,因为作物发展将转向夏季,而夏季较长的干旱期和热浪可能会严重影响作物生长。此外,结果表明,虽然适应存在一些区域特有的限制,但整体适应将有效减少对欧洲农业部门的影响。5.4水资源: 潜在的水资源可利用性使用全球范围的水资源影响模型(Arnell,2003)评估了这两种情景对水资源压力变化的影响。图12显示了在基线情景和缓解情景(使用 HadCM2气候模型模式)下,到2100年(相对于1961-1990年的平均值)平均年径流量的百分比变化。如果年平均径流量小于1000 m3/人均/年,我们将流域定义为处于水资源紧张状态(文献中也使用了其他定义)。气候变化的影响是通过总结(i)生活在径流显着减少(增加)的水资源紧张流域的人口(通常超过5-10%)和(ii)生活在由于气候变化而变得水资源紧张(不再水资源紧张)的流域的人口。因气候变化而面对水资源压力上升或下降的人数没有计算在内,原因有二: (i)水资源压力下降的负面影响大于水资源压力下降的有利影响; (ii)面对水资源压力上升或下降的地区分布广泛,一个地区的“盈余”不能抵消另一个地区的“亏损”。结果显示,在2050年、2080年和2100年,缓解和基线假设情景之间,水资源紧张程度增加的风险暴露存在显著差异。到2020年,这两种情况的径流量几乎没有差别。图13显示了在这两种情景下,由于气候变化而面临水资源压力增加或减少的人数。在基线和缓解方案中,生活在缺水流域的人们显然从增加的可用水量中受益,这个数字大于暴露于径流减少的人数,但是,如上所述,我们并不关注净效应。面临水资源压力变化的人数对假定的气候变化模式很敏感。与基线相比,缓解方案在2050年、2080年和2100年分别减少了1.35亿(减少12% 的影响)、2.81亿(减少20%)和4.57亿(减少30%)面临水资源压力增加的人口。然而,与此同时,也有人从气候变化中受益。具有积极和消极影响的群体的相对规模取决于所使用的气候模型(这里只使用了哈德利模式)。显然,减缓气候变化也减少了从气候变化中受益的人数。同样显而易见的是,缓解并不能消除气候变化对供水的影响,因此需要对气候变化而面临更大水资源压力的其余10亿人进行适应。适应措施可以包括增加水的储存、水的运输或通过提高效率来减少水的需求。基本结果表明,缓解效果因地区而异。事实上,在一些地区,缓解甚至可能增加暴露于压力增加的人数。具体的不确定性分析表明,结果高度依赖于气候变化引起的降水模式变化的不确定性。5.5海平面上升气候变化的另一个重要影响是海平面上升。利用 IMAGE 模型的 MAGICC 组成部分,预测了这两种情况下的全球平均海平面上升。由于海平面对全球变暖的反应迟缓,预测主要在本世纪下半叶出现分歧: 在4摄氏度和2摄氏度的情况下,2050年海平面上升分别为35厘米和31厘米,在2100年分别为71厘米和49厘米。这些预测不包括格陵兰岛和南极洲冰盖的潜在加速贡献,这可能导致更高的海平面上升,但潜在的过程没有得到充分的理解,目前不包括在气候模型中(Meehl 等,2007; Nicholls 等,2010; Vermeer 和 Rahmstorf,2009)。我们使用 DIVA 模型来评估海平面上升、相关风暴潮和社会经济发展在这两种情况下的损害和适应成本,同时考虑到海岸侵蚀(直接和间接)、强迫迁移、沿海洪水(包括河流)以及盐度入侵三角洲和河口。对于每种情况,模型首先在没有堤坝的情况下运行,然后根据提高堤坝和滋养海滩的情况进行适应(DINAS-COAST Consortium,2006; Hinkel and Klein,2009)。由于缺乏全球数据和这些过程的一般模型,无法列入诸如沿海含水层盐度入侵、沿海湿地和生物多样性丧失等进一步影响,以及诸如盐度入侵屏障、港口升级、倒退区和基于生态系统的保护等进一步适应办法。图14显示,与缓解水平无关的是,适应相当有效地降低了全球总体成本,这说明即使在目标远大的缓解情况下也必须进行适应。在总体规模上,仅通过适应战略比仅通过缓解战略可以避免更多的损害,尽管两者结合起来产生的积极影响最大。然而,从较贫穷和小岛屿国家的角度来看,严格的缓解措施对于将风险保持在可控水平是必要的。即使没有海平面上升,为了保护仅由于社会经济发展而增加的泛滥平原资产,适应措施也是具有成本效益的。虽然这将涉及大量的投资流动(全球数百亿美元) ,但它们在全球 GDP 中所占的比例相对较小,即使对于基线情景下的海平面上升而言也是如此。然而,对于个别国家或地区(特别是小岛屿国家)来说,这些成本可能占 GDP 的比例要大得多,包括完全丧失的风险。5.6供热和供冷需求(居住区和社会)气候变化可能会影响空间供冷和供热的需求。因此,我们建立了一套简单的关系来描述住宅部门的供暖和空气调节需求,并探索了气候变化对这种模拟能源需求的影响(Isaac and van Vuuren,2009)。显然,人口和收入的变化预计将导致下个世纪取暖和空气调节的能源需求大幅增长(见图15,没有气候变化的例子)。在气候的驱动下,制冷和制热实践的变化是自主适应的例子(即没有政策干预)。然而,适应并不是普遍的,因为人们并不总是能够做出反应。供暖和制冷需求得不到满足可能导致健康影响(如第5.2节所述)和劳动生产率的损失。除了这些影响,当室内温度超过一定水平时,舒适度也会降低。图15显示,在全球范围内,由于收入和财富的增加而不考虑气候变化的能源需求的自主增长远远大于基线情景和缓解情景中的能源需求之间的差异(Isaac 和 van Vuuren (2009)显示,这对其他基线也是一个强有力的结果)。气候变化对综合能源需求的影响也小于单独对供暖和空气调节的影响,因为空气调节的增加弥补了供暖的减少。在区域和国家层面,影响可能更为显著: 例如,在印度,我们预计由于冷却增加,能源需求将大幅增加,而在西欧和美国,我们预计由于供暖减少,能源需求将大幅减少。5.7极端事件气候变化预计将导致一些与天气有关的极端事件的频率和强度发生变化(Parry et al。 ,2007)。像洪水、干旱、热浪和风暴潮这样的极端天气可能会变得更加频繁和强烈,而寒冷的极端天气,如寒潮,可能会变得不那么频繁和弱化。根据平均条件的变化来评估气候变化的风险——只有极端事件风险的变化被平均化的风险。因此,一种更基于风险、更明确地理位置的方法更可取。然而,关于灾害影响的知识是复杂和有争议的。迄今为止,只有有限的国家级研究采用概率方法预测气候变化存在的未来风险,主要集中在洪水风险(Mechler et al。 ,2010)。Feyen 等人(2009)在泛欧范围内进行的一项研究计算出,在基线情景下,预期的年度损失将增加三倍。定量风险方法的一个关键制约因素是气候预测的不确定性。对于降水,例如,模型往往不同意在局部尺度的变化迹象。这对于寻找例如洪水风险的研究尤其重要。虽然 Mechler 等人(2010)的研究旨在预测未来的风险,但他们发现未来的预测是如此的不确定,以至于作者没有根据对当今洪水影响的估计来预测未来的洪水风险。然而,目前的模型和数据似乎足以以相对较高的确定性(较慢的现象)评估干旱和热浪对农业造成的综合风险。这里提供了在2 °C 和4 °C 情景下的一些工作实例。一些研究调查了全球范围内受洪水影响的人们(Hirabayashi 和 Kanae,2009; Kundzewicz 等,2010)。样本回归分析显示,在缓解方案(2摄氏度)中,全球每年受100年洪灾影响的人口平均为2.11亿人,而基线(4摄氏度)为5.44亿人。Mirza 等人(2003)指出,对于孟加拉国这样一个易受水灾影响的国家来说,即使是2摄氏度的情况,预计也会使预计的洪水泛滥面积至少增加23-29% 。然而,应当指出的是,由于暴露、脆弱性和适应方面的不确定性,对未来洪灾损失的估计范围仍然很广。关于干旱,Burke 等人(2006)对2090年代的预测表明,对于2090年代的基线情景,每100年的极端干旱事件数量和平均干旱持续时间可能分别增加2倍和6倍。有证据表明,天气和气候相关影响造成的损害在当今已经增加,但这主要是由于财富和人口的增加(Bouwer,2010)。然而,气候变化预计将随着时间的推移而增加,并可能成为未来损害增加的一个更重要的因素。IPCC 最近的报告指出,重大事件的成本预计从占地区年 GDP 和收入的几个百分点到经济强劲的大地区的25% 不等(Parry et al。 ,2007)。过去受灾严重的小岛屿国家的灾害损失实际上已经超过了年度 GDP (Cummins and Mahul,2009)。5.8影响的经济评估成本-收益分析(CBA)是用一个共同的货币单位来表示不同战略的气候变化的成本和收益。我们在这里使用 FAIR 模型的 CBA 模块(参见模型附录 A)来获得一些关于更加聚合规模的影响的概念。对于缓解成本,FAIR 模型使用了前面介绍的 IMAGE 模型的信息。FAIR 中使用的气候损害和适应成本函数是从 AD-DICE 模型中推导出来的(De Bbu 等,2009a; Hof 等,2009a)。简而言之,AD-DICE 根据 DICE 模型的损伤函数估计适应成本(Nordhaus and Boyer,2000)。AD-DICE 将这些功能分为损害成本函数和残余损害函数,基于 DICE 模型中描述的每个影响类别的评估-农业,沿海地区,健康,住区,非市场时间使用,其他脆弱市场和灾难性影响。在这项研究中,我们假设了对气候变化的最佳适应响应(即给定一个温度变化水平,模型将适应成本和剩余影响的总和最小化)。DICE (因此 FAIR)中使用的影响估计包括: (i)实际的,可测量的经济成本(所谓的市场成本) ; 和(ii)其他的,无形的损失(非市场损失) ,使用支付意愿概念货币化。损害函数与本节前面所述的物理或经济损害没有直接关系,因为它们来自单独的来源。早先已经表明,适应成本的 FAIR 结果与文献中报道的值范围一致(Hof et al。 ,2009a)。在 FAIR 模型的默认设置和2.5% 的贴现率下,2005-2200年期间气候变化影响造成的贴现成本占全球 GDP 的比例在基线水平上接近4.5% (图16)。这些成本可能看起来高于上面提到的有限的部门分析,但是包括更多的部门和可能的灾难性事件的影响(Nordhaus and Boyer,2000)。年度成本随着时间的推移急剧上升,到2200年达到17% (注意,影响估计非常不确定,文献中可以找到更高和更低的值(Parry 等,2007; Stern,2006; Tol,2002b))。只有适应或缓解的情况下,折扣成本大幅降低到2.5% 左右(图16)。Hof 等人(2008)的研究表明,气候变化的 CBA 结果对模型假设非常敏感,其中折现率起着最重要的作用。贴现率特别重要,因为随着时间的推移,与仅适应和仅缓解设想相关的成本函数不同。3.5% 的贴现率导致仅适应情景和仅缓解情景的贴现成本分别为0.8% 和1.9% 。如果使用1.4% 的贴现率(相当于 Stern (2006)使用的贴现率) ,仅适应情景和仅缓解情景的贴现成本分别为3.2% 和2.5% 。在贴现率为2.5% 的情况下,缓解与适应相结合,贴现成本最低,即占 GDP 的2% 。与文献资料一致,适应性投资被评估为小于缓解投资和剩余损害。然而,它们在限制残余损伤方面是非常重要的。需要提及一些重要的警告。首先,对于风险的极端尾部(即低概率、高影响事件) ,计算不能被视为可靠。作为对如何处理这些风险的主观评估,Weitzman (2008)质疑 CBA 对决策者的有用性。其次,考虑到时间偏好和风险的贴现率的价值目前正在激烈争论,争论涉及主观时间偏好和风险感知(Nordhaus,2008; Price,2005; Stern,2006)。如上所述,贴现率的价值可以对结果产生很大的影响。最后,非市场影响需要对损害进行主观量化; 虽然这些影响很难货币化,但一般来说,不可逆转的变化更加困难,例如导致珊瑚礁丧失的海洋变暖(Ackerman and Heinzerling,2004)。5.9气候变化、影响和适应方面的不确定性对未来气候变化及其影响的预测有许多不确定性的来源。不确定性与因果链中的每一步都有关联: 排放、气候驱动因素(如碳循环)、气候(主要是气候敏感性和气候变化模式)以及影响(包括适应能力)。因此,对于同样的排放情景,不同的研究可能会给出非常不同的结果。事实上,这些差异往往大于不同排放情景下某一特定模型产生的差异。例如,对于本世纪末的降水变化,多模式集合平均值仅在高纬度地区超过模式间的标准差(Kundzewicz et al。 ,2007)。气候变化预测的不确定性随着时间的推移而增加。在近期(例如2020年代) ,气候模型的不确定性起着最重要的作用; 而在较长的时期(例如2090年代) ,由于排放情景的选择而产生的不确定性变得越来越重要(Jenkins and Lowe,2003)。未来气候变化对极端事件的影响尤其不确定。这部分是由于粗分辨率气候模型的较大空间和时间尺度与某些极端天气(如暴雨降水和山洪)的局部发生和短期生命之间的不匹配。由于影响和适应是在当地范围内发生的,因此需要详细的信息——这意味着不确定性的增加。较大的不确定性范围表明,适应规划不应基于单一的假设情景,而是需要考虑到大范围的预测。6结论在本文中,我们讨论了情景分析如何有助于评估缓解和适应战略。我们还提出了两个集成的场景作为分析的起点。这些设想方案明确地将缓解和适应行动纳入了几个指标,并涵盖了社会经济发展与影响之间的几个重要联系和反馈(例如,考虑了气候变化对土地利用和缓解的影响)。我们为选定的一些指标确定了这些设想方案的影响,主要侧重于平均气候变化。基于我们的工作,我们得出以下结论: •通过描述两套对比鲜明的世界可能的气候变化轨迹,我们为对减缓、适应和气候影响之间的相互作用进行更加综合的分析奠定了基础。第一种情况(不采取缓解措施)预计将导致全球平均气温在本世纪末上升4摄氏度左右(气候参数和当前经济趋势的最有可能值)。正如我们的一些分析所显示的,这种情况有很高的适应需求。第二种设想假设有严格的缓解措施,并将全球平均气温变化限制在2 °C,概率为50% 。即使在这种情况下,也需要大量的适应措施。•这里提出的综合情景分析可以为探讨政策选择的不同后果(包括不确定性)奠定良好基础; 鉴于不确定性,确定缓解、适应和剩余损害之间的最佳组合是不可行的。正如本文所讨论的那样,衡量气候变化的后果和各种政策回应的复杂性在于规模、空间和时间上的巨大差异; 巨大的不确定性; 以及行为者之间利益的明显差异(例如,他们是气候变化的肇事者还是受害者)。因此,对风险的主观解释将始终发挥重要作用。尽管如此,情景分析可以提供对可能结果和风险的描述。在这个阶段,成本和收益的货币评估(第5.8节)不能与前面章节中对物理变化的描述联系起来。有效的气候政策包括适应和减缓。模型计算表明,可以设计缓解设想方案,使全球平均气温上升2摄氏度,从而达到对气候敏感性的最佳猜测。然而,即使是这些严格的情况也可能导致全球平均气温上升超过2.5摄氏度(最多上升1.5摄氏度)和区域气温变化更大。本文件探讨的大多数影响都表明需要将缓解和适应结合起来。例如,在应对海平面上升(至少在21世纪)方面,适应措施可能比缓解措施更有效,但缓解措施在减少损害和降低适应成本方面仍然可以发挥作用。农业提供了一个明显需要适应和缓解的例子。如果不采取适应和减缓行动,预计许多区域的农作物产量将因气候变化而受到不利影响。如果没有严格的缓解措施,适应只能限制负面影响,而不能消除它们。缓解的一个好处是,它影响到所有影响类别,而适应需要根据影响和环境进行调整。•尽管气候变化的影响可能很严重,而且根据主观选择,可能需要制定严格的气候政策,但本研究评估的影响(鉴于目前的技术水平)可能仍然是全球范围内人口变化和经济增长的次要影响。然而,需要注意的是(见下文)。虽然气候变化可能对数百万人产生影响,但其他挑战可能对人民和治理产生更大的影响。然而,应该指出的是,我们只涉及了有限的一组影响,并且主要集中在对逐渐变化的气候的平均估计上,例如,没有涉及灾难性的、影响非常大的、极低概率的事件(Weitzman,2008)。这些事件实际上可能非常严重,以至于上述结论不再成立。如果全球范围的成本仍然相对较低,就不太需要进行全球分析,以便根据故事情节的一致性纳入对主要驱动因素的所有反馈。显然,在地方一级,情况可能大不相同; 对个别国家的影响可能远远大于全球一级的影响。例如,海平面上升对一些低洼岛国和国家非常重要,这些国家可能会受到巨大的适应成本和/或损害(直至完全毁灭)的显著影响。对农业而言,预计积极和消极影响将在不同地方和不同时间发生,低收入国家往往受到相对较为负面的影响。目前温度受到限制的温带地区的农业可能会受益。总之,我们认为,进一步制定在区域范围内进一步具体说明这些情况的综合设想是有益的。虽然本文提出了一个有用的第一步,它也留下了许多反馈意见仍然没有说明。本研究中的总体缓解成本估计为2 °C 情景下国内生产总值的1-2% 左右。缓解方案降低了气候变化的风险。在缓解方面的投资有几种类型的好处。首先,与气候相关的损害和适应成本得到降低。其次,不确定性也会减少,考虑到所涉及的风险,这一点很重要。虽然我们认为,在全球一级,缓解和适应之间不可能存在最佳的平衡,但我们已经表明,从长远来看,缓解和适应的成本和收益是相当的。进一步分析的重点包括评估实际变化与货币影响分析之间的联系、极端事件的可变性和变化、大规模干扰和治理的潜在作用。在我们和其他评估中,主要关注的是平均值的变化,然而,人们对与气候变化有关的极端事件(导致自然灾害) ,以及大规模的破坏(如西南极冰盾的解体)有相当大的关注,这些破坏并没有被平均值准确地描述。对气候变异性变化的预测高度不确定,迄今为止常常妨碍分析有力地预测未来的极端事件风险。不同行为者的作用是另一个问题; 某些形式的适应需要政府的积极参与; 其他形式的适应可能由私人投资者实施,例如安装空间冷却系统。这两个适应主体之间的差异与未来的情景发展有关。本文中提出的研究是作为欧盟资助的 ADAM 研究项目的一部分进行的。这篇论文的早期版本是作为《让气候变化为我们服务》一书的一部分出版的,该书由休姆和纽菲尔德编辑,剑桥大学出版社于2010年出版。附录 A.模型描述 A.1 IMAGE 2.4 IMAGE 2.4综合评估模型(Bouwman et al。 ,2006)由一系列相互关联的综合模型组成,这些模型共同描述了全球环境变化长期动态的重要因素,如空气污染、气候变化和土地使用变化。作为 IMAGE 的一部分,全球能源模型 TIMER (van Vuuren et al。 ,2006)描述了一次能源和二次能源需求和生产的长期动态以及温室气体和区域空气污染物的相关排放。模型行为主要是由各种技术的替代过程决定的,基于长期价格和燃料偏好。IMAGE 的农业模型模拟了7个作物类别和5个动物类别的生产力(Leemans 和 Born,1994)。根据一套分配规则,农产品的区域生产在空间上(0.5 ° × 0.5 °)分布(Alcamo et al。土地利用变化图和农业活动数据都被用来模拟土地利用(变化)产生的排放。MAGICC 模型利用温室气体排放量来计算全球平均气温变化(Wigley and Raper,2001)。温度变化模式是通过与大气环流模式(GCM)产生的气候变化模式联系而获得的。局限性: IMAGE 提供了对人类活动的物理描述(使用成吨的石油,生产成吨的谷物等)。更全面的宏观经济描述只能通过与其他模型的合作得到。IMAGE 作为综合评估模型的广泛覆盖范围意味着许多关键的不确定性影响模型的结果。在这种情况下,使用单一的基线(如在 ADAM 项目中)并不能完全满足所涉及的基本不确定性。A. 2 FAIR 气候政策模型 FAIR (Den Elzen 等,2008)与 IMAGE 模型一起用于确定不同排放源的减排率。全球气候计算利用了简单的气候模型 MAGICC 4.1(Wigley,2003; Wigley and Raper,2001)。所要求的全球减排量是通过计算基准和全球排放路径之间的差额得出的。FAIR 成本模型使用不同排放源的区域边际减排成本曲线(MAC) ,采用最小成本方法,在各区域之间进行分配。最近,FAIR 模型已经扩展到损害和适应成本曲线(基于 AD-DICE 模型(De Bbu 等,2009b)和估计宏观经济对 GDP 增长的影响的能力(Hof 等,2008))。这使模型能够探讨减缓和适应综合战略的经济影响。限制: 为了灵活起见,公平竞争模式不包括部门宏观经济模式或能源模式。因此,该模型从一个局部均衡的方法工作-和更深层次的后果气候政策只能通过转发公平的结果到其他(相关)模型研究。A.3 DIVA DIVA (动态和互动脆弱性评估)是在欧盟资助的项目 DINAS-COAST 44中开发的一个沿海系统的综合模型,连同其适当的沿海数据库,对沿海地区对海平面上升的国家、区域和全球脆弱性进行动态和互动评估; http://www.pik-potsdam.de/DINAS-COAST/。(DINAS-COAST Consortium,2006; Hinkel and Klein,2009).《综合发展战略》提供了一系列生态、社会和经济沿海脆弱性指标的定量信息,从国家以下各级到全球各级,涵盖所有沿海国家。该模型由来自各种工程、自然和社会科学学科的专家开发的若干模块组成。根据气候和社会经济情景,该模型评估了海岸侵蚀(直接和间接)、海岸洪水(包括河流)、湿地变化和盐度入侵三角洲和河口。家庭影响评估还从提高堤坝和滋养海滩的角度考虑沿海适应问题,并包括一些预先确定的适应战略,例如不予保护、充分保护或最佳保护。限制因素: 《综合可持续发展战略》排除了可能影响沿海影响的下列进程,但目前无法有把握地建立模型: 风暴频率和强度的变化、沿海快速发展和城市化导致的国内生产总值的地方分布和人口增长,以及盐度侵入沿海含水层。由于高程数据的粗分辨率和精度,还会产生更多的重要不确定性。A. 4 TIMER ——冷却/加热能源需求 TIMER 冷却/加热能源需求模型(Isaac and van Vuuren,2009)描述了冷却和加热能源的使用是几个因素的函数,包括人口水平、不断变化的收入水平和气候。对于加热和冷却,经验数据被用来校准一组系统-动态需求函数。气候(降温和升温度日)起着重要作用。该模型能够解释气候变化的影响。局限性: 对发展中国家而言,校准模型的经验基础相对较差。该模型没有描述可以提供冷却和加热需求的不同方式以及用一种技术替代另一种技术所涉及的成本。水资源影响模型水资源影响模型(Arnell,2003,2004)有两个组成部分。第一阶段采用宏观尺度水文模型 Mac-PDM 模拟全球地表(0.5 ° × 0.5 °)的径流,第二阶段通过计算人均水资源可利用率确定流域水资源压力指标。如果一个流域的年平均径流量低于每人每年1000立方米,则假定该流域面临水资源压力,这是一个半任意的阈值,广泛用于确定水资源压力区域。如果气候变化导致缺水流域的径流量显著减少,或者导致流域降低到阈值以下,那么气候变化将导致水资源压力暴露的增加。气候变化导致对相反趋势的暴露明显减少。这些变化不能直接比较; 虽然径流量的减少(和暴露量的增加)极有可能是不利的,但如果额外的水不能储存,或者在洪水增加的高流量季节发生,则径流量的增加(和暴露量的明显减少)可能不是有益的。生活在水资源压力增加的流域的人口数量可以作为暴露于气候变化的一个指标。实际影响(就真正的水资源短缺而言)将取决于现有的水资源管理结构。局限性: 水文模型不能完全模拟河流径流量,特别是在半干旱地区倾向于高估径流量。水资源指标是衡量受影响程度的指标,而不是实际影响; 它可以被视为适应需求的替代指标。答.6疟疾的危险疟疾媒介,蚊子传播感染,只能生存在适当的气候与高平均温度,没有霜冻和足够的降水。MARA/ARMA 疟疾适宜性模型(Craig et al。 ,1999)综合了这些气候因素来确定气候适宜区域。最大适合度为1和最小适合度为0所需的气候水平见表 A.1。对于水平介于0或1适合性所需水平之间的指标,使用简单函数计算水平(Craig et al。 ,1999)。利用 IMAGE 模型的输出结果,所有这些因子都是在半乘以半度的网格水平上计算的(Bouwman et al。 ,2006)。每个栅格细胞的总气候疟疾适应性是由这三个指数中的最低值决定的。局限性: MARA/ARMA 模型描述了对疟疾病媒的适用性。它没有提供蚊子传播的过程说明,也没有明确说明人们可能对增加的风险水平作出何种反应。参考文献 Ackerman and Heinzerling,2004 F。 Ackerman L。 Heinzerling 无价之宝: 关于了解一切事物的价格和一无所有的价值2004 The New Press 纽约 Agrawala and Fankhauser,2008 S。 Agrawala S。 Fankhauser 适应气候变化的经济方面。成本、收益和政策工具2008经合组织巴黎 Alcamo 等人,1998年 J。 Alcamo E。 Krol R。 Leemans J。 Bolen J。 Minnen M。 Schaeffer S。 Toet B。 Vries 全球环境变化模型: IMAGE 2.1 J。IMAGE 2.1型号1998爱思唯尔科技有限公司测试结果。Arnell,2004 N.Arnell 气候变化与全球水资源: 气候变化与社会经济情景全球环境变化14120043152 Arnell 等,2002 N.W。Arnell M.G.R.Cannell M. Hulme R.S.Kovats J.F.B.Mitchell R.J.Nicholls M.L.Parry M.J.L.利弗莫尔 · A · 怀特二氧化碳稳定化对气候变化影响的后果气候变化5342002413446 Bakkenes 等,2006 M。 Bakkenes B. Eickhout R. Alkemade 不同气候稳定化情景对欧洲植物物种的影响全球环境变化16120061928,2003年代表全球气候变化、适应和减缓全球环境变化132003年16巴克等人,2009年巴克,t。 ,肯伯,M。 ,Scrieciu,S。打破气候僵局。降低成本: 合作气候行动的经济效益。气候小组,托尼 · 布莱尔办公室,4CMR-剑桥大学和 Cambridge Econometrics。Barker and Scrieciu,2010 T Barker SS.Scrieciu 用 E3MG 模拟低稳定性: 走向模拟能源-环境-经济系统动力学的“新经济学”方法能源期刊31特刊12010137164 Barker 等,2008 T。Scrieciu T. Foxon 实现八国集团50% 的目标: 使用宏观经济计量模型 E3MG 气候政策82008 S30 S45 Berkhout 等,2002 F. Berkhout J. Hertin A. Jordan 气候变化影响评估中的社会经济未来: 使用情景作为“学习机器”全球环境变化12220028395 Bouwer,2010 L.M。人为气候变化造成的灾害损失增加了吗?2010年美国气象学会简报10.1175/2010BAMS 3092.1 Bouwman 等人,2006年 A.F. Bouwman T. Kram K. Klein Goldewijk 全球环境变化综合模拟。2006年荷兰环境评估机构 Bilthoven 228页。(出版物500110002/2006) Burke 等人,2006 E.J。 Burke S.J。 Brown N. Christidis 利用哈德利中心气候模型对21世纪全球干旱的近期演变和预测进行建模。《水文气象学杂志》2006年7月1113日1125 Clarke 等人,2010年 L。 Clarke J. Edmonds V. Krey R. Richels S. Rose M. Tavoni 国际气候政策架构: EMF 22国际情景的概述能源经济学31号补编。2010年 S64/s81哥本哈根协议,2009年哥本哈根协议,2009年。(2009年12月18日哥本哈根协议)。2009年联合国气候变化会议,哥本哈根。Craig 等人,1999 M.H。Craig R.W.基于气候的疟疾在非洲传播的分布模型寄生虫学今天1531999105111康明斯和马胡尔,2009 J.D。发展中国家巨灾风险融资: 公共干预原则,2009年世界银行华盛顿,DC De Bbu 等,2009a K.C。德布鲁姆 R.B。Dellink S. Agrawala 适应气候变化的经济方面: 适应成本和效益综合评估模型,2009年经合组织巴黎德布鲁恩等,2009 b。德布鲁姆 R.B。Dellink R.S.J.Tol AD-DICE: DICE 模型气候变化951-220096381 Den Elzen 等,2008 M.G。J.登伊尔森私人侦探社。Lucas D.P.为实现低二氧化碳当量浓度而采取的区域减排行动和分配办法下的排放限额费用气候变化9032008243268 Den Elzen 和 van Vuuren,2007 M.G。J.Den Elzen D.P.104-46/2007/17931/17936 DINAS-COAST 财团,2006年 DINAS-COAST 财团,2006年,更有可能以更低的成本实现长期温度目标的范维伦峰值美国国家科学院院刊。DIVA 1.5.5光盘,德国波茨坦气候影响研究所。伊斯特林等,2007 W。伊斯特林 P。阿加瓦尔 P。巴蒂玛 K。布兰德 L。埃尔达 M。霍登 A。基里连科 J。莫顿 J。Soussana J. Schmidhuber F.N. Tubiello 食品、纤维和森林产品 M.L. Parry O.F. Canziani J.P. Palutikof P.J. van der Linden C.E. Hanson 气候变化2007: 影响、适应和脆弱性。第二工作组对政府间气候变化专门委员会2007年剑桥大学出版社第四次评估报告的贡献剑桥,英国欧共体,2006年欧共体2050年世界能源技术展望(WETO H2)2006年欧盟委员会布鲁塞尔埃登霍费尔等人,2010 O。 Edenhofer B。 Knopf T。 Barker L。 Baumstart E。 Bellevrat B。 Chateau P。 Criqui M。 Isaac A。 Kitous S。 Kypreos M。 Leimbach K。 Lessmann B。 Magné S。 Scrieciu H。 Turton D。Van Vuuren 低稳定性的经济学: 减缓战略和成本的模型比较能源杂志31 SI-120101148环境变化研究所,2009环境变化研究所,2009。国际气候会议-4度及以上。英国牛津大学环境变化研究所,9月28日至30日。Feyen J.I. Barredo R. Dankers 全球变暖和城市土地利用变化对欧洲洪水的影响。迈向工程、设计和管理方法的一体化2009 Taylor and Francis Group London Fischer et al。 ,2007 G. Fischer F.N. Tubiello H. van Velthuizen D.Wiberg 气候变化对灌溉用水需求的影响: 缓解的影响,1990-2080技术预测和社会变化747200710831107 Fisher et al。时间序列: K.Jiang M.Kainuma E. La Rovere A. Matysek A. Rana K. Riahi R. Richels S. Rose D. van Vuuren R. Warren P. Ambrosi F. Birol D. Bouille C. Clapp B. Eickhout T. Hanaoka M.D. Mastrandrea Y.Matsuoko B.O’Neill H. Pitcher S. Rao F. Toth 与长期减缓有关的问题:。减缓气候变化。第三工作组对2007年政府间气候变化专门委员会剑桥大学出版社第四次评估报告的贡献,2010年 A. Hayashi K. Akimoto F. Sano S. Mori T. Tomoda 评估全球变暖对不同稳定水平的影响,作为确定长期稳定目标气候变化的一个步骤98201087112 Hilderink et al。 ,2008 H. Hilderink P.L。卢卡斯 · A · 滕霍夫 · 科克 · M · 德沃斯 · P · 詹森 · J · 梅耶尔 · A · 费伯尔 · A · 伊格纳修克 · A · 彼得森 · H · J。M.2008年荷兰环境评估机构 Bilthoven Hinkel 和 Klein,2009年 J。T.Klein 整合知识以评估海平面上升对沿海脆弱性的影响: DIVA 工具全球环境变化的发展1932009384395 Hirabayashi and Kanae,2009 Y. Hirabayashi S. Kanae 第一次估计未来全球人口面临洪水风险水文研究快报3200969 Hof 等,2009a A.F。霍夫 K.C。德布鲁姆 R.B。Dellink M.G.J.Elzen D.P.不同减缓战略对国际适应融资的影响环境科学和政策1272009832843 Hof 等,2009b A.F。霍夫・德・布鲁姆・ R ・德林克・ M.G。J.Elzen D.P.2012年后的全球气候治理: 架构、机构和适应2009年剑桥大学出版社剑桥霍夫等人,2008年 A.F。Hof M.G.J.Elzen D.P.范维伦分析气候政策的成本和收益: 价值判断和科学不确定性全球环境变化1832008412424 IMAGE-team,2001 IMAGE-team,2001。IPCC SRES 情景的图像2.2实现。全面分析21世纪的排放、气候变化和影响。RIVM 光盘出版物481508018,比尔特霍芬国家公共卫生与环境研究所。政府间气候变化专门委员会,2007年,2007年。2007年气候变化: 综合报告。第一、第二和第三工作组对政府间气候变化专门委员会第四次评估报告的贡献。政府间气候变化专门委员会,日内瓦,104页。能源政策背景下的全球住宅部门供暖和空气调节的能源需求建模。处理 UKCIP02气候变化情景中的不确定性。埃克塞特气象局哈德利中心技术说明44。Kinney et al。 ,2008 P.L。 Kinney M.S. O’Neill M.L。 Bell J. Schwartz 用于估计气候变化对与热有关的死亡的影响的方法: 挑战和机会环境科学和政策11872008 Klein et al。 ,2007 R.J.T. Klein S. Huq F. Denton T.E. Downing R.G. Richels J.B. Robinson F.L. 适应和缓解之间的相互关系。2007年气候变化。影响、适应和脆弱性。第二工作组的贡献。2007年政府间气候变化专门委员会报告剑桥大学出版社剑桥745777 Krol 等人,1997年 M. Krol J. Alcamo R. Leemans 稳定大气中二氧化碳的全球和区域影响全球变化的缓解和适应策略11997年341361 Kundzewicz 等人,2010 Z.W。Kundzewicz y. Hirabayashi 气候变化中的 S. Kanae River 洪水——水资源管理的观察和预测2010年10.1007/s11269-009-9571-6 Kundzewicz 等,2007 z.W。Kundzewicz L.J.Mata N. Arnell P. Döll P. Kabat B. Jiménez K. Miller T. Oki Z. en I. Shiklomanov 淡水资源及其管理。招架。坎齐亚尼 J.P。普鲁提克。Hanson P.J.2007年范德林登气候变化: 影响、适应和脆弱性。第二工作组对2007年政府间气候变化专门委员会剑桥大学出版社第四次评估报告的贡献。翻译。D.确定自然植被、作物和农业生产力的潜在全球分布水、空气和土壤污染761994133161 Lenton 等人,2008 T.M。Lenton H 拘留了 E. Kriegler J.W。鲁赫特 · S · 拉姆斯托夫 · H · J 大厅。Schellnhuber 地球气候系统中的倾斜元素美国国家科学院院刊1056200817861793。Manne R.G.Richels Merge: 全球气候变化综合评估模型。Zaccour Energy and Environment 2005 Springer USA McMichael et al。 ,1996 A. McMichael A. Haines R. Sloff S. Kovats Climate Change and Human Health 1996 World Health Organization Geneva Mechler et al。 ,2010 R. Mechler S. Hochrainer A. Aaheim Z. Kundzewicz N. Lugeri M. Moriondo H. Salen M. Bindi I. Banaszak A. Chorynski E. Genovese H. Kalirai J. Linnerooth-Bayer C. Lavalle D. McEvoy P. Matczak M. RadzieJewski D. Rübbelke M.-J。评估欧洲适应不断变化的洪水和干旱风险的风险管理方法 M.Hulme H. Neufeldt 让气候变化为我们服务: 关于适应和减缓战略的欧洲观点2010年剑桥大学,英国 Meehl 等人,2007年 G.A. Meehl T.F. Stocker W.D. Collins Friedlingstein A.T. Gaye J.M. Gregory A. Kitoh R. Knutti J.M. Murphy A. Noda S.C.B. Raper I.G. Watterson A.J. Weaver Z- C。赵全球气候预测2007所罗门气候变化: 物理科学基础。第一工作组对2007年剑桥大学出版社第四次评估报告的政府间气候变化专门委员会。减缓气候变化。第三工作组对2007年政府间气候变化专门委员会剑桥大学出版社第四次评估报告的贡献剑桥,United Kingdom Mirza 等,2003 M.M。问:。Mirza R.A.Warrick N.J.气候变化对恒河、 Brahmaputra 和梅格纳河洪水的影响孟加拉国气候变化572003287318 Moriondo et al。欧洲农业应对气候变化和可变性的影响和适应机会全球变化的缓解和适应战略1572010657679 Moss 等,2010 R.H。Moss J.A.Edmonds K.A.Hibbard M.R.Manning S.K.Rose D.P.Van Vuuren T.R.Kainuma T. Kram G.A.Meehl J.F.B.Mitchell N. Nakicenovic K. Riahi S.J.史密斯 R.J。早上吃饱。汤姆森 J.P。Weyant T.J.Nakicenovic 等,2000 N。 Nakicenovic 排放情景特别报告(SRES)2000剑桥大学出版社剑桥,英国 Nakicenovic 等,2000,nakicenovic P. Kolp K. Riahi M. Kainuma T. Hanaoka 排放情景评估重新审视环境经济学和政策研究732006137173 Nicholls and Lowe,2004 R.J。Nicholls J.A.全球环境变化1432004229244 Nicholls 等,2010 R.J。Nicholls N. Marinova J.A.劳 · S · 布朗 · P · 维林加 · D · 德 · 古斯芒 · J · 欣克尔。J.21世纪英国皇家学会哲学汇刊2010年10月10日。2010.029 Nordhaus and Boyer,2000 W.D.诺德豪斯 · J · 博耶(Nordhaus J. Boyer)《全球变暖: 2000年全球变暖的经济模型》麻省理工学院出版社,剑桥,马萨诸塞州,第10页。315-328 Nordhaus,2008 W.D.平衡的问题衡量全球变暖政策的选择2008年纽黑文和伦敦帕里等人的耶鲁大学出版社,2007年。招架。坎齐亚尼 J.P。PJ 的 Palutikof。Van der Linden C.E.2007年汉森气候变化: 影响、适应和脆弱性。返回文章页面第二工作组对2007年剑桥大学出版社第四次评估报告的贡献政府间气候变化专门委员会:?气候变化9932010383402 Piani 等人,2005 C. Piani D.J。陷害地方检察官。Stainforth M.R.艾伦对气候变化的约束来自数千名成员的模拟地球物理研究通讯322005 L23825 Price,2005 c 价格关于环境变化影响的代际视角: 折现未来的观点 J.L。Innes 通用汽车。Hickey H.F.2005年国际林业研究组织联合会(国际林研组织)维也纳罗斯等人,2007年,Ahammad,h,a,Rao,S. ,Riahi,K. ,van Vuuren,D. 2007.土地在气候稳定模拟中的作用: 初步观测能源模拟论坛报告。斯坦福大学。Schneider 和 Kuntz-Duriseti,2002 S.H。不确定性与气候变化政策。奈尔斯气候变化政策: 一项调查2002年岛屿出版社华盛顿特区斯特恩,2006年 N。斯特恩气候变化经济学评论2006年剑桥大学出版社剑桥斯沃特等人,斯瓦特 · L · 伯恩斯坦 · M · Ha-Duong A. Petersen 同意不同意: 政府间气候变化专门委员会评估气候变化、影响和应对措施的不确定性管理922009129斯瓦特和雷斯,2007 R · 斯瓦特 · F · 雷斯将适应和缓解工作结合起来: 纳入可持续发展政策的主流?气候政策742007288303 Tol,2002a。第二部分。环境和资源经济学2122002135160托尔,2002年 b。第一部分。基准估计环境和资源经济学21120024773 Tol,2002 c R.S.J. Tol 福利规格和气候变化的最佳控制: 基金的应用能源经济学2442002367376 Tubiello and Fischer,2007 F.N. Tubiello G. Fischer 减少气候变化对农业的影响: 减缓的全球和区域影响,2000-2080年技术预测和社会变化747200710301056联合国,2005年。世界人口前景: 2004年修订本。光盘版-扩展数据集。联合国出版物。E.05.十三.12,联合国,经济和社会事务部,人口司。Van Vliet et al。 ,2009 J.van Vliet M.G.J.den Elzen D.P. van Vuuren 会议根据延迟参与的能源经济31号补充辐射效应制定了目标。22009 S152 S162 van Vuuren et al。 ,2006 D.P. van Vuuren B。 van Ruijven M。 Hoogwijk M。 Isaac B。 De Vries TIMER 2: 模型描述和应用 L。 Bouwman T。 Kram K。 Klein-Goldewijk 全球环境变化综合模型。IMAGE 2.42006 MNP 概述-荷兰环境评估机构 Bilthoven van Vuuren 等,2007 D.P。Van Vuuren M.G.J.登伊尔森私人侦探社。Lucas B. Eickhout B.J.稳定低水平的温室气体浓度: 减少战略和成本的评估气候变化8122007119159 Van Vuuren 等,2008 a D.P。Van Vuuren B. De Vries A. Beusen P.S.C.21世纪温室气体排放的有条件概率估计基于 IPCC-SRES 设想的全球环境变化1842008635654 van Vuuren 等,2008b D.P。Van Vuuren M Meinshausen G.K.普拉特纳 · F · 乔斯 · K.M。斯特拉斯曼 S.J。Smith T.M.L.Wigley S.C.B.Raper K. Riahi F. De La Chesnaye M.G.J.登藤野 K。江 N。 Nakicenovic S。 Paltsev J.M。21世纪减缓美国国家科学院院刊的温度上升。Van Vuuren M.G.J.艾萨克不同气候制度的比较: 扩大参与的影响能源政策3712200953515362 van Vuuren 等,2010 D.P。Van Vuuren E. Stehfest M.G.J.艾萨克探索将温室气体辐射效应控制在3瓦/平方米以下的情景能源经济学31特刊12010165192维梅尔和拉姆斯托夫,2009年 M. Vermeer S. Rahmstorf 与全球气温美国国家科学院院刊相关的全球海平面10620092152721532 Weitzman,2008 Weitzman,M.L。,2008年。灾难性气候变化的经济学模型与解释。Wigley,2003 T.M.L. Wigley MAGICC/SCENGEN 4.1: 技术手册2003 UCAR-气候和全球动力学分部博尔德,CO Wigley and Raper,2001 T.M.L. Wigley S.C.B. Raper 全球平均变暖科学高预测的解释2932001451454|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Fairness+in+Graph+Machine+Learning:+Recent+Advances+and+Future+Prospectives)|0| -|[Socially Responsible Machine Learning: A Causal Perspective](https://doi.org/10.1145/3580305.3599571)|Raha Moraffah, AmirHossein Karimi, Adrienne Raglin, Huan Liu|Shanghai Lixin Univ Accounting & Finance, Shanghai, Peoples R China; Pukyong Natl Univ, Grad Sch Management Technol, Busan, South Korea|The underlying assumption of using investor sentiment to predict stock prices, stock market returns, and liquidity is that of synergy between stock prices and investor sentiment. However, this synergistic relationship has received little attention in the literature. This paper investigates the synergistic pattern between stock prices and investor sentiment using social media messages from stock market investors and natural language processing techniques. At the macro level, we reveal extremely significant positive synergy between investor sentiment and stock prices. That is, when a stock price rises, investor sentiment rises, and when a stock price falls, investor sentiment falls. However, this synergy may be reversed or even disappear over a specific time period. Through a segmented measurement of the synergy between stock prices and investor sentiment over the course of a day, we also find that investor sentiment on social media is forward looking. This provides theoretical support for using investor sentiment in stock price prediction. We also examine the effect of lockdowns, the most draconian response to COVID-19, on synergy between stock prices and investor sentiment through causal inference machine learning. Our analysis shows that external anxiety can significantly affect synergy between stock prices and investor sentiment, but this effect can promote either positive or negative synergy. This paper offers a new perspective on stock price forecasting, investor sentiment, behavioral finance, and the impact of COVID-19 on the stock markets. Copyright (c) 2022 Borsa Istanbul Anonim S, irketi. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).|利用投资者情绪来预测股价、股市回报和流动性的基本假设是股价和投资者情绪之间的协同作用。然而,这种协同关系在文献中很少受到关注。本文利用来自股市投资者的社交媒体信息和自然语言处理技术,研究了股票价格与投资者情绪之间的协同关系。在宏观层面,我们发现投资者情绪与股价之间存在极其显著的创建力量。也就是说,当股价上涨时,投资者情绪上升,当股价下跌时,投资者情绪下降。然而,这种协同作用可能会逆转,甚至在特定的时间段内消失。通过对股票价格和投资者情绪在一天内的协同效应进行分段测量,我们还发现,社交媒体上的投资者情绪具有前瞻性。这为利用投资者情绪进行股价预测提供了理论支持。我们亦会透过因果推理机器学习,研究封锁对股价与投资者情绪之间的协同效应的影响。封锁是对2019冠状病毒疾病最严厉的回应。我们的分析表明,外部焦虑可以显著影响股票价格和投资者情绪之间的协同效应,但这种效应可以促进正面或负面的协同效应。本文提供了一个新的视角股票价格预测,投资者情绪,行为金融学,以及2019冠状病毒疾病对股票市场的影响。版权所有(c)2022伊斯坦布尔博尔萨 Anonim S,irketi。由 Elsevier B.V 出版。这是 CC BY-NC-nd 许可证下的一篇开放存取文章( http://creativecommons.org/licenses/BY-NC-ND/4.0/)。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Socially+Responsible+Machine+Learning:+A+Causal+Perspective)|0| +|[Fairness in Graph Machine Learning: Recent Advances and Future Prospectives](https://doi.org/10.1145/3580305.3599555)|Yushun Dong, Oyku Deniz Kose, Yanning Shen, Jundong Li|polish academy of sciences; university of cambridge; ; university of grenoble; potsdam institute for climate impact research; netherlands environmental assessment agency; vienna university of economics and business; vu university amsterdam; university of reading|Scenarios are used to explore the consequences of different adaptation and mitigation strategies under uncertainty. In this paper, two scenarios are used to explore developments with (1) no mitigation leading to an increase of global mean temperature of 4 °C by 2100 and (2) an ambitious mitigation strategy leading to 2 °C increase by 2100. For the second scenario, uncertainties in the climate system imply that a global mean temperature increase of 3 °C or more cannot be ruled out. Our analysis shows that, in many cases, adaptation and mitigation are not trade-offs but supplements. For example, the number of people exposed to increased water resource stress due to climate change can be substantially reduced in the mitigation scenario, but adaptation will still be required for the remaining large numbers of people exposed to increased stress. Another example is sea level rise, for which, from a global and purely monetary perspective, adaptation (up to 2100) seems more effective than mitigation. From the perspective of poorer and small island countries, however, stringent mitigation is necessary to keep risks at manageable levels. For agriculture, only a scenario based on a combination of adaptation and mitigation is able to avoid serious climate change impacts. Keywords Scenarios Integrated assessment Climate change Mitigation Adaptation Climate impacts 1 Introduction Scenario analysis forms a very important tool in the assessment of climate change and climate change policy, allowing analysts to explore the complex and uncertain future interactions between factors like economic development, greenhouse gas (GHG) emissions, climate and ecosystems. Together these factors determine the need and the possibilities for mitigation and adaptation policy. Scenarios can also act as a means to harmonize assumptions across very different research communities that are involved in the fields of climate research, allowing a better comparison of their results. As such, scenarios have been used extensively in both mitigation and adaptation studies (see Metz et al., 2007; Parry et al., 2007 ) (especially the scenarios from Special Report on Emission Scenarios (SRES) ( Nakicenovic et al., 2000 )). Moss et al. (2010) point out that since the SRES information requirements from scenario analysis are changing. First, there is an increasing interest in exploring the relationships between adaptation and mitigation. As indicated by Moss et al. (2010) , this would require a further integration of information across the different analytical traditions involved in climate research. Secondly, there is also an increased interest in scenarios that explicitly explore the impact of climate policies in addition to the climate policy-free scenarios explored so far. Specifically, there is a strong interest in being able to evaluate the “costs” and “benefits” of long-term climate goals vis-à-vis the situation without climate policy. In this paper, we follow this line of thought and explore how scenario analysis can contribute to a joint assessment of future adaptation and mitigation strategies. Such a joint assessment can be useful for several reasons: (1) the preferred mitigation strategy depends on expected climate impacts and adaptation costs, (2) it takes account of the limitations of adaptation to climate change, (3) some adaptation and mitigation strategies may interact and (4) finally, impacts of climate change may have important feedbacks that need to be taken into account. Such analysis is most useful at a strategic level, and not for individual adaptation (or mitigation) decisions. Given this purpose, we discuss in the paper two main scenarios that include elements of adaptation and mitigation strategies (see further in this paper), resulting in an increase of global mean temperature of 4 °C and 2 °C by the end of this century. These two temperature levels have started to become iconic numbers, representing a potential outcome in the situation without mitigation policy (4 °C) and the temperature target of international climate negotiations (2 °C) ( Copenhagen Accord, 2009 ). Arguably, understanding the implications of these two temperature levels is essential if political leaders are to make informed choices about the balance between mitigation, adaptation and climate impacts ( Environmental Change Institute, 2009 ). Integrated assessment of mitigation and adaptation strategies is hampered by methodological differences. Integrated assessment models have difficulties describing adaptation processes given the importance of local circumstances ( Patt et al., 2010 ). A practical problem is that to date a considerable part of the impact literature has concentrated on impacts under no-policy scenarios (exceptions include Arnell et al., 2002; Bakkenes et al., 2006; Hayashi et al., 2010; Krol et al., 1997; Nicholls and Lowe, 2004 ). This paper therefore presents a generalised scenario assessment based on coupled pieces of information – but without pretending to be complete or to be fully integrated. As a learning-by-doing exercise, the paper intends to show important differences between a 4 °C and a 2 °C world, but also to identify some of the practical issues involved in performing integrated scenario analysis. This implies that the most important advancement compared to existing literature is that we present a multi-sector analysis based on consistent scenarios. Given the state-of-the-art of current integrated assessment models, the experiments have been done using several loosely coupled models. As a result, several important linkages could not be addressed such as between the adaptation responses for agriculture, which may involve irrigation (see Section 5.3 ) and water demand (Section 5.4 ). In fact, an important question raised in the paper is whether a fully integrated analysis is needed or whether partial integration is sufficient. The paper is organized as follows: we first discuss some of the methodological complications in developing scenarios that can provide information for both adaptation and mitigation policy decisions. Next, we discuss the differences between the two main scenarios in terms of socio-economic drivers (Sections 3 and 4 ). In Section 5 we explore the potential consequences of adaptation and mitigation strategies on various impacts of climate change. 2 Assessment of climate strategies and scenario development (theory and methods) 2.1 Different strategies in response to climate change Climate change and the responses to it can lead to three forms of costs (not necessarily monetary): (1) the (residual) costs of climate impacts, (2) the costs of adaptation and (3) the costs of mitigation. At least theoretically, this corresponds to three different strategies: (1) “laissez faire” (accept climate change), (2) focus on adaptation and (3) focus on mitigation as illustrated conceptually in Fig. 1 (see also Klein et al., 2007 ). While Fig. 1 suggests that the costs and benefits of mitigation, adaptation and residual damages can be traded-off against each other, there are conceptual and analytical problems that complicate such an approach. These relate to spatial and temporal scales, and risks and uncertainty ( Swart and Raes, 2007 ). Mitigation and adaptation are processes that take place at different spatial s cales. While mitigation action is often taken at the national or local scale, the benefits are shared globally. As a result, a critical factor in the success and costs of climate policy is the degree of international cooperation ( Barker et al., 2009; Clarke et al., 2010; van Vliet et al., 2009; van Vuuren et al., 2009 ). For adaptation, in contrast, both costs and benefits occur on multiple scales from local to national and even international. An enabling environment at a larger scale can still enhance adaptation at a smaller scale (e.g. local capacity-building funded by international financing mechanisms). For these kinds of reasons, assessment of mitigation tend to concentrate on the global level, while by contrast, adaptation research is mostly focusing at the local scale. The dynamics over time of mitigation and adaptation is also an important factor. Stringent mitigation scenarios typically require strong, early reduction of emissions. Climate change impacts of these scenarios, however, will in the short-term (first decades) hardly differ from those in scenarios without climate change policy due to the large inertia within the climate system. In contrast, some associated impacts (e.g. co-benefits in reduced local air pollution) may be realized at a much faster pace. Adaptation measures are likely to yield private and social benefits over the near-term. For instance, simple adaptation measures such as air conditioning can bring clear short-term benefits. Some important exceptions exist which may require decades to implement, such as changes in spatial planning or large-scale engineering works for flood protection (see Hallegatte, 2009 ). Other important factors are risk and uncertainty . Our understanding of climate change faces many uncertainties. Key uncertainties to be identified comprise epistemic, data, model, and ontic uncertainties ( Schneider and Kuntz-Duriseti, 2002; van Vuuren et al., 2008a ). Examples of factors that involve uncertainty are (i) future emissions, (ii) the climate system, (iii) future vulnerability and exposure to climate risks and (iv) mitigation costs. Taking mitigative action reduces some uncertainties, since it reduces the originating sources of climate change and reveals the actual mitigation costs ( Barker, 2003; Piani et al., 2005 ). Mitigation may, however, also add to risks. For example, bio-energy, if implemented unsustainably, may offset one set of risks (climate change) while creating another set of different risks (biodiversity loss and reduced food security). One way of dealing with risks is to include assessments of probabilities. This is often done using past evidence, extrapolated to cover specific future circumstances. Other uncertainties (for instance unknowable shocks and surprises) are more difficult to deal with in quantitative sense, but justify acknowledgement of ignorance. Scenarios can be used to explore the potential for extreme events and the robustness of various policy portfolios but this is not often done ( Berkhout et al., 2002 ). Traditionally, the disciplines involved in mitigation research and adaptation research have different ways of describing uncertainty. While mitigation research often uses quantitative methods and concentrates on mean estimates, adaptation research often focuses more on qualitative descriptions of uncertainty and concentrates on the risks of hazardous events even if these have a low probability of occurrence. These different perceptions of uncertainty may complicate an integrated assessment of different strategies ( Swart et al., 2009 ). 2.2 Types of scenarios We can characterize scenarios into different classes based on the considerations about mitigation and adaptation. First, we define a baseline scenario, as a trajectory of events assuming no major feedbacks from climate change and no specific policy efforts on either mitigation or adaptation (such a scenario may still include many actions that indirectly influence the ability to mitigate or adapt to climate change; for instance, increasing income levels can be expected to coincide with greater investment in health services reducing the risks of climate-related diseases such as malaria). The main purpose of this type of scenario is analytical, serving as a point of reference for other scenarios. Second, adaptation scenarios describe a world in which societies are responding to climate change impacts. Their purpose is to explore the type of technologies and policies required to adapt to climate change, the avoided damage and the associated costs. Adaptation includes so-called autonomous adaptation (i.e. actions that occur without specific government action) and planned adaptation. Third, mitigation scenarios describe a world including policies aiming to limit climate change. Their purpose is to explore the type of technologies and policies required to minimize climate change and the associated costs. As there will always be remaining impacts, the fourth set, adaptation and mitigation scenarios combine both types of responses to climate change. Possibly, this fourth category of scenarios could re-order policy options according to the synergies that might exists between adaptation and mitigation options, e.g. for some re-afforestation options. Each of these scenarios is connected to a broader social, political and cultural context in which they are assumed to arise. In exploring a preferred mix of mitigation, adaptation and residual damage, two main approaches exist: (i) the impact and risk-based approach that describes potential impacts as function of global mean temperature increase (and thus mitigation), and (ii) the cost–benefit analysis, which identifies monetary costs and benefits in order to maximize welfare (see for instance Nordhaus, 2008; Tol, 2002c ). In both cases, we believe it to be more useful and reflective of the issue to describe the relationships between different response strategies than to seek to determine an optimum. Given the complexities and uncertainties laid out in Section 2.1 , we believe no optimal mitigation, adaptation or combined strategy can be pursued in reality. 2.3 Integrated analysis An integrated analysis of mitigation and adaptation can be achieved in different ways: e.g., by using one single, so-called integrated assessment model, or by exchanging information between different models and disciplines, assessing available literature and making results comparable. Both methods are organized around the cause–effect chain of climate change, i.e. describing the relationship between economic activities (income, energy use, agriculture, etc.), emissions, climate change and impacts – and the related feedbacks ( Fig. 2 ). The scheme in fact also forms the backbone of information flows around scenarios for the IPCC reports ( Moss et al., 2010 ). Scenarios are developed first by integrated assessment and emission modelers (focusing on economic driving forces, energy and land use and GHG emissions (IPCC “Working Group III”)). Subsequently, the emission trajectories are used in climate models to assess the impacts of climate change (IPCC “Working Group I”). Finally, the scenarios are used for impact, adaptation and vulnerability analyses (IPCC “Working Group II”). The involvement of different research disciplines and working groups implies that it is difficult to account for feedbacks between the different areas. Integrated Assessment models capture only a limited number of the possible feedbacks (frequently omitted feedbacks include the impact of food and water security on population and economic drivers; relationships between water scarcity and food production, impact of climate change on energy use, etc.). Ignoring (some of) these feedbacks may be reasonable if they are not substantial enough to significantly influence the system. For analytical reasons, there are major advantages to organizing scenario development within disciplinary fields and consider a limited number of feedbacks. It allows researchers to focus on elements of the chain that they understand well and to add the required amount of detail, without being confronted with the complications of interlinkages. However, this may change in a situation of increased focus on integrated analysis of mitigation and adaptation strategies. Some examples of why an integrated approach may be necessary are: i. Climate impacts, such as those triggered by extreme events, may be so severe that they undermine the economic assumptions of the original scenario; ii. Climate impacts could be substantial in agriculture so that estimates of land-use related emissions not taking impacts into account might be wrong, and the mitigation potential of bio-energy may be affected; and iii. There may be competing claims for land areas attractive for both mitigation and adaptation purposes. Thus, an interesting question is whether the need for more integrated analysis is so urgent that more complex modes of integration are needed (interactive coupling of models; one complex model), or whether the impacts can be handled separately simplifying the analysis framework. The time horizon and the decision focus may also be important here, e.g. whether potential tipping points are taken into account ( Lenton et al., 2008 ). The few available studies that have looked into this question seem to suggest that in most sectors the adaptation implications of any mitigation project are small as well as the emissions generated by most adaptation activities ( Klein et al., 2007 ). The most integrated analyses to date come from the cost–benefit oriented integrated assessment models like FUND, DICE and MERGE ( Manne and Richels, 2005; Nordhaus, 2008; Tol, 2002c ) – but these models typically aggregated climate impacts into a limited amount of rather abstract damage functions. We believe that over time, with growing intensity of both mitigation and adaptation measures across many sectors, the need for joint assessment with sufficient detail will intensify. The scenarios presented here, based on the current state of the art in modeling and scenario development, take a first step. The same scenarios are used in one assessment for mitigation and impact assessment and we explicitly address mitigation and adaptation strategies (either as part of the scenarios or within the models used for the different impacts). However, many feedbacks are not accounted for. We come back at the end of the paper to the role of more integrated (but also more complex) scenarios. 2.4 Methods used in this paper As described above, several types of scenarios can be identified: baseline, mitigation, adaptation and adaptation–mitigation scenarios. These scenario types are also presented in this paper. For the baseline/adaptation scenario, we assume intermediate assumptions for most socio-economic drivers. Scenarios assumptions are described in Sections 3 and 4 . The scenarios do not include mitigation, leading to a global mean temperature increase of 4 °C above pre-industrial levels by 2100. While we describe possible impacts and adaptation in these scenarios, we do not include feedbacks on the original drivers. In the mitigation scenarios, stringent mitigation efforts are included leading to a global mean temperature increase of 2 °C. Using the median value for climate sensitivity given by IPCC of 3 °C ( Meehl et al., 2007 ), this translates into a stabilization level of around 450 ppm CO 2 -equivalent (CO 2 -equiv.). The impacts of climate policy on economic drivers are not accounted for – but several other relationships are coupled (e.g. land use). In most of the paper, we thus ignore potential impacts of climate change and climate policy on the economic assumptions. In Section 5.8 , however, we discuss their impacts within a simple, economic model (FAIR) to provide some insight in the possible size of the economic consequences on the global scale. Several model tools are used. The scenarios are mainly developed using the IMAGE integrated assessment model ( Bouwman et al., 2006 ). The IMAGE model describes developments in energy and land use in the 21st century based on assumptions for population and the world economy, combined with assumptions for technology development and consumption patterns. The model projects climate change (as indexed by global mean temperature change and sea level rise) at the global scale, and constructs spatial scenarios for change in monthly temperature and rainfall at a 0.5° × 0.5° grid by pattern-scaling downscaled climate model patterns. The output of IMAGE is used in the model DIVA to describe sea-level rise; in the global hydrology model Mac-PDM to estimate consequences for water stress; in the TIMER energy model to estimate implications for heating and cooling demand; in the MARA/ARMA malaria suitability model for impacts on malaria and in the FAIR model for a monetary cost–benefit analysis. Moreover, we discuss more generally the implications for agriculture (based on IPCC AR4) and extreme events. Appendix A provides a brief description of all models used. In our descriptions, we focus on the global level (in view of the limited space). Clearly, this leads to limitations in our discussion of adaptation. The experiments depend on the design each model and thus the number of scenarios that can be presented differs between different impacts. This implies that the study should be interpreted as a first illustration of an integrated assessment, and not as a holistic study on adaptation and its limits. 3 Results: socio-economic trends in the baseline scenario 3.1 Population development and economic growth We assume that population follows medium-fertility variant of the 2004 revision of the World Population Projections ( UN, 2005 ) up to 2050, and the UN's long-range medium projections up to 2100 ( Fig. 3 ). This implies that the global population steadily increases to almost 9.1 billion people by 2050 and stabilizes at about 9.2 billion people over the subsequent 50 years up to 2100. The scenario takes a middle ground within the range of population forecasting (see Fig. 3 ). For economic growth up to 2050, the scenario follows projections linked to the Cambridge model E3MG ( Barker and Scrieciu, 2010; Barker et al., 2008 ). The scenario was extended beyond 2050 using the economic growth projections of the SRES-based B2 scenario ( IMAGE-team, 2001 ). Quantitatively, the scenario is a medium to high economic growth scenario, which is mainly the result of optimistic growth assumptions for China and India. The OECD economies are projected to remain the richest in the world in per capita terms, but in terms of total economic activity the importance of developing regions grows rapidly. The growth of GDP per capita is between 0 and 2% per annum in Africa, the Middle East and Latin America. In Asia, it falls from the current high levels to 3% per annum in 2050. 3.2 Energy use and greenhouse gas emissions for the baseline scenario Energy use in the baseline scenario is made consistent with a baseline published by the European Commission ( EC, 2006 ). Despite a further decrease of energy intensity, world energy consumption more than doubles in the 2000–2050 period and increases by another 25% in the 2050–2100 period ( Fig. 4 ). Over the whole century, energy supply remains dominated by fossil fuels. While oil and natural gas production peak and decline during the century, the use of coal increases during the whole scenario period. Also non-fossil energy production increases rapidly. Nuclear energy use increases by a factor of two to three to 76 EJ over the period until 2100, the use of biomass increases strongly, while hydro-electricity production increases by about 60–80%. The largest relative increase is that of wind and solar energy; this rises from less than 1% of all non-fossil energy to between 10 and 14% in 2050. Total renewable energy use in 2050 is 120–140 EJ, and 190 EJ in 2100. The trends described above imply that emissions of CO 2 from energy activities more than double in the period to 2050, and rise by another third between 2050 and 2100 (see Fig. 3 ). As such, the scenario forms an intermediate baseline scenario within the literature range ( Fisher et al., 2007 ). Non-CO 2 GHGs (in particular methane) increase steadily in the period 2000–2050, but at a slower rate than CO 2 (as their driver, agriculture, is expected to grow more slowly than the energy sector). CO 2 emissions from land-use fall back to zero during the first half of the century. The area of agricultural land lies within the range of similar scenarios that have recently been published, although at the low end of the range ( Rose et al., 2007 ). 4 Results for the mitigation scenario and climate scenarios 4.1 Energy use and greenhouse gas emissions The mitigation scenario aims at stabilising GHGs at around 450 ppm CO 2 -equiv. (see also van Vuuren et al., 2007, 2010 ). The scenario allows for an initial overshoot of concentration to about 510 ppm CO 2 -equiv. Den Elzen and van Vuuren (2007) have shown earlier that a limited overshoot of concentration allows for meeting similar climate targets at lower costs. Emission reductions are achieved in various ways. One element is to increase energy efficiency, which reduces the total amount of energy use (a 20% reduction in 2050 compared to baseline) (see Fig. 4 ). The scenario also shows an increasing use of energy from non-fossil sources, which account for most of the growth in total energy use. Non-fossil energy use increases from about 15% of total primary energy use in 2010 to more than 30% in 2050 and is over 40% of the total by the end of the century. Most of this growth is due to an increase in bio-energy use. Carbon capture and storage is applied in most remaining stationary uses of fossil fuels. Finally, also non-carbon dioxide greenhouse gas emissions are reduced. As a result, global emissions peak around 2020, and reduce further with time. Emissions are reduced by more than 70% compared to the baseline in 2050 and more than 80% by 2100. The consequences of mitigation policies affect not only the energy sector, but also land use. Substantial additional land areas are used for afforestation and bio-energy (see Fig. 5 ). Model comparison studies show that the mitigation scenarios presented here are consistent with the current literature, although models show significant differences in the contribution of various reduction measures ( Clarke et al., 2010; Edenhofer et al., 2010 ). According to the IMAGE model calculations, the abatement costs of the emission reductions are in the order of 1–2% of GDP (i.e. the annual additional expenditures which can be compared to the current expenditure of around 1.5% of GDP on environmental policy in OECD countries) ( Fig. 6 ). The literature range of comparable scenarios is in the order 0.5–5.5% in 2100. Most studies agree that these additional expenditures would lead to a reduction of GDP. We discuss this further in Section 5.8 . 4.2 Climate change under the baseline and mitigation scenario The atmospheric GHG concentration and associated mean global temperature change resulting from the emissions of the two scenarios is shown in Fig. 7 (solid lines indicate best-guess values), based on the IMAGE model calculations. The IMAGE model uses the MAGICC model to calculate changes in global mean temperature. The MAGICC model was used earlier for similar IMAGE scenarios by van Vuuren et al. (2008b) to calculate trajectories for greenhouse gas concentration and temperature including uncertainty ranges. Here, the uncertainty ranges used for the MAGICC calculations were based on existing runs of more complex carbon cycle and climate models. We have used the implications for ranges in greenhouse gas concentration and temperature outcomes to also depict the uncertainty ranges here as is indicated by the shaded areas in this graph. For temperature, the wider shaded area indicates the uncertainty as result of uncertainty in the carbon cycle and climate sensitivity. For the baseline scenario, global mean temperature increases almost linearly to 2.1 °C above the pre-industrial levels in 2050 and to 3.7 °C in 2100 (uncertainty range 3–5 °C). In the mitigation scenario, the global mean temperature increase by 2100 is limited to 1.9 °C. Again, there is considerable uncertainty. Fig. 7 indicates that by the end of the century the mitigation case could also lead to a temperature increase of 2.6 °C compared to pre-industrial levels. As the mitigation scenario presented here is among the most stringent in the scientific literature (cf. Clarke et al., 2010; Edenhofer et al., 2010; Fisher et al., 2007 ), two important conclusions can be drawn. First, the analysis indicates that global warming can be moderated but not halted. Second, the observation that a stringent scenario could also lead to considerably greater climate change than 2 °C may imply that hedging adaptation policies against more warming might have considerable value. For example, such policies may be to ‘… aim for 2 °C, but prepare for 3 °C’. In the assessment of impacts below, we focus on the central climate change projections. Changes in mean monthly temperature and precipitation across the globe at the 0.5° × 0.5° scale, associated with the global average temperature changes, have been constructed by rescaling patterns derived from the HadCM2 climate model ( Fig. 8 ). These patterns show that the change in annual mean temperature is larger at high latitudes than at low latitudes, and show considerable spatial variation in change in rainfall. Considerable disagreement about the expected patterns of climate change exists, especially for precipitation: the impact results presented in this paper therefore represent only one possible outcome. 5 Results: impacts and adaptation in the different scenarios 5.1 Introduction IPCC's Fourth Assessment Report ( IPCC, 2007 ) gives an overview of climate impacts. Some of these impacts result from changes in average climate, but other impacts may result from changes in extreme events. Table 1 summarizes some of the impacts, for health, agriculture, water availability, coastal flooding, urban areas and energy system, and large-scale disruptions of the climate system (in contrast, biodiversity and ecosystem services have not been included). As noted earlier, most of the literature has treated climate change as “a gradual phenomena” ( Agrawala and Fankhauser, 2008 ). This is problematic for impacts characterized by low probabilities coupled with high impacts (see below). In this exploratory analysis, we sketch some of the impacts and adaptation requirements. We aimed to cover several key impacts mentioned in Table 1 , but the assessment was limited by the availability of models that could easily be coupled. Therefore, rather than intending to be exhaustive, the descriptions provide some indication of the magnitude of some impacts and key adaptation challenges. In presenting our results, we have used several new model runs based on the scenario discussed above (e.g. for malaria, water resources, sea-level rise, heating and cooling demand). We have, however, also assessed existing information from IPCC 4th Assessment Report in the context of the two scenarios presented here (temperature-related mortality, agriculture and extreme events). 5.2 Human health: temperature-related mortality and malaria Health impacts of climate change need to be seen in the context of other, more important drivers of human health, including lifestyle-related factors ( Hilderink et al., 2008 ). We focus here on temperature-related mortality and malaria. 5.2.1 Temperature-related mortality Temperature-related mortality impacts may occur via changes in extreme temperatures, changes in average temperatures, or in seasonal variation of temperatures, with the literature showing varying results. McMichael et al. (1996) made an estimation of temperature-related mortality using relative risk ratios, showing that there is an optimum temperature at which the death rate is lowest (also know as the U-shaped dose–response relation). If temperature increases, heat stress-related mortality increases, but cold-related mortality decreases. Tol (2002a) concluded that in monetary terms the reduction in cold-related mortality due to climate change outnumbers the increase in heat-related mortality. This conclusion is however, influenced by the approach used to value a life and also subject to the large uncertainties with respect to the relationships between average and regional temperatures and temperature and health. Adaptation may occur both by the adjustment of the human physiology to higher temperatures ( McMichael et al., 1996 ), changes in behavior and an increase of air conditioning use ( Kinney et al., 2008 ). Given the complexities in using dose–response relationships between temperature and mortality, we have not attempted to quantify these here. 5.2.2 Malaria Considerable attention has been paid to the relationship between malaria and climate change. In this paper, we also focus on climate-induced changes in malaria risks. Annually more than one million people, mostly African children, die from malaria, a vector-born infectious disease. The anopheles mosquitoes (the vector which spreads the malaria infection) can only survive in climates with high average temperatures, no frost and sufficient precipitation. The MARA/ARMA malaria suitability model ( Craig et al., 1999 ) incorporates these factors to determine climatically suitable areas. Mortality due to malaria is, however, also heavily influenced by factors such as access to preventative measures (including indoor spraying and insecticide-treated bed nets) and access to health care. In the MARA/ARMA model these factors are linked to income and urbanization. Fig. 9 shows the results of this model for the scenarios of this paper. The impact of autonomous adaptation (as function of rising income) reduces malaria deaths by around 50%, especially in Africa (mainly due to better provision of health care). In contrast, the impacts of climate – and especially the difference between the mitigation scenario and the baseline case are much smaller. Mitigation reduces malaria health risks by about 2% (2050). Adaptation, therefore, has a much more decisive influence on malaria control than mitigation (this finding seems to be robust with available literature). 5.3 Agriculture: impacts on yields Easterling et al. (2007) have synthesized a large amount of research on the impacts of climate change on crop growth, with and without adaptation. The results were summarized as a function of global mean temperature increase, although in reality changes in temperature and precipitation patterns and CO 2 fertilisation all play a role. For instance, the impacts of CO 2 fertilisation partly offset the impact of climate change. The results can be used to assess the climate impacts for our scenarios by using the best-fit polynomials from Easterling et al. (2007) , that indicate the impact on yield as a function of mean temperature change. 1 1 We have in each case taken the global mean temperature change for a scenario and used that as an indication of the average local temperature change to be expected. This means that our impact estimates are likely to be conservative, as temperature increase is likely to be stronger the global average over many land areas. We looked at the impacts for the baseline (4 °C) and mitigation (2 °C) scenario, with and without adaptation, for maize, wheat and rice (see Fig. 10 ; results are presented for tropical and temperate zones in 2100; these impacts are additional to the yield increases as a result of other factors than climate change). Although the results are very uncertain, some conclusions seem to be possible. First, the baseline scenario (no adaptation) causes a very substantial decrease in yields (relative to the situation without climate change) for all cases shown: Climate change impacts may reduce yields for the aggregated regions shown by 10–35% for the crops studied (2050). Second, engaging in either mitigation or adaptation limits the decrease in yields. In the tropics, however, impacts remain negative and typically in the order of a 10% loss. Third, the combination of mitigation and adaptation may result in an improvement from today's situation. Agricultural impacts may be more positive for temperate regions, but only if the advantages of higher temperature are not offset by impacts of extreme weather. These results underline the need to look at both mitigation and adaptation. The results presented are based on the IPCC assessment and represent a wide range of models. The results can also be illustrated by individual studies. Tubiello and Fischer (2007) , for instance, found that a mitigation scenario could reduce the global costs of climate change in agriculture significantly. Similarly, Fischer et al. (2007) illustrated the importance of adaptation for water irrigation requirements. They found that mitigation reduced agricultural water requirements by about 40%, leaving 60% of the impacts requiring adaptation. When dealing with impacts on agriculture both drought and heat wave stress play important roles. Fig. 11 shows, for Europe, the impact of drought and heat wave stress on crop yields for a 2 °C warming scenario, assuming various forms of adaptation ( Mechler et al., 2010; Moriondo et al., 2010 ). 2 2 Calculations were done using the Cropsyst model on the basis of the HADCM3 climate model for the 2030–2060 time slice. Winter and summer crop yields were simulated for spring wheat with today's and future crop management practices. Adaptation options considered comprised shifting the sowing date by a few days and using cultivars with a longer/shorter growth cycle. Results show that Southern Europe and parts of France are today already particularly exposed to drought and heat stress, and this situation is expected to worsen even under the 2 °C (mitigation) scenario ( Fig. 11 panel A). When considering the two adaptation strategies in combination with mitigation ( Fig. 11 panels B and C), many regions in Europe may actually benefit. Northern Europe, in particular, could exploit the advantage of higher precipitation by using crop varieties with a longer growing cycle. In contrast, in Southern Europe the same adaptation options would result in an added negative impact, since crop development would shift towards summer when longer dry spells and heat waves may significantly affect crop growth. Also, the results show that while there are some region-specific limits to adaptation, overall adaptation would effectively reduce impacts on the agricultural sector in Europe. 5.4 Water resources: potential water availability The effects of the two scenarios on exposure to changes in water resources stress are assessed using a global-scale water resources impact model ( Arnell, 2003 ). Fig. 12 shows the percentage change in average annual runoff by 2100 (relative to the 1961–1990 mean) under the baseline scenario and the mitigation scenario (with the HadCM2 climate model pattern). We define watersheds to be in a water-stressed condition if average annual runoff is less than 1000 m 3 /capita/year (other definitions are also used in the literature). The effect of climate change is indexed by summing (i) the populations living in water-stressed watersheds where runoff decreases (increases) significantly (typically by more than 5–10%) and (ii) the population living in watersheds that become water-stressed (cease to be water-stressed) due to climate change. The number of people exposed to an increase or decrease in water stress due to climate change have not been summed for two reasons: (i) the adverse effects of having less water are greater than the beneficial effects of having more water in a water-stressed catchment, and (ii) the regions with an increase and decrease in exposure to water resources stress are widely separated, and “surpluses” in one area do not offset “deficits” in another. The results show substantial differences in exposure to increased water resource stress in 2050, 2080 and 2100 between the mitigation and baseline scenarios. In 2020, there is little difference in runoff between the two scenarios. Fig. 13 shows the numbers of people exposed to an increase or decrease in water resource stress due to climate change under the two scenarios. In both the baseline and the mitigation scenario, the numbers of people living in water-stressed watersheds who apparently benefit from increased water availability is larger than the numbers exposed to a reduction in runoff, but – as outlined above – we do not focus on the net effect. The numbers of people exposed to change in water resources stresses are sensitive to the assumed pattern of climate change. Compared to the baseline, the mitigation scenario reduces the numbers exposed to an increase in water resources stress by 135 million (reducing impacts by 12%), 281 million (20% reduction) and 457 (30% reduction) million in 2050, 2080 and 2100 respectively. At the same time, however, there are also people benefiting from climate change. The relative size of the groups with positive and negative impacts depends on the climate model used (here only the Hadley pattern has been used). Clearly, mitigation also decreases the number of people benefiting from climate change. It is also clear that mitigation does not eliminate water supply impacts of climate change, and adaptation will be required for the remaining billion people exposed to increased water resource stress due to climate change. Adaptation may include measures to increase water storage, transport of water, or reduction of water demand by increasing efficiency. Underlying results show that the effects of mitigation vary significantly by region. In fact, in some regions mitigation may even increase the numbers of people exposed to increased stress. Specific uncertainty analysis shows that results are highly dependent on the uncertainty in the changes in the precipitation pattern due to climate change. 5.5 Sea level rise Another important impact of climate change is rising sea levels. Global mean sea-level rise has been projected for both scenarios using the MAGICC component of the IMAGE model. Due to the delayed response of sea-level to global warming, the projections mainly diverge in the second part of the century: sea level rise is 35 and 31 cm in 2050 in the 4 °C and 2 °C scenario, respectively and 71 and 49 cm in 2100. These projections do not include a potential accelerated contribution of the ice sheets of Greenland and Antarctica, which could lead to higher sea-level rises but the underlying processes are insufficiently understood and are currently not included in climate models ( Meehl et al., 2007; Nicholls et al., 2010; Vermeer and Rahmstorf, 2009 ). We use the DIVA model to assess both damage and adaptation costs of sea-level rise, associated storm surges and socio-economic development under the two scenarios taking into account coastal erosion (both direct and indirect), forced migration, coastal flooding (including rivers) and salinity intrusion into deltas and estuaries. For each scenario the model is run first without and then with adaptation in terms of raising dikes and nourishing beaches ( DINAS-COAST Consortium, 2006; Hinkel and Klein, 2009 ). Further impacts such as salinity intrusion in coastal aquifers, loss of coastal wetlands and biodiversity as well as further adaptation options such as salinity intrusion barriers, port upgrade, set-back zones and ecosystem-based protection could not be included due to the unavailability of global data and general models of these processes. Fig. 14 shows that independent of the level of mitigation, adaptation reduces global overall costs rather effectively, which illustrates the necessity for engaging in adaptation even under ambitious mitigation. At the aggregated scale more damages can be avoided through an adaptation-only strategy than through a mitigation-only strategy, although a combination of the two has the strongest positive impact. From the perspective of poorer and small island countries, however, stringent mitigation is necessary to keep risks at manageable levels. Even without sea-level rise, adaptation would be cost-effective in order to protect the assets situated in the floodplain, which increase due to socio-economic development alone. While this would involve substantial investment flows (tens of billions of US$ worldwide), they are a relatively small fraction of global GDP, even for sea level rise at the level of the baseline scenario. However, for individual countries or regions (particularly small island states) these costs can be a much larger fraction of GDP, including the risk of a complete loss. 5.6 Heating and cooling demand (settlements and society) Climate change is likely to influence the demand for space cooling and heating. Therefore, we have developed a set of simple relationships to describe heating and air conditioning demand in the residential sector and explored the impacts of climate change on this simulated energy demand ( Isaac and van Vuuren, 2009 ). Clearly, changes in population and income are projected to lead to a considerable growth in the energy demand for heating and air conditioning in the coming century (see Fig. 15 , no climate change case). Driven by climate, changes in cooling and heating practices are examples of autonomous adaptation (i.e. without policy intervention). Adaptation is not universal, however, since the population will not always be able to respond. Unfulfilled demand for heating and cooling can lead to health impacts (as described in Section 5.2 ) and to loss of labour productivity. In addition to these effects, there is reduced comfort when indoor temperatures rise above a given level. Fig. 15 shows that, globally, the autonomous increase in energy demand without taking climate change into account due to increasing income and wealth is much larger than the difference between the energy demand in the baseline scenario and the mitigation scenario ( Isaac and van Vuuren (2009) show this a robust result also for other baselines). The effect of climate change on combined energy demand is also smaller than the effect on heating and air conditioning separately, since increases in air conditioning compensate decreases in heating. On the regional and country level, impacts can be far more significant: for example, in India we project a large increase in energy demand due to increased cooling, while in Western Europe and the USA, we project a substantial decrease due to reduced heating. 5.7 Extreme events Climate change is expected to lead to changes in the frequency and intensity of some weather-related extreme events ( Parry et al., 2007 ). Extremes like floods, droughts, heat waves and storm surges could become more frequent and intense, while cold-extremes, such as cold spells, are likely to become less frequent and weaker. Assessing risks of climate change based on changes in average conditions-only runs the risk that changes in extreme event risks are averaged out. A more risk-based, geographically explicit method is therefore preferable. However, knowledge on disaster impacts is complex and contested. To date, there are only a limited number of national level studies taking a probabilistic approach to projecting future risk in the presence of climate change, mostly focusing on flood risk ( Mechler et al., 2010 ). One such study on the pan-European scale by Feyen et al. (2009) computed that expected annual damages would triple under a baseline scenario. A key constraint to quantitative risk approaches is the uncertainty in the climate projections. For precipitation, for instance, models often disagree on the sign of changes at the local scale. This is especially important for studies looking for instance flood risk. While the Mechler et al. (2010) study aimed to project future risk, they found future projection to be so uncertain that the authors refrained from projecting future flood risk based on an estimate of today's flood impacts. Current models and data, however, seem to be sufficient to assess the combined risk of drought and heat wave stress on agriculture with a relatively high level of certainty (slower phenomena). Some examples of work in the context of the 2 °C and 4 °C scenarios are provided here. Several studies looked into flood-affected people at the global scale ( Hirabayashi and Kanae, 2009; Kundzewicz et al., 2010 ). Regression of samples shows that the average global number of people affected by 100-year floods per year for the mitigation scenario (2 °C) is projected to be 211 million compared to 544 million for the baseline (4 °C). Mirza et al. (2003) showed that for Bangladesh, a flood-vulnerable country, even the 2 °C scenario is expected to increase the projected flooded area by at least 23–29%. It should be noted, however, that the uncertainties about exposure, vulnerability and adaptation still lead to a wide range of estimates for the costs of future flood damage. With respect to drought, the projections for the 2090s made by Burke et al. (2006) show that the number of extreme drought events per 100 years and mean drought duration are likely to increase by factors of two and six, respectively, for the baseline scenario by the 2090s. Evidence suggests that damage of weather and climate related impacts has already increased in the present-day, but these are mainly due to the wealth and population increases ( Bouwer, 2010 ). However, climate change is expected to increase over time, and is likely to become a more significant contributor to rising damages in the future. The most recent IPCC report indicates that the costs of major events are expected to range from several percent of annual regional GDP and income in very large regions with very strong economies, to more than 25% in smaller areas ( Parry et al., 2007 ). Disaster losses for highly exposed small island states in the past have in fact exceeded annual GDP ( Cummins and Mahul, 2009 ). 5.8 Economic evaluation of impacts Cost–benefit analysis (CBA) is used to express the costs and benefits of climate change of different strategies in terms of a common monetary unit. We use the CBA module of the FAIR model (see model Appendix A ) here to obtain some idea of impacts at a more aggregated scale. For mitigation costs, the FAIR model uses the information of the IMAGE model presented earlier. The climate damage and adaptation cost functions used in FAIR are derived from the AD-DICE model ( De Bruin et al., 2009a; Hof et al., 2009a ). In short, AD-DICE estimates adaptation costs based on the damage function of the DICE model ( Nordhaus and Boyer, 2000 ). The AD-DICE separates these functions into a damage cost function and residual damage function based on an assessment of each impact category described in the DICE model – agriculture, coastal zones, health, settlements, non-market time use, other vulnerable markets and catastrophic impacts. For this study, we assumed an optimal adaptation response to climate change (i.e. given a level of temperature change the model minimizes the sum of adaptation costs and residual impacts). The impact estimates used in DICE (and thus FAIR) include: (i) real, measurable, economic costs (so-called market costs); and (ii) other, intangible losses (non-market losses), which are monetized using the willingness-to-pay concept. The damage functions are not directly related to the physical or economic damages described earlier in this section, as they are derived from a separate source. It has been shown earlier that the FAIR results of adaptation costs are consistent with the range of values reported in the literature ( Hof et al., 2009a ). Under default settings of the FAIR model and a discount rate of 2.5%, the discounted costs as a share of global GDP due to climate change impacts for the period 2005–2200 amount to nearly 4.5% in the baseline ( Fig. 16 ). These costs may seem higher than suggested by the limited set of sectoral analyses presented above, but include more sectors and also the impacts of possible catastrophic events ( Nordhaus and Boyer, 2000 ). Annual costs rise sharply over time, reaching 17% in 2200 (note that impact estimates are very uncertain and both higher and lower values can be found in the literature ( Parry et al., 2007; Stern, 2006; Tol, 2002b )). Scenarios with only adaptation or mitigation reduce discounted costs substantially to around 2.5% ( Fig. 16 ). Hof et al. (2008) have shown that the results of CBA of climate change are very sensitive to model assumptions, with the discount rate playing the most important role. The discount rate is especially important due to the different costs function over time related to the adaptation only and mitigation only scenarios. 3 3 A discount rate of 5% leads to discounted costs of 0.8% and 1.9% for the adaptation-only scenario and mitigation-only scenario, respectively. If a discount rate of 1.4% is used (equal to the discount rate used by Stern (2006) ), the discounted costs are 3.2% and 2.5% for the adaptation-only scenario and mitigation-only scenario, respectively. With our discount rate of 2.5%, the combination of mitigation and adaptation leads to the lowest discounted costs, namely 2% of GDP. Consistent with literature, the adaptation investments are assessed to be smaller than mitigation investments and residual damages. However, they are very important in limiting residual damages. Some important caveats need to be mentioned. First, calculations cannot be regarded as reliable for the extreme tails of risks (i.e. low probability, high impact events). As a subjective assessment on how to handle such risks is involved, Weitzman (2008) questioned the usefulness of CBA for policymakers. Secondly, the value of the discount rate to account for time preference and risk is currently heavily debated, with arguments relating to subjective time preference and risk perception ( Nordhaus, 2008; Price, 2005; Stern, 2006 ). As mentioned above, the value of the discount rate can have a large effect on the results. Finally, non-market impacts need subjective quantification of damages; while it is difficult to monetize these impacts, in general, it is even more difficult for irreversible changes, for example a warming of the oceans leading to the loss of coral reefs ( Ackerman and Heinzerling, 2004 ). 5.9 Uncertainties in climate change, impacts and adaptation There are many sources of uncertainty in projections of future climate change and its impacts. Uncertainties are associated with every step in the causal chain: emissions, climatic drivers (e.g. the carbon cycle), climate (mainly climate sensitivity and pattern of climate change), and impacts (including adaptive capacity). As a result, different studies might give very different results for the same emission scenario. In fact, these differences are often larger than those arising in a particular model under different emission scenarios. For example, for precipitation changes at the end of the century, the multi-model ensemble mean exceeds the inter-model standard deviation only at high latitudes ( Kundzewicz et al., 2007 ). Uncertainties in climate change projections increase with the length of the time horizon. In the near term (e.g., the 2020s), climate model uncertainties play the most important role; while over longer time horizons (e.g. the 2090s), uncertainties due to the selection of emissions scenario become increasingly significant ( Jenkins and Lowe, 2003 ). The impact of future climate change on extreme events is particularly uncertain. This is partly due to a mismatch between the larger spatial and temporal scale of coarse-resolution climate models, and the local occurrence and short life of some weather extremes (e.g. cloudburst precipitation and flash floods). As impacts and adaptation take place at the local scale, detailed information is needed – which implies an increase in uncertainty. The large uncertainty ranges suggests that planning for adaptation should not be based on a single scenarios, but that a large range of projections need to be account for. 6 Conclusions In this paper, we have discussed how scenario analysis may contribute to the assessment of mitigation and adaptation strategies. We have also presented two integrated scenarios as a starting point for analysis. The scenarios have explicitly treated mitigation and adaptation action for several indicators – and cover several important linkages and feedbacks between socio-economic development and impacts (e.g. the impacts of climate change on land use and mitigation are accounted for). We specified impacts in those scenarios for a selected number of indicators, focusing mainly on mean climate changes. Based on our work, we draw the following conclusions: • By describing two contrasting sets of possible climate change trajectories for the world, we have created the basis for a more integrated analysis of the interaction between mitigation, adaptation and climate impacts. The first scenario (no mitigation) is expected to lead to a global mean temperature increase by the end of the century of around 4 °C (for the most likely values for climate parameters, and current economic trends). This scenario has high adaptation needs as has been shown in some of our analyses. The second scenario assumes stringent mitigation and limits global mean temperature change to 2 °C, with a probability of 50%. Even under this scenario, substantial adaptation measures will be needed. • Integrated scenario analysis as presented here can form a good basis for exploring the different consequences of policy choices (including uncertainties); it is not feasible, given uncertainties to determine an optimal mix between mitigation, adaptation and residual damages. As discussed in this paper, the weighing of the consequences of climate change and the various policy responses is complicated by large differences in scale, space and time; large uncertainties; and clear differences in interest between actors (whether they are perpetrators or victims of climate change, for instance). As a result, subjective interpretation of risks will always play an important role. Still, scenario analysis can provide a description of possible consequences and risks. At this stage, the monetary assessment of cost and benefits (Section 5.8 ) could not be linked to the description of physical change in the preceding sections. • Effective climate policy includes both adaptation and mitigation. Model calculations show that mitigation scenarios can be designed that lead to an increase of global mean temperature increase 2 °C for a best-guess climate sensitivity. However, even these stringent scenarios can still also result in a global mean temperature increase of more than 2.5 °C (and at best a temperature increase of 1.5 °C) and regional temperature change which is far greater. The need for a combination of mitigation and adaptation has been shown for most of the impacts explored in this paper. For example, adaptation can be more effective than mitigation in dealing with sea-level rise (at least during the 21st century), but mitigation still has a role to play in reducing damages and costs of adaptation. Agriculture presents an example where adaptation and mitigation are both clearly necessary. Crop yields in agriculture are projected to suffer negative impacts in many regions due to climate change in the absence of both adaptation and mitigation action. Without stringent mitigation, adaptation could limit negative impacts, but not remove them. An advantage of mitigation is that it affects all impact categories, while adaptation needs to be tailored to impacts and contexts. • While impacts of climate change can be severe and, depending on subjective choices, may warrant stringent climate policy, the impacts assessed in this study (given the state of the art) are likely to remain secondary influences of population change and economic growth at a global scale. Yet important caveats apply (see below). While climate change may have an impact on millions of people, other challenges are likely to influence people and governance more significantly. It should be noted, however, that we have covered only a limited set of impacts and focused mostly on mean estimates of gradual climate change and, for instance, not on catastrophic, very high-impact, extremely low-probability events ( Weitzman, 2008 ). Such events in fact may be so severe that the conclusion above no longer holds. If costs at a global scale remain relatively low, there is less need for global analysis to include all feedbacks on main drivers based on the consistency of the storylines. Clearly, at the local scale the situation is likely to be very different; impacts for individual countries can be far more substantial than at the global scale. For example, sea level rise is very important for some low-lying island states and countries that could be significantly affected by either large adaptation costs and/or damages (up to complete destruction). For agriculture, positive and negative impacts are projected to occur in different places and at different times – with low-income countries often experiencing relatively more negative impacts. Agriculture in temperate regions, where it is currently temperature-limited, could benefit. All in all, we believe that it useful to pursue further the development of integrated scenarios specifying these further on a regional scale. While this paper presents a useful first step, it also has left many feedbacks still unaccounted for. • The overall mitigation costs in this study are estimated to be in the order of 1–2% of GDP for the 2 °C scenario. The mitigation scenario reduces the risks of climate change. There are several types of benefits of investments in mitigation. First, climate-related damages and the costs of adaptation are reduced. Second, also uncertainty is reduced, which is important given the risks involved. While we argue there can be no optimal trade-off between mitigation and adaptation at a global level, we have shown that over the longer-run the costs and benefits of mitigation and adaptation are of an equivalent magnitude. • Important foci for further analysis include the linkages between assessment of physical changes and monetary impact analysis, variability and changes in extreme events, the potential role of large scale disruptions and governance. In our and other assessments, the focus has mostly been on changes in mean values, yet there is considerable concern about extreme events (resulting in natural disasters) associated with climate variability, but also in large scale disruptions (such as the disintegration of the West Antarctic Ice Shield), which are not accurately described by average values. Projections of changes in climate variability have been highly uncertain, and to date often hinder analyses from robustly predicting future extreme event risk. The role of different actors is another issue; some forms of adaptation require active governmental involvement; other forms are likely to be implemented by private investors, such as installation of space cooling systems. The differences between these two adaptation protagonists are relevant for future scenario development. Acknowledgements The research presented in this paper was performed as part of the EU-funded ADAM research project. An earlier version of this paper was published as part of the book “Making Climate Change work for us” edited by Hulme and Neufeld and published by Cambridge University Press in 2010. Appendix A Model descriptions A.1 IMAGE 2.4 The IMAGE 2.4 Integrated Assessment model ( Bouwman et al., 2006 ) consists of a set of linked and integrated models that together describe important elements of the long-term dynamics of global environmental change, such as air pollution, climate change, and land-use change. As part of IMAGE, the global energy model TIMER ( van Vuuren et al., 2006 ) describes the long-term dynamics of demand and production of primary and secondary energy and the related emissions of greenhouse gases and regional air pollutants. The model behavior is mainly determined by substitution processes of various technologies on the basis of long-term prices and fuel-preferences. The agricultural model of IMAGE models the productivity of 7 crop groups and 5 animal categories ( Leemans and Born, 1994 ). The regional production of agricultural goods is distributed spatially (at 0.5° × 0.5°) on the basis of a set of allocation rules ( Alcamo et al., 1998 ). Both the land use change maps and the agricultural activity data are used to model emissions from land use (change). The emissions of GHGs are used by the MAGICC model to calculate global mean temperature change ( Wigley and Raper, 2001 ). Patterns of temperature change are obtained by making a link to climate change patterns generated by a general circulation models (GCM). Limitations : IMAGE is provides a physically oriented description of human activities (use of tons of oil, production of tons of cereals, etc.). A fuller macro-economic description only emerges from cooperation with other models. The broad coverage of IMAGE as Integrated Assessment Model implies that many critical uncertainties influence the model outcomes. In this context, use of a single baseline (as in the ADAM project) does not do fully justice to the fundament uncertainties involved. A.2 FAIR The climate policy model FAIR ( Den Elzen et al., 2008 ) is used in conjunction with the IMAGE model to determine the reduction rates across different emission sources. Global climate calculations make use of the simple climate model, MAGICC 4.1 ( Wigley, 2003; Wigley and Raper, 2001 ). Required global emission reductions are derived by taking the difference between the baseline and a global emission pathway. The FAIR cost model distributes these between the regions following a least-cost approach using regional marginal abatement costs curves (MACs) for the different emissions sources. Recently, the FAIR model has been extended with damage and adaptation costs curves (based on the AD-DICE model ( De Bruin et al., 2009b ) and the ability to estimate macro-economic impacts on GDP growth ( Hof et al., 2008 )). This allows the model to explore the economic impacts of combined mitigation and adaptation strategies. Limitations : In its aim to be flexible, the FAIR model does not include a sectoral macro-economic model or an energy model. The model thus works from a partial equilibrium approach – and more underlying consequences of climate policy can only be studied by forwarding the FAIR results to other (linked) models. A.3 DIVA DIVA (Dynamic and Interactive Vulnerability Assessment) is an integrated model of coastal systems that was developed, together with its proper coastal database, within the EU-funded project DINAS-COAST 4 4 Dynamic and Interactive Assessment of National, Regional and Global Vulnerability of Coastal Zones to Sea-Level Rise; http://www.pik-potsdam.de/dinas-coast/ . ( DINAS-COAST Consortium, 2006; Hinkel and Klein, 2009 ). DIVA produces quantitative information on a range of ecological, social and economic coastal vulnerability indicators from sub-national to global scales, covering all coastal nations. The model consists of a number of modules developed by experts from various engineering, natural and social science disciplines. Based on climatic and socio-economic scenarios, the model assesses coastal erosion (both direct and indirect), coastal flooding (including rivers), wetland change and salinity intrusion into deltas and estuaries. DIVA also considers coastal adaptation in terms of raising dikes and nourishing beaches and includes several predefined adaption strategies such as no protection, full protection or optimal protection. Limitations : DIVA excludes the following processes that are likely to affect coastal impacts, but can currently not be modeled with confidence: changes in storm frequency and intensity, local distribution of GDP and population growth due to rapid coastal development and urbanization, and salinity intrusion into coastal aquifers. Further important uncertainties arise due to the coarse resolution and accuracy of elevation data. A.4 TIMER-cooling/heating energy demand The TIMER cooling/heating energy demand model ( Isaac and van Vuuren, 2009 ) describes the energy use for cooling and heating as a function of several factors, including population levels, changing income levels and climate. For both heating and cooling, empirical data is used to calibrate a set of system-dynamic demand functions. Climate (cooling and heating degree days) plays an important role. The model is able to account for the impacts of climate change. Limitations : The empirical basis on which the model is calibrated is relatively poor for developing countries. The model does not contain a description of different ways cooling and heating demand can be supplied and the costs involved in substituting one technology for the other. A.5 Water resources impact model The water resources impact model ( Arnell, 2003, 2004 ) has two components. The first simulates river runoff across the entire global land surface (at 0.5° × 0.5°) using the macro-scale hydrological model Mac-PDM, and the second determines indicators of water resources stress at the watershed level by calculating per capita water resource availability. A watershed is assumed to be exposed to water resources stress if it has an annual average runoff equivalent to less than 1000 m 3 /capita/year, a semi-arbitrary threshold widely used to identify water-stressed regions. Climate change leads to an increase in exposure to water resources stress if it causes runoff in a water-stressed watershed to decrease significantly, or causes the watershed to fall below the threshold. Climate change leads to an apparent reduction in exposure for the opposite trends. These changes cannot be directly compared; whilst a reduction in runoff (and an increase in exposure) is highly likely to be adverse, an increase in runoff (and apparent decrease in exposure) may not be beneficial if the additional water cannot be stored or if it occurs during high flow seasons as increased flooding. The number of people living in watersheds exposed to an increase in water resources stress can be used as an indicator of exposure to climate change. The actual impacts (in terms of real water shortages) will depend on water management structures in place. Limitations : The hydrological model does not simulate perfectly the volume of river runoff, and in particular tends to overestimate runoff in semi-arid regions. The water resources indicator is a measure of exposure to impact, not actual impact; it can be seen as a surrogate for the demand for adaptation. A.6 Malaria risks Malaria vectors, the mosquitoes spreading the infection, can only survive in suitable climates with high average temperatures, no frost and enough precipitation. The MARA/ARMA malaria suitability model ( Craig et al., 1999 ) incorporates these climatic factors to determine climatic suitable areas. The climatic levels required for the maximum suitability of 1, and for the minimum suitability of 0, are shown in Table A.1 . For indicators with levels between those required for 0 or 1 suitability a level is calculated using s simple function ( Craig et al., 1999 ). All these factors are calculated at half by half degree grid level, making use of the output from the IMAGE-model ( Bouwman et al., 2006 ). Total climatic malaria suitability for each grid cell is determined by the lowest of these three indices. Limitations : The MARA/ARMA model describes suitability for malaria vectors. It does not provide a process description of the spread of mosquitos, nor does it explicitly describe how people may react to increased risk levels. References Ackerman and Heinzerling, 2004 F. Ackerman L. Heinzerling Priceless: On Knowing the Price of Everything and the Value of Nothing 2004 The New Press New York Agrawala and Fankhauser, 2008 S. Agrawala S. Fankhauser Economic Aspects of Adaptation to Climate Change. Costs, Benefits and Policy Instruments 2008 OECD Paris Alcamo et al., 1998 J. Alcamo E. Kreileman M. Krol R. Leemans J. Bollen J.V. Minnen M. Schaeffer S. Toet B. de Vries Global modelling of environmental change: an overview of IMAGE 2.1 J. Alcamo R. Leemans E. Kreileman Global Change Scenarios of the 21st Century. Results from the IMAGE 2.1 Model 1998 Elsevier Science Ltd. Oxford 3 94 Arnell, 2003 N. Arnell Effects of IPCC SRES emissions scenarios on river runoff: a global perspective Hydrology and Earth System Sciences 7 5 2003 619 641 Arnell, 2004 N. Arnell Climate change and global water resources: SRES emissions and socio-economic scenarios Global Environmental Change 14 1 2004 31 52 Arnell et al., 2002 N.W. Arnell M.G.R. Cannell M. Hulme R.S. Kovats J.F.B. Mitchell R.J. Nicholls M.L. Parry M.J.L. Livermore A. White The consequences of CO 2 stabilisation for the impacts of climate change Climatic Change 53 4 2002 413 446 Bakkenes et al., 2006 M. Bakkenes B. Eickhout R. Alkemade Impacts of different climate stabilisation scenarios on plant species in Europe Global Environmental Change 16 1 2006 19 28 Barker, 2003 T. Barker Representing global climate change, adaptation and mitigation Global Environmental Change 13 2003 1 6 Barker et al., 2009 Barker, T., Kenber, M., Scrieciu, S., Ryan, D., 2009. Breaking the Climate Deadlock. Cutting the Cost: The Economic Benefits of Collaborative Climate Action. The Climate Group, The Office of Tony Blair, 4CMR – University of Cambridge and Cambridge Econometrics. Barker and Scrieciu, 2010 T. Barker S.S. Scrieciu Modelling low stabilisation with E3MG: towards a ‘New Economics’ approach to simulating energy-environment-economy system dynamics The Energy Journal 31 Special issue 1 2010 137 164 Barker et al., 2008 T. Barker S.S. Scrieciu T. Foxon Achieving the G8 50% target: modelling induced and accelerated technological change using the macro-econometric model E3MG Climate Policy 8 2008 S30 S45 Berkhout et al., 2002 F. Berkhout J. Hertin A. Jordan Socio-economic futures in climate change impact assessment: using scenarios as ‘learning machines’ Global Environmental Change 12 2 2002 83 95 Bouwer, 2010 L.M. Bouwer Have disaster losses increased due to anthropogenic climate change? Bulletin of the American Meteorological Society 2010 10.1175/2010BAMS3092.1 Bouwman et al., 2006 A.F. Bouwman T. Kram K. Klein Goldewijk Integrated Modelling of Global Environmental Change. An Overview of IMAGE 2.4 2006 Netherlands Environmental Assessment Agency Bilthoven 228 pp. (Publication 500110002/2006) Burke et al., 2006 E.J. Burke S.J. Brown N. Christidis Modelling the recent evolution of global drought and projections for the 21st century with the Hadley Centre climate model Journal of Hydrometeorology 7 2006 1113 1125 Clarke et al., 2010 L. Clarke J. Edmonds V. Krey R. Richels S. Rose M. Tavoni International climate policy architectures: overview of the EMF 22 international scenarios Energy Economics 31 Suppl. 2 2010 S64 S81 Copenhagen Accord, 2009 Copenhagen Accord, 2009. (Copenhagen Accord of 18 December 2009). United Nations Climate Change Conference 2009, Copenhagen. Craig et al., 1999 M.H. Craig R.W. Snow D. le Sueur A climate-based distribution model of malaria transmission in Africa Parasitology Today 15 3 1999 105 111 Cummins and Mahul, 2009 J.D. Cummins O. Mahul Catastrophe Risk Financing in Developing Countries: Principles for Public Intervention 2009 The World Bank Washington, DC De Bruin et al., 2009a K.C. De Bruin R.B. Dellink S. Agrawala Economic Aspects of Adaptation to Climate Change: Integrated Assessment Modelling of Adaptation Costs and Benefits 2009 OECD Paris De Bruin et al., 2009b K.C. De Bruin R.B. Dellink R.S.J. Tol AD-DICE: an implementation of adaptation in the DICE model Climatic Change 95 1–2 2009 63 81 Den Elzen et al., 2008 M.G.J. Den Elzen P.L. Lucas D.P. van Vuuren Regional abatement action and costs under allocation schemes for emission allowances for achieving low CO 2 -equivalent concentrations Climatic Change 90 3 2008 243 268 Den Elzen and van Vuuren, 2007 M.G.J. Den Elzen D.P. van Vuuren Peaking profiles for achieving long-term temperature targets with more likelihood at lower costs Proceedings of the National Academy of Sciences of the United States of America 104 46 2007 17931 17936 DINAS-COAST Consortium, 2006 DINAS-COAST Consortium, 2006. DIVA 1.5.5 CD-ROM, Potsdam Institute for Climate Impact Research, Potsdam, Germany. Easterling et al., 2007 W. Easterling P. Aggarwal P. Batima K. Brander L. Erda M. Howden A. Kirilenko J. Morton J.-F. Soussana J. Schmidhuber F.N. Tubiello Food, fibre and forest products M.L. Parry O.F. Canziani J.P. Palutikof P.J. van der Linden C.E. Hanson Climate Change 2007: Impacts, Adaptation and Vulnerability. Contribution of Working Group II to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change 2007 Cambridge University Press Cambridge, UK EC, 2006 EC World Energy Technology Outlook 2050 (WETO H2) 2006 European Commission Brussels Edenhofer et al., 2010 O. Edenhofer B. Knopf T. Barker L. Baumstark E. Bellevrat B. Chateau P. Criqui M. Isaac A. Kitous S. Kypreos M. Leimbach K. Lessmann B. Magné S. Scrieciu H. Turton D.P. van Vuuren The economics of low stabilization: model comparison of mitigation strategies and costs The Energy Journal 31 SI-1 2010 11 48 Environmental Change Institute, 2009 Environmental Change Institute, 2009. International Climate Conference – 4 Degrees and Beyond. Environmental Change Institute, Oxford University, 28–30 September, Oxford, UK. Feyen et al., 2009 L. Feyen J.I. Barredo R. Dankers Implications of global warming and urban land use change on flooding in Europe J. Feyen K. Shannon M. Neville Water and Urban Development Paradigms. Towards an Integration of Engineering, Design and Management Approaches 2009 Taylor and Francis Group London Fischer et al., 2007 G. Fischer F.N. Tubiello H. van Velthuizen D.A. Wiberg Climate change impacts on irrigation water requirements: effects of mitigation, 1990–2080 Technological Forecasting and Social Change 74 7 2007 1083 1107 Fisher et al., 2007 B. Fisher N. Nakicenovic K. Alfsen J. Corfee Morlot F. de la Chesnaye J.-C. Hourcade K. Jiang M. Kainuma E. La Rovere A. Matysek A. Rana K. Riahi R. Richels S. Rose D. van Vuuren R. Warren P. Ambrosi F. Birol D. Bouille C. Clapp B. Eickhout T. Hanaoka M.D. Mastrandrea Y. Matsuoko B. O’Neill H. Pitcher S. Rao F. Toth Issues related to mitigation in the long-term context B. Metz O. Davidson P. Bosch R. Dave L. Meyer Climate Change 2007. Mitigation of Climate Change. Contribution of Working Group III to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change 2007 Cambridge University Press New York 169 250 Hallegatte, 2009 S. Hallegatte Strategies to adapt to an uncertain climate change Global Environmental Change 19 2009 240 247 Hayashi et al., 2010 A. Hayashi K. Akimoto F. Sano S. Mori T. Tomoda Evaluation of global warming impacts for different levels of stabilization as a step toward determination of the long-term stabilization target Climatic Change 98 2010 87 112 Hilderink et al., 2008 H. Hilderink P.L. Lucas A. ten Hove M. Kok M. de Vos P. Janssen J. Meijer A. Faber A. Ignaciuk A. Petersen H.J.M. de Vries Towards a Global Integrated Sustainability Model 2008 Netherlands Environmental Assessment Agency Bilthoven Hinkel and Klein, 2009 J. Hinkel R.J.T. Klein Integrating knowledge to assess coastal vulnerability to sea-level rise: the development of the DIVA tool Global Environmental Change 19 3 2009 384 395 Hirabayashi and Kanae, 2009 Y. Hirabayashi S. Kanae First estimate of the future global population at risk of flooding Hydrological Research Letters 3 2009 6 9 Hof et al., 2009a A.F. Hof K.C. de Bruin R.B. Dellink M.G.J. den Elzen D.P. van Vuuren The effect of different mitigation strategies on international financing of adaptation Environmental Science and Policy 12 7 2009 832 843 Hof et al., 2009b A.F. Hof K. de Bruin R. Dellink M.G.J. den Elzen D.P. van Vuuren Costs, benefits and inter-linkages between adaptation and mitigation F. Biermann P. Pattberg F. Zelli Global Climate Governance After 2012: Architecture, Agency and Adaptation 2009 Cambridge University Press Cambridge Hof et al., 2008 A.F. Hof M.G.J. den Elzen D.P. van Vuuren Analysing the costs and benefits of climate policy: value judgements and scientific uncertainties Global Environmental Change 18 3 2008 412 424 IMAGE-team, 2001 IMAGE-team, 2001. The IMAGE 2.2 implementation of the IPCC SRES scenarios. A comprehensive analysis of emissions, climate change and impacts in the 21st century. RIVM CD-ROM publication 481508018, National Institute for Public Health and the Environment, Bilthoven, the Netherlands. IPCC, 2007 IPCC (Ed.), 2007. Climate Change 2007: Synthesis Report. Contribution of Working Groups I, II and III to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change. IPCC, Geneva, 104 pp. Isaac and van Vuuren, 2009 M. Isaac D.P. van Vuuren Modeling global residential sector energy demand for heating and air conditioning in the context of climate change Energy Policy 37 2 2009 507 521 Jenkins and Lowe, 2003 Jenkins, G., Lowe, J., 2003. Handling uncertainties in the UKCIP02 scenarios of climate change. Hadley Centre Technical Note 44, Met Office, Exeter. Kinney et al., 2008 P.L. Kinney M.S. O’Neill M.L. Bell J. Schwartz Approaches for estimating effects of climate change on heat-related deaths: challenges and opportunities Environmental Science and Policy 11 87 2008 Klein et al., 2007 R.J.T. Klein S. Huq F. Denton T.E. Downing R.G. Richels J.B. Robinson F.L. Toth Inter-relationships between Adaptation and Mitigation. Climate Change 2007. Impacts, Adaptation and Vulnerability. Contribution of Working Group II. Report of the Intergovernmental Panel on Climate Change 2007 Cambridge University Press Cambridge 745 777 Krol et al., 1997 M. Krol J. Alcamo R. Leemans Global and regional impacts of stabilizing atmospheric CO 2 Mitigation and Adaptation Strategies for Global Change 1 1997 341 361 Kundzewicz et al., 2010 Z.W. Kundzewicz Y. Hirabayashi S. Kanae River floods in the changing climate – observations and projections Water Resources Management 2010 10.1007/s11269-009-9571-6 Kundzewicz et al., 2007 Z.W. Kundzewicz L.J. Mata N. Arnell P. Döll P. Kabat B. Jiménez K. Miller T. Oki Z. Şen I. Shiklomanov Freshwater resources and their management M.L. Parry O.F. Canziani J.P. Palutikof C.E. Hanson P.J. van der Linden Climate Change 2007: Impacts, Adaptation and Vulnerability. Contribution of Working Group II to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change 2007 Cambridge University Press Cambridge, UK Leemans and Born, 1994 R. Leemans G.J.v.d. Born Determining the potential global distribution of natural vegetation, crops and agricultural productivity Water, Air and Soil Pollution 76 1994 133 161 Lenton et al., 2008 T.M. Lenton H. Held E. Kriegler J.W. Hall W. Lucht S. Rahmstorf H.J. Schellnhuber Tipping elements in the Earth's climate system Proceedings of the National Academy of Sciences of the United States of America 105 6 2008 1786 1793 Manne and Richels, 2005 A.S. Manne R.G. Richels Merge: an integrated assessment model for global climate change R. Loulou J.-P. Waaub G. Zaccour Energy and Environment 2005 Springer USA McMichael et al., 1996 A. McMichael A. Haines R. Slooff S. Kovats Climate Change and Human Health 1996 World Health Organization Geneva Mechler et al., 2010 R. Mechler S. Hochrainer A. Aaheim Z. Kundzewicz N. Lugeri M. Moriondo H. Salen M. Bindi I. Banaszak A. Chorynski E. Genovese H. Kalirai J. Linnerooth-Bayer C. Lavalle D. McEvoy P. Matczak M. Radziejewski D. Rübbelke M.-J. Schelhaas M. Szwed A. Wreford Risk management approach for assessing adaptation to changing flood and drought risks in Europe M. Hulme H. Neufeldt Making Climate Change Work for Us: European Perspectives on Adaptation and Mitigation Strategies 2010 Cambridge University Cambridge, UK Meehl et al., 2007 G.A. Meehl T.F. Stocker W.D. Collins P. Friedlingstein A.T. Gaye J.M. Gregory A. Kitoh R. Knutti J.M. Murphy A. Noda S.C.B. Raper I.G. Watterson A.J. Weaver Z.-C. Zhao Global climate projections S. Solomon Climate Change 2007: The Physical Science Basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change 2007 Cambridge University Press Cambridge Metz et al., 2007 B. Metz O.R. Davidson P.R. Bosch R. Dave L.A. Meyer Climate Change. Mitigation of Climate Change. Contribution of Working Group III to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change 2007 Cambridge University Press Cambridge, United Kingdom Mirza et al., 2003 M.M.Q. Mirza R.A. Warrick N.J. Ericksen The implications of climate change on floods of the Ganges, Brahmaputra and Meghna Rrivers in Bangladesh Climatic Change 57 2003 287 318 Moriondo et al., 2010 M. Moriondo M. Bindi Z.W. Kundzewicz M. Szwed A. Chorynski P. Matczak M. Radziejewski D. McEvoy A. Wreford Impact and adaptation opportunities for European agriculture in response to climatic change and variability Mitigation and Adaptation Strategies for Global Change 15 7 2010 657 679 Moss et al., 2010 R.H. Moss J.A. Edmonds K.A. Hibbard M.R. Manning S.K. Rose D.P. van Vuuren T.R. Carter S. Emori M. Kainuma T. Kram G.A. Meehl J.F.B. Mitchell N. Nakicenovic K. Riahi S.J. Smith R.J. Stouffer A.M. Thomson J.P. Weyant T.J. Wilbanks The next generation of scenarios for climate change research and assessment Nature 2010 10.1038/nature08823 Nakicenovic et al., 2000 N. Nakicenovic Special Report on Emissions Scenarios (SRES) 2000 Cambridge University Press Cambridge, UK Nakicenovic et al., 2006 N. Nakicenovic P. Kolp K. Riahi M. Kainuma T. Hanaoka Assessment of Emissions Scenarios Revisited Environmental Economics and Policy Studies 7 3 2006 137 173 Nicholls and Lowe, 2004 R.J. Nicholls J.A. Lowe Benefits of mitigation of climate change for coastal areas Global Environmental Change 14 3 2004 229 244 Nicholls et al., 2010 R.J. Nicholls N. Marinova J.A. Lowe S. Brown P. Vellinga D. de Gusmao J. Hinkel R.S.J. Tol Sea-level rise and its possible impacts given a “beyond 4 degree world” in the 21st century Philosophical Transactions of the Royal Society 2010 10.1098/rsta.2010.029 Nordhaus and Boyer, 2000 W.D. Nordhaus J. Boyer Warming the World: Economic Models for Global Warming 2000 MIT Press Cambridge, MA pp. 315–328 Nordhaus, 2008 W.D. Nordhaus A Question of Balance Weighing the Options on Global Warming Policies 2008 Yale University Press New Haven and London Parry et al., 2007 M.L. Parry O.F. Canziani J.P. Palutikof P.J. van der Linden C.E. Hanson Climate Change 2007: Impacts, Adaptation and Vulnerability. Contribution of Working Group II to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change 2007 Cambridge University Press Cambridge Patt et al., 2010 A.G. Patt D.P. van Vuuren F. Berkhout A. Aaheim A.F. Hof M. Isaac R. Mechler Adaptation in integrated assessment modeling: Where do we stand? Climatic Change 99 3 2010 383 402 Piani et al., 2005 C. Piani D.J. Frame D.A. Stainforth M.R. Allen Constraints on climate change from a multi-thousand member ensemble of simulations Geophysical Research Letters 32 2005 L23825 Price, 2005 C. Price An intergenerational perspective on effects of environmental changes: discounting the future's viewpoint J.L. Innes G.M. Hickey H.F. Hoen Forestry and Environmental Change: Socioeconomic and Political Dimensions 2005 International Union on Forestry Research Organisations (IUFRO) Vienna Rose et al., 2007 Rose, S., Ahammad, H., Eickhout, B., Fisher, B., Kurosawa, A., Rao, S., Riahi, K., van Vuuren, D. 2007. Land in climate stabilization modeling: initial observations Energy Modeling Forum Report. Stanford University. Schneider and Kuntz-Duriseti, 2002 S.H. Schneider K. Kuntz-Duriseti Uncertainty and climate change policy S.H. Schneider A. Rosencranz O. Niles Climate Change Policy: A Survey 2002 Island Press Washington, DC Stern, 2006 N. Stern Stern Review on the Economics of Climate Change 2006 Cambridge University Press Cambridge Swart et al., 2009 R. Swart L. Bernstein M. Ha-Duong A. Petersen Agreeing to disagree: uncertainty management in assessing climate change, impacts and responses by the IPCC Climatic Change 92 2009 1 29 Swart and Raes, 2007 R. Swart F. Raes Making integration of adaptation and mitigation work: mainstreaming into sustainable development policies? Climate Policy 7 4 2007 288 303 Tol, 2002a R. Tol Estimates of the damage costs of climate change. Part II. Dynamic estimates Environmental and Resource Economics 21 2 2002 135 160 Tol, 2002b R.S.J. Tol Estimates of the damage costs of climate change. Part 1. Benchmark estimates Environmental and Resource Economics 21 1 2002 47 73 Tol, 2002c R.S.J. Tol Welfare specifications and optimal control of climate change: an application of fund Energy Economics 24 4 2002 367 376 Tubiello and Fischer, 2007 F.N. Tubiello G. Fischer Reducing climate change impacts on agriculture: global and regional effects of mitigation, 2000–2080 Technological Forecasting and Social Change 74 7 2007 1030 1056 UN, 2005 UN, 2005. World Population Prospects: The 2004 Revision. CD-ROM Edition – Extended Dataset. United Nations publications, Sales No. E.05.XIII.12, United Nations, Department of Economic and Social Affairs, Population Division. van Vliet et al., 2009 J. van Vliet M.G.J. den Elzen D.P. van Vuuren Meeting radiative forcing targets under delayed participation Energy Economics 31 Suppl. 2 2009 S152 S162 van Vuuren et al., 2006 D.P. van Vuuren B. van Ruijven M. Hoogwijk M. Isaac B. De Vries TIMER 2: model description and application L. Bouwman T. Kram K. Klein-Goldewijk Integrated Modelling of Global Environmental Change. An overview of IMAGE 2.4 2006 MNP - Netherlands Environmental Assessment Agency Bilthoven van Vuuren et al., 2007 D.P. van Vuuren M.G.J. Den Elzen P.L. Lucas B. Eickhout B.J. Strengers B. Van Ruijven S. Wonink R. Van Houdt Stabilizing greenhouse gas concentrations at low levels: an assessment of reduction strategies and costs Climatic Change 81 2 2007 119 159 van Vuuren et al., 2008a D.P. van Vuuren B. De Vries A. Beusen P.S.C. Heuberger Conditional probabilistic estimates of 21st century greenhouse gas emissions based on the storylines of the IPCC-SRES scenarios Global Environmental Change 18 4 2008 635 654 van Vuuren et al., 2008b D.P. van Vuuren M. Meinshausen G.K. Plattner F. Joos K.M. Strassmann S.J. Smith T.M.L. Wigley S.C.B. Raper K. Riahi F. De La Chesnaye M.G.J. Den Elzen J. Fujino K. Jiang N. Nakicenovic S. Paltsev J.M. Reilly Temperature increase of 21st century mitigation scenarios Proceedings of the National Academy of Sciences of the United States of America 105 40 2008 15258 15262 van Vuuren et al., 2009 D.P. van Vuuren M.G.J. den Elzen J. van Vliet T. Kram P. Lucas M. Isaac Comparison of different climate regimes: the impact of broadening participation Energy Policy 37 12 2009 5351 5362 van Vuuren et al., 2010 D.P. van Vuuren E. Stehfest M.G.J. den Elzen J. Van Vliet M. Isaac Exploring scenarios that keep greenhouse gas radiative forcing below 3 W/m 2 in 2100 Energy Economics 31 Special Issue 1 2010 165 192 Vermeer and Rahmstorf, 2009 M. Vermeer S. Rahmstorf Global sea level linked to global temperature Proceedings of the National Academy of Sciences of the United States of America 106 2009 21527 21532 Weitzman, 2008 Weitzman, M.L., 2008. On modeling and interpreting the economics of catastrophic climate change. Wigley, 2003 T.M.L. Wigley MAGICC/SCENGEN 4.1: Technical Manual 2003 UCAR - Climate and Global Dynamics Division Boulder, CO Wigley and Raper, 2001 T.M.L. Wigley S.C.B. Raper Interpretation of high projections for global-mean warming Science 293 2001 451 454|设想情景用于探讨不确定情况下不同适应和缓解战略的后果。在本文中,我们使用了两种情景来探讨发展: (1)没有缓解措施导致全球平均气温到2100年上升4摄氏度; (2)一个雄心勃勃的缓解策略导致到2100年上升2摄氏度。就第二种情况而言,气候系统的不确定性意味着不能排除全球平均气温上升3摄氏度或更多的可能性。我们的分析表明,在许多情况下,适应和减缓不是权衡,而是补充。例如,在缓解设想方案中,因气候变化而面临更大水资源压力的人数可以大幅度减少,但仍然需要对面临更大压力的其余大量人口进行适应。另一个例子是海平面上升,从全球和纯货币的角度来看,适应(直到2100年)似乎比缓解更有效。然而,从较贫穷和小岛屿国家的角度来看,严格的缓解措施对于将风险保持在可控水平是必要的。就农业而言,只有基于适应和缓解相结合的设想方案才能避免严重的气候变化影响。关键词情景综合评估气候变化缓解适应气候影响1引言情景分析是评估气候变化和气候变化政策的一个非常重要的工具,它使分析人员能够探索经济发展、温室气体排放、气候和生态系统等因素之间复杂而不确定的未来相互作用。这些因素共同决定了缓解和适应政策的必要性和可能性。设想情景还可以作为一种手段,协调参与气候研究领域的各种不同研究群体的假设,从而更好地比较其结果。因此,情景在缓解和适应研究中得到了广泛的应用(参见 Metz 等,2007; Parry 等,2007)(特别是来自排放情景特别报告(SRES)的情景(Nakicenovic 等,2000))。Moss 等人(2010)指出,由于 SRES 对场景分析的信息需求正在发生变化。首先,人们对探索适应与缓解之间的关系越来越感兴趣。正如 Moss 等人(2010)所指出的,这将需要进一步整合气候研究中涉及的不同分析传统的信息。第二,除了迄今为止探讨的无气候政策情景之外,人们对明确探讨气候政策影响的情景也越来越感兴趣。具体而言,在没有气候政策的情况下,能够评估长期气候目标的“成本”和“收益”是非常有意义的。在本文中,我们遵循这一思路,探讨情景分析如何能够促进对未来适应和缓解战略的联合评估。这样的联合评估有以下几个原因: (1)首选的缓解策略取决于预期的气候影响和适应成本; (2)考虑到适应气候变化的局限性; (3)一些适应和缓解策略可能相互作用; (4)最后,气候变化的影响可能有需要考虑的重要反馈。这种分析在战略层面上是最有用的,而不是针对个人的适应(或缓解)决策。鉴于这一目的,我们在本文中讨论了两个主要的情景,其中包括适应和减缓战略的要素(见本文的进一步内容) ,导致本世纪末全球平均气温上升4摄氏度和2摄氏度。这两个温度水平已经开始成为标志性的数字,代表着在没有减缓政策(4摄氏度)和国际气候谈判的温度目标(2摄氏度)(2009年哥本哈根协议)的情况下的潜在结果。可以说,如果政治领导人要在减缓、适应和气候影响之间做出明智的选择,了解这两个温度水平的影响是至关重要的(环境变化研究所,2009)。缓解和适应战略的综合评估由于方法上的差异而受到阻碍。考虑到当地环境的重要性,综合评估模型很难描述适应过程(Patt et al。 ,2010)。一个实际问题是,迄今为止,影响文献的相当一部分集中在非政策情景下的影响(例外包括 Arnell 等,2002; Bakkenes 等,2006; Hayashi 等,2010; Krol 等,1997; Nicholls 和 Lowe,2004)。因此,本文提出了一个基于耦合信息的广义情景评估——但没有假装是完整的或完全集成的。作为一项边做边学的活动,本文件打算说明4摄氏度和2摄氏度世界之间的重要区别,但也要确定进行综合情景分析所涉及的一些实际问题。这意味着,与现有文献相比,最重要的进步是我们提出了一个基于一致情景的多部门分析。鉴于目前综合评估模型的先进水平,已经使用几个松散耦合模型进行了试验。因此,一些重要的联系无法得到解决,如农业的适应性反应,这可能涉及灌溉(见第5.3节)和水需求(第5.4节)。事实上,本文提出的一个重要问题是,是否需要进行全面综合分析,或者部分综合是否足够。本文的内容安排如下: 我们首先讨论在开发能够为适应和缓解政策决策提供信息的设想方案时所遇到的一些方法上的复杂问题。接下来,我们讨论两种主要情景在社会经济驱动因素方面的差异(第3和第4部分)。在第5节中,我们探讨了适应和减缓战略对气候变化各种影响的潜在后果。2评估气候战略和情景发展(理论和方法)2.1应对气候变化的不同战略气候变化及其响应可能导致三种形式的成本(不一定是货币) : (1)气候影响的(剩余)成本,(2)适应的成本和(3)缓解的成本。至少在理论上,这对应于三种不同的策略: (1)“自由放任”(接受气候变化) ,(2)关注适应,(3)关注缓解,如图1所示(另见 Klein 等,2007)。虽然图1表明,缓解、适应和剩余损害的成本和收益可以相互交换,但存在一些概念和分析问题,使这种办法复杂化。这些与空间和时间尺度、风险和不确定性有关(SwartandRaes,2007)。缓解和适应是在不同空间尺度上发生的过程。虽然缓解行动通常是在国家或地方范围内采取的,但好处是全球共享的。因此,气候政策成功和成本的关键因素是国际合作的程度(Barker 等,2009; Clarke 等,2010; van Vliet 等,2009; van Vuuren 等,2009)。相比之下,对于适应而言,成本和收益在从地方到国家乃至国际的多个尺度上都存在。较大规模的扶持性环境仍然可以在较小规模上加强适应(例如,由国际融资机制资助的地方能力建设)。由于这些原因,缓解评估往往集中在全球一级,而相比之下,适应研究大多集中在地方一级。随着时间的推移,缓解和适应的动态也是一个重要因素。严格的缓解方案通常需要强有力的早期减排。然而,由于气候系统内部的巨大惯性,这些假设情景的气候变化影响在短期(前几十年)与没有气候变化政策的假设情景几乎没有差别。相比之下,一些相关的影响(例如减少当地空气污染的共同利益)可以以更快的速度实现。适应措施可能在短期内产生私人和社会效益。例如,空气调节等简单的适应措施可以带来明显的短期效益。一些重要的例外存在,可能需要几十年的实施,如空间规划的变化或大规模的工程工程防洪(见哈勒盖特,2009年)。其他重要因素是风险和不确定性。我们对气候变化的理解面临许多不确定性。要确定的关键不确定性包括认知、数据、模型和实体不确定性(施奈德和 Kuntz-Duriseti,2002; van Vuuren 等,2008a)。涉及不确定因素的例子有: (i)未来的排放量,(ii)气候系统,(iii)未来的脆弱性和对气候风险的暴露,以及(iv)缓解成本。采取缓解行动减少了一些不确定性,因为它减少了气候变化的源头,并揭示了实际的缓解成本(Barker,2003; Piani 等,2005)。然而,缓解措施也可能增加风险。例如,如果以不可持续的方式实施生物能源,可能会抵消一组风险(气候变化) ,同时产生另一组不同的风险(生物多样性丧失和粮食安全下降)。处理风险的一种方法是包括概率评估。这通常是使用过去的证据,推断以涵盖特定的未来情况。其他不确定性(例如不可知的冲击和意外)在量化意义上更难处理,但它们证明了承认无知的合理性。情景可以用来探索极端事件的可能性和各种政策组合的稳健性,但这并不常见(Berkhout et al。 ,2002)。传统上,涉及缓解研究和适应研究的学科对不确定性有不同的描述方式。虽然缓解研究往往使用定量方法并侧重于平均估计,但适应研究往往更侧重于对不确定性的定性描述,并侧重于危险事件的风险,即使这些事件发生的概率很低。这些不同的不确定性感知可能会使不同策略的综合评估复杂化(Swartet al。 ,2009)。2.2场景的类型我们可以根据缓解和适应的考虑将场景分为不同的类别。首先,我们将基线情景定义为一个事件的轨迹,假设没有来自气候变化的重大反馈,也没有关于缓解或适应的具体政策努力(这种情景可能仍然包括许多间接影响缓解或适应气候变化能力的行动; 例如,可以预期收入水平的增加与对减少疟疾等气候相关疾病风险的卫生服务的更大投资相一致)。这种类型的场景的主要目的是进行分析,作为其他场景的参考点。其次,适应情景描述了一个社会正在应对气候变化影响的世界。其目的是探讨适应气候变化所需的技术和政策类型、避免的损害和相关费用。适应包括所谓的自主适应(即在没有特定政府行动的情况下发生的行动)和有计划的适应。第三,缓解方案描述了一个包括旨在限制气候变化的政策的世界。其目的是探讨最大限度地减少气候变化及相关成本所需的技术和政策类型。由于总是存在剩余的影响,第四组,适应和缓解情景综合了两种类型的气候变化应对措施。可能的话,这第四类情景可以根据适应和缓解备选办法之间可能存在的协同作用,例如对于一些重新造林备选办法,重新排列政策备选办法。每一种情况都与更广泛的社会、政治和文化背景有关,在这种背景下,它们被认为会出现。在探索缓解、适应和残余损害的优选组合时,存在两种主要方法: (i)将潜在影响描述为全球平均气温上升(从而缓解)的功能的影响和基于风险的方法,以及(ii)成本效益分析,其中确定货币成本和收益,以最大限度地提高福利(例如,参见 Nordhaus,2008; Tol,2002c)。在这两种情况下,我们认为,描述不同应对战略之间的关系比寻求确定最佳办法更有用,也更能反映问题。鉴于第2.1节所列出的复杂性和不确定性,我们认为在现实中不可能采取任何最佳的缓解、适应或联合策略。2.3综合分析缓解和适应的综合分析可以通过不同方式实现: 例如,使用单一的所谓综合评估模型,或在不同模型和学科之间交流信息,评估现有文献并使结果具有可比性。这两种方法都是围绕气候变化的因果链进行组织的,即描述经济活动(收入、能源使用、农业等)、排放、气候变化和影响之间的关系——以及相关的反馈(图2)。实际上,该方案也构成了 IPCC 报告情景信息流的主干(Moss et al。 ,2010)。情景首先由综合评估和排放模型制定(侧重于经济驱动力、能源和土地使用以及温室气体排放(IPCC“工作组 III”))。随后,排放轨迹在气候模型中被用来评估气候变化的影响(IPCC“工作组 I”)。最后,这些情景被用于影响、适应和脆弱性分析(IPCC“工作组 II”)。不同研究学科和工作组的参与意味着很难说明不同领域之间的反馈意见。综合评估模型只能获取有限数量的可能反馈(经常被忽略的反馈包括粮食和水安全对人口和经济驱动因素的影响; 水资源短缺与粮食生产之间的关系; 气候变化对能源使用的影响等)。如果这些反馈不足以对系统产生重大影响,忽略(其中一些)可能是合理的。出于分析原因,在学科领域内组织场景开发并考虑有限数量的反馈有很大的优势。它使研究人员能够专注于他们很好地理解的链条要素,并增加所需的细节数量,而不必面对相互联系的复杂性。然而,在更加注重对缓解和适应战略进行综合分析的情况下,这种情况可能会改变。关于为什么需要采取综合办法的一些例子是: 一、气候影响,例如极端事件引发的影响,可能非常严重,破坏了原先设想的经济假设; 二。气候影响可能对农业产生重大影响,因此,对未考虑影响的土地使用相关排放量的估计可能是错误的,生物能源的缓解潜力可能受到影响;。对于对缓解和适应都有吸引力的土地面积,可能存在相互竞争的权利主张。因此,一个有趣的问题是,是否需要更加集成的分析是如此迫切,以至于需要更加复杂的集成模式(模型的交互耦合; 一个复杂的模型) ,或者是否可以单独处理影响,简化分析框架。时间范围和决策焦点在这里可能也很重要,例如是否考虑到潜在的临界点(Lenton et al。 ,2008)。研究这个问题的少数现有研究似乎表明,在大多数部门,任何缓解项目的适应影响都很小,大多数适应活动产生的排放量也很小(Klein et al。 ,2007)。迄今为止,最完整的分析来自以成本效益为导向的综合评估模型,如基金、 DICE 和 MERGE (Manne and Richels,2005; Nordhaus,2008; Tol,2002c) ,但这些模型通常将气候影响聚合为有限数量的相当抽象的损害函数。我们认为,随着时间的推移,随着许多部门减缓和适应措施的力度不断加大,将更加需要进行足够详细的联合评估。这里提供的场景基于建模和场景开发的当前技术状态,迈出了第一步。缓解和影响评估的一项评估使用了相同的设想方案,我们明确提出了缓解和适应战略(作为设想方案的一部分或在用于不同影响的模型中)。然而,许多反馈没有被考虑进去。在本文的最后,我们回到了更加集成(但也更加复杂)的场景的作用。2.4本文件使用的方法如上所述,可以确定几种情景: 基线情景、缓解情景、适应情景和适应-缓解情景。本文还介绍了这些场景类型。对于基准/适应情景,我们假设大多数社会经济驱动因素的中间假设。情景假设在第3和第4节中描述。这些设想方案不包括缓解措施,导致到2100年全球平均气温比工业化前水平上升4摄氏度。虽然我们描述了在这些情况下可能产生的影响和适应,但我们没有包括对原始驱动因素的反馈。在缓解设想方案中,包括严格的缓解努力,导致全球平均气温上升2摄氏度。使用 IPCC 给出的3 ° C 的气候敏感性中值(Meehl 等,2007) ,这意味着稳定水平约为450 ppm CO2当量(CO2当量).气候政策对经济驱动因素的影响没有被考虑在内——但是其他几个关系是耦合的(例如土地使用)。因此,在大多数文章中,我们忽略了气候变化和气候政策对经济假设的潜在影响。然而,在第5.8节中,我们用一个简单的经济模型(FAIR)来讨论它们的影响,以提供对全球范围内经济后果的可能规模的一些见解。使用了几个模型工具。这些情景主要是使用 IMAGE 综合评估模型开发的(Bouwman 等,2006)。IMAGE 模型根据对人口和世界经济的假设,结合对技术发展和消费模式的假设,描述了21世纪能源和土地使用的发展情况。该模型预测了全球范围内的气候变化(以全球平均温度变化和海平面上升为指标) ,并通过模式缩放缩小的气候模式模式构建了0.5 ° × 0.5 ° 网格上月气温和降雨量变化的空间场景。IMAGE 的产出用于描述海平面上升的 DIVA 模型; 用于估计水资源紧张后果的 Mac-PDM 全球水文模型; 用于估计加热和降温需求影响的 TIMER 能源模型; 用于疟疾对疟疾影响的 MARA/ARMA 适宜性模型和用于货币成本效益分析的 FAIR 模型。此外,我们更一般地讨论对农业的影响(基于 IPCC AR4)和极端事件。附录 A 提供了所有使用模型的简要描述。在我们的描述中,我们关注于全局级别(考虑到空间有限)。显然,这导致了我们讨论适应的局限性。实验取决于每个模型的设计,因此,不同影响之间可以提出的假设情景的数量是不同的。这意味着研究报告应被解释为综合评估的第一个例证,而不是关于适应及其限制的全面研究。3结果: 基线情景中的社会经济趋势3.1人口发展和经济增长我们假设人口遵循2004年世界人口预测修订版(UN,2005)到2050年的中等生育率变量,以及联合国到2100年的长期中等预测(图3)。这意味着,到2050年,全球人口将稳步增至近91亿,并在随后的50年中稳定在约92亿人,直至2100年。这种情况在人口预测范围内采取中间立场(见图3)。对于直到2050年的经济增长,这种情况遵循与剑桥模型 E3MG 相关的预测(巴克和 Scrieciu,2010; 巴克等人,2008)。利用基于 SRES 的 B2情景(IMAGE-team,2001)的经济增长预测,该情景延伸到2050年以后。从数量上看,这是一个中高速经济增长的情景,主要是对中国和印度经济增长的乐观假设的结果。按人均计算,经合组织经济体预计仍将是世界上最富有的经济体,但就经济活动总量而言,发展中区域的重要性迅速增加。在非洲、中东和拉丁美洲,人均国内生产总值的年增长率在0% 到2% 之间。在亚洲,到2050年,这一比例将从目前的高水平降至每年3% 。3.2基线情景下的能源使用和温室气体排放基线情景下的能源使用与欧盟委员会(EC,2006)公布的基线保持一致。尽管能源强度进一步降低,世界能源消费在2000-2050年期间增加了一倍以上,在2050-2100年期间又增加了25% (图4)。整个世纪以来,能源供应仍然以化石燃料为主。当石油和天然气的生产在本世纪达到高峰和下降时,煤炭的使用在整个情景期间增加。此外,非化石能源的产量也在迅速增长。在截至2100年的这段时间里,核能使用量增加了2到3倍,达到76 EJ,生物质使用量强劲增加,而水力发电产量增加了大约60% 到80% 。相对增长最大的是风能和太阳能; 到2050年,风能和太阳能在所有非化石能源中的比例将从不到1% 上升到10% 至14% 。2050年可再生能源总使用量为120-140 EJ,2100年为190 EJ。上述趋势表明,能源活动的二氧化碳排放量在2050年之前增加了一倍以上,在2050年至2100年之间又增加了三分之一(见图3)。因此,该方案在文献范围内形成了一个中间基线方案(Fisher et al。 ,2007)。非二氧化碳温室气体(尤其是甲烷)在2000-2050年期间稳步增长,但增长速度低于二氧化碳(作为其驱动因素,农业的增长速度预计将低于能源部门)。本世纪上半叶,土地利用产生的二氧化碳排放量回落至零。农业用地面积位于最近公布的类似情景的范围内,尽管处于该范围的低端(Rose et al。 ,2007)。4缓解方案和气候方案的结果4.1能源使用和温室气体排放缓解方案旨在将温室气体稳定在大约450 ppm CO2-当量。(参见 van Vuuren 等人,2007,2010)。这种情况允许初始浓度超调至大约510 ppm 二氧化碳当量。DenElzen 和 van Vuuren (2007)早些时候已经表明,有限的过度集中可以以较低的成本达到类似的气候目标。减少排放的方法有很多种。一个要素是提高能源效率,从而减少能源使用总量(2050年与基准相比减少20%)(见图4)。该设想方案还显示,非化石能源的使用日益增加,占能源使用总量增长的大部分。非化石能源的使用从2010年占一次能源使用总量的15% 增加到2050年的30% 以上,到本世纪末占总量的40% 以上。这种增长主要是由于生物能源使用的增加。碳捕获和储存应用于化石燃料的大多数固定用途。最后,还减少了非二氧化碳温室气体的排放。其结果是,全球排放量在2020年左右达到峰值,并随着时间的推移进一步减少。与2050年的基准相比,排放量减少了70% 以上,到2100年减少了80% 以上。减缓政策的后果不仅影响能源部门,而且影响土地使用。大量额外的土地用于造林和生物能源(见图5)。模型比较研究表明,这里提出的缓解方案与目前的文献是一致的,尽管模型显示各种减排措施的贡献有显着差异(Clarke 等,2010; Edenhofer 等,2010)。根据 IMAGE 模型计算,减排成本约为 GDP 的1-2% (即每年的额外支出,可与经合组织国家目前约占 GDP 1.5% 的环境政策支出相比较)(图6)。可比情景的文献范围在2100年的0.5.5% 左右。大多数研究都认为,这些额外的支出将导致国内生产总值的减少。我们将在第5.8节进一步讨论这个问题。4.2基线和缓解情景下的气候变化基于 IMAGE 模型计算,大气温室气体浓度和由这两种情景排放引起的相关平均全球温度变化如图7所示(实线表示最佳猜测值)。IMAGE 模型使用 MAGICC 模型来计算全球平均温度的变化。早期,van Vuuren 等人(2008b)将 MAGICC 模型用于类似的 IMAGE 场景,以计算温室气体浓度和温度(包括不确定范围)的轨迹。在这里,用于 MAGICC 计算的不确定性范围是基于现有的更复杂的碳循环和气候模型。我们使用了温室气体浓度范围和温度结果的含义来描述这里的不确定性范围,如图中阴影区域所示。对于温度,较宽的阴影区域表示由于碳循环和气候敏感性的不确定性而产生的不确定性。对于基线情景,全球平均气温在2050年几乎与工业化前水平线性上升至2.1摄氏度,在2100年上升至3.7摄氏度(不确定性范围为3-5摄氏度)。在缓解方案中,到2100年全球平均气温上升幅度限制在1.9摄氏度。同样,存在相当大的不确定性。图7表明,到本世纪末,与工业化前水平相比,减缓气候变化的情况也可能导致气温上升2.6摄氏度。由于这里提出的缓解方案是科学文献中最严格的方案之一(参考 Clarke 等,2010; Edenhofer 等,2010; Fisher 等,2007) ,可以得出两个重要的结论。首先,分析表明,全球变暖可以得到缓解,但不能停止。第二,严格的设想也可能导致气候变化大大超过2摄氏度这一观察结果可能意味着,对冲适应政策以防止气候变暖加剧可能具有相当大的价值。例如,这些政策可能是“ ... ... 目标是2摄氏度,但准备3摄氏度”。在下面的影响评估中,我们关注的是中央气候变化预测。通过源自 HadCM2气候模型的重新尺度模式(图8)构建了与全球平均温度变化相关的全球0.5 ° × 0.5 ° 尺度上的月平均温度和降水量的变化。结果表明,高纬度地区年平均气温变化大于低纬度地区,降水量变化具有明显的空间变化特征。关于气候变化的预期模式,特别是降水的模式,存在着相当大的分歧: 因此,本文件提出的影响结果只代表一种可能的结果。5结果: 不同情景下的影响和适应5.1导言 IPCC 的第四次评估报告(IPCC,2007)概述了气候影响。其中一些影响来自平均气候的变化,但其他影响可能来自极端事件的变化。表1总结了一些影响,包括健康、农业、水资源供应、沿海洪水、城市地区和能源系统,以及气候系统的大规模破坏(相比之下,生物多样性和生态系统服务没有包括在内)。如前所述,大多数文献都将气候变化视为“渐进现象”(Agrawala and Fankhauser,2008)。这对于低概率和高拥有属性的影响是有问题的(见下文)。在这个探索性的分析中,我们描绘了一些影响和适应需求。我们的目标是涵盖表1中提到的几个关键影响,但是评估受到可以很容易耦合的模型的限制。因此,这些描述并不打算详尽无遗,而是在一定程度上说明了一些影响的严重程度和关键的适应挑战。在介绍我们的结果时,我们使用了几个基于上述情景的新模型运行(例如疟疾、水资源、海平面上升、加热和降温需求)。然而,我们也根据这里提出的两种情景(与温度相关的死亡率、农业和极端事件)评估了 IPCC 第四次评估报告中的现有信息。5.2人类健康: 与温度相关的死亡率和疟疾气候变化的健康影响需要在其他更重要的人类健康驱动因素的背景下看待,包括与生活方式相关的因素(Hilderink 等,2008)。我们在这里关注与温度有关的死亡率和疟疾。5.2.1与温度有关的死亡率与温度有关的死亡率影响可能通过极端温度的变化、平均温度的变化或温度的季节性变化而发生,文献显示的结果各不相同。McMichael 等(1996)使用相对风险比对温度相关死亡率进行了估计,表明存在死亡率最低的最佳温度(也称为 U 形剂量-反应关系)。如果温度升高,热应激相关的死亡率增加,但寒冷相关的死亡率下降。Tol (2002a)的结论是,从货币角度来看,由于气候变化导致的与寒冷有关的死亡率的下降超过与热有关的死亡率的增加。然而,这一结论受到用于评估生命价值的方法的影响,也受到平均和区域温度以及温度和健康之间关系的巨大不确定性的影响。适应可能发生在人体生理学适应更高的温度(麦克迈克尔等人,1996) ,行为的改变和空气调节使用的增加(Kinney 等人,2008)。考虑到使用温度和死亡率之间的剂量-反应关系的复杂性,我们还没有尝试在这里量化这些。5.2.2疟疾疟疾与气候变化之间的关系引起了人们的极大关注。在本文中,我们还重点讨论了气候引起的疟疾风险变化。每年有100多万人死于疟疾,其中大多数是非洲儿童。疟疾是一种由病媒传播的传染病。按蚊(传播疟疾感染的媒介)只能在平均温度高、没有霜冻和降水充足的气候中生存。MARA/ARMA 疟疾适宜性模型(Craig et al。 ,1999)综合了这些因素来确定气候适宜的地区。然而,疟疾造成的死亡率也受到诸如获得预防措施(包括室内喷洒和经杀虫剂处理的蚊帐)和获得医疗保健等因素的严重影响。在 MARA/ARMA 模型中,这些因素与收入和城市化有关。图9显示了这个模型在本文情景下的结果。自主适应的影响(作为收入增加的功能)减少了约50% 的疟疾死亡,特别是在非洲(主要是由于更好地提供卫生保健)。相比之下,气候的影响——特别是缓解设想方案与基准设想方案之间的差异要小得多。减缓措施将疟疾健康风险降低约2% (2050年)。因此,适应对疟疾控制的影响比缓解更具决定性(这一发现在现有文献中似乎很有说服力)。5.3农业: 对产量的影响伊斯特林等人(2007)综合了大量关于气候变化对作物生长的影响的研究,包括适应和不适应。结果总结为全球平均气温升高的函数,尽管实际上气温和降水模式的变化以及 CO2施肥都起作用。例如,二氧化碳施肥的影响部分抵消了气候变化的影响。结果可以用来评估我们的情景的气候影响使用最佳拟合多项式从伊斯特林等人(2007年) ,这表明产量的影响作为平均温度变化的函数。我们在每种情况下都采用全球平均气温变化作为一种假设,并将其作为预期的局部平均气温变化的指示。这意味着我们的影响估计可能是保守的,因为在许多陆地地区,气温上升可能比全球平均水平更强。我们研究了基线(4摄氏度)和缓解(2摄氏度)情景对玉米、小麦和水稻的影响,包括适应和不适应(见图10; 2100年热带和温带地区的结果; 这些影响是由于气候变化以外的其他因素导致的产量增加的额外影响)。虽然结果很不确定,但有些结论似乎是可能的。首先,基准情景(无适应性)导致所有情况下的产量(相对于没有气候变化的情况)大幅度下降: 气候变化影响可能会使所研究作物(2050年)的总体产量下降10-35% 。其次,无论是采取缓解还是适应措施,都会限制产量的下降。然而,在热带地区,影响仍然是负面的,通常在10% 左右的损失。第三,缓解和适应相结合可能会导致从今天的情况得到改善。对温带地区的农业影响可能更为积极,但前提是气温较高的优势不被极端天气的影响所抵消。这些结果突出表明,需要同时考虑缓解和适应问题。所提出的结果以气专委的评估为基础,代表了范围广泛的模型。结果也可以通过个别研究来说明。比如,Tubiello 和 Fischer (2007)发现,缓解方案可以显著降低全球农业气候变化的成本。同样,Fischer 等人(2007)阐述了适应水灌溉需求的重要性。他们发现,缓解措施减少了约40% 的农业用水需求,剩下60% 的影响需要适应。在处理对农业的影响时,干旱和热浪胁迫都起着重要作用。图11显示了在欧洲,假设各种形式的适应(Mechler 等,2010; Moriondo 等,2010) ,干旱和热浪胁迫对2 °C 升温情景下作物产量的影响。22在 HADCM3气候模式的基础上,利用 Cropsyst 模式对2030-2060年的时间片进行了计算。利用当前和未来的作物管理措施,模拟了春小麦的冬夏季作物产量。所考虑的适应选择包括将播种日期推迟几天和使用生长周期较长/较短的品种。结果显示,南欧和法国部分地区如今已经特别容易受到干旱和热应激的影响,即使在2摄氏度(缓解)的情况下,这种情况预计也会恶化(图11图 A)。当考虑两种适应策略与缓解措施相结合时(图11图 B 和 C) ,欧洲的许多地区实际上可能会受益。尤其是北欧,可以利用降水量较高的优势,使用生长周期较长的作物品种。相比之下,在南欧,同样的适应办法将产生额外的负面影响,因为作物发展将转向夏季,而夏季较长的干旱期和热浪可能会严重影响作物生长。此外,结果表明,虽然适应存在一些区域特有的限制,但整体适应将有效减少对欧洲农业部门的影响。5.4水资源: 潜在的水资源可利用性使用全球范围的水资源影响模型(Arnell,2003)评估了这两种情景对水资源压力变化的影响。图12显示了在基线情景和缓解情景(使用 HadCM2气候模型模式)下,到2100年(相对于1961-1990年的平均值)平均年径流量的百分比变化。如果年平均径流量小于1000 m3/人均/年,我们将流域定义为处于水资源紧张状态(文献中也使用了其他定义)。气候变化的影响是通过总结(i)生活在径流显着减少(增加)的水资源紧张流域的人口(通常超过5-10%)和(ii)生活在由于气候变化而变得水资源紧张(不再水资源紧张)的流域的人口。因气候变化而面对水资源压力上升或下降的人数没有计算在内,原因有二: (i)水资源压力下降的负面影响大于水资源压力下降的有利影响; (ii)面对水资源压力上升或下降的地区分布广泛,一个地区的“盈余”不能抵消另一个地区的“亏损”。结果显示,在2050年、2080年和2100年,缓解和基线假设情景之间,水资源紧张程度增加的风险暴露存在显著差异。到2020年,这两种情况的径流量几乎没有差别。图13显示了在这两种情景下,由于气候变化而面临水资源压力增加或减少的人数。在基线和缓解方案中,生活在缺水流域的人们显然从增加的可用水量中受益,这个数字大于暴露于径流减少的人数,但是,如上所述,我们并不关注净效应。面临水资源压力变化的人数对假定的气候变化模式很敏感。与基线相比,缓解方案在2050年、2080年和2100年分别减少了1.35亿(减少12% 的影响)、2.81亿(减少20%)和4.57亿(减少30%)面临水资源压力增加的人口。然而,与此同时,也有人从气候变化中受益。具有积极和消极影响的群体的相对规模取决于所使用的气候模型(这里只使用了哈德利模式)。显然,减缓气候变化也减少了从气候变化中受益的人数。同样显而易见的是,缓解并不能消除气候变化对供水的影响,因此需要对气候变化而面临更大水资源压力的其余10亿人进行适应。适应措施可以包括增加水的储存、水的运输或通过提高效率来减少水的需求。基本结果表明,缓解效果因地区而异。事实上,在一些地区,缓解甚至可能增加暴露于压力增加的人数。具体的不确定性分析表明,结果高度依赖于气候变化引起的降水模式变化的不确定性。5.5海平面上升气候变化的另一个重要影响是海平面上升。利用 IMAGE 模型的 MAGICC 组成部分,预测了这两种情况下的全球平均海平面上升。由于海平面对全球变暖的反应迟缓,预测主要在本世纪下半叶出现分歧: 在4摄氏度和2摄氏度的情况下,2050年海平面上升分别为35厘米和31厘米,在2100年分别为71厘米和49厘米。这些预测不包括格陵兰岛和南极洲冰盖的潜在加速贡献,这可能导致更高的海平面上升,但潜在的过程没有得到充分的理解,目前不包括在气候模型中(Meehl 等,2007; Nicholls 等,2010; Vermeer 和 Rahmstorf,2009)。我们使用 DIVA 模型来评估海平面上升、相关风暴潮和社会经济发展在这两种情况下的损害和适应成本,同时考虑到海岸侵蚀(直接和间接)、强迫迁移、沿海洪水(包括河流)以及盐度入侵三角洲和河口。对于每种情况,模型首先在没有堤坝的情况下运行,然后根据提高堤坝和滋养海滩的情况进行适应(DINAS-COAST Consortium,2006; Hinkel and Klein,2009)。由于缺乏全球数据和这些过程的一般模型,无法列入诸如沿海含水层盐度入侵、沿海湿地和生物多样性丧失等进一步影响,以及诸如盐度入侵屏障、港口升级、倒退区和基于生态系统的保护等进一步适应办法。图14显示,与缓解水平无关的是,适应相当有效地降低了全球总体成本,这说明即使在目标远大的缓解情况下也必须进行适应。在总体规模上,仅通过适应战略比仅通过缓解战略可以避免更多的损害,尽管两者结合起来产生的积极影响最大。然而,从较贫穷和小岛屿国家的角度来看,严格的缓解措施对于将风险保持在可控水平是必要的。即使没有海平面上升,为了保护仅由于社会经济发展而增加的泛滥平原资产,适应措施也是具有成本效益的。虽然这将涉及大量的投资流动(全球数百亿美元) ,但它们在全球 GDP 中所占的比例相对较小,即使对于基线情景下的海平面上升而言也是如此。然而,对于个别国家或地区(特别是小岛屿国家)来说,这些成本可能占 GDP 的比例要大得多,包括完全丧失的风险。5.6供热和供冷需求(居住区和社会)气候变化可能会影响空间供冷和供热的需求。因此,我们建立了一套简单的关系来描述住宅部门的供暖和空气调节需求,并探索了气候变化对这种模拟能源需求的影响(Isaac and van Vuuren,2009)。显然,人口和收入的变化预计将导致下个世纪取暖和空气调节的能源需求大幅增长(见图15,没有气候变化的例子)。在气候的驱动下,制冷和制热实践的变化是自主适应的例子(即没有政策干预)。然而,适应并不是普遍的,因为人们并不总是能够做出反应。供暖和制冷需求得不到满足可能导致健康影响(如第5.2节所述)和劳动生产率的损失。除了这些影响,当室内温度超过一定水平时,舒适度也会降低。图15显示,在全球范围内,由于收入和财富的增加而不考虑气候变化的能源需求的自主增长远远大于基线情景和缓解情景中的能源需求之间的差异(Isaac 和 van Vuuren (2009)显示,这对其他基线也是一个强有力的结果)。气候变化对综合能源需求的影响也小于单独对供暖和空气调节的影响,因为空气调节的增加弥补了供暖的减少。在区域和国家层面,影响可能更为显著: 例如,在印度,我们预计由于冷却增加,能源需求将大幅增加,而在西欧和美国,我们预计由于供暖减少,能源需求将大幅减少。5.7极端事件气候变化预计将导致一些与天气有关的极端事件的频率和强度发生变化(Parry et al。 ,2007)。像洪水、干旱、热浪和风暴潮这样的极端天气可能会变得更加频繁和强烈,而寒冷的极端天气,如寒潮,可能会变得不那么频繁和弱化。根据平均条件的变化来评估气候变化的风险——只有极端事件风险的变化被平均化的风险。因此,一种更基于风险、更明确地理位置的方法更可取。然而,关于灾害影响的知识是复杂和有争议的。迄今为止,只有有限的国家级研究采用概率方法预测气候变化存在的未来风险,主要集中在洪水风险(Mechler et al。 ,2010)。Feyen 等人(2009)在泛欧范围内进行的一项研究计算出,在基线情景下,预期的年度损失将增加三倍。定量风险方法的一个关键制约因素是气候预测的不确定性。对于降水,例如,模型往往不同意在局部尺度的变化迹象。这对于寻找例如洪水风险的研究尤其重要。虽然 Mechler 等人(2010)的研究旨在预测未来的风险,但他们发现未来的预测是如此的不确定,以至于作者没有根据对当今洪水影响的估计来预测未来的洪水风险。然而,目前的模型和数据似乎足以以相对较高的确定性(较慢的现象)评估干旱和热浪对农业造成的综合风险。这里提供了在2 °C 和4 °C 情景下的一些工作实例。一些研究调查了全球范围内受洪水影响的人们(Hirabayashi 和 Kanae,2009; Kundzewicz 等,2010)。样本回归分析显示,在缓解方案(2摄氏度)中,全球每年受100年洪灾影响的人口平均为2.11亿人,而基线(4摄氏度)为5.44亿人。Mirza 等人(2003)指出,对于孟加拉国这样一个易受水灾影响的国家来说,即使是2摄氏度的情况,预计也会使预计的洪水泛滥面积至少增加23-29% 。然而,应当指出的是,由于暴露、脆弱性和适应方面的不确定性,对未来洪灾损失的估计范围仍然很广。关于干旱,Burke 等人(2006)对2090年代的预测表明,对于2090年代的基线情景,每100年的极端干旱事件数量和平均干旱持续时间可能分别增加2倍和6倍。有证据表明,天气和气候相关影响造成的损害在当今已经增加,但这主要是由于财富和人口的增加(Bouwer,2010)。然而,气候变化预计将随着时间的推移而增加,并可能成为未来损害增加的一个更重要的因素。IPCC 最近的报告指出,重大事件的成本预计从占地区年 GDP 和收入的几个百分点到经济强劲的大地区的25% 不等(Parry et al。 ,2007)。过去受灾严重的小岛屿国家的灾害损失实际上已经超过了年度 GDP (Cummins and Mahul,2009)。5.8影响的经济评估成本-收益分析(CBA)是用一个共同的货币单位来表示不同战略的气候变化的成本和收益。我们在这里使用 FAIR 模型的 CBA 模块(参见模型附录 A)来获得一些关于更加聚合规模的影响的概念。对于缓解成本,FAIR 模型使用了前面介绍的 IMAGE 模型的信息。FAIR 中使用的气候损害和适应成本函数是从 AD-DICE 模型中推导出来的(De Bbu 等,2009a; Hof 等,2009a)。简而言之,AD-DICE 根据 DICE 模型的损伤函数估计适应成本(Nordhaus and Boyer,2000)。AD-DICE 将这些功能分为损害成本函数和残余损害函数,基于 DICE 模型中描述的每个影响类别的评估-农业,沿海地区,健康,住区,非市场时间使用,其他脆弱市场和灾难性影响。在这项研究中,我们假设了对气候变化的最佳适应响应(即给定一个温度变化水平,模型将适应成本和剩余影响的总和最小化)。DICE (因此 FAIR)中使用的影响估计包括: (i)实际的,可测量的经济成本(所谓的市场成本) ; 和(ii)其他的,无形的损失(非市场损失) ,使用支付意愿概念货币化。损害函数与本节前面所述的物理或经济损害没有直接关系,因为它们来自单独的来源。早先已经表明,适应成本的 FAIR 结果与文献中报道的值范围一致(Hof et al。 ,2009a)。在 FAIR 模型的默认设置和2.5% 的贴现率下,2005-2200年期间气候变化影响造成的贴现成本占全球 GDP 的比例在基线水平上接近4.5% (图16)。这些成本可能看起来高于上面提到的有限的部门分析,但是包括更多的部门和可能的灾难性事件的影响(Nordhaus and Boyer,2000)。年度成本随着时间的推移急剧上升,到2200年达到17% (注意,影响估计非常不确定,文献中可以找到更高和更低的值(Parry 等,2007; Stern,2006; Tol,2002b))。只有适应或缓解的情况下,折扣成本大幅降低到2.5% 左右(图16)。Hof 等人(2008)的研究表明,气候变化的 CBA 结果对模型假设非常敏感,其中折现率起着最重要的作用。贴现率特别重要,因为随着时间的推移,与仅适应和仅缓解设想相关的成本函数不同。3.5% 的贴现率导致仅适应情景和仅缓解情景的贴现成本分别为0.8% 和1.9% 。如果使用1.4% 的贴现率(相当于 Stern (2006)使用的贴现率) ,仅适应情景和仅缓解情景的贴现成本分别为3.2% 和2.5% 。在贴现率为2.5% 的情况下,缓解与适应相结合,贴现成本最低,即占 GDP 的2% 。与文献资料一致,适应性投资被评估为小于缓解投资和剩余损害。然而,它们在限制残余损伤方面是非常重要的。需要提及一些重要的警告。首先,对于风险的极端尾部(即低概率、高影响事件) ,计算不能被视为可靠。作为对如何处理这些风险的主观评估,Weitzman (2008)质疑 CBA 对决策者的有用性。其次,考虑到时间偏好和风险的贴现率的价值目前正在激烈争论,争论涉及主观时间偏好和风险感知(Nordhaus,2008; Price,2005; Stern,2006)。如上所述,贴现率的价值可以对结果产生很大的影响。最后,非市场影响需要对损害进行主观量化; 虽然这些影响很难货币化,但一般来说,不可逆转的变化更加困难,例如导致珊瑚礁丧失的海洋变暖(Ackerman and Heinzerling,2004)。5.9气候变化、影响和适应方面的不确定性对未来气候变化及其影响的预测有许多不确定性的来源。不确定性与因果链中的每一步都有关联: 排放、气候驱动因素(如碳循环)、气候(主要是气候敏感性和气候变化模式)以及影响(包括适应能力)。因此,对于同样的排放情景,不同的研究可能会给出非常不同的结果。事实上,这些差异往往大于不同排放情景下某一特定模型产生的差异。例如,对于本世纪末的降水变化,多模式集合平均值仅在高纬度地区超过模式间的标准差(Kundzewicz et al。 ,2007)。气候变化预测的不确定性随着时间的推移而增加。在近期(例如2020年代) ,气候模型的不确定性起着最重要的作用; 而在较长的时期(例如2090年代) ,由于排放情景的选择而产生的不确定性变得越来越重要(Jenkins and Lowe,2003)。未来气候变化对极端事件的影响尤其不确定。这部分是由于粗分辨率气候模型的较大空间和时间尺度与某些极端天气(如暴雨降水和山洪)的局部发生和短期生命之间的不匹配。由于影响和适应是在当地范围内发生的,因此需要详细的信息——这意味着不确定性的增加。较大的不确定性范围表明,适应规划不应基于单一的假设情景,而是需要考虑到大范围的预测。6结论在本文中,我们讨论了情景分析如何有助于评估缓解和适应战略。我们还提出了两个集成的场景作为分析的起点。这些设想方案明确地将缓解和适应行动纳入了几个指标,并涵盖了社会经济发展与影响之间的几个重要联系和反馈(例如,考虑了气候变化对土地利用和缓解的影响)。我们为选定的一些指标确定了这些设想方案的影响,主要侧重于平均气候变化。基于我们的工作,我们得出以下结论: •通过描述两套对比鲜明的世界可能的气候变化轨迹,我们为对减缓、适应和气候影响之间的相互作用进行更加综合的分析奠定了基础。第一种情况(不采取缓解措施)预计将导致全球平均气温在本世纪末上升4摄氏度左右(气候参数和当前经济趋势的最有可能值)。正如我们的一些分析所显示的,这种情况有很高的适应需求。第二种设想假设有严格的缓解措施,并将全球平均气温变化限制在2 °C,概率为50% 。即使在这种情况下,也需要大量的适应措施。•这里提出的综合情景分析可以为探讨政策选择的不同后果(包括不确定性)奠定良好基础; 鉴于不确定性,确定缓解、适应和剩余损害之间的最佳组合是不可行的。正如本文所讨论的那样,衡量气候变化的后果和各种政策回应的复杂性在于规模、空间和时间上的巨大差异; 巨大的不确定性; 以及行为者之间利益的明显差异(例如,他们是气候变化的肇事者还是受害者)。因此,对风险的主观解释将始终发挥重要作用。尽管如此,情景分析可以提供对可能结果和风险的描述。在这个阶段,成本和收益的货币评估(第5.8节)不能与前面章节中对物理变化的描述联系起来。有效的气候政策包括适应和减缓。模型计算表明,可以设计缓解设想方案,使全球平均气温上升2摄氏度,从而达到对气候敏感性的最佳猜测。然而,即使是这些严格的情况也可能导致全球平均气温上升超过2.5摄氏度(最多上升1.5摄氏度)和区域气温变化更大。本文件探讨的大多数影响都表明需要将缓解和适应结合起来。例如,在应对海平面上升(至少在21世纪)方面,适应措施可能比缓解措施更有效,但缓解措施在减少损害和降低适应成本方面仍然可以发挥作用。农业提供了一个明显需要适应和缓解的例子。如果不采取适应和减缓行动,预计许多区域的农作物产量将因气候变化而受到不利影响。如果没有严格的缓解措施,适应只能限制负面影响,而不能消除它们。缓解的一个好处是,它影响到所有影响类别,而适应需要根据影响和环境进行调整。•尽管气候变化的影响可能很严重,而且根据主观选择,可能需要制定严格的气候政策,但本研究评估的影响(鉴于目前的技术水平)可能仍然是全球范围内人口变化和经济增长的次要影响。然而,需要注意的是(见下文)。虽然气候变化可能对数百万人产生影响,但其他挑战可能对人民和治理产生更大的影响。然而,应该指出的是,我们只涉及了有限的一组影响,并且主要集中在对逐渐变化的气候的平均估计上,例如,没有涉及灾难性的、影响非常大的、极低概率的事件(Weitzman,2008)。这些事件实际上可能非常严重,以至于上述结论不再成立。如果全球范围的成本仍然相对较低,就不太需要进行全球分析,以便根据故事情节的一致性纳入对主要驱动因素的所有反馈。显然,在地方一级,情况可能大不相同; 对个别国家的影响可能远远大于全球一级的影响。例如,海平面上升对一些低洼岛国和国家非常重要,这些国家可能会受到巨大的适应成本和/或损害(直至完全毁灭)的显著影响。对农业而言,预计积极和消极影响将在不同地方和不同时间发生,低收入国家往往受到相对较为负面的影响。目前温度受到限制的温带地区的农业可能会受益。总之,我们认为,进一步制定在区域范围内进一步具体说明这些情况的综合设想是有益的。虽然本文提出了一个有用的第一步,它也留下了许多反馈意见仍然没有说明。本研究中的总体缓解成本估计为2 °C 情景下国内生产总值的1-2% 左右。缓解方案降低了气候变化的风险。在缓解方面的投资有几种类型的好处。首先,与气候相关的损害和适应成本得到降低。其次,不确定性也会减少,考虑到所涉及的风险,这一点很重要。虽然我们认为,在全球一级,缓解和适应之间不可能存在最佳的平衡,但我们已经表明,从长远来看,缓解和适应的成本和收益是相当的。进一步分析的重点包括评估实际变化与货币影响分析之间的联系、极端事件的可变性和变化、大规模干扰和治理的潜在作用。在我们和其他评估中,主要关注的是平均值的变化,然而,人们对与气候变化有关的极端事件(导致自然灾害) ,以及大规模的破坏(如西南极冰盾的解体)有相当大的关注,这些破坏并没有被平均值准确地描述。对气候变异性变化的预测高度不确定,迄今为止常常妨碍分析有力地预测未来的极端事件风险。不同行为者的作用是另一个问题; 某些形式的适应需要政府的积极参与; 其他形式的适应可能由私人投资者实施,例如安装空间冷却系统。这两个适应主体之间的差异与未来的情景发展有关。本文中提出的研究是作为欧盟资助的 ADAM 研究项目的一部分进行的。这篇论文的早期版本是作为《让气候变化为我们服务》一书的一部分出版的,该书由休姆和纽菲尔德编辑,剑桥大学出版社于2010年出版。附录 A.模型描述 A.1 IMAGE 2.4 IMAGE 2.4综合评估模型(Bouwman et al。 ,2006)由一系列相互关联的综合模型组成,这些模型共同描述了全球环境变化长期动态的重要因素,如空气污染、气候变化和土地使用变化。作为 IMAGE 的一部分,全球能源模型 TIMER (van Vuuren et al。 ,2006)描述了一次能源和二次能源需求和生产的长期动态以及温室气体和区域空气污染物的相关排放。模型行为主要是由各种技术的替代过程决定的,基于长期价格和燃料偏好。IMAGE 的农业模型模拟了7个作物类别和5个动物类别的生产力(Leemans 和 Born,1994)。根据一套分配规则,农产品的区域生产在空间上(0.5 ° × 0.5 °)分布(Alcamo et al。土地利用变化图和农业活动数据都被用来模拟土地利用(变化)产生的排放。MAGICC 模型利用温室气体排放量来计算全球平均气温变化(Wigley and Raper,2001)。温度变化模式是通过与大气环流模式(GCM)产生的气候变化模式联系而获得的。局限性: IMAGE 提供了对人类活动的物理描述(使用成吨的石油,生产成吨的谷物等)。更全面的宏观经济描述只能通过与其他模型的合作得到。IMAGE 作为综合评估模型的广泛覆盖范围意味着许多关键的不确定性影响模型的结果。在这种情况下,使用单一的基线(如在 ADAM 项目中)并不能完全满足所涉及的基本不确定性。A. 2 FAIR 气候政策模型 FAIR (Den Elzen 等,2008)与 IMAGE 模型一起用于确定不同排放源的减排率。全球气候计算利用了简单的气候模型 MAGICC 4.1(Wigley,2003; Wigley and Raper,2001)。所要求的全球减排量是通过计算基准和全球排放路径之间的差额得出的。FAIR 成本模型使用不同排放源的区域边际减排成本曲线(MAC) ,采用最小成本方法,在各区域之间进行分配。最近,FAIR 模型已经扩展到损害和适应成本曲线(基于 AD-DICE 模型(De Bbu 等,2009b)和估计宏观经济对 GDP 增长的影响的能力(Hof 等,2008))。这使模型能够探讨减缓和适应综合战略的经济影响。限制: 为了灵活起见,公平竞争模式不包括部门宏观经济模式或能源模式。因此,该模型从一个局部均衡的方法工作-和更深层次的后果气候政策只能通过转发公平的结果到其他(相关)模型研究。A.3 DIVA DIVA (动态和互动脆弱性评估)是在欧盟资助的项目 DINAS-COAST 44中开发的一个沿海系统的综合模型,连同其适当的沿海数据库,对沿海地区对海平面上升的国家、区域和全球脆弱性进行动态和互动评估; http://www.pik-potsdam.de/DINAS-COAST/。(DINAS-COAST Consortium,2006; Hinkel and Klein,2009).《综合发展战略》提供了一系列生态、社会和经济沿海脆弱性指标的定量信息,从国家以下各级到全球各级,涵盖所有沿海国家。该模型由来自各种工程、自然和社会科学学科的专家开发的若干模块组成。根据气候和社会经济情景,该模型评估了海岸侵蚀(直接和间接)、海岸洪水(包括河流)、湿地变化和盐度入侵三角洲和河口。家庭影响评估还从提高堤坝和滋养海滩的角度考虑沿海适应问题,并包括一些预先确定的适应战略,例如不予保护、充分保护或最佳保护。限制因素: 《综合可持续发展战略》排除了可能影响沿海影响的下列进程,但目前无法有把握地建立模型: 风暴频率和强度的变化、沿海快速发展和城市化导致的国内生产总值的地方分布和人口增长,以及盐度侵入沿海含水层。由于高程数据的粗分辨率和精度,还会产生更多的重要不确定性。A. 4 TIMER ——冷却/加热能源需求 TIMER 冷却/加热能源需求模型(Isaac and van Vuuren,2009)描述了冷却和加热能源的使用是几个因素的函数,包括人口水平、不断变化的收入水平和气候。对于加热和冷却,经验数据被用来校准一组系统-动态需求函数。气候(降温和升温度日)起着重要作用。该模型能够解释气候变化的影响。局限性: 对发展中国家而言,校准模型的经验基础相对较差。该模型没有描述可以提供冷却和加热需求的不同方式以及用一种技术替代另一种技术所涉及的成本。水资源影响模型水资源影响模型(Arnell,2003,2004)有两个组成部分。第一阶段采用宏观尺度水文模型 Mac-PDM 模拟全球地表(0.5 ° × 0.5 °)的径流,第二阶段通过计算人均水资源可利用率确定流域水资源压力指标。如果一个流域的年平均径流量低于每人每年1000立方米,则假定该流域面临水资源压力,这是一个半任意的阈值,广泛用于确定水资源压力区域。如果气候变化导致缺水流域的径流量显著减少,或者导致流域降低到阈值以下,那么气候变化将导致水资源压力暴露的增加。气候变化导致对相反趋势的暴露明显减少。这些变化不能直接比较; 虽然径流量的减少(和暴露量的增加)极有可能是不利的,但如果额外的水不能储存,或者在洪水增加的高流量季节发生,则径流量的增加(和暴露量的明显减少)可能不是有益的。生活在水资源压力增加的流域的人口数量可以作为暴露于气候变化的一个指标。实际影响(就真正的水资源短缺而言)将取决于现有的水资源管理结构。局限性: 水文模型不能完全模拟河流径流量,特别是在半干旱地区倾向于高估径流量。水资源指标是衡量受影响程度的指标,而不是实际影响; 它可以被视为适应需求的替代指标。答.6疟疾的危险疟疾媒介,蚊子传播感染,只能生存在适当的气候与高平均温度,没有霜冻和足够的降水。MARA/ARMA 疟疾适宜性模型(Craig et al。 ,1999)综合了这些气候因素来确定气候适宜区域。最大适合度为1和最小适合度为0所需的气候水平见表 A.1。对于水平介于0或1适合性所需水平之间的指标,使用简单函数计算水平(Craig et al。 ,1999)。利用 IMAGE 模型的输出结果,所有这些因子都是在半乘以半度的网格水平上计算的(Bouwman et al。 ,2006)。每个栅格细胞的总气候疟疾适应性是由这三个指数中的最低值决定的。局限性: MARA/ARMA 模型描述了对疟疾病媒的适用性。它没有提供蚊子传播的过程说明,也没有明确说明人们可能对增加的风险水平作出何种反应。参考文献 Ackerman and Heinzerling,2004 F。 Ackerman L。 Heinzerling 无价之宝: 关于了解一切事物的价格和一无所有的价值2004 The New Press 纽约 Agrawala and Fankhauser,2008 S。 Agrawala S。 Fankhauser 适应气候变化的经济方面。成本、收益和政策工具2008经合组织巴黎 Alcamo 等人,1998年 J。 Alcamo E。 Krol R。 Leemans J。 Bolen J。 Minnen M。 Schaeffer S。 Toet B。 Vries 全球环境变化模型: IMAGE 2.1 J。IMAGE 2.1型号1998爱思唯尔科技有限公司测试结果。Arnell,2004 N.Arnell 气候变化与全球水资源: 气候变化与社会经济情景全球环境变化14120043152 Arnell 等,2002 N.W。Arnell M.G.R.Cannell M. Hulme R.S.Kovats J.F.B.Mitchell R.J.Nicholls M.L.Parry M.J.L.利弗莫尔 · A · 怀特二氧化碳稳定化对气候变化影响的后果气候变化5342002413446 Bakkenes 等,2006 M。 Bakkenes B. Eickhout R. Alkemade 不同气候稳定化情景对欧洲植物物种的影响全球环境变化16120061928,2003年代表全球气候变化、适应和减缓全球环境变化132003年16巴克等人,2009年巴克,t。 ,肯伯,M。 ,Scrieciu,S。打破气候僵局。降低成本: 合作气候行动的经济效益。气候小组,托尼 · 布莱尔办公室,4CMR-剑桥大学和 Cambridge Econometrics。Barker and Scrieciu,2010 T Barker SS.Scrieciu 用 E3MG 模拟低稳定性: 走向模拟能源-环境-经济系统动力学的“新经济学”方法能源期刊31特刊12010137164 Barker 等,2008 T。Scrieciu T. Foxon 实现八国集团50% 的目标: 使用宏观经济计量模型 E3MG 气候政策82008 S30 S45 Berkhout 等,2002 F. Berkhout J. Hertin A. Jordan 气候变化影响评估中的社会经济未来: 使用情景作为“学习机器”全球环境变化12220028395 Bouwer,2010 L.M。人为气候变化造成的灾害损失增加了吗?2010年美国气象学会简报10.1175/2010BAMS 3092.1 Bouwman 等人,2006年 A.F. Bouwman T. Kram K. Klein Goldewijk 全球环境变化综合模拟。2006年荷兰环境评估机构 Bilthoven 228页。(出版物500110002/2006) Burke 等人,2006 E.J。 Burke S.J。 Brown N. Christidis 利用哈德利中心气候模型对21世纪全球干旱的近期演变和预测进行建模。《水文气象学杂志》2006年7月1113日1125 Clarke 等人,2010年 L。 Clarke J. Edmonds V. Krey R. Richels S. Rose M. Tavoni 国际气候政策架构: EMF 22国际情景的概述能源经济学31号补编。2010年 S64/s81哥本哈根协议,2009年哥本哈根协议,2009年。(2009年12月18日哥本哈根协议)。2009年联合国气候变化会议,哥本哈根。Craig 等人,1999 M.H。Craig R.W.基于气候的疟疾在非洲传播的分布模型寄生虫学今天1531999105111康明斯和马胡尔,2009 J.D。发展中国家巨灾风险融资: 公共干预原则,2009年世界银行华盛顿,DC De Bbu 等,2009a K.C。德布鲁姆 R.B。Dellink S. Agrawala 适应气候变化的经济方面: 适应成本和效益综合评估模型,2009年经合组织巴黎德布鲁恩等,2009 b。德布鲁姆 R.B。Dellink R.S.J.Tol AD-DICE: DICE 模型气候变化951-220096381 Den Elzen 等,2008 M.G。J.登伊尔森私人侦探社。Lucas D.P.为实现低二氧化碳当量浓度而采取的区域减排行动和分配办法下的排放限额费用气候变化9032008243268 Den Elzen 和 van Vuuren,2007 M.G。J.Den Elzen D.P.104-46/2007/17931/17936 DINAS-COAST 财团,2006年 DINAS-COAST 财团,2006年,更有可能以更低的成本实现长期温度目标的范维伦峰值美国国家科学院院刊。DIVA 1.5.5光盘,德国波茨坦气候影响研究所。伊斯特林等,2007 W。伊斯特林 P。阿加瓦尔 P。巴蒂玛 K。布兰德 L。埃尔达 M。霍登 A。基里连科 J。莫顿 J。Soussana J. Schmidhuber F.N. Tubiello 食品、纤维和森林产品 M.L. Parry O.F. Canziani J.P. Palutikof P.J. van der Linden C.E. Hanson 气候变化2007: 影响、适应和脆弱性。第二工作组对政府间气候变化专门委员会2007年剑桥大学出版社第四次评估报告的贡献剑桥,英国欧共体,2006年欧共体2050年世界能源技术展望(WETO H2)2006年欧盟委员会布鲁塞尔埃登霍费尔等人,2010 O。 Edenhofer B。 Knopf T。 Barker L。 Baumstart E。 Bellevrat B。 Chateau P。 Criqui M。 Isaac A。 Kitous S。 Kypreos M。 Leimbach K。 Lessmann B。 Magné S。 Scrieciu H。 Turton D。Van Vuuren 低稳定性的经济学: 减缓战略和成本的模型比较能源杂志31 SI-120101148环境变化研究所,2009环境变化研究所,2009。国际气候会议-4度及以上。英国牛津大学环境变化研究所,9月28日至30日。Feyen J.I. Barredo R. Dankers 全球变暖和城市土地利用变化对欧洲洪水的影响。迈向工程、设计和管理方法的一体化2009 Taylor and Francis Group London Fischer et al。 ,2007 G. Fischer F.N. Tubiello H. van Velthuizen D.Wiberg 气候变化对灌溉用水需求的影响: 缓解的影响,1990-2080技术预测和社会变化747200710831107 Fisher et al。时间序列: K.Jiang M.Kainuma E. La Rovere A. Matysek A. Rana K. Riahi R. Richels S. Rose D. van Vuuren R. Warren P. Ambrosi F. Birol D. Bouille C. Clapp B. Eickhout T. Hanaoka M.D. Mastrandrea Y.Matsuoko B.O’Neill H. Pitcher S. Rao F. Toth 与长期减缓有关的问题:。减缓气候变化。第三工作组对2007年政府间气候变化专门委员会剑桥大学出版社第四次评估报告的贡献,2010年 A. Hayashi K. Akimoto F. Sano S. Mori T. Tomoda 评估全球变暖对不同稳定水平的影响,作为确定长期稳定目标气候变化的一个步骤98201087112 Hilderink et al。 ,2008 H. Hilderink P.L。卢卡斯 · A · 滕霍夫 · 科克 · M · 德沃斯 · P · 詹森 · J · 梅耶尔 · A · 费伯尔 · A · 伊格纳修克 · A · 彼得森 · H · J。M.2008年荷兰环境评估机构 Bilthoven Hinkel 和 Klein,2009年 J。T.Klein 整合知识以评估海平面上升对沿海脆弱性的影响: DIVA 工具全球环境变化的发展1932009384395 Hirabayashi and Kanae,2009 Y. Hirabayashi S. Kanae 第一次估计未来全球人口面临洪水风险水文研究快报3200969 Hof 等,2009a A.F。霍夫 K.C。德布鲁姆 R.B。Dellink M.G.J.Elzen D.P.不同减缓战略对国际适应融资的影响环境科学和政策1272009832843 Hof 等,2009b A.F。霍夫・德・布鲁姆・ R ・德林克・ M.G。J.Elzen D.P.2012年后的全球气候治理: 架构、机构和适应2009年剑桥大学出版社剑桥霍夫等人,2008年 A.F。Hof M.G.J.Elzen D.P.范维伦分析气候政策的成本和收益: 价值判断和科学不确定性全球环境变化1832008412424 IMAGE-team,2001 IMAGE-team,2001。IPCC SRES 情景的图像2.2实现。全面分析21世纪的排放、气候变化和影响。RIVM 光盘出版物481508018,比尔特霍芬国家公共卫生与环境研究所。政府间气候变化专门委员会,2007年,2007年。2007年气候变化: 综合报告。第一、第二和第三工作组对政府间气候变化专门委员会第四次评估报告的贡献。政府间气候变化专门委员会,日内瓦,104页。能源政策背景下的全球住宅部门供暖和空气调节的能源需求建模。处理 UKCIP02气候变化情景中的不确定性。埃克塞特气象局哈德利中心技术说明44。Kinney et al。 ,2008 P.L。 Kinney M.S. O’Neill M.L。 Bell J. Schwartz 用于估计气候变化对与热有关的死亡的影响的方法: 挑战和机会环境科学和政策11872008 Klein et al。 ,2007 R.J.T. Klein S. Huq F. Denton T.E. Downing R.G. Richels J.B. Robinson F.L. 适应和缓解之间的相互关系。2007年气候变化。影响、适应和脆弱性。第二工作组的贡献。2007年政府间气候变化专门委员会报告剑桥大学出版社剑桥745777 Krol 等人,1997年 M. Krol J. Alcamo R. Leemans 稳定大气中二氧化碳的全球和区域影响全球变化的缓解和适应策略11997年341361 Kundzewicz 等人,2010 Z.W。Kundzewicz y. Hirabayashi 气候变化中的 S. Kanae River 洪水——水资源管理的观察和预测2010年10.1007/s11269-009-9571-6 Kundzewicz 等,2007 z.W。Kundzewicz L.J.Mata N. Arnell P. Döll P. Kabat B. Jiménez K. Miller T. Oki Z. en I. Shiklomanov 淡水资源及其管理。招架。坎齐亚尼 J.P。普鲁提克。Hanson P.J.2007年范德林登气候变化: 影响、适应和脆弱性。第二工作组对2007年政府间气候变化专门委员会剑桥大学出版社第四次评估报告的贡献。翻译。D.确定自然植被、作物和农业生产力的潜在全球分布水、空气和土壤污染761994133161 Lenton 等人,2008 T.M。Lenton H 拘留了 E. Kriegler J.W。鲁赫特 · S · 拉姆斯托夫 · H · J 大厅。Schellnhuber 地球气候系统中的倾斜元素美国国家科学院院刊1056200817861793。Manne R.G.Richels Merge: 全球气候变化综合评估模型。Zaccour Energy and Environment 2005 Springer USA McMichael et al。 ,1996 A. McMichael A. Haines R. Sloff S. Kovats Climate Change and Human Health 1996 World Health Organization Geneva Mechler et al。 ,2010 R. Mechler S. Hochrainer A. Aaheim Z. Kundzewicz N. Lugeri M. Moriondo H. Salen M. Bindi I. Banaszak A. Chorynski E. Genovese H. Kalirai J. Linnerooth-Bayer C. Lavalle D. McEvoy P. Matczak M. RadzieJewski D. Rübbelke M.-J。评估欧洲适应不断变化的洪水和干旱风险的风险管理方法 M.Hulme H. Neufeldt 让气候变化为我们服务: 关于适应和减缓战略的欧洲观点2010年剑桥大学,英国 Meehl 等人,2007年 G.A. Meehl T.F. Stocker W.D. Collins Friedlingstein A.T. Gaye J.M. Gregory A. Kitoh R. Knutti J.M. Murphy A. Noda S.C.B. Raper I.G. Watterson A.J. Weaver Z- C。赵全球气候预测2007所罗门气候变化: 物理科学基础。第一工作组对2007年剑桥大学出版社第四次评估报告的政府间气候变化专门委员会。减缓气候变化。第三工作组对2007年政府间气候变化专门委员会剑桥大学出版社第四次评估报告的贡献剑桥,United Kingdom Mirza 等,2003 M.M。问:。Mirza R.A.Warrick N.J.气候变化对恒河、 Brahmaputra 和梅格纳河洪水的影响孟加拉国气候变化572003287318 Moriondo et al。欧洲农业应对气候变化和可变性的影响和适应机会全球变化的缓解和适应战略1572010657679 Moss 等,2010 R.H。Moss J.A.Edmonds K.A.Hibbard M.R.Manning S.K.Rose D.P.Van Vuuren T.R.Kainuma T. Kram G.A.Meehl J.F.B.Mitchell N. Nakicenovic K. Riahi S.J.史密斯 R.J。早上吃饱。汤姆森 J.P。Weyant T.J.Nakicenovic 等,2000 N。 Nakicenovic 排放情景特别报告(SRES)2000剑桥大学出版社剑桥,英国 Nakicenovic 等,2000,nakicenovic P. Kolp K. Riahi M. Kainuma T. Hanaoka 排放情景评估重新审视环境经济学和政策研究732006137173 Nicholls and Lowe,2004 R.J。Nicholls J.A.全球环境变化1432004229244 Nicholls 等,2010 R.J。Nicholls N. Marinova J.A.劳 · S · 布朗 · P · 维林加 · D · 德 · 古斯芒 · J · 欣克尔。J.21世纪英国皇家学会哲学汇刊2010年10月10日。2010.029 Nordhaus and Boyer,2000 W.D.诺德豪斯 · J · 博耶(Nordhaus J. Boyer)《全球变暖: 2000年全球变暖的经济模型》麻省理工学院出版社,剑桥,马萨诸塞州,第10页。315-328 Nordhaus,2008 W.D.平衡的问题衡量全球变暖政策的选择2008年纽黑文和伦敦帕里等人的耶鲁大学出版社,2007年。招架。坎齐亚尼 J.P。PJ 的 Palutikof。Van der Linden C.E.2007年汉森气候变化: 影响、适应和脆弱性。返回文章页面第二工作组对2007年剑桥大学出版社第四次评估报告的贡献政府间气候变化专门委员会:?气候变化9932010383402 Piani 等人,2005 C. Piani D.J。陷害地方检察官。Stainforth M.R.艾伦对气候变化的约束来自数千名成员的模拟地球物理研究通讯322005 L23825 Price,2005 c 价格关于环境变化影响的代际视角: 折现未来的观点 J.L。Innes 通用汽车。Hickey H.F.2005年国际林业研究组织联合会(国际林研组织)维也纳罗斯等人,2007年,Ahammad,h,a,Rao,S. ,Riahi,K. ,van Vuuren,D. 2007.土地在气候稳定模拟中的作用: 初步观测能源模拟论坛报告。斯坦福大学。Schneider 和 Kuntz-Duriseti,2002 S.H。不确定性与气候变化政策。奈尔斯气候变化政策: 一项调查2002年岛屿出版社华盛顿特区斯特恩,2006年 N。斯特恩气候变化经济学评论2006年剑桥大学出版社剑桥斯沃特等人,斯瓦特 · L · 伯恩斯坦 · M · Ha-Duong A. Petersen 同意不同意: 政府间气候变化专门委员会评估气候变化、影响和应对措施的不确定性管理922009129斯瓦特和雷斯,2007 R · 斯瓦特 · F · 雷斯将适应和缓解工作结合起来: 纳入可持续发展政策的主流?气候政策742007288303 Tol,2002a。第二部分。环境和资源经济学2122002135160托尔,2002年 b。第一部分。基准估计环境和资源经济学21120024773 Tol,2002 c R.S.J. Tol 福利规格和气候变化的最佳控制: 基金的应用能源经济学2442002367376 Tubiello and Fischer,2007 F.N. Tubiello G. Fischer 减少气候变化对农业的影响: 减缓的全球和区域影响,2000-2080年技术预测和社会变化747200710301056联合国,2005年。世界人口前景: 2004年修订本。光盘版-扩展数据集。联合国出版物。E.05.十三.12,联合国,经济和社会事务部,人口司。Van Vliet et al。 ,2009 J.van Vliet M.G.J.den Elzen D.P. van Vuuren 会议根据延迟参与的能源经济31号补充辐射效应制定了目标。22009 S152 S162 van Vuuren et al。 ,2006 D.P. van Vuuren B。 van Ruijven M。 Hoogwijk M。 Isaac B。 De Vries TIMER 2: 模型描述和应用 L。 Bouwman T。 Kram K。 Klein-Goldewijk 全球环境变化综合模型。IMAGE 2.42006 MNP 概述-荷兰环境评估机构 Bilthoven van Vuuren 等,2007 D.P。Van Vuuren M.G.J.登伊尔森私人侦探社。Lucas B. Eickhout B.J.稳定低水平的温室气体浓度: 减少战略和成本的评估气候变化8122007119159 Van Vuuren 等,2008 a D.P。Van Vuuren B. De Vries A. Beusen P.S.C.21世纪温室气体排放的有条件概率估计基于 IPCC-SRES 设想的全球环境变化1842008635654 van Vuuren 等,2008b D.P。Van Vuuren M Meinshausen G.K.普拉特纳 · F · 乔斯 · K.M。斯特拉斯曼 S.J。Smith T.M.L.Wigley S.C.B.Raper K. Riahi F. De La Chesnaye M.G.J.登藤野 K。江 N。 Nakicenovic S。 Paltsev J.M。21世纪减缓美国国家科学院院刊的温度上升。Van Vuuren M.G.J.艾萨克不同气候制度的比较: 扩大参与的影响能源政策3712200953515362 van Vuuren 等,2010 D.P。Van Vuuren E. Stehfest M.G.J.艾萨克探索将温室气体辐射效应控制在3瓦/平方米以下的情景能源经济学31特刊12010165192维梅尔和拉姆斯托夫,2009年 M. Vermeer S. Rahmstorf 与全球气温美国国家科学院院刊相关的全球海平面10620092152721532 Weitzman,2008 Weitzman,M.L。,2008年。灾难性气候变化的经济学模型与解释。Wigley,2003 T.M.L. Wigley MAGICC/SCENGEN 4.1: 技术手册2003 UCAR-气候和全球动力学分部博尔德,CO Wigley and Raper,2001 T.M.L. Wigley S.C.B. Raper 全球平均变暖科学高预测的解释2932001451454|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Fairness+in+Graph+Machine+Learning:+Recent+Advances+and+Future+Prospectives)|0| +|[Socially Responsible Machine Learning: A Causal Perspective](https://doi.org/10.1145/3580305.3599571)|Raha Moraffah, AmirHossein Karimi, Adrienne Raglin, Huan Liu|Pukyong Natl Univ, Grad Sch Management Technol, Busan, South Korea; Shanghai Lixin Univ Accounting & Finance, Shanghai, Peoples R China|The underlying assumption of using investor sentiment to predict stock prices, stock market returns, and liquidity is that of synergy between stock prices and investor sentiment. However, this synergistic relationship has received little attention in the literature. This paper investigates the synergistic pattern between stock prices and investor sentiment using social media messages from stock market investors and natural language processing techniques. At the macro level, we reveal extremely significant positive synergy between investor sentiment and stock prices. That is, when a stock price rises, investor sentiment rises, and when a stock price falls, investor sentiment falls. However, this synergy may be reversed or even disappear over a specific time period. Through a segmented measurement of the synergy between stock prices and investor sentiment over the course of a day, we also find that investor sentiment on social media is forward looking. This provides theoretical support for using investor sentiment in stock price prediction. We also examine the effect of lockdowns, the most draconian response to COVID-19, on synergy between stock prices and investor sentiment through causal inference machine learning. Our analysis shows that external anxiety can significantly affect synergy between stock prices and investor sentiment, but this effect can promote either positive or negative synergy. This paper offers a new perspective on stock price forecasting, investor sentiment, behavioral finance, and the impact of COVID-19 on the stock markets. Copyright (c) 2022 Borsa Istanbul Anonim S, irketi. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).|利用投资者情绪来预测股价、股市回报和流动性的基本假设是股价和投资者情绪之间的协同作用。然而,这种协同关系在文献中很少受到关注。本文利用来自股市投资者的社交媒体信息和自然语言处理技术,研究了股票价格与投资者情绪之间的协同关系。在宏观层面,我们发现投资者情绪与股价之间存在极其显著的创建力量。也就是说,当股价上涨时,投资者情绪上升,当股价下跌时,投资者情绪下降。然而,这种协同作用可能会逆转,甚至在特定的时间段内消失。通过对股票价格和投资者情绪在一天内的协同效应进行分段测量,我们还发现,社交媒体上的投资者情绪具有前瞻性。这为利用投资者情绪进行股价预测提供了理论支持。我们亦会透过因果推理机器学习,研究封锁对股价与投资者情绪之间的协同效应的影响。封锁是对2019冠状病毒疾病最严厉的回应。我们的分析表明,外部焦虑可以显著影响股票价格和投资者情绪之间的协同效应,但这种效应可以促进正面或负面的协同效应。本文提供了一个新的视角股票价格预测,投资者情绪,行为金融学,以及2019冠状病毒疾病对股票市场的影响。版权所有(c)2022伊斯坦布尔博尔萨 Anonim S,irketi。由 Elsevier B.V 出版。这是 CC BY-NC-nd 许可证下的一篇开放存取文章( http://creativecommons.org/licenses/BY-NC-ND/4.0/)。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Socially+Responsible+Machine+Learning:+A+Causal+Perspective)|0| |[Training Large-scale Foundation Models on Emerging AI Chips](https://doi.org/10.1145/3580305.3599573)|Aashiq Muhamed, Christian Bock, Rahul Solanki, Youngsuk Park, Yida Wang, Jun Huan||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Training+Large-scale+Foundation+Models+on+Emerging+AI+Chips)|0| |[How to DP-fy ML: A Practical Tutorial to Machine Learning with Differential Privacy](https://doi.org/10.1145/3580305.3599561)|Natalia Ponomareva, Sergei Vassilvitskii, Zheng Xu, Brendan McMahan, Alexey Kurakin, Chiyaun Zhang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=How+to+DP-fy+ML:+A+Practical+Tutorial+to+Machine+Learning+with+Differential+Privacy)|0| -|[Trustworthy Machine Learning: Robustness, Generalization, and Interpretability](https://doi.org/10.1145/3580305.3599574)|Jindong Wang, Haoliang Li, Haohan Wang, Sinno Jialin Pan, Xing Xie|Univ Wisconsin, Madison, WI 53706 USA; Google, Mountain View, CA USA|An emerging problem in trustworthy machine learning is to train models that produce robust interpretations for their predictions. We take a step towards solving this problem through the lens of axiomatic attribution of neural networks. Our theory is grounded in the recent work, Integrated Gradients (IG) [STY17], in axiomatically attributing a neural network's output change to its input change. We propose training objectives in classic robust optimization models to achieve robust IG attributions. Our objectives give principled generalizations of previous objectives designed for robust predictions, and they naturally degenerate to classic soft-margin training for one-layer neural networks. We also generalize previous theory and prove that the objectives for different robust optimization models are closely related. Experiments demonstrate the effectiveness of our method, and also point to intriguing problems which hint at the need for better optimization techniques or better neural network architectures for robust attribution training.|值得信赖的机器学习中一个新出现的问题是训练模型,为它们的预测产生可靠的解释。我们通过神经网络的公理属性透镜来解决这个问题。我们的理论是基于最近的工作,综合梯度(IG)[ STY17] ,在公理归因于一个神经网络的输出变化的输入变化。在经典的鲁棒优化模型中,我们提出训练目标来获得鲁棒 IG 属性。我们的目标提供了原则性的概括以前的目标设计的稳健预测,他们自然退化到经典的软边际训练的一层神经网络。我们还推广了以往的理论,证明了不同鲁棒优化模型的目标是密切相关的。实验证明了我们的方法的有效性,也指出了有趣的问题,暗示需要更好的优化技术或更好的神经网络架构的鲁棒性归因训练。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Trustworthy+Machine+Learning:+Robustness,+Generalization,+and+Interpretability)|0| -|[Large-Scale Graph Neural Networks: The Past and New Frontiers](https://doi.org/10.1145/3580305.3599565)|Rui Xue, Haoyu Han, Tong Zhao, Neil Shah, Jiliang Tang, Xiaorui Liu|Graduate School of Agricultural and Life Sciences, The University of Tokyo, Yayoi 1-1-1, Bunkyo-ku, Tokyo 113-8657, Japan; Maekawa Manufacturing Co., Ltd., Tatsuzawa 2000, Moriya-shi, Ibaraki-Prefecture 302-0118, Japan; College of Biosystems Engineering and Food Science, Zhejiang University, Huajiachi Campus, Hangzhou, PR China|Highlights ► A real-time detection method by UV–Vis spectroscopy was developed for monitoring of ATP and viable cells on meat surface. ► The linear relationship was observed between the ATP amount and plate count with the determination coefficient of 0.95. ► The 2nd derivative of reflectance spectra gave a high correlation for the first 48 h with both ATP amount and viable cell count at 318 nm. Abstract Cleanliness monitoring at slaughterhouses depend on traditional methods, e.g. , visual inspection or swabbing. The visual inspection is not always accurate. Swabbing requires skilled workers and further plate count or ATP bioluminescence technique. To solve these problems, a rapid technique based on non-destructive UV–Vis reflectance was developed to monitor the ATP and viable cells. Samples were lean part of pork loin. The samples stored at 15 °C were analyzed at 0, 24, 48, 72, 84 and 96 h for ATP, plate count and UV–Vis reflectance. The reflectance spectra were measured from 240 to 540 nm at 20 °C, and then the area of 40 × 40 mm 2 of the sample surface was swabbed for the determination of plate count and ATP amount. The plate count on the sample surface increased from the initial count of 29 to 3.2 × 10 7 CFU/cm 2 after 84 h. The ATP amount also increased with time from the initial amount of 9.2 × 10 −15 to 2.8 × 10 −10 mol/cm 2 after 84 h. The linear relationship was observed between the ATP amount and plate count with the determination coefficient of 0.95. The 2nd derivative of raw spectra gave a high correlation for the first 48 h with both ATP amount and viable cell count showing the determination coefficient of 0.89 and 0.83, respectively at 318 nm. The results strongly suggested that the UV–Vis reflectance spectrum analysis could be used as the real-time monitoring of ATP and/or plate count on meat surface with the optimal wavelength. Keywords ATP Sanitation Real-time monitoring Non-destructive detection Pork Spectroscopy Plate count Absorbance Reflectance Meat Quality 1 Introduction Muscle foods that include both meat and poultry are an integral part of the human diet and have been so for several thousand years. However, within the past two decades public concern, as well as awareness, has been raised due to high profile food safety issues such as the BSE and foot and mouth epidemics ( Fox, 2001; Pickrell and Enserink, 2001 ). These outbreaks, along with concerns over specific pathogenic bacteria within meats have illustrated the requirement for a rapid and accurate detection system for microbial spoilage of meats within what is a large-scale production industry whose turn over is billions of £ and $ per annum ( Ellis and Goodacre, 2001 ). The major role of microorganisms in the spoilage of food and the role of food as a vector for the transmission of microbes responsible for food-borne disease are well recognized. At a slaughterhouse of poultry, pork and beef, monitoring of cleanliness depends mainly on traditional methods of visual inspection, swabbing and subsequent viable cell count or ATP bioluminescence technique ( Hawronskyj and Holah, 1997 ). This is especially important for microbial hazards associated with food process. In the case of poultry, pork and beef processing, verification of the efficacy of preventive measures to reduce or eliminate microbial hazards may be achieved by routine carcass analysis using cultural method, i.e., classical Standard Plate Count ( Bautista et al., 1997 ). However, the development of more rapid methods on a ‘real time basis’ for microbiological quality control has been in the interest of scientists ever since routine microbiological analysis was applied to foods. Rapid detection methods based on the detection of whole cells or their metabolites can be divided into two main classes: direct methods are based on the detection of cells with or without incubation and indirect methods are based on the measurement of metabolic products or other changes caused by the cell growth ( Vanne et al., 1996 ). Although rapid detection methods have been under development, conventional methods for microbial monitoring are used on the job site of slaughterhouse. However, such methods usually require operator’s skill, long analysis time and high expenses. Moreover, visual inspection is not always accurate, the swabbing requires skilled worker and further plate count analysis which usually requires 24–48 h. The conventional microbiological approach to food sampling has changed over the last half century and it has been estimated that there are currently in excess of 40 methods to measure and detect bacterial spoilage in meats ( Jay, 2005; Nychas et al., 1988 ). The development of rapid microbiological test procedures over the last two decades can be divided into two main groups: enumeration and presence–absence tests. Several commercial presence–absence (P-A) test kits area available and were evaluated over a 6-month period in 1990 by using the Ontario Ministry of the Environment P-A test for comparison by Clark and El-Shaarawi (1993) . Current rapid enumeration methods are generally based on microscopy, ATP bioluminescence or the measurement of electrical phenomena ( Ellis and Goodacre, 2001 ). The use of ATP bioluminescence assay is a logical approach and relies on the fact that all living cells contain adenosine 5’-triphosphate (ATP), which is a universal energy donor for metabolism ( Bautista et al., 1997 ). Detection of the high-energy molecule adenosine triphosphate (ATP) extracted from cells is a widely used indirect assay method. The ATP amount is measured as the light energy released by the luciferin–luciferase system in the presence of magnesium ions ( Stanley, 1989 ). The assay is rapid, only a few seconds in hygiene monitoring applications and less than an hour of most other samples. Previously, it was thought that this technology has limitations – because of the fact that ATP is present in all viable cells. Therefore, intrinsic ATP originating from the target cells must be removed enzymatically before the assay ( Vanne et al., 1996 ). Siragusa et al. (1996) stated that the major challenge in using microbial ATP as a means of determining total microbial populations in food samples is the separation of nonmicrobial ATP from microbial ATP. The basis of their described Rapid-microbial ATP assay was the use of a filtration device in which somatic ATP was extracted: then within the same device, extraction of bacterial ATP was followed by its quantification. In the case of microscopic methods sophisticated techniques have been developed where microorganisms are stained with fluorescent dyes and viewed with an epifluorescent microscope. ATP bioluminescence acts by measuring ATP levels in bacterial cells in culture in order to calculate the number of cells present in that culture ( Champiat et al., 2001; de Boer and Beurmer, 1999; D’Souza, 2001; Siragusa et al., 1996 ). The problem with this method is that ATP is the primary energy source of all living cells and the food samples themselves will also contain large amounts of this chemical which have to be destroyed before microbial ATP can be measured. Consequently, the measurement of ATP bioluminescence is probably the best suited to detection of contaminated surfaces on equipment and machinery associated with food production and preparation ( Ellis and Goodacre, 2001 ). In the case that the viable cells should only be detected, the above-mentioned limitation of ATP bioluminescence technology and drawback of microscopic method have to be taken into account. However, the total amount of ATP originating from both meat and viable cell has sufficient importance in the cleanliness evaluation, because the ATP of meat origin acts as a nutrient source for the bacteria leading to bacterial spoilage. Due to the advantages of nondestructive, free of chemical preparation and fast inspection speed, spectroscopy has been studied extensively for determining properties of agricultural products, but less for meat products as compared to plant materials ( Chan et al., 2002 ).According to the literature, VIS/NIRS technology has been used in pork to determine intramuscular fat ( Hoving-Bolink et al., 2005; Savenije et al., 2006 ), fatty acid composition ( Fernandez-Cabanas et al., 2007; Gonzalez-Martin et al., 2003, 2005 ), color ( Cozzoliono et al., 2003 ), water-holding capacity ( Brondum et al., 2000 ), presence of RN − genetic allele ( Josell et al., 2000 ), and Doroc and Iberian porl neural network classification ( del Moral et al., 2009 ), but it has not been applied for the direct qualitative classification of meats of varied quality and price ( del Moral et al., 2009 ). Moreover, only a few reports are available for determination of quality of food products by using reflectance data. From these current conditions, the objective of this study was to develop a real-time detection method for monitoring of ATP and viable cells on meat surface by using reflectance spectra that could be used for sanitation management. 2 Materials and methods 2.1 Meat samples The lean part of pork loin samples sliced in 5-mm thick was obtained from a retailer. It was slaughtered 3 days ago and kept in the marketing conditions at the retailer shop. A total of 24-sliced samples were cut into pieces of about 6 × 6 cm 2 , and were individually placed in sterilized Petri dishes. 2.2 Experimental setup The samples were separated into six groups with four samples of each and were stored in a constant temperature chamber at 15 °C. The storage temperature was selected as the highest temperature in a working room of a slaughterhouse, where the temperature is usually controlled from 10 to 15 °C in consideration of worker’s health, according to our conversation with the slaughterhouse management. Measurements were conducted after 0, 24, 48, 72, 84 and 96 h of storage. The each value shown is a mean of four pieces. The experiment was repeated thrice to validate the results. Similar results were obtained in all the repeated experiments. Here, for simplicity, results of only one experiment are shown. 2.3 UV–Vis reflectance spectrum A dual beam spectrometer (UV-3600, Shimadzu Co., Kyoto, Japan) equipped with an integrating sphere setup was used for recording reflectance spectrum from a surface of meat sample (9 × 20 mm 2 ). Measured range of wave length was 240–1200 nm with the resolution of 2 nm; however, the results from 240 to 540 nm are only shown in the Section 3 . In order to confirm the maximum absorption wavelength of ATP, the transmittance of serial dilutions of ATP standard solution (LL-100-1, TOYO B-Net Co., Tokyo, Japan) was obtained with 10 mm quartz cells. 2.4 Spectral data pre-treatment Spectral data are often pre-processed to reduce undesirable systemtic noise, such as baseline variation, light scattering, path length differences and so on, and enhance the contribution of the chemical composition (Tigabu and Oden, 2002). In this study, two types of pre-processing were employed: Savitzky–Golay 1st and 2nd derivative. In our case, the possible sources of systematic variation could be due to the path length slight difference arising from the positioning of individual meat samples with slight different sizes during scanning. 2.5 Sampling protocol and microbiological analysis 2.5.1 Sampling protocol Sampling of materials on pork meat surface (40 × 40 mm 2 ) covering the area for spectroscopic measurement was carried out using a swab technique. To ensure adequate sampling, the sample was swabbed in a horizontal pattern and again in a vertical pattern being rotated between the index finger and the thumb in a back and forth motion according to Bautista et al. (1997) . The end of cotton bud used for swabbing was cut into 9 ml of sterilized water and then the swab sample was stirred well for the further examination for plate count and ATP determination. 2.5.2 Plate count Serial dilutions of the swab sample were prepared from the phosphate buffer solution in which the swab was immersed and 1 ml of the dilution was dispensed onto Petrifilms™ (AC plate, Sumitomo 3M Ltd., Tokyo, Japan) for total aerobic counts. The Petrifilms™ were incubated for 48 h at 35 °C. 2.5.3 ATP bioluminescence assay One hundred microliters of the swab sample (phosphate buffer solution in which the swab was immersed) was injected into a fresh cuvette placed in a luminometer (Luminescenser MCA, Atto Corporation, Tokyo, Japan), and then, 100 μl of Extractant (LL-100-2, Toyo B-Net Co. Ltd., Tokyo, Japan) was added into it. After 10 s, 100 μl of Luciferin–luciferase complex (LL-100-1, Toyo B-Net Co. Ltd., Tokyo, Japan) was added, and the light output was measured. From each swab, two measurements were taken and means were calculated to determine relative light units (RLU). The RLU was then converted into the amount of ATP by a standard curve constructed with ATP standard solution (LL-100-1, Toyo B-Net Co. Ltd., Tokyo, Japan) in the range of 10 −16 –10 −11 mole/100 microliters. 2.6 Statistical analysis The samples of four pieces of pork meat were selected at random for the storage time period. The regression analysis was carried out to know the relationship between ATP contents and plate count. The raw data had background information; therefore, it was converted by using the 1st and the 2nd derivatives, and the best one was selected. 3 Results 3.1 Plate count The plate count on the sample meat surface increased with the storage time period. At the outset of the experiment, the initial count was 29 CFU/cm 2 and 84 h after storage it was 3.2 × 10 7 CFU/cm 2 . 3.2 ATP content The amount of ATP increased with storage time period from the initial amount of 9.2 × 10 −5 to 2.8 × 10 −10 mol/cm 2 (84 h after storage). A linear relationship was observed between the amount of ATP and the plate count with the determination coefficient (R 2 ) of 0.95 as shown in Fig. 1 . 3.3 Absorption maximum of pure ATP The transmittance of ATP solutions of different concentration from 1 × 10 −4 to 5.85 × 10 −6 M are shown in Fig. 2 . It shows that the transmittance decreased with an increase in the ATP concentration, and spectra taken for all samples of different ATP concentrations showed that the maximum absorbance related to the decrease in transmittance was at 260 nm ( Fig. 2 ). 3.4 Estimation of ATP and plate count from reflectance The reflectance spectra obtained at 0–84 h of storage are shown in Fig. 3 in the UV–Vis range (from 240 to 540 nm). There was a very little difference between the reflectance at 0 h and that at 24 h. The reflectance of samples taken at 48, 72 and 84 h, however, showed a decreasing trend with increase in the storage time period. The 2nd derivative of reflectance data selected as the best between 1st and 2nd derivatives of reflectance is shown in Fig. 4 . Many upward and downward peaks were observed and the analysis of correlation of peaks at 298, 318, 344 and 374 nm in UV range was conducted. Fig. 5 shows the correlation coefficient between the 2nd derivative of reflectance and log (ATP). This gave a high correlation between the values of the 2nd derivative and log (ATP). Considering bathochromic shift, any of these four wave lengths could be taken as the maximum absorption of ATP. 4 Discussion Spectroscopic methods have gained importance in the evaluation of food quality attributes during the last decades (Nadai, 1983; Nadai and Mihalyi-Kengyel, 1984). Although NIR spectra reflect several parameters relating to complex quality of food (Williams and Norris, 2001), the information on ATP and/or microorganisms can not be detected in the range of NIR. Therefore, in the present study, UV–Vis was applied ranging from 240 to 540 nm. 4.1 Plate count In this study, samples were evaluated as fresh until that time when bacterial counts crossed the boundary line of 107 CFU/g and no putrid odor could be perceived. After 72 h, the plate count reached the order of 107 CFU/g and samples gave off a faint putrid odor. These samples were in the initial stage of spoilage and would be regarded as unacceptable. Plate count is a fundamental index of meat spoilage, and count of 10 7 CFU/g in meat is regarded as unacceptable ( Brown, 1982 ). Detection of the order of 10 6 CFU/g is important as this is achieved just before the meat reaches the unacceptable stage. Fresh meats generally have a pH range between 5.5 and 5.9 and contain sufficient glucose and other simple carbohydrates to support approximately 10 9 CFU/cm 2 . The organisms that grow the fastest and utilize glucose at refrigeration temperatures are the pseudomonas ( Gill and Newton, 1977; Jay, 2005; Seymour et al., 1994 ). At levels of 10 7 CFU/cm 2 off-odors may become evident in the form of a faint ‘dairy’ type aroma and once the surface population of bacteria has reached 10 8 CFU/cm 2 the supply of simple carbohydrates has been exhausted and recognizable off-odors develop leading to what is known as ‘sensory’ spoilage ( Jackson et al., 1997; Jay, 2005; Stanbridge and Davies, 1998 ). The development of off-odors is dependent upon the extent to which free amino acid utilization has occurred and these odors have been variously described as dairy/buttery/fatty/cheesy at 10 7 CFU/cm 2 through to a sickly sweet/fruity aroma at 10 8 CFU/cm 2 and finally putrid odor at 10 9 CFU/cm 2 ( Adams and Moss, 2007; Dainty et al., 1985 ). 4.2 ATP content Fig. 1 shows the linear relationship between log 10 ATP and log 10 plate count. From this figure both the ATP analysis and the plate count methods were able to assess the hygienence of pork meat samples. The ATP analysis provides only an estimation of the total bacterial count, and cannot differentiate between bacteria ( Baumgart, 1993 ). Theoretically, ATP amounts as low as 100 fg (10 −13 g) can be measured, corresponding to about 100 bacterial cells. Under practical conditions the sensitivity is about 1000 fg (10 −12 g), which corresponds to about 1000 bacterial cells or one to two yeast cells ( Heeschen et al., 1991 ). Stressed cells and cells in the stationary growth phase contain less ATP, which also affects the results ( Bulte and Reuter, 1985 ). On the other hand, however, the amount of ATP in a sample provides an estimate of the active microbial population, which is important when considering the shelf life of the product. Stressed cells can also be allowed to resuscitate before the ATP assay ( Graumlich, 1985 ). The enzyme, luciferase, converts the chemical energy provided by ATP into light by a stochiometric reaction. Thus, the amount of light produced is proportional to the concentration of ATP present, which in turn, is directly related to the number of cells in the sample ( Bautista et al., 1997 ). ATP bioluminescence is also useful for monitoring microbial contamination in scalding and chilling tanks within a meat processing operation. In the ATP bioluminescence assays for carcass contamination and process water quality, microbial cells are removed by filtration before they are lysed to release intracellular ATP. To simplify the method, it would be desirable if the step could be eliminated to allow direct detection of ATP on swabs of the carcass surface, in much the same way as for the ATP bioluminescence hygiene monitoring tests ( Griffiths, 1996 ). However, there would be no way of differentiating ATP from microbial and non-microbial sources using a swab assay, but results would be obtained within 2 min, as opposed to the 10–15 min required when a filtration step is incorporated ( Bautista et al., 1997 ). Siragusa et al. (1996) developed segmented-model statistical approach to determine the lower limits of assay sensitivity and by using this model analyzed in-plant data. According to them, the rapid microbial-ATP test responded in a linear fashion to levels of microbial contamination of >log 10 3.2 aerobic CFU/cm 2 for pork carcasses. 4.3 Absorption maximum of pure ATP As shown in Fig. 2 , the transmittance decreased with an increase in ATP concentration, and different spectra showed the minimum absorbance at 260 nm. The wave length of 260 nm was in accordance with the maximum absorbance of ATP (259 nm) as previously reported by Bagshaw (2001) . 4.4 Estimation of ATP and plate count from reflectance Reflectance ( Fig. 3 ) showed a decreasing trend with time in the UV–Vis range, although there was a very little difference between the reflectance at 0 h and that at 24 h. To remove the background effect the raw data was transformed by using the 1st and the 2nd derivates. However, the 2nd derivative was chosen because the effect was more clearer in it. The 2nd derivative technique is often used to process NIR data. It helps to separate overlapping absorption bands, remove baseline shifts and increase apparent spectral resolution (Lin et al., 2004), although the derivatives are notoriously sensitive to noise (Tsai and Philpot, 1998). Many upward and downward peaks were observed when the 2nd derivative of raw reflectance spectra for all storage time periods (from 0 to 96 h) was taken ( Fig. 4 ). The analysis of correlation of peaks at 298, 318, 344 and 374 nm was conducted. These selected wavelengths were in the UV range, i.e., less than 400 nm. The greatest differences were obtained for all selected wavelengths, between time 0 and 96 h. The maximum differences between time 0 and 96 h were in the range of 318 nm. This wavelength range could mainly differentiate between samples at 0 and 96 h. Fig. 5 shows the correlation coefficient between the 2nd derivative of reflectance and log (ATP). This gave a high correlation between the value of the 2nd derivative and log (ATP). Considering bathochromic shift, any of these four wave lengths could be taken as the maximum absorption of ATP. On the other hand, it is widely known that the spectral absorption by ATP is usually masked by protein absorbance and cannot be exploited in spectroscopic studies ( Bagshaw, 2001 ). However, the graph of correlation coefficient between the 2nd derivative of reflectance and log (plate count) shown in Fig. 6 became very similar in shape to Fig. 5 . This indicated that the 2nd derivative of reflectance involved the information of ATP in viable cells. The understanding of this is also supported by the result that the amount of ATP corresponded to the plate count ( Fig. 1 ). From these considerations, the wave length of 318 nm showing the highest correlation coefficient was selected. The linear relationship between the value of the 2nd derivative and log (ATP) for the first 48 h at 318 nm is shown in Fig. 7 with the determination coefficient of 0.89. The similar relationship was also observed between the value of the 2nd derivative and log (plate counts) for the first 48 h at 318 nm with the determination coefficient of 0.83. The duration of the first 48 h chosen here means that pork meat samples were fresh. From these results, it is expected that the selection of appropriate wave length could give the real-time monitoring of ATP and/or viable cell count on meat surface by the use of reflectance information. The plate count gives an estimate of microbial contamination whereas the ATP bioluminescence method used in this study measures total ATP, from both microbial and non-microbial sources, and may be a better measure of the overall cleanliness of the carcass. Therefore, an exact relationship between the two methods should not be expected and results obtained from the two assay systems should be interpreted separately. Multiple linear regression analysis using more than one reflectance at different wave length is a powerful tool in estimating ATP and/or viable cell count on meat surface, and can lead to higher predictive power. However, such paradigm may lead to overfitting. Accordingly, in this study, only one wave length (i.e., 318 nm) was selected for the prediction of ATP. 5 Conclusions A real-time detection method for monitoring of ATP and viable cells on meat surface by using reflectance spectra was developed. The data showed that the plate count on the sample meat surface increased and it corresponded exactly to the increase in the amount of ATP during 84 h storage at 15 °C. The linear relationship between the amount of ATP and plate count was supported by its determination coefficient of 0.95. Reflectance showed a decreasing trend with time in UV–Vis range and at the peak of 318 nm, 2nd derivative of reflectance gave a high correlation with log (ATP). As a similar high correlation was also observed between the 2nd derivative of reflectance and log (plate count), it is suggested that the 2nd derivative of reflectance involved the information of ATP in viable cells. From these observations, a linear relationship was given for the estimation of the amount of microbialy-derived ATP on the basis of reflectance analysis of meat surface. Hence, the developed technique can give a powerful way for monitoring of cleanness at a slaughterhouse. Acknowledgements This research was partly funded by the Japan Society for the Promotion of Science (JSPS) Grant No. 19:07178 . References Adams and Moss, 2007 Adams, M.R., Moss, M.O., 2007. Food Microbiology, third ed. The Royal Society of Chemistry, Cambridge, pp. 138. Bagshaw, 2001 C.R. Bagshaw ATP analogues at a glance Journal of Cell Science 114 2001 459 460 Baumgart, 1993 J. Baumgart Lebensmitteluberwachung und–qualitatssicherung Mikrobiologisch- hygienische Schnellverfahren Fleischwirtschaft 73 1993 292 396 Bautista et al., 1997 D.A. Bautista D.W. Sprung S. Barbut M.W. Griffiths A sampling regime based on an ATP bioluminescence assay to assess the quality of poultry carcasses at critical control points during processing Food Research International 30 1997 803 809 Brondum et al., 2000 J. Brondum L. Munck P. Henckel A. Karlsson E. Tornberg S.B. Engelsen Prediction and water-holding capacity and composition of porcine meat by comparative spectroscopy Meat Science 55 2000 177 185 Brown, 1982 Brown, M. H. (1982). Meat microbiology. (p. 410). New York: Applied Science Publications. Bulte and Reuter, 1985 M. Bulte G. Reuter The bioluminescence as a rapid method for the determination of the microflora of meat International Journal of Food Microbiology 2 1985 371 381 Champiat et al., 2001 D. Champiat N. Matas B. Monofort H. Fraass Applications of bioluminescence to HACCP Luminescence 16 2001 193 198 Chan et al., 2002 D.E. Chan P.N. Walker E.W. Mills Prediction of pork quality characteristics using visible and NIR spectroscopy Transactions of ASAE 45 2002 1519 1527 Clark and El-Shaarawi, 1993 J.A. Clark A.H. El-Shaarawi Evaluation of commercial presence-absence test kits for detection of total coliforms, Escherichia coli, and other indicator bacteria Applied and Environmental Microbiology 59 2 1993 380 388 Cozzoliono et al., 2003 D. Cozzoliono N. Barlocco A. Vadell F. Ballesteros G. Gallieta The use of visible and near-infrared reflectance spectroscopy to predict colour on both intact and homogenized pork muscle LWT-Food Science and Technology 36 2003 195 202 de Boer and Beurmer, 1999 E. de Boer R.R. Beurmer Methodology for detection and typing of food borne microorganisms International Journal of Food Microbiology 50 1999 119 130 del Moral et al., 2009 F.G. del Moral A. Guillen K.G. del Moral F. O’Valle L. Martinez R.G. del Moral Duroc and Iberian porl neural network classification by visible and near infrared reflectance spectroscopy Journal of Food Engineering 90 2009 540 547 Dainty et al., 1985 R.H. Dainty R.A. Edwards C.M. Hibbard Time course of volatile compound formation during refrigerated storage of naturally contaminated beef in air Journal of Applied Bacteriology 59 1985 303 309 D’Souza, 2001 S.F. D’Souza Microbial biosensors Biosensors and Bioelectronics 16 2001 337 353 Ellis and Goodacre, 2001 D.I. Ellis R. Goodacre Rapid and quantitative detection of the microbial spoilage of muscle foods: Current status and future trends Trends in Food Science & Technology. 12 2001 414 424 Fernandez-Cabanas et al., 2007 V.M. Fernandez-Cabanas A. Garrido-Varo J. Gracia-Olmo E. De Pedro P. Dardenne Optimisation of the spectral pre-treatments used for Iberian pig fat NIR calibration Chemometrics and Intelligent Laboratory System 87 2007 104 112 Fox, 2001 S. Fox WHO to convene on worldwide risk of BSE and CJD Infections in Medicine. 18 2001 69 Gonzalez-Martin et al., 2003 I. Gonzalez-Martin C. Gonzalez-Perez J. Hernandez-Menderz N. Alvarez-Gracia Determination of fatty acids in the subcutaneous fat of Iberian breed swine by near infrared spectroscopy (NIRS) with a fiber-optic probe Meat Science 65 2003 713 719 Gonzalez-Martin et al., 2005 I. Gonzalez-Martin C. Gonzalez-Perez N. Alvarez-Gracia J.M. Gonzalez-Cabrera On-line determination of fatty acids composition in intramuscular fat of Iberian pork loin by NIRS with a remote reflectance fibre optic probe Meat Science 65 2005 713 719 Gill and Newton, 1977 C.O. Gill K.G. Newton The development of aerobic spoilage flora on meat stored at chill temperatures Journal of Applied Bacteriology 43 1977 189 195 Graumlich, 1985 T.R. Graumlich Estimation of microbial populations in orange juice by bioluminescence Journal of Food Science 50 1985 116 117, 124 Griffiths, 1996 M.W. Griffiths The role of ATP bioluminescence in the food industry: new light on old problems Food Technology 50 6 1996 62 72 Hawronskyj and Holah, 1997 J.M. Hawronskyj J. Holah ATP: A universal hygiene monitor Trends in Food Science & Technology 8 1997 79 84 Heeschen et al., 1991 W.H. Heeschen G. Suhren G. Hahn Rapid methods in the dairy industry A. Vaheri R.C. Tilton A. Balows Rapid methods and Automation in Microbiology and Immunology 1991 Springer-Verlag Berlin Heidelberg 520 532 Hoving-Bolink et al., 2005 A.H. Hoving-Bolink H.W. Vedder J.W.M. Merks W.J.H. de Klein H.G.M. Reimert R. Frankhuizen W.H.A.M. van den Broek enE. Lambooji Perspective of NIRS measurements early post mortem for prediction of pork quality Meat Science 69 2005 417 423 Jackson et al., 1997 T.C. Jackson G.R. Acuff J.S. Dickson Meat, poultry, and seafood M.P. Doyle L.R. Beuchat T.J. Montville Food microbiology: fundamentals and frontiers 1997 ASM Press Washington DC 83 100 Jay, 2005 J.M. Jay Modern food microbiology sixth ed. 2005 Aspen Publishers Maryland Josell et al., 2000 A. Josell L. Martinsson C. Borggaard J.R. Anderson E. Tornberg Determination of RN - phenotype in pigs at slaughter-line using visual and near-infrared spectroscopy Meat Science 55 2000 273 278 Nychas et al., 1988 G.J. Nychas V.M. Dillon R.G. Board Glucose, the key substrate in the microbiological changes occurring in meat and certain meat products Biotechnology and applied biochemistry 10 1988 203 231 Pickrell and Enserink, 2001 J. Pickrell M. Enserink Foot-and-mouth disease – UK outbreak is latest in global epidemic Science 291 2001 1677 Savenije et al., 2006 B. Savenije G.H. Geesink J.G.P. van der Palen G. Hemke Prediction of pork quality using visible/near infrared reflectance spectroscopy Meat Science 73 2006 181 184 Seymour et al., 1994 I.J. Seymour M.B. Cole P.J. Coote A substrate-mediated assay of bacterial proton efflux/influx to predict the degree of spoilage of beef mince stored at chill temperatures Journal of Applied Bacteriology 76 1994 608 615 Siragusa et al., 1996 G.R. Siragusa W.J. Dorsa C.N. Cutter Perino L.J. Kooh-maraie Use of a newly developed rapid microbial ATP bioluminescence assay to detect microbial contamination on poultry carcasses Journal of Bioluminescence and Chemilumoscence 11 1996 297 301 Stanbridge and Davies, 1998 L.H. Stanbridge A.R. Davies The microbiology of chill-stored meat Davies R. Board The microbiology of meat and poultry 1998 Blackie Academic & Professional London 174 219 Stanley, 1989 P.E. Stanley A review of bioluminescent ATP techniques in rapid microbiology Journal of Bioluminescence and Chemiluminescence 4 1989 375 380 Vanne et al., 1996 L. Vanne M. Karwoski S. Karppinen A.M. Sjoberg HACCP-based food quality control and rapid detection methods for microorganisms Food Control 7 1996 263 276|建立了一种紫外-可见光谱实时检测肉表面 ATP 和活细胞的方法。ATP 含量与平板计数呈线性关系,测定系数为0.95。反射光谱的二阶导数与前48h 的 ATP 含量和318nm 的活细胞计数均有较高的相关性。摘要屠宰场的清洁度监测依赖于传统的方法,例如目视检查或擦拭。目视检查并不总是准确的。采样需要熟练的工人和进一步的平板计数或 ATP 生物发光技术。为了解决这些问题,开发了一种基于无损紫外-可见光反射的快速检测技术来监测 ATP 和活细胞。样本为猪腰瘦肉。分别在0、24、48、72、84和96h 分析15 °C 保存的样品的 ATP、平板计数和紫外-可见光反射率。测定了样品在20 °C 下240 ~ 540nm 范围内的反射光谱,然后采集样品表面40 × 40mm2的面积,测定平板计数和 ATP 含量。84h 后,样品表面的平板计数由最初的29个增加到3.2 × 107CFU/cm2。84h 后,ATP 含量从最初的9.2 × 10 ~ (-15) mol/cm2增加到2.8 × 10 ~ (-10) mol/cm2。ATP 含量与平板计数呈线性关系,测定系数为0.95。原始光谱的二阶导数与 ATP 含量和活细胞计数在318nm 处的相关系数分别为0.89和0.83。实验结果表明,紫外-可见光反射光谱分析可以用于实时监测肉品表面 ATP 和/或平板计数,并且具有最佳波长。关键词 ATP 卫生实时监测无损检测猪肉光谱板计数吸光度反射率肉类质量1简介包括肉类和家禽的肌肉食品是人类饮食的一个组成部分,并已经存在了几千年。然而,在过去的二十年中,由于诸如疯牛病和口蹄疫等引人注目的食品安全问题(Fox,2001; Pickrell and Enserink,2001) ,公众的关注和意识已经提高。这些疾病的爆发,以及对肉类中特定病原菌的担忧,说明了在这个大规模生产行业中,对肉类微生物腐败的快速和准确检测系统的要求,这个行业每年的营业额达数十亿美元(Ellis and Goodacre,2001)。微生物在食物变质中的主要作用以及食物作为传播导致食源性疾病的微生物的媒介的作用已得到公认。在家禽、猪肉和牛肉的屠宰场,清洁度的监测主要依赖于传统的目视检查、拭子和随后的活细胞计数或 ATP 生物发光技术(Hawronskyj and Holah,1997)。这对于与食物加工过程有关的微生物危害尤其重要。就家禽、猪肉和牛肉加工而言,可以通过使用培养方法(即经典的标准盘计数法)进行常规屠体分析,来验证预防措施减少或消除微生物危害的有效性(Bautista et al。 ,1997)。然而,自从常规微生物分析应用于食品以来,科学家一直对发展更快速的“实时”微生物质量控制方法感兴趣。基于检测整个细胞或其代谢物的快速检测方法可以分为两大类: 直接方法基于检测有或没有孵育的细胞,间接方法基于测量代谢产物或由细胞生长引起的其他变化(Vanne 等,1996)。虽然快速检测方法正在发展中,传统的微生物监测方法在屠宰场的工作现场使用。然而,这些方法往往需要操作人员的技能,分析时间长,费用高。此外,目测检查并不总是准确的,擦拭需要熟练的工人和进一步的板计数分析,通常需要24-48小时。在过去的半个世纪中,传统的食品采样微生物学方法已经发生了变化,据估计,目前有超过40种方法来测量和检测肉类中的细菌变质(Jay,2005; Nychas 等,1988)。近二十年来微生物快速检测技术的发展可分为两大类: 计数检测和存在-缺失检测。1990年,通过使用安大略省环境部的 P-A 测试,Clark 和 El-Shaarawi (1993)比较了几种商业存在缺失(P-A)测试试剂盒区域,并在6个月内进行了评估。目前的快速计数方法一般基于显微镜、 ATP 生物发光或电现象的测量(Ellis and Goodacre,2001)。ATP 生物发光测定的使用是一种合乎逻辑的方法,并且依赖于所有活细胞都含有腺苷5’-三磷酸(ATP)的事实,ATP 是代谢的通用能量供体(Bautista 等,1997)。检测从细胞中提取的高能分子三磷酸腺苷(ATP)是一种广泛使用的间接测定方法。三磷酸腺苷(ATP)量是以镁离子存在下荧光素-荧光素酶系统释放的光能量来测量的(Stanley,1989)。该分析是快速的,只有几秒钟的卫生监测应用和不到一个小时的大多数其他样品。以前,人们认为这种技术有局限性,因为 ATP 存在于所有活细胞中。因此,来自靶细胞的内源性 ATP 必须在测定前被酶去除(Vanne 等,1996)。Siragusa 等(1996)指出,使用微生物 ATP 作为确定食品样品中微生物总数的手段的主要挑战是从微生物 ATP 中分离非微生物 ATP。他们描述的快速微生物 ATP 测定的基础是使用过滤装置提取体细胞 ATP: 然后在同一装置内,提取细菌 ATP 后进行定量。在显微镜方法的情况下,复杂的技术已经开发出来,其中微生物被荧光染料染色,并用荧光显微镜观察。ATP 生物发光通过测量培养细菌细胞中的 ATP 水平来计算该培养物中存在的细胞数量(Champiat 等,2001; de Boer 和 Beurmer,1999; D’Souza,2001; Siragusa 等,1996)。这种方法的问题在于,ATP 是所有活细胞的主要能量来源,而且食品样本本身也含有大量这种化学物质,在测量微生物 ATP 之前,必须将其销毁。因此,ATP 生物发光的测量可能最适合于检测与食品生产和制备相关的设备和机械的污染表面(Ellis and Goodacre,2001)。在只检测活细胞的情况下,必须考虑 ATP 生物发光技术的上述局限性和显微技术的缺陷。然而,来源于肉类和活细胞的 ATP 总量在清洁度评估中具有足够的重要性,因为来源于肉类的 ATP 是导致细菌腐败的营养来源。由于无损、无化学制备和快速检测的优点,光谱学已被广泛研究用于确定农产品的性质,但与植物材料相比,肉类产品的性质较少(Chan et al。 ,2002)。根据文献,VIS/NIRS 技术已应用于猪肉中以测定肌间脂肪(Hoving-Bolink 等,2005; Savenije 等,2006) ,脂肪酸组成(Fernandez-Cabanas 等,2007; Gonzalez-Martin 等,2003,2005) ,颜色(Cozzoliono 等,2003) ,持水能力(Brondum 等,2000)(Josell et al。 ,2000) ,以及 Doroc 和伊比利亚 porl 神经网络分类(del Moral。 ,2009) ,但是它还没有被应用于不同质量和价格的肉类的直接定性分类(del Moral。 ,2009)。此外,利用反射率数据测定食品质量的报告屈指可数。从目前的情况来看,这项研究的目的是开发一种实时检测方法,以监测 ATP 和活细胞在肉表面的反射光谱,可用于卫生管理。2材料及方法2.1肉类样本2.1从零售商取得切成5毫米厚的猪腰样本的瘦肉部分。3天前被宰杀,在零售商店保存在市场条件下。共有24个切片的样品被切成约6 × 6cm2的块,并分别放置在消毒的培养皿中。2.2实验装置样品分为6组,每组4个样品,在15 °C 恒温箱中保存。根据我们与屠宰场管理人员的谈话,我们选择了屠宰场工作室的最高温度作为储存温度,考虑到工人的健康,通常将温度控制在10 °C 至15 °C 之间。在储存0,24,48,72,84和96小时后进行测量。显示的每个值是四个部分的平均值。实验重复三次以验证结果。在所有的重复实验中都得到了相似的结果。在这里,为了简单起见,只显示了一个实验的结果。2.3 UV-Vis 反射光谱日本京都岛津公司的 UV-3600双光束光谱仪配备了积分球装置,用于记录肉样表面(9 × 20mm2)的反射光谱。测量的波长范围为240-1200纳米,分辨率为2纳米; 然而,从240-540纳米的结果只显示在第3节。为了确定 ATP 的最大吸收波长,用10mm 石英电池测定了 ATP 标准溶液(LL-100-1,日本东京东洋 B-Net 公司)系列稀释液的透过率。2.4光谱数据预处理光谱数据经常被预处理,以减少不良的系统噪声,如基线变化、光散射、路径长度差异等,并增强化学成份的贡献(Tigabu 和 Oden,2002)。在这项研究中,两种类型的预处理采用: 萨维茨基-戈雷一阶和二阶导数。在我们的案例中,系统变化的可能来源可能是由于路径长度的微小差异产生的定位个别肉类样本与轻微不同的大小在扫描过程中。2.5采样方案和微生物分析2.5.1采样方案采用拭子技术对覆盖光谱测量区域的猪肉表面(40 × 40mm2)材料进行采样。为了确保足够的采样,根据 Bautista 等人(1997) ,以水平模式擦拭样品,并在食指和拇指之间来回旋转的垂直模式中再次擦拭样品。将棉签末端切入9ml 无菌水中,搅拌均匀,进一步检测棉签平板计数和 ATP 含量。2.5.2平板计数从浸入拭子的磷酸盐缓冲溶液中制备拭子样品的连续稀释液,并将1ml 稀释液分配到 Petri ilmsTM (AC 板,Sumitomo 3M Ltd. ,Tokyo,Japan)上以进行总有氧计数。在35 °C 下培养48小时。2.5.3 ATP 生物发光测定将100微升拭子样品(将拭子浸入其中的磷酸盐缓冲溶液)注射到置于发光计(Luminescenser MCA,Atto Corporation,Tokyo,Japan)中的新鲜试管中,然后将100μl 萃取剂(LL-100-2,Toyo B-Net Co. Ltd. ,Tokyo,Japan)加入其中。10秒后,加入100μl 萤光素-荧光素酶复合物(LL-100-1,日本东京东洋 B-Net Co. Ltd. ,Tokyo,Japan) ,并测量光输出。从每个拭子,采取两个测量和手段计算确定相对光单位(RLU)。然后通过用 ATP 标准溶液(LL-100-1,Toyo B-Net Co. Ltd. ,Tokyo,Japan)在10-16-10-11摩尔/100微升范围内构建的标准曲线将 RLU 转化为 ATP 的量。2.6统计分析在贮存期间,随机抽取四块猪肉样本作统计分析。研究人员进行回归分析测试,以了解三磷酸腺苷(ATP)含量与盘子数量的关系。由于原始数据具有背景信息,因此采用一阶导数和二阶导数对原始数据进行转换,从中选出最优的一个。3结果3.1平板计数样品肉表面平板计数随贮藏时间延长而增加。试验开始时,初始计数为29CFU/cm2,贮藏84h 后为3.2 × 107CFU/cm2。3.2 ATP 含量随着贮藏时间的延长,ATP 含量由最初的9.2 × 10-5增加到2.8 × 10-10mol/cm2(贮藏后84h)。ATP 含量与平板计数呈线性关系,测定系数(R2)为0.95,如图1所示。3.3纯 ATP 的最大吸收量不同浓度(1 × 10-4 ~ 5.85 × 10-6M)的 ATP 溶液的透过率如图2所示。它表明,透过率随着 ATP 浓度的增加而降低,并且对不同 ATP 浓度的所有样品进行的光谱显示与透过率降低相关的最大吸光度在260nm 处(图2)。3.4根据反射率估计 ATP 和平板计数储存0-84小时获得的反射光谱如图3所示,在 UV-Vis 范围内(从240到540nm)。0小时和24小时的反射率差别很小。在48、72和84小时采集的样品的反射率随着储存时间的延长呈下降趋势。反射率数据的二阶导数被选为反射率一阶导数和二阶导数之间的最佳值,如图4所示。在紫外光谱范围内,观察到多个向上和向下的峰,并对298,318,344和374nm 的峰进行了相关性分析。图5显示了反射率二阶导数与对数(ATP)之间的相关系数。这给出了二阶导数和对数(ATP)之间的高度相关性。考虑到暗色移动,这四个波长中的任何一个都可以作为 ATP 的最大吸收。4在过去的几十年中,光谱方法在食品质量属性的评价中获得了重要性(Nadai,1983; Nadai 和 Mihalyi-Kengyel,1984)。尽管近红外光谱反映了与食品复杂质量有关的几个参数(Williams 和 Norris,2001) ,但是在近红外光谱范围内不能检测到 ATP 和/或微生物的信息。因此,在本研究中,UV-Vis 的应用范围从240到540纳米。4.1平板计数在这项研究中,样品被评估为新鲜,直到当细菌计数超过107CFU/g 的边界线,没有腐臭气味可以察觉。72小时后,平板计数达到107 CFU/g,样品散发出微弱的腐臭气味。这些样品处于变质的初始阶段,将被视为不可接受。盘子计数是肉类腐败的一个基本指标,肉类中107CFU/g 的计数被认为是不可接受的(Brown,1982)。检测106 CFU/g 的量级很重要,因为这是在肉类达到不可接受阶段之前完成的。新鲜肉类的 pH 值一般在5.5至5.9之间,并含有足够的葡萄糖和其他简单碳水化合物,以支持大约109 CFU/cm2。在冷藏温度下生长最快并利用葡萄糖的生物是假单胞菌(Gill and Newton,1977; Jay,2005; Seymour et al。 ,1994)。在107 CFU/cm2的水平下,异味可能会以淡淡的“奶制品”香味的形式变得明显,一旦表面细菌数量达到108 CFU/cm2,简单碳水化合物的供应已经耗尽,可识别的异味发展导致所谓的“感官”腐败(Jackson 等,1997; Jay,2005; Stanbridge and Davies,1998)。异味的发展取决于游离氨基酸利用发生的程度,这些气味已经被不同地描述为107CFU/cm2的乳制品/黄油/脂肪/奶酪,直到108CFU/cm2的病态甜味/水果香味,最后是109CFU/cm2的腐臭气味(Adams 和 Moss,2007; Dainty 等,1985)。4.2 ATP 含量图1显示了对数10 ATP 和对数10平板计数之间的线性关系。根据这个数字,ATP 分析法和平板计数法都能够评估猪肉样本的卫生情况。ATP 分析只能提供总细菌计数的估计,不能区分细菌(Baumgart,1993)。理论上,ATP 含量低至100fg (10-13g) ,相当于约100个细菌细胞。在实际条件下,灵敏度约为1000fg (10-12g) ,相当于约1000个细菌细胞或一至两个酵母细胞(Heeschen 等,1991)。应激细胞和处于静止生长阶段的细胞含有较少的 ATP,这也影响了结果(Bulte 和 Reuter,1985)。然而,另一方面,样品中 ATP 的含量提供了对活性微生物种群的估计,这在考虑产品的货架期时是很重要的。应激细胞也可以在 ATP 测定前复苏(Graumlich,1985)。这种酶,萤光素酶,通过化学计量反应将 ATP 提供的化学能转化为光。因此,产生的光量与存在的 ATP 浓度成正比,而 ATP 浓度又与样品中细胞的数量直接相关(Bautista 等,1997)。ATP 生物发光也可用于监测肉类加工过程中烫伤和冷却罐中的微生物污染。在对屠体污染和处理水质的 ATP 生物发光测定中,微生物细胞在裂解释放细胞内 ATP 之前通过过滤除去。为了简化这种方法,如果能够消除这一步骤以允许在屠体表面的拭子上直接检测 ATP,就像 ATP 生物发光卫生监测测试(Griffiths,1996)一样是可取的。然而,使用拭子测定将无法区分 ATP 与微生物和非微生物来源,但是结果将在2分钟内获得,而不是当纳入过滤步骤时所需的10-15分钟(Bautista 等,1997)。Siragusa 等(1996)开发了分段模型统计方法来确定检测灵敏度的下限,并使用该模型分析植物内数据。根据他们的研究,快速微生物 ATP 测试以线性方式响应猪肉胴体 > log 103.2需氧 CFU/cm2的微生物污染水平。4.3纯 ATP 的吸收峰如图2所示,透过率随 ATP 浓度的增加而降低,不同的光谱在260nm 处表现出最小的吸光度。260nm 的波长与之前 Bagshaw (2001)报道的 ATP (259nm)的最大吸光度一致。4.4从反射反射率估计 ATP 和平板计数(图3)在 UV-Vis 范围内随着时间的推移显示出减少的趋势,尽管在0小时和24小时的反射率之间差异很小。为了消除背景效应原始数据通过使用第一和第二导数进行转换。然而,选择二阶导数是因为它的效果更加明显。二阶导数技术是近红外数据处理的常用方法。它有助于分离重叠吸收带,消除基线偏移和增加明显的光谱分辨率(Lin et al。 ,2004) ,尽管衍生物是众所周知的对噪声敏感(Tsai 和 Philpot,1998)。在所有贮存时间段(0-96小时)的原始反射光谱二阶导数测量时,观察到许多向上和向下的峰值(图4)。对298、318、344和374nm 波段的峰进行了相关性分析。这些选定的波长在紫外线范围内,即小于400纳米。最大的差异获得所有选定的波长,在时间0和96小时之间。时间0 ~ 96h 的最大差异在318nm 范围内。这个波长范围主要可以区分0和96小时的样品。图5显示了反射率的二阶导数与对数(ATP)之间的相关系数。这给出了二阶导数的值与对数(ATP)之间的高度相关性。考虑到暗色移动,这四个波长中的任何一个都可以作为 ATP 的最大吸收。另一方面,众所周知,ATP 的光谱吸收通常被蛋白质吸收所掩盖,在光谱学研究中不能被利用(Bagshaw,2001)。然而,图6所示的反射率二阶导数与对数(平板计数)之间的相关系数图在形状上与图5非常相似。这表明反射率的二阶导数涉及活细胞内 ATP 的信息。对这一点的理解也得到了 ATP 含量与平板计数相对应的结果的支持(图1)。从这些考虑,波长318纳米显示最高的相关系数被选中。在318nm 处的前48h,二阶导数的值与对数(ATP)之间的线性关系如图7所示,测定系数为0.89。在318nm 处测定前48h,二阶导数值与对数(平板计数)之间也存在相似的关系,测定系数为0.83。这里选择的前48小时的持续时间意味着猪肉样本是新鲜的。从这些结果可以看出,选择合适的波长可以利用反射率信息实时监测肉表面 ATP 和/或活细胞计数。平板计数给出了微生物污染的估计值,而本研究中使用的 ATP 生物发光法测量来自微生物和非微生物来源的总 ATP,可能是对屠体整体清洁度的更好测量。因此,不应期望两种方法之间有确切的关系,从两种测定系统获得的结果应该分开解释。在不同波长下使用多个反射率的多个线性回归分析是估计肉表面 ATP 和/或存活细胞数量的有力工具,并且可以导致更高的预测能力。然而,这样的范式可能会导致过度拟合。因此,在这项研究中,只有一个波长(即,318纳米)被选择用于 ATP 的预测。5结论建立了一种利用反射光谱实时监测肉表面 ATP 和活细胞的方法。结果表明,在15 °C 下贮藏84h,样品表面的平板数量增加,与 ATP 含量的增加完全一致。其测定系数为0.95,支持 ATP 含量与平板计数之间的线性关系。在紫外-可见光范围内,反射率随时间呈下降趋势,在318nm 的峰值,反射率的二阶导数与对数(ATP)呈高度相关。由于反射率的二阶导数与对数(平板计数)之间也存在相似的高相关性,提示反射率的二阶导数涉及活细胞中 ATP 的信息。根据这些观察结果,在肉表面反射率分析的基础上,给出了估算微生物源性 ATP 含量的线性关系。因此,开发的技术可以提供一个强有力的方式监测清洁度在屠宰场。这项研究部分由日本科学促进会(JSPS)拨款19:07178资助。参考文献 Adams and Moss 2007 Adams M.R. Moss 作案手法2007。食品微生物学。英国皇家化学学会,剑桥,第138页。巴格肖,2001年产。Bagshaw ATP 类似物一目了然细胞科学杂志1142001459460 Baumgart,1993 J. Baumgart Lebensmitteluberwachung und-qualitatssecherung Mikrobiologch-hygienische Schnellverfahren Fleischwirtschaft 731993292396 Bautista 等,1997 D.A。Bautista D.W.Sprung S. Barbut M.W.Griffiths 基于 ATP 生物发光测定的采样制度,用于评估食品研究国际组织在加工过程中关键控制点的家禽尸体的质量。301997803809 Brondum et al。 ,2000 J. Brondum L. Munck P. Henckel A. Karlsson E. Tornberg S.B。利用比较光谱法对猪肉的恩格尔森预测和持水性及其组成的研究[55]。肉类微生物学。(p. 410).纽约: 应用科学出版物。生物发光法作为一种快速测定肉类微生物区系的方法,国际食品微生物学杂志,1985年2月37日,381 Champiat 等人,2001 D. Champiat N. Matas B. Monofort H. Fraass 生物发光在 HACCP 发光中的应用162001193198 Chan 等,2002 D.E。陈 PN。Walker E.W.利用 ASAE 45200215191527 Clark 和 El-Shaarawi 的可见光和近红外光谱法预测猪肉品质特性。Clark A.H.用于检测总大肠菌群、大肠桿菌和其他指示细菌应用与环境微生物学的商业存在-缺失试剂盒的评估5921993380388 Cozzoliono 等人,2003 D. Cozzoliono N. Barlocco A. Vadell F. Ballesteros G. Gallieta 使用可见光和近红外反射光谱来预测完整和均化猪肉的颜色 LWT-Food Science and Technology 362003195202 de Boer and Beurmer,1999E.De Boer R.R.Beurmer 食源性微生物检测和分型方法国际食品微生物学杂志501999119130 del Moral 等,2009 F.G。德尔 · 道德 · A · 吉伦 · KG。马丁内斯 R.G。可见光和近红外反射光谱学对杜洛克和伊比利亚 Porl 神经网络的分类食品工程杂志902009540547 Dainty 等,1985 R.H。小巧的 R.A。爱德华兹 C.M。空气中自然污染牛肉冷藏过程中挥发性化合物形成的时间过程应用细菌学杂志591985303309 d’Souza,2001 S.F. 。D’Souza 微生物生物传感器生物传感器和生物电子学162001337353 Ellis and Goodacre,2001 D。肌肉食品微生物腐败的快速定量检测: 食品科学与技术的现状与未来趋势。122001414424 Fernandez-Cabanas 等,2007 V.M. Fernandez-Cabanas A. Garrido-Varo J. Gracia-Olmo E. De Pedro P. Dardenne 优化用于伊比利亚猪脂肪近红外校准的光谱预处理化学计量学和智能实验室系统872007104112 Fox,2001 S.Fox 世界卫生组织召开关于世界范围内疯牛病和 CJD 医学感染风险的会议。冈萨雷斯-马丁等人,2003 I. 冈萨雷斯-马丁冈萨雷斯-佩雷斯 · 埃尔南德斯 · 门德斯 · N · 阿尔瓦雷斯 · 格拉西亚用光纤探针近红外光谱学(NIRS)测定伊比利亚种猪皮下脂肪中的脂肪酸。肉类科学652003713719冈萨雷斯-马丁等人,2005年 I. Gonzalez-Martin C. Gonzalez-Perez N. Alvarez-Gracia J.M。冈萨雷斯-卡布雷拉利用远程反射光纤探针在线测定伊比利亚猪腰肌间脂肪脂肪酸组成。Gill K.G.牛顿冷藏肉类上有氧腐败菌群的发展应用细菌学杂志431977189195 Graumlich,1985 T.R。《食品科学生物发光杂志》1985116117,124 Griffiths,1996 M.W 对橙汁中微生物种群的 Graumlich 估算。三磷酸腺苷生物发光在食品工业中的作用: 对旧问题的新认识食品技术50619966272 Hawronskyj and Holah,1997 J.M。Hawronskyj J. Holah ATP: 食品科学与技术通用卫生监测器趋势819977984 Heeschen 等,1991 W.H。乳品工业中的快速方法。《微生物学和免疫学中的快速方法和自动化》 ,1991年,施普林格-出版社,Berlin Heidelberg 520532霍文-博林克等,2005年。霍温-博林克 H.W。Vedder J.W.M.Merks W.J.H.De Klein H.G.M.Reimert R. Frankhuizen W.H.A.M.范登布鲁克。返回文章页面兰布吉对近红外光谱仪测量结果的透视译者: pestwave 肉类科学692005417423杰克逊等人,1997年杰克逊 G.R. Acuff J.S. 迪克森肉类,家禽和海鲜 M.P. 道尔 L.R. 博伊查特 T.J. 蒙特维尔食品微生物学: 基础和前沿1997年美国广播公司出版社华盛顿 DC 83100杰伊,2005 J.M. 杰伊现代食品微生物学第六版。2005年 Aspen 出版社 Maryland Josell 等人,2000年 A. Josell L. Martinsson C. Borggaard J.R。利用视觉和近红外光谱技术肉类科学552000273278尼查斯等,1988 G.J。Nychas V.M.Dillon R.G.O < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o log 10 3.2 aerobic CFU/cm 2 for pork carcasses. 4.3 Absorption maximum of pure ATP As shown in Fig. 2 , the transmittance decreased with an increase in ATP concentration, and different spectra showed the minimum absorbance at 260 nm. The wave length of 260 nm was in accordance with the maximum absorbance of ATP (259 nm) as previously reported by Bagshaw (2001) . 4.4 Estimation of ATP and plate count from reflectance Reflectance ( Fig. 3 ) showed a decreasing trend with time in the UV–Vis range, although there was a very little difference between the reflectance at 0 h and that at 24 h. To remove the background effect the raw data was transformed by using the 1st and the 2nd derivates. However, the 2nd derivative was chosen because the effect was more clearer in it. The 2nd derivative technique is often used to process NIR data. It helps to separate overlapping absorption bands, remove baseline shifts and increase apparent spectral resolution (Lin et al., 2004), although the derivatives are notoriously sensitive to noise (Tsai and Philpot, 1998). Many upward and downward peaks were observed when the 2nd derivative of raw reflectance spectra for all storage time periods (from 0 to 96 h) was taken ( Fig. 4 ). The analysis of correlation of peaks at 298, 318, 344 and 374 nm was conducted. These selected wavelengths were in the UV range, i.e., less than 400 nm. The greatest differences were obtained for all selected wavelengths, between time 0 and 96 h. The maximum differences between time 0 and 96 h were in the range of 318 nm. This wavelength range could mainly differentiate between samples at 0 and 96 h. Fig. 5 shows the correlation coefficient between the 2nd derivative of reflectance and log (ATP). This gave a high correlation between the value of the 2nd derivative and log (ATP). Considering bathochromic shift, any of these four wave lengths could be taken as the maximum absorption of ATP. On the other hand, it is widely known that the spectral absorption by ATP is usually masked by protein absorbance and cannot be exploited in spectroscopic studies ( Bagshaw, 2001 ). However, the graph of correlation coefficient between the 2nd derivative of reflectance and log (plate count) shown in Fig. 6 became very similar in shape to Fig. 5 . This indicated that the 2nd derivative of reflectance involved the information of ATP in viable cells. The understanding of this is also supported by the result that the amount of ATP corresponded to the plate count ( Fig. 1 ). From these considerations, the wave length of 318 nm showing the highest correlation coefficient was selected. The linear relationship between the value of the 2nd derivative and log (ATP) for the first 48 h at 318 nm is shown in Fig. 7 with the determination coefficient of 0.89. The similar relationship was also observed between the value of the 2nd derivative and log (plate counts) for the first 48 h at 318 nm with the determination coefficient of 0.83. The duration of the first 48 h chosen here means that pork meat samples were fresh. From these results, it is expected that the selection of appropriate wave length could give the real-time monitoring of ATP and/or viable cell count on meat surface by the use of reflectance information. The plate count gives an estimate of microbial contamination whereas the ATP bioluminescence method used in this study measures total ATP, from both microbial and non-microbial sources, and may be a better measure of the overall cleanliness of the carcass. Therefore, an exact relationship between the two methods should not be expected and results obtained from the two assay systems should be interpreted separately. Multiple linear regression analysis using more than one reflectance at different wave length is a powerful tool in estimating ATP and/or viable cell count on meat surface, and can lead to higher predictive power. However, such paradigm may lead to overfitting. Accordingly, in this study, only one wave length (i.e., 318 nm) was selected for the prediction of ATP. 5 Conclusions A real-time detection method for monitoring of ATP and viable cells on meat surface by using reflectance spectra was developed. The data showed that the plate count on the sample meat surface increased and it corresponded exactly to the increase in the amount of ATP during 84 h storage at 15 °C. The linear relationship between the amount of ATP and plate count was supported by its determination coefficient of 0.95. Reflectance showed a decreasing trend with time in UV–Vis range and at the peak of 318 nm, 2nd derivative of reflectance gave a high correlation with log (ATP). As a similar high correlation was also observed between the 2nd derivative of reflectance and log (plate count), it is suggested that the 2nd derivative of reflectance involved the information of ATP in viable cells. From these observations, a linear relationship was given for the estimation of the amount of microbialy-derived ATP on the basis of reflectance analysis of meat surface. Hence, the developed technique can give a powerful way for monitoring of cleanness at a slaughterhouse. Acknowledgements This research was partly funded by the Japan Society for the Promotion of Science (JSPS) Grant No. 19:07178 . References Adams and Moss, 2007 Adams, M.R., Moss, M.O., 2007. Food Microbiology, third ed. The Royal Society of Chemistry, Cambridge, pp. 138. Bagshaw, 2001 C.R. Bagshaw ATP analogues at a glance Journal of Cell Science 114 2001 459 460 Baumgart, 1993 J. Baumgart Lebensmitteluberwachung und–qualitatssicherung Mikrobiologisch- hygienische Schnellverfahren Fleischwirtschaft 73 1993 292 396 Bautista et al., 1997 D.A. Bautista D.W. Sprung S. Barbut M.W. Griffiths A sampling regime based on an ATP bioluminescence assay to assess the quality of poultry carcasses at critical control points during processing Food Research International 30 1997 803 809 Brondum et al., 2000 J. Brondum L. Munck P. Henckel A. Karlsson E. Tornberg S.B. Engelsen Prediction and water-holding capacity and composition of porcine meat by comparative spectroscopy Meat Science 55 2000 177 185 Brown, 1982 Brown, M. H. (1982). Meat microbiology. (p. 410). New York: Applied Science Publications. Bulte and Reuter, 1985 M. Bulte G. Reuter The bioluminescence as a rapid method for the determination of the microflora of meat International Journal of Food Microbiology 2 1985 371 381 Champiat et al., 2001 D. Champiat N. Matas B. Monofort H. Fraass Applications of bioluminescence to HACCP Luminescence 16 2001 193 198 Chan et al., 2002 D.E. Chan P.N. Walker E.W. Mills Prediction of pork quality characteristics using visible and NIR spectroscopy Transactions of ASAE 45 2002 1519 1527 Clark and El-Shaarawi, 1993 J.A. Clark A.H. El-Shaarawi Evaluation of commercial presence-absence test kits for detection of total coliforms, Escherichia coli, and other indicator bacteria Applied and Environmental Microbiology 59 2 1993 380 388 Cozzoliono et al., 2003 D. Cozzoliono N. Barlocco A. Vadell F. Ballesteros G. Gallieta The use of visible and near-infrared reflectance spectroscopy to predict colour on both intact and homogenized pork muscle LWT-Food Science and Technology 36 2003 195 202 de Boer and Beurmer, 1999 E. de Boer R.R. Beurmer Methodology for detection and typing of food borne microorganisms International Journal of Food Microbiology 50 1999 119 130 del Moral et al., 2009 F.G. del Moral A. Guillen K.G. del Moral F. O’Valle L. Martinez R.G. del Moral Duroc and Iberian porl neural network classification by visible and near infrared reflectance spectroscopy Journal of Food Engineering 90 2009 540 547 Dainty et al., 1985 R.H. Dainty R.A. Edwards C.M. Hibbard Time course of volatile compound formation during refrigerated storage of naturally contaminated beef in air Journal of Applied Bacteriology 59 1985 303 309 D’Souza, 2001 S.F. D’Souza Microbial biosensors Biosensors and Bioelectronics 16 2001 337 353 Ellis and Goodacre, 2001 D.I. Ellis R. Goodacre Rapid and quantitative detection of the microbial spoilage of muscle foods: Current status and future trends Trends in Food Science & Technology. 12 2001 414 424 Fernandez-Cabanas et al., 2007 V.M. Fernandez-Cabanas A. Garrido-Varo J. Gracia-Olmo E. De Pedro P. Dardenne Optimisation of the spectral pre-treatments used for Iberian pig fat NIR calibration Chemometrics and Intelligent Laboratory System 87 2007 104 112 Fox, 2001 S. Fox WHO to convene on worldwide risk of BSE and CJD Infections in Medicine. 18 2001 69 Gonzalez-Martin et al., 2003 I. Gonzalez-Martin C. Gonzalez-Perez J. Hernandez-Menderz N. Alvarez-Gracia Determination of fatty acids in the subcutaneous fat of Iberian breed swine by near infrared spectroscopy (NIRS) with a fiber-optic probe Meat Science 65 2003 713 719 Gonzalez-Martin et al., 2005 I. Gonzalez-Martin C. Gonzalez-Perez N. Alvarez-Gracia J.M. Gonzalez-Cabrera On-line determination of fatty acids composition in intramuscular fat of Iberian pork loin by NIRS with a remote reflectance fibre optic probe Meat Science 65 2005 713 719 Gill and Newton, 1977 C.O. Gill K.G. Newton The development of aerobic spoilage flora on meat stored at chill temperatures Journal of Applied Bacteriology 43 1977 189 195 Graumlich, 1985 T.R. Graumlich Estimation of microbial populations in orange juice by bioluminescence Journal of Food Science 50 1985 116 117, 124 Griffiths, 1996 M.W. Griffiths The role of ATP bioluminescence in the food industry: new light on old problems Food Technology 50 6 1996 62 72 Hawronskyj and Holah, 1997 J.M. Hawronskyj J. Holah ATP: A universal hygiene monitor Trends in Food Science & Technology 8 1997 79 84 Heeschen et al., 1991 W.H. Heeschen G. Suhren G. Hahn Rapid methods in the dairy industry A. Vaheri R.C. Tilton A. Balows Rapid methods and Automation in Microbiology and Immunology 1991 Springer-Verlag Berlin Heidelberg 520 532 Hoving-Bolink et al., 2005 A.H. Hoving-Bolink H.W. Vedder J.W.M. Merks W.J.H. de Klein H.G.M. Reimert R. Frankhuizen W.H.A.M. van den Broek enE. Lambooji Perspective of NIRS measurements early post mortem for prediction of pork quality Meat Science 69 2005 417 423 Jackson et al., 1997 T.C. Jackson G.R. Acuff J.S. Dickson Meat, poultry, and seafood M.P. Doyle L.R. Beuchat T.J. Montville Food microbiology: fundamentals and frontiers 1997 ASM Press Washington DC 83 100 Jay, 2005 J.M. Jay Modern food microbiology sixth ed. 2005 Aspen Publishers Maryland Josell et al., 2000 A. Josell L. Martinsson C. Borggaard J.R. Anderson E. Tornberg Determination of RN - phenotype in pigs at slaughter-line using visual and near-infrared spectroscopy Meat Science 55 2000 273 278 Nychas et al., 1988 G.J. Nychas V.M. Dillon R.G. Board Glucose, the key substrate in the microbiological changes occurring in meat and certain meat products Biotechnology and applied biochemistry 10 1988 203 231 Pickrell and Enserink, 2001 J. Pickrell M. Enserink Foot-and-mouth disease – UK outbreak is latest in global epidemic Science 291 2001 1677 Savenije et al., 2006 B. Savenije G.H. Geesink J.G.P. van der Palen G. Hemke Prediction of pork quality using visible/near infrared reflectance spectroscopy Meat Science 73 2006 181 184 Seymour et al., 1994 I.J. Seymour M.B. Cole P.J. Coote A substrate-mediated assay of bacterial proton efflux/influx to predict the degree of spoilage of beef mince stored at chill temperatures Journal of Applied Bacteriology 76 1994 608 615 Siragusa et al., 1996 G.R. Siragusa W.J. Dorsa C.N. Cutter Perino L.J. Kooh-maraie Use of a newly developed rapid microbial ATP bioluminescence assay to detect microbial contamination on poultry carcasses Journal of Bioluminescence and Chemilumoscence 11 1996 297 301 Stanbridge and Davies, 1998 L.H. Stanbridge A.R. Davies The microbiology of chill-stored meat Davies R. Board The microbiology of meat and poultry 1998 Blackie Academic & Professional London 174 219 Stanley, 1989 P.E. Stanley A review of bioluminescent ATP techniques in rapid microbiology Journal of Bioluminescence and Chemiluminescence 4 1989 375 380 Vanne et al., 1996 L. Vanne M. Karwoski S. Karppinen A.M. Sjoberg HACCP-based food quality control and rapid detection methods for microorganisms Food Control 7 1996 263 276|建立了一种紫外-可见光谱实时检测肉表面 ATP 和活细胞的方法。ATP 含量与平板计数呈线性关系,测定系数为0.95。反射光谱的二阶导数与前48h 的 ATP 含量和318nm 的活细胞计数均有较高的相关性。摘要屠宰场的清洁度监测依赖于传统的方法,例如目视检查或擦拭。目视检查并不总是准确的。采样需要熟练的工人和进一步的平板计数或 ATP 生物发光技术。为了解决这些问题,开发了一种基于无损紫外-可见光反射的快速检测技术来监测 ATP 和活细胞。样本为猪腰瘦肉。分别在0、24、48、72、84和96h 分析15 °C 保存的样品的 ATP、平板计数和紫外-可见光反射率。测定了样品在20 °C 下240 ~ 540nm 范围内的反射光谱,然后采集样品表面40 × 40mm2的面积,测定平板计数和 ATP 含量。84h 后,样品表面的平板计数由最初的29个增加到3.2 × 107CFU/cm2。84h 后,ATP 含量从最初的9.2 × 10 ~ (-15) mol/cm2增加到2.8 × 10 ~ (-10) mol/cm2。ATP 含量与平板计数呈线性关系,测定系数为0.95。原始光谱的二阶导数与 ATP 含量和活细胞计数在318nm 处的相关系数分别为0.89和0.83。实验结果表明,紫外-可见光反射光谱分析可以用于实时监测肉品表面 ATP 和/或平板计数,并且具有最佳波长。关键词 ATP 卫生实时监测无损检测猪肉光谱板计数吸光度反射率肉类质量1简介包括肉类和家禽的肌肉食品是人类饮食的一个组成部分,并已经存在了几千年。然而,在过去的二十年中,由于诸如疯牛病和口蹄疫等引人注目的食品安全问题(Fox,2001; Pickrell and Enserink,2001) ,公众的关注和意识已经提高。这些疾病的爆发,以及对肉类中特定病原菌的担忧,说明了在这个大规模生产行业中,对肉类微生物腐败的快速和准确检测系统的要求,这个行业每年的营业额达数十亿美元(Ellis and Goodacre,2001)。微生物在食物变质中的主要作用以及食物作为传播导致食源性疾病的微生物的媒介的作用已得到公认。在家禽、猪肉和牛肉的屠宰场,清洁度的监测主要依赖于传统的目视检查、拭子和随后的活细胞计数或 ATP 生物发光技术(Hawronskyj and Holah,1997)。这对于与食物加工过程有关的微生物危害尤其重要。就家禽、猪肉和牛肉加工而言,可以通过使用培养方法(即经典的标准盘计数法)进行常规屠体分析,来验证预防措施减少或消除微生物危害的有效性(Bautista et al。 ,1997)。然而,自从常规微生物分析应用于食品以来,科学家一直对发展更快速的“实时”微生物质量控制方法感兴趣。基于检测整个细胞或其代谢物的快速检测方法可以分为两大类: 直接方法基于检测有或没有孵育的细胞,间接方法基于测量代谢产物或由细胞生长引起的其他变化(Vanne 等,1996)。虽然快速检测方法正在发展中,传统的微生物监测方法在屠宰场的工作现场使用。然而,这些方法往往需要操作人员的技能,分析时间长,费用高。此外,目测检查并不总是准确的,擦拭需要熟练的工人和进一步的板计数分析,通常需要24-48小时。在过去的半个世纪中,传统的食品采样微生物学方法已经发生了变化,据估计,目前有超过40种方法来测量和检测肉类中的细菌变质(Jay,2005; Nychas 等,1988)。近二十年来微生物快速检测技术的发展可分为两大类: 计数检测和存在-缺失检测。1990年,通过使用安大略省环境部的 P-A 测试,Clark 和 El-Shaarawi (1993)比较了几种商业存在缺失(P-A)测试试剂盒区域,并在6个月内进行了评估。目前的快速计数方法一般基于显微镜、 ATP 生物发光或电现象的测量(Ellis and Goodacre,2001)。ATP 生物发光测定的使用是一种合乎逻辑的方法,并且依赖于所有活细胞都含有腺苷5’-三磷酸(ATP)的事实,ATP 是代谢的通用能量供体(Bautista 等,1997)。检测从细胞中提取的高能分子三磷酸腺苷(ATP)是一种广泛使用的间接测定方法。三磷酸腺苷(ATP)量是以镁离子存在下荧光素-荧光素酶系统释放的光能量来测量的(Stanley,1989)。该分析是快速的,只有几秒钟的卫生监测应用和不到一个小时的大多数其他样品。以前,人们认为这种技术有局限性,因为 ATP 存在于所有活细胞中。因此,来自靶细胞的内源性 ATP 必须在测定前被酶去除(Vanne 等,1996)。Siragusa 等(1996)指出,使用微生物 ATP 作为确定食品样品中微生物总数的手段的主要挑战是从微生物 ATP 中分离非微生物 ATP。他们描述的快速微生物 ATP 测定的基础是使用过滤装置提取体细胞 ATP: 然后在同一装置内,提取细菌 ATP 后进行定量。在显微镜方法的情况下,复杂的技术已经开发出来,其中微生物被荧光染料染色,并用荧光显微镜观察。ATP 生物发光通过测量培养细菌细胞中的 ATP 水平来计算该培养物中存在的细胞数量(Champiat 等,2001; de Boer 和 Beurmer,1999; D’Souza,2001; Siragusa 等,1996)。这种方法的问题在于,ATP 是所有活细胞的主要能量来源,而且食品样本本身也含有大量这种化学物质,在测量微生物 ATP 之前,必须将其销毁。因此,ATP 生物发光的测量可能最适合于检测与食品生产和制备相关的设备和机械的污染表面(Ellis and Goodacre,2001)。在只检测活细胞的情况下,必须考虑 ATP 生物发光技术的上述局限性和显微技术的缺陷。然而,来源于肉类和活细胞的 ATP 总量在清洁度评估中具有足够的重要性,因为来源于肉类的 ATP 是导致细菌腐败的营养来源。由于无损、无化学制备和快速检测的优点,光谱学已被广泛研究用于确定农产品的性质,但与植物材料相比,肉类产品的性质较少(Chan et al。 ,2002)。根据文献,VIS/NIRS 技术已应用于猪肉中以测定肌间脂肪(Hoving-Bolink 等,2005; Savenije 等,2006) ,脂肪酸组成(Fernandez-Cabanas 等,2007; Gonzalez-Martin 等,2003,2005) ,颜色(Cozzoliono 等,2003) ,持水能力(Brondum 等,2000)(Josell et al。 ,2000) ,以及 Doroc 和伊比利亚 porl 神经网络分类(del Moral。 ,2009) ,但是它还没有被应用于不同质量和价格的肉类的直接定性分类(del Moral。 ,2009)。此外,利用反射率数据测定食品质量的报告屈指可数。从目前的情况来看,这项研究的目的是开发一种实时检测方法,以监测 ATP 和活细胞在肉表面的反射光谱,可用于卫生管理。2材料及方法2.1肉类样本2.1从零售商取得切成5毫米厚的猪腰样本的瘦肉部分。3天前被宰杀,在零售商店保存在市场条件下。共有24个切片的样品被切成约6 × 6cm2的块,并分别放置在消毒的培养皿中。2.2实验装置样品分为6组,每组4个样品,在15 °C 恒温箱中保存。根据我们与屠宰场管理人员的谈话,我们选择了屠宰场工作室的最高温度作为储存温度,考虑到工人的健康,通常将温度控制在10 °C 至15 °C 之间。在储存0,24,48,72,84和96小时后进行测量。显示的每个值是四个部分的平均值。实验重复三次以验证结果。在所有的重复实验中都得到了相似的结果。在这里,为了简单起见,只显示了一个实验的结果。2.3 UV-Vis 反射光谱日本京都岛津公司的 UV-3600双光束光谱仪配备了积分球装置,用于记录肉样表面(9 × 20mm2)的反射光谱。测量的波长范围为240-1200纳米,分辨率为2纳米; 然而,从240-540纳米的结果只显示在第3节。为了确定 ATP 的最大吸收波长,用10mm 石英电池测定了 ATP 标准溶液(LL-100-1,日本东京东洋 B-Net 公司)系列稀释液的透过率。2.4光谱数据预处理光谱数据经常被预处理,以减少不良的系统噪声,如基线变化、光散射、路径长度差异等,并增强化学成份的贡献(Tigabu 和 Oden,2002)。在这项研究中,两种类型的预处理采用: 萨维茨基-戈雷一阶和二阶导数。在我们的案例中,系统变化的可能来源可能是由于路径长度的微小差异产生的定位个别肉类样本与轻微不同的大小在扫描过程中。2.5采样方案和微生物分析2.5.1采样方案采用拭子技术对覆盖光谱测量区域的猪肉表面(40 × 40mm2)材料进行采样。为了确保足够的采样,根据 Bautista 等人(1997) ,以水平模式擦拭样品,并在食指和拇指之间来回旋转的垂直模式中再次擦拭样品。将棉签末端切入9ml 无菌水中,搅拌均匀,进一步检测棉签平板计数和 ATP 含量。2.5.2平板计数从浸入拭子的磷酸盐缓冲溶液中制备拭子样品的连续稀释液,并将1ml 稀释液分配到 Petri ilmsTM (AC 板,Sumitomo 3M Ltd. ,Tokyo,Japan)上以进行总有氧计数。在35 °C 下培养48小时。2.5.3 ATP 生物发光测定将100微升拭子样品(将拭子浸入其中的磷酸盐缓冲溶液)注射到置于发光计(Luminescenser MCA,Atto Corporation,Tokyo,Japan)中的新鲜试管中,然后将100μl 萃取剂(LL-100-2,Toyo B-Net Co. Ltd. ,Tokyo,Japan)加入其中。10秒后,加入100μl 萤光素-荧光素酶复合物(LL-100-1,日本东京东洋 B-Net Co. Ltd. ,Tokyo,Japan) ,并测量光输出。从每个拭子,采取两个测量和手段计算确定相对光单位(RLU)。然后通过用 ATP 标准溶液(LL-100-1,Toyo B-Net Co. Ltd. ,Tokyo,Japan)在10-16-10-11摩尔/100微升范围内构建的标准曲线将 RLU 转化为 ATP 的量。2.6统计分析在贮存期间,随机抽取四块猪肉样本作统计分析。研究人员进行回归分析测试,以了解三磷酸腺苷(ATP)含量与盘子数量的关系。由于原始数据具有背景信息,因此采用一阶导数和二阶导数对原始数据进行转换,从中选出最优的一个。3结果3.1平板计数样品肉表面平板计数随贮藏时间延长而增加。试验开始时,初始计数为29CFU/cm2,贮藏84h 后为3.2 × 107CFU/cm2。3.2 ATP 含量随着贮藏时间的延长,ATP 含量由最初的9.2 × 10-5增加到2.8 × 10-10mol/cm2(贮藏后84h)。ATP 含量与平板计数呈线性关系,测定系数(R2)为0.95,如图1所示。3.3纯 ATP 的最大吸收量不同浓度(1 × 10-4 ~ 5.85 × 10-6M)的 ATP 溶液的透过率如图2所示。它表明,透过率随着 ATP 浓度的增加而降低,并且对不同 ATP 浓度的所有样品进行的光谱显示与透过率降低相关的最大吸光度在260nm 处(图2)。3.4根据反射率估计 ATP 和平板计数储存0-84小时获得的反射光谱如图3所示,在 UV-Vis 范围内(从240到540nm)。0小时和24小时的反射率差别很小。在48、72和84小时采集的样品的反射率随着储存时间的延长呈下降趋势。反射率数据的二阶导数被选为反射率一阶导数和二阶导数之间的最佳值,如图4所示。在紫外光谱范围内,观察到多个向上和向下的峰,并对298,318,344和374nm 的峰进行了相关性分析。图5显示了反射率二阶导数与对数(ATP)之间的相关系数。这给出了二阶导数和对数(ATP)之间的高度相关性。考虑到暗色移动,这四个波长中的任何一个都可以作为 ATP 的最大吸收。4在过去的几十年中,光谱方法在食品质量属性的评价中获得了重要性(Nadai,1983; Nadai 和 Mihalyi-Kengyel,1984)。尽管近红外光谱反映了与食品复杂质量有关的几个参数(Williams 和 Norris,2001) ,但是在近红外光谱范围内不能检测到 ATP 和/或微生物的信息。因此,在本研究中,UV-Vis 的应用范围从240到540纳米。4.1平板计数在这项研究中,样品被评估为新鲜,直到当细菌计数超过107CFU/g 的边界线,没有腐臭气味可以察觉。72小时后,平板计数达到107 CFU/g,样品散发出微弱的腐臭气味。这些样品处于变质的初始阶段,将被视为不可接受。盘子计数是肉类腐败的一个基本指标,肉类中107CFU/g 的计数被认为是不可接受的(Brown,1982)。检测106 CFU/g 的量级很重要,因为这是在肉类达到不可接受阶段之前完成的。新鲜肉类的 pH 值一般在5.5至5.9之间,并含有足够的葡萄糖和其他简单碳水化合物,以支持大约109 CFU/cm2。在冷藏温度下生长最快并利用葡萄糖的生物是假单胞菌(Gill and Newton,1977; Jay,2005; Seymour et al。 ,1994)。在107 CFU/cm2的水平下,异味可能会以淡淡的“奶制品”香味的形式变得明显,一旦表面细菌数量达到108 CFU/cm2,简单碳水化合物的供应已经耗尽,可识别的异味发展导致所谓的“感官”腐败(Jackson 等,1997; Jay,2005; Stanbridge and Davies,1998)。异味的发展取决于游离氨基酸利用发生的程度,这些气味已经被不同地描述为107CFU/cm2的乳制品/黄油/脂肪/奶酪,直到108CFU/cm2的病态甜味/水果香味,最后是109CFU/cm2的腐臭气味(Adams 和 Moss,2007; Dainty 等,1985)。4.2 ATP 含量图1显示了对数10 ATP 和对数10平板计数之间的线性关系。根据这个数字,ATP 分析法和平板计数法都能够评估猪肉样本的卫生情况。ATP 分析只能提供总细菌计数的估计,不能区分细菌(Baumgart,1993)。理论上,ATP 含量低至100fg (10-13g) ,相当于约100个细菌细胞。在实际条件下,灵敏度约为1000fg (10-12g) ,相当于约1000个细菌细胞或一至两个酵母细胞(Heeschen 等,1991)。应激细胞和处于静止生长阶段的细胞含有较少的 ATP,这也影响了结果(Bulte 和 Reuter,1985)。然而,另一方面,样品中 ATP 的含量提供了对活性微生物种群的估计,这在考虑产品的货架期时是很重要的。应激细胞也可以在 ATP 测定前复苏(Graumlich,1985)。这种酶,萤光素酶,通过化学计量反应将 ATP 提供的化学能转化为光。因此,产生的光量与存在的 ATP 浓度成正比,而 ATP 浓度又与样品中细胞的数量直接相关(Bautista 等,1997)。ATP 生物发光也可用于监测肉类加工过程中烫伤和冷却罐中的微生物污染。在对屠体污染和处理水质的 ATP 生物发光测定中,微生物细胞在裂解释放细胞内 ATP 之前通过过滤除去。为了简化这种方法,如果能够消除这一步骤以允许在屠体表面的拭子上直接检测 ATP,就像 ATP 生物发光卫生监测测试(Griffiths,1996)一样是可取的。然而,使用拭子测定将无法区分 ATP 与微生物和非微生物来源,但是结果将在2分钟内获得,而不是当纳入过滤步骤时所需的10-15分钟(Bautista 等,1997)。Siragusa 等(1996)开发了分段模型统计方法来确定检测灵敏度的下限,并使用该模型分析植物内数据。根据他们的研究,快速微生物 ATP 测试以线性方式响应猪肉胴体 > log 103.2需氧 CFU/cm2的微生物污染水平。4.3纯 ATP 的吸收峰如图2所示,透过率随 ATP 浓度的增加而降低,不同的光谱在260nm 处表现出最小的吸光度。260nm 的波长与之前 Bagshaw (2001)报道的 ATP (259nm)的最大吸光度一致。4.4从反射反射率估计 ATP 和平板计数(图3)在 UV-Vis 范围内随着时间的推移显示出减少的趋势,尽管在0小时和24小时的反射率之间差异很小。为了消除背景效应原始数据通过使用第一和第二导数进行转换。然而,选择二阶导数是因为它的效果更加明显。二阶导数技术是近红外数据处理的常用方法。它有助于分离重叠吸收带,消除基线偏移和增加明显的光谱分辨率(Lin et al。 ,2004) ,尽管衍生物是众所周知的对噪声敏感(Tsai 和 Philpot,1998)。在所有贮存时间段(0-96小时)的原始反射光谱二阶导数测量时,观察到许多向上和向下的峰值(图4)。对298、318、344和374nm 波段的峰进行了相关性分析。这些选定的波长在紫外线范围内,即小于400纳米。最大的差异获得所有选定的波长,在时间0和96小时之间。时间0 ~ 96h 的最大差异在318nm 范围内。这个波长范围主要可以区分0和96小时的样品。图5显示了反射率的二阶导数与对数(ATP)之间的相关系数。这给出了二阶导数的值与对数(ATP)之间的高度相关性。考虑到暗色移动,这四个波长中的任何一个都可以作为 ATP 的最大吸收。另一方面,众所周知,ATP 的光谱吸收通常被蛋白质吸收所掩盖,在光谱学研究中不能被利用(Bagshaw,2001)。然而,图6所示的反射率二阶导数与对数(平板计数)之间的相关系数图在形状上与图5非常相似。这表明反射率的二阶导数涉及活细胞内 ATP 的信息。对这一点的理解也得到了 ATP 含量与平板计数相对应的结果的支持(图1)。从这些考虑,波长318纳米显示最高的相关系数被选中。在318nm 处的前48h,二阶导数的值与对数(ATP)之间的线性关系如图7所示,测定系数为0.89。在318nm 处测定前48h,二阶导数值与对数(平板计数)之间也存在相似的关系,测定系数为0.83。这里选择的前48小时的持续时间意味着猪肉样本是新鲜的。从这些结果可以看出,选择合适的波长可以利用反射率信息实时监测肉表面 ATP 和/或活细胞计数。平板计数给出了微生物污染的估计值,而本研究中使用的 ATP 生物发光法测量来自微生物和非微生物来源的总 ATP,可能是对屠体整体清洁度的更好测量。因此,不应期望两种方法之间有确切的关系,从两种测定系统获得的结果应该分开解释。在不同波长下使用多个反射率的多个线性回归分析是估计肉表面 ATP 和/或存活细胞数量的有力工具,并且可以导致更高的预测能力。然而,这样的范式可能会导致过度拟合。因此,在这项研究中,只有一个波长(即,318纳米)被选择用于 ATP 的预测。5结论建立了一种利用反射光谱实时监测肉表面 ATP 和活细胞的方法。结果表明,在15 °C 下贮藏84h,样品表面的平板数量增加,与 ATP 含量的增加完全一致。其测定系数为0.95,支持 ATP 含量与平板计数之间的线性关系。在紫外-可见光范围内,反射率随时间呈下降趋势,在318nm 的峰值,反射率的二阶导数与对数(ATP)呈高度相关。由于反射率的二阶导数与对数(平板计数)之间也存在相似的高相关性,提示反射率的二阶导数涉及活细胞中 ATP 的信息。根据这些观察结果,在肉表面反射率分析的基础上,给出了估算微生物源性 ATP 含量的线性关系。因此,开发的技术可以提供一个强有力的方式监测清洁度在屠宰场。这项研究部分由日本科学促进会(JSPS)拨款19:07178资助。参考文献 Adams and Moss 2007 Adams M.R. Moss 作案手法2007。食品微生物学。英国皇家化学学会,剑桥,第138页。巴格肖,2001年产。Bagshaw ATP 类似物一目了然细胞科学杂志1142001459460 Baumgart,1993 J. Baumgart Lebensmitteluberwachung und-qualitatssecherung Mikrobiologch-hygienische Schnellverfahren Fleischwirtschaft 731993292396 Bautista 等,1997 D.A。Bautista D.W.Sprung S. Barbut M.W.Griffiths 基于 ATP 生物发光测定的采样制度,用于评估食品研究国际组织在加工过程中关键控制点的家禽尸体的质量。301997803809 Brondum et al。 ,2000 J. Brondum L. Munck P. Henckel A. Karlsson E. Tornberg S.B。利用比较光谱法对猪肉的恩格尔森预测和持水性及其组成的研究[55]。肉类微生物学。(p. 410).纽约: 应用科学出版物。生物发光法作为一种快速测定肉类微生物区系的方法,国际食品微生物学杂志,1985年2月37日,381 Champiat 等人,2001 D. Champiat N. Matas B. Monofort H. Fraass 生物发光在 HACCP 发光中的应用162001193198 Chan 等,2002 D.E。陈 PN。Walker E.W.利用 ASAE 45200215191527 Clark 和 El-Shaarawi 的可见光和近红外光谱法预测猪肉品质特性。Clark A.H.用于检测总大肠菌群、大肠桿菌和其他指示细菌应用与环境微生物学的商业存在-缺失试剂盒的评估5921993380388 Cozzoliono 等人,2003 D. Cozzoliono N. Barlocco A. Vadell F. Ballesteros G. Gallieta 使用可见光和近红外反射光谱来预测完整和均化猪肉的颜色 LWT-Food Science and Technology 362003195202 de Boer and Beurmer,1999E.De Boer R.R.Beurmer 食源性微生物检测和分型方法国际食品微生物学杂志501999119130 del Moral 等,2009 F.G。德尔 · 道德 · A · 吉伦 · KG。马丁内斯 R.G。可见光和近红外反射光谱学对杜洛克和伊比利亚 Porl 神经网络的分类食品工程杂志902009540547 Dainty 等,1985 R.H。小巧的 R.A。爱德华兹 C.M。空气中自然污染牛肉冷藏过程中挥发性化合物形成的时间过程应用细菌学杂志591985303309 d’Souza,2001 S.F. 。D’Souza 微生物生物传感器生物传感器和生物电子学162001337353 Ellis and Goodacre,2001 D。肌肉食品微生物腐败的快速定量检测: 食品科学与技术的现状与未来趋势。122001414424 Fernandez-Cabanas 等,2007 V.M. Fernandez-Cabanas A. Garrido-Varo J. Gracia-Olmo E. De Pedro P. Dardenne 优化用于伊比利亚猪脂肪近红外校准的光谱预处理化学计量学和智能实验室系统872007104112 Fox,2001 S.Fox 世界卫生组织召开关于世界范围内疯牛病和 CJD 医学感染风险的会议。冈萨雷斯-马丁等人,2003 I. 冈萨雷斯-马丁冈萨雷斯-佩雷斯 · 埃尔南德斯 · 门德斯 · N · 阿尔瓦雷斯 · 格拉西亚用光纤探针近红外光谱学(NIRS)测定伊比利亚种猪皮下脂肪中的脂肪酸。肉类科学652003713719冈萨雷斯-马丁等人,2005年 I. Gonzalez-Martin C. Gonzalez-Perez N. Alvarez-Gracia J.M。冈萨雷斯-卡布雷拉利用远程反射光纤探针在线测定伊比利亚猪腰肌间脂肪脂肪酸组成。Gill K.G.牛顿冷藏肉类上有氧腐败菌群的发展应用细菌学杂志431977189195 Graumlich,1985 T.R。《食品科学生物发光杂志》1985116117,124 Griffiths,1996 M.W 对橙汁中微生物种群的 Graumlich 估算。三磷酸腺苷生物发光在食品工业中的作用: 对旧问题的新认识食品技术50619966272 Hawronskyj and Holah,1997 J.M。Hawronskyj J. Holah ATP: 食品科学与技术通用卫生监测器趋势819977984 Heeschen 等,1991 W.H。乳品工业中的快速方法。《微生物学和免疫学中的快速方法和自动化》 ,1991年,施普林格-出版社,Berlin Heidelberg 520532霍文-博林克等,2005年。霍温-博林克 H.W。Vedder J.W.M.Merks W.J.H.De Klein H.G.M.Reimert R. Frankhuizen W.H.A.M.范登布鲁克。返回文章页面兰布吉对近红外光谱仪测量结果的透视译者: pestwave 肉类科学692005417423杰克逊等人,1997年杰克逊 G.R. Acuff J.S. 迪克森肉类,家禽和海鲜 M.P. 道尔 L.R. 博伊查特 T.J. 蒙特维尔食品微生物学: 基础和前沿1997年美国广播公司出版社华盛顿 DC 83100杰伊,2005 J.M. 杰伊现代食品微生物学第六版。2005年 Aspen 出版社 Maryland Josell 等人,2000年 A. Josell L. Martinsson C. Borggaard J.R。利用视觉和近红外光谱技术肉类科学552000273278尼查斯等,1988 G.J。Nychas V.M.Dillon R.G.O < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o < o sub>1-score of 0.78.|我们提出了一种识别网络搜索查询中争议性问题的方法。有争议的问题会询问支持某种立场的理由,比如“大麻是否应该合法化?”有争议的话题包含相反的立场,因此可以支持或反对的各种论点。争议性问题对搜索引擎来说是一个挑战,因为它们应该同时回答正反两方面的争议,以避免用户偏向某一立场。为了进一步分析这个问题,我们从一个大型 Yandex 搜索日志中抽取了19个有争议话题的问题,并让人工注释者将它们标记为事实、方法或论证性的问题。结果是收集了39,340个带标签的问题,其中28% 是有争议的,表明需要为这类问题开发专门的系统。对三种问题类型的比较分析表明,提问原因和预测是议论题的最重要特征之一。为了验证分类任务的可行性,我们开发了一个基于 BERT 的分类器来将问题映射到问题类型,得到了一个很有前途的宏观平均 F > 小于1的得分为0.78。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Identifying+Argumentative+Questions+in+Web+Search+Logs)|1| |[An MLP-based Algorithm for Efficient Contrastive Graph Recommendations](https://doi.org/10.1145/3477495.3531874)|Siwei Liu, Iadh Ounis, Craig Macdonald|University of Glasgow, Glasgow, United Kingdom|Graph-based recommender systems (GBRSs) have achieved promising performance by incorporating the user-item bipartite graph using the Graph Neural Network (GNN). Among GBRSs, the information from each user and item's multi-hop neighbours is effectively conveyed between nodes through neighbourhood aggregation and message passing. Although effective, existing neighbourhood information aggregation and passing functions are usually computationally expensive. Motivated by the emerging contrastive learning technique, we design a simple neighbourhood construction method in conjunction with the contrastive objective function to simulate the neighbourhood information processing of GNN. In addition, we propose a simple algorithm based on Multilayer Perceptron (MLP) for learning users and items' representations with extra non-linearity while lowering computational burden compared with multi-layers GNNs. Our extensive empirical experiments on three public datasets demonstrate that our proposed model, i.e. MLP-CGRec, can reduce the GPU memory consumption and training time by up to 24.0% and 33.1%, respectively, without significantly degenerating the recommendation accuracy in comparison with competitive baselines.|基于图形的推荐系统(GBRS)通过使用图形神经网络(GNN)结合用户项二分图已经取得了良好的性能。在 GBRS 中,每个用户和项目的多跳邻居信息通过邻居聚合和消息传递在节点之间有效地传递。虽然有效,但现有的邻里信息聚合和传递函数通常计算开销很大。受新兴的对比学习技术的启发,我们设计了一种简单的邻域构造方法,结合对比目标函数来模拟 GNN 的邻域信息处理。此外,我们提出了一个简单的基于多层感知机(MLP)的学习算法,用于学习用户和项目的额外非线性表示,同时与多层 GNN 相比,降低了计算负担。我们在三个公共数据集上的广泛实验表明,我们提出的模型,即 MLP-CGRec,可以分别减少 GPU 内存消耗和训练时间高达24.0% 和33.1% ,与竞争性基线相比,不会显着降低推荐准确性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=An+MLP-based+Algorithm+for+Efficient+Contrastive+Graph+Recommendations)|1| |[Entity-Conditioned Question Generation for Robust Attention Distribution in Neural Information Retrieval](https://doi.org/10.1145/3477495.3531878)|Revanth Gangi Reddy, Md. Arafat Sultan, Martin Franz, Avirup Sil, Heng Ji|UIUC, Champaign, IL, USA; IBM Research AI, Yorktown Heights, NY, USA|We show that supervised neural information retrieval (IR) models are prone to learning sparse attention patterns over passage tokens, which can result in key phrases including named entities receiving low attention weights, eventually leading to model under-performance. Using a novel targeted synthetic data generation method that identifies poorly attended entities and conditions the generation episodes on those, we teach neural IR to attend more uniformly and robustly to all entities in a given passage. On two public IR benchmarks, we empirically show that the proposed method helps improve both the model's attention patterns and retrieval performance, including in zero-shot settings.|我们发现监督神经信息检索(IR)模型更倾向于学习通道标记上的稀疏注意模式,这可能导致包括命名实体在内的关键短语注意力权重较低,最终导致模型表现不佳。使用一种新的有针对性的合成数据生成方法,识别参与程度较低的实体,并对这些实体上的生成事件进行条件化处理,我们教导神经 IR 在给定的段落中更加统一和稳健地参与所有实体。实验结果表明,该方法有助于提高模型的注意模式和检索性能,包括在零镜头设置下的检索性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Entity-Conditioned+Question+Generation+for+Robust+Attention+Distribution+in+Neural+Information+Retrieval)|1| -|[C3: Continued Pretraining with Contrastive Weak Supervision for Cross Language Ad-Hoc Retrieval](https://doi.org/10.1145/3477495.3531886)|Eugene Yang, Suraj Nair, Ramraj Chandradevan, Rebecca IglesiasFlores, Douglas W. Oard|University of Maryland, College Park, College Park, MD, USA; University of Pennsylvania, Philadelphia, PA, USA; Johns Hopkins University, Baltimore, MD, USA; Emory University, Atlanta, GA, USA|Pretrained language models have improved effectiveness on numerous tasks, including ad-hoc retrieval. Recent work has shown that continuing to pretrain a language model with auxiliary objectives before fine-tuning on the retrieval task can further improve retrieval effectiveness. Unlike monolingual retrieval, designing an appropriate auxiliary task for cross-language mappings is challenging. To address this challenge, we use comparable Wikipedia articles in different languages to further pretrain off-the-shelf multilingual pretrained models before fine-tuning on the retrieval task. We show that our approach yields improvements in retrieval effectiveness.|预先训练的语言模型可以提高包括即席检索在内的许多任务的有效性。最近的研究表明,在对检索任务进行微调之前,继续预训练带有辅助目标的语言模型可以进一步提高检索效率。与单语检索不同,为跨语言映射设计合适的辅助任务具有挑战性。为了解决这个问题,我们使用不同语言的维基百科文章来进一步预训练现成的多语言预训模型,然后再对检索任务进行微调。我们展示了我们的方法在检索效率方面的改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=C3:+Continued+Pretraining+with+Contrastive+Weak+Supervision+for+Cross+Language+Ad-Hoc+Retrieval)|1| +|[C3: Continued Pretraining with Contrastive Weak Supervision for Cross Language Ad-Hoc Retrieval](https://doi.org/10.1145/3477495.3531886)|Eugene Yang, Suraj Nair, Ramraj Chandradevan, Rebecca IglesiasFlores, Douglas W. Oard|University of Maryland, College Park, College Park, MD, USA; Emory University, Atlanta, GA, USA; University of Pennsylvania, Philadelphia, PA, USA; Johns Hopkins University, Baltimore, MD, USA|Pretrained language models have improved effectiveness on numerous tasks, including ad-hoc retrieval. Recent work has shown that continuing to pretrain a language model with auxiliary objectives before fine-tuning on the retrieval task can further improve retrieval effectiveness. Unlike monolingual retrieval, designing an appropriate auxiliary task for cross-language mappings is challenging. To address this challenge, we use comparable Wikipedia articles in different languages to further pretrain off-the-shelf multilingual pretrained models before fine-tuning on the retrieval task. We show that our approach yields improvements in retrieval effectiveness.|预先训练的语言模型可以提高包括即席检索在内的许多任务的有效性。最近的研究表明,在对检索任务进行微调之前,继续预训练带有辅助目标的语言模型可以进一步提高检索效率。与单语检索不同,为跨语言映射设计合适的辅助任务具有挑战性。为了解决这个问题,我们使用不同语言的维基百科文章来进一步预训练现成的多语言预训模型,然后再对检索任务进行微调。我们展示了我们的方法在检索效率方面的改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=C3:+Continued+Pretraining+with+Contrastive+Weak+Supervision+for+Cross+Language+Ad-Hoc+Retrieval)|1| |[A Meta-learning Approach to Fair Ranking](https://doi.org/10.1145/3477495.3531892)|Yuan Wang, Zhiqiang Tao, Yi Fang|Santa Clara University, Santa Clara, CA, USA|In recent years, the fairness in information retrieval (IR) system has received increasing research attention. While the data-driven ranking models achieve significant improvements over traditional methods, the dataset used to train such models is usually biased, which causes unfairness in the ranking models. For example, the collected imbalance dataset on the subject of the expert search usually leads to systematic discrimination on the specific demographic groups such as race, gender, etc, which further reduces the exposure for the minority group. To solve this problem, we propose a Meta-learning based Fair Ranking (MFR) model that could alleviate the data bias for protected groups through an automatically-weighted loss. Specifically, we adopt a meta-learning framework to explicitly train a meta-learner from an unbiased sampled dataset (meta-dataset), and simultaneously, train a listwise learning-to-rank (LTR) model on the whole (biased) dataset governed by "fair" loss weights. The meta-learner serves as a weighting function to make the ranking loss attend more on the minority group. To update the parameters of the weighting function and the ranking model, we formulate the proposed MFR as a bilevel optimization problem and solve it using the gradients through gradients. Experimental results on several real-world datasets demonstrate that the proposed method achieves a comparable ranking performance and significantly improves the fairness metric compared with state-of-the-art methods.|近年来,信息检索制度的公平性越来越受到研究人员的关注。虽然数据驱动的排序模型比传统的方法有了显著的改进,但是用于训练这些模型的数据集往往存在偏差,从而导致排序模型的不公平性。例如,所收集的关于专家搜索主题的不平衡数据集通常导致对特定人口群体如种族、性别等的系统性歧视,从而进一步减少少数群体的暴露程度。为了解决这一问题,我们提出了一种基于元学习的公平排序(MFR)模型,该模型通过自动加权损失来减轻受保护群体的数据偏差。具体而言,我们采用元学习框架来显式地从无偏的采样数据集(元数据集)中训练元学习者,同时在受“公平”损失权重管理的整个(有偏的)数据集上训练列表学习到排名(LTR)模型。元学习者作为一个权重函数,使排名的损失更多地出现在少数群体中。为了更新权重函数和排名模型的参数,我们将建议的最小生成最佳化问题(mFR)表示为一个双层次模型,并使用梯度来解决这个问题。在实际数据集上的实验结果表明,与现有方法相比,该方法具有可比较的排序性能,并显著提高了公平性度量。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Meta-learning+Approach+to+Fair+Ranking)|1| -|[Where Does the Performance Improvement Come From?: - A Reproducibility Concern about Image-Text Retrieval](https://doi.org/10.1145/3477495.3531715)|Jun Rao, Fei Wang, Liang Ding, Shuhan Qi, Yibing Zhan, Weifeng Liu, Dacheng Tao|Harbin Institute of Technology, Shenzhen, shenzhen, China; China University of Petroleum (East China), qingdao, China; JD Explore Academy, beijing, China|This article aims to provide the information retrieval community with some reflections on recent advances in retrieval learning by analyzing the reproducibility of image-text retrieval models. Due to the increase of multimodal data over the last decade, image-text retrieval has steadily become a major research direction in the field of information retrieval. Numerous researchers train and evaluate image-text retrieval algorithms using benchmark datasets such as MS-COCO and Flickr30k. Research in the past has mostly focused on performance, with multiple state-of-the-art methodologies being suggested in a variety of ways. According to their assertions, these techniques provide improved modality interactions and hence more precise multimodal representations. In contrast to previous works, we focus on the reproducibility of the approaches and the examination of the elements that lead to improved performance by pretrained and nonpretrained models in retrieving images and text. To be more specific, we first examine the related reproducibility concerns and explain why our focus is on image-text retrieval tasks. Second, we systematically summarize the current paradigm of image-text retrieval models and the stated contributions of those approaches. Third, we analyze various aspects of the reproduction of pretrained and nonpretrained retrieval models. To complete this, we conducted ablation experiments and obtained some influencing factors that affect retrieval recall more than the improvement claimed in the original paper. Finally, we present some reflections and challenges that the retrieval community should consider in the future. Our source code is publicly available at https://github.com/WangFei-2019/Image-text-Retrieval.|本文旨在通过分析图像-文本检索模型的可重复性,为信息检索提供有关检索学习最新进展的一些反思。由于过去十年来多模态数据的增加,图像-文本检索逐渐成为信息检索领域的一个主要研究方向。许多研究人员使用基准数据集(如 MS-COCO 和 Flickr30k)训练和评估图像-文本检索算法。过去的研究主要集中在性能方面,以各种方式提出了多种最先进的方法。根据他们的断言,这些技术提供了改进的模态交互,因此更精确的多模态表示。与以往的工作相比,我们关注的方法的可重复性和检查的元素,导致改善性能的预训练和未经预训练的模型检索图像和文本。更具体地说,我们首先考察相关的重复性问题,并解释为什么我们的重点是图像文本检索任务。其次,系统地总结了当前图像文本检索模型的研究范式以及这些方法的贡献。第三,我们分析了预训练和非预训练检索模型复制的各个方面。为此,我们进行了消融实验,得到了一些影响检索回忆的因素,这些因素对检索回忆的影响大于原文中提出的改进。最后,我们提出了一些反思和挑战,检索社区应该考虑在未来。我们的源代码可以在 https://github.com/wangfei-2019/image-text-retrieval 上公开。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Where+Does+the+Performance+Improvement+Come+From?:+-+A+Reproducibility+Concern+about+Image-Text+Retrieval)|1| -|[Competitive Search](https://doi.org/10.1145/3477495.3532771)|Oren Kurland, Moshe Tennenholtz|Zhejiang Univ, Hangzhou, Peoples R China; Univ New South Wales, Business Sch, Sydney, NSW, Australia; European Univ Inst, Fiesole, Italy; Univ Chicago, Booth Sch Business, Chicago, IL 60637 USA|This essay surveys the literature on directed search and competitive search equilibrium, covering theory and a variety of applications. These models share features with traditional search theory, but also differ in important ways. They share features with general equilibrium theory, but with explicit frictions. Equilibria are often efficient, mainly because markets price goods plus the time required to get them. The approach is tractable and arguably realistic. Results are presented for finite and continuum economies. Private information and sorting with heterogeneity are analyzed. While emphasizing issues and applications, we also provide several hard-to-find technical results.|本文综述了有关定向搜索和竞争搜索均衡、覆盖理论和各种应用的文献。这些模型具有与传统搜索理论相同的特点,但在一些重要方面又有所不同。它们与一般均衡理论有共同的特点,但存在明显的摩擦。均衡往往是有效的,主要是因为市场对商品定价,加上购买商品所需的时间。这种方法容易处理,而且可以说是现实的。给出了有限经济体和连续经济体的结果。分析了私有信息和异构排序问题。在强调问题和应用程序的同时,我们还提供了一些难以找到的技术结果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Competitive+Search)|1| -|[A Dataset for Sentence Retrieval for Open-Ended Dialogues](https://doi.org/10.1145/3477495.3531727)|Itay Harel, Hagai Taitelbaum, Idan Szpektor, Oren Kurland|Technion - Israel Institute of Technology, Haifa, Israel; TSG IT Advanced Systems Ltd., Tel Aviv, Israel; Google Research, Tel Aviv, Israel|We address the task of sentence retrieval for open-ended dialogues. The goal is to retrieve sentences from a document corpus that contain information useful for generating the next turn in a given dialogue. Prior work on dialogue-based retrieval focused on specific types of dialogues: either conversational QA or conversational search. To address a broader scope of this task where any type of dialogue can be used, we constructed a dataset that includes open-ended dialogues from Reddit, candidate sentences from Wikipedia for each dialogue and human annotations for the sentences. We report the performance of several retrieval baselines, including neural retrieval models, over the dataset. To adapt neural models to the types of dialogues in the dataset, we explored an approach to induce a large-scale weakly supervised training data from Reddit. Using this training set significantly improved the performance over training on the MS MARCO dataset.|我们讨论开放式对话的句子检索任务。其目的是从文档语料库中检索句子,这些句子包含有用的信息,用于在给定的对话中生成下一个回合。先前的基于对话的检索工作集中在特定类型的对话: 会话 QA 或会话搜索。为了扩大任何类型的对话都可以使用的范围,我们构建了一个数据集,其中包括来自 Reddit 的开放式对话、来自 Wikipedia 的每个对话的候选句子以及句子的人工注释。我们报告了在数据集上的几个检索基线的性能,包括神经检索模型。为了使神经模型适应数据集中的对话类型,我们探索了一种从 Reddit 引入大规模弱监督训练数据的方法。使用这个训练集显著提高了在 MS MARCO 数据集上的训练性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Dataset+for+Sentence+Retrieval+for+Open-Ended+Dialogues)|1| -|[iRec: An Interactive Recommendation Framework](https://doi.org/10.1145/3477495.3531754)|Thiago Silva, Nícollas Silva, Heitor Werneck, Carlos Mito, Adriano C. M. Pereira, Leonardo Rocha|Universidade Federal de Minas Gerais, Belo Horizonte, Brazil; Universidade Federal de São João Del Rei, São João Del Rei, Brazil|Nowadays, most e-commerce and entertainment services have adopted interactive Recommender Systems (RS) to guide the entire journey of users into the system. This task has been addressed as a Multi-Armed Bandit problem where systems must continuously learn and recommend at each iteration. However, despite the recent advances, there is still a lack of consensus on the best practices to evaluate such bandit solutions. Several variables might affect the evaluation process, but most of the works have only been concerned about the accuracy of each method. Thus, this work proposes an interactive RS framework named iRec. It covers the whole experimentation process by following the main RS guidelines. The iRec provides three modules to prepare the dataset, create new recommendation agents, and simulate the interactive scenario. Moreover, it also contains several state-of-the-art algorithms, a hyperparameter tuning module, distinct evaluation metrics, different ways of visualizing the results, and statistical validation.|目前,大多数电子商务和娱乐服务都采用交互式推荐系统(RS)来引导用户进入系统的整个过程。这个任务已经被解决为一个多臂老虎机问题,系统必须在每次迭代中不断学习和推荐。然而,尽管最近取得了一些进展,但是在评估这种土匪解决方案的最佳实践方面仍然缺乏共识。有几个变量可能会影响评价过程,但大多数工作只关心每种方法的准确性。因此,本文提出了一个交互式 RS 框架 iRec。它涵盖了遵循 RS 主要准则的整个实验过程。IRec 提供了三个模块来准备数据集、创建新的推荐代理和模拟交互场景。此外,它还包含一些最先进的算法、一个超参数调整模块、不同的评估指标、不同的结果可视化方法以及统计验证。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=iRec:+An+Interactive+Recommendation+Framework)|1| -|[RecDelta: An Interactive Dashboard on Top-k Recommendation for Cross-model Evaluation](https://doi.org/10.1145/3477495.3531674)|YiShyuan Chiang, YuZe Liu, ChenFeng Tsai, JingKai Lou, MingFeng Tsai, ChuanJu Wang|National Chengchi University, Taipei, Taiwan Roc; Academia Sinica, Taipei, Taiwan Roc; KKStream Technologies, Taipei, Taiwan Roc|In this demonstration, we present RecDelta, an interactive tool for the cross-model evaluation of top-k recommendation. RecDelta is a web-based information system where people visually compare the performance of various recommendation algorithms and their recommended items. In the proposed system, we visualize the distribution of the δ scores between algorithms--a distance metric measuring the intersection between recommendation lists. Such visualization allows for rapid identification of users for whom the items recommended by different algorithms diverge or vice versa; then, one can further select the desired user to present the relationship between recommended items and his/her historical behavior. RecDelta benefits both academics and practitioners by enhancing model explainability as they develop recommendation algorithms with their newly gained insights. Note that while the system is now online at https://cfda.csie.org/recdelta, we also provide a video recording at https://tinyurl.com/RecDelta to introduce the concept and the usage of our system.|在这个演示中,我们展示了 RecDelta,一个用于 top-k 推荐的跨模型评估的交互式工具。RecDelta 是一个基于网络的信息系统,人们可以在这个系统中直观地比较各种推荐算法及其推荐项目的性能。在所提出的系统中,我们可视化算法之间 δ 分数的分布——一个测量推荐列表之间交集的距离度量。这种可视化允许快速识别由不同算法推荐的条目有差异的用户,反之亦然; 然后,人们可以进一步选择所需的用户来表示推荐条目和他/她的历史行为之间的关系。RecDelta 通过增强模型的可解释性使学者和从业者受益,因为他们利用新获得的见解开发推荐算法。请注意,虽然该系统现已上网,但我们亦提供 https://cfda.csie.org/recdelta 录像 https://tinyurl.com/recdelta ,介绍该系统的概念及用途。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=RecDelta:+An+Interactive+Dashboard+on+Top-k+Recommendation+for+Cross-model+Evaluation)|1| +|[Where Does the Performance Improvement Come From?: - A Reproducibility Concern about Image-Text Retrieval](https://doi.org/10.1145/3477495.3531715)|Jun Rao, Fei Wang, Liang Ding, Shuhan Qi, Yibing Zhan, Weifeng Liu, Dacheng Tao|China University of Petroleum (East China), qingdao, China; Harbin Institute of Technology, Shenzhen, shenzhen, China; JD Explore Academy, beijing, China|This article aims to provide the information retrieval community with some reflections on recent advances in retrieval learning by analyzing the reproducibility of image-text retrieval models. Due to the increase of multimodal data over the last decade, image-text retrieval has steadily become a major research direction in the field of information retrieval. Numerous researchers train and evaluate image-text retrieval algorithms using benchmark datasets such as MS-COCO and Flickr30k. Research in the past has mostly focused on performance, with multiple state-of-the-art methodologies being suggested in a variety of ways. According to their assertions, these techniques provide improved modality interactions and hence more precise multimodal representations. In contrast to previous works, we focus on the reproducibility of the approaches and the examination of the elements that lead to improved performance by pretrained and nonpretrained models in retrieving images and text. To be more specific, we first examine the related reproducibility concerns and explain why our focus is on image-text retrieval tasks. Second, we systematically summarize the current paradigm of image-text retrieval models and the stated contributions of those approaches. Third, we analyze various aspects of the reproduction of pretrained and nonpretrained retrieval models. To complete this, we conducted ablation experiments and obtained some influencing factors that affect retrieval recall more than the improvement claimed in the original paper. Finally, we present some reflections and challenges that the retrieval community should consider in the future. Our source code is publicly available at https://github.com/WangFei-2019/Image-text-Retrieval.|本文旨在通过分析图像-文本检索模型的可重复性,为信息检索提供有关检索学习最新进展的一些反思。由于过去十年来多模态数据的增加,图像-文本检索逐渐成为信息检索领域的一个主要研究方向。许多研究人员使用基准数据集(如 MS-COCO 和 Flickr30k)训练和评估图像-文本检索算法。过去的研究主要集中在性能方面,以各种方式提出了多种最先进的方法。根据他们的断言,这些技术提供了改进的模态交互,因此更精确的多模态表示。与以往的工作相比,我们关注的方法的可重复性和检查的元素,导致改善性能的预训练和未经预训练的模型检索图像和文本。更具体地说,我们首先考察相关的重复性问题,并解释为什么我们的重点是图像文本检索任务。其次,系统地总结了当前图像文本检索模型的研究范式以及这些方法的贡献。第三,我们分析了预训练和非预训练检索模型复制的各个方面。为此,我们进行了消融实验,得到了一些影响检索回忆的因素,这些因素对检索回忆的影响大于原文中提出的改进。最后,我们提出了一些反思和挑战,检索社区应该考虑在未来。我们的源代码可以在 https://github.com/wangfei-2019/image-text-retrieval 上公开。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Where+Does+the+Performance+Improvement+Come+From?:+-+A+Reproducibility+Concern+about+Image-Text+Retrieval)|1| +|[Competitive Search](https://doi.org/10.1145/3477495.3532771)|Oren Kurland, Moshe Tennenholtz|Univ Chicago, Booth Sch Business, Chicago, IL 60637 USA; European Univ Inst, Fiesole, Italy; Univ New South Wales, Business Sch, Sydney, NSW, Australia; Zhejiang Univ, Hangzhou, Peoples R China|This essay surveys the literature on directed search and competitive search equilibrium, covering theory and a variety of applications. These models share features with traditional search theory, but also differ in important ways. They share features with general equilibrium theory, but with explicit frictions. Equilibria are often efficient, mainly because markets price goods plus the time required to get them. The approach is tractable and arguably realistic. Results are presented for finite and continuum economies. Private information and sorting with heterogeneity are analyzed. While emphasizing issues and applications, we also provide several hard-to-find technical results.|本文综述了有关定向搜索和竞争搜索均衡、覆盖理论和各种应用的文献。这些模型具有与传统搜索理论相同的特点,但在一些重要方面又有所不同。它们与一般均衡理论有共同的特点,但存在明显的摩擦。均衡往往是有效的,主要是因为市场对商品定价,加上购买商品所需的时间。这种方法容易处理,而且可以说是现实的。给出了有限经济体和连续经济体的结果。分析了私有信息和异构排序问题。在强调问题和应用程序的同时,我们还提供了一些难以找到的技术结果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Competitive+Search)|1| +|[A Dataset for Sentence Retrieval for Open-Ended Dialogues](https://doi.org/10.1145/3477495.3531727)|Itay Harel, Hagai Taitelbaum, Idan Szpektor, Oren Kurland|TSG IT Advanced Systems Ltd., Tel Aviv, Israel; Google Research, Tel Aviv, Israel; Technion - Israel Institute of Technology, Haifa, Israel|We address the task of sentence retrieval for open-ended dialogues. The goal is to retrieve sentences from a document corpus that contain information useful for generating the next turn in a given dialogue. Prior work on dialogue-based retrieval focused on specific types of dialogues: either conversational QA or conversational search. To address a broader scope of this task where any type of dialogue can be used, we constructed a dataset that includes open-ended dialogues from Reddit, candidate sentences from Wikipedia for each dialogue and human annotations for the sentences. We report the performance of several retrieval baselines, including neural retrieval models, over the dataset. To adapt neural models to the types of dialogues in the dataset, we explored an approach to induce a large-scale weakly supervised training data from Reddit. Using this training set significantly improved the performance over training on the MS MARCO dataset.|我们讨论开放式对话的句子检索任务。其目的是从文档语料库中检索句子,这些句子包含有用的信息,用于在给定的对话中生成下一个回合。先前的基于对话的检索工作集中在特定类型的对话: 会话 QA 或会话搜索。为了扩大任何类型的对话都可以使用的范围,我们构建了一个数据集,其中包括来自 Reddit 的开放式对话、来自 Wikipedia 的每个对话的候选句子以及句子的人工注释。我们报告了在数据集上的几个检索基线的性能,包括神经检索模型。为了使神经模型适应数据集中的对话类型,我们探索了一种从 Reddit 引入大规模弱监督训练数据的方法。使用这个训练集显著提高了在 MS MARCO 数据集上的训练性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Dataset+for+Sentence+Retrieval+for+Open-Ended+Dialogues)|1| +|[iRec: An Interactive Recommendation Framework](https://doi.org/10.1145/3477495.3531754)|Thiago Silva, Nícollas Silva, Heitor Werneck, Carlos Mito, Adriano C. M. Pereira, Leonardo Rocha|Universidade Federal de São João Del Rei, São João Del Rei, Brazil; Universidade Federal de Minas Gerais, Belo Horizonte, Brazil|Nowadays, most e-commerce and entertainment services have adopted interactive Recommender Systems (RS) to guide the entire journey of users into the system. This task has been addressed as a Multi-Armed Bandit problem where systems must continuously learn and recommend at each iteration. However, despite the recent advances, there is still a lack of consensus on the best practices to evaluate such bandit solutions. Several variables might affect the evaluation process, but most of the works have only been concerned about the accuracy of each method. Thus, this work proposes an interactive RS framework named iRec. It covers the whole experimentation process by following the main RS guidelines. The iRec provides three modules to prepare the dataset, create new recommendation agents, and simulate the interactive scenario. Moreover, it also contains several state-of-the-art algorithms, a hyperparameter tuning module, distinct evaluation metrics, different ways of visualizing the results, and statistical validation.|目前,大多数电子商务和娱乐服务都采用交互式推荐系统(RS)来引导用户进入系统的整个过程。这个任务已经被解决为一个多臂老虎机问题,系统必须在每次迭代中不断学习和推荐。然而,尽管最近取得了一些进展,但是在评估这种土匪解决方案的最佳实践方面仍然缺乏共识。有几个变量可能会影响评价过程,但大多数工作只关心每种方法的准确性。因此,本文提出了一个交互式 RS 框架 iRec。它涵盖了遵循 RS 主要准则的整个实验过程。IRec 提供了三个模块来准备数据集、创建新的推荐代理和模拟交互场景。此外,它还包含一些最先进的算法、一个超参数调整模块、不同的评估指标、不同的结果可视化方法以及统计验证。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=iRec:+An+Interactive+Recommendation+Framework)|1| +|[RecDelta: An Interactive Dashboard on Top-k Recommendation for Cross-model Evaluation](https://doi.org/10.1145/3477495.3531674)|YiShyuan Chiang, YuZe Liu, ChenFeng Tsai, JingKai Lou, MingFeng Tsai, ChuanJu Wang|KKStream Technologies, Taipei, Taiwan Roc; Academia Sinica, Taipei, Taiwan Roc; National Chengchi University, Taipei, Taiwan Roc|In this demonstration, we present RecDelta, an interactive tool for the cross-model evaluation of top-k recommendation. RecDelta is a web-based information system where people visually compare the performance of various recommendation algorithms and their recommended items. In the proposed system, we visualize the distribution of the δ scores between algorithms--a distance metric measuring the intersection between recommendation lists. Such visualization allows for rapid identification of users for whom the items recommended by different algorithms diverge or vice versa; then, one can further select the desired user to present the relationship between recommended items and his/her historical behavior. RecDelta benefits both academics and practitioners by enhancing model explainability as they develop recommendation algorithms with their newly gained insights. Note that while the system is now online at https://cfda.csie.org/recdelta, we also provide a video recording at https://tinyurl.com/RecDelta to introduce the concept and the usage of our system.|在这个演示中,我们展示了 RecDelta,一个用于 top-k 推荐的跨模型评估的交互式工具。RecDelta 是一个基于网络的信息系统,人们可以在这个系统中直观地比较各种推荐算法及其推荐项目的性能。在所提出的系统中,我们可视化算法之间 δ 分数的分布——一个测量推荐列表之间交集的距离度量。这种可视化允许快速识别由不同算法推荐的条目有差异的用户,反之亦然; 然后,人们可以进一步选择所需的用户来表示推荐条目和他/她的历史行为之间的关系。RecDelta 通过增强模型的可解释性使学者和从业者受益,因为他们利用新获得的见解开发推荐算法。请注意,虽然该系统现已上网,但我们亦提供 https://cfda.csie.org/recdelta 录像 https://tinyurl.com/recdelta ,介绍该系统的概念及用途。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=RecDelta:+An+Interactive+Dashboard+on+Top-k+Recommendation+for+Cross-model+Evaluation)|1| |[Asyncval: A Toolkit for Asynchronously Validating Dense Retriever Checkpoints During Training](https://doi.org/10.1145/3477495.3531658)|Shengyao Zhuang, Guido Zuccon|The University of Queensland, Brisbane, QLD, Australia|The process of model checkpoint validation refers to the evaluation of the performance of a model checkpoint executed on a held-out portion of the training data while learning the hyperparameters of the model. This model checkpoint validation process is used to avoid over-fitting and determine when the model has converged so as to stop training. A simple and efficient strategy to validate deep learning checkpoints is the addition of validation loops to execute during training. However, the validation of dense retrievers (DR) checkpoints is not as trivial -- and the addition of validation loops is not efficient. This is because, in order to accurately evaluate the performance of a DR checkpoint, the whole document corpus needs to be encoded into vectors using the current checkpoint before any actual retrieval operation for checkpoint validation can be performed. This corpus encoding process can be very time-consuming if the document corpus contains millions of documents (e.g., 8.8M for MS MARCO v1 and 21M for Natural Questions). Thus, a naïve use of validation loops during training will significantly increase training time. To address this issue, we propose Asyncval: a Python-based toolkit for efficiently validating DR checkpoints during training. Instead of pausing the training loop for validating DR checkpoints, Asyncval decouples the validation loop from the training loop, uses another GPU to automatically validate new DR checkpoints and thus permits to perform validation asynchronously from training. Asyncval also implements a range of different corpus subset sampling strategies for validating DR checkpoints; these strategies allow to further speed up the validation process. We provide an investigation of these methods in terms of their impact on validation time and validation fidelity. Asyncval is made available as an open-source project at https://github.com/ielab/asyncval.|模型检查点验证过程是指在学习模型超参数的同时,对在训练数据的保持部分上执行的模型检查点的性能进行评估。该模型检查点验证过程用于避免过拟合,判断模型何时收敛,从而停止训练。验证深度学习检查点的一个简单而有效的策略是在训练期间添加验证循环来执行。然而,密集检索器(DR)检查点的验证并不那么简单——并且添加验证循环的效率也不高。这是因为,为了准确地评估 DR 检查点的性能,在执行检查点验证的任何实际检索操作之前,需要使用当前检查点将整个文档语料库编码到向量中。如果文档语料库包含数百万个文档(例如,MS MARCO v1为8.8 M,自然问题为21M) ,那么这个语料库编码过程可能非常耗时。因此,在训练期间天真地使用验证循环将显著增加训练时间。为了解决这个问题,我们提出了 Asyncval: 一个基于 Python 的工具包,用于在培训期间有效地验证 DR 检查点。Asyncval 没有暂停用于验证 DR 检查点的训练循环,而是将验证循环与训练循环解耦,使用另一个 GPU 自动验证新的 DR 检查点,从而允许从训练中异步执行验证。Asyncval 还实现了一系列不同的语料库子集采样策略来验证 DR 检查点; 这些策略允许进一步加快验证过程。我们根据这些方法对验证时间和验证保真度的影响对它们进行了研究。Asyncval 作为一个开源项目在 https://github.com/ielab/Asyncval 上可以使用。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Asyncval:+A+Toolkit+for+Asynchronously+Validating+Dense+Retriever+Checkpoints+During+Training)|1| -|[TaskMAD: A Platform for Multimodal Task-Centric Knowledge-Grounded Conversational Experimentation](https://doi.org/10.1145/3477495.3531679)|Alessandro Speggiorin, Jeffrey Dalton, Anton Leuski|University of Glasgow, Glasgow, United Kingdom; University of Southern California, Los Angeles, CA, USA|The role of conversational assistants continues to evolve, beyond simple voice commands to ones that support rich and complex tasks in the home, car, and even virtual reality. Going beyond simple voice command and control requires agents and datasets blending structured dialogue, information seeking, grounded reasoning, and contextual question-answering in a multimodal environment with rich image and video content. In this demo, we introduce Task-oriented Multimodal Agent Dialogue (TaskMAD), a new platform that supports the creation of interactive multimodal and task-centric datasets in a Wizard-of-Oz experimental setup. TaskMAD includes support for text and voice, federated retrieval from text and knowledge bases, and structured logging of interactions for offline labeling. Its architecture supports a spectrum of tasks that span open-domain exploratory search to traditional frame-based dialogue tasks. It's open-source and offers rich capability as a platform used to collect data for the Amazon Alexa Prize Taskbot challenge, TREC Conversational Assistance track, undergraduate student research, and others. TaskMAD is distributed under the MIT license.|会话助理的角色不断演变,从简单的语音命令到支持家庭、汽车甚至虚拟现实中丰富而复杂的任务的语音命令。超越简单的语音命令和控制需要代理和数据集在一个具有丰富图像和视频内容的多通道环境中,将结构化对话、信息搜索、基础推理和上下文问答相结合。在这个演示中,我们介绍了面向任务的多通道 Agent 对话(TaskMAD) ,这是一个新的平台,支持在 Wizard-of-Oz 实验设置中创建交互式多通道和以任务为中心的数据集。TaskMAD 包括对文本和语音的支持、对文本和知识库的联合检索,以及用于离线标记的交互的结构化日志记录。它的体系结构支持一系列任务,从开放领域的探索性搜索到传统的基于框架的对话任务。它是开源的,提供了丰富的能力,作为一个平台,用于收集数据的亚马逊 Alexa 奖任务机器人挑战,TREC 对话援助轨道,本科生研究,和其他。TaskMAD 是在 MIT 许可下发布的。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=TaskMAD:+A+Platform+for+Multimodal+Task-Centric+Knowledge-Grounded+Conversational+Experimentation)|1| +|[TaskMAD: A Platform for Multimodal Task-Centric Knowledge-Grounded Conversational Experimentation](https://doi.org/10.1145/3477495.3531679)|Alessandro Speggiorin, Jeffrey Dalton, Anton Leuski|University of Southern California, Los Angeles, CA, USA; University of Glasgow, Glasgow, United Kingdom|The role of conversational assistants continues to evolve, beyond simple voice commands to ones that support rich and complex tasks in the home, car, and even virtual reality. Going beyond simple voice command and control requires agents and datasets blending structured dialogue, information seeking, grounded reasoning, and contextual question-answering in a multimodal environment with rich image and video content. In this demo, we introduce Task-oriented Multimodal Agent Dialogue (TaskMAD), a new platform that supports the creation of interactive multimodal and task-centric datasets in a Wizard-of-Oz experimental setup. TaskMAD includes support for text and voice, federated retrieval from text and knowledge bases, and structured logging of interactions for offline labeling. Its architecture supports a spectrum of tasks that span open-domain exploratory search to traditional frame-based dialogue tasks. It's open-source and offers rich capability as a platform used to collect data for the Amazon Alexa Prize Taskbot challenge, TREC Conversational Assistance track, undergraduate student research, and others. TaskMAD is distributed under the MIT license.|会话助理的角色不断演变,从简单的语音命令到支持家庭、汽车甚至虚拟现实中丰富而复杂的任务的语音命令。超越简单的语音命令和控制需要代理和数据集在一个具有丰富图像和视频内容的多通道环境中,将结构化对话、信息搜索、基础推理和上下文问答相结合。在这个演示中,我们介绍了面向任务的多通道 Agent 对话(TaskMAD) ,这是一个新的平台,支持在 Wizard-of-Oz 实验设置中创建交互式多通道和以任务为中心的数据集。TaskMAD 包括对文本和语音的支持、对文本和知识库的联合检索,以及用于离线标记的交互的结构化日志记录。它的体系结构支持一系列任务,从开放领域的探索性搜索到传统的基于框架的对话任务。它是开源的,提供了丰富的能力,作为一个平台,用于收集数据的亚马逊 Alexa 奖任务机器人挑战,TREC 对话援助轨道,本科生研究,和其他。TaskMAD 是在 MIT 许可下发布的。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=TaskMAD:+A+Platform+for+Multimodal+Task-Centric+Knowledge-Grounded+Conversational+Experimentation)|1| |[IRVILAB: Gamified Searching on Multilingual Wikipedia](https://doi.org/10.1145/3477495.3531662)|Paavo Arvola, Tuulikki Alamettälä|Tampere University, Tampere, Finland|Information retrieval (IR) evaluation can be considered as a form of competition in matching documents and queries. This paper introduces a learning environment based on gamification of query construction for document retrieval, called IRVILAB (Information Retrieval Virtual Lab). The lab has modules for creating standard evaluation settings, one for topic creation including relevance assessments and another for performance evaluation of user queries. In addition, multilingual Wikipedia online collection enables a module, where relevance assessments are translated to other languages. The underlying game utilizes IR performance metrics to measure and give feedback on participants' information retrieval performance. It aims to improve participants' search skills, subject knowledge and contributes to science education by introducing an experimental method. Distinctive features of the system include algorithmic relevance assessments and automatic recall base translation.|信息检索评估可被视为一种在配对文件和查询方面的竞争形式。本文介绍了一个基于文献检索查询结构游戏化的学习环境 IRVILAB (信息检索虚拟实验室)。该实验室有用于创建标准评估设置的模块,一个用于主题创建,包括相关性评估,另一个用于用户查询的性能评估。此外,多语言维基百科在线收集支持一个模块,其中相关性评估被翻译成其他语言。基础游戏利用红外表现指标来衡量和反馈参与者的信息检索表现。它旨在提高参与者的搜索技能,学科知识和有助于科学教育的实验方法。该系统的显著特点包括算法相关性评估和自动回忆库翻译。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=IRVILAB:+Gamified+Searching+on+Multilingual+Wikipedia)|1| |[Improving Efficiency and Robustness of Transformer-based Information Retrieval Systems](https://doi.org/10.1145/3477495.3532681)|Edmon Begoli, Sudarshan Srinivasan, Maria Mahbub|Oak Ridge National Laboratory (ORNL), Oak Ridge, TN, USA|This tutorial focuses on both theoretical and practical aspects of improving the efficiency and robustness of transformer-based approaches, so that these can be effectively used in practical, high-scale, and high-volume information retrieval (IR) scenarios. The tutorial is inspired and informed by our work and experience while working with massive narrative datasets (8.5 billion medical notes), and by our basic research and academic experience with transformer-based IR tasks. Additionally, the tutorial focuses on techniques for making transformer-based IR robust against adversarial (AI) exploitation. This is a recent concern in the IR domain that we needed to take into concern, and we want to want to share some of the lessons learned and applicable principles with our audience. Finally, an important, if not critical, element of this tutorial is its focus on didacticism -- delivering tutorial content in a clear, intuitive, plain-speak fashion. Transformers are a challenging subject, and, through our teaching experience, we observed a great value and a great need to explain all relevant aspects of this architecture and related principles in the most straightforward, precise, and intuitive manner. That is the defining style of our proposed tutorial.|本教程重点介绍提高基于变压器的方法的效率和稳健性的理论和实践方面,以便这些方法能够有效地用于实际的、大规模的和大容量的信息检索(IR)场景。本教程的灵感来自于我们在处理大量叙述性数据集(85亿医学笔记)时的工作和经验,以及我们在基于变压器的 IR 任务方面的基础研究和学术经验。此外,本教程还重点介绍了使基于变压器的 IR 针对对手(AI)开发具有鲁棒性的技术。这是国际关系领域最近关注的一个问题,我们需要加以关注,我们希望与我们的听众分享一些经验教训和适用的原则。最后,本教程的一个重要的(如果不是关键的)元素是它对教学法的关注——以一种清晰、直观、直白的方式提供教程内容。变压器是一个具有挑战性的课题,并且,通过我们的教学经验,我们观察到一个巨大的价值和一个巨大的需要,以最直接,精确和直观的方式解释这个架构和相关原则的所有相关方面。这就是我们提议的教程的定义风格。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+Efficiency+and+Robustness+of+Transformer-based+Information+Retrieval+Systems)|1| -|[Self-Supervised Learning for Recommender System](https://doi.org/10.1145/3477495.3532684)|Chao Huang, Xiang Wang, Xiangnan He, Dawei Yin|Baidu Inc., Beijing, China; University of Hong Kong, Hong Kong, Hong Kong; University of Science and Technology of China, Hefei, China|Recommender systems have become key components for a wide spectrum of web applications (e.g., E-commerce sites, video sharing platforms, lifestyle applications, etc), so as to alleviate the information overload and suggest items for users. However, most existing recommendation models follow a supervised learning manner, which notably limits their representation ability with the ubiquitous sparse and noisy data in practical applications. Recently, self-supervised learning (SSL) has become a promising learning paradigm to distill informative knowledge from unlabeled data, without the heavy reliance on sufficient supervision signals. Inspired by the effectiveness of self-supervised learning, recent efforts bring SSL's superiority into various recommendation representation learning scenarios with augmented auxiliary learning tasks. In this tutorial, we aim to provide a systemic review of existing self-supervised learning frameworks and analyze the corresponding challenges for various recommendation scenarios, such as general collaborative filtering paradigm, social recommendation, sequential recommendation, and multi-behavior recommendation. We then raise discussions and future directions of this area. With the introduction of this emerging and promising topic, we expect the audience to have a deep understanding of this domain. We also seek to promote more ideas and discussions, which facilitates the development of self-supervised learning recommendation techniques.|推荐系统已成为一系列网上应用程式(例如电子商贸网站、影片分享平台、生活方式应用程式等)的重要组成部分,以纾缓信息超载及为使用者提供建议。然而,现有的大多数推荐模型都遵循监督式学习的方式,这明显地限制了它们在实际应用中对普遍存在的稀疏和嘈杂数据的表示能力。近年来,自监督学习(SSL)已成为从未标记数据中提取信息知识的一种很有前途的学习方法,而不需要过多地依赖于足够的监督信号。受自我监督学习效果的启发,最近的研究将 SSL 的优越性应用到各种推荐表示学习场景中,并增加了辅助学习任务。在本教程中,我们的目标是提供一个现有的自我监督学习框架的系统回顾,并分析各种推荐场景的相应挑战,如一般协同过滤范式,社会推荐,顺序推荐和多行为推荐。然后,我们提出这一领域的讨论和未来方向。随着这个新兴的和有前途的主题的介绍,我们希望听众对这个领域有一个深刻的理解。我们也寻求促进更多的想法和讨论,以促进自我监督学习推荐技术的发展。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Self-Supervised+Learning+for+Recommender+System)|1| +|[Self-Supervised Learning for Recommender System](https://doi.org/10.1145/3477495.3532684)|Chao Huang, Xiang Wang, Xiangnan He, Dawei Yin|University of Science and Technology of China, Hefei, China; University of Hong Kong, Hong Kong, Hong Kong; Baidu Inc., Beijing, China|Recommender systems have become key components for a wide spectrum of web applications (e.g., E-commerce sites, video sharing platforms, lifestyle applications, etc), so as to alleviate the information overload and suggest items for users. However, most existing recommendation models follow a supervised learning manner, which notably limits their representation ability with the ubiquitous sparse and noisy data in practical applications. Recently, self-supervised learning (SSL) has become a promising learning paradigm to distill informative knowledge from unlabeled data, without the heavy reliance on sufficient supervision signals. Inspired by the effectiveness of self-supervised learning, recent efforts bring SSL's superiority into various recommendation representation learning scenarios with augmented auxiliary learning tasks. In this tutorial, we aim to provide a systemic review of existing self-supervised learning frameworks and analyze the corresponding challenges for various recommendation scenarios, such as general collaborative filtering paradigm, social recommendation, sequential recommendation, and multi-behavior recommendation. We then raise discussions and future directions of this area. With the introduction of this emerging and promising topic, we expect the audience to have a deep understanding of this domain. We also seek to promote more ideas and discussions, which facilitates the development of self-supervised learning recommendation techniques.|推荐系统已成为一系列网上应用程式(例如电子商贸网站、影片分享平台、生活方式应用程式等)的重要组成部分,以纾缓信息超载及为使用者提供建议。然而,现有的大多数推荐模型都遵循监督式学习的方式,这明显地限制了它们在实际应用中对普遍存在的稀疏和嘈杂数据的表示能力。近年来,自监督学习(SSL)已成为从未标记数据中提取信息知识的一种很有前途的学习方法,而不需要过多地依赖于足够的监督信号。受自我监督学习效果的启发,最近的研究将 SSL 的优越性应用到各种推荐表示学习场景中,并增加了辅助学习任务。在本教程中,我们的目标是提供一个现有的自我监督学习框架的系统回顾,并分析各种推荐场景的相应挑战,如一般协同过滤范式,社会推荐,顺序推荐和多行为推荐。然后,我们提出这一领域的讨论和未来方向。随着这个新兴的和有前途的主题的介绍,我们希望听众对这个领域有一个深刻的理解。我们也寻求促进更多的想法和讨论,以促进自我监督学习推荐技术的发展。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Self-Supervised+Learning+for+Recommender+System)|1| |[ReNeuIR: Reaching Efficiency in Neural Information Retrieval](https://doi.org/10.1145/3477495.3531704)|Sebastian Bruch, Claudio Lucchese, Franco Maria Nardini|Pinecone, New York, NY, USA; Ca' Foscary University of Venice, Venice, Italy; ISTI-CNR, Pisa, Italy|Perhaps the applied nature of information retrieval research goes some way to explain the community's rich history of evaluating machine learning models holistically, understanding that efficacy matters but so does the computational cost incurred to achieve it. This is evidenced, for example, by more than a decade of research on efficient training and inference of large decision forest models in learning-to-rank. As the community adopts even more complex, neural network-based models in a wide range of applications, questions on efficiency have once again become relevant. We propose this workshop as a forum for a critical discussion of efficiency in the era of neural information retrieval, to encourage debate on the current state and future directions of research in this space, and to promote more sustainable research by identifying best practices in the development and evaluation of neural models for information retrieval.|也许信息检索研究的应用性质在某种程度上可以解释社区整体评估机器学习模型的丰富历史,理解功效很重要,但实现它所需的计算成本也很重要。例如,十多年来对大型决策森林模型在学习排序中的有效训练和推理的研究就证明了这一点。随着社会采用更为复杂的、基于神经网络的模型在广泛的应用中,有关效率的问题再次变得相关。我们建议这个研讨会作为一个论坛,就神经信息检索时代的效率进行批判性讨论,鼓励就该领域的研究现状和未来方向展开辩论,并通过确定开发和评估神经信息检索模型的最佳做法,促进更可持续的研究。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ReNeuIR:+Reaching+Efficiency+in+Neural+Information+Retrieval)|1| |[Generating Knowledge-based Explanation for Recommendation from Review](https://doi.org/10.1145/3477495.3531683)|Zuoxi Yang|South China University of Technology, Tianhe, Guangzhou, China|Reasonable explanation is helpful to increase the trust and satisfaction of user to the recommender system. Among many previous studies, there is growing concern about generating explanation based on review text. Collaborative filtering is one of the most successful approaches to predict user's preference. However, most of them suffer from data sparsity problem. Researcher often utilizes auxiliary data to address this problem, such as review, knowledge graph (KG), image and so on. Some researchers have proven that recommendation accuracy can be improved via incorporating rating and review data. Besides, neural network is also applied to learn more powerful representations for user and item from the review data. For example, convolution neural network (CNN) is used to extract representation from review text by using convolutional filters. Recurrent neural network (RNN) is another widely used model, which can encode the sequential behaviours as hidden states. However, most of them lack the ability to generate explanation. In order to generate explanation, there are two main approaches are used, i.e., template-based approach and generation-based approach. It is usually necessary for the templated-based approach to define serval templates. Then, these templates will be further filled with different personalized features/words. Although they can offer readable explanations, they rely heavily on pre-defined templates. It causes large manual efforts, limiting their explanation expression. Due to the strong generation ability of natural language model, the generation-based approach is capable to generate explanation without templates, which can largely enhance the expression of the generated sentence. Although they can generate more free and flexible explanation, the explanation might tend to be uninformative. To tackle these challenges of the above-mentioned work, we propose a Generating Knowldge-based Explanation for Recommendation from Review (GKER) to provide informative explanation. Unlike the traditional generation-based approach with a multi-task framework, we design a single-task framework to simultaneously model user's preference and explanation generation. The multi-task training usually needs more manual effort and time overhead. In this unitary framework, we inject the user's sentiment preference into the explanation generation, aiming at capturing the user's interest while producing high-quality explanation. Specifically, we build three graphs, including a bipartite graph, a KG and a co-occur graph. All of them are integrated to form a unitary graph, thus bringing the semantic among user-item interaction, KG and review. Based on this integrated graph, it is possible to learn more effective representations for user and item. To make better use of the integrated KG, a graph convolution network (GCN) is utilized to obtain improved embeddings due to its superior representation learning ability. We argue that these embeddings can contain more semantic interaction signals with the help of the integrated KG and GCN. After obtaining these extensive embeddings, a multilayer perceptron (MLP) layer is further employed to capture non-linear interaction signals between user and item, aiming at predicting user's rating accurately. The predicted rating would be regarded as a sentiment indicator to explore why the user likes or dislikes the target item. To investigate the association between sentiment indicator and the related review data, a transformer-enhanced encoder-decoder architecture is designed to produce informative and topic-relevant explanation. Besides, the aspect semantic is added in this architecture through an attention mechanism. In this framework, the transformer is utilized as a "teacher" model to supervise the generation of the encoder-decoder process. Finally, experiments conducted on three datasets have shown the state-of-the-art performance of GKER. There are some research issues for discussion: 1) although KG is a useful tool for recommendation accuracy and explainability, it is always incomplete in the real world. Hence, it is worth completing it for the recommendation. 2) Besides, as for explainable, it still needs more metrics to evaluate the quality of its explanation.|合理的解释有助于提高用户对推荐系统的信任和满意度。在以往的许多研究中,人们越来越关注基于评论文本生成解释。协同过滤是预测用户偏好最成功的方法之一。然而,它们中的大多数都存在数据稀疏问题。研究人员经常利用辅助数据来解决这个问题,如复习、知识图(KG)、图像等。一些研究人员已经证明,通过整合评分和评论数据,可以提高推荐的准确性。此外,还应用神经网络从复习数据中学习更强大的用户和项目表示。例如,卷积神经网络(CNN)通过使用卷积滤波器从复习文本中提取表示。递归神经网络(RNN)是另一种广泛使用的模型,它可以将序列行为编码为隐藏状态。然而,他们中的大多数缺乏产生解释的能力。为了产生解释,主要使用了两种方法,即基于模板的方法和基于生成的方法。基于模板的方法通常需要定义几个模板。然后,这些模板将进一步填充不同的个性化特征/词语。尽管它们可以提供可读的解释,但它们严重依赖于预定义的模板。它导致大量的人工操作,限制了它们的解释表达。由于自然语言模型具有很强的生成能力,基于生成的方法能够在没有模板的情况下生成解释,从而大大提高了生成句子的表达能力。虽然他们可以产生更自由和灵活的解释,解释可能往往是无益的。为了应对上述工作的这些挑战,我们提出了一个基于知识生成的评审推荐解释(GKER)来提供信息解释。与传统的基于生成的多任务框架方法不同,我们设计了一个单任务框架来同时模拟用户的偏好和解释生成。多任务训练通常需要更多的人工努力和时间开销。在这个统一的框架中,我们将用户的情感偏好引入到解释生成中,目的是在获取用户兴趣的同时生成高质量的解释。具体来说,我们构造了三个图,包括一个二部图、一个 KG 图和一个共现图。它们集成在一起形成一个统一的图形,从而实现了用户-项目交互、 KG 和评论之间的语义关系。基于这个集成图表,可以学习更有效的表示用户和项目。为了更好地利用集成 KG,利用图卷积网络(GCN)优越的表示学习能力来获得改进的嵌入。我们认为,这些嵌入可以包含更多的语义交互信号的帮助下,整合 KG 和 GCN。在获得这些广泛的嵌入之后,一个多层感知机(MLP)层被进一步用来捕获用户和项目之间的非线性交互信号,旨在准确预测用户的评分。预测的评分将被视为一个情绪指标,以探索为什么用户喜欢或不喜欢的目标项目。为了研究情绪指标与相关评论数据之间的关联,设计了一个变压器增强型编解码器结构,以产生信息丰富和与主题相关的解释。此外,通过注意机制在体系结构中添加了方面语义。在这个框架中,变压器被用作“教师”模型来监督编码器-解码器过程的生成。最后,在三个数据集上进行的实验显示了 GKER 的最新性能。有一些研究问题值得讨论: 1)虽然 KG 是一个有效的推荐准确性和可解释性的工具,但在现实世界中它总是不完整的。因此,为了获得推荐,完成它是值得的。2)对于可解释性,还需要更多的指标来评价其解释的质量。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Generating+Knowledge-based+Explanation+for+Recommendation+from+Review)|1| |[Improving Fairness and Transparency for Artists in Music Recommender Systems](https://doi.org/10.1145/3477495.3531681)|Karlijn Dinnissen|Utrecht University, Utrecht, Netherlands|Streaming services have become one of today's main sources of music consumption, with music recommender systems (MRS) as important components. The MRS' choices strongly influence what users consume, and vice versa. Therefore, there is a growing interest in ensuring the fairness of these choices for all stakeholders involved. Firstly, for users, unfairness might result in some users receiving lower-quality recommendations in terms of accuracy and coverage. Secondly, item provider (i.e. artist) unfairness might result in some artists receiving less exposure, and therefore less revenue. However, it is challenging to improve fairness without a decrease in, for instance, overall recommendation quality or user satisfaction. Additional complications arise when balancing possibly domain-specific objectives for multiple stakeholders at once. While fairness research exists from both the user and artist perspective in the music domain, there is a lack of research directly consulting artists---with Ferraro et al. (2021) as an exception. When interacting with recommendation systems and evaluating their fairness, the many factors influencing recommendation system decisions can cause another difficulty: lack of transparency. Artists indicate they would appreciate more transparency in MRS---both towards the user and themselves. While e.g. Millecamp et al. (2019) use explanations to increase transparency for MRS users, to the best of our knowledge, no research has addressed improving transparency for artists this way.|流媒体服务已经成为当今音乐消费的主要来源之一,音乐推荐系统(MRS)是其中的重要组成部分。MRS 的选择强烈地影响用户的消费,反之亦然。因此,确保这些选择对所有相关利益攸关方的公平性越来越受到关注。首先,对于用户来说,不公平可能会导致一些用户在准确性和覆盖率方面得到低质量的推荐。其次,项目提供者(即艺术家)的不公平可能导致一些艺术家获得较少的曝光率,从而减少收入。然而,在不降低例如整体推荐质量或用户满意度的情况下提高公平性是具有挑战性的。当同时为多个涉众平衡可能的领域特定目标时,会出现额外的复杂性。虽然公平性研究存在于音乐领域的用户和艺术家两个角度,但缺乏直接咨询艺术家的研究——费拉罗等人(2021)是一个例外。当与推荐系统进行交互并评估其公平性时,影响推荐系统决策的诸多因素会导致另一个困难: 缺乏透明度。艺术家们表示,他们希望 MRS 更加透明——无论是对用户还是对他们自己。虽然 Millecamp 等人(2019)使用解释来增加 MRS 用户的透明度,据我们所知,还没有研究以这种方式提高艺术家的透明度。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+Fairness+and+Transparency+for+Artists+in+Music+Recommender+Systems)|1| |[Exploring Modular Task Decomposition in Cross-domain Named Entity Recognition](https://doi.org/10.1145/3477495.3531976)|Xinghua Zhang, Bowen Yu, Yubin Wang, Tingwen Liu, Taoyu Su, Hongbo Xu|Institute of Information Engineering, Chinese Academy of Sciences & University of Chinese Academy of Sciences, Beijing, China|Cross-domain Named Entity Recognition (NER) aims to transfer knowledge from the source domain to the target, alleviating expensive labeling costs in the target domain. Most prior studies acquire domain-invariant features under the end-to-end sequence-labeling framework where each token is assigned a compositional label (e.g., B-LOC). However, the complexity of cross-domain transfer may be increased over this complicated labeling scheme, which leads to sub-optimal results, especially when there are significantly distinct entity categories across domains. In this paper, we aim to explore the task decomposition in cross-domain NER. Concretely, we suggest a modular learning approach in which two sub-tasks (entity span detection and type classification) are learned by separate functional modules to perform respective cross-domain transfer with corresponding strategies. Compared with the compositional labeling scheme, the label spaces are smaller and closer across domains especially in entity span detection, leading to easier transfer in each sub-task. And then we combine two sub-tasks to achieve the final result with modular interaction mechanism, and deploy the adversarial regularization for generalized and robust learning in low-resource target domains. Extensive experiments over 10 diverse domain pairs demonstrate that the proposed method is superior to state-of-the-art cross-domain NER methods in an end-to-end fashion (about average 6.4% absolute F1 score increase). Further analyses show the effectiveness of modular task decomposition and its great potential in cross-domain NER.|跨域命名实体识别(NER)旨在将知识从源域传递到目标域,减少目标域中昂贵的标记代价。大多数先前的研究在端到端序列标记框架下获得域不变特征,其中每个标记被分配一个组合标签(例如,B-LOC)。然而,这种复杂的标记方案可能会增加跨域传输的复杂性,从而导致次优结果,特别是当跨域存在明显不同的实体类别时。本文旨在研究跨域 NER 中的任务分解问题。具体地说,我们提出了一种模块化学习方法,其中两个子任务(实体跨度检测和类型分类)由不同的功能模块学习,以执行各自的跨域传输和相应的策略。与组合标记方案相比,标记空间更小,跨域更紧密,特别是在实体跨度检测中,使得每个子任务之间的传递更加容易。然后将两个子任务结合起来,采用模块化交互机制实现最终结果,并在低资源目标域上部署广义鲁棒学习的对抗正则化。超过10个不同域对的广泛实验表明,所提出的方法以端到端的方式优于最先进的跨域 NER 方法(平均绝对 F1评分增加约6.4%)。进一步的分析表明模块化任务分解的有效性及其在跨域 NER 中的巨大潜力。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Exploring+Modular+Task+Decomposition+in+Cross-domain+Named+Entity+Recognition)|1| -|[Graph Adaptive Semantic Transfer for Cross-domain Sentiment Classification](https://doi.org/10.1145/3477495.3531984)|Kai Zhang, Qi Liu, Zhenya Huang, Mingyue Cheng, Kun Zhang, Mengdi Zhang, Wei Wu, Enhong Chen|Hefei University of Technology, Hefei, China; Anhui Province Key Lab. of Big Data Analysis and Application, University of S&T of China & State Key Laboratory of Cognitive Intelligence, Hefei, China; Meituan, Beijing, China; Anhui Province Key Lab. of Big Data Analysis and Application, University of S&T of China, Hefei, China|Cross-domain sentiment classification (CDSC) aims to use the transferable semantics learned from the source domain to predict the sentiment of reviews in the unlabeled target domain. Existing studies in this task attach more attention to the sequence modeling of sentences while largely ignoring the rich domain-invariant semantics embedded in graph structures (i.e., the part-of-speech tags and dependency relations). As an important aspect of exploring characteristics of language comprehension, adaptive graph representations have played an essential role in recent years. To this end, in the paper, we aim to explore the possibility of learning invariant semantic features from graph-like structures in CDSC. Specifically, we present Graph Adaptive Semantic Transfer (GAST) model, an adaptive syntactic graph embedding method that is able to learn domain-invariant semantics from both word sequences and syntactic graphs. More specifically, we first raise a POS-Transformer module to extract sequential semantic features from the word sequences as well as the part-of-speech tags. Then, we design a Hybrid Graph Attention (HGAT) module to generate syntax-based semantic features by considering the transferable dependency relations. Finally, we devise an Integrated aDaptive Strategy (IDS) to guide the joint learning process of both modules. Extensive experiments on four public datasets indicate that GAST achieves comparable effectiveness to a range of state-of-the-art models.|跨域情感分类(CDSC)的目的是利用从源域学到的可转移语义来预测未标记目标域中的评论情感。现有的研究更多地关注句子的序列建模,而忽略了图结构中的丰富的领域不变语义(即词性标签和依赖关系)。自适应图表示作为探索语言理解特征的一个重要方面,近年来发挥了重要作用。为此,本文旨在探索在 CDSC 中从类图结构中学习不变语义特征的可能性。具体来说,我们提出了一种自适应语义转移(GAST)模型,这是一种自适应语法图嵌入方法,能够从词序列和语法图中学习领域不变语义。更具体地说,我们首先提出一个 POS 转换器模块来从词序列和词性标签中提取序列语义特征。然后,我们设计了一个混合图注意(HGAT)模块,通过考虑可转移的依赖关系来生成基于语法的语义特征。最后,我们设计了一个综合自适应策略(IDS)来指导两个模块的联合学习过程。对四个公共数据集的大量实验表明,GAST 达到了与一系列最先进的模型相当的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph+Adaptive+Semantic+Transfer+for+Cross-domain+Sentiment+Classification)|1| +|[Graph Adaptive Semantic Transfer for Cross-domain Sentiment Classification](https://doi.org/10.1145/3477495.3531984)|Kai Zhang, Qi Liu, Zhenya Huang, Mingyue Cheng, Kun Zhang, Mengdi Zhang, Wei Wu, Enhong Chen|Meituan, Beijing, China; Anhui Province Key Lab. of Big Data Analysis and Application, University of S&T of China & State Key Laboratory of Cognitive Intelligence, Hefei, China; Hefei University of Technology, Hefei, China; Anhui Province Key Lab. of Big Data Analysis and Application, University of S&T of China, Hefei, China|Cross-domain sentiment classification (CDSC) aims to use the transferable semantics learned from the source domain to predict the sentiment of reviews in the unlabeled target domain. Existing studies in this task attach more attention to the sequence modeling of sentences while largely ignoring the rich domain-invariant semantics embedded in graph structures (i.e., the part-of-speech tags and dependency relations). As an important aspect of exploring characteristics of language comprehension, adaptive graph representations have played an essential role in recent years. To this end, in the paper, we aim to explore the possibility of learning invariant semantic features from graph-like structures in CDSC. Specifically, we present Graph Adaptive Semantic Transfer (GAST) model, an adaptive syntactic graph embedding method that is able to learn domain-invariant semantics from both word sequences and syntactic graphs. More specifically, we first raise a POS-Transformer module to extract sequential semantic features from the word sequences as well as the part-of-speech tags. Then, we design a Hybrid Graph Attention (HGAT) module to generate syntax-based semantic features by considering the transferable dependency relations. Finally, we devise an Integrated aDaptive Strategy (IDS) to guide the joint learning process of both modules. Extensive experiments on four public datasets indicate that GAST achieves comparable effectiveness to a range of state-of-the-art models.|跨域情感分类(CDSC)的目的是利用从源域学到的可转移语义来预测未标记目标域中的评论情感。现有的研究更多地关注句子的序列建模,而忽略了图结构中的丰富的领域不变语义(即词性标签和依赖关系)。自适应图表示作为探索语言理解特征的一个重要方面,近年来发挥了重要作用。为此,本文旨在探索在 CDSC 中从类图结构中学习不变语义特征的可能性。具体来说,我们提出了一种自适应语义转移(GAST)模型,这是一种自适应语法图嵌入方法,能够从词序列和语法图中学习领域不变语义。更具体地说,我们首先提出一个 POS 转换器模块来从词序列和词性标签中提取序列语义特征。然后,我们设计了一个混合图注意(HGAT)模块,通过考虑可转移的依赖关系来生成基于语法的语义特征。最后,我们设计了一个综合自适应策略(IDS)来指导两个模块的联合学习过程。对四个公共数据集的大量实验表明,GAST 达到了与一系列最先进的模型相当的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph+Adaptive+Semantic+Transfer+for+Cross-domain+Sentiment+Classification)|1| |[Hybrid CNN Based Attention with Category Prior for User Image Behavior Modeling](https://doi.org/10.1145/3477495.3531854)|Xin Chen, Qingtao Tang, Ke Hu, Yue Xu, Shihang Qiu, Jia Cheng, Jun Lei|Meituan, Shanghai, China|User historical behaviors are proved useful for Click Through Rate (CTR) prediction in online advertising system. In Meituan, one of the largest e-commerce platform in China, an item is typically displayed with its image and whether a user clicks the item or not is usually influenced by its image, which implies that user's image behaviors are helpful for understanding user's visual preference and improving the accuracy of CTR prediction. Existing user image behavior models typically use a two-stage architecture, which extracts visual embeddings of images through off-the-shelf Convolutional Neural Networks (CNNs) in the first stage, and then jointly trains a CTR model with those visual embeddings and non-visual features. We find that the two-stage architecture is sub-optimal for CTR prediction. Meanwhile, precisely labeled categories in online ad systems contain abundant visual prior information, which can enhance the modeling of user image behaviors. However, off-the-shelf CNNs without category prior may extract category unrelated features, limiting CNN's expression ability. To address the two issues, we propose a hybrid CNN based attention module, unifying user's image behaviors and category prior, for CTR prediction. Our approach achieves significant improvements in both online and offline experiments on a billion scale real serving dataset.|在线广告系统中,用户的历史行为对点击率(CTR)的预测非常有用。美团是中国最大的电子商务平台之一,一个商品通常与其图像一起显示,用户是否点击该商品通常受其图像的影响,这意味着用户的图像行为有助于理解用户的视觉偏好,提高点击率预测的准确性。现有的用户图像行为模型通常采用两阶段结构,在第一阶段通过现成的卷积神经网络(CNN)提取图像的视觉嵌入,然后将这些视觉嵌入和非视觉特征联合训练 CTR 模型。我们发现两阶段结构对于 CTR 预测是次优的。同时,在线广告系统中精确标注的类别含有丰富的视觉先验信息,可以增强用户图像行为的建模能力。然而,没有类别先验的现成 CNN 可能会提取类别不相关的特征,限制了 CNN 的表达能力。针对这两个问题,提出了一种基于混合细胞神经网络的注意模块,统一用户的图像行为和类别先验,用于 CTR 预测。我们的方法在10亿规模的实际服务数据集的两个在线和离线实验中都取得了显著的改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Hybrid+CNN+Based+Attention+with+Category+Prior+for+User+Image+Behavior+Modeling)|1| |[When Online Meets Offline: Exploring Periodicity for Travel Destination Prediction](https://doi.org/10.1145/3477495.3531859)|Wanjie Tao, Liangyue Li, Chen Chen, Zulong Chen, Hong Wen|Alibaba Group, Hangzhou, UNK, China; University of Virginia, Petersburg, UNK, USA|Online travel platforms (OTPs), e.g., booking.com and Ctrip.com, deliver travel experiences to online users by providing travel-related products. One key problem facing OTPs is to predict users' future travel destination, which has many important applications, e.g., proactively recommending users flight tickets or hotels in the destination city. Although much progress has been made for the next POI recommendation, they are largely sub-optimal for travel destination prediction on OTPs, due to the unique characteristics exhibited from users' travel behaviors such as offline spatial-temporal periodicity and online multi-interest exploration. In this paper, we propose an online-offline periodicity-aware information gain network, OOPIN, for travel destination prediction on OTPs. The key components of the model are (1) an offline mobility pattern extractor, which extracts spatial-temporal periodicity along with the sequential dependencies from the visited city sequence; and (2) an online multi-interests exploration module that discovers destinations that the user might be interested in but not yet visited from their online interaction data.Comprehensive experiments on real-world OTP demonstrate the superior performance of the proposed model for travel destination prediction compared with state-of-the-art methods.|在线旅游平台(OTP) ,例如 booking.com 和 Trip.com,通过提供与旅游相关的产品,为在线用户提供旅游体验。OTP 面临的一个关键问题是预测用户未来的旅游目的地,这有许多重要的应用程序,例如,主动向用户推荐目的地城市的机票或酒店。尽管下一个 POI 推荐已经取得了很大的进展,但由于用户的出行行为表现出离线时空周期性和在线多兴趣探索等独特特征,它们在 OTP 上对旅游目的地的预测大多是次优的。在本文中,我们提出了一个在线-离线周期感知信息增益网络,OOPIN,用于 OTP 上的旅游目的地预测。该模型的关键组成部分是: (1)离线移动模式提取器,它从访问的城市序列中提取出时空周期以及顺序依赖关系; (2)在线多兴趣探索模块,它从用户的在线交互数据中发现用户可能感兴趣但尚未访问的目的地。在现实生活中的 OTP 实验表明,与现有的预测方法相比,本文提出的旅游目的地预测模型具有更好的预测性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=When+Online+Meets+Offline:+Exploring+Periodicity+for+Travel+Destination+Prediction)|1| |[Modeling User Behavior With Interaction Networks for Spam Detection](https://doi.org/10.1145/3477495.3531875)|Prabhat Agarwal, Manisha Srivastava, Vishwakarma Singh, Charles Rosenberg|Pinterest, San Francisco, CA, USA|Spam is a serious problem plaguing web-scale digital platforms which facilitate user content creation and distribution. It compromises platform's integrity, performance of services like recommendation and search, and overall business. Spammers engage in a variety of abusive and evasive behavior which are distinct from non-spammers. Users' complex behavior can be well represented by a heterogeneous graph rich with node and edge attributes. Learning to identify spammers in such a graph for a web-scale platform is challenging because of its structural complexity and size. In this paper, we propose SEINE (Spam DEtection using Interaction NEtworks), a spam detection model over a novel graph framework. Our graph simultaneously captures rich users' details and behavior and enables learning on a billion-scale graph. Our model considers neighborhood along with edge types and attributes, allowing it to capture a wide range of spammers. SEINE, trained on a real dataset of tens of millions of nodes and billions of edges, achieves a high performance of 80% recall with 1% false positive rate. SEINE achieves comparable performance to the state-of-the-art techniques on a public dataset while being pragmatic to be used in a large-scale production system.|垃圾邮件是困扰网络规模的数字平台的一个严重问题,这些平台促进了用户内容的创建和分发。它损害了平台的完整性、推荐和搜索等服务的性能以及整体业务。垃圾邮件发送者与非垃圾邮件发送者有所不同,他们从事各种辱骂和回避行为。用户的复杂行为可以很好地用一个具有丰富节点和边属性的异构图来表示。由于其结构的复杂性和规模,学习在这样一个网络规模平台的图表中识别垃圾邮件发送者是具有挑战性的。在本文中,我们提出了 SEINE (使用交互网络的垃圾邮件检测) ,一个新的图形框架下的垃圾邮件检测模型。我们的图形同时捕捉丰富用户的细节和行为,并支持在十亿级图形上学习。我们的模型考虑了邻域以及边缘类型和属性,允许它捕获范围广泛的垃圾邮件发送者。SEINE 在数千万个节点和数十亿条边的真实数据集上进行训练,实现了80% 的高性能召回率和1% 的假阳性率。SEINE 在公共数据集上实现了与最先进技术相当的性能,同时在大规模生产系统中使用也很实用。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Modeling+User+Behavior+With+Interaction+Networks+for+Spam+Detection)|1| -|[ArchivalQA: A Large-scale Benchmark Dataset for Open-Domain Question Answering over Historical News Collections](https://doi.org/10.1145/3477495.3531734)|Jiexin Wang, Adam Jatowt, Masatoshi Yoshikawa|University of Innsbruck, Innsbruck, Austria; Kyoto University, Kyoto, Japan|In the last few years, open-domain question answering (ODQA) has advanced rapidly due to the development of deep learning techniques and the availability of large-scale QA datasets. However, the current datasets are essentially designed for synchronic document collections (e.g., Wikipedia). Temporal news collections such as long-term news archives spanning decades are rarely used in training the models despite they are quite valuable for our society. To foster the research in the field of ODQA on such historical collections, we present ArchivalQA, a large question answering dataset consisting of 532,444 question-answer pairs which is designed for temporal news QA. We divide our dataset into four subparts based on the question difficulty levels and the containment of temporal expressions, which we believe are useful for training and testing ODQA systems characterized by different strengths and abilities. The novel QA dataset-constructing framework that we introduce can be also applied to generate high-quality, non-ambiguous questions over other types of temporal document collections.|近年来,随着深度学习技术的发展和大规模问答数据集的出现,开放域问答技术得到了迅速的发展。然而,当前的数据集基本上是为同步文档集合而设计的(例如,Wikipedia)。时态新闻集合,如跨越数十年的长期新闻档案,很少用于训练模型,尽管它们对我们的社会有相当大的价值。为了促进 ODQA 领域对这些历史文献的研究,我们提出了 ArchivalQA,一个由532,444个问答对组成的大型问答数据集,它是为时事新闻 QA 而设计的。我们根据问题的难度水平和时间表达式的限制将数据集分为四个子部分,我们相信这对于训练和测试具有不同拥有属性和能力的 ODQA 系统是有用的。我们介绍的新的 QA 数据集构建框架也可以应用于生成高质量的、非歧义的问题,而不是其他类型的时态文档集合。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ArchivalQA:+A+Large-scale+Benchmark+Dataset+for+Open-Domain+Question+Answering+over+Historical+News+Collections)|1| -|[Structure and Semantics Preserving Document Representations](https://doi.org/10.1145/3477495.3532062)|Natraj Raman, Sameena Shah, Manuela Veloso|J.P. Morgan AI Research, New York, NY, USA; J.P. Morgan AI Research, London, United Kingdom|Retrieving relevant documents from a corpus is typically based on the semantic similarity between the document content and query text. The inclusion of structural relationship between documents can benefit the retrieval mechanism by addressing semantic gaps. However, incorporating these relationships requires tractable mechanisms that balance structure with semantics and take advantage of the prevalent pre-train/fine-tune paradigm. We propose here a holistic approach to learning document representations by integrating intra-document content with inter-document relations. Our deep metric learning solution analyzes the complex neighborhood structure in the relationship network to efficiently sample similar/dissimilar document pairs and defines a novel quintuplet loss function that simultaneously encourages document pairs that are semantically relevant to be closer and structurally unrelated to be far apart in the representation space. Furthermore, the separation margins between the documents are varied flexibly to encode the heterogeneity in relationship strengths. The model is fully fine-tunable and natively supports query projection during inference. We demonstrate that it outperforms competing methods on multiple datasets for document retrieval tasks.|从语料库中检索相关文档通常是基于文档内容和查询文本之间的语义相似性。文档之间结构化关系的引入有利于解决语义空白,提高检索机制的效率。然而,合并这些关系需要易处理的机制,平衡结构与语义,并利用流行的预训练/微调范式的优势。在这里,我们提出了一种通过整合文档内容和文档间关系来学习文档表示的整体方法。我们的深度度量学习解决方案分析了关系网络中复杂的邻域结构,以有效地采样相似/不相似的文档对,并定义了一种新的五元组丢失函数,同时鼓励语义相关的文档对在表示空间中更加紧密,结构上不相关。此外,文档之间的分离边界可以灵活变化,以编码关系强度的异质性。该模型是完全可调的,并且在推理过程中本身支持查询投影。我们展示了它在多个数据集的文献检索任务中优于竞争方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Structure+and+Semantics+Preserving+Document+Representations)|1| +|[ArchivalQA: A Large-scale Benchmark Dataset for Open-Domain Question Answering over Historical News Collections](https://doi.org/10.1145/3477495.3531734)|Jiexin Wang, Adam Jatowt, Masatoshi Yoshikawa|Kyoto University, Kyoto, Japan; University of Innsbruck, Innsbruck, Austria|In the last few years, open-domain question answering (ODQA) has advanced rapidly due to the development of deep learning techniques and the availability of large-scale QA datasets. However, the current datasets are essentially designed for synchronic document collections (e.g., Wikipedia). Temporal news collections such as long-term news archives spanning decades are rarely used in training the models despite they are quite valuable for our society. To foster the research in the field of ODQA on such historical collections, we present ArchivalQA, a large question answering dataset consisting of 532,444 question-answer pairs which is designed for temporal news QA. We divide our dataset into four subparts based on the question difficulty levels and the containment of temporal expressions, which we believe are useful for training and testing ODQA systems characterized by different strengths and abilities. The novel QA dataset-constructing framework that we introduce can be also applied to generate high-quality, non-ambiguous questions over other types of temporal document collections.|近年来,随着深度学习技术的发展和大规模问答数据集的出现,开放域问答技术得到了迅速的发展。然而,当前的数据集基本上是为同步文档集合而设计的(例如,Wikipedia)。时态新闻集合,如跨越数十年的长期新闻档案,很少用于训练模型,尽管它们对我们的社会有相当大的价值。为了促进 ODQA 领域对这些历史文献的研究,我们提出了 ArchivalQA,一个由532,444个问答对组成的大型问答数据集,它是为时事新闻 QA 而设计的。我们根据问题的难度水平和时间表达式的限制将数据集分为四个子部分,我们相信这对于训练和测试具有不同拥有属性和能力的 ODQA 系统是有用的。我们介绍的新的 QA 数据集构建框架也可以应用于生成高质量的、非歧义的问题,而不是其他类型的时态文档集合。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ArchivalQA:+A+Large-scale+Benchmark+Dataset+for+Open-Domain+Question+Answering+over+Historical+News+Collections)|1| +|[Structure and Semantics Preserving Document Representations](https://doi.org/10.1145/3477495.3532062)|Natraj Raman, Sameena Shah, Manuela Veloso|J.P. Morgan AI Research, London, United Kingdom; J.P. Morgan AI Research, New York, NY, USA|Retrieving relevant documents from a corpus is typically based on the semantic similarity between the document content and query text. The inclusion of structural relationship between documents can benefit the retrieval mechanism by addressing semantic gaps. However, incorporating these relationships requires tractable mechanisms that balance structure with semantics and take advantage of the prevalent pre-train/fine-tune paradigm. We propose here a holistic approach to learning document representations by integrating intra-document content with inter-document relations. Our deep metric learning solution analyzes the complex neighborhood structure in the relationship network to efficiently sample similar/dissimilar document pairs and defines a novel quintuplet loss function that simultaneously encourages document pairs that are semantically relevant to be closer and structurally unrelated to be far apart in the representation space. Furthermore, the separation margins between the documents are varied flexibly to encode the heterogeneity in relationship strengths. The model is fully fine-tunable and natively supports query projection during inference. We demonstrate that it outperforms competing methods on multiple datasets for document retrieval tasks.|从语料库中检索相关文档通常是基于文档内容和查询文本之间的语义相似性。文档之间结构化关系的引入有利于解决语义空白,提高检索机制的效率。然而,合并这些关系需要易处理的机制,平衡结构与语义,并利用流行的预训练/微调范式的优势。在这里,我们提出了一种通过整合文档内容和文档间关系来学习文档表示的整体方法。我们的深度度量学习解决方案分析了关系网络中复杂的邻域结构,以有效地采样相似/不相似的文档对,并定义了一种新的五元组丢失函数,同时鼓励语义相关的文档对在表示空间中更加紧密,结构上不相关。此外,文档之间的分离边界可以灵活变化,以编码关系强度的异质性。该模型是完全可调的,并且在推理过程中本身支持查询投影。我们展示了它在多个数据集的文献检索任务中优于竞争方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Structure+and+Semantics+Preserving+Document+Representations)|1| |[Aspect Feature Distillation and Enhancement Network for Aspect-based Sentiment Analysis](https://doi.org/10.1145/3477495.3531938)|Rui Liu, Jiahao Cao, Nannan Sun, Lei Jiang|Institute of Information Engineering, Chinese Academy of Sciences & University of Chinese Academy of Sciences, Beijing, China|Aspect-based sentiment analysis (ABSA) is a fine-grained sentiment analysis task designed to identify the polarity of a target aspect. Some works introduce various attention mechanisms to fully mine the relevant context words of different aspects, and use the traditional cross-entropy loss to fine-tune the models for the ABSA task. However, the attention mechanism paying partial attention to aspect-unrelated words inevitably introduces irrelevant noise. Moreover, the cross-entropy loss lacks discriminative learning of features, which makes it difficult to exploit the implicit information of intra-class compactness and inter-class separability. To overcome these challenges, we propose an Aspect Feature Distillation and Enhancement Network (AFDEN) for the ABSA task. We first propose a dual-feature extraction module to extract aspect-related and aspect-unrelated features through the attention mechanisms and graph convolutional networks. Then, to eliminate the interference of aspect-unrelated words, we design a novel aspect-feature distillation module containing a gradient reverse layer that learns aspect-unrelated contextual features through adversarial training, and an aspect-specific orthogonal projection layer to further project aspect-related features into the orthogonal space of aspect-unrelated features. Finally, we propose an aspect-feature enhancement module that leverages supervised contrastive learning to capture the implicit information between the same sentiment labels and between different sentiment labels. Experimental results on three public datasets demonstrate that our AFDEN model achieves state-of-the-art performance and verify the effectiveness and robustness of our model.|基于方面的情绪分析(ABSA)是一个细粒度的情绪分析任务,旨在识别目标方面的极性。一些工作引入了各种注意机制来充分挖掘不同方面的相关上下文词,并利用传统的交叉熵损失对 ABSA 任务的模型进行了微调。然而,部分关注体无关词的注意机制不可避免地会引入不相关的噪声。此外,交叉熵损失缺乏对特征的判别学习,这使得利用类内紧性和类间可分性的隐含信息变得困难。为了克服这些挑战,我们提出了一个面向特征提取和增强网络(AFDEN)的 ABSA 任务。首先提出一个双特征提取模块,通过注意机制和图卷积网络提取与方面相关和与方面无关的特征。然后,为了消除方面无关词的干扰,设计了一个新的方面特征提取模块,该模块包含一个梯度反向层,通过对抗性训练学习方面无关的上下文特征,以及一个方面特定的正交投影层,进一步将方面相关特征投影到方面无关特征的正交空间中。最后,我们提出了一个侧面特征增强模块,该模块利用监督对比学习来捕获相同情感标签之间和不同情感标签之间的隐含信息。在三个公共数据集上的实验结果表明,我们的 AFDEN 模型达到了最先进的性能,验证了模型的有效性和鲁棒性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Aspect+Feature+Distillation+and+Enhancement+Network+for+Aspect-based+Sentiment+Analysis)|1| |[Detecting Frozen Phrases in Open-Domain Question Answering](https://doi.org/10.1145/3477495.3531793)|Mostafa Yadegari, Ehsan Kamalloo, Davood Rafiei|University of Alberta, Edmonton, AB, Canada|There is essential information in the underlying structure of words and phrases in natural language questions, and this structure has been extensively studied. In this paper, we study one particular structure, referred to as frozen phrases, that is highly expected to transfer as a whole from questions to answer passages. Frozen phrases, if detected, can be helpful in open-domain Question Answering (QA) where identifying the localized context of a given input question is crucial. An interesting question is if frozen phrases can be accurately detected. We cast the problem as a sequence-labeling task and create synthetic data from existing QA datasets to train a model. We further plug this model into a sparse retriever that is made aware of the detected phrases. Our experiments reveal that detecting frozen phrases whose presence in answer documents are highly plausible yields significant improvements in retrievals as well as in the end-to-end accuracy of open-domain QA models.|自然语言问题中的词和短语的基本结构蕴含着重要的信息,这种结构已经得到了广泛的研究。在本文中,我们研究了一个特殊的结构,称为冻结短语,这是高度期望转移作为一个整体从问题,以回答段落。如果检测到冻结的短语,在开放域问答(QA)中可能会有所帮助,因为识别给定输入问题的本地化上下文是至关重要的。一个有趣的问题是,是否可以准确地检测到冻结的短语。我们将问题转换为序列标记任务,并从现有的 QA 数据集创建合成数据来训练模型。我们进一步将这个模型插入到一个稀疏的检索器中,这个检索器可以识别检测到的短语。我们的实验表明,检测在应答文档中出现的高度合理的冻结短语可以显著提高检索效率以及开放域 QA 模型的端到端准确性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Detecting+Frozen+Phrases+in+Open-Domain+Question+Answering)|1| |[Understanding User Satisfaction with Task-oriented Dialogue Systems](https://doi.org/10.1145/3477495.3531798)|Clemencia Siro, Mohammad Aliannejadi, Maarten de Rijke|University of Amsterdam, Amsterdam, Netherlands|\beginabstract \AcpDS are evaluated depending on their type and purpose. Two categories are often distinguished: \beginenumerate* \item \acpTDS, which are typically evaluated on utility, i.e., their ability to complete a specified task, and \item open-domain chat-bots, which are evaluated on the user experience, i.e., based on their ability to engage a person. \endenumerate* What is the influence of user experience on the user satisfaction rating of \acpTDS as opposed to, or in addition to, utility ? We collect data by providing an additional annotation layer for dialogues sampled from the ReDial dataset, a widely used conversational recommendation dataset. Unlike prior work, we annotate the sampled dialogues at both the turn and dialogue level on six dialogue aspects: relevance, interestingness, understanding, task completion, efficiency, and interest arousal. The annotations allow us to study how different dialogue aspects influence user satisfaction. We introduce a comprehensive set of user experience aspects derived from the annotators' open comments that can influence users' overall impression. We find that the concept of satisfaction varies across annotators and dialogues, and show that a relevant turn is significant for some annotators, while for others, an interesting turn is all they need. Our analysis indicates that the proposed user experience aspects provide a fine-grained analysis of user satisfaction that is not captured by a monolithic overall human rating. \endabstract|根据其类型和用途对 AcpDS 进行评估。通常区分为两类: 初始列举 * 项目 acpTDS,通常根据实用性进行评估,例如,它们完成指定任务的能力; 以及项目开放域聊天机器人,根据用户体验进行评估,例如,根据它们吸引人的能力进行评估。相对于效用,用户体验对 acpTDS 的用户满意度有什么影响?我们通过为从 ReDial 数据集(一个广泛使用的会话推荐数据集)采样的对话提供额外的注释层来收集数据。与之前的工作不同,我们在转向和对话两个层面对样本对话进行了注释: 相关性、趣味性、理解性、任务完成、效率和兴趣激发。注释允许我们研究不同的对话方面如何影响用户满意度。我们介绍了一套全面的用户体验方面来自注释者的开放评论,可以影响用户的整体印象。我们发现满意度的概念在不同的注释者和对话中是不同的,并且表明相关的转向对于一些注释者是重要的,而对于其他人,一个有趣的转向是他们所需要的。我们的分析表明,建议的用户体验方面提供了一个细粒度的用户满意度分析,而不是由单一的整体人类评分。结束摘要|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Understanding+User+Satisfaction+with+Task-oriented+Dialogue+Systems)|1| |[On Survivorship Bias in MS MARCO](https://doi.org/10.1145/3477495.3531832)|Prashansa Gupta, Sean MacAvaney|University of Glasgow, Glasgow, United Kingdom|Survivorship bias is the tendency to concentrate on the positive outcomes of a selection process and overlook the results that generate negative outcomes. We observe that this bias could be present in the popular MS MARCO dataset, given that annotators could not find answers to 38--45% of the queries, leading to these queries being discarded in training and evaluation processes. Although we find that some discarded queries in MS MARCO are ill-defined or otherwise unanswerable, many are valid questions that could be answered had the collection been annotated more completely (around two thirds using modern ranking techniques). This survivability problem distorts the MS MARCO collection in several ways. We find that it affects the natural distribution of queries in terms of the type of information needed. When used for evaluation, we find that the bias likely yields a significant distortion of the absolute performance scores observed. Finally, given that MS MARCO is frequently used for model training, we train models based on subsets of MS MARCO that simulates more survivorship bias. We find that models trained in this setting are up to 9.9% worse when evaluated on versions of the dataset with more complete annotations, and up to 3.5% worse at zero-shot transfer. Our findings are complementary to other recent suggestions for further annotation of MS MARCO, but with a focus on discarded queries.|倖存者偏差就是倾向于专注于选择过程的积极结果,而忽视产生消极结果的结果。我们观察到,这种偏见可能存在于流行的 MS MARCO 数据集中,因为注释者无法找到38-45% 的查询的答案,导致这些查询在培训和评估过程中被丢弃。尽管我们发现 MS MARCO 中的一些丢弃的查询是不明确的或者无法回答的,但是许多是有效的问题,如果集合被更完整地注释(使用现代排序技术约占三分之二) ,则可以回答这些问题。这个生存性问题在几个方面扭曲了 MS MARCO 集合。我们发现,根据所需信息的类型,它会影响查询的自然分布。当用于评估时,我们发现这种偏差很可能导致观察到的绝对绩效得分的显著扭曲。最后,鉴于微软 MARCO 经常被用于模型训练,我们基于微软 MARCO 的子集来训练模型,以模拟更多的倖存者偏差。我们发现,在这种情况下训练的模型在使用更完整注释的数据集版本进行评估时差异高达9.9% ,而在零镜头传输时差异高达3.5% 。我们的发现是对其他最近的建议进一步注释微软 MARCO 的补充,但重点放在丢弃的查询。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On+Survivorship+Bias+in+MS+MARCO)|1| |[Bias Mitigation for Evidence-aware Fake News Detection by Causal Intervention](https://doi.org/10.1145/3477495.3531850)|Junfei Wu, Qiang Liu, Weizhi Xu, Shu Wu|Institute of Automation, Chinese Academy of Sciences & University of Chinese Academy of Sciences, Beijing, China|Evidence-based fake news detection is to judge the veracity of news against relevant evidences. However, models tend to memorize the dataset biases within spurious correlations between news patterns and veracity labels as shortcuts, rather than learning how to integrate the information behind them to reason. As a consequence, models may suffer from a serious failure when facing real-life conditions where most news has different patterns. Inspired by the success of causal inference, we propose a novel framework for debiasing evidence-based fake news detection\footnoteCode available at https://github.com/CRIPAC-DIG/CF-FEND by causal intervention. Under this framework, the model is first trained on the original biased dataset like ordinary work, then it makes conventional predictions and counterfactual predictions simultaneously in the testing stage, where counterfactual predictions are based on the intervened evidence. Relatively unbiased predictions are obtained by subtracting intervened outputs from the conventional ones. Extensive experiments conducted on several datasets demonstrate our method's effectiveness and generality on debiased datasets.|基于证据的假新闻检测就是根据相关证据来判断新闻的真实性。然而,模型倾向于记住新闻模式和准确性标签之间的虚假相关性中的数据集偏差作为快捷方式,而不是学习如何整合它们背后的信息进行推理。因此,模型可能会遭受严重的失败,当面对现实生活的情况下,大多数新闻有不同的模式。受到因果推理成功的启发,我们提出了一个新的框架,用于消除基于证据的假新闻检测脚注通过因果干预可以获得的 https://github.com/cripac-dig/cf-fend 代码。在此框架下,该模型首先像普通工作一样对原始偏差数据集进行训练,然后在测试阶段同时进行常规预测和反事实预测,反事实预测是基于介入的证据。相对无偏的预测是通过减去传统的干预输出得到的。在多个数据集上进行的大量实验证明了该方法在去偏数据集上的有效性和通用性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Bias+Mitigation+for+Evidence-aware+Fake+News+Detection+by+Causal+Intervention)|1| -|[Preference Enhanced Social Influence Modeling for Network-Aware Cascade Prediction](https://doi.org/10.1145/3477495.3532042)|Likang Wu, Hao Wang, Enhong Chen, Zhi Li, Hongke Zhao, Jianhui Ma|University of Science and Technology of China & State Key Laboratory of Cognitive Intelligence, Hefei, China; Tianjin University, Hefei, China|Network-aware cascade size prediction aims to predict the final reposted number of user-generated information via modeling the propagation process in social networks. Estimating the user's reposting probability by social influence, namely state activation plays an important role in the information diffusion process. Therefore, Graph Neural Networks (GNN), which can simulate the information interaction between nodes, has been proved as an effective scheme to handle this prediction task. However, existing studies including GNN-based models usually neglect a vital factor of user's preference which influences the state activation deeply. To that end, we propose a novel framework to promote cascade size prediction by enhancing the user preference modeling according to three stages, i.e., preference topics generation, preference shift modeling, and social influence activation. Our end-to-end method makes the user activating process of information diffusion more adaptive and accurate. Extensive experiments on two large-scale real-world datasets have clearly demonstrated the effectiveness of our proposed model compared to state-of-the-art baselines.|网络感知级联规模预测的目的是通过建立社交网络中的传播过程模型来预测用户生成信息的最终转发数量。通过社会影响即状态激活来估计用户的转发概率在信息传播过程中起着重要作用。因此,能够模拟节点间信息交互的图神经网络(GNN)已被证明是处理这一预测任务的有效方案。然而,现有的研究,包括基于 GNN 的模型,往往忽略了用户偏好的一个重要因素,这个因素对状态激活有着深刻的影响。为此,我们提出了一个新的框架来促进级联规模预测,通过增强用户偏好建模的三个阶段,即偏好主题的生成,偏好转移模型和社会影响激活。我们的端到端方法使得信息传播的用户激活过程更具适应性和准确性。在两个大规模真实世界数据集上的大量实验已经清楚地证明了我们提出的模型相对于最先进的基线的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Preference+Enhanced+Social+Influence+Modeling+for+Network-Aware+Cascade+Prediction)|1| +|[Preference Enhanced Social Influence Modeling for Network-Aware Cascade Prediction](https://doi.org/10.1145/3477495.3532042)|Likang Wu, Hao Wang, Enhong Chen, Zhi Li, Hongke Zhao, Jianhui Ma|Tianjin University, Hefei, China; University of Science and Technology of China & State Key Laboratory of Cognitive Intelligence, Hefei, China|Network-aware cascade size prediction aims to predict the final reposted number of user-generated information via modeling the propagation process in social networks. Estimating the user's reposting probability by social influence, namely state activation plays an important role in the information diffusion process. Therefore, Graph Neural Networks (GNN), which can simulate the information interaction between nodes, has been proved as an effective scheme to handle this prediction task. However, existing studies including GNN-based models usually neglect a vital factor of user's preference which influences the state activation deeply. To that end, we propose a novel framework to promote cascade size prediction by enhancing the user preference modeling according to three stages, i.e., preference topics generation, preference shift modeling, and social influence activation. Our end-to-end method makes the user activating process of information diffusion more adaptive and accurate. Extensive experiments on two large-scale real-world datasets have clearly demonstrated the effectiveness of our proposed model compared to state-of-the-art baselines.|网络感知级联规模预测的目的是通过建立社交网络中的传播过程模型来预测用户生成信息的最终转发数量。通过社会影响即状态激活来估计用户的转发概率在信息传播过程中起着重要作用。因此,能够模拟节点间信息交互的图神经网络(GNN)已被证明是处理这一预测任务的有效方案。然而,现有的研究,包括基于 GNN 的模型,往往忽略了用户偏好的一个重要因素,这个因素对状态激活有着深刻的影响。为此,我们提出了一个新的框架来促进级联规模预测,通过增强用户偏好建模的三个阶段,即偏好主题的生成,偏好转移模型和社会影响激活。我们的端到端方法使得信息传播的用户激活过程更具适应性和准确性。在两个大规模真实世界数据集上的大量实验已经清楚地证明了我们提出的模型相对于最先进的基线的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Preference+Enhanced+Social+Influence+Modeling+for+Network-Aware+Cascade+Prediction)|1| |[Users and Contemporary SERPs: A (Re-)Investigation](https://doi.org/10.1145/3477495.3531719)|Nirmal Roy, David Maxwell, Claudia Hauff|Delft University of Technology, Delft, Netherlands|TheSearch Engine Results Page (SERP) has evolved significantly over the last two decades, moving away from the simple ten blue links paradigm to considerably more complex presentations that contain results from multiple verticals and granularities of textual information. Prior works have investigated how user interactions on the SERP are influenced by the presence or absence of heterogeneous content (e.g., images, videos, or news content), the layout of the SERP (\emphlist vs. grid layout), and task complexity. In this paper, we reproduce the user studies conducted in prior works---specifically those of~\citetarguello2012task and~\citetsiu2014first ---to explore to what extent the findings from research conducted five to ten years ago still hold today as the average web user has become accustomed to SERPs with ever-increasing presentational complexity. To this end, we designed and ran a user study with four different SERP interfaces:(i) ~\empha heterogeneous grid ;(ii) ~\empha heterogeneous list ;(iii) ~\empha simple grid ; and(iv) ~\empha simple list. We collected the interactions of $41$ study participants over $12$ search tasks for our analyses. We observed that SERP types and task complexity affect user interactions with search results. We also find evidence to support most (6 out of 8) observations from~\citearguello2012task,siu2014first indicating that user interactions with different interfaces and to solve tasks of different complexity have remained mostly similar over time.|搜索引擎结果页面(SERP)在过去的二十年中发生了巨大的变化,从简单的十个蓝色链接范式转变为包含多个垂直结果和文本信息粒度的更复杂的表示。先前的工作已经研究了用户在 SERP 上的交互是如何受到异构内容(如图像、视频或新闻内容)、 SERP 布局(强调列表与网格布局)和任务复杂性的影响的。在这篇论文中,我们重现了在以前的研究中进行的用户研究——特别是那些 ~ citetguello2012task 和 ~ citetsiu2014first 的研究——来探索五到十年前的研究结果在多大程度上仍然适用于今天,因为普通的网络用户已经习惯了不断增加的表现复杂度的 SERP。为此,我们设计并运行了一个使用四种不同的 SERP 接口的用户研究: (i) ~ empa 异构网格; (ii) ~ empa 异构列表; (iii) ~ empa 简单网格; 和(iv) ~ empa 简单列表。我们收集了 $41 $研究参与者在 $12 $搜索任务中的交互作用,用于我们的分析。我们观察到 SERP 类型和任务复杂性影响用户与搜索结果的交互。我们还发现支持 ~ citearguello2012task,siu2014的大多数(8个中的6个)观察的证据首先表明,用户与不同界面的交互以及解决不同复杂度的任务随着时间的推移大多保持相似。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Users+and+Contemporary+SERPs:+A+(Re-)Investigation)|1| -|[ReMeDi: Resources for Multi-domain, Multi-service, Medical Dialogues](https://doi.org/10.1145/3477495.3531809)|Guojun Yan, Jiahuan Pei, Pengjie Ren, Zhaochun Ren, Xin Xin, Huasheng Liang, Maarten de Rijke, Zhumin Chen|WeChat Tencent, Qingdao, China; University of Amsterdam, Amsterdam, Netherlands; Shandong University, Qingdao, China|\AcpMDS aim to assist doctors and patients with a range of professional medical services, i.e., diagnosis, treatment and consultation. The development of \acpMDS is hindered because of a lack of resources. In particular. \beginenumerate* [label=(\arabic*) ] \item there is no dataset with large-scale medical dialogues that covers multiple medical services and contains fine-grained medical labels (i.e., intents, actions, slots, values), and \item there is no set of established benchmarks for \acpMDS for multi-domain, multi-service medical dialogues. \endenumerate* In this paper, we present \acsReMeDi, a set of \aclReMeDi \acusedReMeDi. ØurResources consists of two parts, the ØurResources dataset and the ØurResources benchmarks. The ØurResources dataset contains 96,965 conversations between doctors and patients, including 1,557 conversations with fine-gained labels. It covers 843 types of diseases, 5,228 medical entities, and 3 specialties of medical services across 40 domains. To the best of our knowledge, the ØurResources dataset is the only medical dialogue dataset that covers multiple domains and services, and has fine-grained medical labels. The second part of the ØurResources resources consists of a set of state-of-the-art models for (medical) dialogue generation. The ØurResources benchmark has the following methods: \beginenumerate* \item pretrained models (i.e., BERT-WWM, BERT-MED, GPT2, and MT5) trained, validated, and tested on the ØurResources dataset, and \item a \acfSCL method to expand the ØurResources dataset and enhance the training of the state-of-the-art pretrained models. \endenumerate* We describe the creation of the ØurResources dataset, the ØurResources benchmarking methods, and establish experimental results using the ØurResources benchmarking methods on the ØurResources dataset for future research to compare against. With this paper, we share the dataset, implementations of the benchmarks, and evaluation scripts.|该计划旨在为医生和病人提供一系列的专业医疗服务,包括诊断、治疗和咨询。由于缺乏资源,阻碍了 ACpMDS 的发展。尤其是。开始列举 * [ label = (阿拉伯语 *)]项目没有涵盖多种医疗服务并包含细粒度医疗标签(即意图,行动,插槽,价值)的大规模医疗对话的数据集,并且没有一套针对多领域,多服务医疗对话的 acpMDS 的既定基准。在本文中,我们介绍 acsReMeDi,一组 aclReMeDi acusedReMeDi。ØurResources 由两部分组成,ØurResources 数据集和 ØurResources 基准。ØurResources 的数据集包含96,965次医生与病人之间的对话,其中包括1557次带有罚款标签的对话。它涵盖了40个领域的843种疾病,5228个医疗实体和3个医疗服务专业。就我们所知,ØurResources 数据集是唯一一个涵盖多个领域和服务的医疗对话数据集,并具有细粒度的医疗标签。ØurResources 资源的第二部分包含一组用于(医疗)对话生成的最先进模型。ØurResources 基准有以下方法: 在 ØurResources 数据集上训练、验证和测试的 * 项目预训模型(即 BERT-WMM、 BERT-MED、 GPT2和 MT5) ,以及扩展 ØurResources 数据集和加强最先进的预训模型的训练的 acfSCL 方法。* 我们描述了 ØurResources 数据集的创建、 ØurResources 基准测试方法,并使用 ØurResources 基准测试方法建立了实验结果,以便将来的研究与之进行比较。在本文中,我们共享数据集、基准测试的实现和评估脚本。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ReMeDi:+Resources+for+Multi-domain,+Multi-service,+Medical+Dialogues)|1| +|[ReMeDi: Resources for Multi-domain, Multi-service, Medical Dialogues](https://doi.org/10.1145/3477495.3531809)|Guojun Yan, Jiahuan Pei, Pengjie Ren, Zhaochun Ren, Xin Xin, Huasheng Liang, Maarten de Rijke, Zhumin Chen|Shandong University, Qingdao, China; University of Amsterdam, Amsterdam, Netherlands; WeChat Tencent, Qingdao, China|\AcpMDS aim to assist doctors and patients with a range of professional medical services, i.e., diagnosis, treatment and consultation. The development of \acpMDS is hindered because of a lack of resources. In particular. \beginenumerate* [label=(\arabic*) ] \item there is no dataset with large-scale medical dialogues that covers multiple medical services and contains fine-grained medical labels (i.e., intents, actions, slots, values), and \item there is no set of established benchmarks for \acpMDS for multi-domain, multi-service medical dialogues. \endenumerate* In this paper, we present \acsReMeDi, a set of \aclReMeDi \acusedReMeDi. ØurResources consists of two parts, the ØurResources dataset and the ØurResources benchmarks. The ØurResources dataset contains 96,965 conversations between doctors and patients, including 1,557 conversations with fine-gained labels. It covers 843 types of diseases, 5,228 medical entities, and 3 specialties of medical services across 40 domains. To the best of our knowledge, the ØurResources dataset is the only medical dialogue dataset that covers multiple domains and services, and has fine-grained medical labels. The second part of the ØurResources resources consists of a set of state-of-the-art models for (medical) dialogue generation. The ØurResources benchmark has the following methods: \beginenumerate* \item pretrained models (i.e., BERT-WWM, BERT-MED, GPT2, and MT5) trained, validated, and tested on the ØurResources dataset, and \item a \acfSCL method to expand the ØurResources dataset and enhance the training of the state-of-the-art pretrained models. \endenumerate* We describe the creation of the ØurResources dataset, the ØurResources benchmarking methods, and establish experimental results using the ØurResources benchmarking methods on the ØurResources dataset for future research to compare against. With this paper, we share the dataset, implementations of the benchmarks, and evaluation scripts.|该计划旨在为医生和病人提供一系列的专业医疗服务,包括诊断、治疗和咨询。由于缺乏资源,阻碍了 ACpMDS 的发展。尤其是。开始列举 * [ label = (阿拉伯语 *)]项目没有涵盖多种医疗服务并包含细粒度医疗标签(即意图,行动,插槽,价值)的大规模医疗对话的数据集,并且没有一套针对多领域,多服务医疗对话的 acpMDS 的既定基准。在本文中,我们介绍 acsReMeDi,一组 aclReMeDi acusedReMeDi。ØurResources 由两部分组成,ØurResources 数据集和 ØurResources 基准。ØurResources 的数据集包含96,965次医生与病人之间的对话,其中包括1557次带有罚款标签的对话。它涵盖了40个领域的843种疾病,5228个医疗实体和3个医疗服务专业。就我们所知,ØurResources 数据集是唯一一个涵盖多个领域和服务的医疗对话数据集,并具有细粒度的医疗标签。ØurResources 资源的第二部分包含一组用于(医疗)对话生成的最先进模型。ØurResources 基准有以下方法: 在 ØurResources 数据集上训练、验证和测试的 * 项目预训模型(即 BERT-WMM、 BERT-MED、 GPT2和 MT5) ,以及扩展 ØurResources 数据集和加强最先进的预训模型的训练的 acfSCL 方法。* 我们描述了 ØurResources 数据集的创建、 ØurResources 基准测试方法,并使用 ØurResources 基准测试方法建立了实验结果,以便将来的研究与之进行比较。在本文中,我们共享数据集、基准测试的实现和评估脚本。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ReMeDi:+Resources+for+Multi-domain,+Multi-service,+Medical+Dialogues)|1| |[Online DATEing: A Web Interface for Temporal Annotations](https://doi.org/10.1145/3477495.3531670)|Dennis Aumiller, Satya Almasian, David Pohl, Michael Gertz|Heidelberg University, Heidelberg, Germany|Despite more than two decades of research on temporal tagging and temporal relation extraction, usable tools for annotating text remain very basic and hard to set up from an average end-user perspective, limiting the applicability of developments to a selected group of invested researchers. In this work, we aim to increase the accessibility of temporal tagging systems by presenting an intuitive web interface, called "Online DATEing", which simplifies the interaction with existing temporal annotation frameworks. Our system integrates several approaches in a single interface and streamlines the process of importing (and tagging) groups of documents, as well as making it accessible through a programmatic API. It further enables users to interactively investigate and visualize tagged texts, and is designed with an extensible API for the inclusion of new models or data formats. A web demonstration of our tool is available at https://onlinedating.ifi.uni-heidelberg.de and public code accessible at https://github.com/satya77/Temporal_Tagger_Service.|尽管在时间标签和时间关系提取方面进行了二十多年的研究,但用于文本注释的可用工具仍然非常基础,从一般最终用户的角度来看很难建立起来,这限制了开发的适用性,使其只能适用于一组经过投资的研究人员。在这项工作中,我们的目标是通过提出一个直观的网络界面,称为“在线 DATEing”,简化与现有的时态标注框架的交互,提高时态标注系统的可访问性。我们的系统在单个接口中集成了多种方法,并简化了导入(和标记)文档组的过程,同时还通过编程 API 使其可访问。它进一步使用户能够交互式地调查和可视化标记的文本,并且设计了一个可扩展的 API,用于包含新的模型或数据格式。我们的工具的网页演示可在 https://onlinedating.ifi.uni-heidelberg.de 下载,公众代码亦可在 https://github.com/satya77/temporal_tagger_service 下载。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Online+DATEing:+A+Web+Interface+for+Temporal+Annotations)|1| |[Few-shot Node Classification on Attributed Networks with Graph Meta-learning](https://doi.org/10.1145/3477495.3531978)|Yonghao Liu, Mengyu Li, Ximing Li, Fausto Giunchiglia, Xiaoyue Feng, Renchu Guan|Jilin University, Changchun, China; University of Trento, Trento, Italy|Attributed networks, as a manifestation of data in non-Euclidean domains, have a wide range of applications in the real world, such as molecular property prediction, social network analysis and anomaly detection. Node classification, as a fundamental research problem in attributed networks, has attracted increasing attention among research communities. However, most existing models cannot be directly applied to the data with limited labeled instances (\textiti.e., the few-shot scenario). Few-shot node classification on attributed networks is gradually becoming a research hotspot. Although several methods aim to integrate meta-learning with graph neural networks to address this problem, some limitations remain. First, they all assume node representation learning using graph neural networks in homophilic graphs. %Hence, suboptimal performance is obtained when these models are applied to heterophilic graphs. Second, existing models based on meta-learning entirely depend on instance-based statistics. %which in few-shot settings are unavoidably degraded by data noise or outliers. Third, most previous models treat all sampled tasks equally and fail to adapt their uniqueness. %which has a significant impact on the overall performance of the model. To solve the above three limitations, we propose a novel graph Meta -learning framework called G raph learning based on P rototype and S caling & shifting transformation (Meta-GPS ). More specifically, we introduce an efficient method for learning expressive node representations even on heterophilic graphs and propose utilizing a prototype-based approach to initialize parameters in meta-learning. Moreover, we also leverage S$^2$ (scaling & shifting) transformation to learn effective transferable knowledge from diverse tasks. Extensive experimental results on six real-world datasets demonstrate the superiority of our proposed framework, which outperforms other state-of-the-art baselines by up to 13% absolute improvement in terms of related metrics.|属性网络作为非欧几里德领域数据的表现形式,在现实世界中有着广泛的应用,如分子特性预测、社会网络分析和异常检测分析。节点分类作为属性网络的一个基础研究问题,越来越受到研究界的重视。然而,大多数现有的模型不能直接应用于带有有限标签的实例的数据(textti.e.a few-shot 场景)。基于属性网络的少镜头节点分类正逐渐成为研究热点。虽然有几种方法旨在将元学习与图神经网络相结合来解决这个问题,但仍然存在一些局限性。首先,它们都假定在同态图中使用图神经网络进行节点表示学习。因此,当这些模型应用于异质图时,得到了次优的性能。其次,现有的基于元学习的模型完全依赖于基于实例的统计。在少镜头设置中,不可避免地会受到数据噪声或异常值的影响。第三,大多数以前的模型对所有抽样任务一视同仁,不能适应它们的唯一性。% ,这对模型的整体性能有重大影响。为了解决这三个问题,提出了一种新的图元学习框架——基于 P 原型和 S 标定与移位变换的 G 图学习(Meta-GPS)。更具体地说,我们介绍了一种有效的学习表达式节点表示的方法,甚至在异质图上,并提出了利用基于原型的方法来初始化元学习中的参数。此外,我们还利用 S $^ 2 $(缩放和转移)转换来学习不同任务中有效的可转移知识。在六个真实世界数据集上的大量实验结果证明了我们提出的框架的优越性,它在相关指标方面比其他最先进的基线表现出高达13% 的绝对改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Few-shot+Node+Classification+on+Attributed+Networks+with+Graph+Meta-learning)|1| |[Co-clustering Interactions via Attentive Hypergraph Neural Network](https://doi.org/10.1145/3477495.3531868)|Tianchi Yang, Cheng Yang, Luhao Zhang, Chuan Shi, Maodi Hu, Huaijun Liu, Tao Li, Dong Wang|Meituan, Beijing, China; Beijing University of Posts and Telecommunications, Beijing, China|With the rapid growth of interaction data, many clustering methods have been proposed to discover interaction patterns as prior knowledge beneficial to downstream tasks. Considering that an interaction can be seen as an action occurring among multiple objects, most existing methods model the objects and their pair-wise relations as nodes and links in graphs. However, they only model and leverage part of the information in real entire interactions, i.e., either decompose the entire interaction into several pair-wise sub-interactions for simplification, or only focus on clustering some specific types of objects, which limits the performance and explainability of clustering. To tackle this issue, we propose to Co-cluster the Interactions via Attentive Hypergraph neural network (CIAH). Particularly, with more comprehensive modeling of interactions by hypergraph, we propose an attentive hypergraph neural network to encode the entire interactions, where an attention mechanism is utilized to select important attributes for explanations. Then, we introduce a salient method to guide the attention to be more consistent with real importance of attributes, namely saliency-based consistency. Moreover, we propose a novel co-clustering method to perform a joint clustering for the representations of interactions and the corresponding distributions of attribute selection, namely cluster-based consistency. Extensive experiments demonstrate that our CIAH significantly outperforms state-of-the-art clustering methods on both public datasets and real industrial datasets.|随着交互数据的快速增长,人们提出了许多聚类方法来发现交互模式作为有利于下游任务的先验知识。考虑到一个交互可以看作是多个对象之间的一个动作,现有的方法大多将对象及其成对关系建模为图中的节点和链接。然而,他们只是在真实的整个交互中对部分信息进行建模和利用,也就是说,要么将整个交互分解为几个成对的子交互以进行简化,要么只关注某些特定类型的对象的聚类,这限制了聚类的性能和可解释性。为了解决这一问题,我们提出了通过注意超图神经网络(CIAH)对相互作用进行共聚类。特别是,随着超图对交互作用建模的不断深入,我们提出了一种注意超图神经网络对整个交互作用进行编码,利用注意机制选择重要属性进行解释。然后,我们介绍了一种引导注意与属性的真实重要性更加一致的显著方法,即基于显著性的一致性。此外,我们提出了一种新的共聚类方法,即基于聚类的一致性,对交互作用的表示和相应的属性选择分布进行联合聚类。大量的实验表明,我们的 CIAH 在公共数据集和实际工业数据集上都显著优于最先进的聚类方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Co-clustering+Interactions+via+Attentive+Hypergraph+Neural+Network)|1| |[Mutual Disentanglement Learning for Joint Fine-Grained Sentiment Classification and Controllable Text Generation](https://doi.org/10.1145/3477495.3532029)|Hao Fei, Chenliang Li, Donghong Ji, Fei Li|Wuhan University, Wuhan, China|Fine-grained sentiment classification (FGSC) task and fine-grained controllable text generation (FGSG) task are two representative applications of sentiment analysis, two of which together can actually form an inverse task prediction, i.e., the former aims to infer the fine-grained sentiment polarities given a text piece, while the latter generates text content that describes the input fine-grained opinions. Most of the existing work solves the FGSC and the FGSG tasks in isolation, while ignoring the complementary benefits in between. This paper combines FGSC and FGSG as a joint dual learning system, encouraging them to learn the advantages from each other. Based on the dual learning framework, we further propose decoupling the feature representations in two tasks into fine-grained aspect-oriented opinion variables and content variables respectively, by performing mutual disentanglement learning upon them. We also propose to transform the difficult "data-to-text'' generation fashion widely used in FGSG into an easier text-to-text generation fashion by creating surrogate natural language text as the model inputs. Experimental results on 7 sentiment analysis benchmarks including both the document-level and sentence-level datasets show that our method significantly outperforms the current strong-performing baselines on both the FGSC and FGSG tasks. Automatic and human evaluations demonstrate that our FGSG model successfully generates fluent, diverse and rich content conditioned on fine-grained sentiments.|细粒度情绪分类任务(FGSC)和细粒度可控文本生成任务(FGSG)是情绪分析的两个具有代表性的应用,两者结合起来实际上可以形成一个反向的任务预测,即前者旨在推断给定文本片段的细粒度情绪极性,而后者生成描述输入细粒度意见的文本内容。现有的大多数工作是孤立地解决 FGSC 和 FGSG 任务,而忽略了它们之间的互补性。本文将 FGSC 和 FGSG 作为一个联合的双学习系统结合起来,鼓励它们相互学习对方的优势。在对偶学习框架的基础上,进一步提出将两个任务的特征表示分别解耦为细粒度面向方面的观点变量和内容变量,并对它们进行相互解缠学习。我们还建议通过创建替代自然语言文本作为模型输入,将 FGSG 中广泛使用的困难的“数据到文本”生成方式转换为更容易的文本到文本生成方式。对包括文档级和句子级数据集在内的7个情绪分析基准的实验结果表明,我们的方法在 FGSC 和 FGSG 任务上都显著优于目前表现强劲的基准。自动和人工评估表明,我们的 FGSG 模型成功地产生流畅,多样和丰富的内容条件下细粒度的情感。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Mutual+Disentanglement+Learning+for+Joint+Fine-Grained+Sentiment+Classification+and+Controllable+Text+Generation)|1| -|[Learning Disentangled Representations for Counterfactual Regression via Mutual Information Minimization](https://doi.org/10.1145/3477495.3532011)|Mingyuan Cheng, Xinru Liao, Quan Liu, Bin Ma, Jian Xu, Bo Zheng|Alibaba Group, Hangzhou, China; Alibaba Group, Beijing, China|Learning individual-level treatment effect is a fundamental problem in causal inference and has received increasing attention in many areas, especially in the user growth area which concerns many internet companies. Recently, disentangled representation learning methods that decompose covariates into three latent factors, including instrumental, confounding and adjustment factors, have witnessed great success in treatment effect estimation. However, it remains an open problem how to learn the underlying disentangled factors precisely. Specifically, previous methods fail to obtain independent disentangled factors, which is a necessary condition for identifying treatment effect. In this paper, we propose Disentangled Representations for Counterfactual Regression via Mutual Information Minimization (MIM-DRCFR), which uses a multi-task learning framework to share information when learning the latent factors and incorporates MI minimization learning criteria to ensure the independence of these factors. Extensive experiments including public benchmarks and real-world industrial user growth datasets demonstrate that our method performs much better than state-of-the-art methods.|个体水平治疗效应的学习是因果推理中的一个基本问题,在许多领域,特别是在涉及到许多互联网公司的用户增长领域受到越来越多的关注。近年来,将协变量分解为工具因素、混杂因素和调整因素的分离表征学习方法在评估治疗效果方面取得了很大的成功。然而,如何准确地认识这些潜在的分离因素仍然是一个悬而未决的问题。具体来说,以往的方法不能获得独立的解缠因子,这是确定治疗效果的必要条件。本文提出了基于互信息最小化的反事实回归分离表示方法(MIM-DRCFR) ,该方法采用多任务学习框架,在学习潜在因素时共享信息,并结合 MI 最小化学习准则,保证了这些因素的独立性。包括公共基准和真实世界工业用户增长数据集在内的大量实验表明,我们的方法比最先进的方法表现得更好。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Disentangled+Representations+for+Counterfactual+Regression+via+Mutual+Information+Minimization)|1| -|[L3E-HD: A Framework Enabling Efficient Ensemble in High-Dimensional Space for Language Tasks](https://doi.org/10.1145/3477495.3531761)|Fangxin Liu, Haomin Li, Xiaokang Yang, Li Jiang|Shanghai Jiao Tong University, Shanghai, China; Shanghai Jiao Tong University, Shanghai , China; Tianjin University, Tianjin, China|Brain-inspired hyperdimensional computing (HDC) has been introduced as an alternative computing paradigm to achieve efficient and robust learning. HDC simulates cognitive tasks by mapping all data points to patterns of neural activity in the high-dimensional space, which has demonstrated promising performances in a wide range of applications such as robotics, biomedical signal processing, and genome sequencing. Language tasks, generally solved using machine learning methods, are widely deployed on low-power embedded devices. However, existing HDC solutions suffer from major challenges that impede the deployment of low-power embedded devices: the storage and computation overhead of HDC models grows dramatically with (i) the number of dimensions and (ii) the complex similarity metric during the inference. In this paper, we proposed a novel ensemble framework for the language task, termed L3E-HD, which enables efficient HDC on low-power edge devices. L3E-HD accelerates the inference by mapping data points to a high-dimensional binary space to simplify similarity search, which dominates costly and frequent operation in HDC. Through marrying HDC with the ensemble technique, L3E-HD also addresses the severe accuracy degradation induced by the compression of the dimension and precision of the model. Our experiments show that the ensemble technique is naturally a perfect fit to boost HDCs. We find that our L3E-HD, which is faster, more efficient, and more accurate than conventional machine learning methods, can even surpass the accuracy of the full-precision model at a smaller model size. Code is released at: https://github.com/MXHX7199/SIGIR22-EnsembleHDC.|大脑启发的高维计算(HDC)已被引入作为一种替代计算范式,以实现有效和健壮的学习。HDC 通过将所有数据点映射到高维空间中的神经活动模式来模拟认知任务,这已经在机器人、生医信号处理和基因组测序等广泛的应用中展示了有前途的性能。语言任务通常采用机器学习方法来解决,在低功耗嵌入式设备上得到了广泛的应用。然而,现有的 HDC 解决方案面临着阻碍低功耗嵌入式设备部署的主要挑战: HDC 模型的存储和计算开销随着(i)维数和(ii)推断期间复杂的相似度量急剧增长。在本文中,我们提出了一个新的语言任务集成框架,称为 L3E-HD,它可以在低功耗边缘设备上实现有效的 HDC。通过将数据点映射到一个高维的二进制空间来简化最近邻搜索,L3E-HD 加快了推理速度。通过将 HDC 与集成技术相结合,L3E-HD 还解决了由于模型尺寸和精度的压缩而引起的精度严重下降的问题。我们的实验表明,集成技术自然是一个完美的适合增强 HDC。我们发现我们的 L3E-HD 比传统的机器学习方法更快、更有效、更精确,甚至可以在更小的模型尺寸下超过全精度模型的精度。密码发布于: https://github.com/mxhx7199/sigir22-ensemblehdc。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=L3E-HD:+A+Framework+Enabling+Efficient+Ensemble+in+High-Dimensional+Space+for+Language+Tasks)|1| +|[Learning Disentangled Representations for Counterfactual Regression via Mutual Information Minimization](https://doi.org/10.1145/3477495.3532011)|Mingyuan Cheng, Xinru Liao, Quan Liu, Bin Ma, Jian Xu, Bo Zheng|Alibaba Group, Beijing, China; Alibaba Group, Hangzhou, China|Learning individual-level treatment effect is a fundamental problem in causal inference and has received increasing attention in many areas, especially in the user growth area which concerns many internet companies. Recently, disentangled representation learning methods that decompose covariates into three latent factors, including instrumental, confounding and adjustment factors, have witnessed great success in treatment effect estimation. However, it remains an open problem how to learn the underlying disentangled factors precisely. Specifically, previous methods fail to obtain independent disentangled factors, which is a necessary condition for identifying treatment effect. In this paper, we propose Disentangled Representations for Counterfactual Regression via Mutual Information Minimization (MIM-DRCFR), which uses a multi-task learning framework to share information when learning the latent factors and incorporates MI minimization learning criteria to ensure the independence of these factors. Extensive experiments including public benchmarks and real-world industrial user growth datasets demonstrate that our method performs much better than state-of-the-art methods.|个体水平治疗效应的学习是因果推理中的一个基本问题,在许多领域,特别是在涉及到许多互联网公司的用户增长领域受到越来越多的关注。近年来,将协变量分解为工具因素、混杂因素和调整因素的分离表征学习方法在评估治疗效果方面取得了很大的成功。然而,如何准确地认识这些潜在的分离因素仍然是一个悬而未决的问题。具体来说,以往的方法不能获得独立的解缠因子,这是确定治疗效果的必要条件。本文提出了基于互信息最小化的反事实回归分离表示方法(MIM-DRCFR) ,该方法采用多任务学习框架,在学习潜在因素时共享信息,并结合 MI 最小化学习准则,保证了这些因素的独立性。包括公共基准和真实世界工业用户增长数据集在内的大量实验表明,我们的方法比最先进的方法表现得更好。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Disentangled+Representations+for+Counterfactual+Regression+via+Mutual+Information+Minimization)|1| +|[L3E-HD: A Framework Enabling Efficient Ensemble in High-Dimensional Space for Language Tasks](https://doi.org/10.1145/3477495.3531761)|Fangxin Liu, Haomin Li, Xiaokang Yang, Li Jiang|Shanghai Jiao Tong University, Shanghai , China; Tianjin University, Tianjin, China; Shanghai Jiao Tong University, Shanghai, China|Brain-inspired hyperdimensional computing (HDC) has been introduced as an alternative computing paradigm to achieve efficient and robust learning. HDC simulates cognitive tasks by mapping all data points to patterns of neural activity in the high-dimensional space, which has demonstrated promising performances in a wide range of applications such as robotics, biomedical signal processing, and genome sequencing. Language tasks, generally solved using machine learning methods, are widely deployed on low-power embedded devices. However, existing HDC solutions suffer from major challenges that impede the deployment of low-power embedded devices: the storage and computation overhead of HDC models grows dramatically with (i) the number of dimensions and (ii) the complex similarity metric during the inference. In this paper, we proposed a novel ensemble framework for the language task, termed L3E-HD, which enables efficient HDC on low-power edge devices. L3E-HD accelerates the inference by mapping data points to a high-dimensional binary space to simplify similarity search, which dominates costly and frequent operation in HDC. Through marrying HDC with the ensemble technique, L3E-HD also addresses the severe accuracy degradation induced by the compression of the dimension and precision of the model. Our experiments show that the ensemble technique is naturally a perfect fit to boost HDCs. We find that our L3E-HD, which is faster, more efficient, and more accurate than conventional machine learning methods, can even surpass the accuracy of the full-precision model at a smaller model size. Code is released at: https://github.com/MXHX7199/SIGIR22-EnsembleHDC.|大脑启发的高维计算(HDC)已被引入作为一种替代计算范式,以实现有效和健壮的学习。HDC 通过将所有数据点映射到高维空间中的神经活动模式来模拟认知任务,这已经在机器人、生医信号处理和基因组测序等广泛的应用中展示了有前途的性能。语言任务通常采用机器学习方法来解决,在低功耗嵌入式设备上得到了广泛的应用。然而,现有的 HDC 解决方案面临着阻碍低功耗嵌入式设备部署的主要挑战: HDC 模型的存储和计算开销随着(i)维数和(ii)推断期间复杂的相似度量急剧增长。在本文中,我们提出了一个新的语言任务集成框架,称为 L3E-HD,它可以在低功耗边缘设备上实现有效的 HDC。通过将数据点映射到一个高维的二进制空间来简化最近邻搜索,L3E-HD 加快了推理速度。通过将 HDC 与集成技术相结合,L3E-HD 还解决了由于模型尺寸和精度的压缩而引起的精度严重下降的问题。我们的实验表明,集成技术自然是一个完美的适合增强 HDC。我们发现我们的 L3E-HD 比传统的机器学习方法更快、更有效、更精确,甚至可以在更小的模型尺寸下超过全精度模型的精度。密码发布于: https://github.com/mxhx7199/sigir22-ensemblehdc。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=L3E-HD:+A+Framework+Enabling+Efficient+Ensemble+in+High-Dimensional+Space+for+Language+Tasks)|1| |[Graph Capsule Network with a Dual Adaptive Mechanism](https://doi.org/10.1145/3477495.3531764)|Xiangping Zheng, Xun Liang, Bo Wu, Yuhui Guo, Xuan Zhang|Renmin University of China, Beijing, China|While Graph Convolutional Networks (GCNs) have been extended to various fields of artificial intelligence with their powerful representation capabilities, recent studies have revealed that their ability to capture the part-whole structure of the graph is limited. Furthermore, though many GCNs variants have been proposed and obtained state-of-the-art results, they face the situation that much early information may be lost during the graph convolution step. To this end, we innovatively present an Graph Capsule Network with a Dual Adaptive Mechanism (DA-GCN) to tackle the above challenges. Specifically, this powerful mechanism is a dual-adaptive mechanism to capture the part-whole structure of the graph. One is an adaptive node interaction module to explore the potential relationship between interactive nodes. The other is an adaptive attention-based graph dynamic routing to select appropriate graph capsules, so that only favorable graph capsules are gathered and redundant graph capsules are restrained for better capturing the whole structure between graphs. Experiments demonstrate that our proposed algorithm has achieved the most advanced or competitive results on all datasets.|虽然图卷积网络(GCNs)以其强大的表示能力已经扩展到人工智能的各个领域,但最近的研究表明,它们捕获图的部分-整体结构的能力是有限的。此外,虽然已经提出了许多 GCNs 变体,并获得了最先进的结果,但是它们面临的情况是,在图卷积步骤中,许多早期信息可能丢失。为此,我们创新性地提出了一个具有双重适应机制的图胶囊网络(DA-GCN)来应对上述挑战。具体来说,这种强大的机制是一种双自适应机制,用于捕获图的部分-整体结构。一种是自适应节点交互模块,用于探索交互节点之间的潜在关系。另一种是基于自适应注意的图动态路由选择合适的图胶囊,从而只收集有利的图胶囊,抑制冗余的图胶囊,以便更好地捕捉图间的整体结构。实验结果表明,本文提出的算法在所有数据集上都取得了最先进或最有竞争力的结果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph+Capsule+Network+with+a+Dual+Adaptive+Mechanism)|1| |[Training Entire-Space Models for Target-oriented Opinion Words Extraction](https://doi.org/10.1145/3477495.3531768)|Yuncong Li, Fang Wang, ShengHua Zhong|Tencent Inc, Shenzhen, China; Shenzhen University, Shenzhen, China|Target-oriented opinion words extraction (TOWE) is a subtask of aspect-based sentiment analysis (ABSA). Given a sentence and an aspect term occurring in the sentence, TOWE extracts the corresponding opinion words for the aspect term. TOWE has two types of instance. In the first type, aspect terms are associated with at least one opinion word, while in the second type, aspect terms do not have corresponding opinion words. However, previous researches trained and evaluated their models with only the first type of instance, resulting in a sample selection bias problem. Specifically, TOWE models were trained with only the first type of instance, while these models would be utilized to make inference on the entire space with both the first type of instance and the second type of instance. Thus, the generalization performance will be hurt. Moreover, the performance of these models on the first type of instance cannot reflect their performance on entire space. To validate the sample selection bias problem, four popular TOWE datasets containing only aspect terms associated with at least one opinion word are extended and additionally include aspect terms without corresponding opinion words. Experimental results on these datasets show that training TOWE models on entire space will significantly improve model performance and evaluating TOWE models only on the first type of instance will overestimate model performance.|面向目标的意见词提取(TOWE)是基于方面的情感分析(ABSA)的一个子任务。给定一个句子和一个体项出现在句子中,TOWE 提取相应的体项意见词。TOWE 有两种类型的实例。在第一种类型中,体词至少与一个意见词相关联,而在第二种类型中,体词没有相应的意见词。然而,以往的研究仅用第一类实例对模型进行训练和评价,导致样本选择偏差问题。具体来说,TOWE 模型只用第一种类型的实例进行训练,而这些模型将被用来对整个空间进行第一种类型的实例和第二种类型的实例的推断。因此,泛化性能将受到影响。此外,这些模型在第一类实例上的性能不能反映它们在整个空间上的性能。为了验证样本选择偏差问题,扩展了四个常用的 TOWE 数据集,其中只包含与至少一个意见词相关的方面词,并且还包含了没有相应意见词的方面词。在这些数据集上的实验结果表明,在整个空间上训练 TOWE 模型将显著提高模型的性能,只有在第一类实例上评估 TOWE 模型才会高估模型的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Training+Entire-Space+Models+for+Target-oriented+Opinion+Words+Extraction)|1| |[Point Prompt Tuning for Temporally Language Grounding](https://doi.org/10.1145/3477495.3531795)|Yawen Zeng|Tencent Inc., Shenzhen, China|The task of temporally language grounding (TLG) aims to locate a video moment from an untrimmed video that match a given textual query, which has attracted considerable research attention. In recent years, typical retrieval-based TLG methods are inefficient due to pre-segmented candidate moments, while localization-based TLG solutions adopt reinforcement learning resulting in unstable convergence. Therefore, how to perform TLG task efficiently and stably is a non-trivial work. Toward this end, we innovatively contribute a solution, Point Prompt Tuning (PPT), which formulates this task as a prompt-based multi-modal problem and integrates multiple sub-tasks to tuning performance. Specifically, a flexible prompt strategy is contributed to rewrite the query firstly, which contains both query, start point and end point. Thereafter, a multi-modal Transformer is adopted to fully learn the multi-modal context. Meanwhile, we design various sub-tasks to constrain the novel framework, namely matching task and localization task. Finally, the start and end points of matched video moment are straightforward predicted, simply yet stably. Extensive experiments on two real-world datasets have well verified the effectiveness of our proposed solution.|时间语言接地任务(TLG)是从未经修剪的视频中定位匹配给定文本查询的视频片段,已经引起了相当多的研究关注。近年来,典型的基于检索的 TLG 方法由于预分割候选矩而效率低下,而基于定位的 TLG 解决方案则采用强化学习,导致收敛不稳定。因此,如何高效、稳定地完成 TLG 任务是一项非常重要的工作。为此,我们创新性地提供了一个解决方案,即 Point Prompt Tuning (PPT) ,它将此任务描述为一个基于提示的多模态问题,并将多个子任务集成到性能调优中。提出了一种灵活的提示策略,首先对查询进行重写,包括查询、起始点和终结点。此后,采用多模态变压器,以充分了解多模态背景。同时,我们设计了各种子任务来约束新的框架,即匹配任务和定位任务。最后,对匹配视频时刻的起始点和终止点进行了简单而稳定的预测。在两个实际数据集上的大量实验已经很好地验证了我们提出的解决方案的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Point+Prompt+Tuning+for+Temporally+Language+Grounding)|1| -|[What Makes a Good Podcast Summary?](https://doi.org/10.1145/3477495.3531802)|Rezvaneh Rezapour, Sravana Reddy, Rosie Jones, Ian Soboroff|NIST, Gaithersburg, MD, USA; Drexel University, Philadelphia, PA, USA; Spotify, Boston, MA, USA; ASAPP, New York, NY, USA|Abstractive summarization of podcasts is motivated by the growing popularity of podcasts and the needs of their listeners. Podcasting is a markedly different domain from news and other media that are commonly studied in the context of automatic summarization. As such, the qualities of a good podcast summary are yet unknown. Using a collection of podcast summaries produced by different algorithms alongside human judgments of summary quality obtained from the TREC 2020 Podcasts Track, we study the correlations between various automatic evaluation metrics and human judgments, as well as the linguistic aspects of summaries that result in strong evaluations.|播客的抽象摘要是受到日益流行的播客和听众需求的推动。播客是一个明显不同的领域,从新闻和其他媒体,通常研究的背景下的自动汇总。因此,一个好的播客总结的质量还是未知数。使用由不同算法产生的播客摘要集合以及从 TREC 2020播客跟踪获得的总结质量的人类判断,我们研究各种自动评估指标和人类判断之间的相关性,以及导致强烈评估的总结的语言方面。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=What+Makes+a+Good+Podcast+Summary?)|1| -|[BSAL: A Framework of Bi-component Structure and Attribute Learning for Link Prediction](https://doi.org/10.1145/3477495.3531804)|Bisheng Li, Min Zhou, Shengzhong Zhang, Menglin Yang, Defu Lian, Zengfeng Huang|Huawei Noah's Ark Lab, Shenzhen, China; The Chinese University of Hong Kong, Hong Kong, China; Fudan University, Shanghai, China; University of Science and Technology of China, Hefei, China|Given the ubiquitous existence of graph-structured data, learning the representations of nodes for the downstream tasks ranging from node classification, link prediction to graph classification is of crucial importance. Regarding missing link inference of diverse networks, we revisit the link prediction techniques and identify the importance of both the structural and attribute information. However, the available techniques either heavily count on the network topology which is spurious in practice, or cannot integrate graph topology and features properly. To bridge the gap, we propose a bicomponent structural and attribute learning framework (BSAL) that is designed to adaptively leverage information from topology and feature spaces. Specifically, BSAL constructs a semantic topology via the node attributes and then gets the embeddings regarding the semantic view, which provides a flexible and easy-to-implement solution to adaptively incorporate the information carried by the node attributes. Then the semantic embedding together with topology embedding are fused together using attention mechanism for the final prediction. Extensive experiments show the superior performance of our proposal and it significantly outperforms baselines on diverse research benchmarks.|由于图结构数据的普遍存在,学习下游任务的节点表示,从节点分类、链路预测到图分类,都是至关重要的。关于不同网络的缺失链接推断,我们重新审视了链接预测技术,并确定了结构和属性信息的重要性。然而,现有的技术要么严重依赖于在实践中虚假的网络拓扑,要么不能恰当地整合图形拓扑和特性。为了弥补这一差距,我们提出了一个双组件结构和属性学习框架(BSAL) ,该框架旨在自适应地利用拓扑和特征空间中的信息。具体来说,BSAL 通过节点属性构造一个语义拓扑,然后得到关于语义视图的嵌入,这为自适应地合并节点属性所携带的信息提供了一个灵活且易于实现的解决方案。然后利用注意机制将语义嵌入和拓扑嵌入融合在一起进行最终预测。广泛的实验表明,我们的建议的优越性能,它明显优于基线的不同研究基准。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=BSAL:+A+Framework+of+Bi-component+Structure+and+Attribute+Learning+for+Link+Prediction)|1| -|[Tensor-based Graph Modularity for Text Data Clustering](https://doi.org/10.1145/3477495.3531834)|Rafika Boutalbi, Mira Ait Saada, Anastasiia Iurshina, Steffen Staab, Mohamed Nadif|University of Stuttgart, Stuttgart, Germany; Université de Paris, Paris, France|Graphs are used in several applications to represent similarities between instances. For text data, we can represent texts by different features such as bag-of-words, static embeddings (Word2vec, GloVe, etc.), and contextual embeddings (BERT, RoBERTa, etc.), leading to multiple similarities (or graphs) based on each representation. The proposal posits that incorporating the local invariance within every graph and the consistency across different graphs leads to a consensus clustering that improves the document clustering. This problem is complex and challenged with the sparsity and the noisy data included in each graph. To this end, we rely on the modularity metric, which effectively evaluates graph clustering in such circumstances. Therefore, we present a novel approach for text clustering based on both a sparse tensor representation and graph modularity. This leads to cluster texts (nodes) while capturing information arising from the different graphs. We iteratively maximize a Tensor-based Graph Modularity criterion. Extensive experiments on benchmark text clustering datasets are performed, showing that the proposed algorithm referred to as Tensor Graph Modularity -TGM- outperforms other baseline methods in terms of clustering task. The source code is available at https://github.com/TGMclustering/TGMclustering.|在几个应用程序中使用图表来表示实例之间的相似性。对于文本数据,我们可以通过不同的特性来表示文本,例如词包、静态嵌入(Word2vec、 GloVe 等)和上下文嵌入(BERT、 RoBERTa 等) ,从而基于每种表示形式产生多种相似性(或图形)。该方案假定,在每个图中引入局部不变性和不同图之间的一致性,可以形成共识聚类,从而改善文档聚类。这个问题是复杂的和挑战的稀疏性和噪声数据包含在每个图。为此,我们依赖于模块化度量,它可以有效地评估在这种情况下的图聚类。因此,我们提出了一种基于稀疏张量表示和图模块化的文本聚类方法。这将导致集群文本(节点) ,同时捕获来自不同图表的信息。我们迭代地最大化一个基于张量的图模块化准则。对基准文本聚类数据集进行了广泛的实验,结果表明该算法在聚类任务方面优于其他基准方法。源代码可在 https://github.com/tgmclustering/tgmclustering 下载。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Tensor-based+Graph+Modularity+for+Text+Data+Clustering)|1| -|[GraphAD: A Graph Neural Network for Entity-Wise Multivariate Time-Series Anomaly Detection](https://doi.org/10.1145/3477495.3531848)|Xu Chen, Qiu Qiu, Changshan Li, Kunqing Xie|Alibaba DAMO Academy, Hangzhou, China; Peking University, Beijing, China|In recent years, the emergence and development of third-party platforms have greatly facilitated the growth of the Online to Offline (O2O) business. However, the large amount of transaction data raises new challenges for retailers, especially anomaly detection in operating conditions. Thus, platforms begin to develop intelligent business assistants with embedded anomaly detection methods to reduce the management burden on retailers. Traditional time-series anomaly detection methods capture underlying patterns from the perspectives of time and attributes, ignoring the difference between retailers in this scenario. Besides, similar transaction patterns extracted by the platforms can also provide guidance to individual retailers and enrich their available information without privacy issues. In this paper, we pose an entity-wise multivariate time-series anomaly detection problem that considers the time-series of each unique entity. To address this challenge, we propose GraphAD, a novel multivariate time-series anomaly detection model based on the graph neural network. GraphAD decomposes the Key Performance Indicator (KPI) into stable and volatility components and extracts their patterns in terms of attributes, entities and temporal perspectives via graph neural networks. We also construct a real-world entity-wise multivariate time-series dataset from the business data of Ele.me. The experimental results on this dataset show that GraphAD significantly outperforms existing anomaly detection methods.|近年来,第三方平台的出现和发展极大地促进了 Online To Offline线上到线下业务的发展。然而,大量的交易数据给零售商带来了新的挑战,尤其是在经营异常检测方面。因此,平台开始开发具有嵌入式异常检测方法的智能商务助理,以减轻零售商的管理负担。传统的时间序列异常检测方法从时间和属性的角度捕捉潜在的模式,忽略了在这种情况下零售商之间的差异。此外,平台所提取的类似交易模式亦可为个别零售商提供指引,并在不涉及私隐问题的情况下丰富他们的资料。在本文中,我们提出了一个实体多变量时间序列异常检测问题,考虑了每个独特的实体的时间序列。为了应对这一挑战,我们提出了一种新的基于图形神经网络的多变量时间序列异常检测模型 GraphAD。GraphAD 将关键绩效指标分解成稳定和波动的分量,并通过图形神经网络从属性、实体和时间角度提取它们的模式。我们还利用 Ele.me 的业务数据构造了一个真实世界的实体多变量时间序列数据集。在这个数据集上的实验结果显示,GraphAD 显著优于现有的异常检测方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=GraphAD:+A+Graph+Neural+Network+for+Entity-Wise+Multivariate+Time-Series+Anomaly+Detection)|1| +|[What Makes a Good Podcast Summary?](https://doi.org/10.1145/3477495.3531802)|Rezvaneh Rezapour, Sravana Reddy, Rosie Jones, Ian Soboroff|NIST, Gaithersburg, MD, USA; Spotify, Boston, MA, USA; ASAPP, New York, NY, USA; Drexel University, Philadelphia, PA, USA|Abstractive summarization of podcasts is motivated by the growing popularity of podcasts and the needs of their listeners. Podcasting is a markedly different domain from news and other media that are commonly studied in the context of automatic summarization. As such, the qualities of a good podcast summary are yet unknown. Using a collection of podcast summaries produced by different algorithms alongside human judgments of summary quality obtained from the TREC 2020 Podcasts Track, we study the correlations between various automatic evaluation metrics and human judgments, as well as the linguistic aspects of summaries that result in strong evaluations.|播客的抽象摘要是受到日益流行的播客和听众需求的推动。播客是一个明显不同的领域,从新闻和其他媒体,通常研究的背景下的自动汇总。因此,一个好的播客总结的质量还是未知数。使用由不同算法产生的播客摘要集合以及从 TREC 2020播客跟踪获得的总结质量的人类判断,我们研究各种自动评估指标和人类判断之间的相关性,以及导致强烈评估的总结的语言方面。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=What+Makes+a+Good+Podcast+Summary?)|1| +|[BSAL: A Framework of Bi-component Structure and Attribute Learning for Link Prediction](https://doi.org/10.1145/3477495.3531804)|Bisheng Li, Min Zhou, Shengzhong Zhang, Menglin Yang, Defu Lian, Zengfeng Huang|The Chinese University of Hong Kong, Hong Kong, China; Fudan University, Shanghai, China; Huawei Noah's Ark Lab, Shenzhen, China; University of Science and Technology of China, Hefei, China|Given the ubiquitous existence of graph-structured data, learning the representations of nodes for the downstream tasks ranging from node classification, link prediction to graph classification is of crucial importance. Regarding missing link inference of diverse networks, we revisit the link prediction techniques and identify the importance of both the structural and attribute information. However, the available techniques either heavily count on the network topology which is spurious in practice, or cannot integrate graph topology and features properly. To bridge the gap, we propose a bicomponent structural and attribute learning framework (BSAL) that is designed to adaptively leverage information from topology and feature spaces. Specifically, BSAL constructs a semantic topology via the node attributes and then gets the embeddings regarding the semantic view, which provides a flexible and easy-to-implement solution to adaptively incorporate the information carried by the node attributes. Then the semantic embedding together with topology embedding are fused together using attention mechanism for the final prediction. Extensive experiments show the superior performance of our proposal and it significantly outperforms baselines on diverse research benchmarks.|由于图结构数据的普遍存在,学习下游任务的节点表示,从节点分类、链路预测到图分类,都是至关重要的。关于不同网络的缺失链接推断,我们重新审视了链接预测技术,并确定了结构和属性信息的重要性。然而,现有的技术要么严重依赖于在实践中虚假的网络拓扑,要么不能恰当地整合图形拓扑和特性。为了弥补这一差距,我们提出了一个双组件结构和属性学习框架(BSAL) ,该框架旨在自适应地利用拓扑和特征空间中的信息。具体来说,BSAL 通过节点属性构造一个语义拓扑,然后得到关于语义视图的嵌入,这为自适应地合并节点属性所携带的信息提供了一个灵活且易于实现的解决方案。然后利用注意机制将语义嵌入和拓扑嵌入融合在一起进行最终预测。广泛的实验表明,我们的建议的优越性能,它明显优于基线的不同研究基准。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=BSAL:+A+Framework+of+Bi-component+Structure+and+Attribute+Learning+for+Link+Prediction)|1| +|[Tensor-based Graph Modularity for Text Data Clustering](https://doi.org/10.1145/3477495.3531834)|Rafika Boutalbi, Mira Ait Saada, Anastasiia Iurshina, Steffen Staab, Mohamed Nadif|Université de Paris, Paris, France; University of Stuttgart, Stuttgart, Germany|Graphs are used in several applications to represent similarities between instances. For text data, we can represent texts by different features such as bag-of-words, static embeddings (Word2vec, GloVe, etc.), and contextual embeddings (BERT, RoBERTa, etc.), leading to multiple similarities (or graphs) based on each representation. The proposal posits that incorporating the local invariance within every graph and the consistency across different graphs leads to a consensus clustering that improves the document clustering. This problem is complex and challenged with the sparsity and the noisy data included in each graph. To this end, we rely on the modularity metric, which effectively evaluates graph clustering in such circumstances. Therefore, we present a novel approach for text clustering based on both a sparse tensor representation and graph modularity. This leads to cluster texts (nodes) while capturing information arising from the different graphs. We iteratively maximize a Tensor-based Graph Modularity criterion. Extensive experiments on benchmark text clustering datasets are performed, showing that the proposed algorithm referred to as Tensor Graph Modularity -TGM- outperforms other baseline methods in terms of clustering task. The source code is available at https://github.com/TGMclustering/TGMclustering.|在几个应用程序中使用图表来表示实例之间的相似性。对于文本数据,我们可以通过不同的特性来表示文本,例如词包、静态嵌入(Word2vec、 GloVe 等)和上下文嵌入(BERT、 RoBERTa 等) ,从而基于每种表示形式产生多种相似性(或图形)。该方案假定,在每个图中引入局部不变性和不同图之间的一致性,可以形成共识聚类,从而改善文档聚类。这个问题是复杂的和挑战的稀疏性和噪声数据包含在每个图。为此,我们依赖于模块化度量,它可以有效地评估在这种情况下的图聚类。因此,我们提出了一种基于稀疏张量表示和图模块化的文本聚类方法。这将导致集群文本(节点) ,同时捕获来自不同图表的信息。我们迭代地最大化一个基于张量的图模块化准则。对基准文本聚类数据集进行了广泛的实验,结果表明该算法在聚类任务方面优于其他基准方法。源代码可在 https://github.com/tgmclustering/tgmclustering 下载。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Tensor-based+Graph+Modularity+for+Text+Data+Clustering)|1| +|[GraphAD: A Graph Neural Network for Entity-Wise Multivariate Time-Series Anomaly Detection](https://doi.org/10.1145/3477495.3531848)|Xu Chen, Qiu Qiu, Changshan Li, Kunqing Xie|Peking University, Beijing, China; Alibaba DAMO Academy, Hangzhou, China|In recent years, the emergence and development of third-party platforms have greatly facilitated the growth of the Online to Offline (O2O) business. However, the large amount of transaction data raises new challenges for retailers, especially anomaly detection in operating conditions. Thus, platforms begin to develop intelligent business assistants with embedded anomaly detection methods to reduce the management burden on retailers. Traditional time-series anomaly detection methods capture underlying patterns from the perspectives of time and attributes, ignoring the difference between retailers in this scenario. Besides, similar transaction patterns extracted by the platforms can also provide guidance to individual retailers and enrich their available information without privacy issues. In this paper, we pose an entity-wise multivariate time-series anomaly detection problem that considers the time-series of each unique entity. To address this challenge, we propose GraphAD, a novel multivariate time-series anomaly detection model based on the graph neural network. GraphAD decomposes the Key Performance Indicator (KPI) into stable and volatility components and extracts their patterns in terms of attributes, entities and temporal perspectives via graph neural networks. We also construct a real-world entity-wise multivariate time-series dataset from the business data of Ele.me. The experimental results on this dataset show that GraphAD significantly outperforms existing anomaly detection methods.|近年来,第三方平台的出现和发展极大地促进了 Online To Offline线上到线下业务的发展。然而,大量的交易数据给零售商带来了新的挑战,尤其是在经营异常检测方面。因此,平台开始开发具有嵌入式异常检测方法的智能商务助理,以减轻零售商的管理负担。传统的时间序列异常检测方法从时间和属性的角度捕捉潜在的模式,忽略了在这种情况下零售商之间的差异。此外,平台所提取的类似交易模式亦可为个别零售商提供指引,并在不涉及私隐问题的情况下丰富他们的资料。在本文中,我们提出了一个实体多变量时间序列异常检测问题,考虑了每个独特的实体的时间序列。为了应对这一挑战,我们提出了一种新的基于图形神经网络的多变量时间序列异常检测模型 GraphAD。GraphAD 将关键绩效指标分解成稳定和波动的分量,并通过图形神经网络从属性、实体和时间角度提取它们的模式。我们还利用 Ele.me 的业务数据构造了一个真实世界的实体多变量时间序列数据集。在这个数据集上的实验结果显示,GraphAD 显著优于现有的异常检测方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=GraphAD:+A+Graph+Neural+Network+for+Entity-Wise+Multivariate+Time-Series+Anomaly+Detection)|1| |[Lightweight Meta-Learning for Low-Resource Abstractive Summarization](https://doi.org/10.1145/3477495.3531908)|Taehun Huh, Youngjoong Ko|Sungkyunkwan University, Suwon-si, Gyeonggi-do, Republic of Korea|Recently, supervised abstractive summarization using high-resource datasets, such as CNN/DailyMail and Xsum, has achieved significant performance improvements. However, most of the existing high-resource dataset is biased towards a specific domain like news, and annotating document-summary pairs for low-resource datasets is too expensive. Furthermore, the need for low-resource abstractive summarization task is emerging but existing methods for the task such as transfer learning still have domain shifting and overfitting problems. To address these problems, we propose a new framework for low-resource abstractive summarization using a meta-learning algorithm that can quickly adapt to a new domain using small data. For adaptive meta-learning, we introduce a lightweight module inserted into the attention mechanism of a pre-trained language model; the module is first meta-learned with high-resource task-related datasets and then is fine-tuned with the low-resource target dataset. We evaluate our model on 11 different datasets. Experimental results show that the proposed method achieves the state-of-the-art on 9 datasets in low-resource abstractive summarization.|最近,使用高资源数据集(如 CNN/DailyMail 和 Xsum)的监督抽象摘要已经取得了显著的性能改进。然而,大多数现有的高资源数据集偏向于新闻这样的特定领域,并且为低资源数据集注释文档-摘要对过于昂贵。此外,对低资源抽象摘要任务的需求正在出现,但现有的任务转移学习等方法仍然存在领域移位和过拟合问题。为了解决这些问题,我们提出了一个新的框架,低资源抽象摘要使用元学习算法,可以快速适应新的领域使用小数据。对于自适应元学习,我们在预训练语言模型的注意机制中引入了一个轻量级模块,该模块首先对高资源任务相关数据集进行元学习,然后对低资源目标数据集进行微调。我们在11个不同的数据集上评估我们的模型。实验结果表明,该方法在低资源抽象摘要的9个数据集上达到了最高水平。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Lightweight+Meta-Learning+for+Low-Resource+Abstractive+Summarization)|1| |[Task-Oriented Dialogue System as Natural Language Generation](https://doi.org/10.1145/3477495.3531920)|Weizhi Wang, Zhirui Zhang, Junliang Guo, Yinpei Dai, Boxing Chen, Weihua Luo||In this paper, we propose to formulate the task-oriented dialogue system as the purely natural language generation task, so as to fully leverage the large-scale pre-trained models like GPT-2 and simplify complicated delexicalization prepossessing. However, directly applying this method heavily suffers from the dialogue entity inconsistency caused by the removal of delexicalized tokens, as well as the catastrophic forgetting problem of the pre-trained model during fine-tuning, leading to unsatisfactory performance. To alleviate these problems, we design a novel GPT-Adapter-CopyNet network, which incorporates the lightweight adapter and CopyNet modules into GPT-2 to achieve better performance on transfer learning and dialogue entity generation. Experimental results conducted on the DSTC8 Track 1 benchmark and MultiWOZ dataset demonstrate that our proposed approach significantly outperforms baseline models with a remarkable performance on automatic and human evaluations.|本文提出将任务导向的对话系统设计为纯自然语言生成任务,以充分利用 GPT-2等大规模预训练模型,简化复杂的去词化过程。然而,直接应用这种方法,由于去词化标记引起的对话实体不一致,以及预训练模型在微调过程中的灾难性遗忘问题,导致性能不理想。为了解决这些问题,我们设计了一种新型的 GPT-Adapter-CopyNet 网络,它将轻量级适配器和 CopyNet 模块集成到 GPT-2中,以实现更好的传递学习和对话实体生成。在 DSTC8 Track1基准和 MultiWOZ 数据集上进行的实验结果表明,我们提出的方法显著优于基线模型,在自动和人工评估方面具有显著的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Task-Oriented+Dialogue+System+as+Natural+Language+Generation)|1| -|[Where Do Queries Come From?](https://doi.org/10.1145/3477495.3531711)|Marwah Alaofi, Luke Gallagher, Dana McKay, Lauren L. Saling, Mark Sanderson, Falk Scholer, Damiano Spina, Ryen W. White|Microsoft Research, Redmond, WA, USA; RMIT University, Melbourne, VIC, Australia|Where do queries -- the words searchers type into a search box -- come from? The Information Retrieval community understands the performance of queries and search engines extensively, and has recently begun to examine the impact of query variation, showing that different queries for the same information need produce different results. In an information environment where bad actors try to nudge searchers toward misinformation, this is worrisome. The source of query variation -- searcher characteristics, contextual or linguistic prompts, cognitive biases, or even the influence of external parties -- while studied in a piecemeal fashion by other research communities has not been studied by ours. In this paper we draw on a variety of literatures (including information seeking, psychology, and misinformation), and report some small experiments to describe what is known about where queries come from, and demonstrate a clear literature gap around the source of query variations in IR. We chart a way forward for IR to research, document and understand this important question, with a view to creating search engines that provide more consistent, accurate and relevant search results regardless of the searcher's framing of the query.|查询——搜索者在搜索框中输入的单词——从何而来?信息检索广泛了解查询和搜索引擎的性能,最近开始研究查询变化的影响,表明对同一信息的不同查询需要产生不同的结果。在信息环境中,坏人试图推动搜索者获得错误信息,这令人担忧。查询变异的来源——搜索者特征、上下文或语言提示、认知偏差,甚至外部各方的影响——虽然被其他研究团体零碎地研究过,但我们还没有研究过。本文借鉴了大量的文献(包括信息搜索、心理学和错误信息) ,通过一些小实验来描述已知的查询来源,并对信息检索中查询变异的来源进行了清晰的文献分析。我们为 IR 研究、记录和理解这个重要问题绘制了一条前进的道路,以期创建能够提供更加一致、准确和相关的搜索结果的搜索引擎,而不管搜索者的查询框架如何。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Where+Do+Queries+Come+From?)|1| +|[Where Do Queries Come From?](https://doi.org/10.1145/3477495.3531711)|Marwah Alaofi, Luke Gallagher, Dana McKay, Lauren L. Saling, Mark Sanderson, Falk Scholer, Damiano Spina, Ryen W. White|RMIT University, Melbourne, VIC, Australia; Microsoft Research, Redmond, WA, USA|Where do queries -- the words searchers type into a search box -- come from? The Information Retrieval community understands the performance of queries and search engines extensively, and has recently begun to examine the impact of query variation, showing that different queries for the same information need produce different results. In an information environment where bad actors try to nudge searchers toward misinformation, this is worrisome. The source of query variation -- searcher characteristics, contextual or linguistic prompts, cognitive biases, or even the influence of external parties -- while studied in a piecemeal fashion by other research communities has not been studied by ours. In this paper we draw on a variety of literatures (including information seeking, psychology, and misinformation), and report some small experiments to describe what is known about where queries come from, and demonstrate a clear literature gap around the source of query variations in IR. We chart a way forward for IR to research, document and understand this important question, with a view to creating search engines that provide more consistent, accurate and relevant search results regardless of the searcher's framing of the query.|查询——搜索者在搜索框中输入的单词——从何而来?信息检索广泛了解查询和搜索引擎的性能,最近开始研究查询变化的影响,表明对同一信息的不同查询需要产生不同的结果。在信息环境中,坏人试图推动搜索者获得错误信息,这令人担忧。查询变异的来源——搜索者特征、上下文或语言提示、认知偏差,甚至外部各方的影响——虽然被其他研究团体零碎地研究过,但我们还没有研究过。本文借鉴了大量的文献(包括信息搜索、心理学和错误信息) ,通过一些小实验来描述已知的查询来源,并对信息检索中查询变异的来源进行了清晰的文献分析。我们为 IR 研究、记录和理解这个重要问题绘制了一条前进的道路,以期创建能够提供更加一致、准确和相关的搜索结果的搜索引擎,而不管搜索者的查询框架如何。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Where+Do+Queries+Come+From?)|1| |[OVQA: A Clinically Generated Visual Question Answering Dataset](https://doi.org/10.1145/3477495.3531724)|Yefan Huang, Xiaoli Wang, Feiyan Liu, Guofeng Huang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=OVQA:+A+Clinically+Generated+Visual+Question+Answering+Dataset)|1| |[Monant Medical Misinformation Dataset: Mapping Articles to Fact-Checked Claims](https://doi.org/10.1145/3477495.3531726)|Ivan Srba, Branislav Pecher, Matús Tomlein, Róbert Móro, Elena Stefancova, Jakub Simko, Mária Bieliková||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Monant+Medical+Misinformation+Dataset:+Mapping+Articles+to+Fact-Checked+Claims)|1| |[ViQuAE, a Dataset for Knowledge-based Visual Question Answering about Named Entities](https://doi.org/10.1145/3477495.3531753)|Paul Lerner, Olivier Ferret, Camille Guinaudeau, Hervé Le Borgne, Romaric Besançon, José G. Moreno, Jesús LovónMelgarejo||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ViQuAE,+a+Dataset+for+Knowledge-based+Visual+Question+Answering+about+Named+Entities)|1| |[DIANES: A DEI Audit Toolkit for News Sources](https://doi.org/10.1145/3477495.3531660)|Xiaoxiao Shang, Zhiyuan Peng, Qiming Yuan, Sabiq Khan, Lauren Xie, Yi Fang, Subramaniam Vincent||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DIANES:+A+DEI+Audit+Toolkit+for+News+Sources)|1| -|[NAS-CTR: Efficient Neural Architecture Search for Click-Through Rate Prediction](https://doi.org/10.1145/3477495.3532030)|Guanghui Zhu, Feng Cheng, Defu Lian, Chunfeng Yuan, Yihua Huang|Nanjing University, Nanjing, China; University of Science and Technology of China, Hefei, China|Click-Through Rate (CTR) prediction has been widely used in many machine learning tasks such as online advertising and personalization recommendation. Unfortunately, given a domain-specific dataset, searching effective feature interaction operations and combinations from a huge candidate space requires significant expert experience and computational costs. Recently, Neural Architecture Search (NAS) has achieved great success in discovering high-quality network architectures automatically. However, due to the diversity of feature interaction operations and combinations, the existing NAS-based work that treats the architecture search as a black-box optimization problem over a discrete search space suffers from low efficiency. Therefore, it is essential to explore a more efficient architecture search method. To achieve this goal, we propose NAS-CTR, a differentiable neural architecture search approach for CTR prediction. First, we design a novel and expressive architecture search space and a continuous relaxation scheme to make the search space differentiable. Second, we formulate the architecture search for CTR prediction as a joint optimization problem with discrete constraints on architectures and leverage proximal iteration to solve the constrained optimization problem. Additionally, a straightforward yet effective method is proposed to eliminate the aggregation of skip connections. Extensive experimental results reveal that NAS-CTR can outperform the SOTA human-crafted architectures and other NAS-based methods in both test accuracy and search efficiency.|点进率预测已经广泛应用于许多机器学习任务,例如在线广告和个性化推荐。不幸的是,给定一个特定领域的数据集,从一个巨大的候选空间中搜索有效的特征交互操作和组合需要大量的专家经验和计算成本。近年来,神经网络体系结构搜索(NAS)在自动发现高质量的网络体系结构方面取得了巨大的成功。然而,由于功能交互操作和组合的多样性,现有的基于 NAS 的工作将体系结构搜索视为离散搜索空间上的黑盒子最佳化问题,效率低下。因此,有必要探索一种更有效的体系结构搜索方法。为了实现这一目标,我们提出了 NAS-CTR,一种用于 CTR 预测的可微分神经结构搜索方法。首先,我们设计了一个新颖的、具有表现力的体系结构搜索空间和一个连续松弛方案,使搜索空间具有可微性。其次,我们将 CTR 预测的体系结构搜索描述为一个联合最佳化问题,对体系结构进行离散约束,并利用近端迭代来解决约束最佳化问题。此外,提出了一种简单而有效的方法来消除跳跃连接的聚集。大量的实验结果表明,NAS-CTR 在测试精度和搜索效率方面都优于 SOTA 人工架构和其他基于 NAS 的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=NAS-CTR:+Efficient+Neural+Architecture+Search+for+Click-Through+Rate+Prediction)|0| -|[Learning to Enrich Query Representation with Pseudo-Relevance Feedback for Cross-lingual Retrieval](https://doi.org/10.1145/3477495.3532013)|Ramraj Chandradevan, Eugene Yang, Mahsa Yarmohammadi, Eugene Agichtein|Johns Hopkins University, Baltimore, MD, USA; Emory University, Atlanta, GA, USA|Cross-lingual information retrieval (CLIR) aims to provide access to information across languages. Recent pre-trained multilingual language models brought large improvements to the natural language tasks, including cross-lingual adhoc retrieval. However, pseudo-relevance feedback (PRF), a family of techniques for improving ranking using the contents of top initially retrieved items, has not been explored with neural CLIR retrieval models. Two of the challenges are incorporating feedback from long documents, and cross-language knowledge transfer. To address these challenges, we propose a novel neural CLIR architecture, NCLPRF, capable of incorporating PRF feedback from multiple potentially long documents, which enables improvements to query representation in the shared semantic space between query and document languages. The additional information that the feedback documents provide in a target language, can enrich the query representation, bringing it closer to relevant documents in the embedding space. The proposed model performance across three CLIR test collections in Chinese, Russian, and Persian languages, exhibits significant improvements over traditional and SOTA neural CLIR baselines across all three collections.|跨语言信息检索(CLIR)旨在提供跨语言的信息获取途径。最近预先训练的多语言模型给自然语言任务带来了很大的改进,包括跨语言的即席检索。然而,伪相关反馈(PRF)作为一种利用最初检索项目的内容来提高排名的技术,尚未在神经元 CLIR 检索模型中得到应用。其中两个挑战是整合来自长文档的反馈,以及跨语言的知识转移。为了应对这些挑战,我们提出了一个新的神经 CLIR 架构,NCLPRF,能够合并来自多个潜在的长文档的 PRF 反馈,这使得查询语言和文档语言之间的共享语义空间中的查询表示得到改进。反馈文档以目标语言提供的附加信息可以丰富查询表示,使其更接近嵌入空间中的相关文档。在中文,俄文和波斯语的三个 CLIR 测试集合中,所提出的模型性能比所有三个集合中的传统和 SOTA 神经 CLIR 基线都有显著的改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+to+Enrich+Query+Representation+with+Pseudo-Relevance+Feedback+for+Cross-lingual+Retrieval)|0| -|[Incorporating Retrieval Information into the Truncation of Ranking Lists for Better Legal Search](https://doi.org/10.1145/3477495.3531998)|Yixiao Ma, Qingyao Ai, Yueyue Wu, Yunqiu Shao, Yiqun Liu, Min Zhang, Shaoping Ma|University of Utah, Salt Lake City, UT, USA; Tsinghua University, Beijing, China|The truncation of ranking lists predicted by retrieval models is vital to ensure users' search experience. Particularly, in specific vertical domains where documents are usually complicated and extensive (e.g., legal cases), the cost of browsing results is much higher than traditional IR tasks (e.g., Web search) and setting a reasonable cut-off position is quite necessary. While it is straightforward to apply existing result list truncation approaches to legal case retrieval, the effectiveness of these methods is limited because they only focus on simple document statistics and usually fail to capture the context information of documents in the ranking list. These existing efforts also treat result list truncation as an isolated task instead of a component in the entire ranking process, limiting the usage of truncation in practical systems. To tackle these limitations, we propose LeCut, a ranking list truncation model for legal case retrieval. LeCut utilizes contextual features of the retrieval task to capture the semantic-level similarity between documents and decides the best cut-off position with attention mechanisms. We further propose a Joint Optimization of Truncation and Reranking (JOTR) framework based on LeCut to improve the performance of truncation and retrieval tasks simultaneously. Comparison against competitive baselines on public benchmark datasets demonstrates the effectiveness of LeCut and JOTR. A case study is conducted to visualize the cut-off positions of LeCut and the process of how JOTR improves both retrieval and truncation tasks.|检索模型预测的排名列表的截断对于保证用户的搜索体验至关重要。特别是,在特定的垂直领域,文档通常是复杂和广泛的(例如,法律案件) ,浏览结果的成本远远高于传统的 IR 任务(例如,网络搜索) ,设置一个合理的截止位置是非常必要的。虽然将现有的结果清单截断方法应用于法律案件检索很简单,但这些方法的有效性有限,因为它们只侧重于简单的文件统计,通常无法捕捉排名清单中文件的上下文信息。这些现有的工作还将结果列表截断视为一个孤立的任务,而不是整个排序过程中的一个组件,从而限制了截断在实际系统中的使用。为了解决这些局限性,我们提出了 LeCut,一种用于法律案例检索的排序列表截断模型。LeCut 利用检索任务的上下文特征来捕获文档之间的语义级相似性,并通过注意机制确定最佳截止位置。进一步提出了一种基于 LeCut 的联合优化截断与重排(JOTR)框架,以同时提高截断与检索任务的性能。与公共基准数据集的竞争基线进行比较,可以证明 LeCut 和 JOTR 的有效性。通过案例研究,可视化 LeCut 的截止位置以及 JOTR 如何改进检索和截断任务的过程。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Incorporating+Retrieval+Information+into+the+Truncation+of+Ranking+Lists+for+Better+Legal+Search)|0| -|[Ada-Ranker: A Data Distribution Adaptive Ranking Paradigm for Sequential Recommendation](https://doi.org/10.1145/3477495.3531931)|Xinyan Fan, Jianxun Lian, Wayne Xin Zhao, Zheng Liu, Chaozhuo Li, Xing Xie|Microsoft Research Asia, Beijing, China; Renmin University of China, Beijing, China|A large-scale recommender system usually consists of recall and ranking modules. The goal of ranking modules (aka rankers) is to elaborately discriminate users' preference on item candidates proposed by recall modules. With the success of deep learning techniques in various domains, we have witnessed the mainstream rankers evolve from traditional models to deep neural models. However, the way that we design and use rankers remains unchanged: offline training the model, freezing the parameters, and deploying it for online serving. Actually, the candidate items are determined by specific user requests, in which underlying distributions (e.g., the proportion of items for different categories, the proportion of popular or new items) are highly different from one another in a production environment. The classical parameter-frozen inference manner cannot adapt to dynamic serving circumstances, making rankers' performance compromised. In this paper, we propose a new training and inference paradigm, termed as Ada-Ranker, to address the challenges of dynamic online serving. Instead of using parameter-frozen models for universal serving, Ada-Ranker can adaptively modulate parameters of a ranker according to the data distribution of the current group of item candidates. We first extract distribution patterns from the item candidates. Then, we modulate the ranker by the patterns to make the ranker adapt to the current data distribution. Finally, we use the revised ranker to score the candidate list. In this way, we empower the ranker with the capacity of adapting from a global model to a local model which better handles the current task. As a first study, we examine our Ada-Ranker paradigm in the sequential recommendation scenario. Experiments on three datasets demonstrate that Ada-Ranker can effectively enhance various base sequential models and also outperform a comprehensive set of competitive baselines.|一个大规模的推荐系统通常包括召回和排名模块。排序模块(又称排序器)的目标是精心区分用户对召回模块提出的候选项的偏好。随着深度学习技术在各个领域的成功,我们目睹了主流的排名从传统模型演变为深度神经模型。然而,我们设计和使用排名的方式保持不变: 离线训练模型,冻结参数,并部署它在线服务。实际上,候选项是由特定的用户请求决定的,在这种情况下,底层分布(例如,不同类别的项目比例,流行项目或新项目的比例)在生产环境中彼此之间差异很大。传统的参数冻结推理方式不能适应动态服务环境,使得排序器的性能受到影响。在本文中,我们提出了一个新的训练和推理范式,称为 Ada-Ranker,以解决动态在线服务的挑战。Ada-Ranker 可以根据当前项目候选者组的数据分布自适应地调整排序器的参数,而不必使用通用服务的参数冻结模型。我们首先从候选项中提取分布模式。然后,根据模式对排序器进行调整,使排序器适应当前的数据分布。最后,我们使用修改后的排名对候选人列表进行评分。通过这种方式,我们赋予排名者从全球模型到更好地处理当前任务的局部模型的适应能力。作为第一个研究,我们在顺序推荐场景中检查我们的 Ada-Ranker 范式。在三个数据集上的实验表明,Ada-Ranker 能够有效地增强各种基本序列模型,并且表现优于一组综合的竞争基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Ada-Ranker:+A+Data+Distribution+Adaptive+Ranking+Paradigm+for+Sequential+Recommendation)|0| -|[Retrieval and Recommendation Systems at the Crossroads of Artificial Intelligence, Ethics, and Regulation](https://doi.org/10.1145/3477495.3532683)|Markus Schedl, Emilia Gómez, Elisabeth Lex|European Commission, Joint Research Centre and Universitat Pompeu Fabra, Seville/Barcelona, Spain; Johannes Kepler University Linz & Linz Institue of Technology, Linz, Austria; Graz University of Technology, Graz, Austria|This tutorial aims at providing its audience an interdisciplinary overview about the topics of fairness and non-discrimination, diversity, and transparency of AI systems, tailored to the research fields of information retrieval and recommender systems. By means of this tutorial, we would like to equip the mostly technical audience of SIGIR with the necessary understanding of the ethical implications of their research and development on the one hand, and of recent political and legal regulations that address the aforementioned challenges on the other hand.|本教程旨在为读者提供一个关于人工智能系统的公平性和非歧视性、多样性和透明度等主题的跨学科概述,适用于信息检索和推荐系统的研究领域。通过本教程,我们希望让 SIGIR 的大多数技术读者一方面了解他们的研究和发展的道德影响,另一方面了解解决上述挑战的最新政治和法律法规。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Retrieval+and+Recommendation+Systems+at+the+Crossroads+of+Artificial+Intelligence,+Ethics,+and+Regulation)|0| -|[Adversarial Filtering Modeling on Long-term User Behavior Sequences for Click-Through Rate Prediction](https://doi.org/10.1145/3477495.3531788)|Xiaochen Li, Jian Liang, Xialong Liu, Yu Zhang|Lazada Group, Beijing, China; Alibaba Group, Beijing, China|Rich user behavior information is of great importance for capturing and understanding user interest in click-through rate (CTR) prediction. To improve the richness, collecting long-term behaviors becomes a typical approach in academy and industry but at the cost of increasing online storage and latency. Recently, researchers have proposed several approaches to shorten long-term behavior sequence and then model user interests. These approaches reduce online cost efficiently but do not well handle the noisy information in long-term user behavior, which may deteriorate the performance of CTR prediction significantly. To obtain better cost/performance trade-off, we propose a novel Adversarial Filtering Model (ADFM) to model long-term user behavior. ADFM uses a hierarchical aggregation representation to compress raw behavior sequence and then learns to remove useless behavior information with an adversarial filtering mechanism. The selected user behaviors are fed into interest extraction module for CTR prediction. Experimental results on public datasets and industrial dataset demonstrate that our method achieves significant improvements over state-of-the-art models.|丰富的用户行为信息对于捕捉和理解用户对点进率预测的兴趣非常重要。为了提高丰富性,收集长期行为成为学术界和工业界的一种典型方法,但代价是增加在线存储和延迟。最近,研究人员提出了几种方法来缩短长期行为序列,然后模型用户的兴趣。这些方法有效地降低了在线成本,但不能很好地处理长期用户行为中的噪声信息,这可能会严重影响 CTR 预测的性能。为了获得更好的性价比,我们提出了一种新的对抗过滤模型(ADFM)来模拟长期用户行为。ADFM 使用分层聚合表示来压缩原始行为序列,然后学习使用对抗性过滤机制去除无用的行为信息。将选定的用户行为反馈到兴趣提取模块中进行点击率预测。在公共数据集和工业数据集上的实验结果表明,该方法比现有的模型有明显的改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Adversarial+Filtering+Modeling+on+Long-term+User+Behavior+Sequences+for+Click-Through+Rate+Prediction)|0| +|[NAS-CTR: Efficient Neural Architecture Search for Click-Through Rate Prediction](https://doi.org/10.1145/3477495.3532030)|Guanghui Zhu, Feng Cheng, Defu Lian, Chunfeng Yuan, Yihua Huang|University of Science and Technology of China, Hefei, China; Nanjing University, Nanjing, China|Click-Through Rate (CTR) prediction has been widely used in many machine learning tasks such as online advertising and personalization recommendation. Unfortunately, given a domain-specific dataset, searching effective feature interaction operations and combinations from a huge candidate space requires significant expert experience and computational costs. Recently, Neural Architecture Search (NAS) has achieved great success in discovering high-quality network architectures automatically. However, due to the diversity of feature interaction operations and combinations, the existing NAS-based work that treats the architecture search as a black-box optimization problem over a discrete search space suffers from low efficiency. Therefore, it is essential to explore a more efficient architecture search method. To achieve this goal, we propose NAS-CTR, a differentiable neural architecture search approach for CTR prediction. First, we design a novel and expressive architecture search space and a continuous relaxation scheme to make the search space differentiable. Second, we formulate the architecture search for CTR prediction as a joint optimization problem with discrete constraints on architectures and leverage proximal iteration to solve the constrained optimization problem. Additionally, a straightforward yet effective method is proposed to eliminate the aggregation of skip connections. Extensive experimental results reveal that NAS-CTR can outperform the SOTA human-crafted architectures and other NAS-based methods in both test accuracy and search efficiency.|点进率预测已经广泛应用于许多机器学习任务,例如在线广告和个性化推荐。不幸的是,给定一个特定领域的数据集,从一个巨大的候选空间中搜索有效的特征交互操作和组合需要大量的专家经验和计算成本。近年来,神经网络体系结构搜索(NAS)在自动发现高质量的网络体系结构方面取得了巨大的成功。然而,由于功能交互操作和组合的多样性,现有的基于 NAS 的工作将体系结构搜索视为离散搜索空间上的黑盒子最佳化问题,效率低下。因此,有必要探索一种更有效的体系结构搜索方法。为了实现这一目标,我们提出了 NAS-CTR,一种用于 CTR 预测的可微分神经结构搜索方法。首先,我们设计了一个新颖的、具有表现力的体系结构搜索空间和一个连续松弛方案,使搜索空间具有可微性。其次,我们将 CTR 预测的体系结构搜索描述为一个联合最佳化问题,对体系结构进行离散约束,并利用近端迭代来解决约束最佳化问题。此外,提出了一种简单而有效的方法来消除跳跃连接的聚集。大量的实验结果表明,NAS-CTR 在测试精度和搜索效率方面都优于 SOTA 人工架构和其他基于 NAS 的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=NAS-CTR:+Efficient+Neural+Architecture+Search+for+Click-Through+Rate+Prediction)|0| +|[Learning to Enrich Query Representation with Pseudo-Relevance Feedback for Cross-lingual Retrieval](https://doi.org/10.1145/3477495.3532013)|Ramraj Chandradevan, Eugene Yang, Mahsa Yarmohammadi, Eugene Agichtein|Emory University, Atlanta, GA, USA; Johns Hopkins University, Baltimore, MD, USA|Cross-lingual information retrieval (CLIR) aims to provide access to information across languages. Recent pre-trained multilingual language models brought large improvements to the natural language tasks, including cross-lingual adhoc retrieval. However, pseudo-relevance feedback (PRF), a family of techniques for improving ranking using the contents of top initially retrieved items, has not been explored with neural CLIR retrieval models. Two of the challenges are incorporating feedback from long documents, and cross-language knowledge transfer. To address these challenges, we propose a novel neural CLIR architecture, NCLPRF, capable of incorporating PRF feedback from multiple potentially long documents, which enables improvements to query representation in the shared semantic space between query and document languages. The additional information that the feedback documents provide in a target language, can enrich the query representation, bringing it closer to relevant documents in the embedding space. The proposed model performance across three CLIR test collections in Chinese, Russian, and Persian languages, exhibits significant improvements over traditional and SOTA neural CLIR baselines across all three collections.|跨语言信息检索(CLIR)旨在提供跨语言的信息获取途径。最近预先训练的多语言模型给自然语言任务带来了很大的改进,包括跨语言的即席检索。然而,伪相关反馈(PRF)作为一种利用最初检索项目的内容来提高排名的技术,尚未在神经元 CLIR 检索模型中得到应用。其中两个挑战是整合来自长文档的反馈,以及跨语言的知识转移。为了应对这些挑战,我们提出了一个新的神经 CLIR 架构,NCLPRF,能够合并来自多个潜在的长文档的 PRF 反馈,这使得查询语言和文档语言之间的共享语义空间中的查询表示得到改进。反馈文档以目标语言提供的附加信息可以丰富查询表示,使其更接近嵌入空间中的相关文档。在中文,俄文和波斯语的三个 CLIR 测试集合中,所提出的模型性能比所有三个集合中的传统和 SOTA 神经 CLIR 基线都有显著的改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+to+Enrich+Query+Representation+with+Pseudo-Relevance+Feedback+for+Cross-lingual+Retrieval)|0| +|[Incorporating Retrieval Information into the Truncation of Ranking Lists for Better Legal Search](https://doi.org/10.1145/3477495.3531998)|Yixiao Ma, Qingyao Ai, Yueyue Wu, Yunqiu Shao, Yiqun Liu, Min Zhang, Shaoping Ma|Tsinghua University, Beijing, China; University of Utah, Salt Lake City, UT, USA|The truncation of ranking lists predicted by retrieval models is vital to ensure users' search experience. Particularly, in specific vertical domains where documents are usually complicated and extensive (e.g., legal cases), the cost of browsing results is much higher than traditional IR tasks (e.g., Web search) and setting a reasonable cut-off position is quite necessary. While it is straightforward to apply existing result list truncation approaches to legal case retrieval, the effectiveness of these methods is limited because they only focus on simple document statistics and usually fail to capture the context information of documents in the ranking list. These existing efforts also treat result list truncation as an isolated task instead of a component in the entire ranking process, limiting the usage of truncation in practical systems. To tackle these limitations, we propose LeCut, a ranking list truncation model for legal case retrieval. LeCut utilizes contextual features of the retrieval task to capture the semantic-level similarity between documents and decides the best cut-off position with attention mechanisms. We further propose a Joint Optimization of Truncation and Reranking (JOTR) framework based on LeCut to improve the performance of truncation and retrieval tasks simultaneously. Comparison against competitive baselines on public benchmark datasets demonstrates the effectiveness of LeCut and JOTR. A case study is conducted to visualize the cut-off positions of LeCut and the process of how JOTR improves both retrieval and truncation tasks.|检索模型预测的排名列表的截断对于保证用户的搜索体验至关重要。特别是,在特定的垂直领域,文档通常是复杂和广泛的(例如,法律案件) ,浏览结果的成本远远高于传统的 IR 任务(例如,网络搜索) ,设置一个合理的截止位置是非常必要的。虽然将现有的结果清单截断方法应用于法律案件检索很简单,但这些方法的有效性有限,因为它们只侧重于简单的文件统计,通常无法捕捉排名清单中文件的上下文信息。这些现有的工作还将结果列表截断视为一个孤立的任务,而不是整个排序过程中的一个组件,从而限制了截断在实际系统中的使用。为了解决这些局限性,我们提出了 LeCut,一种用于法律案例检索的排序列表截断模型。LeCut 利用检索任务的上下文特征来捕获文档之间的语义级相似性,并通过注意机制确定最佳截止位置。进一步提出了一种基于 LeCut 的联合优化截断与重排(JOTR)框架,以同时提高截断与检索任务的性能。与公共基准数据集的竞争基线进行比较,可以证明 LeCut 和 JOTR 的有效性。通过案例研究,可视化 LeCut 的截止位置以及 JOTR 如何改进检索和截断任务的过程。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Incorporating+Retrieval+Information+into+the+Truncation+of+Ranking+Lists+for+Better+Legal+Search)|0| +|[Ada-Ranker: A Data Distribution Adaptive Ranking Paradigm for Sequential Recommendation](https://doi.org/10.1145/3477495.3531931)|Xinyan Fan, Jianxun Lian, Wayne Xin Zhao, Zheng Liu, Chaozhuo Li, Xing Xie|Renmin University of China, Beijing, China; Microsoft Research Asia, Beijing, China|A large-scale recommender system usually consists of recall and ranking modules. The goal of ranking modules (aka rankers) is to elaborately discriminate users' preference on item candidates proposed by recall modules. With the success of deep learning techniques in various domains, we have witnessed the mainstream rankers evolve from traditional models to deep neural models. However, the way that we design and use rankers remains unchanged: offline training the model, freezing the parameters, and deploying it for online serving. Actually, the candidate items are determined by specific user requests, in which underlying distributions (e.g., the proportion of items for different categories, the proportion of popular or new items) are highly different from one another in a production environment. The classical parameter-frozen inference manner cannot adapt to dynamic serving circumstances, making rankers' performance compromised. In this paper, we propose a new training and inference paradigm, termed as Ada-Ranker, to address the challenges of dynamic online serving. Instead of using parameter-frozen models for universal serving, Ada-Ranker can adaptively modulate parameters of a ranker according to the data distribution of the current group of item candidates. We first extract distribution patterns from the item candidates. Then, we modulate the ranker by the patterns to make the ranker adapt to the current data distribution. Finally, we use the revised ranker to score the candidate list. In this way, we empower the ranker with the capacity of adapting from a global model to a local model which better handles the current task. As a first study, we examine our Ada-Ranker paradigm in the sequential recommendation scenario. Experiments on three datasets demonstrate that Ada-Ranker can effectively enhance various base sequential models and also outperform a comprehensive set of competitive baselines.|一个大规模的推荐系统通常包括召回和排名模块。排序模块(又称排序器)的目标是精心区分用户对召回模块提出的候选项的偏好。随着深度学习技术在各个领域的成功,我们目睹了主流的排名从传统模型演变为深度神经模型。然而,我们设计和使用排名的方式保持不变: 离线训练模型,冻结参数,并部署它在线服务。实际上,候选项是由特定的用户请求决定的,在这种情况下,底层分布(例如,不同类别的项目比例,流行项目或新项目的比例)在生产环境中彼此之间差异很大。传统的参数冻结推理方式不能适应动态服务环境,使得排序器的性能受到影响。在本文中,我们提出了一个新的训练和推理范式,称为 Ada-Ranker,以解决动态在线服务的挑战。Ada-Ranker 可以根据当前项目候选者组的数据分布自适应地调整排序器的参数,而不必使用通用服务的参数冻结模型。我们首先从候选项中提取分布模式。然后,根据模式对排序器进行调整,使排序器适应当前的数据分布。最后,我们使用修改后的排名对候选人列表进行评分。通过这种方式,我们赋予排名者从全球模型到更好地处理当前任务的局部模型的适应能力。作为第一个研究,我们在顺序推荐场景中检查我们的 Ada-Ranker 范式。在三个数据集上的实验表明,Ada-Ranker 能够有效地增强各种基本序列模型,并且表现优于一组综合的竞争基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Ada-Ranker:+A+Data+Distribution+Adaptive+Ranking+Paradigm+for+Sequential+Recommendation)|0| +|[Retrieval and Recommendation Systems at the Crossroads of Artificial Intelligence, Ethics, and Regulation](https://doi.org/10.1145/3477495.3532683)|Markus Schedl, Emilia Gómez, Elisabeth Lex|Graz University of Technology, Graz, Austria; European Commission, Joint Research Centre and Universitat Pompeu Fabra, Seville/Barcelona, Spain; Johannes Kepler University Linz & Linz Institue of Technology, Linz, Austria|This tutorial aims at providing its audience an interdisciplinary overview about the topics of fairness and non-discrimination, diversity, and transparency of AI systems, tailored to the research fields of information retrieval and recommender systems. By means of this tutorial, we would like to equip the mostly technical audience of SIGIR with the necessary understanding of the ethical implications of their research and development on the one hand, and of recent political and legal regulations that address the aforementioned challenges on the other hand.|本教程旨在为读者提供一个关于人工智能系统的公平性和非歧视性、多样性和透明度等主题的跨学科概述,适用于信息检索和推荐系统的研究领域。通过本教程,我们希望让 SIGIR 的大多数技术读者一方面了解他们的研究和发展的道德影响,另一方面了解解决上述挑战的最新政治和法律法规。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Retrieval+and+Recommendation+Systems+at+the+Crossroads+of+Artificial+Intelligence,+Ethics,+and+Regulation)|0| +|[Adversarial Filtering Modeling on Long-term User Behavior Sequences for Click-Through Rate Prediction](https://doi.org/10.1145/3477495.3531788)|Xiaochen Li, Jian Liang, Xialong Liu, Yu Zhang|Alibaba Group, Beijing, China; Lazada Group, Beijing, China|Rich user behavior information is of great importance for capturing and understanding user interest in click-through rate (CTR) prediction. To improve the richness, collecting long-term behaviors becomes a typical approach in academy and industry but at the cost of increasing online storage and latency. Recently, researchers have proposed several approaches to shorten long-term behavior sequence and then model user interests. These approaches reduce online cost efficiently but do not well handle the noisy information in long-term user behavior, which may deteriorate the performance of CTR prediction significantly. To obtain better cost/performance trade-off, we propose a novel Adversarial Filtering Model (ADFM) to model long-term user behavior. ADFM uses a hierarchical aggregation representation to compress raw behavior sequence and then learns to remove useless behavior information with an adversarial filtering mechanism. The selected user behaviors are fed into interest extraction module for CTR prediction. Experimental results on public datasets and industrial dataset demonstrate that our method achieves significant improvements over state-of-the-art models.|丰富的用户行为信息对于捕捉和理解用户对点进率预测的兴趣非常重要。为了提高丰富性,收集长期行为成为学术界和工业界的一种典型方法,但代价是增加在线存储和延迟。最近,研究人员提出了几种方法来缩短长期行为序列,然后模型用户的兴趣。这些方法有效地降低了在线成本,但不能很好地处理长期用户行为中的噪声信息,这可能会严重影响 CTR 预测的性能。为了获得更好的性价比,我们提出了一种新的对抗过滤模型(ADFM)来模拟长期用户行为。ADFM 使用分层聚合表示来压缩原始行为序列,然后学习使用对抗性过滤机制去除无用的行为信息。将选定的用户行为反馈到兴趣提取模块中进行点击率预测。在公共数据集和工业数据集上的实验结果表明,该方法比现有的模型有明显的改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Adversarial+Filtering+Modeling+on+Long-term+User+Behavior+Sequences+for+Click-Through+Rate+Prediction)|0| |[LoL: A Comparative Regularization Loss over Query Reformulation Losses for Pseudo-Relevance Feedback](https://doi.org/10.1145/3477495.3532017)|Yunchang Zhu, Liang Pang, Yanyan Lan, Huawei Shen, Xueqi Cheng|Institute of Computing Technology, CAS & University of Chinese Academy of Sciences, Beijing, China; Data Intelligence System Research Center, Institute of Computing Technology, CAS, Beijing, China; Tsinghua University, Beijing, China; Data Intelligence System Research Center, Institute of Computing Technology, CAS & University of Chinese Academy of Sciences, Beijing, China|Pseudo-relevance feedback (PRF) has proven to be an effective query reformulation technique to improve retrieval accuracy. It aims to alleviate the mismatch of linguistic expressions between a query and its potential relevant documents. Existing PRF methods independently treat revised queries originating from the same query but using different numbers of feedback documents, resulting in severe query drift. Without comparing the effects of two different revisions from the same query, a PRF model may incorrectly focus on the additional irrelevant information increased in the more feedback, and thus reformulate a query that is less effective than the revision using the less feedback. Ideally, if a PRF model can distinguish between irrelevant and relevant information in the feedback, the more feedback documents there are, the better the revised query will be. To bridge this gap, we propose the Loss-over-Loss (LoL) framework to compare the reformulation losses between different revisions of the same query during training. Concretely, we revise an original query multiple times in parallel using different amounts of feedback and compute their reformulation losses. Then, we introduce an additional regularization loss on these reformulation losses to penalize revisions that use more feedback but gain larger losses. With such comparative regularization, the PRF model is expected to learn to suppress the extra increased irrelevant information by comparing the effects of different revised queries. Further, we present a differentiable query reformulation method to implement this framework. This method revises queries in the vector space and directly optimizes the retrieval performance of query vectors, applicable for both sparse and dense retrieval models. Empirical evaluation demonstrates the effectiveness and robustness of our method for two typical sparse and dense retrieval models.|伪相关反馈(PRF)已被证明是一种有效的查询重构技术,以提高检索的准确性。它旨在缓解查询与潜在相关文档之间的语言表达不匹配问题。现有的 PRF 方法独立处理来自同一查询但使用不同数量的反馈文档的修改查询,导致严重的查询漂移。如果不比较来自同一查询的两个不同修订的效果,PRF 模型可能会错误地关注更多反馈中增加的附加不相关信息,从而使用更少的反馈重新表述比修订更低效的查询。理想情况下,如果 PRF 模型能够区分反馈中的不相关信息和相关信息,那么反馈文档越多,修改后的查询就越好。为了弥补这一差距,我们提出了损失超过损失(LoL)框架来比较同一查询在培训期间不同修订版本之间的重构损失。具体来说,我们使用不同数量的反馈并行多次修改原始查询,并计算它们的重新表述损失。然后,我们引入一个额外的正则化损失对这些重制损失,以惩罚修订使用更多的反馈,但获得更大的损失。通过这种比较正则化,PRF 模型可以通过比较不同修订查询的效果来抑制额外增加的不相关信息。进一步,我们提出了一个可微查询重构方法来实现这个框架。该方法对向量空间中的查询进行修正,直接优化查询向量的检索性能,适用于稀疏和密集检索模型。实验结果表明,该方法对两种典型的稀疏和密集检索模型具有较好的鲁棒性和有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=LoL:+A+Comparative+Regularization+Loss+over+Query+Reformulation+Losses+for+Pseudo-Relevance+Feedback)|0| -|[Determinantal Point Process Likelihoods for Sequential Recommendation](https://doi.org/10.1145/3477495.3531965)|Yuli Liu, Christian J. Walder, Lexing Xie|Australian National University & Data61, CSIRO, Canberra, Australia; Data61, CSIRO & Australian National University, Canberra, Australia|Sequential recommendation is a popular task in academic research and close to real-world application scenarios, where the goal is to predict the next action(s) of the user based on his/her previous sequence of actions. In the training process of recommender systems, the loss function plays an essential role in guiding the optimization of recommendation models to generate accurate suggestions for users. However, most existing sequential recommendation tech- niques focus on designing algorithms or neural network architectures, and few efforts have been made to tailor loss functions that fit naturally into the practical application scenario of sequential recommender systems. Ranking-based losses, such as cross-entropy and Bayesian Personalized Ranking (BPR) are widely used in the sequential recommendation area. We argue that such objective functions suffer from two inherent drawbacks: i) the dependencies among elements of a sequence are overlooked in these loss formulations; ii) instead of balancing accuracy (quality) and diversity, only generating accurate results has been over emphasized. We therefore propose two new loss functions based on the Determinantal Point Process (DPP) likelihood, that can be adaptively applied to estimate the subsequent item or items. The DPP-distributed item set captures natural dependencies among temporal actions, and a quality vs. diversity decomposition of the DPP kernel pushes us to go beyond accuracy-oriented loss functions. Experimental results using the proposed loss functions on three real-world datasets show marked improvements over state-of-the-art sequential recommendation methods in both quality and diversity metrics.|顺序推荐是学术研究中的一个热门任务,它接近于真实的应用场景,其目标是根据用户以前的操作顺序预测他/她的下一个操作。在推荐系统的培训过程中,损失函数对于指导推荐模型的优化,为用户提供准确的建议起着至关重要的作用。然而,现有的顺序推荐技术大多侧重于设计算法或神经网络体系结构,很少有人努力去调整自然适合顺序推荐系统实际应用场景的损失函数。基于排序的损失,如交叉熵和贝叶斯个性化排序(BPR)被广泛应用于序列推荐领域。我们认为这样的目标函数有两个固有的缺点: i)在这些损失公式中忽略了序列元素之间的依赖关系; ii)没有平衡准确性(质量)和多样性,只有产生准确的结果被过分强调。因此,我们提出两个新的基于行列式点过程(DPP)可能性的损失函数,可以自适应地应用于估计随后的项目。DPP 分布式项目集捕获时间操作之间的自然依赖关系,DPP 内核的质量与多样性分解促使我们超越面向准确性的损失函数。在三个实际数据集上使用提出的损失函数的实验结果显示,在质量和多样性度量方面,该方法比最先进的顺序推荐方法有明显的改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Determinantal+Point+Process+Likelihoods+for+Sequential+Recommendation)|0| +|[Determinantal Point Process Likelihoods for Sequential Recommendation](https://doi.org/10.1145/3477495.3531965)|Yuli Liu, Christian J. Walder, Lexing Xie|Data61, CSIRO & Australian National University, Canberra, Australia; Australian National University & Data61, CSIRO, Canberra, Australia|Sequential recommendation is a popular task in academic research and close to real-world application scenarios, where the goal is to predict the next action(s) of the user based on his/her previous sequence of actions. In the training process of recommender systems, the loss function plays an essential role in guiding the optimization of recommendation models to generate accurate suggestions for users. However, most existing sequential recommendation tech- niques focus on designing algorithms or neural network architectures, and few efforts have been made to tailor loss functions that fit naturally into the practical application scenario of sequential recommender systems. Ranking-based losses, such as cross-entropy and Bayesian Personalized Ranking (BPR) are widely used in the sequential recommendation area. We argue that such objective functions suffer from two inherent drawbacks: i) the dependencies among elements of a sequence are overlooked in these loss formulations; ii) instead of balancing accuracy (quality) and diversity, only generating accurate results has been over emphasized. We therefore propose two new loss functions based on the Determinantal Point Process (DPP) likelihood, that can be adaptively applied to estimate the subsequent item or items. The DPP-distributed item set captures natural dependencies among temporal actions, and a quality vs. diversity decomposition of the DPP kernel pushes us to go beyond accuracy-oriented loss functions. Experimental results using the proposed loss functions on three real-world datasets show marked improvements over state-of-the-art sequential recommendation methods in both quality and diversity metrics.|顺序推荐是学术研究中的一个热门任务,它接近于真实的应用场景,其目标是根据用户以前的操作顺序预测他/她的下一个操作。在推荐系统的培训过程中,损失函数对于指导推荐模型的优化,为用户提供准确的建议起着至关重要的作用。然而,现有的顺序推荐技术大多侧重于设计算法或神经网络体系结构,很少有人努力去调整自然适合顺序推荐系统实际应用场景的损失函数。基于排序的损失,如交叉熵和贝叶斯个性化排序(BPR)被广泛应用于序列推荐领域。我们认为这样的目标函数有两个固有的缺点: i)在这些损失公式中忽略了序列元素之间的依赖关系; ii)没有平衡准确性(质量)和多样性,只有产生准确的结果被过分强调。因此,我们提出两个新的基于行列式点过程(DPP)可能性的损失函数,可以自适应地应用于估计随后的项目。DPP 分布式项目集捕获时间操作之间的自然依赖关系,DPP 内核的质量与多样性分解促使我们超越面向准确性的损失函数。在三个实际数据集上使用提出的损失函数的实验结果显示,在质量和多样性度量方面,该方法比最先进的顺序推荐方法有明显的改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Determinantal+Point+Process+Likelihoods+for+Sequential+Recommendation)|0| |[Re-weighting Negative Samples for Model-Agnostic Matching](https://doi.org/10.1145/3477495.3532053)|Jiazhen Lou, Hong Wen, Fuyu Lv, Jing Zhang, Tengfei Yuan, Zhao Li|The University of Sydney, Darlington, NSW, Australia; Zhejiang University, Hangzhou, China; Alibaba Group, Hangzhou, China|Recommender Systems (RS), as an efficient tool to discover users' interested items from a very large corpus, has attracted more and more attention from academia and industry. As the initial stage of RS, large-scale matching is fundamental yet challenging. A typical recipe is to learn user and item representations with a two-tower architecture and then calculate the similarity score between both representation vectors, which however still struggles in how to properly deal with negative samples. In this paper, we find that the common practice that randomly sampling negative samples from the entire space and treating them equally is not an optimal choice, since the negative samples from different sub-spaces at different stages have different importance to a matching model. To address this issue, we propose a novel method named Unbiased Model-Agnostic Matching Approach (UMA2). It consists of two basic modules including 1) General Matching Model (GMM), which is model-agnostic and can be implemented as any embedding-based two-tower models; and 2) Negative Samples Debias Network (NSDN), which discriminates negative samples by borrowing the idea of Inverse Propensity Weighting (IPW) and re-weighs the loss in GMM. UMA$^2$ seamlessly integrates these two modules in an end-to-end multi-task learning framework. Extensive experiments on both real-world offline dataset and online A/B test demonstrate its superiority over state-of-the-art methods.|推荐系统(RS)作为一种从庞大的语料库中发现用户感兴趣的项目的有效工具,越来越受到学术界和业界的关注。作为遥感的初始阶段,大规模匹配是一个基础性的挑战。一个典型的方法是使用双塔体系结构学习用户和项目表示,然后计算两个表示向量之间的相似度得分,但是如何正确处理负样本仍然是一个难题。在本文中,我们发现从整个空间中随机抽取负样本并平等对待它们的常见做法并不是最优选择,因为不同阶段不同子空间中的负样本对匹配模型的重要性不同。为了解决这一问题,我们提出了一种新的方法——无偏模型-不可知匹配方法(UMA2)。它包括两个基本模块: 1)通用匹配模型(GMM) ,该模型与模型无关,可以作为任何嵌入式双塔模型实现; 2)负样本偏差网络(NSDN) ,该网络借助逆倾向加权(IPW)的思想对负样本进行判别,并在 GMM 中重新权衡损失。UMA $^ 2 $在端到端多任务学习框架中无缝地集成了这两个模块。通过对现实世界离线数据集和在线 A/B 测试的大量实验,证明了该方法优于最先进的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Re-weighting+Negative+Samples+for+Model-Agnostic+Matching)|0| -|[Item-Provider Co-learning for Sequential Recommendation](https://doi.org/10.1145/3477495.3531756)|Lei Chen, Jingtao Ding, Min Yang, Chengming Li, Chonggang Song, Lingling Yi|Sun Yat-sen University, Shenzhen, China; Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; Tencent Inc., Shenzhen, China|Sequential recommender systems (SRSs) have become a research hotspot recently due to its powerful ability in capturing users' dynamic preferences. The key idea behind SRSs is to model the sequential dependencies over the user-item interactions. However, we argue that users' preferences are not only determined by their view or purchase items but also affected by the item-providers with which users have interacted. For instance, in a short-video scenario, a user may click on a video because he/she is attracted to either the video content or simply the video-providers as the vloggers are his/her idols. Motivated by the above observations, in this paper, we propose IPSRec, a novel Item-Provider co-learning framework for Sequential Recommendation. Specifically, we propose two representation learning methods (single-steam and cross-stream) to learn comprehensive item and user representations based on the user's historical item sequence and provider sequence. Then, contrastive learning is employed to further enhance the user embeddings in a self-supervised manner, which treats the representations of a specific user learned from the item side as well as the item-provider side as the positive pair and treats the representations of different users in the batch as the negative samples. Extensive experiments on three real-world SRS datasets demonstrate that IPSRec achieves substantially better results than the strong competitors. For reproducibility, our code and data are available at https://github.com/siat-nlp/IPSRec.|顺序推荐系统(SRS)由于具有捕获用户动态偏好的强大功能,近年来成为研究的热点。SRS 背后的关键思想是在用户-项目交互之间建立顺序依赖关系模型。然而,我们认为用户的偏好不仅取决于他们的观点或购买项目,而且还受到项目供应商的用户已经互动。例如,在一个短视频场景中,用户可能会点击一个视频,因为他/她要么被视频内容吸引,要么被视频提供商吸引,因为视频博客是他/她的偶像。基于上述观察,本文提出了一种新的项目提供者协同学习的序贯推荐框架 IPSRec。具体来说,我们提出了两种表示学习方法(单蒸汽和跨流)来学习综合项目和用户表示基于用户的历史项目序列和提供者序列。然后,采用对比学习的方法,以自监督的方式进一步增强用户嵌入,将从项目侧和项目提供者侧学习到的特定用户的表征视为正对,将批处理中不同用户的表征视为负样本。对三个实际 SRS 数据集的大量实验表明,IPSRec 比强大的竞争对手获得了更好的结果。为确保重复性,我们的代码和数据可在 https://github.com/siat-nlp/ipsrec 查阅。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Item-Provider+Co-learning+for+Sequential+Recommendation)|0| -|[Inconsistent Ranking Assumptions in Medical Search and Their Downstream Consequences](https://doi.org/10.1145/3477495.3531898)|Daniel Cohen, Kevin Du, Bhaskar Mitra, Laura Mercurio, Navid Rekabsaz, Carsten Eickhoff|Johannes Kepler Univ Linz, Linz, Austria; Brown Univ, Providence, RI 02912 USA; Brown Univ, Alpert Med Sch, Providence, RI 02912 USA; Microsoft, Montreal, PQ, Canada|Given a query, neural retrieval models predict point estimates of relevance for each document; however, a significant drawback of relying solely on point estimates is that they contain no indication of the model's confidence in its predictions. Despite this lack of information, downstream methods such as reranking, cutoff prediction, and none-of-the-above classification are still able to learn effective functions to accomplish their respective tasks. Unfortunately, these downstream methods can suffer poor performance when the initial ranking model loses confidence in its score predictions. This becomes increasingly important in high-stakes settings, such as medical searches that can influence health decision making. Recent work has resolved this lack of information by introducing Bayesian uncertainty to capture the possible distribution of a document score. This paper presents the use of this uncertainty information as an indicator of how well downstream methods will function over a ranklist. We highlight a significant bias against certain disease-related queries within the posterior distribution of a neural model, and show that this bias in a model's predictive distribution propagates to downstream methods. Finally, we introduce a multi-distribution uncertainty metric, confidence decay, as a valid way of partially identifying these failure cases in an offline setting without the need of any user feedback.|给定一个查询,神经检索模型预测每个文档的相关性点估计; 然而,仅仅依赖点估计的一个显著缺点是,它们不包含模型对其预测的置信度的指示。尽管缺乏这种信息,下游方法,如重新排序,截止预测,以及没有上述分类仍然能够学习有效的功能,以完成各自的任务。不幸的是,当初始排名模型对其分数预测失去信心时,这些下游方法的性能可能会很差。这在高风险环境中变得越来越重要,例如可以影响健康决策的医学搜索。最近的工作通过引入贝叶斯不确定性来捕获文档分数的可能分布,解决了这种信息缺乏的问题。本文介绍了使用这种不确定性信息作为一个指标,以及下游方法将如何在一个排名表的功能。我们强调,在神经模型的后验概率中,某些与疾病相关的查询存在显著的偏差,并表明模型预测分布的这种偏差会传播到下游方法。最后,我们介绍了一个多分布不确定性度量,置信度衰减,作为一个有效的方法,部分识别这些失败案例在脱机设置,而不需要任何用户反馈。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Inconsistent+Ranking+Assumptions+in+Medical+Search+and+Their+Downstream+Consequences)|0| +|[Item-Provider Co-learning for Sequential Recommendation](https://doi.org/10.1145/3477495.3531756)|Lei Chen, Jingtao Ding, Min Yang, Chengming Li, Chonggang Song, Lingling Yi|Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; Tencent Inc., Shenzhen, China; Sun Yat-sen University, Shenzhen, China|Sequential recommender systems (SRSs) have become a research hotspot recently due to its powerful ability in capturing users' dynamic preferences. The key idea behind SRSs is to model the sequential dependencies over the user-item interactions. However, we argue that users' preferences are not only determined by their view or purchase items but also affected by the item-providers with which users have interacted. For instance, in a short-video scenario, a user may click on a video because he/she is attracted to either the video content or simply the video-providers as the vloggers are his/her idols. Motivated by the above observations, in this paper, we propose IPSRec, a novel Item-Provider co-learning framework for Sequential Recommendation. Specifically, we propose two representation learning methods (single-steam and cross-stream) to learn comprehensive item and user representations based on the user's historical item sequence and provider sequence. Then, contrastive learning is employed to further enhance the user embeddings in a self-supervised manner, which treats the representations of a specific user learned from the item side as well as the item-provider side as the positive pair and treats the representations of different users in the batch as the negative samples. Extensive experiments on three real-world SRS datasets demonstrate that IPSRec achieves substantially better results than the strong competitors. For reproducibility, our code and data are available at https://github.com/siat-nlp/IPSRec.|顺序推荐系统(SRS)由于具有捕获用户动态偏好的强大功能,近年来成为研究的热点。SRS 背后的关键思想是在用户-项目交互之间建立顺序依赖关系模型。然而,我们认为用户的偏好不仅取决于他们的观点或购买项目,而且还受到项目供应商的用户已经互动。例如,在一个短视频场景中,用户可能会点击一个视频,因为他/她要么被视频内容吸引,要么被视频提供商吸引,因为视频博客是他/她的偶像。基于上述观察,本文提出了一种新的项目提供者协同学习的序贯推荐框架 IPSRec。具体来说,我们提出了两种表示学习方法(单蒸汽和跨流)来学习综合项目和用户表示基于用户的历史项目序列和提供者序列。然后,采用对比学习的方法,以自监督的方式进一步增强用户嵌入,将从项目侧和项目提供者侧学习到的特定用户的表征视为正对,将批处理中不同用户的表征视为负样本。对三个实际 SRS 数据集的大量实验表明,IPSRec 比强大的竞争对手获得了更好的结果。为确保重复性,我们的代码和数据可在 https://github.com/siat-nlp/ipsrec 查阅。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Item-Provider+Co-learning+for+Sequential+Recommendation)|0| +|[Inconsistent Ranking Assumptions in Medical Search and Their Downstream Consequences](https://doi.org/10.1145/3477495.3531898)|Daniel Cohen, Kevin Du, Bhaskar Mitra, Laura Mercurio, Navid Rekabsaz, Carsten Eickhoff|Brown Univ, Alpert Med Sch, Providence, RI 02912 USA; Microsoft, Montreal, PQ, Canada; Brown Univ, Providence, RI 02912 USA; Johannes Kepler Univ Linz, Linz, Austria|Given a query, neural retrieval models predict point estimates of relevance for each document; however, a significant drawback of relying solely on point estimates is that they contain no indication of the model's confidence in its predictions. Despite this lack of information, downstream methods such as reranking, cutoff prediction, and none-of-the-above classification are still able to learn effective functions to accomplish their respective tasks. Unfortunately, these downstream methods can suffer poor performance when the initial ranking model loses confidence in its score predictions. This becomes increasingly important in high-stakes settings, such as medical searches that can influence health decision making. Recent work has resolved this lack of information by introducing Bayesian uncertainty to capture the possible distribution of a document score. This paper presents the use of this uncertainty information as an indicator of how well downstream methods will function over a ranklist. We highlight a significant bias against certain disease-related queries within the posterior distribution of a neural model, and show that this bias in a model's predictive distribution propagates to downstream methods. Finally, we introduce a multi-distribution uncertainty metric, confidence decay, as a valid way of partially identifying these failure cases in an offline setting without the need of any user feedback.|给定一个查询,神经检索模型预测每个文档的相关性点估计; 然而,仅仅依赖点估计的一个显著缺点是,它们不包含模型对其预测的置信度的指示。尽管缺乏这种信息,下游方法,如重新排序,截止预测,以及没有上述分类仍然能够学习有效的功能,以完成各自的任务。不幸的是,当初始排名模型对其分数预测失去信心时,这些下游方法的性能可能会很差。这在高风险环境中变得越来越重要,例如可以影响健康决策的医学搜索。最近的工作通过引入贝叶斯不确定性来捕获文档分数的可能分布,解决了这种信息缺乏的问题。本文介绍了使用这种不确定性信息作为一个指标,以及下游方法将如何在一个排名表的功能。我们强调,在神经模型的后验概率中,某些与疾病相关的查询存在显著的偏差,并表明模型预测分布的这种偏差会传播到下游方法。最后,我们介绍了一个多分布不确定性度量,置信度衰减,作为一个有效的方法,部分识别这些失败案例在脱机设置,而不需要任何用户反馈。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Inconsistent+Ranking+Assumptions+in+Medical+Search+and+Their+Downstream+Consequences)|0| |[Exploiting Session Information in BERT-based Session-aware Sequential Recommendation](https://doi.org/10.1145/3477495.3531910)|Jinseok Jamie Seol, Youngrok Ko, Sanggoo Lee|Seoul National University, Seoul, Republic of Korea|In recommendation systems, utilizing the user interaction history as sequential information has resulted in great performance improvement. However, in many online services, user interactions are commonly grouped by sessions that presumably share preferences, which requires a different approach from ordinary sequence representation techniques. To this end, sequence representation models with a hierarchical structure or various viewpoints have been developed but with a rather complex network structure. In this paper, we propose three methods to improve recommendation performance by exploiting session information while minimizing additional parameters in a BERT-based sequential recommendation model: using session tokens, adding session segment embeddings, and a time-aware self-attention. We demonstrate the feasibility of the proposed methods through experiments on widely used recommendation datasets.|在推荐系统中,利用用户交互历史作为序列信息,可以大大提高推荐系统的性能。然而,在许多在线服务中,用户交互通常按照可能共享首选项的会话进行分组,这需要一种不同于普通序列表示技术的方法。为此,开发了具有层次结构或不同视点的序列表示模型,但其网络结构相当复杂。在基于 BERT 的顺序推荐模型中,我们提出了利用会话信息同时最小化附加参数来提高推荐性能的三种方法: 使用会话令牌、增加会话段嵌入和有时间意识的自我注意。通过在广泛使用的推荐数据集上的实验,验证了该方法的可行性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Exploiting+Session+Information+in+BERT-based+Session-aware+Sequential+Recommendation)|0| -|[Towards Reproducible Machine Learning Research in Information Retrieval](https://doi.org/10.1145/3477495.3532686)|Ana Lucic, Maurits J. R. Bleeker, Maarten de Rijke, Koustuv Sinha, Sami Jullien, Robert Stojnic|University of Amsterdam, Amsterdam, Netherlands; McGill University, Montreal, Canada; Facebook AI Research, London, United Kingdom|While recent progress in the field of machine learning (ML) and information retrieval (IR) has been significant, the reproducibility of these cutting-edge results is often lacking, with many submissions failing to provide the necessary information in order to ensure subsequent reproducibility. Despite the introduction of self-check mechanisms before submission (such as the Reproducibility Checklist, criteria for evaluating reproducibility during reviewing at several major conferences, artifact review and badging framework, and dedicated reproducibility tracks and challenges at major IR conferences, the motivation for executing reproducible research is lacking in the broader information community. We propose this tutorial as a gentle introduction to help ensure reproducible research in IR, with a specific emphasis on ML aspects of IR research.|虽然机器学习(ML)和信息检索学习(IR)领域的最新进展显著,但这些尖端结果的可重复性往往缺乏,许多提交的文件未能提供必要的信息,以确保随后的可重复性。尽管在提交之前引入了自我检查机制(例如重现性检查表,在几个主要会议上评估重现性的标准,工件审查和徽章框架,以及在主要 IR 会议上专门的重现性轨道和挑战,在更广泛的信息社区中缺乏执行可重现性研究的动机。我们建议本教程作为一个温和的介绍,以帮助确保在 IR 的重复性研究,并特别强调机器学习方面的 IR 研究。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Reproducible+Machine+Learning+Research+in+Information+Retrieval)|0| -|[Structured and Natural Responses Co-generation for Conversational Search](https://doi.org/10.1145/3477495.3532063)|Chenchen Ye, Lizi Liao, Fuli Feng, Wei Ji, TatSeng Chua|University of Science and Technology of China, Heifei, China; Singapore Management University, Singapore, Singapore; National University of Singapore, Singapore, Singapore; Sea-NExT Joint Lab, National University of Singapore, Singapore, Singapore|Generating fluent and informative natural responses while main- taining representative internal states for search optimization is critical for conversational search systems. Existing approaches ei- ther 1) predict structured dialog acts first and then generate natural response; or 2) map conversation context to natural responses di- rectly in an end-to-end manner. Both kinds of approaches have shortcomings. The former suffers from error accumulation while the semantic associations between structured acts and natural re- sponses are confined in single direction. The latter emphasizes generating natural responses but fails to predict structured acts. Therefore, we propose a neural co-generation model that gener- ates the two concurrently. The key lies in a shared latent space shaped by two informed priors. Specifically, we design structured dialog acts and natural response auto-encoding as two auxiliary tasks in an interconnected network architecture. It allows for the concurrent generation and bidirectional semantic associations. The shared latent space also enables asynchronous reinforcement learn- ing for further joint optimization. Experiments show that our model achieves significant performance improvements.|生成流畅和信息丰富的自然反应,同时为搜索引擎优化保留有代表性的内部状态,这对会话搜索系统至关重要。现有的方法包括: 1)预测结构化对话首先发生,然后产生自然反应; 或者2)以端到端的方式将对话上下文直接映射到自然反应。这两种方法都有缺点。结构化行为和自然反应之间的语义联系是单向的,而结构化行为和自然反应之间的语义联系是单向的。后者强调产生自然反应,但无法预测结构化行为。因此,我们提出了一个神经元协同生成模型,并生成两个。关键在于一个共享的潜在空间,由两个知情的前任塑造。具体来说,我们设计了结构化对话行为和自然响应自动编码作为互联网络体系结构中的两个辅助任务。它允许并发生成和双向语义关联。共享潜在空间还支持异步强化学习,以进一步优化联合。实验结果表明,该模型取得了显著的性能改善。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Structured+and+Natural+Responses+Co-generation+for+Conversational+Search)|0| +|[Towards Reproducible Machine Learning Research in Information Retrieval](https://doi.org/10.1145/3477495.3532686)|Ana Lucic, Maurits J. R. Bleeker, Maarten de Rijke, Koustuv Sinha, Sami Jullien, Robert Stojnic|Facebook AI Research, London, United Kingdom; McGill University, Montreal, Canada; University of Amsterdam, Amsterdam, Netherlands|While recent progress in the field of machine learning (ML) and information retrieval (IR) has been significant, the reproducibility of these cutting-edge results is often lacking, with many submissions failing to provide the necessary information in order to ensure subsequent reproducibility. Despite the introduction of self-check mechanisms before submission (such as the Reproducibility Checklist, criteria for evaluating reproducibility during reviewing at several major conferences, artifact review and badging framework, and dedicated reproducibility tracks and challenges at major IR conferences, the motivation for executing reproducible research is lacking in the broader information community. We propose this tutorial as a gentle introduction to help ensure reproducible research in IR, with a specific emphasis on ML aspects of IR research.|虽然机器学习(ML)和信息检索学习(IR)领域的最新进展显著,但这些尖端结果的可重复性往往缺乏,许多提交的文件未能提供必要的信息,以确保随后的可重复性。尽管在提交之前引入了自我检查机制(例如重现性检查表,在几个主要会议上评估重现性的标准,工件审查和徽章框架,以及在主要 IR 会议上专门的重现性轨道和挑战,在更广泛的信息社区中缺乏执行可重现性研究的动机。我们建议本教程作为一个温和的介绍,以帮助确保在 IR 的重复性研究,并特别强调机器学习方面的 IR 研究。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Reproducible+Machine+Learning+Research+in+Information+Retrieval)|0| +|[Structured and Natural Responses Co-generation for Conversational Search](https://doi.org/10.1145/3477495.3532063)|Chenchen Ye, Lizi Liao, Fuli Feng, Wei Ji, TatSeng Chua|Singapore Management University, Singapore, Singapore; University of Science and Technology of China, Heifei, China; National University of Singapore, Singapore, Singapore; Sea-NExT Joint Lab, National University of Singapore, Singapore, Singapore|Generating fluent and informative natural responses while main- taining representative internal states for search optimization is critical for conversational search systems. Existing approaches ei- ther 1) predict structured dialog acts first and then generate natural response; or 2) map conversation context to natural responses di- rectly in an end-to-end manner. Both kinds of approaches have shortcomings. The former suffers from error accumulation while the semantic associations between structured acts and natural re- sponses are confined in single direction. The latter emphasizes generating natural responses but fails to predict structured acts. Therefore, we propose a neural co-generation model that gener- ates the two concurrently. The key lies in a shared latent space shaped by two informed priors. Specifically, we design structured dialog acts and natural response auto-encoding as two auxiliary tasks in an interconnected network architecture. It allows for the concurrent generation and bidirectional semantic associations. The shared latent space also enables asynchronous reinforcement learn- ing for further joint optimization. Experiments show that our model achieves significant performance improvements.|生成流畅和信息丰富的自然反应,同时为搜索引擎优化保留有代表性的内部状态,这对会话搜索系统至关重要。现有的方法包括: 1)预测结构化对话首先发生,然后产生自然反应; 或者2)以端到端的方式将对话上下文直接映射到自然反应。这两种方法都有缺点。结构化行为和自然反应之间的语义联系是单向的,而结构化行为和自然反应之间的语义联系是单向的。后者强调产生自然反应,但无法预测结构化行为。因此,我们提出了一个神经元协同生成模型,并生成两个。关键在于一个共享的潜在空间,由两个知情的前任塑造。具体来说,我们设计了结构化对话行为和自然响应自动编码作为互联网络体系结构中的两个辅助任务。它允许并发生成和双向语义关联。共享潜在空间还支持异步强化学习,以进一步优化联合。实验结果表明,该模型取得了显著的性能改善。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Structured+and+Natural+Responses+Co-generation+for+Conversational+Search)|0| |[PEVAE: A Hierarchical VAE for Personalized Explainable Recommendation](https://doi.org/10.1145/3477495.3532039)|Zefeng Cai, Zerui Cai|East China Normal University, Shanghai, China|Variational autoencoders (VAEs) have been widely applied in recommendations. One reason is that their amortized inferences are beneficial for overcoming the data sparsity. However, in explainable recommendation that generates natural language explanations, they are still rarely explored. Thus, we aim to extend VAE to explainable recommendation. In this task, we find that VAE can generate acceptable explanations for users with few relevant training samples, however, it tends to generate less personalized explanations for users with relatively sufficient samples than autoencoders (AEs). We conjecture that information shared by different users in VAE disturbs the information for a specific user. To deal with this problem, we present PErsonalized VAE (PEVAE) that generates personalized natural language explanations for explainable recommendation. Moreover, we propose two novel mechanisms to aid our model in generating more personalized explanations, including 1) Self-Adaption Fusion (SAF) manipulates the latent space in a self-adaption manner for controlling the influence of shared information. In this way, our model can enjoy the advantage of overcoming the sparsity of data while generating more personalized explanations for a user with relatively sufficient training samples. 2) DEpendence Maximization (DEM) strengthens dependence between recommendations and explanations by maximizing the mutual information. It makes the explanation more specific to the input user-item pair and thus improves the personalization of the generated explanations. Extensive experiments show PEVAE can generate more personalized explanations and further analyses demonstrate the practical effect of our proposed methods.|变分自动编码器(VAE)已被广泛应用于建议。一个原因是,他们的摊销推断有利于克服数据稀疏。然而,在生成自然语言解释的可解释推荐中,它们仍然很少被探索。因此,我们的目标是将 VAE 扩展到可解释的推荐。在这个任务中,我们发现 VAE 可以在相关训练样本较少的情况下为用户生成可接受的解释,但是,与自动编码器(AE)相比,它往往在样本相对充足的情况下为用户生成较少的个性化解释。我们推测 VAE 中不同用户共享的信息会干扰特定用户的信息。为了解决这个问题,我们提出了个性化的 VAE (PEVAE) ,它可以为解释性推荐生成个性化的自然语言解释。此外,我们提出了两种新的机制来帮助我们的模型产生更多的个性化解释,包括1)自适应融合(SAF)以自适应的方式操纵潜在空间来控制共享信息的影响。通过这种方式,我们的模型可以在克服数据稀疏性的同时,通过相对充足的训练样本为用户生成更加个性化的解释。2)依赖最大化(DEM)通过最大化相互信息来增强推荐与解释之间的依赖性。它使解释更加具体到输入用户项对,从而改进了生成的解释的个性化。大量的实验表明,PEVAE 可以产生更加个性化的解释,进一步的分析表明,我们提出的方法的实际效果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PEVAE:+A+Hierarchical+VAE+for+Personalized+Explainable+Recommendation)|0| -|[Counterfactual Learning To Rank for Utility-Maximizing Query Autocompletion](https://doi.org/10.1145/3477495.3531958)|Adam Block, Rahul Kidambi, Daniel N. Hill, Thorsten Joachims, Inderjit S. Dhillon|University of Texas at Austin, Austin, TX, USA; Amazon Music, San Fransisco, CA, USA; Massachusetts Institute of Technology, Cambridge, MA, USA; Amazon Search, Berkeley, CA, USA|Conventional methods for query autocompletion aim to predict which completed query a user will select from a list. A shortcoming of this approach is that users often do not know which query will provide the best retrieval performance on the current information retrieval system, meaning that any query autocompletion methods trained to mimic user behavior can lead to suboptimal query suggestions. To overcome this limitation, we propose a new approach that explicitly optimizes the query suggestions for downstream retrieval performance. We formulate this as a problem of ranking a set of rankings, where each query suggestion is represented by the downstream item ranking it produces. We then present a learning method that ranks query suggestions by the quality of their item rankings. The algorithm is based on a counterfactual learning approach that is able to leverage feedback on the items (e.g., clicks, purchases) to evaluate query suggestions through an unbiased estimator, thus avoiding the assumption that users write or select optimal queries. We establish theoretical support for the proposed approach and provide learning-theoretic guarantees. We also present empirical results on publicly available datasets, and demonstrate real-world applicability using data from an online shopping store.|查询自动完成的传统方法旨在预测用户将从列表中选择哪个已完成的查询。这种方法的一个缺点是,用户通常不知道哪个查询将在当前的信息检索系统上提供最佳的检索性能,这意味着任何经过训练的模拟用户行为的查询自动完成方法都可能导致次优的查询建议。为了克服这一限制,我们提出了一种新的方法,显式优化查询建议的下游检索性能。我们将这个问题表述为对一组排名进行排序的问题,其中每个查询建议由它产生的下游项目排名表示。然后,我们提出了一种学习方法,根据项目排名的质量对查询建议进行排序。该算法基于一种反事实学习方法,能够利用对项目(如点击、购买)的反馈,通过一个无偏估计器来评估查询建议,从而避免了用户编写或选择最佳查询的假设。我们为提出的方法建立了理论支持,并提供了学习理论保证。我们还提出了公开可用数据集的实证结果,并证明了真实世界的适用性使用数据从网上购物商店。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Counterfactual+Learning+To+Rank+for+Utility-Maximizing+Query+Autocompletion)|0| -|[Automatic Expert Selection for Multi-Scenario and Multi-Task Search](https://doi.org/10.1145/3477495.3531942)|Xinyu Zou, Zhi Hu, Yiming Zhao, Xuchu Ding, Zhongyi Liu, Chenliang Li, Aixin Sun|Wuhan University, Wuhan, China; Ant Group, Hangzhou, China; Nanyang Technological University, Singapore, Singapore|Multi-scenario learning (MSL) enables a service provider to cater for users' fine-grained demands by separating services for different user sectors, e.g., by user's geographical region. Under each scenario there is a need to optimize multiple task-specific targets e.g., click through rate and conversion rate, known as multi-task learning (MTL). Recent solutions for MSL and MTL are mostly based on the multi-gate mixture-of-experts (MMoE) architecture. MMoE structure is typically static and its design requires domain-specific knowledge, making it less effective in handling both MSL and MTL. In this paper, we propose a novel Automatic Expert Selection framework for Multi-scenario and Multi-task search, named AESM2. AESM2 integrates both MSL and MTL into a unified framework with an automatic structure learning. Specifically, AESM2 stacks multi-task layers over multi-scenario layers. This hierarchical design enables us to flexibly establish intrinsic connections between different scenarios, and at the same time also supports high-level feature extraction for different tasks. At each multi-scenario/multi-task layer, a novel expert selection algorithm is proposed to automatically identify scenario-/task-specific and shared experts for each input. Experiments over two real-world large-scale datasets demonstrate the effectiveness of AESM2 over a battery of strong baselines. Online A/B test also shows substantial performance gain on multiple metrics. Currently, AESM2 has been deployed online for serving major traffic.|多场景学习可让服务供应商因应用户的细粒度需求,将不同用户界别的服务(例如按用户地理地区)分开。在每种情况下,都需要优化多个特定任务的目标,例如点击率和转换率,称为多任务学习(MTL)。最近的 MSL 和 MTL 解决方案大多基于多门混合专家(MMoE)体系结构。MMoE 结构通常是静态的,其设计需要特定于领域的知识,因此在处理 MSL 和 MTL 时效率较低。本文提出了一种面向多场景多任务搜索的自动专家选择框架 AESM2。AESM2将 MSL 和 MTL 集成到一个具有自动结构学习的统一框架中。具体来说,AESM2在多场景层上堆叠多任务层。这种分层设计使我们能够灵活地建立不同场景之间的内在联系,同时也支持不同任务的高级特征提取。在每个多场景/多任务层,提出了一种新的专家选择算法来自动识别每个输入的场景/任务特定的和共享的专家。通过两个真实世界的大规模数据集的实验证明了 AESM2在一组强基线上的有效性。在线 A/B 测试还显示了在多个指标上的大量性能增益。目前,AESM2已经部署在线服务主要流量。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Automatic+Expert+Selection+for+Multi-Scenario+and+Multi-Task+Search)|0| +|[Counterfactual Learning To Rank for Utility-Maximizing Query Autocompletion](https://doi.org/10.1145/3477495.3531958)|Adam Block, Rahul Kidambi, Daniel N. Hill, Thorsten Joachims, Inderjit S. Dhillon|Massachusetts Institute of Technology, Cambridge, MA, USA; University of Texas at Austin, Austin, TX, USA; Amazon Search, Berkeley, CA, USA; Amazon Music, San Fransisco, CA, USA|Conventional methods for query autocompletion aim to predict which completed query a user will select from a list. A shortcoming of this approach is that users often do not know which query will provide the best retrieval performance on the current information retrieval system, meaning that any query autocompletion methods trained to mimic user behavior can lead to suboptimal query suggestions. To overcome this limitation, we propose a new approach that explicitly optimizes the query suggestions for downstream retrieval performance. We formulate this as a problem of ranking a set of rankings, where each query suggestion is represented by the downstream item ranking it produces. We then present a learning method that ranks query suggestions by the quality of their item rankings. The algorithm is based on a counterfactual learning approach that is able to leverage feedback on the items (e.g., clicks, purchases) to evaluate query suggestions through an unbiased estimator, thus avoiding the assumption that users write or select optimal queries. We establish theoretical support for the proposed approach and provide learning-theoretic guarantees. We also present empirical results on publicly available datasets, and demonstrate real-world applicability using data from an online shopping store.|查询自动完成的传统方法旨在预测用户将从列表中选择哪个已完成的查询。这种方法的一个缺点是,用户通常不知道哪个查询将在当前的信息检索系统上提供最佳的检索性能,这意味着任何经过训练的模拟用户行为的查询自动完成方法都可能导致次优的查询建议。为了克服这一限制,我们提出了一种新的方法,显式优化查询建议的下游检索性能。我们将这个问题表述为对一组排名进行排序的问题,其中每个查询建议由它产生的下游项目排名表示。然后,我们提出了一种学习方法,根据项目排名的质量对查询建议进行排序。该算法基于一种反事实学习方法,能够利用对项目(如点击、购买)的反馈,通过一个无偏估计器来评估查询建议,从而避免了用户编写或选择最佳查询的假设。我们为提出的方法建立了理论支持,并提供了学习理论保证。我们还提出了公开可用数据集的实证结果,并证明了真实世界的适用性使用数据从网上购物商店。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Counterfactual+Learning+To+Rank+for+Utility-Maximizing+Query+Autocompletion)|0| +|[Automatic Expert Selection for Multi-Scenario and Multi-Task Search](https://doi.org/10.1145/3477495.3531942)|Xinyu Zou, Zhi Hu, Yiming Zhao, Xuchu Ding, Zhongyi Liu, Chenliang Li, Aixin Sun|Ant Group, Hangzhou, China; Nanyang Technological University, Singapore, Singapore; Wuhan University, Wuhan, China|Multi-scenario learning (MSL) enables a service provider to cater for users' fine-grained demands by separating services for different user sectors, e.g., by user's geographical region. Under each scenario there is a need to optimize multiple task-specific targets e.g., click through rate and conversion rate, known as multi-task learning (MTL). Recent solutions for MSL and MTL are mostly based on the multi-gate mixture-of-experts (MMoE) architecture. MMoE structure is typically static and its design requires domain-specific knowledge, making it less effective in handling both MSL and MTL. In this paper, we propose a novel Automatic Expert Selection framework for Multi-scenario and Multi-task search, named AESM2. AESM2 integrates both MSL and MTL into a unified framework with an automatic structure learning. Specifically, AESM2 stacks multi-task layers over multi-scenario layers. This hierarchical design enables us to flexibly establish intrinsic connections between different scenarios, and at the same time also supports high-level feature extraction for different tasks. At each multi-scenario/multi-task layer, a novel expert selection algorithm is proposed to automatically identify scenario-/task-specific and shared experts for each input. Experiments over two real-world large-scale datasets demonstrate the effectiveness of AESM2 over a battery of strong baselines. Online A/B test also shows substantial performance gain on multiple metrics. Currently, AESM2 has been deployed online for serving major traffic.|多场景学习可让服务供应商因应用户的细粒度需求,将不同用户界别的服务(例如按用户地理地区)分开。在每种情况下,都需要优化多个特定任务的目标,例如点击率和转换率,称为多任务学习(MTL)。最近的 MSL 和 MTL 解决方案大多基于多门混合专家(MMoE)体系结构。MMoE 结构通常是静态的,其设计需要特定于领域的知识,因此在处理 MSL 和 MTL 时效率较低。本文提出了一种面向多场景多任务搜索的自动专家选择框架 AESM2。AESM2将 MSL 和 MTL 集成到一个具有自动结构学习的统一框架中。具体来说,AESM2在多场景层上堆叠多任务层。这种分层设计使我们能够灵活地建立不同场景之间的内在联系,同时也支持不同任务的高级特征提取。在每个多场景/多任务层,提出了一种新的专家选择算法来自动识别每个输入的场景/任务特定的和共享的专家。通过两个真实世界的大规模数据集的实验证明了 AESM2在一组强基线上的有效性。在线 A/B 测试还显示了在多个指标上的大量性能增益。目前,AESM2已经部署在线服务主要流量。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Automatic+Expert+Selection+for+Multi-Scenario+and+Multi-Task+Search)|0| |[Multi-Agent RL-based Information Selection Model for Sequential Recommendation](https://doi.org/10.1145/3477495.3532022)|Kaiyuan Li, Pengfei Wang, Chenliang Li|Wuhan University, Beijing, China; Beijing University of Posts and Telecommunications, Beijing, China|For sequential recommender, the coarse-grained yet sparse sequential signals mined from massive user-item interactions have become the bottleneck to further improve the recommendation performance. To alleviate the spareness problem, exploiting auxiliary semantic features (\eg textual descriptions, visual images and knowledge graph) to enrich contextual information then turns into a mainstream methodology. Though effective, we argue that these different heterogeneous features certainly include much noise which may overwhelm the valuable sequential signals, and therefore easily reach the phenomenon of negative collaboration (ie 1 + 1 > 2). How to design a flexible strategy to select proper auxiliary information and alleviate the negative collaboration towards a better recommendation is still an interesting and open question. Unfortunately, few works have addressed this challenge in sequential recommendation. In this paper, we introduce a Multi-Agent RL-based Information S election Model (named MARIS) to explore an effective collaboration between different kinds of auxiliary information and sequential signals in an automatic way. Specifically, MARIS formalizes the auxiliary feature selection as a cooperative Multi-agent Markov Decision Process. For each auxiliary feature type, MARIS resorts to using an agent to determine whether a specific kind of auxiliary feature should be imported to achieve a positive collaboration. In between, a QMIX network is utilized to cooperate their joint selection actions and produce an episode corresponding an effective combination of different auxiliary features for the whole historical sequence. Considering the lack of supervised selection signals, we further devise a novel reward-guided sampling strategy to leverage exploitation and exploration scheme for episode sampling. By preserving them in a replay buffer, MARIS learns the action-value function and the reward alternatively for optimization. Extensive experiments on four real-world datasets demonstrate that our model obtains significant performance improvement over up-to-date state-of-the-art recommendation models.|对于序列推荐系统来说,从大量用户交互中挖掘出的粗粒度稀疏序列信号已经成为进一步提高推荐性能的瓶颈。为了解决这一问题,利用辅助语义特征(如文本描述、视觉图像和知识图形)丰富上下文信息成为主流方法论。虽然有效,我们认为这些不同的异质性特征肯定包括大量的噪声,这可能压倒有价值的序列信号,因此很容易达到负协作现象(即1 + 1 > 2)。如何设计一种灵活的策略,选择合适的辅助信息,减轻负面协作,以获得更好的推荐,仍然是一个有趣而开放的问题。不幸的是,很少有作品在连续推荐中解决了这个问题。本文提出了一种基于多 Agent RL 的信息 S 选择模型(MARIS) ,用于探索不同辅助信息与序列信号之间的自动有效协作。具体来说,MARIS 将辅助特征选择形式化为一个合作的多代理马可夫决策过程。对于每一种辅助特征类型,MARIS 都使用一个代理来确定是否需要导入一种特定的辅助特征来实现积极的协作。其间,利用 QMIX 网络协同它们的联合选择行动,产生对应于整个历史序列的不同辅助特征的有效组合的情节。考虑到缺乏监督选择信号,我们进一步设计了一种新的奖励引导抽样策略,以利用开发和探索方案的情节抽样。通过将它们保存在一个重播缓冲区中,MARIS 学习动作-价值函数和优化的报酬。在四个真实世界数据集上的大量实验表明,我们的模型比最新的最先进的推荐模型获得了显著的性能改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-Agent+RL-based+Information+Selection+Model+for+Sequential+Recommendation)|0| -|[Neural Statistics for Click-Through Rate Prediction](https://doi.org/10.1145/3477495.3531762)|Yanhua Huang, Hangyu Wang, Yiyun Miao, Ruiwen Xu, Lei Zhang, Weinan Zhang|Shanghai Jiao Tong University, Shanghai, China; Xiaohongshu Inc., Shanghai, China|With the success of deep learning, click-through rate (CTR) predictions are transitioning from shallow approaches to deep architectures. Current deep CTR prediction usually follows the Embedding & MLP paradigm, where the model embeds categorical features into latent semantic space. This paper introduces a novel embedding technique called neural statistics that instead learns explicit semantics of categorical features by incorporating feature engineering as an innate prior into the deep architecture in an end-to-end manner. Besides, since the statistical information changes over time, we study how to adapt to the distribution shift in the MLP module efficiently. Offline experiments on two public datasets validate the effectiveness of neural statistics against state-of-the-art models. We also apply it to a large-scale recommender system via online A/B tests, where the user's satisfaction is significantly improved.|随着深度学习的成功,点进率预测(ctrl)正从浅层方法向深层架构过渡。目前的深度 CTR 预测通常遵循嵌入与 MLP 范式,该模型将范畴特征嵌入到潜在语义空间中。本文介绍了一种新的嵌入技术,称为神经统计学,通过将特征工程作为一种先天优势以端到端的方式结合到深层体系结构中,来学习范畴特征的显性语义。此外,由于统计信息随时间变化,我们研究了如何有效地适应 MLP 模块中的分布变化。在两个公共数据集上的离线实验验证了针对最先进模型的神经统计的有效性。我们还通过在线 A/B 测试将其应用于大规模的推荐系统测试,用户的满意度显著提高。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Neural+Statistics+for+Click-Through+Rate+Prediction)|0| +|[Neural Statistics for Click-Through Rate Prediction](https://doi.org/10.1145/3477495.3531762)|Yanhua Huang, Hangyu Wang, Yiyun Miao, Ruiwen Xu, Lei Zhang, Weinan Zhang|Xiaohongshu Inc., Shanghai, China; Shanghai Jiao Tong University, Shanghai, China|With the success of deep learning, click-through rate (CTR) predictions are transitioning from shallow approaches to deep architectures. Current deep CTR prediction usually follows the Embedding & MLP paradigm, where the model embeds categorical features into latent semantic space. This paper introduces a novel embedding technique called neural statistics that instead learns explicit semantics of categorical features by incorporating feature engineering as an innate prior into the deep architecture in an end-to-end manner. Besides, since the statistical information changes over time, we study how to adapt to the distribution shift in the MLP module efficiently. Offline experiments on two public datasets validate the effectiveness of neural statistics against state-of-the-art models. We also apply it to a large-scale recommender system via online A/B tests, where the user's satisfaction is significantly improved.|随着深度学习的成功,点进率预测(ctrl)正从浅层方法向深层架构过渡。目前的深度 CTR 预测通常遵循嵌入与 MLP 范式,该模型将范畴特征嵌入到潜在语义空间中。本文介绍了一种新的嵌入技术,称为神经统计学,通过将特征工程作为一种先天优势以端到端的方式结合到深层体系结构中,来学习范畴特征的显性语义。此外,由于统计信息随时间变化,我们研究了如何有效地适应 MLP 模块中的分布变化。在两个公共数据集上的离线实验验证了针对最先进模型的神经统计的有效性。我们还通过在线 A/B 测试将其应用于大规模的推荐系统测试,用户的满意度显著提高。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Neural+Statistics+for+Click-Through+Rate+Prediction)|0| |[Towards Results-level Proportionality for Multi-objective Recommender Systems](https://doi.org/10.1145/3477495.3531787)|Ladislav Peska, Patrik Dokoupil|Charles University, Prague, Czech Rep|The main focus of our work is the problem of multiple objectives optimization (MOO) while providing a final list of recommendations to the user. Currently, system designers can tune MOO by setting importance of individual objectives, usually in some kind of weighted average setting. However, this does not have to translate into the presence of such objectives in the final results. In contrast, in our work we would like to allow system designers or end-users to directly quantify the required relative ratios of individual objectives in the resulting recommendations, e.g., the final results should have 60% relevance, 30% diversity and 10% novelty. If individual objectives are transformed to represent quality on the same scale, these result conditioning expressions may greatly contribute towards recommendations tuneability and explainability as well as user's control over recommendations. To achieve this task, we propose an iterative algorithm inspired by the mandates allocation problem in public elections. The algorithm is applicable as long as per-item marginal gains of individual objectives can be calculated. Effectiveness of the algorithm is evaluated on several settings of relevance-novelty-diversity optimization problem. Furthermore, we also outline several options to scale individual objectives to represent similar value for the user.|我们工作的主要重点是多目标优化(MOO)问题,同时向用户提供最终的建议列表。目前,系统设计者可以通过设定个人目标的重要性来调整 MOO,通常是在某种加权平均数设置中。然而,这并不意味着在最终结果中存在这样的目标。相比之下,在我们的工作中,我们希望允许系统设计者或最终用户直接量化结果建议中各个目标所需的相对比例,例如,最终结果应该有60% 的相关性,30% 的多样性和10% 的新颖性。如果将单个目标转换为在同一尺度上表示质量,那么这些结果条件表达式可能极大地有助于建议的可调整性和可解释性,以及用户对建议的控制。为了实现这一任务,我们提出了一个迭代算法的启发任务分配问题在公共选举。只要能够计算出单个目标的单项边际收益,该算法是可行的。该算法的有效性是根据相关性-新颖性-多样性最佳化问题进行评估的。此外,我们还概述了几个选项,以缩放单个目标,表示用户的类似价值。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Results-level+Proportionality+for+Multi-objective+Recommender+Systems)|0| -|[Transform Cold-Start Users into Warm via Fused Behaviors in Large-Scale Recommendation](https://doi.org/10.1145/3477495.3531797)|Pengyang Li, Rong Chen, Quan Liu, Jian Xu, Bo Zheng|Alibaba Group, Hangzhou, China; Zhejiang University, Hangzhou, China|Recommendation for cold-start users who have very limited data is a canonical challenge in recommender systems. Existing deep recommender systems utilize user content features and behaviors to produce personalized recommendations, yet often face significant performance degradation on cold-start users compared to existing ones due to the following challenges: (1) Cold-start users may have a quite different distribution of features from existing users. (2) The few behaviors of cold-start users are hard to be exploited. In this paper, we propose a recommender system called Cold-Transformer to alleviate these problems. Specifically, we design context-based Embedding Adaption to offset the differences in feature distribution. It transforms the embedding of cold-start users into a warm state that is more like existing ones to represent corresponding user preferences. Furthermore, to exploit the few behaviors of cold-start users and characterize the user context, we propose Label Encoding that models Fused Behaviors of positive and negative feedback simultaneously, which are relatively more sufficient. Last, to perform large-scale industrial recommendations, we keep the two-tower architecture that de-couples user and target item. Extensive experiments on public and industrial datasets show that Cold-Transformer significantly outperforms state-of-the-art methods, including those that are deep coupled and less scalable.|对于数据非常有限的冷启动用户的推荐在推荐系统中是一个典型的挑战。现有的深度推荐系统利用用户内容特征和行为来产生个性化的推荐,但是由于以下挑战,冷启动用户的性能往往比现有的推荐系统有显著的下降: (1)冷启动用户可能具有与现有用户完全不同的特征分布。(2)冷启动用户的少数行为很难被利用。在这篇文章中,我们提出了一个叫做冷变压器的推荐系统来缓解这些问题。具体来说,我们设计了基于上下文的嵌入适应,以抵消特征分布的差异。它将冷启动用户的嵌入转换为更像现有用户的暖状态,以表示相应的用户偏好。此外,为了充分利用冷启动用户的少数行为并刻画用户上下文特征,我们提出了标签编码方法,该方法同时对正反馈和负反馈的融合行为进行建模,相对来说比较充分。最后,为了执行大规模的工业建议,我们保留了解耦用户和目标项目的双塔架构。在公共和工业数据集上进行的大量实验表明,冷变压器的性能明显优于最先进的方法,包括那些深度耦合和可伸缩性较差的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Transform+Cold-Start+Users+into+Warm+via+Fused+Behaviors+in+Large-Scale+Recommendation)|0| -|[Coarse-to-Fine Sparse Sequential Recommendation](https://doi.org/10.1145/3477495.3531732)|Jiacheng Li, Tong Zhao, Jin Li, Jim Chan, Christos Faloutsos, George Karypis, SooMin Pantel, Julian J. McAuley|Carnegie Mellon University, Pittsburgh, PA, USA; University of California, San Diego, La Jolla, CA, USA; Amazon, Seattle, WA, USA|Sequential recommendation aims to model dynamic user behavior from historical interactions. Self-attentive methods have proven effective at capturing short-term dynamics and long-term preferences. Despite their success, these approaches still struggle to model sparse data, on which they struggle to learn high-quality item representations. We propose to model user dynamics from shopping intents and interacted items simultaneously. The learned intents are coarse-grained and work as prior knowledge for item recommendation. To this end, we present a coarse-to-fine self-attention framework, namely CaFe, which explicitly learns coarse-grained and fine-grained sequential dynamics. Specifically, CaFe first learns intents from coarse-grained sequences which are dense and hence provide high-quality user intent representations. Then, CaFe fuses intent representations into item encoder outputs to obtain improved item representations. Finally, we infer recommended items based on representations of items and corresponding intents. Experiments on sparse datasets show that CaFe outperforms state-of-the-art self-attentive recommenders by 44.03% [email protected] on average.|顺序推荐旨在从历史交互中建立动态用户行为模型。事实证明,自我关注的方法在捕捉短期动态和长期偏好方面是有效的。尽管这些方法取得了成功,但它们仍然难以建立稀疏数据的模型,难以在稀疏数据上学习高质量的项目表示。我们建议同时从购物意图和交互项目建立用户动态模型。学习意图是粗粒度的,作为项目推荐的先验知识。为此,我们提出了一个由粗到细的自我注意框架,即 CaFe,它显式地学习粗粒度和细粒度的序列动力学。具体来说,CaFe 首先从密集的粗粒度序列中学习意图,因此提供高质量的用户意图表示。然后,CaFe 将意图表示融合到项编码器输出中,以获得改进的项表示。最后,根据项目的表示和相应的意图推断推荐项目。在稀疏数据集上的实验表明,CaFe 的性能平均比最先进的自我关注推荐系统高出44.03% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Coarse-to-Fine+Sparse+Sequential+Recommendation)|0| -|[Conversational Recommendation via Hierarchical Information Modeling](https://doi.org/10.1145/3477495.3531830)|Quan Tu, Shen Gao, Yanran Li, Jianwei Cui, Bin Wang, Rui Yan|Xiaomi AI Lab, Beijing, China; Renmin University of China, Beijing, China; Peking University, Beijing, China|Conversational recommendation system aims to recommend appropriate items to user by directly asking preference on attributes or recommending item list. However, most of existing methods only employ the flat item and attribute relationship, and ignore the hierarchical relationship connected by the similar user which can provide more comprehensive information. And these methods usually use the user accepted attributes to represent the conversational history and ignore the hierarchical information of sequential transition in the historical turns. In this paper, we propose Hierarchical Information-aware Conversational Recommender (HICR) to model the two types of hierarchical information to boost the performance of CRS. Experiments conducted on four benchmark datasets verify the effectiveness of our proposed model.|会话推荐系统旨在通过直接询问用户对属性的偏好或推荐项目列表来向用户推荐合适的项目。然而,现有的方法大多只使用平面项目和属性关系,而忽略了相似用户之间的层次关系,这样可以提供更全面的信息。这些方法通常使用用户接受的属性来表示会话历史,而忽略了历史转折中顺序转换的层次信息。本文提出了基于层次信息感知的会话推荐系统(HICR) ,对两种层次信息进行建模,以提高会话推荐系统的性能。在四个基准数据集上进行的实验验证了该模型的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Conversational+Recommendation+via+Hierarchical+Information+Modeling)|0| -|[CTnoCVR: A Novelty Auxiliary Task Making the Lower-CTR-Higher-CVR Upper](https://doi.org/10.1145/3477495.3531843)|Dandan Zhang, Haotian Wu, Guanqi Zeng, Yao Yang, Weijiang Qiu, Yujie Chen, Haoyuan Hu|China Electric Power Research Institute, Beijing, China; Beijing Jiaotong University, Beijing, China; Cainiao Network, Hangzhou, China; Zhejiang Lab, Hangzhou, China|In recent years, multi-task learning models based on deep learning in recommender systems have attracted increasing attention from researchers in industry and academia. Accurately estimating post-click conversion rate (CVR) is often considered as the primary task of multi-task learning in recommender systems. However, some advertisers may try to get higher click-through rates (CTR) by over-decorating their ads, which may result in excessive exposure to samples with lower CVR. For example, some only eye-catching clickbait have higher CTR, but actually, CVR is very low. As a result, the overall performance of the recommender system will be hurt. In this paper, we introduce a novelty auxiliary task called CTnoCVR, which aims to predict the probability of events with click but no-conversion, in various state-of-the-art multi-task models of recommender systems to promote samples with high CVR but low CTR. Plentiful Experiments on a large-scale dataset gathered from traffic logs of Taobao's recommender system demonstrate that the introduction of CTnoCVR task significantly improves the prediction effect of CVR under various multi-task frameworks. In addition, we conduct the online test and evaluate the effectiveness of our proposed method to make those samples with high CVR and low CTR rank higher.|近年来,推荐系统中基于深度学习的多任务学习模型越来越受到业界和学术界的关注。在推荐系统中,准确估计点击后转换率(CVR)常常被认为是多任务学习的首要任务。然而,一些广告商可能试图通过过度装饰他们的广告来获得更高的点击率(CTR) ,这可能导致过度暴露于低 CVR 的样品。例如,一些只有吸引眼球的点击诱饵有较高的点击率,但实际上,CVR 是非常低的。因此,推荐系统的整体表现将受到影响。本文介绍了一种新颖的辅助任务 CTnoCVR,该任务在推荐系统的多任务模型中预测点击不转换的事件发生概率,以提升高 CVR 低 CTR 的样本。大量的实验表明,在多任务框架下,引入 CTnoCVR 任务可以显著提高 CVR 的预测效果。这些实验都是从淘宝推荐系统的流量日志中收集的大规模数据集中得到的。此外,我们进行了在线测试,并评估了我们提出的方法的有效性,使高 CVR 和低 CTR 排名的样本。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CTnoCVR:+A+Novelty+Auxiliary+Task+Making+the+Lower-CTR-Higher-CVR+Upper)|0| +|[Transform Cold-Start Users into Warm via Fused Behaviors in Large-Scale Recommendation](https://doi.org/10.1145/3477495.3531797)|Pengyang Li, Rong Chen, Quan Liu, Jian Xu, Bo Zheng|Zhejiang University, Hangzhou, China; Alibaba Group, Hangzhou, China|Recommendation for cold-start users who have very limited data is a canonical challenge in recommender systems. Existing deep recommender systems utilize user content features and behaviors to produce personalized recommendations, yet often face significant performance degradation on cold-start users compared to existing ones due to the following challenges: (1) Cold-start users may have a quite different distribution of features from existing users. (2) The few behaviors of cold-start users are hard to be exploited. In this paper, we propose a recommender system called Cold-Transformer to alleviate these problems. Specifically, we design context-based Embedding Adaption to offset the differences in feature distribution. It transforms the embedding of cold-start users into a warm state that is more like existing ones to represent corresponding user preferences. Furthermore, to exploit the few behaviors of cold-start users and characterize the user context, we propose Label Encoding that models Fused Behaviors of positive and negative feedback simultaneously, which are relatively more sufficient. Last, to perform large-scale industrial recommendations, we keep the two-tower architecture that de-couples user and target item. Extensive experiments on public and industrial datasets show that Cold-Transformer significantly outperforms state-of-the-art methods, including those that are deep coupled and less scalable.|对于数据非常有限的冷启动用户的推荐在推荐系统中是一个典型的挑战。现有的深度推荐系统利用用户内容特征和行为来产生个性化的推荐,但是由于以下挑战,冷启动用户的性能往往比现有的推荐系统有显著的下降: (1)冷启动用户可能具有与现有用户完全不同的特征分布。(2)冷启动用户的少数行为很难被利用。在这篇文章中,我们提出了一个叫做冷变压器的推荐系统来缓解这些问题。具体来说,我们设计了基于上下文的嵌入适应,以抵消特征分布的差异。它将冷启动用户的嵌入转换为更像现有用户的暖状态,以表示相应的用户偏好。此外,为了充分利用冷启动用户的少数行为并刻画用户上下文特征,我们提出了标签编码方法,该方法同时对正反馈和负反馈的融合行为进行建模,相对来说比较充分。最后,为了执行大规模的工业建议,我们保留了解耦用户和目标项目的双塔架构。在公共和工业数据集上进行的大量实验表明,冷变压器的性能明显优于最先进的方法,包括那些深度耦合和可伸缩性较差的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Transform+Cold-Start+Users+into+Warm+via+Fused+Behaviors+in+Large-Scale+Recommendation)|0| +|[Coarse-to-Fine Sparse Sequential Recommendation](https://doi.org/10.1145/3477495.3531732)|Jiacheng Li, Tong Zhao, Jin Li, Jim Chan, Christos Faloutsos, George Karypis, SooMin Pantel, Julian J. McAuley|Amazon, Seattle, WA, USA; University of California, San Diego, La Jolla, CA, USA; Carnegie Mellon University, Pittsburgh, PA, USA|Sequential recommendation aims to model dynamic user behavior from historical interactions. Self-attentive methods have proven effective at capturing short-term dynamics and long-term preferences. Despite their success, these approaches still struggle to model sparse data, on which they struggle to learn high-quality item representations. We propose to model user dynamics from shopping intents and interacted items simultaneously. The learned intents are coarse-grained and work as prior knowledge for item recommendation. To this end, we present a coarse-to-fine self-attention framework, namely CaFe, which explicitly learns coarse-grained and fine-grained sequential dynamics. Specifically, CaFe first learns intents from coarse-grained sequences which are dense and hence provide high-quality user intent representations. Then, CaFe fuses intent representations into item encoder outputs to obtain improved item representations. Finally, we infer recommended items based on representations of items and corresponding intents. Experiments on sparse datasets show that CaFe outperforms state-of-the-art self-attentive recommenders by 44.03% [email protected] on average.|顺序推荐旨在从历史交互中建立动态用户行为模型。事实证明,自我关注的方法在捕捉短期动态和长期偏好方面是有效的。尽管这些方法取得了成功,但它们仍然难以建立稀疏数据的模型,难以在稀疏数据上学习高质量的项目表示。我们建议同时从购物意图和交互项目建立用户动态模型。学习意图是粗粒度的,作为项目推荐的先验知识。为此,我们提出了一个由粗到细的自我注意框架,即 CaFe,它显式地学习粗粒度和细粒度的序列动力学。具体来说,CaFe 首先从密集的粗粒度序列中学习意图,因此提供高质量的用户意图表示。然后,CaFe 将意图表示融合到项编码器输出中,以获得改进的项表示。最后,根据项目的表示和相应的意图推断推荐项目。在稀疏数据集上的实验表明,CaFe 的性能平均比最先进的自我关注推荐系统高出44.03% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Coarse-to-Fine+Sparse+Sequential+Recommendation)|0| +|[Conversational Recommendation via Hierarchical Information Modeling](https://doi.org/10.1145/3477495.3531830)|Quan Tu, Shen Gao, Yanran Li, Jianwei Cui, Bin Wang, Rui Yan|Renmin University of China, Beijing, China; Peking University, Beijing, China; Xiaomi AI Lab, Beijing, China|Conversational recommendation system aims to recommend appropriate items to user by directly asking preference on attributes or recommending item list. However, most of existing methods only employ the flat item and attribute relationship, and ignore the hierarchical relationship connected by the similar user which can provide more comprehensive information. And these methods usually use the user accepted attributes to represent the conversational history and ignore the hierarchical information of sequential transition in the historical turns. In this paper, we propose Hierarchical Information-aware Conversational Recommender (HICR) to model the two types of hierarchical information to boost the performance of CRS. Experiments conducted on four benchmark datasets verify the effectiveness of our proposed model.|会话推荐系统旨在通过直接询问用户对属性的偏好或推荐项目列表来向用户推荐合适的项目。然而,现有的方法大多只使用平面项目和属性关系,而忽略了相似用户之间的层次关系,这样可以提供更全面的信息。这些方法通常使用用户接受的属性来表示会话历史,而忽略了历史转折中顺序转换的层次信息。本文提出了基于层次信息感知的会话推荐系统(HICR) ,对两种层次信息进行建模,以提高会话推荐系统的性能。在四个基准数据集上进行的实验验证了该模型的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Conversational+Recommendation+via+Hierarchical+Information+Modeling)|0| +|[CTnoCVR: A Novelty Auxiliary Task Making the Lower-CTR-Higher-CVR Upper](https://doi.org/10.1145/3477495.3531843)|Dandan Zhang, Haotian Wu, Guanqi Zeng, Yao Yang, Weijiang Qiu, Yujie Chen, Haoyuan Hu|Beijing Jiaotong University, Beijing, China; China Electric Power Research Institute, Beijing, China; Cainiao Network, Hangzhou, China; Zhejiang Lab, Hangzhou, China|In recent years, multi-task learning models based on deep learning in recommender systems have attracted increasing attention from researchers in industry and academia. Accurately estimating post-click conversion rate (CVR) is often considered as the primary task of multi-task learning in recommender systems. However, some advertisers may try to get higher click-through rates (CTR) by over-decorating their ads, which may result in excessive exposure to samples with lower CVR. For example, some only eye-catching clickbait have higher CTR, but actually, CVR is very low. As a result, the overall performance of the recommender system will be hurt. In this paper, we introduce a novelty auxiliary task called CTnoCVR, which aims to predict the probability of events with click but no-conversion, in various state-of-the-art multi-task models of recommender systems to promote samples with high CVR but low CTR. Plentiful Experiments on a large-scale dataset gathered from traffic logs of Taobao's recommender system demonstrate that the introduction of CTnoCVR task significantly improves the prediction effect of CVR under various multi-task frameworks. In addition, we conduct the online test and evaluate the effectiveness of our proposed method to make those samples with high CVR and low CTR rank higher.|近年来,推荐系统中基于深度学习的多任务学习模型越来越受到业界和学术界的关注。在推荐系统中,准确估计点击后转换率(CVR)常常被认为是多任务学习的首要任务。然而,一些广告商可能试图通过过度装饰他们的广告来获得更高的点击率(CTR) ,这可能导致过度暴露于低 CVR 的样品。例如,一些只有吸引眼球的点击诱饵有较高的点击率,但实际上,CVR 是非常低的。因此,推荐系统的整体表现将受到影响。本文介绍了一种新颖的辅助任务 CTnoCVR,该任务在推荐系统的多任务模型中预测点击不转换的事件发生概率,以提升高 CVR 低 CTR 的样本。大量的实验表明,在多任务框架下,引入 CTnoCVR 任务可以显著提高 CVR 的预测效果。这些实验都是从淘宝推荐系统的流量日志中收集的大规模数据集中得到的。此外,我们进行了在线测试,并评估了我们提出的方法的有效性,使高 CVR 和低 CTR 排名的样本。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CTnoCVR:+A+Novelty+Auxiliary+Task+Making+the+Lower-CTR-Higher-CVR+Upper)|0| |[Smooth-AUC: Smoothing the Path Towards Rank-based CTR Prediction](https://doi.org/10.1145/3477495.3531865)|Shuang Tang, Fangyuan Luo, Jun Wu|Beijing Jiaotong University, Beijing, China|Deep neural networks (DNNs) have been a key technique for click-through rate (CTR) estimation, yet existing DNNs-based CTR models neglect the inconsistency between their optimization objectives (e.g., Binary Cross Entropy, BCE) and CTR ranking metrics (e.g., Area Under the ROC Curve, AUC). It is noteworthy that directly optimizing AUC by gradient-descent methods is difficult due to the non-differentiable Heaviside function built-in AUC. To this end, we propose a smooth approximation of AUC, called smooth-AUC (SAUC), towards the rank-based CTR prediction. Specifically, SAUC relaxes the Heaviside function via sigmoid with a temperature coefficient (aiming at controlling the function sharpness) in order to facilitate the gradient-based optimization. Furthermore, SAUC is a plug-and-play objective that can be used in any DNNs-based CTR model. Experimental results on two real-world datasets demonstrate that SAUC consistently improves the recommendation accuracy of current DNNs-based CTR models.|深度神经网络(DNN)一直是点进率评估的关键技术,然而现有的基于 DNN 的 CTR 模型忽略了它们的优化目标(例如,二进制交叉熵,BCE)和 CTR 排名指标(例如,ROC Curve 下面积,AUC)之间的不一致性。值得注意的是,由于 AUC 内置的不可微单位阶跃函数,用梯度下降法直接优化 AUC 是困难的。为此,我们提出了一种平滑近似的 AUC,称为平滑 AUC (SAUC) ,用于基于秩的 CTR 预测。具体来说,SAUC 通过 sigmoid 放松单位阶跃函数(目的是控制函数的清晰度) ,以便于基于梯度的优化温度系数。此外,SAUC 是一个即插即用的目标,可以在任何基于 DNN 的 CTR 模型中使用。在两个实际数据集上的实验结果表明,SAUC 一致地提高了当前基于 DNN 的 CTR 模型的推荐精度。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Smooth-AUC:+Smoothing+the+Path+Towards+Rank-based+CTR+Prediction)|0| |[Alignment Rationale for Query-Document Relevance](https://doi.org/10.1145/3477495.3531883)|Youngwoo Kim, Razieh Rahimi, James Allan|University of Massachusetts Amherst, Amherst, MA, USA|Deep neural networks are widely used for text pair classification tasks such as as adhoc information retrieval. These deep neural networks are not inherently interpretable and require additional efforts to get rationale behind their decisions. Existing explanation models are not yet capable of inducing alignments between the query terms and the document terms -- which part of the document rationales are responsible for which part of the query? In this paper, we study how the input perturbations can be used to infer or evaluate alignments between the query and document spans, which best explain the black-box ranker's relevance prediction. We use different perturbation strategies and accordingly propose a set of metrics to evaluate the faithfulness of alignment rationales to the model. Our experiments show that the defined metrics based on substitution-based perturbation are more successful in preferring higher-quality alignments, compared to the deletion-based metrics.|深度神经网络广泛用于文本对分类任务,如自组织信息检索。这些深层神经网络本质上是不可解释的,需要额外的努力来获得其决策背后的理由。现有的解释模型还不能在查询术语和文档术语之间引入对齐——文档基本原理的哪一部分负责查询的哪一部分?在本文中,我们研究了如何利用输入扰动来推断或评估查询和文档跨度之间的对齐,这最好地解释了黑盒排名的相关性预测。我们使用不同的摄动策略,并相应地提出了一套度量来评估对齐基本原理的忠实性模型。我们的实验表明,与基于删除的度量相比,基于替换扰动的度量更容易获得高质量的比对。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Alignment+Rationale+for+Query-Document+Relevance)|0| -|[Learning to Rank Knowledge Subgraph Nodes for Entity Retrieval](https://doi.org/10.1145/3477495.3531888)|Parastoo Jafarzadeh, Zahra Amirmahani, Faezeh Ensan|Ferdowsi University of Mashhad, Mashhad, Iran; Ryerson University, Toronto, ON, Canada|The importance of entity retrieval, the task of retrieving a ranked list of related entities from big knowledge bases given a textual query, has been widely acknowledged in the literature. In this paper, we propose a novel entity retrieval method that addresses the important challenge that revolves around the need to effectively represent and model context in which entities relate to each other. Based on our proposed method, a model is firstly trained to retrieve and prune a subgraph of a textual knowledge graph that represents contextual relationships between entities. Secondly, a deep model is introduced to reason over the textual content of nodes, edges, and the given question and score and rank entities in the subgraph. We show experimentally that our approach outperforms state-of-the-art methods on a number of benchmarks for entity retrieval.|实体检索的重要性在文献中得到了广泛的认可。在本文中,我们提出了一种新的实体检索方法,以解决围绕着需要有效地表示和模型实体相互关联的上下文的重要挑战。基于该方法,首先训练一个模型来检索和剪枝表示实体间上下文关系的文本知识图的子图。其次,引入一个深度模型来推理子图中节点、边和给定问题的文本内容以及子图中的得分和排序实体。我们的实验表明,我们的方法在实体检索的许多基准上优于最先进的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+to+Rank+Knowledge+Subgraph+Nodes+for+Entity+Retrieval)|0| +|[Learning to Rank Knowledge Subgraph Nodes for Entity Retrieval](https://doi.org/10.1145/3477495.3531888)|Parastoo Jafarzadeh, Zahra Amirmahani, Faezeh Ensan|Ryerson University, Toronto, ON, Canada; Ferdowsi University of Mashhad, Mashhad, Iran|The importance of entity retrieval, the task of retrieving a ranked list of related entities from big knowledge bases given a textual query, has been widely acknowledged in the literature. In this paper, we propose a novel entity retrieval method that addresses the important challenge that revolves around the need to effectively represent and model context in which entities relate to each other. Based on our proposed method, a model is firstly trained to retrieve and prune a subgraph of a textual knowledge graph that represents contextual relationships between entities. Secondly, a deep model is introduced to reason over the textual content of nodes, edges, and the given question and score and rank entities in the subgraph. We show experimentally that our approach outperforms state-of-the-art methods on a number of benchmarks for entity retrieval.|实体检索的重要性在文献中得到了广泛的认可。在本文中,我们提出了一种新的实体检索方法,以解决围绕着需要有效地表示和模型实体相互关联的上下文的重要挑战。基于该方法,首先训练一个模型来检索和剪枝表示实体间上下文关系的文本知识图的子图。其次,引入一个深度模型来推理子图中节点、边和给定问题的文本内容以及子图中的得分和排序实体。我们的实验表明,我们的方法在实体检索的许多基准上优于最先进的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+to+Rank+Knowledge+Subgraph+Nodes+for+Entity+Retrieval)|0| |[ELECRec: Training Sequential Recommenders as Discriminators](https://doi.org/10.1145/3477495.3531894)|Yongjun Chen, Jia Li, Caiming Xiong|Salesforce Research, Palo Alto, CA, USA|Sequential recommendation is often considered as a generative task, i.e., training a sequential encoder to generate the next item of a user's interests based on her historical interacted items. Despite their prevalence, these methods usually require training with more meaningful samples to be effective, which otherwise will lead to a poorly trained model. In this work, we propose to train the sequential recommenders as discriminators rather than generators. Instead of predicting the next item, our method trains a discriminator to distinguish if a sampled item is a 'real' target item or not. A generator, as an auxiliary model, is trained jointly with the discriminator to sample plausible alternative next items and will be thrown out after training. The trained discriminator is considered as the final SR model and denoted as \modelname. Experiments conducted on four datasets demonstrate the effectiveness and efficiency of the proposed approach.|顺序推荐通常被认为是一个生成任务,例如,训练一个顺序编码器根据用户的历史交互项目生成下一个用户感兴趣的项目。尽管这些方法普遍存在,但通常需要训练更有意义的样本才能有效,否则将导致训练不足的模型。在这项工作中,我们建议训练顺序推荐器作为鉴别器,而不是生成器。我们的方法不是预测下一个项目,而是训练一个鉴别器来区分一个采样的项目是否是“真正的”目标项目。发电机作为辅助模型,与鉴别器联合训练,以抽样合理的替代下一个项目,并将在训练后抛出。训练后的鉴别器被认为是最终的 SR 模型,并表示为模型名。在四个数据集上进行的实验表明了该方法的有效性和高效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ELECRec:+Training+Sequential+Recommenders+as+Discriminators)|0| |[A2A-API: A Prototype for Biomedical Information Retrieval Research and Benchmarking](https://doi.org/10.1145/3477495.3531667)|Maciej Rybinski, Liam Watts, Sarvnaz Karimi|CSIRO Data61, Sydney, NSW, Australia|Finding relevant literature is crucial for biomedical research and in the practice of evidence-based medicine, making biomedical search an important application area within the field of information retrieval. This is recognised by the broader IR community, and in particular by the organisers of Text Retrieval Conference (TREC) as early as 2003. While TREC provides crucial evaluation resources, to get started in biomedical IR one needs to tackle an important software engineering hurdle of parsing, indexing, and deploying several large document collections. Moreover, many newcomers to the field often face a steep learning curve, where theoretical concepts are tangled up with technical aspects. Finally, many of the existing baselines and systems are difficult to reproduce. We aim to alleviate all three of these bottlenecks with the launch of A2A-API. It is a RESTful API which serves as an easy-to-use and programming-language-independent interface to existing biomedical TREC collections. It builds upon A2A, our system for biomedical information retrieval benchmarking, and extends it with additional functionalities. Apart from providing programmatic access to the features of the original A2A system - focused principally on benchmarking - A2A-API supports biomedical IR researchers in development of systems featuring reranking and query reformulation components. In this demonstration, we illustrate the capabilities of A2A-API with comprehensive use cases.|寻找相关文献对于生物医学研究和循证医学的实践至关重要,这使得生物医学搜索成为信息检索领域的一个重要应用领域。早在2003年,更广泛的信息检索社区,特别是文本检索会议(TREC)的组织者就认识到了这一点。虽然 TREC 提供了关键的评估资源,但要开始学习生物医学 IR,需要解决一个重要的软件工程障碍,即解析、索引和部署几个大型文档集。此外,该领域的许多新手往往面临一个陡峭的学习曲线,其中理论概念与技术方面纠缠在一起。最后,许多现有的基线和系统很难再现。我们的目标是通过推出 A2A-API 来缓解所有这三个瓶颈。它是一个 RESTful API,作为一个易于使用和独立于编程语言的接口,用于现有的生物医学 TREC 集合。它建立在我们的生物医学信息检索基准测试系统 A2A 的基础上,并扩展了其他功能。除了提供对原始 A2A 系统特性的程序访问(主要侧重于基准测试)外,A2A-API 还支持生物医学红外研究人员开发具有重新排序和查询重新制定组件的系统。在本演示中,我们通过全面的用例说明了 A2A-API 的功能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A2A-API:+A+Prototype+for+Biomedical+Information+Retrieval+Research+and+Benchmarking)|0| |[Learning to Rank Instant Search Results with Multiple Indices: A Case Study in Search Aggregation for Entertainment](https://doi.org/10.1145/3477495.3536334)|Scott Rome, Sardar Hamidian, Richard Walsh, Kevin Foley, Ferhan Ture|Comcast, Washington, DC, USA; Comcast, Philadelphia, PA, USA; Comcast, Sunnyvale, CA, USA|At Xfinity, an instant search system provides a variety of results for a given query from different sources. For each keystroke, new results are rendered on screen to the user, which could contain movies, television series, sporting events, music videos, news clips, person pages, and other result types. Users are also able to use the Xfinity Voice Remote to submit longer queries, some of which are more open-ended. Examples of queries include incomplete words which match multiple results through lexical matching (i.e., "ali"), topical searches ("vampire movies"), and more specific longer searches ("Movies with Adam Sandler"). Since results can be based on lexical matches, semantic matches, item-to-item similarity matches, or a variety of business logic driven sources, a key challenge is how to combine results into a single list. To accomplish this, we propose merging the lists via a Learning to Rank (LTR) neural model which takes into account the search query. This combined list can be personalized via a second LTR neural model with knowledge of the user's search history and metadata of the programs. Because instant search is under-represented in the literature, we present our learnings from research to aid other practitioners.|在 Xfinity,即时搜索系统为来自不同来源的特定查询提供多种结果。对于每次按键,新的结果都会在屏幕上呈现给用户,其中可能包含电影、电视剧、体育赛事、音乐视频、新闻剪辑、人物页面和其他结果类型。用户还可以使用 Xfinity Voice Remote 提交更长的查询,其中一些查询更为开放。查询的例子包括通过词汇匹配(例如“ ali”)匹配多个结果的不完整单词、主题搜索(“吸血鬼电影”)和更具体的长搜索(“与 Adam Sandler 的电影”)。由于结果可以基于词汇匹配、语义匹配、项目间相似性匹配或各种业务逻辑驱动源,因此一个关键的挑战是如何将结果组合成一个单独的列表。为了实现这一点,我们建议通过一个学习排序(LTR)神经模型,考虑到搜索查询合并列表。这个组合列表可以通过第二个具有用户搜索历史和程序元数据知识的 LTR 神经模型进行个性化。因为即时搜索在文献中的代表性不足,我们提出我们从研究中学到的东西来帮助其他从业者。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+to+Rank+Instant+Search+Results+with+Multiple+Indices:+A+Case+Study+in+Search+Aggregation+for+Entertainment)|0| |[Scalable Exploration for Neural Online Learning to Rank with Perturbed Feedback](https://doi.org/10.1145/3477495.3532057)|Yiling Jia, Hongning Wang|University of Virginia, Charlottesville, VA, USA|Deep neural networks (DNNs) demonstrates significant advantages in improving ranking performance in retrieval tasks. Driven by the recent developments in optimization and generalization of DNNs, learning a neural ranking model online from its interactions with users becomes possible. However, the required exploration for model learning has to be performed in the entire neural network parameter space, which is prohibitively expensive and limits the application of such online solutions in practice. In this work, we propose an efficient exploration strategy for online interactive neural ranker learning based on bootstrapping. Our solution is based on an ensemble of ranking models trained with perturbed user click feedback. The proposed method eliminates explicit confidence set construction and the associated computational overhead, which enables the online neural rankers training to be efficiently executed in practice with theoretical guarantees. Extensive comparisons with an array of state-of-the-art OL2R algorithms on two public learning to rank benchmark datasets demonstrate the effectiveness and computational efficiency of our proposed neural OL2R solution.|深层神经网络(DNN)在提高检索任务的排序性能方面具有显著的优势。在 DNN 优化和泛化的最新发展的驱动下,从与用户的交互中学习在线神经排序模型成为可能。然而,模型学习所需要的探索必须在整个神经网络参数空间中进行,这是非常昂贵的,并且限制了这种在线解决方案在实际中的应用。本文提出了一种基于自举的在线交互式神经排序学习的有效探索策略。我们的解决方案是基于一个排名模型的集合训练与不安的用户点击反馈。该方法消除了显式置信集结构和相关的计算开销,使在线神经排序训练能够在理论保证的情况下在实际应用中有效地执行。通过与一系列最先进的 OL2R 算法在两个公共学习基准数据集上的广泛比较,证明了我们提出的神经 OL2R 解决方案的有效性和计算效率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Scalable+Exploration+for+Neural+Online+Learning+to+Rank+with+Perturbed+Feedback)|0| -|[Towards Validating Long-Term User Feedbacks in Interactive Recommendation Systems](https://doi.org/10.1145/3477495.3531869)|Hojoon Lee, Dongyoon Hwang, Kyushik Min, Jaegul Choo|KAIST, SeongNam, Republic of Korea; KAKAO Enterprise, SeongNam, Republic of Korea|Interactive Recommender Systems (IRSs) have attracted a lot of attention, due to their ability to model interactive processes between users and recommender systems. Numerous approaches have adopted Reinforcement Learning (RL) algorithms, as these can directly maximize users' cumulative rewards. In IRS, researchers commonly utilize publicly available review datasets to compare and evaluate algorithms. However, user feedback provided in public datasets merely includes instant responses (e.g., a rating), with no inclusion of delayed responses (e.g., the dwell time and the lifetime value). Thus, the question remains whether these review datasets are an appropriate choice to evaluate the long-term effects in IRS. In this work, we revisited experiments on IRS with review datasets and compared RL-based models with a simple reward model that greedily recommends the item with the highest one-step reward. Following extensive analysis, we can reveal three main findings: First, a simple greedy reward model consistently outperforms RL-based models in maximizing cumulative rewards. Second, applying higher weighting to long-term rewards leads to degradation of recommendation performance. Third, user feedbacks have mere long-term effects in the benchmark datasets. Based on our findings, we conclude that a dataset has to be carefully verified and that a simple greedy baseline should be included for a proper evaluation of RL-based IRS approaches. Our code and dataset are available at https://github.com/dojeon-ai/irs_validation.|交互式推荐系统(IRS)由于能够对用户和推荐系统之间的交互过程进行建模而引起了人们的广泛关注。许多方法都采用了强化学习算法,因为这些算法可以直接最大化用户的累积回报。在 IRS 中,研究人员通常利用公开的评论数据集来比较和评估算法。然而,在公共数据集中提供的用户反馈只包括即时响应(例如,评级) ,没有包括延迟响应(例如,停留时间和生命周期值)。因此,问题仍然是这些审查数据集是否是评估 IRS 长期影响的合适选择。在这项工作中,我们重新回顾了 IRS 的实验与评论数据集,并比较了基于 RL 的模型与一个简单的奖励模型,贪婪地推荐项目具有最高的一步奖励。经过广泛的分析,我们可以揭示三个主要的发现: 第一,一个简单的贪婪报酬模型在最大化累积报酬方面始终优于基于 RL 的模型。其次,对长期奖励加权会导致推荐绩效的下降。第三,用户反馈在基准数据集中只有长期效果。基于我们的研究结果,我们得出结论,一个数据集必须被仔细验证,并且一个简单的贪婪基线应该被包括在一个基于 RL 的 IRS 方法的正确评估中。我们的代码和数据集可在 https://github.com/dojeon-ai/irs_validation 下载。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Validating+Long-Term+User+Feedbacks+in+Interactive+Recommendation+Systems)|0| -|[Structure-Aware Semantic-Aligned Network for Universal Cross-Domain Retrieval](https://doi.org/10.1145/3477495.3532061)|Jialin Tian, Xing Xu, Kai Wang, Zuo Cao, Xunliang Cai, Heng Tao Shen|University of Electronic Science and Technology of China, Chengdu, China; Meituan, Shanghai, China; University of Electronic Science and Technology of China & Peng Cheng Laboratory, Chengdu, China|The goal of cross-domain retrieval (CDR) is to search for instances of the same category in one domain by using a query from another domain. Existing CDR approaches mainly consider the standard scenario that the cross-domain data for both training and testing come from the same categories and underlying distributions. However, these methods cannot be well extended to the newly emerging task of universal cross-domain retrieval (UCDR), where the testing data belong to the domain and categories not present during training. Compared to CDR, the UCDR task is more challenging due to (1) visually diverse data from multi-source domains, (2) the domain shift between seen and unseen domains, and (3) the semantic shift across seen and unseen categories. To tackle these problems, we propose a novel model termed Structure-Aware Semantic-Aligned Network (SASA) to align the heterogeneous representations of multi-source domains without loss of generalizability for the UCDR task. Specifically, we leverage the advanced Vision Transformer (ViT) as the backbone and devise a distillation-alignment ViT (DAViT) with a novel token-based strategy, which incorporates two complementary distillation and alignment tokens into the ViT architecture. In addition, the distillation token is devised to improve the generalizability of our model by structure information preservation and the alignment token is used to improve discriminativeness with trainable categorical prototypes. Extensive experiments on three large-scale benchmarks, i.e., Sketchy, TU-Berlin, and DomainNet, demonstrate the superiority of our SASA method over the state-of-the-art UCDR and ZS-SBIR methods.|跨域检索(CDR)的目标是通过使用来自另一个域的查询在一个域中搜索相同类别的实例。现有的 CDR 方法主要考虑这样的标准场景: 用于培训和测试的跨域数据来自相同的类别和底层分布。然而,这些方法不能很好地推广到新出现的通用跨域检索(UCDR)任务,其中的测试数据属于领域和类别不存在的训练过程中。与 CDR 相比,UCDR 任务更具挑战性,因为(1)来自多源域的视觉多样化数据,(2)可见和不可见域之间的域转移,以及(3)跨可见和不可见类别的语义转移。为了解决这些问题,我们提出了一种称为结构感知语义对齐网络(SASA)的新模型,该模型可以在不损失 UCDR 任务通用性的前提下对多源域的异构表示进行对齐。具体而言,我们利用先进的视觉变压器(ViT)作为骨干,并设计了一种蒸馏对准 ViT (DAViT) ,其具有基于令牌的新策略,其将两个互补的蒸馏和对准令牌合并到 ViT 体系结构中。此外,通过结构信息的保留,设计了精馏令牌来提高模型的泛化能力,并利用对齐令牌来提高可训练范畴原型的区分能力。在 Sketchy、 TU-Berlin 和 DomainNet 这三个大型基准测试上的大量实验证明了我们的 SASA 方法优于最先进的 UCDR 和 ZS-SBIR 方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Structure-Aware+Semantic-Aligned+Network+for+Universal+Cross-Domain+Retrieval)|0| -|[Enhancing Top-N Item Recommendations by Peer Collaboration](https://doi.org/10.1145/3477495.3531773)|Yang Sun, Fajie Yuan, Min Yang, Alexandros Karatzoglou, Li Shen, Xiaoyan Zhao|Harbin Institute of Technology, Shenzhen, Shenzhen, China; Google Research, London, United Kingdom; Westlake University, Hangzhou, China; JD Explore Academy, Beijing, China; Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China|Deep neural networks (DNN) based recommender models often require numerous parameters to achieve remarkable performance. However, this inevitably brings redundant neurons, a phenomenon referred to as over-parameterization. In this paper, we plan to exploit such redundancy phenomena for recommender systems (RS), and propose a top-N item recommendation framework called PCRec that leverages collaborative training of two recommender models of the same network structure, termed peer collaboration. We first introduce two criteria to identify the importance of parameters of a given recommender model. Then, we rejuvenate the unimportant parameters by copying parameters from its peer network. After such an operation and retraining, the original recommender model is endowed with more representation capacity by possessing more functional model parameters. To show its generality, we instantiate PCRec by using three well-known recommender models. We conduct extensive experiments on two real-world datasets, and show that PCRec yields significantly better performance than its counterpart with the same model (parameter) size.|基于深度神经网络(DNN)的推荐模型往往需要大量的参数才能达到显著的性能。然而,这不可避免地带来了多余的神经元,这种现象被称为过度参数化。在本文中,我们计划在推荐系统中利用这种冗余现象,并提出了一个名为 PCRec 的前 N 项推荐框架,该框架利用了两个相同网络结构的推荐模型的协同训练,称为对等协作。我们首先引入两个标准来确定一个给定的推荐模型参数的重要性。然后,我们通过从其对等网络中复制参数来恢复不重要的参数。经过这样的操作和再训练,原有的推荐模型具有更多的功能模型参数,从而具有更强的表示能力。为了显示其通用性,我们使用三个著名的推荐模型来实例化 PCRec。我们在两个真实世界的数据集上进行了广泛的实验,结果表明,与相同模型(参数)大小的同类数据集相比,PCRec 产生了明显更好的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Enhancing+Top-N+Item+Recommendations+by+Peer+Collaboration)|0| +|[Towards Validating Long-Term User Feedbacks in Interactive Recommendation Systems](https://doi.org/10.1145/3477495.3531869)|Hojoon Lee, Dongyoon Hwang, Kyushik Min, Jaegul Choo|KAKAO Enterprise, SeongNam, Republic of Korea; KAIST, SeongNam, Republic of Korea|Interactive Recommender Systems (IRSs) have attracted a lot of attention, due to their ability to model interactive processes between users and recommender systems. Numerous approaches have adopted Reinforcement Learning (RL) algorithms, as these can directly maximize users' cumulative rewards. In IRS, researchers commonly utilize publicly available review datasets to compare and evaluate algorithms. However, user feedback provided in public datasets merely includes instant responses (e.g., a rating), with no inclusion of delayed responses (e.g., the dwell time and the lifetime value). Thus, the question remains whether these review datasets are an appropriate choice to evaluate the long-term effects in IRS. In this work, we revisited experiments on IRS with review datasets and compared RL-based models with a simple reward model that greedily recommends the item with the highest one-step reward. Following extensive analysis, we can reveal three main findings: First, a simple greedy reward model consistently outperforms RL-based models in maximizing cumulative rewards. Second, applying higher weighting to long-term rewards leads to degradation of recommendation performance. Third, user feedbacks have mere long-term effects in the benchmark datasets. Based on our findings, we conclude that a dataset has to be carefully verified and that a simple greedy baseline should be included for a proper evaluation of RL-based IRS approaches. Our code and dataset are available at https://github.com/dojeon-ai/irs_validation.|交互式推荐系统(IRS)由于能够对用户和推荐系统之间的交互过程进行建模而引起了人们的广泛关注。许多方法都采用了强化学习算法,因为这些算法可以直接最大化用户的累积回报。在 IRS 中,研究人员通常利用公开的评论数据集来比较和评估算法。然而,在公共数据集中提供的用户反馈只包括即时响应(例如,评级) ,没有包括延迟响应(例如,停留时间和生命周期值)。因此,问题仍然是这些审查数据集是否是评估 IRS 长期影响的合适选择。在这项工作中,我们重新回顾了 IRS 的实验与评论数据集,并比较了基于 RL 的模型与一个简单的奖励模型,贪婪地推荐项目具有最高的一步奖励。经过广泛的分析,我们可以揭示三个主要的发现: 第一,一个简单的贪婪报酬模型在最大化累积报酬方面始终优于基于 RL 的模型。其次,对长期奖励加权会导致推荐绩效的下降。第三,用户反馈在基准数据集中只有长期效果。基于我们的研究结果,我们得出结论,一个数据集必须被仔细验证,并且一个简单的贪婪基线应该被包括在一个基于 RL 的 IRS 方法的正确评估中。我们的代码和数据集可在 https://github.com/dojeon-ai/irs_validation 下载。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Validating+Long-Term+User+Feedbacks+in+Interactive+Recommendation+Systems)|0| +|[Structure-Aware Semantic-Aligned Network for Universal Cross-Domain Retrieval](https://doi.org/10.1145/3477495.3532061)|Jialin Tian, Xing Xu, Kai Wang, Zuo Cao, Xunliang Cai, Heng Tao Shen|Meituan, Shanghai, China; University of Electronic Science and Technology of China & Peng Cheng Laboratory, Chengdu, China; University of Electronic Science and Technology of China, Chengdu, China|The goal of cross-domain retrieval (CDR) is to search for instances of the same category in one domain by using a query from another domain. Existing CDR approaches mainly consider the standard scenario that the cross-domain data for both training and testing come from the same categories and underlying distributions. However, these methods cannot be well extended to the newly emerging task of universal cross-domain retrieval (UCDR), where the testing data belong to the domain and categories not present during training. Compared to CDR, the UCDR task is more challenging due to (1) visually diverse data from multi-source domains, (2) the domain shift between seen and unseen domains, and (3) the semantic shift across seen and unseen categories. To tackle these problems, we propose a novel model termed Structure-Aware Semantic-Aligned Network (SASA) to align the heterogeneous representations of multi-source domains without loss of generalizability for the UCDR task. Specifically, we leverage the advanced Vision Transformer (ViT) as the backbone and devise a distillation-alignment ViT (DAViT) with a novel token-based strategy, which incorporates two complementary distillation and alignment tokens into the ViT architecture. In addition, the distillation token is devised to improve the generalizability of our model by structure information preservation and the alignment token is used to improve discriminativeness with trainable categorical prototypes. Extensive experiments on three large-scale benchmarks, i.e., Sketchy, TU-Berlin, and DomainNet, demonstrate the superiority of our SASA method over the state-of-the-art UCDR and ZS-SBIR methods.|跨域检索(CDR)的目标是通过使用来自另一个域的查询在一个域中搜索相同类别的实例。现有的 CDR 方法主要考虑这样的标准场景: 用于培训和测试的跨域数据来自相同的类别和底层分布。然而,这些方法不能很好地推广到新出现的通用跨域检索(UCDR)任务,其中的测试数据属于领域和类别不存在的训练过程中。与 CDR 相比,UCDR 任务更具挑战性,因为(1)来自多源域的视觉多样化数据,(2)可见和不可见域之间的域转移,以及(3)跨可见和不可见类别的语义转移。为了解决这些问题,我们提出了一种称为结构感知语义对齐网络(SASA)的新模型,该模型可以在不损失 UCDR 任务通用性的前提下对多源域的异构表示进行对齐。具体而言,我们利用先进的视觉变压器(ViT)作为骨干,并设计了一种蒸馏对准 ViT (DAViT) ,其具有基于令牌的新策略,其将两个互补的蒸馏和对准令牌合并到 ViT 体系结构中。此外,通过结构信息的保留,设计了精馏令牌来提高模型的泛化能力,并利用对齐令牌来提高可训练范畴原型的区分能力。在 Sketchy、 TU-Berlin 和 DomainNet 这三个大型基准测试上的大量实验证明了我们的 SASA 方法优于最先进的 UCDR 和 ZS-SBIR 方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Structure-Aware+Semantic-Aligned+Network+for+Universal+Cross-Domain+Retrieval)|0| +|[Enhancing Top-N Item Recommendations by Peer Collaboration](https://doi.org/10.1145/3477495.3531773)|Yang Sun, Fajie Yuan, Min Yang, Alexandros Karatzoglou, Li Shen, Xiaoyan Zhao|Google Research, London, United Kingdom; JD Explore Academy, Beijing, China; Westlake University, Hangzhou, China; Harbin Institute of Technology, Shenzhen, Shenzhen, China; Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China|Deep neural networks (DNN) based recommender models often require numerous parameters to achieve remarkable performance. However, this inevitably brings redundant neurons, a phenomenon referred to as over-parameterization. In this paper, we plan to exploit such redundancy phenomena for recommender systems (RS), and propose a top-N item recommendation framework called PCRec that leverages collaborative training of two recommender models of the same network structure, termed peer collaboration. We first introduce two criteria to identify the importance of parameters of a given recommender model. Then, we rejuvenate the unimportant parameters by copying parameters from its peer network. After such an operation and retraining, the original recommender model is endowed with more representation capacity by possessing more functional model parameters. To show its generality, we instantiate PCRec by using three well-known recommender models. We conduct extensive experiments on two real-world datasets, and show that PCRec yields significantly better performance than its counterpart with the same model (parameter) size.|基于深度神经网络(DNN)的推荐模型往往需要大量的参数才能达到显著的性能。然而,这不可避免地带来了多余的神经元,这种现象被称为过度参数化。在本文中,我们计划在推荐系统中利用这种冗余现象,并提出了一个名为 PCRec 的前 N 项推荐框架,该框架利用了两个相同网络结构的推荐模型的协同训练,称为对等协作。我们首先引入两个标准来确定一个给定的推荐模型参数的重要性。然后,我们通过从其对等网络中复制参数来恢复不重要的参数。经过这样的操作和再训练,原有的推荐模型具有更多的功能模型参数,从而具有更强的表示能力。为了显示其通用性,我们使用三个著名的推荐模型来实例化 PCRec。我们在两个真实世界的数据集上进行了广泛的实验,结果表明,与相同模型(参数)大小的同类数据集相比,PCRec 产生了明显更好的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Enhancing+Top-N+Item+Recommendations+by+Peer+Collaboration)|0| |[Learning-to-Rank at the Speed of Sampling: Plackett-Luce Gradient Estimation with Minimal Computational Complexity](https://doi.org/10.1145/3477495.3531842)|Harrie Oosterhuis|Radboud University, Nijmegen, Netherlands|Plackett-Luce gradient estimation enables the optimization of stochastic ranking models within feasible time constraints through sampling techniques. Unfortunately, the computational complexity of existing methods does not scale well with the length of the rankings, i.e. the ranking cutoff, nor with the item collection size. In this paper, we introduce the novel PL-Rank-3 algorithm that performs unbiased gradient estimation with a computational complexity comparable to the best sorting algorithms. As a result, our novel learning-to-rank method is applicable in any scenario where standard sorting is feasible in reasonable time. Our experimental results indicate large gains in the time required for optimization, without any loss in performance. For the field, our contribution could potentially allow state-of-the-art learning-to-rank methods to be applied to much larger scales than previously feasible.|Plackett-Luce 梯度估计可以通过抽样技术在可行的时间约束下优化随机排序模型。遗憾的是,现有方法的计算复杂度并不能很好地与排名的长度(即排名截止值)和项目集合的大小相适应。在本文中,我们介绍了一种新的 PL-Rank-3算法,该算法执行无偏梯度估计,其计算复杂度与最佳排序算法相当。因此,我们的新学习排序方法适用于任何情况下,标准排序是可行的在合理的时间。我们的实验结果表明,优化所需的时间大大增加,性能没有任何损失。对于这个领域,我们的贡献可能使最先进的学习排名方法应用于比以前可行的更大的范围。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning-to-Rank+at+the+Speed+of+Sampling:+Plackett-Luce+Gradient+Estimation+with+Minimal+Computational+Complexity)|0| |[Rethinking Correlation-based Item-Item Similarities for Recommender Systems](https://doi.org/10.1145/3477495.3532055)|Katsuhiko Hayashi|Hokkaido University, Sapporo, Japan|This paper studies correlation-based item-item similarity measures for recommendation systems. While current research on recommender systems is directed toward deep learning-based approaches, nearest neighbor methods have been still used extensively in commercial recommender systems due to their simplicity. A crucial step in item-based nearest neighbor methods is to compute similarities between items, which are generally estimated through correlation measures like Pearson. The purpose of this paper is to re-investigate the effectiveness of correlation-based nearest neighbor methods on several benchmark datasets that have been used for recommendation evaluation in recent years. This paper also provides a more effective estimation method for correlation measures than the classical Pearson correlation coefficient and shows that this leads to significant improvements in recommendation performance.|本文研究了基于相关性的推荐系统项目相似性度量。虽然目前对推荐系统的研究主要集中在基于深度学习的方法上,但是最近邻方法由于其简单性在商业推荐系统中仍然得到了广泛的应用。基于项目的最近邻方法的一个关键步骤是计算项目之间的相似性,这通常通过相关度量(如 Pearson)来估计。本文旨在重新研究基于相关性的最近邻方法在近年来用于推荐评价的几个基准数据集上的有效性。本文还提供了一种比经典的皮尔逊相关系数更有效的相关度量估计方法,结果表明这种方法可以显著提高推荐性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Rethinking+Correlation-based+Item-Item+Similarities+for+Recommender+Systems)|0| -|[DeSCoVeR: Debiased Semantic Context Prior for Venue Recommendation](https://doi.org/10.1145/3477495.3531877)|Sailaja Rajanala, Arghya Pal, Manish Singh, Raphael C.W. Phan, KokSheik Wong|Monash University Malaysia, Bandar Sunway, Malaysia; Harvard Medical School, Boston, MA, USA; Indian Institute of Technology Hyderabad, Hyderabad, India|We present a novel semantic context prior-based venue recommendation system that uses only the title and the abstract of a paper. Based on the intuition that the text in the title and abstract have both semantic and syntactic components, we demonstrate that a joint training of a semantic feature extractor and syntactic feature extractor collaboratively leverages meaningful information that helps to provide venues for papers. The proposed methodology that we call DeSCoVeR at first elicits these semantic and syntactic features using a Neural Topic Model and text classifier respectively. The model then executes a transfer learning optimization procedure to perform a contextual transfer between the feature distributions of the Neural Topic Model and the text classifier during the training phase. DeSCoVeR also mitigates the document-level label bias using a Causal back-door path criterion and a sentence-level keyword bias removal technique. Experiments on the DBLP dataset show that DeSCoVeR outperforms the state-of-the-art methods.|我们提出了一个新的基于语义上下文先验的场地推荐系统,它只使用文章的标题和摘要。基于标题和摘要中的文本同时具有语义和句法成分的直觉,我们证明了语义特征提取器和句法特征提取器的联合训练协同利用有意义的信息,有助于为论文提供场所。我们提出的方法,我们称为 DeSCoVeR 首先引出这些语义和句法特征使用神经主题模型和文本分类器分别。然后,该模型执行一个迁移学习优化过程,在训练阶段在神经主题模型的特征分布和文本分类器之间进行上下文迁移。DeSCoVeR 还使用因果后门路径标准和句子级关键字偏差消除技术来减轻文档级标签偏差。在 DBLP 数据集上的实验表明,DeSCoVeR 方法的性能优于最先进的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DeSCoVeR:+Debiased+Semantic+Context+Prior+for+Venue+Recommendation)|0| -|[Revisiting Bundle Recommendation: Datasets, Tasks, Challenges and Opportunities for Intent-aware Product Bundling](https://doi.org/10.1145/3477495.3531904)|Zhu Sun, Jie Yang, Kaidong Feng, Hui Fang, Xinghua Qu, Yew Soon Ong|Institute of High Performance Computing and Centre for Frontier AI Research, A*STAR, Singapore, Singapore; Yanshan University, Qinhuangdao, China; Bytedance AI Lab, Singapore, Singapore; A*STAR Centre for Frontier AI Research and Nanyang Technological University, Singapore, Singapore; Delft University of Technology, Delft, Netherlands; Shanghai University of Finance and Economics, Shanghai, China|Product bundling is a commonly-used marketing strategy in both offline retailers and online e-commerce systems. Current research on bundle recommendation is limited by: (1) noisy datasets, where bundles are defined by heuristics, e.g., products co-purchased in the same session; and (2) specific tasks, holding unrealistic assumptions, e.g., the availability of bundles for recommendation directly. In this paper, we propose to take a step back and consider the process of bundle recommendation from a holistic user experience perspective. We first construct high-quality bundle datasets with rich meta information, particularly bundle intents, through a carefully designed crowd-sourcing task. We then define a series of tasks that together, support all key steps in a typical bundle recommendation process, from bundle detection, completion, ranking, to explanation and auto-naming. Finally, we conduct extensive experiments and in-depth analysis that demonstrate the challenges of bundle recommendation, arising from the need for capturing complex relations among users, products, and bundles, as well as the research opportunities, especially in graph-based neural methods. To sum up, our study delivers new data sources, opens up new research directions, and provides useful guidance for product bundling in real e-commerce platforms. Our datasets are available at GitHub (\urlhttps://github.com/BundleRec/bundle_recommendation ).|绑售是线下零售商和在线电子商务系统中常用的营销策略。目前对捆绑推荐的研究受到以下因素的限制: (1)有噪音的数据集,其中捆绑包是由启发式定义的,例如,在同一会话中共同购买的产品; (2)具体的任务,持有不切实际的假设,例如,捆绑包的可用性直接推荐。在本文中,我们建议退一步,从整体用户体验的角度来考虑捆绑推荐的过程。我们首先通过一个精心设计的众包任务,构建包含丰富元信息的高质量捆绑数据集,特别是捆绑意图。然后,我们定义一系列任务,这些任务一起支持典型的包推荐过程中的所有关键步骤,从包检测、完成、排名到解释和自动命名。最后,我们进行了广泛的实验和深入的分析,展示了捆绑推荐的挑战,由于需要捕获用户、产品和捆绑之间的复杂关系,以及研究机会,特别是在基于图的神经方法。总之,我们的研究提供了新的数据来源,开辟了新的研究方向,并为绑售在真正的电子商务平台上提供了有用的指导。我们的数据集可以在 GitHub 上获得(urlhttps:// GitHub.com/bundlerec/bundle_recommendation )。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Revisiting+Bundle+Recommendation:+Datasets,+Tasks,+Challenges+and+Opportunities+for+Intent-aware+Product+Bundling)|0| +|[DeSCoVeR: Debiased Semantic Context Prior for Venue Recommendation](https://doi.org/10.1145/3477495.3531877)|Sailaja Rajanala, Arghya Pal, Manish Singh, Raphael C.W. Phan, KokSheik Wong|Indian Institute of Technology Hyderabad, Hyderabad, India; Monash University Malaysia, Bandar Sunway, Malaysia; Harvard Medical School, Boston, MA, USA|We present a novel semantic context prior-based venue recommendation system that uses only the title and the abstract of a paper. Based on the intuition that the text in the title and abstract have both semantic and syntactic components, we demonstrate that a joint training of a semantic feature extractor and syntactic feature extractor collaboratively leverages meaningful information that helps to provide venues for papers. The proposed methodology that we call DeSCoVeR at first elicits these semantic and syntactic features using a Neural Topic Model and text classifier respectively. The model then executes a transfer learning optimization procedure to perform a contextual transfer between the feature distributions of the Neural Topic Model and the text classifier during the training phase. DeSCoVeR also mitigates the document-level label bias using a Causal back-door path criterion and a sentence-level keyword bias removal technique. Experiments on the DBLP dataset show that DeSCoVeR outperforms the state-of-the-art methods.|我们提出了一个新的基于语义上下文先验的场地推荐系统,它只使用文章的标题和摘要。基于标题和摘要中的文本同时具有语义和句法成分的直觉,我们证明了语义特征提取器和句法特征提取器的联合训练协同利用有意义的信息,有助于为论文提供场所。我们提出的方法,我们称为 DeSCoVeR 首先引出这些语义和句法特征使用神经主题模型和文本分类器分别。然后,该模型执行一个迁移学习优化过程,在训练阶段在神经主题模型的特征分布和文本分类器之间进行上下文迁移。DeSCoVeR 还使用因果后门路径标准和句子级关键字偏差消除技术来减轻文档级标签偏差。在 DBLP 数据集上的实验表明,DeSCoVeR 方法的性能优于最先进的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DeSCoVeR:+Debiased+Semantic+Context+Prior+for+Venue+Recommendation)|0| +|[Revisiting Bundle Recommendation: Datasets, Tasks, Challenges and Opportunities for Intent-aware Product Bundling](https://doi.org/10.1145/3477495.3531904)|Zhu Sun, Jie Yang, Kaidong Feng, Hui Fang, Xinghua Qu, Yew Soon Ong|Shanghai University of Finance and Economics, Shanghai, China; Delft University of Technology, Delft, Netherlands; A*STAR Centre for Frontier AI Research and Nanyang Technological University, Singapore, Singapore; Institute of High Performance Computing and Centre for Frontier AI Research, A*STAR, Singapore, Singapore; Yanshan University, Qinhuangdao, China; Bytedance AI Lab, Singapore, Singapore|Product bundling is a commonly-used marketing strategy in both offline retailers and online e-commerce systems. Current research on bundle recommendation is limited by: (1) noisy datasets, where bundles are defined by heuristics, e.g., products co-purchased in the same session; and (2) specific tasks, holding unrealistic assumptions, e.g., the availability of bundles for recommendation directly. In this paper, we propose to take a step back and consider the process of bundle recommendation from a holistic user experience perspective. We first construct high-quality bundle datasets with rich meta information, particularly bundle intents, through a carefully designed crowd-sourcing task. We then define a series of tasks that together, support all key steps in a typical bundle recommendation process, from bundle detection, completion, ranking, to explanation and auto-naming. Finally, we conduct extensive experiments and in-depth analysis that demonstrate the challenges of bundle recommendation, arising from the need for capturing complex relations among users, products, and bundles, as well as the research opportunities, especially in graph-based neural methods. To sum up, our study delivers new data sources, opens up new research directions, and provides useful guidance for product bundling in real e-commerce platforms. Our datasets are available at GitHub (\urlhttps://github.com/BundleRec/bundle_recommendation ).|绑售是线下零售商和在线电子商务系统中常用的营销策略。目前对捆绑推荐的研究受到以下因素的限制: (1)有噪音的数据集,其中捆绑包是由启发式定义的,例如,在同一会话中共同购买的产品; (2)具体的任务,持有不切实际的假设,例如,捆绑包的可用性直接推荐。在本文中,我们建议退一步,从整体用户体验的角度来考虑捆绑推荐的过程。我们首先通过一个精心设计的众包任务,构建包含丰富元信息的高质量捆绑数据集,特别是捆绑意图。然后,我们定义一系列任务,这些任务一起支持典型的包推荐过程中的所有关键步骤,从包检测、完成、排名到解释和自动命名。最后,我们进行了广泛的实验和深入的分析,展示了捆绑推荐的挑战,由于需要捕获用户、产品和捆绑之间的复杂关系,以及研究机会,特别是在基于图的神经方法。总之,我们的研究提供了新的数据来源,开辟了新的研究方向,并为绑售在真正的电子商务平台上提供了有用的指导。我们的数据集可以在 GitHub 上获得(urlhttps:// GitHub.com/bundlerec/bundle_recommendation )。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Revisiting+Bundle+Recommendation:+Datasets,+Tasks,+Challenges+and+Opportunities+for+Intent-aware+Product+Bundling)|0| |[Query Facet Mapping and its Applications in Streaming Services: The Netflix Case Study](https://doi.org/10.1145/3477495.3536330)|Sudeep Das, Ivan Provalov, Vickie Zhang, Weidong Zhang|Netflix Inc., Los Gatos, CA, USA|In an instant search setting such as Netflix Search where results are returned in response to every keystroke, determining how a partial query maps onto broad classes of relevant entities orfacets --- such as videos, talent, and genres --- can facilitate a better understanding of the underlying objective of that query. Such a query-to-facet mapping system has a multitude of applications. It can help improve the quality of search results, drive meaningful result organization, and can be leveraged to establish trust by being transparent with Netflix members when they search for an entity that is not available on the service. By anticipating the relevant facets with each keystroke entry, the system can also better guide the experience within a search session. When aggregated across queries, the facets can reveal interesting patterns of member interest. A key challenge for building such a system is to judiciously balance lexical similarity with behavioral relevance. In this paper, we present a high level overview of a Query Facet Mapping system that we have developed at Netflix, describe its main components, provide evaluation results with real-world data, and outline several potential applications.|在像 Netflix Search 这样的即时搜索设置中,每次按键都会返回结果,确定一个部分查询如何映射到相关实体或方面的广泛类别——比如视频、人才和类型——可以促进对该查询的潜在目标的更好理解。这种查询到面的映射系统有大量的应用程序。它可以帮助提高搜索结果的质量,推动有意义的结果组织,并且可以通过在 Netflix 成员搜索服务中不可用的实体时对其保持透明来建立信任。通过预测每个按键输入的相关方面,系统还可以更好地指导搜索会话中的体验。当跨查询聚合时,方面可以显示成员感兴趣的有趣模式。建立这样一个系统的关键挑战是明智地平衡词汇相似性和行为相关性。本文对我们在 Netflix 上开发的 Query Facet Mapping 系统进行了高层次的概述,描述了它的主要组件,提供了实际数据的评估结果,并概述了几个潜在的应用。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Query+Facet+Mapping+and+its+Applications+in+Streaming+Services:+The+Netflix+Case+Study)|0| |[Implicit Feedback for Dense Passage Retrieval: A Counterfactual Approach](https://doi.org/10.1145/3477495.3531994)|Shengyao Zhuang, Hang Li, Guido Zuccon|The University of Queensland, Brisbane, QLD, Australia|In this paper we study how to effectively exploit implicit feedback in Dense Retrievers (DRs). We consider the specific case in which click data from a historic click log is available as implicit feedback. We then exploit such historic implicit interactions to improve the effectiveness of a DR. A key challenge that we study is the effect that biases in the click signal, such as position bias, have on the DRs. To overcome the problems associated with the presence of such bias, we propose the Counterfactual Rocchio (CoRocchio) algorithm for exploiting implicit feedback in Dense Retrievers. We demonstrate both theoretically and empirically that dense query representations learnt with CoRocchio are unbiased with respect to position bias and lead to higher retrieval effectiveness. We make available the implementations of the proposed methods and the experimental framework, along with all results at https://github.com/ielab/Counterfactual-DR.|本文研究了密集检索器(DRs)中如何有效地利用内隐反馈。我们考虑这样一个特定的情况,在这种情况下,来自历史点击日志的点击数据可以作为隐式反馈使用。然后,我们利用这种历史性的隐性相互作用来提高 DR 的有效性。我们研究的一个关键挑战是点击信号中的偏差(如位置偏差)对 DR 的影响。为了克服与存在这种偏差相关的问题,我们提出反事实 Rocchio (CoRocchio)算法用于利用致密检索器中的隐性反馈。我们从理论和实验两方面证明了 CoRocchio 学习的密集查询表示对位置偏差是无偏的,从而提高了检索效率。我们提供了建议方法和实验框架的实施,以及所有 https://github.com/ielab/counterfactual-dr 的结果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Implicit+Feedback+for+Dense+Passage+Retrieval:+A+Counterfactual+Approach)|0| -|[Offline Evaluation of Ranked Lists using Parametric Estimation of Propensities](https://doi.org/10.1145/3477495.3532032)|Vishwa Vinay, Manoj Kilaru, David Arbour|Adobe Research, San Jose, CA, USA; Adobe Research, Bangalore, India; University of California, San Diego, CA, USA|Search engines and recommendation systems attempt to continually improve the quality of the experience they afford to their users. Refining the ranker that produces the lists displayed in response to user requests is an important component of this process. A common practice is for the service providers to make changes (e.g. new ranking features, different ranking models) and A/B test them on a fraction of their users to establish the value of the change. An alternative approach estimates the effectiveness of the proposed changes offline, utilising previously collected clickthrough data on the old ranker to posit what the user behaviour on ranked lists produced by the new ranker would have been. A majority of offline evaluation approaches invoke the well studied inverse propensity weighting to adjust for biases inherent in logged data. In this paper, we propose the use of parametric estimates for these propensities. Specifically, by leveraging well known learning-to-rank methods as subroutines, we show how accurate offline evaluation can be achieved when the new rankings to be evaluated differ from the logged ones.|搜索引擎和推荐系统试图不断提高它们为用户提供的体验的质量。优化生成响应用户请求的列表的排名是这个过程的一个重要组成部分。一个常见的做法是,服务提供商进行更改(例如,新的排名功能,不同的排名模型)和 A/B 测试他们的一小部分用户,以建立变化的价值。另一种方法是利用先前收集到的老排名者的点击数据,来估计新排名者生成的排名表上的用户行为的有效性。大多数离线评估方法都会调用经过充分研究的倾向性反向加权来调整测井数据中固有的偏差。在本文中,我们提出了这些倾向的参数估计的使用。具体来说,通过利用众所周知的学习排名方法作为子程序,我们展示了当评估的新排名与记录的排名不同时,如何实现准确的离线评估。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Offline+Evaluation+of+Ranked+Lists+using+Parametric+Estimation+of+Propensities)|0| -|[CAPTOR: A Crowd-Aware Pre-Travel Recommender System for Out-of-Town Users](https://doi.org/10.1145/3477495.3531949)|Haoran Xin, Xinjiang Lu, Nengjun Zhu, Tong Xu, Dejing Dou, Hui Xiong|The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China; Shanghai University, Shanghai, China; University of Science and Technology of China, Hefei, China; Baidu Research, Beijing, China|Pre-travel out-of-town recommendation aims to recommend Point-of-Interests (POIs) to the users who plan to travel out of their hometown in the near future yet have not decided where to go, i.e., their destination regions and POIs both remain unknown. It is a non-trivial task since the searching space is vast, which may lead to distinct travel experiences in different out-of-town regions and eventually confuse decision-making. Besides, users' out-of-town travel behaviors are affected not only by their personalized preferences but heavily by others' travel behaviors. To this end, we propose a Crowd-Aware Pre-Travel Out-of-town Recommendation framework (CAPTOR) consisting of two major modules: spatial-affined conditional random field (SA-CRF) and crowd behavior memory network (CBMN). Specifically, SA-CRF captures the spatial affinity among POIs while preserving the inherent information of POIs. Then, CBMN is proposed to maintain the crowd travel behaviors w.r.t. each region through three affiliated blocks reading and writing the memory adaptively. We devise the elaborated metric space with a dynamic mapping mechanism, where the users and POIs are distinguishable both inherently and geographically. Extensive experiments on two real-world nationwide datasets validate the effectiveness of CAPTOR against the pre-travel out-of-town recommendation task.|旅行前出城推荐的目的是向那些计划在不久的将来离开家乡但还没有决定去哪里旅行的用户推荐他们的兴趣点,也就是说,他们的目的地和兴趣点都是未知的。由于搜索空间巨大,这是一个非常重要的任务,可能会导致在不同的城外地区有不同的旅行体验,并最终混淆决策。此外,用户的出城旅游行为不仅受到个人偏好的影响,还受到他人旅游行为的影响。为此,我们提出了一个基于人群感知的预先出城推荐框架(CAPTOR) ,该框架由两个主要模块组成: 空间仿真条件随机域(SA-CRF)和人群行为记忆网络(cBMN)。特别地,SA-CRF 捕获 POI 之间的空间亲和性,同时保留 POI 的固有信息。然后,提出了通过三个附属块自适应地读写记忆来维持每个区域的人群出行行为。我们使用动态映射机制设计了详细的度量空间,其中用户和 POI 在本质上和地理上都是可以区分的。在两个真实世界的全国性数据集上进行了大量的实验,验证了 CAPTOR 对于出城前的推荐任务的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CAPTOR:+A+Crowd-Aware+Pre-Travel+Recommender+System+for+Out-of-Town+Users)|0| +|[Offline Evaluation of Ranked Lists using Parametric Estimation of Propensities](https://doi.org/10.1145/3477495.3532032)|Vishwa Vinay, Manoj Kilaru, David Arbour|University of California, San Diego, CA, USA; Adobe Research, San Jose, CA, USA; Adobe Research, Bangalore, India|Search engines and recommendation systems attempt to continually improve the quality of the experience they afford to their users. Refining the ranker that produces the lists displayed in response to user requests is an important component of this process. A common practice is for the service providers to make changes (e.g. new ranking features, different ranking models) and A/B test them on a fraction of their users to establish the value of the change. An alternative approach estimates the effectiveness of the proposed changes offline, utilising previously collected clickthrough data on the old ranker to posit what the user behaviour on ranked lists produced by the new ranker would have been. A majority of offline evaluation approaches invoke the well studied inverse propensity weighting to adjust for biases inherent in logged data. In this paper, we propose the use of parametric estimates for these propensities. Specifically, by leveraging well known learning-to-rank methods as subroutines, we show how accurate offline evaluation can be achieved when the new rankings to be evaluated differ from the logged ones.|搜索引擎和推荐系统试图不断提高它们为用户提供的体验的质量。优化生成响应用户请求的列表的排名是这个过程的一个重要组成部分。一个常见的做法是,服务提供商进行更改(例如,新的排名功能,不同的排名模型)和 A/B 测试他们的一小部分用户,以建立变化的价值。另一种方法是利用先前收集到的老排名者的点击数据,来估计新排名者生成的排名表上的用户行为的有效性。大多数离线评估方法都会调用经过充分研究的倾向性反向加权来调整测井数据中固有的偏差。在本文中,我们提出了这些倾向的参数估计的使用。具体来说,通过利用众所周知的学习排名方法作为子程序,我们展示了当评估的新排名与记录的排名不同时,如何实现准确的离线评估。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Offline+Evaluation+of+Ranked+Lists+using+Parametric+Estimation+of+Propensities)|0| +|[CAPTOR: A Crowd-Aware Pre-Travel Recommender System for Out-of-Town Users](https://doi.org/10.1145/3477495.3531949)|Haoran Xin, Xinjiang Lu, Nengjun Zhu, Tong Xu, Dejing Dou, Hui Xiong|Baidu Research, Beijing, China; Shanghai University, Shanghai, China; University of Science and Technology of China, Hefei, China; The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China|Pre-travel out-of-town recommendation aims to recommend Point-of-Interests (POIs) to the users who plan to travel out of their hometown in the near future yet have not decided where to go, i.e., their destination regions and POIs both remain unknown. It is a non-trivial task since the searching space is vast, which may lead to distinct travel experiences in different out-of-town regions and eventually confuse decision-making. Besides, users' out-of-town travel behaviors are affected not only by their personalized preferences but heavily by others' travel behaviors. To this end, we propose a Crowd-Aware Pre-Travel Out-of-town Recommendation framework (CAPTOR) consisting of two major modules: spatial-affined conditional random field (SA-CRF) and crowd behavior memory network (CBMN). Specifically, SA-CRF captures the spatial affinity among POIs while preserving the inherent information of POIs. Then, CBMN is proposed to maintain the crowd travel behaviors w.r.t. each region through three affiliated blocks reading and writing the memory adaptively. We devise the elaborated metric space with a dynamic mapping mechanism, where the users and POIs are distinguishable both inherently and geographically. Extensive experiments on two real-world nationwide datasets validate the effectiveness of CAPTOR against the pre-travel out-of-town recommendation task.|旅行前出城推荐的目的是向那些计划在不久的将来离开家乡但还没有决定去哪里旅行的用户推荐他们的兴趣点,也就是说,他们的目的地和兴趣点都是未知的。由于搜索空间巨大,这是一个非常重要的任务,可能会导致在不同的城外地区有不同的旅行体验,并最终混淆决策。此外,用户的出城旅游行为不仅受到个人偏好的影响,还受到他人旅游行为的影响。为此,我们提出了一个基于人群感知的预先出城推荐框架(CAPTOR) ,该框架由两个主要模块组成: 空间仿真条件随机域(SA-CRF)和人群行为记忆网络(cBMN)。特别地,SA-CRF 捕获 POI 之间的空间亲和性,同时保留 POI 的固有信息。然后,提出了通过三个附属块自适应地读写记忆来维持每个区域的人群出行行为。我们使用动态映射机制设计了详细的度量空间,其中用户和 POI 在本质上和地理上都是可以区分的。在两个真实世界的全国性数据集上进行了大量的实验,验证了 CAPTOR 对于出城前的推荐任务的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CAPTOR:+A+Crowd-Aware+Pre-Travel+Recommender+System+for+Out-of-Town+Users)|0| |[Unify Local and Global Information for Top-N Recommendation](https://doi.org/10.1145/3477495.3532070)|Xiaoming Liu, Shaocong Wu, Zhaohan Zhang, Chao Shen||Knowledge graph (KG), integrating complex information and containing rich semantics, is widely considered as side information to enhance the recommendation systems. However, most of the existing KG-based methods concentrate on encoding the structural information in the graph, without utilizing the collaborative signals in user-item interaction data, which are important for understanding user preferences. Therefore, the representations learned by these models are insufficient for representing semantic information of users and items in the recommendation environment. The combination of both kinds of data provides a good chance to solve this problem, but it faces the following challenges: i) the inner correlations in user-item interaction data are difficult to capture from one side of the user or item; ii) capturing the knowledge associations on the whole KG would introduce noises and variously influence the recommendation results; iii) the semantic gap between both kinds of data is hard to alleviate. To tackle this research gap, we propose a novel duet representation learning framework named KADM to fuse local information (user-item interaction data) and global information (external knowledge graph) for the top-N recommendation, which is composed of two separate sub-models. One learns the local representations by discovering the inner correlations in local information with a knowledge-aware co-attention mechanism, and another learns the global representations by encoding the knowledge associations in global information with a relation-aware attention network. The two sub-models are jointly trained as part of the semantic fusion network to compute the user preferences, which discriminates the contribution of the two sub-models under the special context. We conduct experiments on two real-world datasets, and the evaluations show that KADM significantly outperforms state-of-art methods. Further ablation studies confirm that the duet architecture performs significantly better than either sub-model on the recommendation tasks.|知识图集成了复杂的信息,包含丰富的语义,被广泛认为是增强推荐系统的边信息。然而,现有的基于 KG 的方法大多集中于对图中的结构信息进行编码,而没有利用用户交互数据中的协作信号,这对于理解用户偏好非常重要。因此,这些模型所学到的表示方法不足以表示推荐环境中用户和项目的语义信息。这两种数据的结合为解决这一问题提供了很好的机会,但它面临着以下挑战: 1)用户项目交互数据的内部相关性难以从用户或项目的一侧获取; 2)捕捉整个 KG 的知识关联会引入噪声并对推荐结果产生各种影响; 3)两种数据之间的语义差异难以缓解。为了解决这一问题,本文提出了一种新的二元表示学习框架 KADM,它融合了顶层 N 推荐的局部信息(用户项目交互数据)和全局信息(外部知识图) ,该框架由两个独立的子模型组成。一种是通过知识感知共注意机制发现局部信息的内在相关性来学习局部表征,另一种是通过关系感知注意网络对全局信息中的知识关联进行编码来学习全局表征。将这两个子模型作为语义融合网络的一部分进行联合训练,以计算用户偏好,从而区分两个子模型在特定语境下的贡献。我们在两个真实世界的数据集上进行了实验,结果表明 KADM 的性能明显优于最先进的方法。进一步的消融研究证实,二重奏架构在推荐任务上的表现明显优于任何一个子模型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Unify+Local+and+Global+Information+for+Top-N+Recommendation)|0| |[Deployable and Continuable Meta-learning-Based Recommender System with Fast User-Incremental Updates](https://doi.org/10.1145/3477495.3531964)|Renchu Guan, Haoyu Pang, Fausto Giunchiglia, Ximing Li, Xuefeng Yang, Xiaoyue Feng|Tencent, Shenzhen, China; Jilin University, Changchun, China; University of Trento, Trento, Italy|User cold-start is a major challenge in building personalized recommender systems. Due to the lack of sufficient interactions, it is difficult to effectively model new users. One of the main solutions is to obtain an initial model through meta-learning (mainly gradient-based methods) and adapt it to new users with a few steps of gradient descent. Although these methods have achieved remarkable performance, they are still far from being usable in real-world applications due to their high-demand data processing, heavy computational burden, and inability to perform effective user-incremental update. In this paper, we propose a d eployable and c ontinuable m eta-learning-based r ecommendation (DCMR) approach, which can achieve fast user-incremental updating with task replay and first-order gradient descent. Specifically, we introduce a dual-constrained task sampler, distillation-based loss functions, and an adaptive controller in this framework to balance the trade-off between stability and plasticity in updating. In summary, DCMR can be updated while serving new users; in other words, it learns continuously and rapidly from a sequential user stream and is able to make recommendations at any time. The extensive experiments conducted on three benchmark datasets illustrate the superiority of our model.|用户冷启动是构建个性化推荐系统的主要挑战。由于缺乏足够的交互,很难对新用户进行有效的建模。其中一个主要的解决方案是通过元学习(主要是基于梯度的方法)获得一个初始模型,并通过几个步骤使其适应新用户的梯度下降法。虽然这些方法已经取得了显著的性能,但是由于其高需求的数据处理、沉重的计算负担以及不能执行有效的用户增量更新,它们在实际应用中仍然远远不能使用。在本文中,我们提出了一种可部署和可持续的基于元学习的 r 推荐(dCMR)方法,它可以通过任务重播和一阶梯度下降法实现快速的用户增量更新。具体来说,我们引入了一个双约束任务采样器,基于蒸馏的损失函数,以及在这个框架中的一个自适应控制器,以平衡稳定性和可塑性之间的权衡在更新。总之,DCMR 可以在为新用户提供服务的同时进行更新; 换句话说,它可以从连续的用户流中不断快速地学习,并且能够在任何时候提出建议。在三个基准数据集上进行的大量实验表明了该模型的优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Deployable+and+Continuable+Meta-learning-Based+Recommender+System+with+Fast+User-Incremental+Updates)|0| -|[Bias Mitigation for Toxicity Detection via Sequential Decisions](https://doi.org/10.1145/3477495.3531945)|Lu Cheng, Ahmadreza Mosallanezhad, Yasin N. Silva, Deborah L. Hall, Huan Liu|Arizona State University, Tempe, AZ, USA; Arizona State University, Glendale, AZ, USA; Loyola University Chicago, Chicago, IL, USA|Increased social media use has contributed to the greater prevalence of abusive, rude, and offensive textual comments. Machine learning models have been developed to detect toxic comments online, yet these models tend to show biases against users with marginalized or minority identities (e.g., females and African Americans). Established research in debiasing toxicity classifiers often (1) takes a static or batch approach, assuming that all information is available and then making a one-time decision; and (2) uses a generic strategy to mitigate different biases (e.g., gender and racial biases) that assumes the biases are independent of one another. However, in real scenarios, the input typically arrives as a sequence of comments/words over time instead of all at once. Thus, decisions based on partial information must be made while additional input is arriving. Moreover, social bias is complex by nature. Each type of bias is defined within its unique context, which, consistent with intersectionality theory within the social sciences, might be correlated with the contexts of other forms of bias. In this work, we consider debiasing toxicity detection as a sequential decision-making process where different biases can be interdependent. In particular, we study debiasing toxicity detection with two aims: (1) to examine whether different biases tend to correlate with each other; and (2) to investigate how to jointly mitigate these correlated biases in an interactive manner to minimize the total amount of bias. At the core of our approach is a framework built upon theories of sequential Markov Decision Processes that seeks to maximize the prediction accuracy and minimize the bias measures tailored to individual biases. Evaluations on two benchmark datasets empirically validate the hypothesis that biases tend to be correlated and corroborate the effectiveness of the proposed sequential debiasing strategy.|越来越多的社交媒体使用导致了辱骂、粗鲁和冒犯性的文字评论更加普遍。机器学习模型已经被开发用来检测网上的有毒评论,然而这些模型往往显示出对边缘化或少数族裔身份的用户(例如,女性和非裔美国人)的偏见。已建立的减少毒性分类器的研究通常(1)采用静态或批量方法,假设所有信息都可用,然后做出一次性决策; (2)使用通用策略来减轻假定偏见彼此独立的不同偏见(例如性别和种族偏见)。然而,在真实的场景中,输入通常是以注释/单词序列的形式随着时间的推移而到达,而不是一次性全部到达。因此,当额外的输入到达时,必须根据部分信息做出决策。此外,社会偏见本质上是复杂的。每种类型的偏见都是在其独特的背景下定义的,这与社会科学中的交叉性理论一致,可能与其他形式的偏见的背景相关。在这项工作中,我们认为去偏毒性检测是一个连续的决策过程中,不同的偏见可以相互依赖。具体而言,我们研究去偏毒性检测有两个目的: (1)检查不同的偏倚是否倾向于相互关联; (2)研究如何以交互方式共同减轻这些相关偏倚,以最小化偏倚总量。我们的方法的核心是一个建立在序贯马尔可夫决策过程理论基础上的框架,该框架寻求最大限度地提高预测的准确性,最小化针对个别偏差的偏差测量。对两个基准数据集的评估经验验证了偏差倾向于相关的假设,并证实了所提出的序贯去偏策略的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Bias+Mitigation+for+Toxicity+Detection+via+Sequential+Decisions)|0| -|[Regulating Group Exposure for Item Providers in Recommendation](https://doi.org/10.1145/3477495.3531760)|Mirko Marras, Ludovico Boratto, Guilherme Ramos, Gianni Fenu|University of Lisbon, Lisbon, Portugal; University of Cagliari, Cagliari, Italy|Engaging all content providers, including newcomers or minority demographic groups, is crucial for online platforms to keep growing and working. Hence, while building recommendation services, the interests of those providers should be valued. In this paper, we consider providers as grouped based on a common characteristic in settings in which certain provider groups have low representation of items in the catalog and, thus, in the user interactions. Then, we envision a scenario wherein platform owners seek to control the degree of exposure to such groups in the recommendation process. To support this scenario, we rely on disparate exposure measures that characterize the gap between the share of recommendations given to groups and the target level of exposure pursued by the platform owners. We then propose a re-ranking procedure that ensures desired levels of exposure are met. Experiments show that, while supporting certain groups of providers by rendering them with the target exposure, beyond-accuracy objectives experience significant gains with negligible impact in recommendation utility.|吸引所有内容提供商,包括新来者或少数族裔群体,对于在线平台保持增长和运作至关重要。因此,在构建推荐服务时,应该重视这些提供者的利益。在本文中,我们认为提供程序是基于一个共同特征进行分组的,在这种情况下,某些提供程序组在目录中的项表示较低,因此在用户交互中也是如此。然后,我们设想一个场景,其中平台所有者寻求控制在推荐过程中暴露于这些群体的程度。为了支持这一设想,我们依靠不同的曝光度量标准,这些标准体现了给予群体的建议份额与平台所有者追求的曝光度目标水平之间的差距。然后,我们提出了一个重新排序的程序,以确保所需的暴露水平得到满足。实验表明,虽然支持某些群体的供应商,使他们的目标暴露,超过准确性的目标经历了显着的收益,对推荐效用的影响可以忽略不计。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Regulating+Group+Exposure+for+Item+Providers+in+Recommendation)|0| -|[IPR: Interaction-level Preference Ranking for Explicit feedback](https://doi.org/10.1145/3477495.3531777)|ShihYang Liu, HsienHao Chen, ChihMing Chen, MingFeng Tsai, ChuanJu Wang|National Chengchi University, Taipei, Taiwan Roc; Academia Sinica, Taipei, Taiwan Roc; National Chengchi University, Academia Sinica, Taipei, Taiwan Roc|Explicit feedback---user input regarding their interest in an item---is the most helpful information for recommendation as it comes directly from the user and shows their direct interest in the item. Most approaches either treat the recommendation given such feedback as a typical regression problem or regard such data as implicit and then directly adopt approaches for implicit feedback; both methods, however,tend to yield unsatisfactory performance in top-k recommendation. In this paper, we propose interaction-level preference ranking(IPR), a novel pairwise ranking embedding learning approach to better utilize explicit feedback for recommendation. Experiments conducted on three real-world datasets show that IPR yields the best results compared to six strong baselines.|明确的反馈——用户关于他们对某个项目感兴趣的输入——是对推荐最有帮助的信息,因为它直接来自用户,并显示了他们对该项目的直接兴趣。大多数方法要么将给出的这种反馈视为典型的回归问题,要么将这种数据视为隐式的,然后直接采用隐式反馈的方法; 然而,这两种方法在 top-k 推荐中的表现往往都不令人满意。在本文中,我们提出了交互层次偏好排序(IPR) ,这是一种新的嵌入学习的成对排序方法,以更好地利用显式反馈进行推荐。在三个实际数据集上进行的实验表明,与六个强基线相比,IPR 产生的结果最好。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=IPR:+Interaction-level+Preference+Ranking+for+Explicit+feedback)|0| -|[MP2: A Momentum Contrast Approach for Recommendation with Pointwise and Pairwise Learning](https://doi.org/10.1145/3477495.3531813)|Menghan Wang, Yuchen Guo, Zhenqi Zhao, Guangzheng Hu, Yuming Shen, Mingming Gong, Philip H. S. Torr|University of Oxford, London, United Kingdom; The University of Melbourne, Melbourne, VIC, Australia; eBay Inc., Shanghai, China; Tencent Inc., Shanghai, China|Binary pointwise labels (aka implicit feedback) are heavily leveraged by deep learning based recommendation algorithms nowadays. In this paper we discuss the limited expressiveness of these labels may fail to accommodate varying degrees of user preference, and thus lead to conflicts during model training, which we call annotation bias. To solve this issue, we find the soft-labeling property of pairwise labels could be utilized to alleviate the bias of pointwise labels. To this end, we propose a momentum contrast framework (\method ) that combines pointwise and pairwise learning for recommendation. \method has a three-tower network structure: one user network and two item networks. The two item networks are used for computing pointwise and pairwise loss respectively. To alleviate the influence of the annotation bias, we perform a momentum update to ensure a consistent item representation. Extensive experiments on real-world datasets demonstrate the superiority of our method against state-of-the-art recommendation algorithms.|二进制点态标签(即隐式反馈)是当今基于深度学习的推荐算法的重要组成部分。在本文中,我们讨论了这些标签的有限表达可能无法适应不同程度的用户偏好,从而导致模型训练过程中的冲突,我们称之为注释偏差。为了解决这个问题,我们发现可以利用成对标签的软标签特性来缓解点态标签的偏差。为此,我们提出了一个动量对比框架(方法) ,结合点态和成对学习的推荐。方法具有三塔网络结构: 一个用户网络和两个项目网络。两项网络分别用于计算逐点损失和成对损失。为了减轻注释偏差的影响,我们进行动量更新以确保项目表示的一致性。在真实世界数据集上的大量实验证明了我们的方法对最先进的推荐算法的优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MP2:+A+Momentum+Contrast+Approach+for+Recommendation+with+Pointwise+and+Pairwise+Learning)|0| -|[Deep Page-Level Interest Network in Reinforcement Learning for Ads Allocation](https://doi.org/10.1145/3477495.3531847)|Guogang Liao, Xiaowen Shi, Ze Wang, Xiaoxu Wu, Chuheng Zhang, Yongkang Wang, Xingxing Wang, Dong Wang|Tsinghua University, Beijing, China; Meituan, Beijing, China|A mixed list of ads and organic items is usually displayed in feed and how to allocate the limited slots to maximize the overall revenue is a key problem. Meanwhile, user behavior modeling is essential in recommendation and advertising (e.g., CTR prediction and ads allocation). Most previous works only model point-level positive feedback (i.e., click), which neglect the page-level information of feedback and other types of feedback. To this end, we propose Deep Page-level Interest Network (DPIN) to model the page-level user preference and exploit multiple types of feedback. Specifically, we introduce four different types of page-level feedback, and capture user preference for item arrangement under different receptive fields through the multi-channel interaction module. Through extensive offline and online experiments on Meituan food delivery platform, we demonstrate that DPIN can effectively model the page-level user preference and increase the revenue.|一个广告和有机项目的混合列表通常显示在饲料和如何分配有限的插槽,以最大限度地提高总收入是一个关键问题。同时,用户行为建模对于推荐和广告(例如,点击率预测和广告分配)至关重要。大多数以前的作品只是模拟点级别的正反馈(例如,点击) ,而忽略了反馈和其他类型的反馈的页级信息。为此,我们提出了深层页面级兴趣网络(DPIN)来建模页面级用户偏好,并利用多种类型的反馈。具体来说,我们引入了四种不同类型的页面级反馈,并通过多通道交互模块捕捉用户对不同接收域下项目排列的偏好。通过在美团外卖平台上进行的大量线下和线上实验,我们证明了 DPIN 可以有效地模拟页面级别的用户偏好,并增加收入。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Deep+Page-Level+Interest+Network+in+Reinforcement+Learning+for+Ads+Allocation)|0| -|[Improving Micro-video Recommendation via Contrastive Multiple Interests](https://doi.org/10.1145/3477495.3531861)|Beibei Li, Beihong Jin, Jiageng Song, Yisong Yu, Yiyuan Zheng, Wei Zhou|MX Media Co., Ltd., Singapore, Singapore; Institute of Software, Chinese Academy of Sciences & University of Chinese Academy of Sciences, Beijing, China|With the rapid increase of micro-video creators and viewers, how to make personalized recommendations from a large number of candidates to viewers begins to attract more and more attention. However, existing micro-video recommendation models rely on expensive multi-modal information and learn an overall interest embedding that cannot reflect the user's multiple interests in micro-videos. Recently, contrastive learning provides a new opportunity for refining the existing recommendation techniques. Therefore, in this paper, we propose to extract contrastive multi-interests and devise a micro-video recommendation model CMI. Specifically, CMI learns multiple interest embeddings for each user from his/her historical interaction sequence, in which the implicit orthogonal micro-video categories are used to decouple multiple user interests. Moreover, it establishes the contrastive multi-interest loss to improve the robustness of interest embeddings and the performance of recommendations. The results of experiments on two micro-video datasets demonstrate that CMI achieves state-of-the-art performance over existing baselines.|随着微视频制作者和观众的迅速增多,如何从大量的候选人中向观众提供个性化的推荐,开始引起越来越多的关注。然而,现有的微视频推荐模型依赖于昂贵的多模态信息,学习的总体兴趣嵌入不能反映用户在微视频中的多重兴趣。近年来,对比学习为完善现有的推荐技术提供了一个新的机会。因此,本文提出提取对比多兴趣并设计一个微视频推荐模型 CMI。具体来说,CMI 从每个用户的历史交互序列中学习多个兴趣嵌入,其中使用隐式正交微视频类别来解耦多个用户兴趣。此外,本文还建立了对比的多利益损失模型,以提高利益嵌入的鲁棒性和建议的执行效率。在两个微视频数据集上的实验结果表明,CMI 在现有的基线上取得了最好的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+Micro-video+Recommendation+via+Contrastive+Multiple+Interests)|0| +|[Bias Mitigation for Toxicity Detection via Sequential Decisions](https://doi.org/10.1145/3477495.3531945)|Lu Cheng, Ahmadreza Mosallanezhad, Yasin N. Silva, Deborah L. Hall, Huan Liu|Arizona State University, Tempe, AZ, USA; Loyola University Chicago, Chicago, IL, USA; Arizona State University, Glendale, AZ, USA|Increased social media use has contributed to the greater prevalence of abusive, rude, and offensive textual comments. Machine learning models have been developed to detect toxic comments online, yet these models tend to show biases against users with marginalized or minority identities (e.g., females and African Americans). Established research in debiasing toxicity classifiers often (1) takes a static or batch approach, assuming that all information is available and then making a one-time decision; and (2) uses a generic strategy to mitigate different biases (e.g., gender and racial biases) that assumes the biases are independent of one another. However, in real scenarios, the input typically arrives as a sequence of comments/words over time instead of all at once. Thus, decisions based on partial information must be made while additional input is arriving. Moreover, social bias is complex by nature. Each type of bias is defined within its unique context, which, consistent with intersectionality theory within the social sciences, might be correlated with the contexts of other forms of bias. In this work, we consider debiasing toxicity detection as a sequential decision-making process where different biases can be interdependent. In particular, we study debiasing toxicity detection with two aims: (1) to examine whether different biases tend to correlate with each other; and (2) to investigate how to jointly mitigate these correlated biases in an interactive manner to minimize the total amount of bias. At the core of our approach is a framework built upon theories of sequential Markov Decision Processes that seeks to maximize the prediction accuracy and minimize the bias measures tailored to individual biases. Evaluations on two benchmark datasets empirically validate the hypothesis that biases tend to be correlated and corroborate the effectiveness of the proposed sequential debiasing strategy.|越来越多的社交媒体使用导致了辱骂、粗鲁和冒犯性的文字评论更加普遍。机器学习模型已经被开发用来检测网上的有毒评论,然而这些模型往往显示出对边缘化或少数族裔身份的用户(例如,女性和非裔美国人)的偏见。已建立的减少毒性分类器的研究通常(1)采用静态或批量方法,假设所有信息都可用,然后做出一次性决策; (2)使用通用策略来减轻假定偏见彼此独立的不同偏见(例如性别和种族偏见)。然而,在真实的场景中,输入通常是以注释/单词序列的形式随着时间的推移而到达,而不是一次性全部到达。因此,当额外的输入到达时,必须根据部分信息做出决策。此外,社会偏见本质上是复杂的。每种类型的偏见都是在其独特的背景下定义的,这与社会科学中的交叉性理论一致,可能与其他形式的偏见的背景相关。在这项工作中,我们认为去偏毒性检测是一个连续的决策过程中,不同的偏见可以相互依赖。具体而言,我们研究去偏毒性检测有两个目的: (1)检查不同的偏倚是否倾向于相互关联; (2)研究如何以交互方式共同减轻这些相关偏倚,以最小化偏倚总量。我们的方法的核心是一个建立在序贯马尔可夫决策过程理论基础上的框架,该框架寻求最大限度地提高预测的准确性,最小化针对个别偏差的偏差测量。对两个基准数据集的评估经验验证了偏差倾向于相关的假设,并证实了所提出的序贯去偏策略的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Bias+Mitigation+for+Toxicity+Detection+via+Sequential+Decisions)|0| +|[Regulating Group Exposure for Item Providers in Recommendation](https://doi.org/10.1145/3477495.3531760)|Mirko Marras, Ludovico Boratto, Guilherme Ramos, Gianni Fenu|University of Cagliari, Cagliari, Italy; University of Lisbon, Lisbon, Portugal|Engaging all content providers, including newcomers or minority demographic groups, is crucial for online platforms to keep growing and working. Hence, while building recommendation services, the interests of those providers should be valued. In this paper, we consider providers as grouped based on a common characteristic in settings in which certain provider groups have low representation of items in the catalog and, thus, in the user interactions. Then, we envision a scenario wherein platform owners seek to control the degree of exposure to such groups in the recommendation process. To support this scenario, we rely on disparate exposure measures that characterize the gap between the share of recommendations given to groups and the target level of exposure pursued by the platform owners. We then propose a re-ranking procedure that ensures desired levels of exposure are met. Experiments show that, while supporting certain groups of providers by rendering them with the target exposure, beyond-accuracy objectives experience significant gains with negligible impact in recommendation utility.|吸引所有内容提供商,包括新来者或少数族裔群体,对于在线平台保持增长和运作至关重要。因此,在构建推荐服务时,应该重视这些提供者的利益。在本文中,我们认为提供程序是基于一个共同特征进行分组的,在这种情况下,某些提供程序组在目录中的项表示较低,因此在用户交互中也是如此。然后,我们设想一个场景,其中平台所有者寻求控制在推荐过程中暴露于这些群体的程度。为了支持这一设想,我们依靠不同的曝光度量标准,这些标准体现了给予群体的建议份额与平台所有者追求的曝光度目标水平之间的差距。然后,我们提出了一个重新排序的程序,以确保所需的暴露水平得到满足。实验表明,虽然支持某些群体的供应商,使他们的目标暴露,超过准确性的目标经历了显着的收益,对推荐效用的影响可以忽略不计。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Regulating+Group+Exposure+for+Item+Providers+in+Recommendation)|0| +|[IPR: Interaction-level Preference Ranking for Explicit feedback](https://doi.org/10.1145/3477495.3531777)|ShihYang Liu, HsienHao Chen, ChihMing Chen, MingFeng Tsai, ChuanJu Wang|National Chengchi University, Academia Sinica, Taipei, Taiwan Roc; Academia Sinica, Taipei, Taiwan Roc; National Chengchi University, Taipei, Taiwan Roc|Explicit feedback---user input regarding their interest in an item---is the most helpful information for recommendation as it comes directly from the user and shows their direct interest in the item. Most approaches either treat the recommendation given such feedback as a typical regression problem or regard such data as implicit and then directly adopt approaches for implicit feedback; both methods, however,tend to yield unsatisfactory performance in top-k recommendation. In this paper, we propose interaction-level preference ranking(IPR), a novel pairwise ranking embedding learning approach to better utilize explicit feedback for recommendation. Experiments conducted on three real-world datasets show that IPR yields the best results compared to six strong baselines.|明确的反馈——用户关于他们对某个项目感兴趣的输入——是对推荐最有帮助的信息,因为它直接来自用户,并显示了他们对该项目的直接兴趣。大多数方法要么将给出的这种反馈视为典型的回归问题,要么将这种数据视为隐式的,然后直接采用隐式反馈的方法; 然而,这两种方法在 top-k 推荐中的表现往往都不令人满意。在本文中,我们提出了交互层次偏好排序(IPR) ,这是一种新的嵌入学习的成对排序方法,以更好地利用显式反馈进行推荐。在三个实际数据集上进行的实验表明,与六个强基线相比,IPR 产生的结果最好。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=IPR:+Interaction-level+Preference+Ranking+for+Explicit+feedback)|0| +|[MP2: A Momentum Contrast Approach for Recommendation with Pointwise and Pairwise Learning](https://doi.org/10.1145/3477495.3531813)|Menghan Wang, Yuchen Guo, Zhenqi Zhao, Guangzheng Hu, Yuming Shen, Mingming Gong, Philip H. S. Torr|eBay Inc., Shanghai, China; The University of Melbourne, Melbourne, VIC, Australia; Tencent Inc., Shanghai, China; University of Oxford, London, United Kingdom|Binary pointwise labels (aka implicit feedback) are heavily leveraged by deep learning based recommendation algorithms nowadays. In this paper we discuss the limited expressiveness of these labels may fail to accommodate varying degrees of user preference, and thus lead to conflicts during model training, which we call annotation bias. To solve this issue, we find the soft-labeling property of pairwise labels could be utilized to alleviate the bias of pointwise labels. To this end, we propose a momentum contrast framework (\method ) that combines pointwise and pairwise learning for recommendation. \method has a three-tower network structure: one user network and two item networks. The two item networks are used for computing pointwise and pairwise loss respectively. To alleviate the influence of the annotation bias, we perform a momentum update to ensure a consistent item representation. Extensive experiments on real-world datasets demonstrate the superiority of our method against state-of-the-art recommendation algorithms.|二进制点态标签(即隐式反馈)是当今基于深度学习的推荐算法的重要组成部分。在本文中,我们讨论了这些标签的有限表达可能无法适应不同程度的用户偏好,从而导致模型训练过程中的冲突,我们称之为注释偏差。为了解决这个问题,我们发现可以利用成对标签的软标签特性来缓解点态标签的偏差。为此,我们提出了一个动量对比框架(方法) ,结合点态和成对学习的推荐。方法具有三塔网络结构: 一个用户网络和两个项目网络。两项网络分别用于计算逐点损失和成对损失。为了减轻注释偏差的影响,我们进行动量更新以确保项目表示的一致性。在真实世界数据集上的大量实验证明了我们的方法对最先进的推荐算法的优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MP2:+A+Momentum+Contrast+Approach+for+Recommendation+with+Pointwise+and+Pairwise+Learning)|0| +|[Deep Page-Level Interest Network in Reinforcement Learning for Ads Allocation](https://doi.org/10.1145/3477495.3531847)|Guogang Liao, Xiaowen Shi, Ze Wang, Xiaoxu Wu, Chuheng Zhang, Yongkang Wang, Xingxing Wang, Dong Wang|Meituan, Beijing, China; Tsinghua University, Beijing, China|A mixed list of ads and organic items is usually displayed in feed and how to allocate the limited slots to maximize the overall revenue is a key problem. Meanwhile, user behavior modeling is essential in recommendation and advertising (e.g., CTR prediction and ads allocation). Most previous works only model point-level positive feedback (i.e., click), which neglect the page-level information of feedback and other types of feedback. To this end, we propose Deep Page-level Interest Network (DPIN) to model the page-level user preference and exploit multiple types of feedback. Specifically, we introduce four different types of page-level feedback, and capture user preference for item arrangement under different receptive fields through the multi-channel interaction module. Through extensive offline and online experiments on Meituan food delivery platform, we demonstrate that DPIN can effectively model the page-level user preference and increase the revenue.|一个广告和有机项目的混合列表通常显示在饲料和如何分配有限的插槽,以最大限度地提高总收入是一个关键问题。同时,用户行为建模对于推荐和广告(例如,点击率预测和广告分配)至关重要。大多数以前的作品只是模拟点级别的正反馈(例如,点击) ,而忽略了反馈和其他类型的反馈的页级信息。为此,我们提出了深层页面级兴趣网络(DPIN)来建模页面级用户偏好,并利用多种类型的反馈。具体来说,我们引入了四种不同类型的页面级反馈,并通过多通道交互模块捕捉用户对不同接收域下项目排列的偏好。通过在美团外卖平台上进行的大量线下和线上实验,我们证明了 DPIN 可以有效地模拟页面级别的用户偏好,并增加收入。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Deep+Page-Level+Interest+Network+in+Reinforcement+Learning+for+Ads+Allocation)|0| +|[Improving Micro-video Recommendation via Contrastive Multiple Interests](https://doi.org/10.1145/3477495.3531861)|Beibei Li, Beihong Jin, Jiageng Song, Yisong Yu, Yiyuan Zheng, Wei Zhou|Institute of Software, Chinese Academy of Sciences & University of Chinese Academy of Sciences, Beijing, China; MX Media Co., Ltd., Singapore, Singapore|With the rapid increase of micro-video creators and viewers, how to make personalized recommendations from a large number of candidates to viewers begins to attract more and more attention. However, existing micro-video recommendation models rely on expensive multi-modal information and learn an overall interest embedding that cannot reflect the user's multiple interests in micro-videos. Recently, contrastive learning provides a new opportunity for refining the existing recommendation techniques. Therefore, in this paper, we propose to extract contrastive multi-interests and devise a micro-video recommendation model CMI. Specifically, CMI learns multiple interest embeddings for each user from his/her historical interaction sequence, in which the implicit orthogonal micro-video categories are used to decouple multiple user interests. Moreover, it establishes the contrastive multi-interest loss to improve the robustness of interest embeddings and the performance of recommendations. The results of experiments on two micro-video datasets demonstrate that CMI achieves state-of-the-art performance over existing baselines.|随着微视频制作者和观众的迅速增多,如何从大量的候选人中向观众提供个性化的推荐,开始引起越来越多的关注。然而,现有的微视频推荐模型依赖于昂贵的多模态信息,学习的总体兴趣嵌入不能反映用户在微视频中的多重兴趣。近年来,对比学习为完善现有的推荐技术提供了一个新的机会。因此,本文提出提取对比多兴趣并设计一个微视频推荐模型 CMI。具体来说,CMI 从每个用户的历史交互序列中学习多个兴趣嵌入,其中使用隐式正交微视频类别来解耦多个用户兴趣。此外,本文还建立了对比的多利益损失模型,以提高利益嵌入的鲁棒性和建议的执行效率。在两个微视频数据集上的实验结果表明,CMI 在现有的基线上取得了最好的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+Micro-video+Recommendation+via+Contrastive+Multiple+Interests)|0| |[Can Users Predict Relative Query Effectiveness?](https://doi.org/10.1145/3477495.3531893)|Oleg Zendel, Melika P. Ebrahim, J. Shane Culpepper, Alistair Moffat, Falk Scholer|The University of Melbourne, Melbourne, VIC, Australia; RMIT University, Melbourne, VIC, Australia|Any given information need can be expressed via a wide range of possible queries. Recent work with such query variations has demonstrated that different queries can fetch notably divergent sets of documents, even when the queries have identical intents and superficial similarity. That is, different users might receive SERPs of quite different effectiveness for the same information need. That observation then raises an interesting question: do users have a sense of how useful any given query will be? Can they anticipate the effectiveness of alternative queries for the same retrieval need? To explore that question we designed and carried out a crowd-sourced user study in which we asked subjects to consider an information need statement expressed as a backstory, and then provide their opinions as to the relative usefulness of a set of queries ostensibly addressing that objective. We solicited opinions using two different interfaces: one that collected absolute ratings of queries, and one that required that the subjects place a set of queries into "order". We found that crowd workers are reasonably consistent in their estimates of how effective queries are likely to be, and also that their estimates correlate positively with actual system performance.|任何给定的信息需求都可以通过各种可能的查询来表示。最近对这种查询变体的研究表明,不同的查询可以获取明显不同的文档集,即使查询具有相同的意图和表面上的相似性。也就是说,对于相同的信息需求,不同的用户可能会收到效果完全不同的 SERP。这种观察提出了一个有趣的问题: 用户是否知道任何给定的查询有多大用处?他们能够预测相同检索需求的替代查询的有效性吗?为了探索这个问题,我们设计并进行了一个众包用户研究,在这个研究中,我们要求受试者考虑一个表达为背景故事的信息需求陈述,然后提供他们对一组表面上针对该目标的查询的相对有用性的意见。我们使用两种不同的界面来征求意见: 一种是收集查询的绝对评分,另一种是要求被试将一组查询按顺序排列。我们发现,人群工作者对查询可能的有效性的估计是相当一致的,而且他们的估计与实际系统性能正相关。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Can+Users+Predict+Relative+Query+Effectiveness?)|0| |[Is Non-IID Data a Threat in Federated Online Learning to Rank?](https://doi.org/10.1145/3477495.3531709)|Shuyi Wang, Guido Zuccon|The University of Queensland, Brisbane, QLD, Australia|In this perspective paper we study the effect of non independent and identically distributed (non-IID) data on federated online learning to rank (FOLTR) and chart directions for future work in this new and largely unexplored research area of Information Retrieval. In the FOLTR process, clients participate in a federation to jointly create an effective ranker from the implicit click signal originating in each client, without the need to share data (documents, queries, clicks). A well-known factor that affects the performance of federated learning systems, and that poses serious challenges to these approaches, is that there may be some type of bias in the way data is distributed across clients. While FOLTR systems are on their own rights a type of federated learning system, the presence and effect of non-IID data in FOLTR has not been studied. To this aim, we first enumerate possible data distribution settings that may showcase data bias across clients and thus give rise to the non-IID problem. Then, we study the impact of each setting on the performance of the current state-of-the-art FOLTR approach, the Federated Pairwise Differentiable Gradient Descent (FPDGD), and we highlight which data distributions may pose a problem for FOLTR methods. We also explore how common approaches proposed in the federated learning literature address non-IID issues in FOLTR. This allows us to unveil new research gaps that, we argue, future research in FOLTR should consider.|在这篇前瞻性的论文中,我们研究了非独立和同分布(非 IID)数据对联邦在线学习排名(FOLTR)的影响,以及未来工作的图表方向,这是一个新的、很大程度上尚未探索的信息检索研究领域。在 FOLTR 过程中,客户端参与到一个联合中,从每个客户端发出的隐式点击信号中共同创建一个有效的排名,而不需要共享数据(文档、查询、点击)。影响联邦学习系统性能的一个众所周知的因素是,数据在客户端之间的分布方式可能存在某种偏差,这对这些方法提出了严峻的挑战。虽然 FOLTR 系统本身就是一种联邦学习系统,但是对于 FOLTR 中非 IID 数据的存在和影响还没有进行研究。为此,我们首先列举可能的数据分布设置,这些设置可能显示客户端之间的数据偏差,从而引起非 IID 问题。然后,我们研究了每种设置对当前最先进的 FOLTR 方法——联邦成对可微分梯度下降法(fPDGD)——性能的影响,并强调了哪些数据分布可能会给 FOLTR 方法带来问题。我们还探讨了联合学习文献中提出的常用方法如何解决 FOLTR 中的非 IID 问题。这使我们能够揭示新的研究差距,我们认为,在 FOLTR 的未来研究应该考虑。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Is+Non-IID+Data+a+Threat+in+Federated+Online+Learning+to+Rank?)|0| -|[On Natural Language User Profiles for Transparent and Scrutable Recommendation](https://doi.org/10.1145/3477495.3531873)|Filip Radlinski, Krisztian Balog, Fernando Diaz, Lucas Dixon, Ben Wedin|Google, Montreal, Canada; Google, Stavanger, Norway; Google, Cambridge, MA, USA; Google, Paris, France; Google, London, United Kingdom|Natural interaction with recommendation and personalized search systems has received tremendous attention in recent years. We focus on the challenge of supporting people's understanding and control of these systems and explore a fundamentally new way of thinking about representation of knowledge in recommendation and personalization systems. Specifically, we argue that it may be both desirable and possible for algorithms that use natural language representations of users' preferences to be developed. We make the case that this could provide significantly greater transparency, as well as affordances for practical actionable interrogation of, and control over, recommendations. Moreover, we argue that such an approach, if successfully applied, may enable a major step towards systems that rely less on noisy implicit observations while increasing portability of knowledge of one's interests.|近年来,与推荐系统和个性化检索系统的自然交互受到了极大的关注。我们重点关注支持人们理解和控制这些系统的挑战,并探索一种在推荐和个性化系统中表示知识的全新思维方式。具体来说,我们认为开发使用用户偏好的自然语言表示的算法是可取的,也是可能的。我们认为,这可以提供更大的透明度,以及提供实际可行的审讯和控制,建议。此外,我们认为,这种方法,如果成功地应用,可能使一个重大的步骤,系统的依赖噪音较少的隐含观察,同时增加了一个人的兴趣知识的可移植性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On+Natural+Language+User+Profiles+for+Transparent+and+Scrutable+Recommendation)|0| -|[Retrieval-Enhanced Machine Learning](https://doi.org/10.1145/3477495.3531722)|Hamed Zamani, Fernando Diaz, Mostafa Dehghani, Donald Metzler, Michael Bendersky|Google Research, Montréal, PQ, Canada; Google Research, Mountain View, CA, USA; Google Research, Amsterdam, Netherlands; University of Massachusetts Amherst, Amherst, MA, USA|Although information access systems have long supportedpeople in accomplishing a wide range of tasks, we propose broadening the scope of users of information access systems to include task-driven machines, such as machine learning models. In this way, the core principles of indexing, representation, retrieval, and ranking can be applied and extended to substantially improve model generalization, scalability, robustness, and interpretability. We describe a generic retrieval-enhanced machine learning (REML) framework, which includes a number of existing models as special cases. REML challenges information retrieval conventions, presenting opportunities for novel advances in core areas, including optimization. The REML research agenda lays a foundation for a new style of information access research and paves a path towards advancing machine learning and artificial intelligence.|尽管信息访问系统长期以来一直支持人们完成各种各样的任务,但我们建议扩大信息访问系统的用户范围,以包括任务驱动的机器,如机器学习模型。通过这种方式,可以应用和扩展索引、表示、检索和排序的核心原则,从而大大提高模型泛化、可伸缩性、健壮性和可解释性。我们描述了一个通用的检索增强机器学习(REML)框架,其中包括一些现有的模型作为特殊情况。REML 挑战了信息检索惯例,为核心领域的新进展提供了机会,包括优化。REML 研究议程为新型的信息获取研究奠定了基础,为机器学习和人工智能的发展铺平了道路。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Retrieval-Enhanced+Machine+Learning)|0| +|[On Natural Language User Profiles for Transparent and Scrutable Recommendation](https://doi.org/10.1145/3477495.3531873)|Filip Radlinski, Krisztian Balog, Fernando Diaz, Lucas Dixon, Ben Wedin|Google, Paris, France; Google, Montreal, Canada; Google, Stavanger, Norway; Google, London, United Kingdom; Google, Cambridge, MA, USA|Natural interaction with recommendation and personalized search systems has received tremendous attention in recent years. We focus on the challenge of supporting people's understanding and control of these systems and explore a fundamentally new way of thinking about representation of knowledge in recommendation and personalization systems. Specifically, we argue that it may be both desirable and possible for algorithms that use natural language representations of users' preferences to be developed. We make the case that this could provide significantly greater transparency, as well as affordances for practical actionable interrogation of, and control over, recommendations. Moreover, we argue that such an approach, if successfully applied, may enable a major step towards systems that rely less on noisy implicit observations while increasing portability of knowledge of one's interests.|近年来,与推荐系统和个性化检索系统的自然交互受到了极大的关注。我们重点关注支持人们理解和控制这些系统的挑战,并探索一种在推荐和个性化系统中表示知识的全新思维方式。具体来说,我们认为开发使用用户偏好的自然语言表示的算法是可取的,也是可能的。我们认为,这可以提供更大的透明度,以及提供实际可行的审讯和控制,建议。此外,我们认为,这种方法,如果成功地应用,可能使一个重大的步骤,系统的依赖噪音较少的隐含观察,同时增加了一个人的兴趣知识的可移植性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On+Natural+Language+User+Profiles+for+Transparent+and+Scrutable+Recommendation)|0| +|[Retrieval-Enhanced Machine Learning](https://doi.org/10.1145/3477495.3531722)|Hamed Zamani, Fernando Diaz, Mostafa Dehghani, Donald Metzler, Michael Bendersky|University of Massachusetts Amherst, Amherst, MA, USA; Google Research, Amsterdam, Netherlands; Google Research, Montréal, PQ, Canada; Google Research, Mountain View, CA, USA|Although information access systems have long supportedpeople in accomplishing a wide range of tasks, we propose broadening the scope of users of information access systems to include task-driven machines, such as machine learning models. In this way, the core principles of indexing, representation, retrieval, and ranking can be applied and extended to substantially improve model generalization, scalability, robustness, and interpretability. We describe a generic retrieval-enhanced machine learning (REML) framework, which includes a number of existing models as special cases. REML challenges information retrieval conventions, presenting opportunities for novel advances in core areas, including optimization. The REML research agenda lays a foundation for a new style of information access research and paves a path towards advancing machine learning and artificial intelligence.|尽管信息访问系统长期以来一直支持人们完成各种各样的任务,但我们建议扩大信息访问系统的用户范围,以包括任务驱动的机器,如机器学习模型。通过这种方式,可以应用和扩展索引、表示、检索和排序的核心原则,从而大大提高模型泛化、可伸缩性、健壮性和可解释性。我们描述了一个通用的检索增强机器学习(REML)框架,其中包括一些现有的模型作为特殊情况。REML 挑战了信息检索惯例,为核心领域的新进展提供了机会,包括优化。REML 研究议程为新型的信息获取研究奠定了基础,为机器学习和人工智能的发展铺平了道路。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Retrieval-Enhanced+Machine+Learning)|0| |[Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval](https://doi.org/10.1145/3477495.3531736)|Dingkun Long, Qiong Gao, Kuan Zou, Guangwei Xu, Pengjun Xie, Ruijie Guo, Jian Xu, Guanjun Jiang, Luxi Xing, Ping Yang|Alibaba Group, Hangzhou, China|Passage retrieval is a fundamental task in information retrieval (IR) research, which has drawn much attention recently. In the English field, the availability of large-scale annotated dataset (e.g, MS MARCO) and the emergence of deep pre-trained language models (e.g, BERT) has resulted in a substantial improvement of existing passage retrieval systems. However, in the Chinese field, especially for specific domains, passage retrieval systems are still immature due to quality-annotated dataset being limited by scale. Therefore, in this paper, we present a novel multi-domain Chinese dataset for passage retrieval (Multi-CPR). The dataset is collected from three different domains, including E-commerce, Entertainment video and Medical. Each dataset contains millions of passages and a certain amount of human annotated query-passage related pairs. We implement various representative passage retrieval methods as baselines. We find that the performance of retrieval models trained on dataset from general domain will inevitably decrease on specific domain. Nevertheless, a passage retrieval system built on in-domain annotated dataset can achieve significant improvement, which indeed demonstrates the necessity of domain labeled data for further optimization. We hope the release of the Multi-CPR dataset could benchmark Chinese passage retrieval task in specific domain and also make advances for future studies.|短文检索是信息检索研究中的一项基础性工作,近年来备受关注。在英语领域,大规模注释数据集(例如 MS MARCO)的可用性和深度预训练语言模型(例如 BERT)的出现使现有的文章检索系统得到了实质性的改进。然而,在中文领域,特别是在特定领域,由于质量注释数据集受到规模的限制,文章检索系统还不成熟。因此,本文提出了一种新的多领域中文文本检索数据集(Multi-CPR)。该数据集收集自三个不同的领域,包括电子商务,娱乐视频和医疗。每个数据集包含数百万个段落和一定数量的人工注释的查询-段落相关对。我们实现了各种具有代表性的文章检索方法作为基线。研究发现,对一般领域数据集训练的检索模型在特定领域的性能不可避免地会下降。然而,建立在域内注释数据集上的文章检索系统可以取得显著的改进,这确实说明了域标记数据进一步优化的必要性。我们希望通过多 CPR 数据集的发布,能够为特定领域的中文文章检索任务提供基准,并为今后的研究提供参考。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-CPR:+A+Multi+Domain+Chinese+Dataset+for+Passage+Retrieval)|0| -|[MIMICS-Duo: Offline & Online Evaluation of Search Clarification](https://doi.org/10.1145/3477495.3531750)|Leila Tavakoli, Johanne R. Trippas, Hamed Zamani, Falk Scholer, Mark Sanderson|University of Melbourne, Melbourne, VIC, Australia; University of Massachusetts Amherst, Amherst, MA, USA; RMIT University, Melbourne, VIC, Australia|Asking clarification questions is an active area of research; however, resources for training and evaluating search clarification methods are not sufficient. To address this issue, we describe MIMICS-Duo, a new freely available dataset of 306 search queries with multiple clarifications (a total of 1,034 query-clarification pairs). MIMICS-Duo contains fine-grained annotations on clarification questions and their candidate answers and enhances the existing MIMICS datasets by enabling multi-dimensional evaluation of search clarification methods, including online and offline evaluation. We conduct extensive analysis to demonstrate the relationship between offline and online search clarification datasets and outline several research directions enabled by MIMICS-Duo. We believe that this resource will help researchers better understand clarification in search.|提出澄清问题是一个活跃的研究领域,然而,培训和评估搜索澄清方法的资源是不够的。为了解决这个问题,我们描述了 MIMICS-Duo,这是一个新的免费数据集,包含306个具有多重澄清的搜索查询(总共1,034个查询-澄清对)。MIMICS-Duo 包含关于澄清问题及其候选答案的细粒度注释,并通过支持搜索澄清方法的多维评估(包括在线和离线评估)来增强现有的 MIMICS 数据集。我们进行了广泛的分析,以证明离线和在线搜索澄清数据集之间的关系,并概述了由 MIMICS-Duo 实现的几个研究方向。我们相信,这一资源将有助于研究人员更好地理解在搜索澄清。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MIMICS-Duo:+Offline+&+Online+Evaluation+of+Search+Clarification)|0| -|[A Common Framework for Exploring Document-at-a-Time and Score-at-a-Time Retrieval Methods](https://doi.org/10.1145/3477495.3531657)|Andrew Trotman, Joel Mackenzie, Pradeesh Parameswaran, Jimmy Lin|University of Otago, Dunedin, New Zealand; University of Waterloo, Waterloo, Canada; The University of Queensland, Brisbane, QLD, Australia|Document-at-a-time (DaaT) and score-at-a-time (SaaT) query evaluation techniques are different approaches to top-k retrieval with inverted indexes. While modern systems are dominated by DaaT, the academic literature has seen decades of debate about the merits of each. Recently, there has been renewed interest in SaaT methods for learned sparse lexical models, where studies have shown that transformers generate "wacky weights" that appear to reduce opportunities for optimizations in DaaT methods. However, researchers currently lack an easy-to-use SaaT system to support further exploration. This is the gap that our work fills. Starting with a modern SaaT system (JASS), we built Python bindings in order to integrate into the DaaT Pyserini IR toolkit (Lucene). The result is a common frontend to both a DaaT and a SaaT system. We demonstrate how recent experiments with a wide range of learned sparse lexical models can be easily reproduced. Our contribution is a framework that enables future research comparing DaaT and SaaT methods in the context of modern neural retrieval models.|一次文档(DaaT)和一次得分(SaaT)查询评估技术是两种不同的方法,用于带有倒排索引的 top-k 检索。虽然现代系统是由 DaaT 主导的,但学术文献已经对每个系统的优点进行了数十年的争论。最近,人们对学习稀疏词汇模型的 SaaT 方法重新产生了兴趣,研究表明,变压器产生的“古怪的权重”似乎减少了 DaaT 方法优化的机会。然而,研究人员目前缺乏一个易于使用的 SaaT 系统来支持进一步的探索。这是我们的工作填补的空白。从一个现代 SaaT 系统(JASS)开始,我们构建了 Python 绑定,以便集成到 DaaT Pyserini IR 工具包(Lucene)中。其结果是 DaaT 和 SaaT 系统的共同前端。我们证明了最近的实验与广泛的学习稀疏词汇模型可以很容易地再现。我们的贡献是一个框架,使未来的研究比较 DaaT 和 SaaT 方法在现代神经检索模型的背景下。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Common+Framework+for+Exploring+Document-at-a-Time+and+Score-at-a-Time+Retrieval+Methods)|0| -|[BiTe-REx: An Explainable Bilingual Text Retrieval System in the Automotive Domain](https://doi.org/10.1145/3477495.3531665)|Viju Sudhi, Sabine Wehnert, Norbert Michael Homner, Sebastian Ernst, Mark Gonter, Andreas Krug, Ernesto William De Luca|Otto von Guericke University, Magdeburg, Germany; Audi AG, Ingolstadt, Germany|To satiate the comprehensive information need of users, retrieval systems surpassing the boundaries of language are inevitable in the present digital space in the wake of an ever-rising multilingualism. This work presents the first-of-its-kind Bilingual Text Retrieval Explanations (BiTe-REx) aimed at users performing competitor or wage analysis in the automotive domain. BiTe-REx supports users to gather a more comprehensive picture of their query by retrieving results regardless of the query language and enables them to make a more informed decision by exposing how the underlying model judges the relevance of documents. With a user study, we demonstrate statistically significant results on the understandability and helpfulness of the explanations provided by the system.|为了满足用户的综合信息需求,随着多种语言的日益普及,超越语言边界的检索系统在当今数字化空间中是不可避免的。这项工作提出了第一种双语文本检索解释(BiTe-REx) ,旨在用户执行竞争对手或工资分析在汽车领域。BiTe-REx 支持用户通过检索结果(不管查询语言如何)收集更全面的查询信息,并通过揭示底层模型如何判断文档的相关性,使用户能够做出更明智的决策。通过用户研究,我们证明了系统提供的解释的可理解性和有用性的统计学显著结果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=BiTe-REx:+An+Explainable+Bilingual+Text+Retrieval+System+in+the+Automotive+Domain)|0| +|[MIMICS-Duo: Offline & Online Evaluation of Search Clarification](https://doi.org/10.1145/3477495.3531750)|Leila Tavakoli, Johanne R. Trippas, Hamed Zamani, Falk Scholer, Mark Sanderson|University of Melbourne, Melbourne, VIC, Australia; RMIT University, Melbourne, VIC, Australia; University of Massachusetts Amherst, Amherst, MA, USA|Asking clarification questions is an active area of research; however, resources for training and evaluating search clarification methods are not sufficient. To address this issue, we describe MIMICS-Duo, a new freely available dataset of 306 search queries with multiple clarifications (a total of 1,034 query-clarification pairs). MIMICS-Duo contains fine-grained annotations on clarification questions and their candidate answers and enhances the existing MIMICS datasets by enabling multi-dimensional evaluation of search clarification methods, including online and offline evaluation. We conduct extensive analysis to demonstrate the relationship between offline and online search clarification datasets and outline several research directions enabled by MIMICS-Duo. We believe that this resource will help researchers better understand clarification in search.|提出澄清问题是一个活跃的研究领域,然而,培训和评估搜索澄清方法的资源是不够的。为了解决这个问题,我们描述了 MIMICS-Duo,这是一个新的免费数据集,包含306个具有多重澄清的搜索查询(总共1,034个查询-澄清对)。MIMICS-Duo 包含关于澄清问题及其候选答案的细粒度注释,并通过支持搜索澄清方法的多维评估(包括在线和离线评估)来增强现有的 MIMICS 数据集。我们进行了广泛的分析,以证明离线和在线搜索澄清数据集之间的关系,并概述了由 MIMICS-Duo 实现的几个研究方向。我们相信,这一资源将有助于研究人员更好地理解在搜索澄清。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MIMICS-Duo:+Offline+&+Online+Evaluation+of+Search+Clarification)|0| +|[A Common Framework for Exploring Document-at-a-Time and Score-at-a-Time Retrieval Methods](https://doi.org/10.1145/3477495.3531657)|Andrew Trotman, Joel Mackenzie, Pradeesh Parameswaran, Jimmy Lin|University of Waterloo, Waterloo, Canada; University of Otago, Dunedin, New Zealand; The University of Queensland, Brisbane, QLD, Australia|Document-at-a-time (DaaT) and score-at-a-time (SaaT) query evaluation techniques are different approaches to top-k retrieval with inverted indexes. While modern systems are dominated by DaaT, the academic literature has seen decades of debate about the merits of each. Recently, there has been renewed interest in SaaT methods for learned sparse lexical models, where studies have shown that transformers generate "wacky weights" that appear to reduce opportunities for optimizations in DaaT methods. However, researchers currently lack an easy-to-use SaaT system to support further exploration. This is the gap that our work fills. Starting with a modern SaaT system (JASS), we built Python bindings in order to integrate into the DaaT Pyserini IR toolkit (Lucene). The result is a common frontend to both a DaaT and a SaaT system. We demonstrate how recent experiments with a wide range of learned sparse lexical models can be easily reproduced. Our contribution is a framework that enables future research comparing DaaT and SaaT methods in the context of modern neural retrieval models.|一次文档(DaaT)和一次得分(SaaT)查询评估技术是两种不同的方法,用于带有倒排索引的 top-k 检索。虽然现代系统是由 DaaT 主导的,但学术文献已经对每个系统的优点进行了数十年的争论。最近,人们对学习稀疏词汇模型的 SaaT 方法重新产生了兴趣,研究表明,变压器产生的“古怪的权重”似乎减少了 DaaT 方法优化的机会。然而,研究人员目前缺乏一个易于使用的 SaaT 系统来支持进一步的探索。这是我们的工作填补的空白。从一个现代 SaaT 系统(JASS)开始,我们构建了 Python 绑定,以便集成到 DaaT Pyserini IR 工具包(Lucene)中。其结果是 DaaT 和 SaaT 系统的共同前端。我们证明了最近的实验与广泛的学习稀疏词汇模型可以很容易地再现。我们的贡献是一个框架,使未来的研究比较 DaaT 和 SaaT 方法在现代神经检索模型的背景下。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Common+Framework+for+Exploring+Document-at-a-Time+and+Score-at-a-Time+Retrieval+Methods)|0| +|[BiTe-REx: An Explainable Bilingual Text Retrieval System in the Automotive Domain](https://doi.org/10.1145/3477495.3531665)|Viju Sudhi, Sabine Wehnert, Norbert Michael Homner, Sebastian Ernst, Mark Gonter, Andreas Krug, Ernesto William De Luca|Audi AG, Ingolstadt, Germany; Otto von Guericke University, Magdeburg, Germany|To satiate the comprehensive information need of users, retrieval systems surpassing the boundaries of language are inevitable in the present digital space in the wake of an ever-rising multilingualism. This work presents the first-of-its-kind Bilingual Text Retrieval Explanations (BiTe-REx) aimed at users performing competitor or wage analysis in the automotive domain. BiTe-REx supports users to gather a more comprehensive picture of their query by retrieving results regardless of the query language and enables them to make a more informed decision by exposing how the underlying model judges the relevance of documents. With a user study, we demonstrate statistically significant results on the understandability and helpfulness of the explanations provided by the system.|为了满足用户的综合信息需求,随着多种语言的日益普及,超越语言边界的检索系统在当今数字化空间中是不可避免的。这项工作提出了第一种双语文本检索解释(BiTe-REx) ,旨在用户执行竞争对手或工资分析在汽车领域。BiTe-REx 支持用户通过检索结果(不管查询语言如何)收集更全面的查询信息,并通过揭示底层模型如何判断文档的相关性,使用户能够做出更明智的决策。通过用户研究,我们证明了系统提供的解释的可理解性和有用性的统计学显著结果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=BiTe-REx:+An+Explainable+Bilingual+Text+Retrieval+System+in+the+Automotive+Domain)|0| |[Are Taylor's Posts Risky? Evaluating Cumulative Revelations in Online Personal Data: A persona-based tool for evaluating awareness of online risks and harms](https://doi.org/10.1145/3477495.3531659)|Leif Azzopardi, Jo Briggs, Melissa Duheric, Callum Nash, Emma Nicol, Wendy Moncur, Burkhard Schafer||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Are+Taylor's+Posts+Risky?+Evaluating+Cumulative+Revelations+in+Online+Personal+Data:+A+persona-based+tool+for+evaluating+awareness+of+online+risks+and+harms)|0| |[DDEN: A Heterogeneous Learning-to-Rank Approach with Deep Debiasing Experts Network](https://doi.org/10.1145/3477495.3536320)|Wenchao Xiu, Yiran Wang, Taofeng Xue, Kai Zhang, Qin Zhang, Zhonghuo Wu, Yifan Yang, Gong Zhang|Meituan, Shanghai, China|Learning-to-Rank(LTR) is widely used in many Information Retrieval(IR) scenarios, including web search and Location Based Services(LBS) search. However, most existing LTR techniques mainly focus on homogeneous ranking. Taking QAC in Dianping search as an example, heterogeneous documents including suggested queries (SQ) and Point-of-Interests(POI) need to be ranked and presented to enhance user experience. New challenges are faced when conducting heterogeneous ranking, including inconsistent feature space and more serious position bias caused by distinct representation spaces. Therefore, we propose Deep Debiasing Experts Network (DDEN), a novel heterogeneous LTR approach based on Mixture-of-Experts architecture and gating network, to deal with the inconsistent feature space of documents in ranking system. Furthermore, DDEN mitigates the position bias by adopting adversarial-debiasing framework embedded with heterogeneous LTR techniques. We conduct reproducible experiments on industrial datasets from Dianping, one of the largest local life platforms, and deploy DDEN in online application. Results show that DDEN substantially improves ranking performance in offline evaluation and boost the overall click-through rate in online A/B test by 2.1%.|学习到排名(learning-to-Rank,LTR)广泛应用于许多信息检索场景,包括网络搜索和基于位置的服务(Location Based Services,LBS)搜索。然而,大多数现有的 LTR 技术主要集中在同质排序。以点评搜索中的质量控制(QAC)为例,需要对包括建议查询(SQ)和兴趣点(POI)在内的异构文档进行排序和呈现,以提高用户体验。异构排序面临的新挑战包括不一致的特征空间和不同表示空间引起的更严重的位置偏差。为此,本文提出了一种基于专家混合体系结构和门网络的异构 LTR 方法——深度去偏专家网络(Deep Debioning Expert Network,DDEN) ,用于处理排序系统中文档的不一致特征空间。此外,DDEN 通过采用嵌入异构 LTR 技术的对抗性消偏框架来缓解位置偏差。我们在本地最大的生活平台之一 Dianping 的工业数据集上进行可重复的实验,并在在线应用中部署 DDEN。结果显示,DDEN 大大提高了离线评估的排名表现,并使在线 A/B 测试的整体点进率提高了2.1% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DDEN:+A+Heterogeneous+Learning-to-Rank+Approach+with+Deep+Debiasing+Experts+Network)|0| |[An Intelligent Advertisement Short Video Production System via Multi-Modal Retrieval](https://doi.org/10.1145/3477495.3536323)|Yanheng Wei, Lianghua Huang, Yanhao Zhang, Yun Zheng, Pan Pan|Alibaba Group, Beijing, China|In its most basic form, advertising video production communicates a message about a product or service to the public. In the age of digital marketing, where the most popular way to connect with audiences is through advertising videos. However, advertising video production is a costly and complicated process from creation, material shooting, editing to the final commercial video. Therefore, producing qualified advertising videos is a capital and talent-intensive task, which poses a huge challenge for start-ups or inexperienced ad creators. paper proposes an intelligent advertising video production system driven by multi-modal retrieval, which only requires the input of descriptive copy. This system can automatically generate scripts, then extract key queries, retrieve related short video materials in the video library, and finally synthesize short advertising videos. The whole process minimizes human input, greatly reduces the threshold for advertising video production and greatly improves output and efficiency. It has a modular design to encourage the study of new multi-modal algorithms, which can be evaluated in batch mode. It can also integrate with a user interface, which allows user studies and data collection in an interactive mode, where the back end can be fully algorithmic or a wizard of oz setup. The proposed system has been fully verified and has broad prospects in the production of short videos for commodity advertisements within Alibaba.|在最基本的形式中,广告视频制作向公众传达了关于产品或服务的信息。在数字营销时代,最流行的与观众联系的方式是通过广告视频。然而,广告视频制作是一个昂贵而复杂的过程,从创作、素材拍摄、编辑到最终的商业视频。因此,制作合格的广告视频是一项资本和人才密集型的任务,这对初创企业或缺乏经验的广告创作者来说是一个巨大的挑战。提出了一种基于多模态检索的智能广告视频制作系统,该系统只需要输入描述性文本。该系统可以自动生成脚本,然后提取关键查询,检索视频库中相关的短视频资料,最后合成广告短视频。整个过程最大限度地减少了人工投入,大大降低了广告视频制作的门槛,大大提高了产量和效率。它采用模块化设计,以鼓励对新的多模态算法的研究,这些算法可以在批处理模式下进行评估。它还可以集成一个用户界面,允许用户研究和数据收集在一个交互模式,其中后端可以完全算法或绿野仙踪设置向导。建议的系统已经全面验证,在阿里巴巴制作商品广告短片方面具有广阔前景。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=An+Intelligent+Advertisement+Short+Video+Production+System+via+Multi-Modal+Retrieval)|0| @@ -324,107 +324,107 @@ Inspired by the availability of knowledge graph (KG), we propose a novel Knowled |[What the Actual...Examining User Behaviour in Information Retrieval](https://doi.org/10.1145/3477495.3532687)|George Buchanan, Dana McKay|University of Melbourne, Melbourne, VIC, Australia; RMIT University, Melbourne, VIC, Australia|Conducting studies involving actual users is a recurring challenge in information retrieval. In this tutorial we will address the main strategic and tactical choices for engaging with, designing and executing user studies, considering both evaluation and formative investigation. The tension between reproducibility and ensuring natural user behaviour will be a recurring focus, seeking to help individual researchers make an intentional and well-argued choice for their research. The presenters have over fifty years of combined experience working in interactive information retrieval, and information interaction in general.|进行涉及实际使用者的研究是信息检索的一个反复出现的挑战。在本教程中,我们将讨论参与、设计和执行用户研究的主要战略和战术选择,同时考虑评估和形成性调查。可重复性和确保自然使用者行为之间的紧张关系将是一个反复出现的焦点,目的是帮助个别研究人员为其研究做出有意识和有充分理由的选择。主持人在互动信息检索和一般的信息互动方面有超过五十年的工作经验。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=What+the+Actual...Examining+User+Behaviour+in+Information+Retrieval)|0| |[User-centered Non-factoid Answer Retrieval](https://doi.org/10.1145/3477495.3531689)|Marwah Alaofi|RMIT University, Melbourne, VIC, Australia|In this research, we aim to examine the assumptions made about users when searching for non-factoid answers using search engines. That is, the way they approach non-factoid question-answering tasks, the language they use to express their questions, the variability in their queries and their behavior towards the provided answers. The investigation will also examine the extent to which these neglected factors affect retrieval performance and potentially highlight the importance of building more realistic methodologies and test collections that capture the real nature of this task. Through our preliminary work, we have begun to explore the characteristics of non-factoid question-answering queries and investigate query variability and their impact on modern retrieval models. Our preliminary results demonstrate notable differences between non-factoid questions sampled from a large query log and those used in QA datasets. In addition, our results demonstrate a profound effect of query variability on retrieval consistency, indicating a potential impact on retrieval performance that is worth studying. We highlight the importance of understanding user behaviour while searching for non-factoid answers, specifically the way they behave in response to receiving an answer. This should advance our understanding of the support users require across different types of non-factoid questions and inform the design of interaction models that support learning and encourage exploring.|本研究旨在探讨使用搜寻引擎搜寻非事实性答案时,对使用者所作的假设。也就是说,他们处理非事实性问答任务的方式,他们用来表达他们的问题的语言,他们的查询的可变性和他们对提供的答案的行为。调查还将审查这些被忽视的因素在多大程度上影响检索性能,并可能强调建立更现实的方法和测试收集的重要性,以捕捉这一任务的真实性质。通过我们的初步工作,我们已经开始探索非事实问答查询的特点,并调查查询的可变性及其对现代检索模型的影响。我们的初步结果表明,从大型查询日志中抽样的非事实性问题与 QA 数据集中使用的问题之间存在显著差异。此外,我们的研究结果显示了查询变异性对检索一致性的深刻影响,表明了对检索性能的潜在影响,值得研究。我们强调了理解用户行为的重要性,同时寻找非事实性的答案,特别是他们的行为方式,以回应收到的答案。这将提高我们对用户在不同类型的非事实性问题中需要的支持的理解,并为支持学习和鼓励探索的交互模型的设计提供信息。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=User-centered+Non-factoid+Answer+Retrieval)|0| |[Intelligent Conversational Agents for Ambient Computing](https://doi.org/10.1145/3477495.3532087)|Ruhi Sarikaya|Amazon, Seattle, WA, USA|We are in the midst of an AI revolution. Three primary disruptive changes set off this revolution: 1) increase in compute power, mobile internet, and advances in deep learning. The next decade is expected to be about the proliferation of Internet-of-Things (IoT) devices and sensors, which will generate exponentially larger amounts of data to reason over and pave the way for ambient computing. This will also give rise to new forms of interaction patterns with these systems. Users will have to interact with these systems under increasingly richer context and in real-time. Conversational AI has a critical role to play in this revolution, but only if it delivers on its promise of enabling natural, frictionless, and personalized interactions in any context the user is in, while hiding the complexity of these systems through ambient intelligence. However, current commercial conversational AI systems are trained primarily with a supervised learning paradigm, which is difficult, if not impossible, to scale by manually annotating data for increasingly complex sets of contextual conditions. Inherent ambiguity in natural language further complicates the problem. We need to devise new forms of learning paradigms and frameworks that will scale to this complexity. In this talk, we present some early steps we are taking with Alexa, Amazon's Conversational AI system, to move from supervised learning to self-learning methods, where the AI relies on customer interactions for supervision in our journey to ambient intelligence.|我们正处于人工智能革命的中期。三个主要的颠覆性变化引发了这场革命: 1)计算能力的提高,移动互联网的发展,以及深度学习的进步。下一个十年预计将是物联网设备和传感器的激增,它们将产生指数级数量的数据来进行推理,并为环境计算铺平道路。这也将产生与这些系统交互模式的新形式。用户将不得不在日益丰富的上下文环境下与这些系统进行实时交互。对话式人工智能在这场革命中扮演着关键的角色,但前提是它能够在用户所处的任何环境中实现自然、无摩擦和个性化的交互,同时通过环境智能隐藏这些系统的复杂性。然而,目前的商业会话人工智能系统主要使用监督式学习范式进行训练,这种范式很难(如果不是不可能的话)通过手动为日益复杂的上下文条件集注释数据来扩展。自然语言中固有的歧义使问题进一步复杂化。我们需要设计新的学习范式和框架,以适应这种复杂性。在本次演讲中,我们将介绍亚马逊的对话式人工智能系统 Alexa 的一些早期步骤,该系统将从监督式学习转向自学习方法,在我们的环境智能过程中,人工智能依赖客户互动进行监督。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Intelligent+Conversational+Agents+for+Ambient+Computing)|0| -|[A Robust Computerized Adaptive Testing Approach in Educational Question Retrieval](https://doi.org/10.1145/3477495.3531928)|Yan Zhuang, Qi Liu, Zhenya Huang, Zhi Li, Binbin Jin, Haoyang Bi, Enhong Chen, Shijin Wang|Huawei Cloud Computing Technologies Co., Ltd, Hangzhou, China; Anhui Province Key Laboratory of Big Data Analysis and Application, University of Science and Technology of China & Institute of Artificial Intelligence, Hefei Comprehensive National Science Center & State Key Laboratory of Cognitive Intelligence, Hefei, China; State Key Laboratory of Cognitive Intelligence & iFLYTEK AI Research (Central China), iFLYTEK Co., Ltd, Hefei, China; Anhui Province Key Laboratory of Big Data Analysis and Application, University of Science and Technology of China & State Key Laboratory of Cognitive Intelligence, Hefei, China|Computerized Adaptive Testing (CAT) is a promising testing mode in personalized online education (e.g., GRE), which aims at measuring student's proficiency accurately and reducing test length. The "adaptive" is reflected in its selection algorithm that can retrieve best-suited questions for student based on his/her estimated proficiency at each test step. Although there are many sophisticated selection algorithms for improving CAT's effectiveness, they are restricted and perturbed by the accuracy of current proficiency estimate, thus lacking robustness. To this end, we investigate a general method to enhance the robustness of existing algorithms by leveraging student's "multi-facet" nature during tests. Specifically, we present a generic optimization criterion Robust Adaptive Testing (RAT) for proficiency estimation via fusing multiple estimates at each step, which maintains a multi-facet description of student's potential proficiency. We further provide theoretical analyses of such estimator's desirable statistical properties: asymptotic unbiasedness, efficiency, and consistency. Extensive experiments on perturbed synthetic data and three real-world datasets show that selection algorithms in our RAT framework are robust and yield substantial improvements.|计算机自适应测试(CAT)是个性化网络教育(如 GRE)中一种很有前途的测试模式,其目的是准确测量学生的水平,减少测试时间。“适应性”反映在其选择算法中,该算法可以根据学生在每个测试步骤中的估计熟练程度为学生检索最适合的问题。虽然有许多复杂的选择算法来提高 CAT 的有效性,但它们都受到当前水平估计精度的限制和干扰,因此缺乏鲁棒性。为此,我们研究了一种通用的方法,以增强现有的算法的健壮性,利用学生的“多方面”的性质在测试。具体来说,我们提出了一个通用的优化标准鲁棒自适应测试(RAT)的水平估计融合多个估计在每一个步骤,它保持了一个学生的潜在水平的多方面的描述。进一步从理论上分析了这类估计量的理想统计性质: 渐近无偏性、有效性和一致性。在扰动合成数据和三个实际数据集上的大量实验表明,我们的 RAT 框架中的选择算法是健壮的,并且产生了实质性的改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Robust+Computerized+Adaptive+Testing+Approach+in+Educational+Question+Retrieval)|0| -|[Forest-based Deep Recommender](https://doi.org/10.1145/3477495.3531980)|Chao Feng, Defu Lian, Zheng Liu, Xing Xie, Le Wu, Enhong Chen|Microsoft Research Asia, Beijing, China; University of Science and Technology of China, Hefei, China; Hefei university of Technology, Hefei, China|With the development of deep learning techniques, deep recommendation models also achieve remarkable improvements in terms of recommendation accuracy. However, due to the large number of candidate items in practice and the high cost of preference computation, these methods also suffer from low efficiency of recommendation. The recently proposed tree-based deep recommendation models alleviate the problem by directly learning tree structure and representations under the guidance of recommendation objectives. However, such models have two shortcomings. First, the max-heap assumption in the hierarchical tree, in which the preference for a parent node should be the maximum between the preferences for its children, is difficult to satisfy in their binary classification objectives. Second, the learned index only includes a single tree, which is different from the widely-used multiple trees index, providing an opportunity to improve the accuracy of recommendation. To this end, we propose a Deep Forest-based Recommender (DeFoRec for short) for an efficient recommendation. In DeFoRec, all the trees generated during training process are retained to form the forest. When learning node representation of each tree, we have to satisfy the max-heap assumption as much as possible and mimic beam search behavior over the tree in the training stage. This is achieved by DeFoRec to regard the training task as multi-classification over tree nodes at the same level. However, the number of tree nodes grows exponentially with levels, making us to train the preference model by the guidance of sampled-softmax technique. The experiments are conducted on real-world datasets, validating the effectiveness of the proposed preference model learning method and tree learning method.|随着深度学习技术的发展,深度推荐模型在推荐精度方面也取得了显著的提高。然而,由于实际中候选项数量大,偏好计算成本高,这些方法也存在推荐效率低的问题。最近提出的基于树的深度推荐模型通过在推荐目标的指导下直接学习树的结构和表示来解决这个问题。然而,这种模式有两个缺点。首先,层次树中的最大堆假设(父节点的首选项应该是其子节点的首选项之间的最大值)难以满足其二进制分类目标。其次,学习索引只包括一棵树,这与广泛使用的多棵树索引不同,为提高推荐的准确性提供了机会。为此,我们提出了一个基于深度森林的推荐器(简称 DeFoRec)来实现有效的推荐。在 DeFoRec 中,所有在训练过程中生成的树被保留以形成森林。在学习每棵树的节点表示时,必须尽可能满足最大堆假设,并在训练阶段模拟树上的束搜索行为。DeFoRec 将训练任务视为同一层次上的树节点上的多分类,从而实现了这一目标。然而,树节点的数量随着层次的增加呈指数增长,这使得我们在采样-软极大技术的指导下对偏好模型进行训练。在实际数据集上进行了实验,验证了所提出的偏好模型学习方法和树学习方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Forest-based+Deep+Recommender)|0| -|[Ranking Interruptus: When Truncated Rankings Are Better and How to Measure That](https://doi.org/10.1145/3477495.3532051)|Enrique Amigó, Stefano Mizzaro, Damiano Spina|UNED NLP & IR Group, Madrid, Spain; University of Udine, Udine, Italy; RMIT University, Melbourne, VIC, Australia|Most of information retrieval effectiveness evaluation metrics assume that systems appending irrelevant documents at the bottom of the ranking are as effective as (or not worse than) systems that have a stopping criteria to 'truncate' the ranking at the right position to avoid retrieving those irrelevant documents at the end. It can be argued, however, that such truncated rankings are more useful to the end user. It is thus important to understand how to measure retrieval effectiveness in this scenario. In this paper we provide both theoretical and experimental contributions. We first define formal properties to analyze how effectiveness metrics behave when evaluating truncated rankings. Our theoretical analysis shows that de-facto standard metrics do not satisfy desirable properties to evaluate truncated rankings: only Observational Information Effectiveness (OIE) -- a metric based on Shannon's information theory -- satisfies them all. We then perform experiments to compare several metrics on nine TREC datasets. According to our experimental results, the most appropriate metrics for truncated rankings are OIE and a novel extension of Rank-Biased Precision that adds a user effort factor penalizing the retrieval of irrelevant documents.|大多数信息检索有效性评估指标都假定,在排名底部附加不相关文档的系统与那些有停止标准的系统一样有效(或者不比那些有停止标准的系统差) ,后者会在正确的位置“截断”排名,以避免在最后检索到那些不相关的文档。然而,可以说,这种截断的排名对最终用户更有用。因此,了解如何在此场景中度量检索效率非常重要。在本文中,我们提供了理论和实验的贡献。我们首先定义形式属性来分析效率指标在评估截断排名时的表现。我们的理论分析表明,事实上的标准指标不能满足评估截断排名的理想属性: 只有观测信息有效性(OIE)——一个基于香农信息理论的指标——能够满足所有这些指标。然后,我们进行实验来比较九个 TREC 数据集上的几个指标。根据我们的实验结果,最适合截断排名的指标是 OIE 和一个新的扩展排名偏差精度,增加了用户的努力因素惩罚检索不相关的文档。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Ranking+Interruptus:+When+Truncated+Rankings+Are+Better+and+How+to+Measure+That)|0| +|[A Robust Computerized Adaptive Testing Approach in Educational Question Retrieval](https://doi.org/10.1145/3477495.3531928)|Yan Zhuang, Qi Liu, Zhenya Huang, Zhi Li, Binbin Jin, Haoyang Bi, Enhong Chen, Shijin Wang|Huawei Cloud Computing Technologies Co., Ltd, Hangzhou, China; Anhui Province Key Laboratory of Big Data Analysis and Application, University of Science and Technology of China & State Key Laboratory of Cognitive Intelligence, Hefei, China; Anhui Province Key Laboratory of Big Data Analysis and Application, University of Science and Technology of China & Institute of Artificial Intelligence, Hefei Comprehensive National Science Center & State Key Laboratory of Cognitive Intelligence, Hefei, China; State Key Laboratory of Cognitive Intelligence & iFLYTEK AI Research (Central China), iFLYTEK Co., Ltd, Hefei, China|Computerized Adaptive Testing (CAT) is a promising testing mode in personalized online education (e.g., GRE), which aims at measuring student's proficiency accurately and reducing test length. The "adaptive" is reflected in its selection algorithm that can retrieve best-suited questions for student based on his/her estimated proficiency at each test step. Although there are many sophisticated selection algorithms for improving CAT's effectiveness, they are restricted and perturbed by the accuracy of current proficiency estimate, thus lacking robustness. To this end, we investigate a general method to enhance the robustness of existing algorithms by leveraging student's "multi-facet" nature during tests. Specifically, we present a generic optimization criterion Robust Adaptive Testing (RAT) for proficiency estimation via fusing multiple estimates at each step, which maintains a multi-facet description of student's potential proficiency. We further provide theoretical analyses of such estimator's desirable statistical properties: asymptotic unbiasedness, efficiency, and consistency. Extensive experiments on perturbed synthetic data and three real-world datasets show that selection algorithms in our RAT framework are robust and yield substantial improvements.|计算机自适应测试(CAT)是个性化网络教育(如 GRE)中一种很有前途的测试模式,其目的是准确测量学生的水平,减少测试时间。“适应性”反映在其选择算法中,该算法可以根据学生在每个测试步骤中的估计熟练程度为学生检索最适合的问题。虽然有许多复杂的选择算法来提高 CAT 的有效性,但它们都受到当前水平估计精度的限制和干扰,因此缺乏鲁棒性。为此,我们研究了一种通用的方法,以增强现有的算法的健壮性,利用学生的“多方面”的性质在测试。具体来说,我们提出了一个通用的优化标准鲁棒自适应测试(RAT)的水平估计融合多个估计在每一个步骤,它保持了一个学生的潜在水平的多方面的描述。进一步从理论上分析了这类估计量的理想统计性质: 渐近无偏性、有效性和一致性。在扰动合成数据和三个实际数据集上的大量实验表明,我们的 RAT 框架中的选择算法是健壮的,并且产生了实质性的改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Robust+Computerized+Adaptive+Testing+Approach+in+Educational+Question+Retrieval)|0| +|[Forest-based Deep Recommender](https://doi.org/10.1145/3477495.3531980)|Chao Feng, Defu Lian, Zheng Liu, Xing Xie, Le Wu, Enhong Chen|Hefei university of Technology, Hefei, China; University of Science and Technology of China, Hefei, China; Microsoft Research Asia, Beijing, China|With the development of deep learning techniques, deep recommendation models also achieve remarkable improvements in terms of recommendation accuracy. However, due to the large number of candidate items in practice and the high cost of preference computation, these methods also suffer from low efficiency of recommendation. The recently proposed tree-based deep recommendation models alleviate the problem by directly learning tree structure and representations under the guidance of recommendation objectives. However, such models have two shortcomings. First, the max-heap assumption in the hierarchical tree, in which the preference for a parent node should be the maximum between the preferences for its children, is difficult to satisfy in their binary classification objectives. Second, the learned index only includes a single tree, which is different from the widely-used multiple trees index, providing an opportunity to improve the accuracy of recommendation. To this end, we propose a Deep Forest-based Recommender (DeFoRec for short) for an efficient recommendation. In DeFoRec, all the trees generated during training process are retained to form the forest. When learning node representation of each tree, we have to satisfy the max-heap assumption as much as possible and mimic beam search behavior over the tree in the training stage. This is achieved by DeFoRec to regard the training task as multi-classification over tree nodes at the same level. However, the number of tree nodes grows exponentially with levels, making us to train the preference model by the guidance of sampled-softmax technique. The experiments are conducted on real-world datasets, validating the effectiveness of the proposed preference model learning method and tree learning method.|随着深度学习技术的发展,深度推荐模型在推荐精度方面也取得了显著的提高。然而,由于实际中候选项数量大,偏好计算成本高,这些方法也存在推荐效率低的问题。最近提出的基于树的深度推荐模型通过在推荐目标的指导下直接学习树的结构和表示来解决这个问题。然而,这种模式有两个缺点。首先,层次树中的最大堆假设(父节点的首选项应该是其子节点的首选项之间的最大值)难以满足其二进制分类目标。其次,学习索引只包括一棵树,这与广泛使用的多棵树索引不同,为提高推荐的准确性提供了机会。为此,我们提出了一个基于深度森林的推荐器(简称 DeFoRec)来实现有效的推荐。在 DeFoRec 中,所有在训练过程中生成的树被保留以形成森林。在学习每棵树的节点表示时,必须尽可能满足最大堆假设,并在训练阶段模拟树上的束搜索行为。DeFoRec 将训练任务视为同一层次上的树节点上的多分类,从而实现了这一目标。然而,树节点的数量随着层次的增加呈指数增长,这使得我们在采样-软极大技术的指导下对偏好模型进行训练。在实际数据集上进行了实验,验证了所提出的偏好模型学习方法和树学习方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Forest-based+Deep+Recommender)|0| +|[Ranking Interruptus: When Truncated Rankings Are Better and How to Measure That](https://doi.org/10.1145/3477495.3532051)|Enrique Amigó, Stefano Mizzaro, Damiano Spina|University of Udine, Udine, Italy; RMIT University, Melbourne, VIC, Australia; UNED NLP & IR Group, Madrid, Spain|Most of information retrieval effectiveness evaluation metrics assume that systems appending irrelevant documents at the bottom of the ranking are as effective as (or not worse than) systems that have a stopping criteria to 'truncate' the ranking at the right position to avoid retrieving those irrelevant documents at the end. It can be argued, however, that such truncated rankings are more useful to the end user. It is thus important to understand how to measure retrieval effectiveness in this scenario. In this paper we provide both theoretical and experimental contributions. We first define formal properties to analyze how effectiveness metrics behave when evaluating truncated rankings. Our theoretical analysis shows that de-facto standard metrics do not satisfy desirable properties to evaluate truncated rankings: only Observational Information Effectiveness (OIE) -- a metric based on Shannon's information theory -- satisfies them all. We then perform experiments to compare several metrics on nine TREC datasets. According to our experimental results, the most appropriate metrics for truncated rankings are OIE and a novel extension of Rank-Biased Precision that adds a user effort factor penalizing the retrieval of irrelevant documents.|大多数信息检索有效性评估指标都假定,在排名底部附加不相关文档的系统与那些有停止标准的系统一样有效(或者不比那些有停止标准的系统差) ,后者会在正确的位置“截断”排名,以避免在最后检索到那些不相关的文档。然而,可以说,这种截断的排名对最终用户更有用。因此,了解如何在此场景中度量检索效率非常重要。在本文中,我们提供了理论和实验的贡献。我们首先定义形式属性来分析效率指标在评估截断排名时的表现。我们的理论分析表明,事实上的标准指标不能满足评估截断排名的理想属性: 只有观测信息有效性(OIE)——一个基于香农信息理论的指标——能够满足所有这些指标。然后,我们进行实验来比较九个 TREC 数据集上的几个指标。根据我们的实验结果,最适合截断排名的指标是 OIE 和一个新的扩展排名偏差精度,增加了用户的努力因素惩罚检索不相关的文档。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Ranking+Interruptus:+When+Truncated+Rankings+Are+Better+and+How+to+Measure+That)|0| |[Offline Retrieval Evaluation Without Evaluation Metrics](https://doi.org/10.1145/3477495.3532033)|Fernando Diaz, Andres Ferraro|Mila - Quebec Artificial Intelligence Institute, Montréal, PQ, Canada; Google, Montréal, PQ, Canada|Offline evaluation of information retrieval and recommendation has traditionally focused on distilling the quality of a ranking into a scalar metric such as average precision or normalized discounted cumulative gain. We can use this metric to compare the performance of multiple systems for the same request. Although evaluation metrics provide a convenient summary of system performance, they also collapse subtle differences across users into a single number and can carry assumptions about user behavior and utility not supported across retrieval scenarios. We propose recall-paired preference (RPP), a metric-free evaluation method based on directly computing a preference between ranked lists. RPP simulates multiple user subpopulations per query and compares systems across these pseudo-populations. Our results across multiple search and recommendation tasks demonstrate that RPP substantially improves discriminative power while correlating well with existing metrics and being equally robust to incomplete data.|对信息检索和推荐的离线评估传统上侧重于将排名的质量提炼为一个标量指标,如平均精度或标准化折现累计增益。我们可以使用这个度量来比较同一个请求的多个系统的性能。虽然评估指标提供了一个方便的系统性能总结,但是它们也将用户之间的细微差异折叠成一个数字,并且可以对不支持检索场景的用户行为和实用程序进行假设。我们提出了一种基于直接计算排名表之间偏好的无度量评价方法——召回配对偏好(RPP)。RPP 模拟每个查询的多个用户子种群,并比较这些伪种群中的系统。我们在多个搜索和推荐任务中的结果表明,RPP 大大提高了识别能力,同时与现有指标关联良好,对不完整数据具有同样的鲁棒性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Offline+Retrieval+Evaluation+Without+Evaluation+Metrics)|0| -|[Pareto-Optimal Fairness-Utility Amortizations in Rankings with a DBN Exposure Model](https://doi.org/10.1145/3477495.3532036)|Till Kletti, JeanMichel Renders, Patrick Loiseau|Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LIG, Grenoble, France; Naver Labs Europe, Meylan, France|In recent years, it has become clear that rankings delivered in many areas need not only be useful to the users but also respect fairness of exposure for the item producers. We consider the problem of finding ranking policies that achieve a Pareto-optimal tradeoff between these two aspects. Several methods were proposed to solve it; for instance a popular one is to use linear programming with a Birkhoff-von Neumann decomposition. These methods, however, are based on a classical Position Based exposure Model (PBM), which assumes independence between the items (hence the exposure only depends on the rank). In many applications, this assumption is unrealistic and the community increasingly moves towards considering other models that include dependences, such as the Dynamic Bayesian Network (DBN) exposure model. For such models, computing (exact) optimal fair ranking policies remains an open question. In this paper, we answer this question by leveraging a new geometrical method based on the so-called expohedron proposed recently for the PBM (Kletti et al., WSDM'22). We lay out the structure of a new geometrical object (the DBN-expohedron), and propose for it a Carathéodory decomposition algorithm of complexity $O(n^3)$, where n is the number of documents to rank. Such an algorithm enables expressing any feasible expected exposure vector as a distribution over at most n rankings; furthermore we show that we can compute the whole set of Pareto-optimal expected exposure vectors with the same complexity $O(n^3)$. Our work constitutes the first exact algorithm able to efficiently find a Pareto-optimal distribution of rankings. It is applicable to a broad range of fairness notions, including classical notions of meritocratic and demographic fairness. We empirically evaluate our method on the TREC2020 and MSLR datasets and compare it to several baselines in terms of Pareto-optimality and speed.|近年来,很明显,在许多领域提供的排名不仅需要对用户有用,而且还要尊重项目制作者的公平曝光。我们考虑的问题,找到排序的政策,实现了帕累托最优权衡这两个方面。人们提出了几种方法来解决这个问题,例如,一种流行的方法是使用伯克霍夫-冯诺依曼分解的线性规划。然而,这些方法是基于经典的基于位置的曝光模型(PBM) ,该模型假设项目之间的独立性(因此曝光只取决于排名)。在许多应用程序中,这种假设是不现实的,社区越来越倾向于考虑包含依赖关系的其他模型,例如动态贝氏网路暴露模型。对于这样的模型,计算(精确的)最优公平排序策略仍然是一个悬而未决的问题。在本文中,我们回答这个问题,利用一个新的几何方法的基础上,所谓的外三面体最近提出的 PBM (Kletti 等,WSDM’22)。我们给出了一个新的几何对象(DBN-expohedron)的结构,并提出了一个复杂度为 $O (n ^ 3) $的 Carathéodory 分解算法,其中 n 是要排序的文档数。这种算法能够表示任何可行的期望暴露矢量作为一个分布在最多 n 个排名; 此外,我们表明,我们可以计算整个集合的帕累托最优期望暴露矢量具有相同的复杂度 $O (n ^ 3) $。我们的工作构成了第一个精确的算法,能够有效地找到排名的帕累托最优分布。它适用于广泛的公平概念,包括精英统治和人口统计公平的经典概念。我们在 TREC2020和 MSLR 数据集上经验性地评估了我们的方法,并将其与几个基线在帕累托最优性和速度方面进行了比较。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Pareto-Optimal+Fairness-Utility+Amortizations+in+Rankings+with+a+DBN+Exposure+Model)|0| +|[Pareto-Optimal Fairness-Utility Amortizations in Rankings with a DBN Exposure Model](https://doi.org/10.1145/3477495.3532036)|Till Kletti, JeanMichel Renders, Patrick Loiseau|Naver Labs Europe, Meylan, France; Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LIG, Grenoble, France|In recent years, it has become clear that rankings delivered in many areas need not only be useful to the users but also respect fairness of exposure for the item producers. We consider the problem of finding ranking policies that achieve a Pareto-optimal tradeoff between these two aspects. Several methods were proposed to solve it; for instance a popular one is to use linear programming with a Birkhoff-von Neumann decomposition. These methods, however, are based on a classical Position Based exposure Model (PBM), which assumes independence between the items (hence the exposure only depends on the rank). In many applications, this assumption is unrealistic and the community increasingly moves towards considering other models that include dependences, such as the Dynamic Bayesian Network (DBN) exposure model. For such models, computing (exact) optimal fair ranking policies remains an open question. In this paper, we answer this question by leveraging a new geometrical method based on the so-called expohedron proposed recently for the PBM (Kletti et al., WSDM'22). We lay out the structure of a new geometrical object (the DBN-expohedron), and propose for it a Carathéodory decomposition algorithm of complexity $O(n^3)$, where n is the number of documents to rank. Such an algorithm enables expressing any feasible expected exposure vector as a distribution over at most n rankings; furthermore we show that we can compute the whole set of Pareto-optimal expected exposure vectors with the same complexity $O(n^3)$. Our work constitutes the first exact algorithm able to efficiently find a Pareto-optimal distribution of rankings. It is applicable to a broad range of fairness notions, including classical notions of meritocratic and demographic fairness. We empirically evaluate our method on the TREC2020 and MSLR datasets and compare it to several baselines in terms of Pareto-optimality and speed.|近年来,很明显,在许多领域提供的排名不仅需要对用户有用,而且还要尊重项目制作者的公平曝光。我们考虑的问题,找到排序的政策,实现了帕累托最优权衡这两个方面。人们提出了几种方法来解决这个问题,例如,一种流行的方法是使用伯克霍夫-冯诺依曼分解的线性规划。然而,这些方法是基于经典的基于位置的曝光模型(PBM) ,该模型假设项目之间的独立性(因此曝光只取决于排名)。在许多应用程序中,这种假设是不现实的,社区越来越倾向于考虑包含依赖关系的其他模型,例如动态贝氏网路暴露模型。对于这样的模型,计算(精确的)最优公平排序策略仍然是一个悬而未决的问题。在本文中,我们回答这个问题,利用一个新的几何方法的基础上,所谓的外三面体最近提出的 PBM (Kletti 等,WSDM’22)。我们给出了一个新的几何对象(DBN-expohedron)的结构,并提出了一个复杂度为 $O (n ^ 3) $的 Carathéodory 分解算法,其中 n 是要排序的文档数。这种算法能够表示任何可行的期望暴露矢量作为一个分布在最多 n 个排名; 此外,我们表明,我们可以计算整个集合的帕累托最优期望暴露矢量具有相同的复杂度 $O (n ^ 3) $。我们的工作构成了第一个精确的算法,能够有效地找到排名的帕累托最优分布。它适用于广泛的公平概念,包括精英统治和人口统计公平的经典概念。我们在 TREC2020和 MSLR 数据集上经验性地评估了我们的方法,并将其与几个基线在帕累托最优性和速度方面进行了比较。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Pareto-Optimal+Fairness-Utility+Amortizations+in+Rankings+with+a+DBN+Exposure+Model)|0| |[Risk-Sensitive Deep Neural Learning to Rank](https://doi.org/10.1145/3477495.3532056)|Pedro Henrique Silva Rodrigues, Daniel Xavier de Sousa, Thierson Couto Rosa, Marcos André Gonçalves|Federal University of Minas Gerais - UFMG, Belo Horizonte, Brazil; Federal University of Goiás - UFG, Goiânia, Brazil; Federal Institute of Goiás - IFG, Anápolis, Brazil|Learning to Rank (L2R) is the core task of many Information Retrieval systems. Recently, a great effort has been put on exploring Deep Neural Networks (DNNs) for L2R, with significant results. However, risk-sensitiveness, an important and recent advance in the L2R arena, that reduces variability and increases trust, has not been incorporated into Deep Neural L2R yet. Risk-sensitive measures are important to assess the risk of an IR system to perform worse than a set of baseline IR systems for several queries. However, the risk-sensitive measures described in the literature have a non-smooth behavior, making them difficult, if not impossible, to be optimized by DNNs. In this work we solve this difficult problem by proposing a family of new loss functions -- \riskloss\ -- that support a smooth risk-sensitive optimization. \riskloss\ introduces two important contributions: (i) the substitution of the traditional NDCG or MAP metrics in risk-sensitive measures with smooth loss functions that evaluate the correlation between the predicted and the true relevance order of documents for a given query and (ii) the use of distinct versions of the same DNN architecture as baselines by means of a multi-dropout technique during the smooth risk-sensitive optimization, avoiding the inconvenience of assessing multiple IR systems as part of DNN training. We empirically demonstrate significant achievements of the proposed \riskloss\ functions when used with recent DNN methods in the context of well-known web-search datasets such as WEB10K, YAHOO, and MQ2007. Our solutions reach improvements of 8% in effectiveness (NDCG) while improving in around 5% the risk-sensitiveness (\grisk\ measure) when applied together with a state-of-the-art Self-Attention DNN-L2R architecture. Furthermore, \riskloss\ is capable of reducing by 28% the losses over the best evaluated baselines and significantly improving over the risk-sensitive state-of-the-art non-DNN method (by up to 13.3%) while keeping (or even increasing) overall effectiveness. All these results ultimately establish a new level for the state-of-the-art on risk-sensitiveness and DNN-L2R research.|学习排名(L2R)是许多信息检索系统的核心任务。近年来,针对 L2R 的深层神经网络(DNN)的研究取得了显著的成果。然而,风险敏感性,一个重要的和最近在 L2R 领域的进展,减少变异性和增加信任,尚未被纳入深层神经 L2R。风险敏感度量对于评估一个 IR 系统在几个查询中的性能低于一组基准 IR 系统的风险非常重要。然而,文献中描述的风险敏感性措施有一个不平滑的行为,使他们难以,如果不是不可能,被 DNN 优化。在这项工作中,我们通过提出一系列新的损失函数——风险损失——来解决这个难题,这些函数支持平稳的风险敏感优化。风险损失引入了两个重要贡献: (i)用平滑损失函数替换风险敏感度量中的传统 NDCG 或 MAP 指标,评估给定查询的文档的预测和真实相关顺序之间的相关性; (ii)通过平滑风险敏感性优化期间的多退出技术使用相同 DNN 架构的不同版本作为基线,避免了评估多个 IR 系统作为 DNN 训练的一部分的不便。当与最近的 DNN 方法在诸如 WEB10K,YAHOO 和 MQ2007等著名的网络搜索数据集的背景下使用时,我们经验性地证明了所提出的风险损失函数的显着成就。我们的解决方案在与最先进的自我注意 DNN-L2R 架构一起应用时,有效性(NDCG)提高了8% ,而风险敏感性(风险测量)提高了约5% 。此外,风险损失能够比最佳评估基线减少28% 的损失,并且比风险敏感的最先进的非 DNN 方法(高达13.3%)显着改善,同时保持(甚至增加)总体有效性。所有这些结果最终为风险敏感性和 DNN-L2R 研究的最新水平奠定了一个新的基础。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Risk-Sensitive+Deep+Neural+Learning+to+Rank)|0| |[Adaptable Text Matching via Meta-Weight Regulator](https://doi.org/10.1145/3477495.3531932)|Bo Zhang, Chen Zhang, Fang Ma, Dawei Song|Beijing Institute of Technology, Beijing, China|Neural text matching models have been used in a range of applications such as question answering and natural language inference, and have yielded a good performance. However, these neural models are of a limited adaptability, resulting in a decline in performance when encountering test examples from a different dataset or even a different task. The adaptability is particularly important in the few-shot setting: in many cases, there is only a limited amount of labeled data available for a target dataset or task, while we may have access to a richly labeled source dataset or task. However, adapting a model trained on the abundant source data to a few-shot target dataset or task is challenging. To tackle this challenge, we propose a Meta-Weight Regulator (MWR), which is a meta-learning approach that learns to assign weights to the source examples based on their relevance to the target loss. Specifically, MWR first trains the model on the uniformly weighted source examples, and measures the efficacy of the model on the target examples via a loss function. By iteratively performing a (meta) gradient descent, high-order gradients are propagated to the source examples. These gradients are then used to update the weights of source examples, in a way that is relevant to the target performance. As MWR is model-agnostic, it can be applied to any backbone neural model. Extensive experiments are conducted with various backbone text matching models, on four widely used datasets and two tasks. The results demonstrate that our proposed approach significantly outperforms a number of existing adaptation methods and effectively improves the cross-dataset and cross-task adaptability of the neural text matching models in the few-shot setting.|神经文本匹配模型已经在问答、自然语言推理等领域得到了广泛的应用,并取得了良好的效果。然而,这些神经模型的适应性有限,当遇到来自不同数据集甚至不同任务的测试例子时,会导致性能下降。适应性在少镜头设置中尤其重要: 在许多情况下,目标数据集或任务只有有限数量的标记数据可用,而我们可以访问标记丰富的源数据集或任务。然而,将一个基于大量源数据训练的模型应用于少量目标数据集或任务是具有挑战性的。为了应对这一挑战,我们提出了一种元权重调节器(MWR) ,它是一种元学习方法,学习根据源示例与目标损失的相关性为其分配权重。具体来说,MWR 首先在均匀加权的源例子上训练模型,然后通过损失函数来度量模型对目标例子的有效性。通过迭代执行一个(元)梯度下降法,高阶梯度被传播到源示例。然后使用这些渐变来更新源示例的权重,其方式与目标性能相关。由于 MWR 是模型无关的,因此它可以应用于任何骨干神经网络模型。在四个广泛使用的数据集和两个任务上,使用各种骨干文本匹配模型进行了广泛的实验。结果表明,本文提出的方法明显优于现有的一些自适应方法,有效地提高了神经元文本匹配模型在少镜头情况下的跨数据集和跨任务适应性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Adaptable+Text+Matching+via+Meta-Weight+Regulator)|0| |[Re-thinking Knowledge Graph Completion Evaluation from an Information Retrieval Perspective](https://doi.org/10.1145/3477495.3532052)|Ying Zhou, Xuanang Chen, Ben He, Zheng Ye, Le Sun|University of Chinese Academy of Sciences & Institute of Software, Chinese Academy of Sciences, Beijing, China; South-Central University for Nationalities, Wuhan, China; Institute of Software, Chinese Academy of Sciences, Beijing, China|Knowledge graph completion (KGC) aims to infer missing knowledge triples based on known facts in a knowledge graph. Current KGC research mostly follows an entity ranking protocol, wherein the effectiveness is measured by the predicted rank of a masked entity in a test triple. The overall performance is then given by a micro(-average) metric over all individual answer entities. Due to the incomplete nature of the large-scale knowledge bases, such an entity ranking setting is likely affected by unlabelled top-ranked positive examples, raising questions on whether the current evaluation protocol is sufficient to guarantee a fair comparison of KGC systems. To this end, this paper presents a systematic study on whether and how the label sparsity affects the current KGC evaluation with the popular micro metrics. Specifically, inspired by the TREC paradigm for large-scale information retrieval (IR) experimentation, we create a relatively "complete" judgment set based on a sample from the popular FB15k-237 dataset following the TREC pooling method. According to our analysis, it comes as a surprise that switching from the original labels to our "complete" labels results in a drastic change of system ranking of a variety of 13 popular KGC models in terms of micro metrics. Further investigation indicates that the IR-like macro(-average) metrics are more stable and discriminative under different settings, meanwhile, less affected by label sparsity. Thus, for KGC evaluation, we recommend conducting TREC-style pooling to balance between human efforts and label completeness, and reporting also the IR-like macro metrics to reflect the ranking nature of the KGC task.|知识图完成(KGC)是基于知识图中已知事实推断出缺失的知识三元组。目前的 KGC 研究大多遵循一个实体排名协议,其中的有效性是衡量一个被掩盖的实体在一个测试三元组的预测排名。然后,通过对所有单个答案实体的微观(平均)度量给出总体表现。由于大规模知识库的不完整性,这种实体排名设置可能会受到没有标记的排名最高的积极实例的影响,从而引起目前的评价议定书是否足以保证公平比较 KGC 系统的问题。为此,本文利用当前流行的微观指标,对标签稀疏性是否以及如何影响当前 KGC 评价进行了系统的研究。具体来说,受到 TREC 大规模信息检索(IR)实验范例的启发,我们创建了一个相对“完整”的判断集,该判断集基于流行的 FB15k-237数据集的样本,采用 TREC 汇集方法。根据我们的分析,令人惊讶的是,从原始标签切换到我们的“完整”标签导致系统排名的急剧变化的各种13个流行的 KGC 模型在微观指标方面。进一步的研究表明,类 IR 宏(平均)指标在不同的设置下更加稳定和具有区分性,同时受标签稀疏性的影响较小。因此,对于 KGC 评估,我们建议进行 TREC 风格的池来平衡人工努力和标签完整性,并报告类似 IR 的宏指标来反映 KGC 任务的排名性质。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Re-thinking+Knowledge+Graph+Completion+Evaluation+from+an+Information+Retrieval+Perspective)|0| |[CRET: Cross-Modal Retrieval Transformer for Efficient Text-Video Retrieval](https://doi.org/10.1145/3477495.3531960)|Kaixiang Ji, Jiajia Liu, Weixiang Hong, Liheng Zhong, Jian Wang, Jingdong Chen, Wei Chu|Ant Group, Hangzhou, China|Given a text query, the text-to-video retrieval task aims to find the relevant videos in the database. Recently, model-based (MDB) methods have demonstrated superior accuracy than embedding-based (EDB) methods due to their excellent capacity of modeling local video/text correspondences, especially when equipped with large-scale pre-training schemes like ClipBERT. Generally speaking, MDB methods take a text-video pair as input and harness deep models to predict the mutual similarity, while EDB methods first utilize modality-specific encoders to extract embeddings for text and video, then evaluate the distance based on the extracted embeddings. Notably, MDB methods cannot produce explicit representations for text and video, instead, they have to exhaustively pair the query with every database item to predict their mutual similarities in the inference stage, which results in significant inefficiency in practical applications. In this work, we propose a novel EDB method CRET (Cross-modal REtrieval Transformer), which not only demonstrates promising efficiency in retrieval tasks, but also achieves better accuracy than existing MDB methods. The credits are mainly attributed to our proposed Cross-modal Correspondence Modeling (CCM) module and Gaussian Estimation of Embedding Space (GEES) loss. Specifically, the CCM module is composed by transformer decoders and a set of decoder centers. With the help of the learned decoder centers, the text/video embeddings can be efficiently aligned, without suffering from pairwise model-based inference. Moreover, to balance the information loss and computational overhead when sampling frames from a given video, we present a novel GEES loss, which implicitly conducts dense sampling in the video embedding space, without suffering from heavy computational cost. Extensive experiments show that without pre-training on extra datasets, our proposed CRET outperforms the state-of-the-art MDB methods that were pre-trained on additional datasets, meanwhile still shows promising efficiency in retrieval tasks.|给定一个文本查询,文本到视频检索任务的目的是在数据库中找到相关的视频。近年来,基于模型(MDB)的方法由于其优异的局部视频/文本对应建模能力而显示出优于基于嵌入的方法的准确性,特别是当配备了大规模的预训练方案,如 ClipBERT。一般来说,MDB 方法以文本-视频对为输入,利用深度模型来预测相似度,而 EDB 方法首先利用特定于模态的编码器来提取文本和视频的嵌入,然后根据提取的嵌入来评估距离。值得注意的是,MDB 方法不能生成文本和视频的显式表示,相反,它们必须将查询与每个数据库项穷举地配对,以预测它们在推理阶段的相似性,这导致了实际应用中的显著效率低下。本文提出了一种新的多模态检索转换器(CRET)方法,该方法不仅在检索任务中表现出良好的效率,而且比现有的 MDB 方法具有更高的准确率。这主要归功于我们提出的交叉模态对应建模(CCM)模块和嵌入空间的高斯估计(GEES)损失。具体来说,CCM 模块由变压器解码器和一组解码中心组成。借助于所学习的解码中心,文本/视频嵌入可以有效地对齐,而不会受到基于成对模型的推理的影响。此外,为了平衡从给定视频帧采样时的信息损失和计算开销,我们提出了一种新的 GEES 损失算法,该算法在视频嵌入空间中隐式地进行密集采样,不需要承担大量的计算开销。大量的实验表明,在不对额外数据集进行预训练的情况下,我们提出的 CRET 方法优于对额外数据集进行预训练的最先进的 MDB 方法,同时在检索任务中仍然显示出有希望的效率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CRET:+Cross-Modal+Retrieval+Transformer+for+Efficient+Text-Video+Retrieval)|0| -|[Learn from Unlabeled Videos for Near-duplicate Video Retrieval](https://doi.org/10.1145/3477495.3532010)|Xiangteng He, Yulin Pan, Mingqian Tang, Yiliang Lv, Yuxin Peng|Alibaba Group, Hangzhou, China; Peking University, Beijing, China|Near-duplicate video retrieval (NDVR) aims to find the copies or transformations of the query video from a massive video database. It plays an important role in many video related applications, including copyright protection, tracing, filtering and etc. Video representation and similarity search are crucial to any video retrieval system. To derive effective video representation, most video retrieval systems require a large amount of manually annotated data for training, making it costly inefficient. In addition, most retrieval systems are based on frame-level features for video similarity searching, making it expensive both storage wise and search wise. To address the above issues, we propose a video representation learning (VRL) approach to effectively address the above shortcomings. It first effectively learns video representation from unlabeled videos via contrastive learning to avoid the expensive cost of manual annotation. Then, it exploits transformer structure to aggregate frame-level features into clip-level to reduce both storage space and search complexity. It can learn the complementary and discriminative information from the interactions among clip frames, as well as acquire the frame permutation and missing invariant ability to support more flexible retrieval manners. Comprehensive experiments on two challenging near-duplicate video retrieval datasets, namely FIVR-200K and SVD, verify the effectiveness of our proposed VRL approach, which achieves the best performance of video retrieval on accuracy and efficiency.|近重复视频检索(NDVR)的目标是从海量视频数据库中查找查询视频的副本或变换。它在许多视频相关应用中起着重要作用,包括版权保护、跟踪、过滤等。视频表示和最近邻搜索对于任何视频检索系统都至关重要。为了获得有效的视频表示,大多数视频检索系统需要大量的人工注释数据进行训练,这使得系统效率低下。此外,大多数检索系统基于帧级特征进行视频相似性搜索,这使得存储和搜索成本都很高。针对上述问题,本文提出了一种视频表示学习(VRL)方法,有效地解决了上述问题。它首先通过对比学习有效地从未标记的视频中学习视频表示,从而避免了人工标注的昂贵成本。然后,利用变压器结构将帧级特征聚合为剪辑级特征,降低存储空间和搜索复杂度。它可以从剪辑帧之间的交互中学习互补信息和鉴别信息,获得帧排列和缺失不变量能力,支持更灵活的检索方式。通过对 FIVR-200K 和 SVD 两个具有挑战性的近重复视频检索数据集的综合实验,验证了本文提出的 VRL 方法的有效性,在准确性和效率方面达到了最佳的视频检索性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learn+from+Unlabeled+Videos+for+Near-duplicate+Video+Retrieval)|0| +|[Learn from Unlabeled Videos for Near-duplicate Video Retrieval](https://doi.org/10.1145/3477495.3532010)|Xiangteng He, Yulin Pan, Mingqian Tang, Yiliang Lv, Yuxin Peng|Peking University, Beijing, China; Alibaba Group, Hangzhou, China|Near-duplicate video retrieval (NDVR) aims to find the copies or transformations of the query video from a massive video database. It plays an important role in many video related applications, including copyright protection, tracing, filtering and etc. Video representation and similarity search are crucial to any video retrieval system. To derive effective video representation, most video retrieval systems require a large amount of manually annotated data for training, making it costly inefficient. In addition, most retrieval systems are based on frame-level features for video similarity searching, making it expensive both storage wise and search wise. To address the above issues, we propose a video representation learning (VRL) approach to effectively address the above shortcomings. It first effectively learns video representation from unlabeled videos via contrastive learning to avoid the expensive cost of manual annotation. Then, it exploits transformer structure to aggregate frame-level features into clip-level to reduce both storage space and search complexity. It can learn the complementary and discriminative information from the interactions among clip frames, as well as acquire the frame permutation and missing invariant ability to support more flexible retrieval manners. Comprehensive experiments on two challenging near-duplicate video retrieval datasets, namely FIVR-200K and SVD, verify the effectiveness of our proposed VRL approach, which achieves the best performance of video retrieval on accuracy and efficiency.|近重复视频检索(NDVR)的目标是从海量视频数据库中查找查询视频的副本或变换。它在许多视频相关应用中起着重要作用,包括版权保护、跟踪、过滤等。视频表示和最近邻搜索对于任何视频检索系统都至关重要。为了获得有效的视频表示,大多数视频检索系统需要大量的人工注释数据进行训练,这使得系统效率低下。此外,大多数检索系统基于帧级特征进行视频相似性搜索,这使得存储和搜索成本都很高。针对上述问题,本文提出了一种视频表示学习(VRL)方法,有效地解决了上述问题。它首先通过对比学习有效地从未标记的视频中学习视频表示,从而避免了人工标注的昂贵成本。然后,利用变压器结构将帧级特征聚合为剪辑级特征,降低存储空间和搜索复杂度。它可以从剪辑帧之间的交互中学习互补信息和鉴别信息,获得帧排列和缺失不变量能力,支持更灵活的检索方式。通过对 FIVR-200K 和 SVD 两个具有挑战性的近重复视频检索数据集的综合实验,验证了本文提出的 VRL 方法的有效性,在准确性和效率方面达到了最佳的视频检索性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learn+from+Unlabeled+Videos+for+Near-duplicate+Video+Retrieval)|0| |[Progressive Learning for Image Retrieval with Hybrid-Modality Queries](https://doi.org/10.1145/3477495.3532047)|Yida Zhao, Yuqing Song, Qin Jin|Renmin University of China, Beijing, China|Image retrieval with hybrid-modality queries, also known as composing text and image for image retrieval (CTI-IR), is a retrieval task where the search intention is expressed in a more complex query format, involving both vision and text modalities. For example, a target product image is searched using a reference product image along with text about changing certain attributes of the reference image as the query. It is a more challenging image retrieval task that requires both semantic space learning and cross-modal fusion. Previous approaches that attempt to deal with both aspects achieve unsatisfactory performance. In this paper, we decompose the CTI-IR task into a three-stage learning problem to progressively learn the complex knowledge for image retrieval with hybrid-modality queries. We first leverage the semantic embedding space for open-domain image-text retrieval, and then transfer the learned knowledge to the fashion-domain with fashion-related pre-training tasks. Finally, we enhance the pre-trained model from single-query to hybrid-modality query for the CTI-IR task. Furthermore, as the contribution of individual modality in the hybrid-modality query varies for different retrieval scenarios, we propose a self-supervised adaptive weighting strategy to dynamically determine the importance of image and text in the hybrid-modality query for better retrieval. Extensive experiments show that our proposed model significantly outperforms state-of-the-art methods in the mean of [email protected] by 24.9% and 9.5% on the Fashion-IQ and Shoes benchmark datasets respectively.|基于混合模态查询的图像检索,即组合文本和图像进行图像检索(CTI-IR) ,是一种以更复杂的查询格式表达搜索意图的检索任务,涉及视觉和文本模态。例如,使用参考产品图像以及关于将参考图像的某些属性更改为查询的文本搜索目标产品图像。语义空间学习和跨模态融合是图像检索中一个更具挑战性的任务。以前试图同时处理这两个方面的方法的性能都不令人满意。本文将 CTI-IR 任务分解为三阶段学习问题,逐步学习用于图像检索的复杂知识。我们首先利用语义嵌入空间进行开放领域的图文检索,然后利用与时尚相关的预训练任务将所学知识转移到时尚领域。最后,对 CTI-IR 任务的预训练模型进行了改进,从单查询模型改进为混合模态查询模型。此外,由于个体模态在混合模态查询中的贡献因检索场景的不同而不同,我们提出了一种自监督自适应加权策略,动态确定图像和文本在混合模态查询中的重要性,以便更好地检索。大量的实验表明,在 Fashion-IQ 和 Shoes 基准数据集上,我们提出的模型在平均值(电子邮件受保护)方面明显优于最先进的方法,分别为24.9% 和9.5% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Progressive+Learning+for+Image+Retrieval+with+Hybrid-Modality+Queries)|0| |[Incorporating Explicit Knowledge in Pre-trained Language Models for Passage Re-ranking](https://doi.org/10.1145/3477495.3531997)|Qian Dong, Yiding Liu, Suqi Cheng, Shuaiqiang Wang, Zhicong Cheng, Shuzi Niu, Dawei Yin|Institute of Software, Chinese Academy of Sciences, Beijing, China; Baidu Inc., Beijing, China|Passage re-ranking is to obtain a permutation over the candidate passage set from retrieval stage. Re-rankers have been boomed by Pre-trained Language Models (PLMs) due to their overwhelming advantages in natural language understanding. However, existing PLM based re-rankers may easily suffer from vocabulary mismatch and lack of domain specific knowledge. To alleviate these problems, explicit knowledge contained in knowledge graph is carefully introduced in our work. Specifically, we employ the existing knowledge graph which is incomplete and noisy, and first apply it in passage re-ranking task. To leverage a reliable knowledge, we propose a novel knowledge graph distillation method and obtain a knowledge meta graph as the bridge between query and passage. To align both kinds of embedding in the latent space, we employ PLM as text encoder and graph neural network over knowledge meta graph as knowledge encoder. Besides, a novel knowledge injector is designed for the dynamic interaction between text and knowledge encoder. Experimental results demonstrate the effectiveness of our method especially in queries requiring in-depth domain knowledge.|文章重新排序是从检索阶段对候选文章集合进行排列。由于预训练语言模型在自然语言理解方面具有压倒性的优势,重新排名的语言模型得到了蓬勃发展。然而,现有的基于 PLM 的重新排名可能很容易受到词汇不匹配和缺乏领域特定知识的影响。为了解决这些问题,我们在工作中仔细介绍了知识图表中的外显知识。具体地说,我们利用现有的不完备且有噪声的知识图,首先将其应用于段落重排任务。为了利用可靠的知识,我们提出了一种新的知识图提取方法,并得到一个知识元图作为查询和文章之间的桥梁。为了在潜空间中对齐这两种嵌入,我们采用 PLM 作为文本编码器,知识元图上的图形神经网络作为知识编码器。此外,还设计了一种新颖的知识注入器,用于文本和知识编码器之间的动态交互。实验结果表明了该方法的有效性,特别是在需要深度领域知识的查询中。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Incorporating+Explicit+Knowledge+in+Pre-trained+Language+Models+for+Passage+Re-ranking)|0| -|[Axiomatically Regularized Pre-training for Ad hoc Search](https://doi.org/10.1145/3477495.3531943)|Jia Chen, Yiqun Liu, Yan Fang, Jiaxin Mao, Hui Fang, Shenghao Yang, Xiaohui Xie, Min Zhang, Shaoping Ma|Tsinghua University, Beijing, China; University of Delaware, Newark, DE, USA; Renmin University of China, Beijing, China|Recently, pre-training methods tailored for IR tasks have achieved great success. However, as the mechanisms behind the performance improvement remain under-investigated, the interpretability and robustness of these pre-trained models still need to be improved. Axiomatic IR aims to identify a set of desirable properties expressed mathematically as formal constraints to guide the design of ranking models. Existing studies have already shown that considering certain axioms may help improve the effectiveness and interpretability of IR models. However, there still lack efforts of incorporating these IR axioms into pre-training methodologies. To shed light on this research question, we propose a novel pre-training method with \underlineA xiomatic \underlineRe gularization for ad hoc \underlineS earch (ARES). In the ARES framework, a number of existing IR axioms are re-organized to generate training samples to be fitted in the pre-training process. These training samples then guide neural rankers to learn the desirable ranking properties. Compared to existing pre-training approaches, ARES is more intuitive and explainable. Experimental results on multiple publicly available benchmark datasets have shown the effectiveness of ARES in both full-resource and low-resource (e.g., zero-shot and few-shot) settings. An intuitive case study also indicates that ARES has learned useful knowledge that existing pre-trained models (e.g., BERT and PROP) fail to possess. This work provides insights into improving the interpretability of pre-trained models and the guidance of incorporating IR axioms or human heuristics into pre-training methods.|近年来,针对 IR 任务的预训练方法取得了很大的成功。然而,由于性能改进背后的机制仍然没有得到充分的研究,这些预先训练的模型的可解释性和鲁棒性仍然需要改进。公理 IR 旨在识别一组用数学方法表示为形式约束的理想属性,以指导排序模型的设计。现有的研究已经表明,考虑某些公理可能有助于提高红外模型的有效性和可解释性。然而,仍然缺乏将这些 IR 公理纳入预训练方法的努力。针对这一问题,本文提出了一种基于下划线公理化下划线正则化的自组织下划线搜索(ARES)预训练方法。在 ARES 框架中,对一些现有的信息检索公理进行了重新组织,以生成培训样本,用于培训前进程。这些训练样本然后指导神经排序学习理想的排序属性。与现有的预训练方法相比,ARES 更加直观和易于解释。在多个公开可用的基准数据集上的实验结果显示了 ARES 在全资源和低资源(例如,零拍摄和少拍摄)环境下的有效性。一个直观的案例研究还表明,ARES 已经学到了有用的知识,现有的预训练模型(例如,BERT 和 PROP)不能拥有。这项工作为提高预训练模型的可解释性提供了见解,并指导将 IR 公理或人类启发式融入预训练方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Axiomatically+Regularized+Pre-training+for+Ad+hoc+Search)|0| -|[On the Role of Relevance in Natural Language Processing Tasks](https://doi.org/10.1145/3477495.3532034)|Artsiom Sauchuk, James Thorne, Alon Y. Halevy, Nicola Tonellotto, Fabrizio Silvestri|Sapienza University of Rome, Roma, Italy; Meta AI, Menlo Park, CA, USA; Cambridge University, London, United Kingdom; University of Pisa, Pisa, Italy|Many recent Natural Language Processing (NLP) task formulations, such as question answering and fact verification, are implemented as a two-stage cascading architecture. In the first stage an IR system retrieves "relevant'' documents containing the knowledge, and in the second stage an NLP system performs reasoning to solve the task. Optimizing the IR system for retrieving relevant documents ensures that the NLP system has sufficient information to operate over. These recent NLP task formulations raise interesting and exciting challenges for IR, where the end-user of an IR system is not a human with an information need, but another system exploiting the documents retrieved by the IR system to perform reasoning and address the user information need. Among these challenges, as we will show, is that noise from the IR system, such as retrieving spurious or irrelevant documents, can negatively impact the accuracy of the downstream reasoning module. Hence, there is the need to balance maximizing relevance while minimizing noise in the IR system. This paper presents experimental results on two NLP tasks implemented as a two-stage cascading architecture. We show how spurious or irrelevant retrieved results from the first stage can induce errors in the second stage. We use these results to ground our discussion of the research challenges that the IR community should address in the context of these knowledge-intensive NLP tasks.|近年来,自然语言处理(NLP)的许多任务公式,如问题回答和事实验证,都是作为一个两阶段级联结构实现的。在第一阶段,IR 系统检索包含知识的“相关”文档,在第二阶段,NLP 系统执行推理来解决任务。优化检索相关文件的红外系统,确保自然语言处理系统有足够的信息进行操作。这些最新的 NLP 任务公式为 IR 提出了有趣和令人兴奋的挑战,其中 IR 系统的最终用户不是一个有信息需求的人,而是另一个利用 IR 系统检索到的文档进行推理并满足用户信息需求的系统。在这些挑战中,正如我们将要展示的,来自 IR 系统的噪音,例如检索虚假或不相关的文档,可能会对下游推理模块的准确性产生负面影响。因此,需要在最大相关性和最小噪声之间取得平衡。本文给出了两个实现为两级级联结构的自然语言处理任务的实验结果。我们展示了如何伪造或不相关的检索结果从第一阶段可以导致错误的第二阶段。我们使用这些结果来基础我们的研究挑战的讨论,IR 社区应该在这些知识密集型的自然语言处理任务的背景下解决。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On+the+Role+of+Relevance+in+Natural+Language+Processing+Tasks)|0| -|[Adversarial Graph Perturbations for Recommendations at Scale](https://doi.org/10.1145/3477495.3531763)|Huiyuan Chen, Kaixiong Zhou, KweiHerng Lai, Xia Hu, Fei Wang, Hao Yang|Rice University, Houston, TX, USA; Visa Research, palo alto, CA, USA|Graph Neural Networks (GNNs) provide a class of powerful architectures that are effective for graph-based collaborative filtering. Nevertheless, GNNs are known to be vulnerable to adversarial perturbations. Adversarial training is a simple yet effective way to improve the robustness of neural models. For example, many prior studies inject adversarial perturbations into either node features or hidden layers of GNNs. However, perturbing graph structures has been far less studied in recommendations. To bridge this gap, we propose AdvGraph to model adversarial graph perturbations during the training of GNNs. Our AdvGraph is mainly based on min-max robust optimization, where an universal graph perturbation is obtained through an inner maximization while the outer optimization aims to compute the model parameters of GNNs. However, direct optimizing the inner problem is challenging due to discrete nature of the graph perturbations. To address this issue, an unbiased gradient estimator is further proposed to compute the gradients of discrete variables. Extensive experiments demonstrate that our AdvGraph is able to enhance the generalization performance of GNN-based recommenders.|图形神经网络(GNN)为基于图形的协同过滤提供了一种强大的架构。然而,GNN 是众所周知的脆弱的对抗性扰动。对抗训练是提高神经模型鲁棒性的一种简单而有效的方法。例如,许多先前的研究将对抗扰动注入到 GNN 的节点特征或隐层中。然而,令人不安的图形结构在推荐中却很少被研究。为了弥补这一差距,我们提出了 AdvGraph 来模拟 GNN 训练过程中的对抗图扰动。我们的 AdvGraph 主要是基于最小-最大鲁棒优化,其中通过内部最大化获得通用图摄动,而外部优化的目的是计算 GNN 的模型参数。然而,直接优化的内部问题是具有挑战性的,由于离散性质的图摄动。为了解决这一问题,进一步提出了一种无偏的梯度估计器来计算离散变量的梯度。大量的实验表明,我们的 AdvGraph 能够提高基于 GNN 的推荐器的泛化性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Adversarial+Graph+Perturbations+for+Recommendations+at+Scale)|0| +|[Axiomatically Regularized Pre-training for Ad hoc Search](https://doi.org/10.1145/3477495.3531943)|Jia Chen, Yiqun Liu, Yan Fang, Jiaxin Mao, Hui Fang, Shenghao Yang, Xiaohui Xie, Min Zhang, Shaoping Ma|Renmin University of China, Beijing, China; University of Delaware, Newark, DE, USA; Tsinghua University, Beijing, China|Recently, pre-training methods tailored for IR tasks have achieved great success. However, as the mechanisms behind the performance improvement remain under-investigated, the interpretability and robustness of these pre-trained models still need to be improved. Axiomatic IR aims to identify a set of desirable properties expressed mathematically as formal constraints to guide the design of ranking models. Existing studies have already shown that considering certain axioms may help improve the effectiveness and interpretability of IR models. However, there still lack efforts of incorporating these IR axioms into pre-training methodologies. To shed light on this research question, we propose a novel pre-training method with \underlineA xiomatic \underlineRe gularization for ad hoc \underlineS earch (ARES). In the ARES framework, a number of existing IR axioms are re-organized to generate training samples to be fitted in the pre-training process. These training samples then guide neural rankers to learn the desirable ranking properties. Compared to existing pre-training approaches, ARES is more intuitive and explainable. Experimental results on multiple publicly available benchmark datasets have shown the effectiveness of ARES in both full-resource and low-resource (e.g., zero-shot and few-shot) settings. An intuitive case study also indicates that ARES has learned useful knowledge that existing pre-trained models (e.g., BERT and PROP) fail to possess. This work provides insights into improving the interpretability of pre-trained models and the guidance of incorporating IR axioms or human heuristics into pre-training methods.|近年来,针对 IR 任务的预训练方法取得了很大的成功。然而,由于性能改进背后的机制仍然没有得到充分的研究,这些预先训练的模型的可解释性和鲁棒性仍然需要改进。公理 IR 旨在识别一组用数学方法表示为形式约束的理想属性,以指导排序模型的设计。现有的研究已经表明,考虑某些公理可能有助于提高红外模型的有效性和可解释性。然而,仍然缺乏将这些 IR 公理纳入预训练方法的努力。针对这一问题,本文提出了一种基于下划线公理化下划线正则化的自组织下划线搜索(ARES)预训练方法。在 ARES 框架中,对一些现有的信息检索公理进行了重新组织,以生成培训样本,用于培训前进程。这些训练样本然后指导神经排序学习理想的排序属性。与现有的预训练方法相比,ARES 更加直观和易于解释。在多个公开可用的基准数据集上的实验结果显示了 ARES 在全资源和低资源(例如,零拍摄和少拍摄)环境下的有效性。一个直观的案例研究还表明,ARES 已经学到了有用的知识,现有的预训练模型(例如,BERT 和 PROP)不能拥有。这项工作为提高预训练模型的可解释性提供了见解,并指导将 IR 公理或人类启发式融入预训练方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Axiomatically+Regularized+Pre-training+for+Ad+hoc+Search)|0| +|[On the Role of Relevance in Natural Language Processing Tasks](https://doi.org/10.1145/3477495.3532034)|Artsiom Sauchuk, James Thorne, Alon Y. Halevy, Nicola Tonellotto, Fabrizio Silvestri|Sapienza University of Rome, Roma, Italy; Cambridge University, London, United Kingdom; Meta AI, Menlo Park, CA, USA; University of Pisa, Pisa, Italy|Many recent Natural Language Processing (NLP) task formulations, such as question answering and fact verification, are implemented as a two-stage cascading architecture. In the first stage an IR system retrieves "relevant'' documents containing the knowledge, and in the second stage an NLP system performs reasoning to solve the task. Optimizing the IR system for retrieving relevant documents ensures that the NLP system has sufficient information to operate over. These recent NLP task formulations raise interesting and exciting challenges for IR, where the end-user of an IR system is not a human with an information need, but another system exploiting the documents retrieved by the IR system to perform reasoning and address the user information need. Among these challenges, as we will show, is that noise from the IR system, such as retrieving spurious or irrelevant documents, can negatively impact the accuracy of the downstream reasoning module. Hence, there is the need to balance maximizing relevance while minimizing noise in the IR system. This paper presents experimental results on two NLP tasks implemented as a two-stage cascading architecture. We show how spurious or irrelevant retrieved results from the first stage can induce errors in the second stage. We use these results to ground our discussion of the research challenges that the IR community should address in the context of these knowledge-intensive NLP tasks.|近年来,自然语言处理(NLP)的许多任务公式,如问题回答和事实验证,都是作为一个两阶段级联结构实现的。在第一阶段,IR 系统检索包含知识的“相关”文档,在第二阶段,NLP 系统执行推理来解决任务。优化检索相关文件的红外系统,确保自然语言处理系统有足够的信息进行操作。这些最新的 NLP 任务公式为 IR 提出了有趣和令人兴奋的挑战,其中 IR 系统的最终用户不是一个有信息需求的人,而是另一个利用 IR 系统检索到的文档进行推理并满足用户信息需求的系统。在这些挑战中,正如我们将要展示的,来自 IR 系统的噪音,例如检索虚假或不相关的文档,可能会对下游推理模块的准确性产生负面影响。因此,需要在最大相关性和最小噪声之间取得平衡。本文给出了两个实现为两级级联结构的自然语言处理任务的实验结果。我们展示了如何伪造或不相关的检索结果从第一阶段可以导致错误的第二阶段。我们使用这些结果来基础我们的研究挑战的讨论,IR 社区应该在这些知识密集型的自然语言处理任务的背景下解决。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On+the+Role+of+Relevance+in+Natural+Language+Processing+Tasks)|0| +|[Adversarial Graph Perturbations for Recommendations at Scale](https://doi.org/10.1145/3477495.3531763)|Huiyuan Chen, Kaixiong Zhou, KweiHerng Lai, Xia Hu, Fei Wang, Hao Yang|Visa Research, palo alto, CA, USA; Rice University, Houston, TX, USA|Graph Neural Networks (GNNs) provide a class of powerful architectures that are effective for graph-based collaborative filtering. Nevertheless, GNNs are known to be vulnerable to adversarial perturbations. Adversarial training is a simple yet effective way to improve the robustness of neural models. For example, many prior studies inject adversarial perturbations into either node features or hidden layers of GNNs. However, perturbing graph structures has been far less studied in recommendations. To bridge this gap, we propose AdvGraph to model adversarial graph perturbations during the training of GNNs. Our AdvGraph is mainly based on min-max robust optimization, where an universal graph perturbation is obtained through an inner maximization while the outer optimization aims to compute the model parameters of GNNs. However, direct optimizing the inner problem is challenging due to discrete nature of the graph perturbations. To address this issue, an unbiased gradient estimator is further proposed to compute the gradients of discrete variables. Extensive experiments demonstrate that our AdvGraph is able to enhance the generalization performance of GNN-based recommenders.|图形神经网络(GNN)为基于图形的协同过滤提供了一种强大的架构。然而,GNN 是众所周知的脆弱的对抗性扰动。对抗训练是提高神经模型鲁棒性的一种简单而有效的方法。例如,许多先前的研究将对抗扰动注入到 GNN 的节点特征或隐层中。然而,令人不安的图形结构在推荐中却很少被研究。为了弥补这一差距,我们提出了 AdvGraph 来模拟 GNN 训练过程中的对抗图扰动。我们的 AdvGraph 主要是基于最小-最大鲁棒优化,其中通过内部最大化获得通用图摄动,而外部优化的目的是计算 GNN 的模型参数。然而,直接优化的内部问题是具有挑战性的,由于离散性质的图摄动。为了解决这一问题,进一步提出了一种无偏的梯度估计器来计算离散变量的梯度。大量的实验表明,我们的 AdvGraph 能够提高基于 GNN 的推荐器的泛化性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Adversarial+Graph+Perturbations+for+Recommendations+at+Scale)|0| |[Relevance under the Iceberg: Reasonable Prediction for Extreme Multi-label Classification](https://doi.org/10.1145/3477495.3531767)|JyunYu Jiang, WeiCheng Chang, Jiong Zhang, ChoJui Hsieh, HsiangFu Yu|University of California, Los Angeles, Los Angeles, CA, USA; Amazon Search, Palo Alto, CA, USA|In the era of big data, eXtreme Multi-label Classification (XMC) has already become one of the most essential research tasks to deal with enormous label spaces in machine learning applications. Instead of assessing every individual label, most XMC methods rely on label trees or filters to derive short ranked label lists as prediction, thereby reducing computational overhead. Specifically, existing studies obtain ranked label lists with a fixed length for prediction and evaluation. However, these predictions are unreasonable since data points have varied numbers of relevant labels. The greatly small and large list lengths in evaluation, such as [email protected] and [email protected] , can also lead to the ignorance of other relevant labels or the tolerance of many irrelevant labels. In this paper, we aim to provide reasonable prediction for extreme multi-label classification with dynamic numbers of predicted labels. In particular, we propose a novel framework, Model-Agnostic List Truncation with Ordinal Regression (MALTOR), to leverage the ranking properties and truncate long ranked label lists for better accuracy. Extensive experiments conducted on six large-scale real-world benchmark datasets demonstrate that MALTOR significantly outperforms statistical baseline methods and conventional ranked list truncation methods in ad-hoc retrieval with both linear and deep XMC models. The results of an ablation study also shows the effectiveness of each individual component in our proposed MALTOR.|在大数据时代,极限多标签分类(XMC)已经成为处理机器学习应用中大量标签空间的重要研究课题之一。大多数 XMC 方法不是评估每个单独的标签,而是依靠标签树或过滤器来推导出短排名标签列表作为预测,从而减少计算开销。具体来说,现有的研究获得了具有固定长度的排名标签列表,用于预测和评价。然而,这些预测是不合理的,因为数据点有不同数量的相关标签。评估中的列表长度大大小小,如[ email protected ]和[ email protected ] ,也可能导致对其他相关标签的忽视或许多不相关标签的容忍。在本文中,我们的目的是提供合理的预测与预测标签的动态数量极端多标签分类。特别是,我们提出了一个新的框架,模型不可知列表与有序回归截断(MALTOR) ,以利用排名属性和截断长的排名标签列表,以更好的准确性。在六个大规模真实世界基准数据集上进行的大量实验表明,MALTOR 在线性和深度 XMC 模型的特别检索中显著优于统计基线方法和传统的排序列表截断方法。消融研究的结果也显示了我们提出的 MALTOR 中每个单独组件的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Relevance+under+the+Iceberg:+Reasonable+Prediction+for+Extreme+Multi-label+Classification)|0| |[Gating-adapted Wavelet Multiresolution Analysis for Exposure Sequence Modeling in CTR Prediction](https://doi.org/10.1145/3477495.3531771)|Xiaoxiao Xu, Zhiwei Fang, Qian Yu, Ruoran Huang, Chaosheng Fan, Yong Li, Yang He, Changping Peng, Zhangang Lin, Jingping Shao, Non Non|JD Com, Business Growth BU, Beijing, Peoples R China|The exposure sequence is being actively studied for user interest modeling in Click-Through Rate (CTR) prediction. However, the existing methods for exposure sequence modeling bring extensive computational burden and neglect noise problems, resulting in an excessively latency and the limited performance in online recommenders. In this paper, we propose to address the high latency and noise problems via Gating-adapted wavelet multiresolution analysis (Gama), which can effectively denoise the extremely long exposure sequence and adaptively capture the implied multi-dimension user interest with linear computational complexity. This is the first attempt to integrate non-parametric multiresolution analysis technique into deep neural network to model user exposure sequence. Extensive experiments on large scale benchmark dataset and real production dataset confirm the effectiveness of Gama for exposure sequence modeling, especially in cold-start scenarios. Benefited from its low latency and high effecitveness, Gama has been deployed in our real large-scale industrial recommender, successfully serving over hundreds of millions users.|我们正积极研究暴露次序,以建立用户兴趣模式,预测点进率。然而,现有的曝光序列建模方法存在计算量大、忽视噪声等问题,导致在线推荐系统的延迟过长,性能有限。在这篇文章中,我们提出利用门控自适应小波多解析度分析(Gama)来处理高延迟和噪声问题,它可以有效地去除极长曝光序列的噪声,并以线性计算复杂度自适应地捕捉隐含的多维用户兴趣。这是首次尝试将非参数多解析度分析技术与深层神经网络相结合,建立用户暴露序列模型。在大规模基准数据集和实际生产数据集上的大量实验证实了伽马方法对曝光序列建模的有效性,特别是在冷启动情况下。得益于它的低延迟和高效率,伽马已经部署在我们真正的大规模工业推荐,成功地为数亿用户服务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Gating-adapted+Wavelet+Multiresolution+Analysis+for+Exposure+Sequence+Modeling+in+CTR+Prediction)|0| -|[Animating Images to Transfer CLIP for Video-Text Retrieval](https://doi.org/10.1145/3477495.3531776)|Yu Liu, Huai Chen, Lianghua Huang, Di Chen, Bin Wang, Pan Pan, Lisheng Wang|Shanghai Jiao Tong University, Shanghai, China; DAMO Academy, Alibaba Group, Beijing, China; DAMO Academy, Alibaba Group, Hangzhou, China|Recent works show the possibility of transferring the CLIP (Contrastive Language-Image Pretraining) model for video-text retrieval with promising performance. However, due to the domain gap between static images and videos, CLIP-based video-text retrieval models with interaction-based matching perform far worse than models with representation-based matching. In this paper, we propose a novel image animation strategy to transfer the image-text CLIP model to video-text retrieval effectively. By imitating the video shooting components, we convert widely used image-language corpus to synthesized video-text data for pretraining. To reduce the time complexity of interaction matching, we further propose a coarse to fine framework which consists of dual encoders for fast candidates searching and a cross-modality interaction module for fine-grained re-ranking. The coarse to fine framework with the synthesized video-text pretraining provides significant gains in retrieval accuracy while preserving efficiency. Comprehensive experiments conducted on MSR-VTT, MSVD, and VATEX datasets demonstrate the effectiveness of our approach.|最近的研究表明,将对比语言-图像预训练(CLIP)模型应用于视频文本检索具有良好的性能。然而,由于静态图像和视频之间存在领域差异,基于 CLIP 的基于交互匹配的视频文本检索模型的性能远远不如基于表示匹配的模型。本文提出了一种新的图像动画策略,将图像-文本 CLIP 模型有效地转化为视频-文本检索。通过模拟视频拍摄组件,将广泛使用的图像语言语料库转换为合成的视频文本数据进行预训练。为了降低交互匹配的时间复杂度,我们进一步提出了一个由双编码器组成的快速候选搜索框架和一个交叉模式交互模块组成的细粒度重排序框架。综合视频文本预训练的粗细框架在保持检索效率的同时,提高了检索精度。在 MSR-VTT、 MSVD 和 VATEX 数据集上进行的综合实验证明了该方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Animating+Images+to+Transfer+CLIP+for+Video-Text+Retrieval)|0| -|[Image-Text Retrieval via Contrastive Learning with Auxiliary Generative Features and Support-set Regularization](https://doi.org/10.1145/3477495.3531783)|Lei Zhang, Min Yang, Chengming Li, Ruifeng Xu|Shenzhen Institutes of Advanced Technology, The Chinese Academy of Sciences, Shenzhen, China; Sun Yat-sen University, Shenzhen, China; Harbin Institute of Technology, Shenzhen, China; Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China|In this paper, we bridge the heterogeneity gap between different modalities and improve image-text retrieval by taking advantage of auxiliary image-to-text and text-to-image generative features with contrastive learning. Concretely, contrastive learning is devised to narrow the distance between the aligned image-text pairs and push apart the distance between the unaligned pairs from both inter- and intra-modality perspectives with the help of cross-modal retrieval features and auxiliary generative features. In addition, we devise a support-set regularization term to further improve contrastive learning by constraining the distance between each image/text and its corresponding cross-modal support-set information contained in the same semantic category. To evaluate the effectiveness of the proposed method, we conduct experiments on three benchmark datasets (i.e., MIRFLICKR-25K, NUS-WIDE, MS COCO). Experimental results show that our model significantly outperforms the strong baselines for cross-modal image-text retrieval. For reproducibility, we submit the code and data publicly at: \urlhttps://github.com/Hambaobao/CRCGS.|本文通过对比学习,利用图像到文本和文本到图像生成的辅助特征,弥补了不同检索方式之间的异质性差距,提高了图像-文本检索的性能。具体来说,对比学习是通过跨模态检索特征和辅助生成特征来缩小图像-文本对之间的距离,并从情态间和情态内的角度分离未对齐图像-文本对之间的距离。此外,我们设计了一个支持集正则化项来进一步改善对比学习,约束图像/文本之间的距离及其相应的跨模态支持集信息包含在同一语义范畴。为了评估该方法的有效性,我们对三个基准数据集(即 MIRFLICKR-25K,NUS-WIDE,MS COCO)进行了实验。实验结果表明,该模型在图像文本检索中的性能明显优于强基线检索。为了重现性,我们公开在 urlhttps:// github.com/hambaobao/crcgs 提交代码和数据。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Image-Text+Retrieval+via+Contrastive+Learning+with+Auxiliary+Generative+Features+and+Support-set+Regularization)|0| -|[Denoising Time Cycle Modeling for Recommendation](https://doi.org/10.1145/3477495.3531785)|Sicong Xie, Qunwei Li, Weidi Xu, Kaiming Shen, Shaohu Chen, Wenliang Zhong|Ant Group, Hangzhou, China; Ant Group, Shanghai, China; Ant Group, Beijing, China|Recently, modeling temporal patterns of user-item interactions have attracted much attention in recommender systems. We argue that existing methods ignore the variety of temporal patterns of user behaviors. We define the subset of user behaviors that are ir- relevant to the target item as noises, which limits the performance of target-related time cycle modeling and affect the recommendation performance. In this paper, we propose Denoising Time Cycle Modeling (DiCycle), a novel approach to denoise user behaviors and select the subset of user behaviors that are highly related to the target item. DiCycle is able to explicitly model diverse time cycle patterns for recommendation. Extensive experiments are conducted on both public benchmarks and a real-world dataset, demonstrating the superior performance of DiCycle over the state-of-the-art recommendation methods.|近年来,推荐系统中用户-项目交互的时间模式建模引起了人们的广泛关注。我们认为现有的方法忽略了用户行为的时间模式的多样性。将与目标项无关的用户行为定义为噪声,限制了目标相关时间周期建模的性能,影响了推荐性能。本文提出了一种新的去噪时间周期建模方法(DiCycle) ,用于去除用户行为的噪声,并选择与目标项高度相关的用户行为子集。DiCycle 能够显式地为推荐建模不同的时间周期模式。在公共基准测试和真实世界数据集上进行了广泛的实验,证明了 DiCycle 优于最先进的推荐方法的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Denoising+Time+Cycle+Modeling+for+Recommendation)|0| +|[Animating Images to Transfer CLIP for Video-Text Retrieval](https://doi.org/10.1145/3477495.3531776)|Yu Liu, Huai Chen, Lianghua Huang, Di Chen, Bin Wang, Pan Pan, Lisheng Wang|DAMO Academy, Alibaba Group, Hangzhou, China; DAMO Academy, Alibaba Group, Beijing, China; Shanghai Jiao Tong University, Shanghai, China|Recent works show the possibility of transferring the CLIP (Contrastive Language-Image Pretraining) model for video-text retrieval with promising performance. However, due to the domain gap between static images and videos, CLIP-based video-text retrieval models with interaction-based matching perform far worse than models with representation-based matching. In this paper, we propose a novel image animation strategy to transfer the image-text CLIP model to video-text retrieval effectively. By imitating the video shooting components, we convert widely used image-language corpus to synthesized video-text data for pretraining. To reduce the time complexity of interaction matching, we further propose a coarse to fine framework which consists of dual encoders for fast candidates searching and a cross-modality interaction module for fine-grained re-ranking. The coarse to fine framework with the synthesized video-text pretraining provides significant gains in retrieval accuracy while preserving efficiency. Comprehensive experiments conducted on MSR-VTT, MSVD, and VATEX datasets demonstrate the effectiveness of our approach.|最近的研究表明,将对比语言-图像预训练(CLIP)模型应用于视频文本检索具有良好的性能。然而,由于静态图像和视频之间存在领域差异,基于 CLIP 的基于交互匹配的视频文本检索模型的性能远远不如基于表示匹配的模型。本文提出了一种新的图像动画策略,将图像-文本 CLIP 模型有效地转化为视频-文本检索。通过模拟视频拍摄组件,将广泛使用的图像语言语料库转换为合成的视频文本数据进行预训练。为了降低交互匹配的时间复杂度,我们进一步提出了一个由双编码器组成的快速候选搜索框架和一个交叉模式交互模块组成的细粒度重排序框架。综合视频文本预训练的粗细框架在保持检索效率的同时,提高了检索精度。在 MSR-VTT、 MSVD 和 VATEX 数据集上进行的综合实验证明了该方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Animating+Images+to+Transfer+CLIP+for+Video-Text+Retrieval)|0| +|[Image-Text Retrieval via Contrastive Learning with Auxiliary Generative Features and Support-set Regularization](https://doi.org/10.1145/3477495.3531783)|Lei Zhang, Min Yang, Chengming Li, Ruifeng Xu|Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; Shenzhen Institutes of Advanced Technology, The Chinese Academy of Sciences, Shenzhen, China; Sun Yat-sen University, Shenzhen, China; Harbin Institute of Technology, Shenzhen, China|In this paper, we bridge the heterogeneity gap between different modalities and improve image-text retrieval by taking advantage of auxiliary image-to-text and text-to-image generative features with contrastive learning. Concretely, contrastive learning is devised to narrow the distance between the aligned image-text pairs and push apart the distance between the unaligned pairs from both inter- and intra-modality perspectives with the help of cross-modal retrieval features and auxiliary generative features. In addition, we devise a support-set regularization term to further improve contrastive learning by constraining the distance between each image/text and its corresponding cross-modal support-set information contained in the same semantic category. To evaluate the effectiveness of the proposed method, we conduct experiments on three benchmark datasets (i.e., MIRFLICKR-25K, NUS-WIDE, MS COCO). Experimental results show that our model significantly outperforms the strong baselines for cross-modal image-text retrieval. For reproducibility, we submit the code and data publicly at: \urlhttps://github.com/Hambaobao/CRCGS.|本文通过对比学习,利用图像到文本和文本到图像生成的辅助特征,弥补了不同检索方式之间的异质性差距,提高了图像-文本检索的性能。具体来说,对比学习是通过跨模态检索特征和辅助生成特征来缩小图像-文本对之间的距离,并从情态间和情态内的角度分离未对齐图像-文本对之间的距离。此外,我们设计了一个支持集正则化项来进一步改善对比学习,约束图像/文本之间的距离及其相应的跨模态支持集信息包含在同一语义范畴。为了评估该方法的有效性,我们对三个基准数据集(即 MIRFLICKR-25K,NUS-WIDE,MS COCO)进行了实验。实验结果表明,该模型在图像文本检索中的性能明显优于强基线检索。为了重现性,我们公开在 urlhttps:// github.com/hambaobao/crcgs 提交代码和数据。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Image-Text+Retrieval+via+Contrastive+Learning+with+Auxiliary+Generative+Features+and+Support-set+Regularization)|0| +|[Denoising Time Cycle Modeling for Recommendation](https://doi.org/10.1145/3477495.3531785)|Sicong Xie, Qunwei Li, Weidi Xu, Kaiming Shen, Shaohu Chen, Wenliang Zhong|Ant Group, Hangzhou, China; Ant Group, Beijing, China; Ant Group, Shanghai, China|Recently, modeling temporal patterns of user-item interactions have attracted much attention in recommender systems. We argue that existing methods ignore the variety of temporal patterns of user behaviors. We define the subset of user behaviors that are ir- relevant to the target item as noises, which limits the performance of target-related time cycle modeling and affect the recommendation performance. In this paper, we propose Denoising Time Cycle Modeling (DiCycle), a novel approach to denoise user behaviors and select the subset of user behaviors that are highly related to the target item. DiCycle is able to explicitly model diverse time cycle patterns for recommendation. Extensive experiments are conducted on both public benchmarks and a real-world dataset, demonstrating the superior performance of DiCycle over the state-of-the-art recommendation methods.|近年来,推荐系统中用户-项目交互的时间模式建模引起了人们的广泛关注。我们认为现有的方法忽略了用户行为的时间模式的多样性。将与目标项无关的用户行为定义为噪声,限制了目标相关时间周期建模的性能,影响了推荐性能。本文提出了一种新的去噪时间周期建模方法(DiCycle) ,用于去除用户行为的噪声,并选择与目标项高度相关的用户行为子集。DiCycle 能够显式地为推荐建模不同的时间周期模式。在公共基准测试和真实世界数据集上进行了广泛的实验,证明了 DiCycle 优于最先进的推荐方法的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Denoising+Time+Cycle+Modeling+for+Recommendation)|0| |[Value Penalized Q-Learning for Recommender Systems](https://doi.org/10.1145/3477495.3531796)|Chengqian Gao, Ke Xu, Kuangqi Zhou, Lanqing Li, Xueqian Wang, Bo Yuan, Peilin Zhao||Scaling reinforcement learning (RL) to recommender systems (RS) is promising since maximizing the expected cumulative rewards for RL agents meets the objective of RS, i.e., improving customers' long-term satisfaction. A key approach to this goal is offline RL, which aims to learn policies from logged data rather than expensive online interactions. In this paper, we propose Value Penalized Q-learning (VPQ), a novel uncertainty-based offline RL algorithm that penalizes the unstable Q-values in the regression target using uncertainty-aware weights, achieving the conservative Q-function without the need of estimating the behavior policy, suitable for RS with a large number of items. Experiments on two real-world datasets show the proposed method serves as a gain plug-in for existing RS models.|由于推荐系统能够最大化推荐系统代理商的预期累积回报,从而达到推荐系统的目标,即提高客户的长期满意度,因此,将推荐系统扩展到推荐系统的强化学习是有前景的。实现这一目标的一个关键方法是离线 RL,它旨在从记录的数据中学习策略,而不是昂贵的在线交互。本文提出了一种新的基于不确定性的离线 RL 算法——价值惩罚 Q 学习算法(VPQ) ,该算法利用不确定性感知权值惩罚回归目标中不稳定的 Q 值,不需要估计行为策略就可以实现保守的 Q 函数,适用于大项目的 RS。在两个实际数据集上的实验表明,该方法可以作为现有 RS 模型的增益插件。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Value+Penalized+Q-Learning+for+Recommender+Systems)|0| |[From Cluster Ranking to Document Ranking](https://doi.org/10.1145/3477495.3531819)|Egor Markovskiy, Fiana Raiber, Shoham Sabach, Oren Kurland|Technion, Haifa, Israel; Yahoo Research, Haifa, Israel|The common approach of using clusters of similar documents for ad hoc document retrieval is to rank the clusters in response to the query; then, the cluster ranking is transformed to document ranking. We present a novel supervised approach to transform cluster ranking to document ranking. The approach allows to simultaneously utilize different clusterings and the resultant cluster rankings; this helps to improve the modeling of the document similarity space. Empirical evaluation shows that using our approach results in performance that substantially transcends the state-of-the-art in cluster-based document retrieval.|使用类似文档集群进行特别文献检索的常见方法是根据查询对集群进行排序,然后将集群排序转换为文档排序。提出了一种新的监督方法将聚类排序转换为文档排序。该方法允许同时使用不同的聚类和由此产生的聚类排名; 这有助于改进文档相似性空间的建模。经验性的评估表明,使用我们的方法所产生的效果远远超过了基于集群的文献检索的最新水平。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=From+Cluster+Ranking+to+Document+Ranking)|0| -|[ILMART: Interpretable Ranking with Constrained LambdaMART](https://doi.org/10.1145/3477495.3531840)|Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Alberto Veneri|Ca' Foscari University of Venice, Venice, Italy; Ca' Foscari University of Venice & ISTI-CNR, Venice, Italy; ISTI-CNR, Pisa, Italy|Interpretable Learning to Rank (LtR) is an emerging field within the research area of explainable AI, aiming at developing intelligible and accurate predictive models. While most of the previous research efforts focus on creating post-hoc explanations, in this paper we investigate how to train effective and intrinsically-interpretable ranking models. Developing these models is particularly challenging and it also requires finding a trade-off between ranking quality and model complexity. State-of-the-art rankers, made of either large ensembles of trees or several neural layers, exploit in fact an unlimited number of feature interactions making them black boxes. Previous approaches on intrinsically-interpretable ranking models address this issue by avoiding interactions between features thus paying a significant performance drop with respect to full-complexity models. Conversely, ILMART, our novel and interpretable LtR solution based on LambdaMART, is able to train effective and intelligible models by exploiting a limited and controlled number of pairwise feature interactions. Exhaustive and reproducible experiments conducted on three publicly-available LtR datasets show that ILMART outperforms the current state-of-the-art solution for interpretable ranking of a large margin with a gain of nDCG of up to 8%.|可解释排序学习是可解释人工智能研究领域中的一个新兴领域,其目标是建立可理解的、准确的预测模型。以往的研究大多侧重于创建事后解释,本文主要研究如何训练有效且内在可解释的排序模型。开发这些模型尤其具有挑战性,而且还需要在排名质量和模型复杂性之间找到平衡。最先进的排名器,由大型的树木集合或几个神经层组成,实际上利用了无限数量的特征交互,使它们成为黑盒子。以前关于内在可解释的排名模型的方法通过避免特性之间的交互来解决这个问题,因此相对于全复杂模型来说,性能下降很大。相反,我们基于 lambdaMART 的新颖且可解释的 ILMART 有限公司解决方案,能够通过利用有限和可控数量的成对特征交互来培训有效和可理解的模型。在三个公开可用的有限责任公司数据集上进行的详尽和可重复的实验表明,ILMART 在可解释的大幅度排名方面优于目前的最先进的解决方案,nDCG 的增益高达8% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ILMART:+Interpretable+Ranking+with+Constrained+LambdaMART)|0| +|[ILMART: Interpretable Ranking with Constrained LambdaMART](https://doi.org/10.1145/3477495.3531840)|Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Alberto Veneri|Ca' Foscari University of Venice & ISTI-CNR, Venice, Italy; Ca' Foscari University of Venice, Venice, Italy; ISTI-CNR, Pisa, Italy|Interpretable Learning to Rank (LtR) is an emerging field within the research area of explainable AI, aiming at developing intelligible and accurate predictive models. While most of the previous research efforts focus on creating post-hoc explanations, in this paper we investigate how to train effective and intrinsically-interpretable ranking models. Developing these models is particularly challenging and it also requires finding a trade-off between ranking quality and model complexity. State-of-the-art rankers, made of either large ensembles of trees or several neural layers, exploit in fact an unlimited number of feature interactions making them black boxes. Previous approaches on intrinsically-interpretable ranking models address this issue by avoiding interactions between features thus paying a significant performance drop with respect to full-complexity models. Conversely, ILMART, our novel and interpretable LtR solution based on LambdaMART, is able to train effective and intelligible models by exploiting a limited and controlled number of pairwise feature interactions. Exhaustive and reproducible experiments conducted on three publicly-available LtR datasets show that ILMART outperforms the current state-of-the-art solution for interpretable ranking of a large margin with a gain of nDCG of up to 8%.|可解释排序学习是可解释人工智能研究领域中的一个新兴领域,其目标是建立可理解的、准确的预测模型。以往的研究大多侧重于创建事后解释,本文主要研究如何训练有效且内在可解释的排序模型。开发这些模型尤其具有挑战性,而且还需要在排名质量和模型复杂性之间找到平衡。最先进的排名器,由大型的树木集合或几个神经层组成,实际上利用了无限数量的特征交互,使它们成为黑盒子。以前关于内在可解释的排名模型的方法通过避免特性之间的交互来解决这个问题,因此相对于全复杂模型来说,性能下降很大。相反,我们基于 lambdaMART 的新颖且可解释的 ILMART 有限公司解决方案,能够通过利用有限和可控数量的成对特征交互来培训有效和可理解的模型。在三个公开可用的有限责任公司数据集上进行的详尽和可重复的实验表明,ILMART 在可解释的大幅度排名方面优于目前的最先进的解决方案,nDCG 的增益高达8% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ILMART:+Interpretable+Ranking+with+Constrained+LambdaMART)|0| |[On Extractive Summarization for Profile-centric Neural Expert Search in Academia](https://doi.org/10.1145/3477495.3531713)|Rennan C. Lima, Rodrygo L. T. Santos|Universidade Federal de Minas Gerais, Belo Horizonte, Brazil|Identifying academic experts is crucial for the progress of science, enabling researchers to connect, form networks, and collaborate on the most pressing research problems. A key challenge for ranking experts in response to a query is how to infer their expertise from the publications they coauthored. Profile-centric approaches represent candidate experts by concatenating all their publications into a text-based profile. Despite offering a complete picture of each candidate's scientific output, such lengthy profiles make it inefficient to leverage state-of-the-art neural architectures for inferring expertise. To overcome this limitation, we investigate the suitability of extractive summarization as a mechanism to reduce candidate profiles for semantic encoding using Transformers. Our thorough experiments with a representative academic search test collection demonstrate the benefits of encoding summarized profiles for an improved expertise inference.|识别学术专家对科学进步至关重要,使研究人员能够在最紧迫的研究问题上进行联系、形成网络和协作。对专家进行排名以回答问题的一个关键挑战是如何从他们合著的出版物中推断出他们的专业知识。以配置文件为中心的方法通过将候选专家的所有出版物连接到一个基于文本的配置文件中来代表他们。尽管提供了每个候选人的科学成果的完整图片,这样冗长的档案使得利用最先进的神经结构来推断专业知识效率低下。为了克服这个限制,我们研究了提取摘要作为使用 Transformers 减少语义编码候选配置文件的机制的适用性。我们对一个有代表性的学术搜索测试集合进行了彻底的实验,证明了对概要进行编码以改进专业知识推理的好处。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On+Extractive+Summarization+for+Profile-centric+Neural+Expert+Search+in+Academia)|0| |[Joint Optimization of Ad Ranking and Creative Selection](https://doi.org/10.1145/3477495.3531855)|Kaiyi Lin, Xiang Zhang, Feng Li, Pengjie Wang, Qingqing Long, Hongbo Deng, Jian Xu, Bo Zheng|Alibaba Group, Beijing, China|In e-commerce, ad creatives play an important role in effectively delivering product information to users. The purpose of online creative selection is to learn users' preferences for ad creatives, and to select the most appealing design for users to maximize Click-Through Rate (CTR). However, the existing common practices in the industry usually place the creative selection after the ad ranking stage, and thus the optimal creative fails to reflect the influence on the ad ranking stage. To address these issues, we propose a novel Cascade Architecture of Creative Selection (CACS), which is built before the ranking stage to joint optimization of intra-ad creative selection and inter-ad ranking. To improve the efficiency, we design a classic two-tower structure and allow creative embeddings of the creative selection stage to share with the ranking stage. To boost the effectiveness, on the one hand, we propose a soft label list-wise ranking distillation method to distill the ranking knowledge from the ranking stage to guide CACS learning; and on the other hand, we also design an adaptive dropout network to encourage the model to probabilistically ignore ID features in favor of content features to learn multi-modal representations of the creative. Most of all, the ranking model obtains the optimal creative information of each ad from our CACS, and uses all available features to improve the performance of the ranking model. We have launched our solution in Taobao advertising platform and have obtained significant improvements both in offline and online evaluations.|在电子商务中,广告创意人员在有效地向用户传递产品信息方面发挥着重要作用。在线创意选择的目的是了解用户对广告创意的偏好,并为用户选择最具吸引力的设计,以最大限度地提高点进率。然而,现有的行业惯例通常将创意选择置于广告排名阶段之后,因此最优创意未能反映出对广告排名阶段的影响。为了解决这些问题,我们提出了一种新颖的创意选择级联体系结构(CACS) ,该体系结构建立在排名阶段之前,以联合优化内部广告创意选择和内部广告排名。为了提高效率,我们设计了一个经典的双塔结构,并允许创造性的嵌入创造性的选择阶段与排名阶段共享。为了提高效率,一方面,我们提出了一种软标签列表式的排序精馏方法,从排序阶段提取排序知识来指导 CACS 学习; 另一方面,我们还设计了一个自适应辍学网络来鼓励模型忽略 ID 特征而有利于内容特征学习创造性的多模态表示。最重要的是,排名模型从我们的 CACS 中获得每个广告的最佳创意信息,并利用所有可用的功能来改善排名模型的性能。我们已经在淘宝广告平台上推出了我们的解决方案,并且在线下和在线评估方面都取得了显著的进步。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Joint+Optimization+of+Ad+Ranking+and+Creative+Selection)|0| |[Long Document Re-ranking with Modular Re-ranker](https://doi.org/10.1145/3477495.3531860)|Luyu Gao, Jamie Callan|Carnegie Mellon University, Pittsburgh, PA, USA|Long document re-ranking has been a challenging problem for neural re-rankers based on deep language models like BERT. Early work breaks the documents into short passage-like chunks. These chunks are independently mapped to scalar scores or latent vectors, which are then pooled into a final relevance score. These encode-and-pool methods however inevitably introduce an information bottleneck: the low dimension representations. In this paper, we propose instead to model full query-to-document interaction, leveraging the attention operation and modular Transformer re-ranker framework. First, document chunks are encoded independently with an encoder module. An interaction module then encodes the query and performs joint attention from the query to all document chunk representations. We demonstrate that the model can use this new degree of freedom to aggregate important information from the entire document. Our experiments show that this design produces effective re-ranking on two classical IR collections Robust04 and ClueWeb09, and a large-scale supervised collection MS-MARCO document ranking.|长文档重新排序一直是基于 BERT 等深层语言模型的神经网络重新排序的一个具有挑战性的问题。早期的工作将文档分解成短小的段落。这些块被独立映射到标量分数或潜在向量,然后将它们汇集到最终的相关分数中。然而,这些编码和池方法不可避免地引入了一个信息瓶颈: 低维表示。在本文中,我们提出了利用注意力操作和模块化的 formerre-rank 框架来建立完全的查询到文档的交互模型。首先,文档块使用编码器模块进行独立编码。然后,交互模块对查询进行编码,并执行从查询到所有文档块表示的联合注意。我们证明该模型可以使用这种新的自由度来聚合整个文档中的重要信息。我们的实验表明,这种设计产生了有效的重新排序的两个经典的红外收集鲁棒04和 ClueWeb09,以及一个大规模的监督收集 MS-MARCO 文件排序。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Long+Document+Re-ranking+with+Modular+Re-ranker)|0| -|[Relation Extraction as Open-book Examination: Retrieval-enhanced Prompt Tuning](https://doi.org/10.1145/3477495.3531746)|Xiang Chen, Lei Li, Ningyu Zhang, Chuanqi Tan, Fei Huang, Luo Si, Huajun Chen|Zhejiang University, Hangzhou, Chile; Alibaba Group, Hangzhou, China; Zhejiang University, Hangzhou, China|Pre-trained language models have contributed significantly to relation extraction by demonstrating remarkable few-shot learning abilities. However, prompt tuning methods for relation extraction may still fail to generalize to those rare or hard patterns. Note that the previous parametric learning paradigm can be viewed as memorization regarding training data as a book and inference as the close-book test. Those long-tailed or hard patterns can hardly be memorized in parameters given few-shot instances. To this end, we regard RE as an open-book examination and propose a new semiparametric paradigm of retrieval-enhanced prompt tuning for relation extraction. We construct an open-book datastore for retrieval regarding prompt-based instance representations and corresponding relation labels as memorized key-value pairs. During inference, the model can infer relations by linearly interpolating the base output of PLM with the non-parametric nearest neighbor distribution over the datastore. In this way, our model not only infers relation through knowledge stored in the weights during training but also assists decision-making by unwinding and querying examples in the open-book datastore. Extensive experiments on benchmark datasets show that our method can achieve state-of-the-art in both standard supervised and few-shot settings|预训练的语言模型通过显示出显著的短镜头学习能力,对关系抽取做出了重要贡献。然而,关系抽取的快速调优方法可能仍然无法推广到那些罕见的或难以实现的模式。请注意,以前的参数学习范式可以被视为记忆的训练数据作为一本书和推论作为关闭书测试。这些长尾或硬模式几乎不能被记忆在参数中,因为实例很少。为此,我们将 RE 视为一个开放式的考试,并提出了一个新的检索半参数范式——关系抽取的增强型提示调优。我们构造了一个开放式数据存储,用于检索基于提示的实例表示和相应的关系标签作为记忆的键值对。在推理过程中,该模型可以通过对 PLM 的基本输出与数据存储上的非参数最近邻分布进行线性插值来推断关系。这样,我们的模型不仅可以通过训练过程中权重中存储的知识来推断关系,而且可以通过展开和查询开卷数据库中的实例来辅助决策。对基准数据集的大量实验表明,该方法可以在标准监督和少镜头设置下达到最先进的水平|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Relation+Extraction+as+Open-book+Examination:+Retrieval-enhanced+Prompt+Tuning)|0| +|[Relation Extraction as Open-book Examination: Retrieval-enhanced Prompt Tuning](https://doi.org/10.1145/3477495.3531746)|Xiang Chen, Lei Li, Ningyu Zhang, Chuanqi Tan, Fei Huang, Luo Si, Huajun Chen|Zhejiang University, Hangzhou, Chile; Zhejiang University, Hangzhou, China; Alibaba Group, Hangzhou, China|Pre-trained language models have contributed significantly to relation extraction by demonstrating remarkable few-shot learning abilities. However, prompt tuning methods for relation extraction may still fail to generalize to those rare or hard patterns. Note that the previous parametric learning paradigm can be viewed as memorization regarding training data as a book and inference as the close-book test. Those long-tailed or hard patterns can hardly be memorized in parameters given few-shot instances. To this end, we regard RE as an open-book examination and propose a new semiparametric paradigm of retrieval-enhanced prompt tuning for relation extraction. We construct an open-book datastore for retrieval regarding prompt-based instance representations and corresponding relation labels as memorized key-value pairs. During inference, the model can infer relations by linearly interpolating the base output of PLM with the non-parametric nearest neighbor distribution over the datastore. In this way, our model not only infers relation through knowledge stored in the weights during training but also assists decision-making by unwinding and querying examples in the open-book datastore. Extensive experiments on benchmark datasets show that our method can achieve state-of-the-art in both standard supervised and few-shot settings|预训练的语言模型通过显示出显著的短镜头学习能力,对关系抽取做出了重要贡献。然而,关系抽取的快速调优方法可能仍然无法推广到那些罕见的或难以实现的模式。请注意,以前的参数学习范式可以被视为记忆的训练数据作为一本书和推论作为关闭书测试。这些长尾或硬模式几乎不能被记忆在参数中,因为实例很少。为此,我们将 RE 视为一个开放式的考试,并提出了一个新的检索半参数范式——关系抽取的增强型提示调优。我们构造了一个开放式数据存储,用于检索基于提示的实例表示和相应的关系标签作为记忆的键值对。在推理过程中,该模型可以通过对 PLM 的基本输出与数据存储上的非参数最近邻分布进行线性插值来推断关系。这样,我们的模型不仅可以通过训练过程中权重中存储的知识来推断关系,而且可以通过展开和查询开卷数据库中的实例来辅助决策。对基准数据集的大量实验表明,该方法可以在标准监督和少镜头设置下达到最先进的水平|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Relation+Extraction+as+Open-book+Examination:+Retrieval-enhanced+Prompt+Tuning)|0| |[End-to-end Distantly Supervised Information Extraction with Retrieval Augmentation](https://doi.org/10.1145/3477495.3531876)|Yue Zhang, Hongliang Fei, Ping Li|Baidu, Bellevue, WA, USA|Distant supervision (DS) has been a prevalent approach to generating labeled data for information extraction (IE) tasks. However, DS often suffers from noisy label problems, where the labels are extracted from the knowledge base (KB), regardless of the input context. Many efforts have been devoted to designing denoising mechanisms. However, most strategies are only designed for one specific task and cannot be directly adapted to other tasks. We propose a general paradigm (Dasiera) to resolve issues in KB-based DS. Labels from KB can be viewed as universal labels of a target entity or an entity pair. While the given context for an IE task may only contain partial/zero information about the target entities, or the entailed information may be vague. Hence the mismatch between the given context and KB labels, i.e., the given context has insufficient information to infer DS labels, can happen in IE training datasets. To solve the problem, during training, Dasiera leverages a retrieval-augmentation mechanism to complete missing information of the given context, where we seamlessly integrate a neural retriever and a general predictor in an end-to-end framework. During inference, we can keep/remove the retrieval component based on whether we want to predict solely on the given context. We have evaluated Dasiera on two IE tasks under the DS setting: named entity typing and relation extraction. Experimental results show Dasiera's superiority to other baselines in both tasks.|远程监控(DS)已经成为一种普遍的方法来为信息抽取(IE)任务生成标记数据。然而,DS 经常遇到噪声标签问题,这些标签是从知识库(KB)中提取出来的,与输入上下文无关。人们在设计去噪机制方面付出了很多努力。然而,大多数策略只针对一个特定任务设计,不能直接适应其他任务。我们提出了一个通用范例(Dasiera)来解决基于知识库的 DS 中的问题。来自 KB 的标签可以被视为目标实体或实体对的通用标签。IE 任务的给定上下文可能只包含关于目标实体的部分/零信息,或者所涉及的信息可能是模糊的。因此,给定上下文和知识库标签之间的不匹配,即给定上下文没有足够的信息来推断 DS 标签,可能发生在 IE 训练数据集中。为了解决这个问题,在训练期间,Dasiera 利用检索增强机制来完成给定上下文的缺失信息,在这里我们无缝地将神经检索器和通用预测器集成在一个端到端框架中。在推理过程中,我们可以保留/删除检索组件,这取决于我们是否希望仅根据给定的上下文进行预测。我们在 DS 设置下对 Dasiera 的两个 IE 任务进行了评估: 命名实体类型和关系提取。实验结果表明,Dasiera 在这两个任务中都优于其他基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=End-to-end+Distantly+Supervised+Information+Extraction+with+Retrieval+Augmentation)|0| -|[Assessing Scientific Research Papers with Knowledge Graphs](https://doi.org/10.1145/3477495.3531879)|Kexuan Sun, Zhiqiang Qiu, Abel Salinas, Yuzhong Huang, DongHo Lee, Daniel Benjamin, Fred Morstatter, Xiang Ren, Kristina Lerman, Jay Pujara|Nova Southeastern University, Fort Lauderdale, CA, USA; University of Southern California, Los Angeles, CA, USA|In recent decades, the growing scale of scientific research has led to numerous novel findings. Reproducing these findings is the foundation of future research. However, due to the complexity of experiments, manually assessing scientific research is laborious and time-intensive, especially in social and behavioral sciences. Although increasing reproducibility studies have garnered increased attention in the research community, there is still a lack of systematic ways for evaluating scientific research at scale. In this paper, we propose a novel approach towards automatically assessing scientific publications by constructing a knowledge graph (KG) that captures a holistic view of the research contributions. Specifically, during the KG construction, we combine information from two different perspectives: micro-level features that capture knowledge from published articles such as sample sizes, effect sizes, and experimental models, and macro-level features that comprise relationships between entities such as authorship and reference information. We then learn low-dimensional representations using language models and knowledge graph embeddings for entities (nodes in KGs), which are further used for the assessments. A comprehensive set of experiments on two benchmark datasets shows the usefulness of leveraging KGs for scoring scientific research.|近几十年来,科学研究的规模不断扩大,产生了许多新的发现。复制这些发现是未来研究的基础。然而,由于实验的复杂性,人工评估科学研究是费时费力的,尤其是在社会科学和行为科学领域。虽然越来越多的重复性研究已经引起了研究界越来越多的关注,但仍然缺乏系统的方法来评价科学研究的规模。在本文中,我们提出了一种新的方法来自动评估科学出版物,通过构造一个知识图(KG) ,捕获了研究贡献的整体观点。具体而言,在 KG 构建过程中,我们从两个不同的角度组合信息: 从已发表的文章(如样本量,效应量和实验模型)中捕获知识的微观层面特征,以及包含实体(如作者和参考信息)之间关系的宏观层面特征。然后,我们学习低维表示使用语言模型和知识图嵌入的实体(节点在幼稚园) ,这是进一步用于评估。在两个基准数据集上的一组综合实验表明了利用幼儿园评分科学研究的有用性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Assessing+Scientific+Research+Papers+with+Knowledge+Graphs)|0| -|[A Content Recommendation Policy for Gaining Subscribers](https://doi.org/10.1145/3477495.3531885)|Konstantinos Theocharidis, Manolis Terrovitis, Spiros Skiadopoulos, Panagiotis Karras|University of the Peloponnese & Information Management Systems Institute, Athena Research Center, Tripoli & Athens, Greece; University of the Peloponnese, Tripoli, Greece; Information Management Systems Institute, Athena Research Center, Athens, Greece; Aarhus University, Aarhus, Denmark|How can we recommend content for a brand agent to use over a series of rounds so as to gain new subscribers to its social network page? The Influence Maximization (IM) problem seeks a set of~k users, and its content-aware variants seek a set of~k post features, that achieve, in both cases, an objective of expected influence in a social network. However, apart from raw influence, it is also relevant to study gain in subscribers, as long-term success rests on the subscribers of a brand page; classic IM may select~k users from the subscriber set, and content-aware IM starts the post's propagation from that subscriber set. In this paper, we propose a novel content recommendation policy to a brand agent for Gaining Subscribers by Messaging (GSM) over many rounds. In each round, the brand agent messages a fixed number of social network users and invites them to visit the brand page aiming to gain their subscription, while its most recently published content consists of features that intensely attract the preferences of the invited users. To solve GSM, we find, in each round, which content features to publish and which users to notify aiming to maximize the cumulative subscription gain over all rounds. We deploy three GSM solvers, named \sR, \sSC, and \sSU, and we experimentally evaluate their performance based on VKontakte (VK) posts by considering different user sets and feature sets. Our experimental results show that \sSU provides the best solution, as it is significantly more efficient than \sSC with a minor loss of efficacy and clearly more efficacious than \sR with competitive efficiency.|我们怎样才能向品牌代理商推荐一系列的内容,从而吸引新的用户访问其社交网络页面?影响力最大化(IM)问题寻找一组 ~ k 用户,其内容感知变体寻找一组 ~ k 帖子特征,在这两种情况下,都实现了社交网络中预期影响力的目标。然而,除了原始的影响力之外,研究订阅者的收益也是相关的,因为长期的成功取决于品牌页面的订阅者; 传统的 IM 可能从订阅者集合中选择 ~ k 用户,而内容感知的 IM 从订阅者集合中开始发布信息。在本文中,我们提出了一个新的内容推荐策略的品牌代理商获得用户的消息(GSM)多轮。在每一轮中,品牌代理人给固定数量的社交网络用户发信息,并邀请他们访问品牌页面以获得订阅,而其最近发布的内容包括强烈吸引受邀用户偏好的功能。为了解决 GSM 问题,我们发现,在每一轮中,哪些内容特性要发布,哪些用户要通知,目的是在所有轮次中最大化累积订阅收益。我们部署了三个 GSM 解决方案,分别命名为 sR、 sSC 和 sSU,通过考虑不同的用户集和特性集,实验性地评估了它们基于 VKontakte (VK)帖子的性能。我们的实验结果表明,sSU 提供了最佳的解决方案,因为它明显地比 sSC 更有效,具有较小的效率损失,并且明显地比具有竞争效率的 sR 更有效。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Content+Recommendation+Policy+for+Gaining+Subscribers)|0| -|[MM-Rec: Visiolinguistic Model Empowered Multimodal News Recommendation](https://doi.org/10.1145/3477495.3531896)|Chuhan Wu, Fangzhao Wu, Tao Qi, Chao Zhang, Yongfeng Huang, Tong Xu|University of Science and Technology of China, Hefei, China; Microsoft Research Asia, Beijing, China; Tsinghua University, Beijing, China; Shandong University, Jinan, China|News representation is critical for news recommendation. Most existing methods learn news representations only from news texts while ignoring the visual information of news. In fact, users may click news not only due to the interest in news titles but also the attraction of news images. Thus, images are useful for representing news and predicting news clicks. Pretrained visiolinguistic models are powerful in multi-modal understanding, which can represent news from both textual and visual contents. In this paper, we propose a multimodal news recommendation method that can incorporate both textual and visual information of news to learn multimodal news representations. We first extract region-of-interests (ROIs) from news images via object detection. We then use a pre-trained visiolinguistic model to encode both news texts and image ROIs and model their inherent relatedness using co-attentional Transformers. In addition, we propose a crossmodal candidate-aware attention network to select relevant historical clicked news for the accurate modeling of user interest in candidate news. Experiments validate that incorporating multimodal news information can effectively improve the performance of news recommendation.|新闻表达对新闻推荐至关重要。现有的大多数方法只从新闻文本中学习新闻表征,而忽视了新闻的视觉信息。事实上,用户之所以会点击新闻,不仅是因为他们对新闻标题感兴趣,还因为新闻图片的吸引力。因此,图像对于表示新闻和预测新闻点击是非常有用的。预先训练的视觉语言模型在多模态理解中具有很强的表现能力,可以从文本和视觉两个方面表现新闻。在本文中,我们提出了一种多通道新闻推荐方法,它可以结合新闻的文本信息和视觉信息来学习多通道新闻表示。我们首先通过目标检测从新闻图像中提取感兴趣区域(ROI)。然后,我们使用一个预先训练的视觉语言学模型来编码新闻文本和图像 ROI,并使用共注意转换器来模拟它们之间的内在联系。此外,我们提出了一个跨模式的候选人感知注意网络来选择相关的历史点击新闻,以准确建模用户对候选人新闻的兴趣。实验证明,融合多模态新闻信息可以有效地提高新闻推荐的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MM-Rec:+Visiolinguistic+Model+Empowered+Multimodal+News+Recommendation)|0| +|[Assessing Scientific Research Papers with Knowledge Graphs](https://doi.org/10.1145/3477495.3531879)|Kexuan Sun, Zhiqiang Qiu, Abel Salinas, Yuzhong Huang, DongHo Lee, Daniel Benjamin, Fred Morstatter, Xiang Ren, Kristina Lerman, Jay Pujara|University of Southern California, Los Angeles, CA, USA; Nova Southeastern University, Fort Lauderdale, CA, USA|In recent decades, the growing scale of scientific research has led to numerous novel findings. Reproducing these findings is the foundation of future research. However, due to the complexity of experiments, manually assessing scientific research is laborious and time-intensive, especially in social and behavioral sciences. Although increasing reproducibility studies have garnered increased attention in the research community, there is still a lack of systematic ways for evaluating scientific research at scale. In this paper, we propose a novel approach towards automatically assessing scientific publications by constructing a knowledge graph (KG) that captures a holistic view of the research contributions. Specifically, during the KG construction, we combine information from two different perspectives: micro-level features that capture knowledge from published articles such as sample sizes, effect sizes, and experimental models, and macro-level features that comprise relationships between entities such as authorship and reference information. We then learn low-dimensional representations using language models and knowledge graph embeddings for entities (nodes in KGs), which are further used for the assessments. A comprehensive set of experiments on two benchmark datasets shows the usefulness of leveraging KGs for scoring scientific research.|近几十年来,科学研究的规模不断扩大,产生了许多新的发现。复制这些发现是未来研究的基础。然而,由于实验的复杂性,人工评估科学研究是费时费力的,尤其是在社会科学和行为科学领域。虽然越来越多的重复性研究已经引起了研究界越来越多的关注,但仍然缺乏系统的方法来评价科学研究的规模。在本文中,我们提出了一种新的方法来自动评估科学出版物,通过构造一个知识图(KG) ,捕获了研究贡献的整体观点。具体而言,在 KG 构建过程中,我们从两个不同的角度组合信息: 从已发表的文章(如样本量,效应量和实验模型)中捕获知识的微观层面特征,以及包含实体(如作者和参考信息)之间关系的宏观层面特征。然后,我们学习低维表示使用语言模型和知识图嵌入的实体(节点在幼稚园) ,这是进一步用于评估。在两个基准数据集上的一组综合实验表明了利用幼儿园评分科学研究的有用性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Assessing+Scientific+Research+Papers+with+Knowledge+Graphs)|0| +|[A Content Recommendation Policy for Gaining Subscribers](https://doi.org/10.1145/3477495.3531885)|Konstantinos Theocharidis, Manolis Terrovitis, Spiros Skiadopoulos, Panagiotis Karras|Information Management Systems Institute, Athena Research Center, Athens, Greece; Aarhus University, Aarhus, Denmark; University of the Peloponnese & Information Management Systems Institute, Athena Research Center, Tripoli & Athens, Greece; University of the Peloponnese, Tripoli, Greece|How can we recommend content for a brand agent to use over a series of rounds so as to gain new subscribers to its social network page? The Influence Maximization (IM) problem seeks a set of~k users, and its content-aware variants seek a set of~k post features, that achieve, in both cases, an objective of expected influence in a social network. However, apart from raw influence, it is also relevant to study gain in subscribers, as long-term success rests on the subscribers of a brand page; classic IM may select~k users from the subscriber set, and content-aware IM starts the post's propagation from that subscriber set. In this paper, we propose a novel content recommendation policy to a brand agent for Gaining Subscribers by Messaging (GSM) over many rounds. In each round, the brand agent messages a fixed number of social network users and invites them to visit the brand page aiming to gain their subscription, while its most recently published content consists of features that intensely attract the preferences of the invited users. To solve GSM, we find, in each round, which content features to publish and which users to notify aiming to maximize the cumulative subscription gain over all rounds. We deploy three GSM solvers, named \sR, \sSC, and \sSU, and we experimentally evaluate their performance based on VKontakte (VK) posts by considering different user sets and feature sets. Our experimental results show that \sSU provides the best solution, as it is significantly more efficient than \sSC with a minor loss of efficacy and clearly more efficacious than \sR with competitive efficiency.|我们怎样才能向品牌代理商推荐一系列的内容,从而吸引新的用户访问其社交网络页面?影响力最大化(IM)问题寻找一组 ~ k 用户,其内容感知变体寻找一组 ~ k 帖子特征,在这两种情况下,都实现了社交网络中预期影响力的目标。然而,除了原始的影响力之外,研究订阅者的收益也是相关的,因为长期的成功取决于品牌页面的订阅者; 传统的 IM 可能从订阅者集合中选择 ~ k 用户,而内容感知的 IM 从订阅者集合中开始发布信息。在本文中,我们提出了一个新的内容推荐策略的品牌代理商获得用户的消息(GSM)多轮。在每一轮中,品牌代理人给固定数量的社交网络用户发信息,并邀请他们访问品牌页面以获得订阅,而其最近发布的内容包括强烈吸引受邀用户偏好的功能。为了解决 GSM 问题,我们发现,在每一轮中,哪些内容特性要发布,哪些用户要通知,目的是在所有轮次中最大化累积订阅收益。我们部署了三个 GSM 解决方案,分别命名为 sR、 sSC 和 sSU,通过考虑不同的用户集和特性集,实验性地评估了它们基于 VKontakte (VK)帖子的性能。我们的实验结果表明,sSU 提供了最佳的解决方案,因为它明显地比 sSC 更有效,具有较小的效率损失,并且明显地比具有竞争效率的 sR 更有效。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Content+Recommendation+Policy+for+Gaining+Subscribers)|0| +|[MM-Rec: Visiolinguistic Model Empowered Multimodal News Recommendation](https://doi.org/10.1145/3477495.3531896)|Chuhan Wu, Fangzhao Wu, Tao Qi, Chao Zhang, Yongfeng Huang, Tong Xu|Microsoft Research Asia, Beijing, China; Shandong University, Jinan, China; University of Science and Technology of China, Hefei, China; Tsinghua University, Beijing, China|News representation is critical for news recommendation. Most existing methods learn news representations only from news texts while ignoring the visual information of news. In fact, users may click news not only due to the interest in news titles but also the attraction of news images. Thus, images are useful for representing news and predicting news clicks. Pretrained visiolinguistic models are powerful in multi-modal understanding, which can represent news from both textual and visual contents. In this paper, we propose a multimodal news recommendation method that can incorporate both textual and visual information of news to learn multimodal news representations. We first extract region-of-interests (ROIs) from news images via object detection. We then use a pre-trained visiolinguistic model to encode both news texts and image ROIs and model their inherent relatedness using co-attentional Transformers. In addition, we propose a crossmodal candidate-aware attention network to select relevant historical clicked news for the accurate modeling of user interest in candidate news. Experiments validate that incorporating multimodal news information can effectively improve the performance of news recommendation.|新闻表达对新闻推荐至关重要。现有的大多数方法只从新闻文本中学习新闻表征,而忽视了新闻的视觉信息。事实上,用户之所以会点击新闻,不仅是因为他们对新闻标题感兴趣,还因为新闻图片的吸引力。因此,图像对于表示新闻和预测新闻点击是非常有用的。预先训练的视觉语言模型在多模态理解中具有很强的表现能力,可以从文本和视觉两个方面表现新闻。在本文中,我们提出了一种多通道新闻推荐方法,它可以结合新闻的文本信息和视觉信息来学习多通道新闻表示。我们首先通过目标检测从新闻图像中提取感兴趣区域(ROI)。然后,我们使用一个预先训练的视觉语言学模型来编码新闻文本和图像 ROI,并使用共注意转换器来模拟它们之间的内在联系。此外,我们提出了一个跨模式的候选人感知注意网络来选择相关的历史点击新闻,以准确建模用户对候选人新闻的兴趣。实验证明,融合多模态新闻信息可以有效地提高新闻推荐的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MM-Rec:+Visiolinguistic+Model+Empowered+Multimodal+News+Recommendation)|0| |[Towards Personalized Bundle Creative Generation with Contrastive Non-Autoregressive Decoding](https://doi.org/10.1145/3477495.3531909)|Penghui Wei, Shaoguo Liu, Xuanhua Yang, Liang Wang, Bo Zheng|Alibaba Group, Beijing, China|Current bundle generation studies focus on generating a combination of items to improve user experience. In real-world applications, there is also a great need to produce bundle creatives that consist of mixture types of objects (e.g., items, slogans and templates) for achieving better promotion effect. We study a new problem named bundle creative generation: for given users, the goal is to generate personalized bundle creatives that the users will be interested in. To take both quality and efficiency into account, we propose a contrastive non-autoregressive model that captures user preferences with ingenious decoding objective. Experiments on large-scale real-world datasets verify that our proposed model shows significant advantages in terms of creative quality and generation speed.|当前的捆绑包生成研究侧重于生成项目的组合,以改善用户体验。在实际应用中,为了达到更好的促销效果,还需要产生由混合类型的对象(例如,项目、标语和模板)组成的捆绑创意。我们研究了一个新的问题——捆绑包创意生成: 对于给定的用户,目标是生成用户感兴趣的个性化捆绑包创意。为了同时考虑质量和效率,我们提出了一个对比的非自回归模型,捕捉用户的喜好与巧妙的解码目标。在大规模真实世界数据集上的实验表明,我们提出的模型在创新质量和生成速度方面具有显著的优势。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Personalized+Bundle+Creative+Generation+with+Contrastive+Non-Autoregressive+Decoding)|0| |[Another Look at Information Retrieval as Statistical Translation](https://doi.org/10.1145/3477495.3531717)|Yuqi Liu, Chengcheng Hu, Jimmy Lin|University of Waterloo, Waterloo, ON, Canada|Over two decades ago, Berger and Lafferty proposed "information retrieval as statistical translation" (IRST), a simple and elegant method for ad hoc retrieval based on the noisy channel model. At the time, they lacked the large-scale human-annotated datasets necessary to properly train their models. In this paper, we ask the simple question: What if Berger and Lafferty had access to datasets such as the MS MARCO passage ranking dataset that we take for granted today? The answer to this question tells us how much of recent improvements in ranking can be solely attributed to having more data available, as opposed to improvements in models (e.g., pretrained transformers) and optimization techniques (e.g., contrastive loss). In fact, Boytsov and Kolter recently began to answer this question with a replication of Berger and Lafferty's model, and this work can be viewed as another independent replication effort, with generalizations to additional conditions not previously explored, including replacing the sum of translation probabilities with ColBERT's MaxSim operator. We confirm that while neural models (particularly pretrained transformers) have indeed led to great advances in retrieval effectiveness, the IRST model proposed decades ago is quite effective if provided sufficient training data.|二十多年前,伯杰和拉弗蒂提出了“信息检索作为统计翻译”(IRST) ,这是一种基于噪声信道模型的简单而优雅的自组织检索方法。当时,他们缺乏必要的大规模人工注释数据集来适当地训练他们的模型。在本文中,我们提出一个简单的问题: 如果 Berger 和 Lafferty 能够访问数据集,比如我们今天认为理所当然的 MS MARCO 通道排名数据集,那会怎样?这个问题的答案告诉我们,最近排名的改善在多大程度上可以完全归因于有更多的数据可用,而不是模型(例如,预先训练的变压器)和优化技术(例如,对比损失)的改进。事实上,Boytsov 和 Kolter 最近开始通过复制 Berger 和 Lafferty 的模型来回答这个问题,这项工作可以被看作是另一个独立的复制努力,对以前没有探索过的其他条件进行推广,包括用 ColBERT 的 MaxSim 运算符替换翻译概率的总和。我们证实,虽然神经模型(特别是预先训练的变压器)确实导致了检索效率的巨大进步,几十年前提出的 IRST 模型是相当有效的,如果提供足够的训练数据。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Another+Look+at+Information+Retrieval+as+Statistical+Translation)|0| -|[ACORDAR: A Test Collection for Ad Hoc Content-Based (RDF) Dataset Retrieval](https://doi.org/10.1145/3477495.3531729)|Tengteng Lin, Qiaosheng Chen, Gong Cheng, Ahmet Soylu, Basil Ell, Ruoqi Zhao, Qing Shi, Xiaxia Wang, Yu Gu, Evgeny Kharlamov|Bosch Center for Artificial Intelligence & University of Oslo, Renningen, Germany; Nanjing University, Nanjing, China; Bielefeld University & University of Oslo, Bielefeld, Germany; OsloMet -- Oslo Metropolitan University & Norwegian University of Science and Technology, Oslo, Norway; The Ohio State University, Columbus, OH, USA|Ad hoc dataset retrieval is a trending topic in IR research. Methods and systems are evolving from metadata-based to content-based ones which exploit the data itself for improving retrieval accuracy but thus far lack a specialized test collection. In this paper, we build and release the first test collection for ad hoc content-based dataset retrieval, where content-oriented dataset queries and content-based relevance judgments are annotated by human experts who are assisted with a dashboard designed specifically for comprehensively and conveniently browsing both the metadata and data of a dataset. We conduct extensive experiments on the test collection to analyze its difficulty and provide insights into the underlying task.|自组织数据集检索是信息检索领域的一个研究热点。方法和系统正在从基于元数据向基于内容的方法和系统演变,这些方法和系统利用数据本身来提高检索的准确性,但迄今为止还缺乏专门的测试集合。本文构建并发布了第一个基于特定内容的数据集检索测试集合,其中面向内容的数据集查询和基于内容的相关性判断由人类专家进行注释,并辅以一个专门为全面方便地浏览数据集的元数据和数据而设计的仪表板。我们进行了广泛的实验测试收集,以分析其难度,并提供深入了解潜在的任务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ACORDAR:+A+Test+Collection+for+Ad+Hoc+Content-Based+(RDF)+Dataset+Retrieval)|0| -|[RELISON: A Framework for Link Recommendation in Social Networks](https://doi.org/10.1145/3477495.3531730)|Javier SanzCruzado, Pablo Castells|University of Glasgow, Glasgow, United Kingdom; Universidad Autónoma de Madrid, Madrid, Spain|Link recommendation is an important and compelling problem at the intersection of recommender systems and online social networks. Given a user, link recommenders identify people in the platform the user might be interested in interacting with. We present RELISON, an extensible framework for running link recommendation experiments. The library provides a wide range of algorithms, along with tools for evaluating the produced recommendations. RELISON includes algorithms and metrics that consider the potential effect of recommendations on the properties of online social networks. For this reason, the library also implements network structure analysis metrics, community detection algorithms, and network diffusion simulation functionalities. The library code and documentation is available at https://github.com/ir-uam/RELISON.|在推荐系统和在线社交网络的交叉点上,链接推荐是一个重要且引人注目的问题。给定一个用户,链接推荐器会识别出平台中用户可能感兴趣的交互对象。我们提出了 RELISON,一个运行链路推荐实验的可扩展框架。该库提供了广泛的算法,以及用于评估生成的建议的工具。RELISON 包括一些算法和度量标准,这些算法和度量标准考虑了推荐对在线社交网络属性的潜在影响。出于这个原因,该库还实现了网络结构分析度量、社区检测算法和网络扩散模拟功能。图书馆代码及文件可于 https://github.com/ir-uam/relison 索取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=RELISON:+A+Framework+for+Link+Recommendation+in+Social+Networks)|0| -|[The Istella22 Dataset: Bridging Traditional and Neural Learning to Rank Evaluation](https://doi.org/10.1145/3477495.3531740)|Domenico Dato, Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto|University of Glasgow, Glasgow, United Kingdom; Istella, Milano, Italy; University of Pisa, Pisa, Italy; ISTI-CNR, Pisa, Italy|Neural approaches that use pre-trained language models are effective at various ranking tasks, such as question answering and ad-hoc document ranking. However, their effectiveness compared to feature-based Learning-to-Rank (LtR) methods has not yet been well-established. A major reason for this is because present LtR benchmarks that contain query-document feature vectors do not contain the raw query and document text needed for neural models. On the other hand, the benchmarks often used for evaluating neural models, e.g., MS MARCO, TREC Robust, etc., provide text but do not provide query-document feature vectors. In this paper, we present Istella22, a new dataset that enables such comparisons by providing both query/document text and strong query-document feature vectors used by an industrial search engine. The dataset consists of a comprehensive corpus of 8.4M web documents, a collection of query-document pairs including 220 hand-crafted features, relevance judgments on a 5-graded scale, and a set of 2,198 textual queries used for testing purposes. Istella22 enables a fair evaluation of traditional learning-to-rank and transfer ranking techniques on the same data. LtR models exploit the feature-based representations of training samples while pre-trained transformer-based neural rankers can be evaluated on the corresponding textual content of queries and documents. Through preliminary experiments on Istella22, we find that neural re-ranking approaches lag behind LtR models in terms of effectiveness. However, LtR models identify the scores from neural models as strong signals.|使用预训练语言模型的神经网络方法可以有效地完成各种排序任务,例如问题回答和即席文档排序。然而,与基于特征的学习到等级(LT)方法相比,它们的有效性还没有得到很好的证实。其中一个主要原因是,现有的包含查询文档特征向量的 LITR 基准测试不包含神经模型所需的原始查询和文档文本。另一方面,常用于评估神经模型的基准,如 MS MARCO、 TREC 鲁棒性等,提供文本但不提供查询文档特征向量。在本文中,我们介绍了新的数据集 Istella22,它通过提供工业搜索引擎使用的查询/文档文本和强查询-文档特征向量来实现这种比较。该数据集包括840万份网络文档的综合语料库、包括220个手工制作的功能的查询-文档对的集合、5级量表的相关性判断,以及用于测试目的的2198个文本查询。Istella22可以对传统的学习排序和转移排序技术在相同数据上进行公平的评估。LTR 模型利用训练样本的基于特征的表示,而预训练的基于变压器的神经排序器可以根据查询和文档的相应文本内容进行评估。通过在 Istella22上的初步实验,我们发现神经重新排序方法在有效性方面落后于 LTR 模型。然而,LTR 模型将神经模型的分数识别为强信号。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=The+Istella22+Dataset:+Bridging+Traditional+and+Neural+Learning+to+Rank+Evaluation)|0| +|[ACORDAR: A Test Collection for Ad Hoc Content-Based (RDF) Dataset Retrieval](https://doi.org/10.1145/3477495.3531729)|Tengteng Lin, Qiaosheng Chen, Gong Cheng, Ahmet Soylu, Basil Ell, Ruoqi Zhao, Qing Shi, Xiaxia Wang, Yu Gu, Evgeny Kharlamov|The Ohio State University, Columbus, OH, USA; Bielefeld University & University of Oslo, Bielefeld, Germany; OsloMet -- Oslo Metropolitan University & Norwegian University of Science and Technology, Oslo, Norway; Nanjing University, Nanjing, China; Bosch Center for Artificial Intelligence & University of Oslo, Renningen, Germany|Ad hoc dataset retrieval is a trending topic in IR research. Methods and systems are evolving from metadata-based to content-based ones which exploit the data itself for improving retrieval accuracy but thus far lack a specialized test collection. In this paper, we build and release the first test collection for ad hoc content-based dataset retrieval, where content-oriented dataset queries and content-based relevance judgments are annotated by human experts who are assisted with a dashboard designed specifically for comprehensively and conveniently browsing both the metadata and data of a dataset. We conduct extensive experiments on the test collection to analyze its difficulty and provide insights into the underlying task.|自组织数据集检索是信息检索领域的一个研究热点。方法和系统正在从基于元数据向基于内容的方法和系统演变,这些方法和系统利用数据本身来提高检索的准确性,但迄今为止还缺乏专门的测试集合。本文构建并发布了第一个基于特定内容的数据集检索测试集合,其中面向内容的数据集查询和基于内容的相关性判断由人类专家进行注释,并辅以一个专门为全面方便地浏览数据集的元数据和数据而设计的仪表板。我们进行了广泛的实验测试收集,以分析其难度,并提供深入了解潜在的任务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ACORDAR:+A+Test+Collection+for+Ad+Hoc+Content-Based+(RDF)+Dataset+Retrieval)|0| +|[RELISON: A Framework for Link Recommendation in Social Networks](https://doi.org/10.1145/3477495.3531730)|Javier SanzCruzado, Pablo Castells|Universidad Autónoma de Madrid, Madrid, Spain; University of Glasgow, Glasgow, United Kingdom|Link recommendation is an important and compelling problem at the intersection of recommender systems and online social networks. Given a user, link recommenders identify people in the platform the user might be interested in interacting with. We present RELISON, an extensible framework for running link recommendation experiments. The library provides a wide range of algorithms, along with tools for evaluating the produced recommendations. RELISON includes algorithms and metrics that consider the potential effect of recommendations on the properties of online social networks. For this reason, the library also implements network structure analysis metrics, community detection algorithms, and network diffusion simulation functionalities. The library code and documentation is available at https://github.com/ir-uam/RELISON.|在推荐系统和在线社交网络的交叉点上,链接推荐是一个重要且引人注目的问题。给定一个用户,链接推荐器会识别出平台中用户可能感兴趣的交互对象。我们提出了 RELISON,一个运行链路推荐实验的可扩展框架。该库提供了广泛的算法,以及用于评估生成的建议的工具。RELISON 包括一些算法和度量标准,这些算法和度量标准考虑了推荐对在线社交网络属性的潜在影响。出于这个原因,该库还实现了网络结构分析度量、社区检测算法和网络扩散模拟功能。图书馆代码及文件可于 https://github.com/ir-uam/relison 索取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=RELISON:+A+Framework+for+Link+Recommendation+in+Social+Networks)|0| +|[The Istella22 Dataset: Bridging Traditional and Neural Learning to Rank Evaluation](https://doi.org/10.1145/3477495.3531740)|Domenico Dato, Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto|Istella, Milano, Italy; University of Pisa, Pisa, Italy; ISTI-CNR, Pisa, Italy; University of Glasgow, Glasgow, United Kingdom|Neural approaches that use pre-trained language models are effective at various ranking tasks, such as question answering and ad-hoc document ranking. However, their effectiveness compared to feature-based Learning-to-Rank (LtR) methods has not yet been well-established. A major reason for this is because present LtR benchmarks that contain query-document feature vectors do not contain the raw query and document text needed for neural models. On the other hand, the benchmarks often used for evaluating neural models, e.g., MS MARCO, TREC Robust, etc., provide text but do not provide query-document feature vectors. In this paper, we present Istella22, a new dataset that enables such comparisons by providing both query/document text and strong query-document feature vectors used by an industrial search engine. The dataset consists of a comprehensive corpus of 8.4M web documents, a collection of query-document pairs including 220 hand-crafted features, relevance judgments on a 5-graded scale, and a set of 2,198 textual queries used for testing purposes. Istella22 enables a fair evaluation of traditional learning-to-rank and transfer ranking techniques on the same data. LtR models exploit the feature-based representations of training samples while pre-trained transformer-based neural rankers can be evaluated on the corresponding textual content of queries and documents. Through preliminary experiments on Istella22, we find that neural re-ranking approaches lag behind LtR models in terms of effectiveness. However, LtR models identify the scores from neural models as strong signals.|使用预训练语言模型的神经网络方法可以有效地完成各种排序任务,例如问题回答和即席文档排序。然而,与基于特征的学习到等级(LT)方法相比,它们的有效性还没有得到很好的证实。其中一个主要原因是,现有的包含查询文档特征向量的 LITR 基准测试不包含神经模型所需的原始查询和文档文本。另一方面,常用于评估神经模型的基准,如 MS MARCO、 TREC 鲁棒性等,提供文本但不提供查询文档特征向量。在本文中,我们介绍了新的数据集 Istella22,它通过提供工业搜索引擎使用的查询/文档文本和强查询-文档特征向量来实现这种比较。该数据集包括840万份网络文档的综合语料库、包括220个手工制作的功能的查询-文档对的集合、5级量表的相关性判断,以及用于测试目的的2198个文本查询。Istella22可以对传统的学习排序和转移排序技术在相同数据上进行公平的评估。LTR 模型利用训练样本的基于特征的表示,而预训练的基于变压器的神经排序器可以根据查询和文档的相应文本内容进行评估。通过在 Istella22上的初步实验,我们发现神经重新排序方法在有效性方面落后于 LTR 模型。然而,LTR 模型将神经模型的分数识别为强信号。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=The+Istella22+Dataset:+Bridging+Traditional+and+Neural+Learning+to+Rank+Evaluation)|0| |[Axiomatic Retrieval Experimentation with ir_axioms](https://doi.org/10.1145/3477495.3531743)|Alexander Bondarenko, Maik Fröbe, Jan Heinrich Reimer, Benno Stein, Michael Völske, Matthias Hagen|Martin-Luther-Universität Halle-Wittenberg, Halle, Germany; Bauhaus-Universität Weimar, Weimar, Germany|Axiomatic approaches to information retrieval have played a key role in determining basic constraints that characterize good retrieval models. Beyond their importance in retrieval theory, axioms have been operationalized to improve an initial ranking, to "guide" retrieval, or to explain some model's rankings. However, recent open-source retrieval frameworks like PyTerrier and Pyserini, which made it easy to experiment with sparse and dense retrieval models, have not included any retrieval axiom support so far. To fill this gap, we propose ir_axioms, an open-source Python framework that integrates retrieval axioms with common retrieval frameworks. We include reference implementations for 25 retrieval axioms, as well as components for preference aggregation, re-ranking, and evaluation. New axioms can easily be defined by implementing an abstract data type or by intuitively combining existing axioms with Python operators or regression. Integration with PyTerrier and ir_datasets makes standard retrieval models, corpora, topics, and relevance judgments---including those used at TREC---immediately accessible for axiomatic experimentation. Our experiments on the TREC Deep Learning tracks showcase some potential research questions that ir_axioms can help to address.|公理化的信息检索检索方法在确定优秀检索模型的基本约束条件方面发挥了关键作用。除了在检索理论中的重要性,公理已经被用来改进初始排名、“指导”检索或解释某些模型的排名。然而,最近的开放源代码检索框架,如 PyTerrier 和 Pyserini,使得对稀疏和密集的检索模型进行试验变得容易,到目前为止还没有包含任何检索公理支持。为了填补这个空白,我们提出 ir _ xioms,这是一个开放源码的 Python 框架,它将检索公理与通用检索框架集成在一起。我们包括用于25个检索公理的参考实现,以及用于偏好聚合、重新排序和评估的组件。通过实现一个抽象数据类型或直观地将现有公理与 Python 运算符或回归相结合,可以很容易地定义新公理。与 PyTerrier 和 ir _ data 集合的集成使得标准检索模型、语料库、主题和相关性判断——包括 TREC 使用的那些——可以立即用于公理化实验。我们在 TREC 深度学习轨道上的实验展示了一些 ir _ axioms 可以帮助解决的潜在研究问题。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Axiomatic+Retrieval+Experimentation+with+ir_axioms)|0| |[Knowledge Graph Question Answering Datasets and Their Generalizability: Are They Enough for Future Research?](https://doi.org/10.1145/3477495.3531751)|Longquan Jiang, Ricardo Usbeck|University Hamburg, Hamburg, Germany|Existing approaches on Question Answering over Knowledge Graphs (KGQA) have weak generalizability. That is often due to the standard i.i.d. assumption on the underlying dataset. Recently, three levels of generalization for KGQA were defined, namely i.i.d., compositional, zero-shot. We analyze 25 well-known KGQA datasets for 5 different Knowledge Graphs (KGs). We show that according to this definition many existing and online available KGQA datasets are either not suited to train a generalizable KGQA system or that the datasets are based on discontinued and out-dated KGs. Generating new datasets is a costly process and, thus, is not an alternative to smaller research groups and companies. In this work, we propose a mitigation method for re-splitting available KGQA datasets to enable their applicability to evaluate generalization, without any cost and manual effort. We test our hypothesis on three KGQA datasets, i.e., LC-QuAD, LC-QuAD 2.0 and QALD-9). Experiments on re-splitted KGQA datasets demonstrate its effectiveness towards generalizability. The code and a unified way to access 18 available datasets is online at https://github.com/semantic-systems/KGQA-datasets as well as https://github.com/semantic-systems/KGQA-datasets-generalization.|现有的知识图问答方法具有较弱的泛化能力。这通常是由于基础数据集上的标准 i.id 假设造成的。最近,定义了 KGQA 的三个推广水平,即标识、合成和零拍。我们分析了5个不同的知识图表(KG)的25个著名的 KGQA 数据集。我们表明,根据这个定义,许多现有的和在线可用的 KGQA 数据集要么不适合训练一个可推广的 KGQA 系统,要么数据集基于不连续的和过时的 KG。生成新的数据集是一个昂贵的过程,因此,不能替代较小的研究团体和公司。在这项工作中,我们提出了一个缓解方法,重新分裂可用的 KGQA 数据集,使其适用性评估一般化,没有任何成本和人工的努力。我们在三个 KGQA 数据集上检验我们的假设,即 LC-QuAD,LC-QuAD 2.0和 QALD-9)。通过对 KGQA 数据集的重新分割实验,证明了该算法的有效性。该代码和一个统一的方式访问18个可用的数据集是在线的 https://github.com/semantic-systems/kgqa-datasets 和 https://github.com/semantic-systems/kgqa-datasets-generalization。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Knowledge+Graph+Question+Answering+Datasets+and+Their+Generalizability:+Are+They+Enough+for+Future+Research?)|0| |[Golden Retriever: A Real-Time Multi-Modal Text-Image Retrieval System with the Ability to Focus](https://doi.org/10.1145/3477495.3531666)|Florian Schneider, Chris Biemann|Universität Hamburg, Hamburg, Germany|In this work, we present the Golden Retriever, a system leveraging state-of-the-art visio-linguistic models (VLMs) for real-time text-image retrieval. The unique feature of our system is that it can focus on words contained in the textual query, i.e., locate and high-light them within retrieved images. An efficient two-stage process implements real-time capability and the ability to focus. Therefore, we first drastically reduce the number of images processed by a VLM. Then, in the second stage, we rank the images and highlight the focussed word using the outputs of a VLM. Further, we introduce a new and efficient algorithm based on the idea of TF-IDF to retrieve images for short textual queries. One of multiple use cases where we employ the Golden Retriever is a language learner scenario, where visual cues for "difficult" words within sentences are provided to improve a user's reading comprehension. However, since the backend is completely decoupled from the frontend, the system can be integrated into any other application where images must be retrieved fast. We demonstrate the Golden Retriever with screenshots of a minimalistic user interface.|在这项工作中,我们介绍了金毛寻回犬,一个利用最先进的视觉语言模型(vlm)进行实时文本图像检索的系统。我们的系统的独特之处在于它可以专注于文本查询中包含的单词,也就是说,在检索到的图像中定位并高亮显示它们。一个有效的两阶段过程实现实时能力和集中能力。因此,我们首先大幅度减少图像处理的 VLM 数量。然后,在第二阶段,我们使用 VLM 的输出对图像进行排序并突出显示聚焦词。在此基础上,提出了一种基于 TF-IDF 思想的短文本查询图像检索算法。我们使用金毛寻回犬的多个用例之一是语言学习者场景,其中提供句子中“难”词的视觉提示,以提高用户的阅读理解。但是,由于后端与前端完全解耦,因此系统可以集成到任何其他必须快速检索图像的应用程序中。我们用一个极简的用户界面截图来演示这个金毛寻回犬。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Golden+Retriever:+A+Real-Time+Multi-Modal+Text-Image+Retrieval+System+with+the+Ability+to+Focus)|0| |[ZeroMatcher: A Cost-Off Entity Matching System](https://doi.org/10.1145/3477495.3531661)|Congcong Ge, Xiaocan Zeng, Lu Chen, Yunjun Gao|Zhejiang University, Ningbo, China; Zhejiang University, Hangzhou, China|Entity Matching (EM) aims to find data instances from different sources that refer to the same real-world entity. The existing EM techniques can be either costly or tailored for a specific data type. We present ZeroMatcher, a cost-off entity matching system, which supports (i) handling EM tasks with different data types, including relational tables and knowledge graphs; (ii) keeping its EM performance always competitive by enabling the sub-modules to be updated in a lightweight manner, thus reducing development costs; and (iii) performing EM without human annotations to further slash the labor costs. First, ZeroMatcher automatically suggests users a set of appropriate modules for EM according to the data types of the input datasets. Users could specify the modules for the subsequent EM process according to their preferences. Alternatively, users are able to customize the modules of ZeroMatcher. Then, the system proceeds to the EM task, where users can track the entire EM process and monitor the memory usage changes in real-time. When the EM process is completed, ZeroMatcher visualizes the EM results from different aspects to ease the understanding for users. Finally, ZeroMatcher provides EM results evaluation, enabling users to compare the effectiveness among different parameter settings.|实体匹配(Entity Matching,EM)的目标是从引用相同实体的不同数据源中找到数据实例。现有的 EM 技术要么成本高昂,要么针对特定的数据类型进行量身定制。我们提出了 ZeroMatcher,一个成本实体匹配系统,它支持(i)处理不同数据类型的 EM 任务,包括关系表和知识图; (ii)保持其 EM 性能始终具有竞争力,使子模块以轻量级的方式更新,从而降低开发成本; 以及(iii)执行 EM 而不需要人工注释,以进一步降低劳动力成本。首先,ZeroMatcher 根据输入数据集的数据类型自动为用户建议一组适合 EM 的模块。用户可以根据自己的偏好为后续 EM 过程指定模块。或者,用户可以自定义 ZeroMatcher 的模块。然后,系统继续执行 EM 任务,在这个任务中,用户可以跟踪整个 EM 进程并实时监视内存使用的变化。当 EM 过程完成后,ZeroMatcher 从不同方面可视化 EM 结果,以便于用户理解。最后,ZeroMatcher 提供 EM 结果评估,使用户能够比较不同参数设置之间的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ZeroMatcher:+A+Cost-Off+Entity+Matching+System)|0| |[QFinder: A Framework for Quantity-centric Ranking](https://doi.org/10.1145/3477495.3531672)|Satya Almasian, Milena Bruseva, Michael Gertz|Heidelberg University, Heidelberg, Germany|Quantities shape our understanding of measures and values, and they are an important means to communicate the properties of objects. Often, search queries contain numbers as retrieval units, e.g., "iPhone that costs less than 800 Euros''. Yet, modern search engines lack a proper understanding of numbers and units. In queries and documents, search engines handle them as normal keywords and therefore are ignorant of relative conditions between numbers, such as greater than or less than, or, more generally, the numerical proximity of quantities. In this work, we demonstrate QFinder, our quantity-centric framework for ranking search results for queries with quantity constraints. We also open-source our new ranking method as an Elasticsearch plug-in for future use. Our demo is available at: https://qfinder.ifi.uni-heidelberg.de/|数量塑造了我们对度量和价值的理解,它们是沟通对象属性的重要手段。通常,搜索查询包含数字作为检索单位,例如,“成本低于800欧元的 iPhone”。然而,现代搜索引擎缺乏对数字和单位的正确理解。在查询和文档中,搜索引擎将它们作为普通关键字处理,因此不知道数字之间的相对条件,例如大于或小于,或者更一般地说,数量的数值接近性。在这项工作中,我们展示了 QFinder,我们的数量为中心的框架排序查询搜索结果的数量约束。我们还将我们的新排名方法作为一个 Elasticsearch 插件开源以供将来使用。我们的演示可以在 https://qfinder.ifi.uni-heidelberg.de/下载|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=QFinder:+A+Framework+for+Quantity-centric+Ranking)|0| -|[CHERCHE: A New Tool to Rapidly Implement Pipelines in Information Retrieval](https://doi.org/10.1145/3477495.3531695)|Raphaël Sourty, José G. Moreno, Lynda Tamine, FrançoisPaul Servant|Université Paul Sabatier, IRIT, Toulouse, France; Université Paul Sabatier, IRIT & Renault, Toulouse, France; Renault, Boulogne-Billancourt, France|In this demo paper, we present a new open-source python module for building information retrieval pipelines with transformers namely CHERCHE. Our aim is to propose an easy to plug tool capable to execute, simple but strong, state-of-the-art information retrieval models. To do so, we have integrated classical models based on lexical matching but also recent models based on semantic matching. Indeed, a large number of models available on public hubs can be now tested on information retrieval tasks with only a few lines. CHERCHE is oriented to newcomers into the neural information retrieval field that want to use transformer-based models in small collections without struggling with heavy tools. The code and documentation of CHERCHE is public available at https://github.com/raphaelsty/cherche|在本演示文件中,我们提出了一个新的开源 python 模块,用于建设带有变压器的信息检索管道,即 CHERCHE。我们的目标是提出一个容易插入的工具,能够执行,简单但强大,国家的最先进的信息检索模型。为此,我们集成了基于词汇匹配的经典模型和基于语义匹配的最新模型。事实上,在公共集线器上提供的大量模型,现在只需几行代码就可以在信息检索任务上进行测试。CHERCHE 面向的是神经信息检索领域的新手,他们希望在小型集合中使用基于变压器的模型,而无需使用笨重的工具。CHERCHE 的代码和文档可在 https://github.com/raphaelsty/CHERCHE 查阅|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CHERCHE:+A+New+Tool+to+Rapidly+Implement+Pipelines+in+Information+Retrieval)|0| +|[CHERCHE: A New Tool to Rapidly Implement Pipelines in Information Retrieval](https://doi.org/10.1145/3477495.3531695)|Raphaël Sourty, José G. Moreno, Lynda Tamine, FrançoisPaul Servant|Renault, Boulogne-Billancourt, France; Université Paul Sabatier, IRIT & Renault, Toulouse, France; Université Paul Sabatier, IRIT, Toulouse, France|In this demo paper, we present a new open-source python module for building information retrieval pipelines with transformers namely CHERCHE. Our aim is to propose an easy to plug tool capable to execute, simple but strong, state-of-the-art information retrieval models. To do so, we have integrated classical models based on lexical matching but also recent models based on semantic matching. Indeed, a large number of models available on public hubs can be now tested on information retrieval tasks with only a few lines. CHERCHE is oriented to newcomers into the neural information retrieval field that want to use transformer-based models in small collections without struggling with heavy tools. The code and documentation of CHERCHE is public available at https://github.com/raphaelsty/cherche|在本演示文件中,我们提出了一个新的开源 python 模块,用于建设带有变压器的信息检索管道,即 CHERCHE。我们的目标是提出一个容易插入的工具,能够执行,简单但强大,国家的最先进的信息检索模型。为此,我们集成了基于词汇匹配的经典模型和基于语义匹配的最新模型。事实上,在公共集线器上提供的大量模型,现在只需几行代码就可以在信息检索任务上进行测试。CHERCHE 面向的是神经信息检索领域的新手,他们希望在小型集合中使用基于变压器的模型,而无需使用笨重的工具。CHERCHE 的代码和文档可在 https://github.com/raphaelsty/CHERCHE 查阅|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CHERCHE:+A+New+Tool+to+Rapidly+Implement+Pipelines+in+Information+Retrieval)|0| |[Arm: Efficient Learning of Neural Retrieval Models with Desired Accuracy by Automatic Knowledge Amalgamation](https://doi.org/10.1145/3477495.3531664)|Linzhu Yu, Dawei Jiang, Ke Chen, Lidan Shou|Zhejiang University, Hangzhou, China|In recent years, there has been increasing interest in adopting published neural retrieval models learned from corpora for text retrieval. Although these models achieve excellent retrieval performance, in terms of popular accuracy metrics, on datasets they have been trained, their performance on new text data might degrade. To obtain the desired retrieval performance on both the data used in training and the latest data collected after training, the simple approach of learning a new model from both datasets is not always feasible since the annotated dataset used in training is often not published along with the learned model. Knowledge amalgamation (KA) is an emerging technique to deal with this problem of inaccessibility of data used in previous training. KA learns a new model (called a student model) from new data by reusing (called amalgamating) a number of trained models (called teacher models) instead of accessing the teachers' original training data. However, in order to efficiently learn an accurate student model, the classical KA approach requires manual selection of an appropriate subset of teacher models for amalgamation. This manual procedure for selecting teacher models prevents the classical KA from being scaled to retrieval tasks for which a large number of candidate teacher models are ready to be reused. This paper presents Arm, an intelligent system for efficiently learning a neural retrieval model with the desired accuracy on incoming data by automatically amalgamating a subset of teacher models (called a teacher model combination or simply combination ) among a large number of teacher models. o filter combinations that fail to produce accurate student models, Arm employs Bayesian optimization to derive an accuracy prediction model based on sampled amalgamation tasks. Then, Arm uses the derived prediction model to exclude unqualified combinations without training the rest combinations. To speed up training, Arm introduces a cost model that picks the teacher model combination with the minimal training cost among all qualified teacher model combinations to produce the final student model. This paper will demonstrate the major workflow of Arm and present the produced student models to users.|近年来,从语料库中学习的已发表的神经检索模型越来越多地被用于文本检索。尽管这些模型在它们已经训练过的数据集上获得了优异的检索性能,但是它们在新文本数据上的性能可能会下降。为了获得对训练中使用的数据和训练后收集的最新数据的期望检索性能,从两个数据集中学习新模型的简单方法并不总是可行的,因为训练中使用的注释数据集通常不与学习模型一起发布。知识融合(KA)是一种新兴的技术,以解决这一问题的数据无法访问以往的训练使用。KA 从新的数据中学习新的模型(称为学生模型) ,方法是重用(称为合并)一些经过训练的模型(称为教师模型) ,而不是访问教师的原始培训数据。然而,为了有效地学习一个准确的学生模型,经典的 KA 方法需要手动选择合适的教师模型子集进行合并。这个选择教师模型的手工程序阻止了经典的 KA 被缩放到检索任务,其中大量的候选教师模型可以被重用。本文提出了一个智能系统 Arm,它通过在大量的教师模型中自动合并一个教师模型子集(称为教师模型组合或简单组合)来有效地学习一个神经检索模型,该模型对输入数据具有期望的准确性。O 过滤器组合不能产生准确的学生模型,Arm 使用贝叶斯优化得到一个基于采样融合任务的精度预测模型。然后,利用导出的预测模型排除不合格组合,而不对剩余组合进行训练。为了加速培训,Arm 引入了一个成本模型,在所有合格的教师模型组合中选择教师模型组合和最小的培训成本,从而产生最终的学生模型。本文将演示 Arm 的主要工作流程,并将生成的学生模型提供给用户。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Arm:+Efficient+Learning+of+Neural+Retrieval+Models+with+Desired+Accuracy+by+Automatic+Knowledge+Amalgamation)|0| |[An Auto Encoder-based Dimensionality Reduction Technique for Efficient Entity Linking in Business Phone Conversations](https://doi.org/10.1145/3477495.3536322)|Md. Tahmid Rahman Laskar, Cheng Chen, Jonathan Johnston, XueYong Fu, Shashi Bhushan TN, Simon CorstonOliver|Dialpad Canada Inc., Vancouver, BC, Canada|An entity linking system links named entities in a text to their corresponding entries in a knowledge base. In recent years, building an entity linking system that leverages the transformer architecture has gained lots of attention. However, deploying a transformer-based neural entity linking system in industrial production environments in a limited resource setting is a challenging task. In this work, we present an entity linking system that leverages a transformer-based BERT encoder (the BLINK model) to connect the product and organization type entities in business phone conversations to their corresponding Wikipedia entries. We propose a dimensionality reduction technique via utilizing an auto encoder that can effectively compress the dimension of the pre-trained BERT embeddings to 256 from the original size of 1024. This allows our entity linking system to significantly optimize the space requirement when deployed in a resource limited cloud machine while reducing the inference time along with retaining high accuracy.|连接系统的实体将文本中的命名实体链接到知识库中相应的条目。近年来,利用变压器体系结构构建实体连接系统引起了人们的广泛关注。然而,在资源有限的工业生产环境中部署一个基于变压器的神经实体连接系统是一个具有挑战性的任务。在这项工作中,我们提出了一个实体链接系统,该系统利用基于转换器的 BERT 编码器(BLINK 模型)将商务电话会话中的产品和组织类型实体连接到相应的 Wikipedia 条目。我们提出了一种降维技术,通过使用自动编码器,可以有效地将预先训练的 BERT 嵌入的尺寸从原来的1024压缩到256。这使得我们的实体连接系统在部署在资源有限的云计算机中时,能够显著优化空间需求,同时减少推理时间,并保持高精度。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=An+Auto+Encoder-based+Dimensionality+Reduction+Technique+for+Efficient+Entity+Linking+in+Business+Phone+Conversations)|0| |[Applications and Future of Dense Retrieval in Industry](https://doi.org/10.1145/3477495.3536324)|Yubin Kim|Etsy, Inc., Brooklyn, NY, USA|Large-scale search engines are often designed as tiered systems with at least two layers. The L1 candidate retrieval layer efficiently generates a subset of potentially relevant documents (typically ~1000 documents) from a corpus many orders of magnitude larger in size. L1 systems emphasize efficiency and are designed to maximize recall. The L2 re-ranking layer uses a more computationally expensive, but more accurate model (e.g. learning-to-rank or neural model) to re-rank the candidates generated by L1 in order to maximize precision of the final result list. Traditionally, candidate retrieval was performed with an inverted index data structure, with exact lexical matching. Candidates are ordered by a dot-product-like scoring function f(q,d) where q and d are sparse vectors containing token weights, typically derived from the token's frequency in the document/query and corpus. The inverted index enables sub-linear ranking of the documents. Due to the sparse vector representation of the documents and queries, lexical match retrieval systems have also been called sparse retrieval. To contrast, dense retrieval represents queries and documents by embedding the text into lower dimensional dense vectors. Candidate documents are scored based on the distance between the query and document embedding vectors. Practically, the similarity computations are made efficiently with approximate k-nearest neighbours (ANN) systems. In this panel, we bring together experts in dense retrieval across multiple industry applications, including web search, enterprise and personal search, e-commerce, and out-of-domain retrieval.|大型搜索引擎通常被设计成至少有两层的分层系统。第一语言候选检索层有效地从一个数量级更大的语料库中生成一个可能相关的文档子集(通常是1000个文档)。L1系统强调效率,旨在最大限度地提高召回率。L2重新排序层使用一个计算更加昂贵,但更加精确的模型(例如学习到排序或神经模型)来重新排序 L1产生的候选人,以最大限度地提高最终结果列表的精确度。传统的候选检索采用倒排索引数据结构,并且采用精确的词法匹配。候选者按照类似点乘积的评分函数 f (q,d)排序,其中 q 和 d 是包含令牌权重的稀疏向量,通常来自文档/查询和语料库中令牌的频率。倒排索引允许对文档进行次线性排序。由于文档和查询的稀疏向量表示,词汇匹配检索系统也被称为稀疏检索。相比之下,密集检索通过将文本嵌入低维密集向量来表示查询和文档。候选文档根据查询和文档嵌入向量之间的距离进行评分。在实际应用中,利用近似 k 最近邻(ANN)系统可以有效地进行相似度计算。在这个专题讨论小组中,我们汇集了跨多个行业应用的密集检索方面的专家,包括网络搜索、企业和个人搜索、电子商务和域外检索。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Applications+and+Future+of+Dense+Retrieval+in+Industry)|0| |[Flipping the Script: Inverse Information Seeking Dialogues for Market Research](https://doi.org/10.1145/3477495.3536326)|Josh Seltzer, Kathy Cheng, Shi Zong, Jimmy Lin|Nexxt Intelligence, Toronto, Canada; University of Waterloo, Waterloo, Canada|Information retrieval has traditionally been framed in terms of searching and extracting information from mostly static resources. Interactive information retrieval (IIR) has widened the scope, with interactive dialogues largely playing the role of clarifying (i.e., making explicit, and/or refining) the information search space. Informed by market research practices, we seek to reframe IIR as a process of eliciting novel information from human interlocutors, with a chatbot-inspired virtual agent playing the role of an interviewer. This reframing flips conventional IIR into what we call an inverse information seeking dialogue, wherein the virtual agent recurrently extracts information from human utterances and poses questions intended to elicit related information. In this work, we introduce and provide a formal definition of an inverse information seeking agent, outline some of its unique challenges, and propose our novel framework to tackle this problem based on techniques from natural language processing (NLP) and IIR.|传统上,信息检索的框架是从大多数静态资源中搜索和提取信息。交互式信息检索(IIR)扩大了搜索范围,交互式对话主要扮演澄清(即明确和/或完善)信息搜索空间的角色。通过市场研究实践,我们试图将 IIR 重新定义为从人类对话者那里获取新信息的过程,由聊天机器人启发的虚拟代理扮演访问者的角色。这种重构将传统的 IIR 翻转为我们所说的反向信息寻求对话,在这种对话中,虚拟代理不断地从人类的话语中提取信息,并提出旨在引出相关信息的问题。本文首先介绍并给出了反向信息搜索代理的形式化定义,概述了它所面临的一些独特挑战,并提出了基于自然语言处理(NLP)和 IIR 技术的反向信息搜索代理框架。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Flipping+the+Script:+Inverse+Information+Seeking+Dialogues+for+Market+Research)|0| -|[Information Ecosystem Threats in Minoritized Communities: Challenges, Open Problems and Research Directions](https://doi.org/10.1145/3477495.3536327)|Shiri DoriHacohen, Scott A. Hale|University of Connecticut & AuCoDe, Storrs, CT, USA; Meedan & University of Oxford, San Francisco, CA, USA|Journalists, fact-checkers, academics, and community media are overwhelmed in their attempts to support communities suffering from gender-, race- and ethnicity-targeted information ecosystem threats, including but not limited to misinformation, hate speech, weaponized controversy and online-to-offline harassment. Yet, for a plethora of reasons, minoritized groups are underserved by current approaches to combat such threats. In this panel, we will present and discuss the challenges and open problems facing such communities and the researchers hoping to serve them. We will also discuss the current state-of-the-art as well as the most promising future directions, both within IR specifically, across Computer Science more broadly, as well as that requiring transdisciplinary and cross-sectoral collaborations. The panel will attract both IR practitioners and researchers and include at least one panelist outside of IR, with unique expertise in this space.|记者、事实核查员、学者和社区媒体在试图支持遭受针对性别、种族和民族的信息生态系统威胁的社区时,不堪重负,这些威胁包括但不限于错误信息、仇恨言论、武器化的争议和线上到线下的骚扰。然而,由于种种原因,少数群体在目前应对这些威胁的方法中得不到充分的服务。在这个小组中,我们将介绍和讨论这些社区面临的挑战和公开的问题,以及研究人员希望为他们服务的问题。我们还将讨论当前最先进的技术,以及最有前途的未来方向,既包括在信息研究领域,也包括更广泛的计算机科学领域,还包括需要跨学科和跨部门合作的领域。该小组将吸引国际关系从业人员和研究人员,并包括至少一个国际关系以外的小组成员,在这个领域具有独特的专业知识。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Information+Ecosystem+Threats+in+Minoritized+Communities:+Challenges,+Open+Problems+and+Research+Directions)|0| +|[Information Ecosystem Threats in Minoritized Communities: Challenges, Open Problems and Research Directions](https://doi.org/10.1145/3477495.3536327)|Shiri DoriHacohen, Scott A. Hale|Meedan & University of Oxford, San Francisco, CA, USA; University of Connecticut & AuCoDe, Storrs, CT, USA|Journalists, fact-checkers, academics, and community media are overwhelmed in their attempts to support communities suffering from gender-, race- and ethnicity-targeted information ecosystem threats, including but not limited to misinformation, hate speech, weaponized controversy and online-to-offline harassment. Yet, for a plethora of reasons, minoritized groups are underserved by current approaches to combat such threats. In this panel, we will present and discuss the challenges and open problems facing such communities and the researchers hoping to serve them. We will also discuss the current state-of-the-art as well as the most promising future directions, both within IR specifically, across Computer Science more broadly, as well as that requiring transdisciplinary and cross-sectoral collaborations. The panel will attract both IR practitioners and researchers and include at least one panelist outside of IR, with unique expertise in this space.|记者、事实核查员、学者和社区媒体在试图支持遭受针对性别、种族和民族的信息生态系统威胁的社区时,不堪重负,这些威胁包括但不限于错误信息、仇恨言论、武器化的争议和线上到线下的骚扰。然而,由于种种原因,少数群体在目前应对这些威胁的方法中得不到充分的服务。在这个小组中,我们将介绍和讨论这些社区面临的挑战和公开的问题,以及研究人员希望为他们服务的问题。我们还将讨论当前最先进的技术,以及最有前途的未来方向,既包括在信息研究领域,也包括更广泛的计算机科学领域,还包括需要跨学科和跨部门合作的领域。该小组将吸引国际关系从业人员和研究人员,并包括至少一个国际关系以外的小组成员,在这个领域具有独特的专业知识。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Information+Ecosystem+Threats+in+Minoritized+Communities:+Challenges,+Open+Problems+and+Research+Directions)|0| |[Extractive Search for Analysis of Biomedical Texts](https://doi.org/10.1145/3477495.3536328)|Daniel Clothiaux, Ravi Starzl|Bioplx & Carnegie Mellon University, Boulder, CO, USA|Extractive search has been used to create datasets matching queries and syntactic patterns, but less attention has been paid on what to do with those datasets. We present a two-stage system targeted towards biomedical texts. First, it creates custom datasets using a powerful mix of keyword and syntactic matching. We then return lists of related words, provide semantic search, train a large language model, a synthetic data based QA model, a summarization model over those results, and so on. These are then used in downstream biomedical work.|提取搜索已被用于创建匹配查询和语法模式的数据集,但对如何处理这些数据集的关注较少。我们提出了一个针对生物医学文本的两阶段系统。首先,它使用关键字和语法匹配的强大组合创建自定义数据集。然后,我们返回相关词汇的列表,提供语义搜索,训练一个大型语言模型,一个基于综合数据的 QA 模型,这些结果的汇总模型,等等。这些然后用于下游的生物医学工作。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Extractive+Search+for+Analysis+of+Biomedical+Texts)|0| |[Recent Advances in Retrieval-Augmented Text Generation](https://doi.org/10.1145/3477495.3532682)|Deng Cai, Yan Wang, Lemao Liu, Shuming Shi|The Chinese University of Hong Kong, Hong Kong, China; Tencent AI Lab, Shenzhen, China|Recently retrieval-augmented text generation has achieved state-of-the-art performance in many NLP tasks and has attracted increasing attention of the NLP and IR community, this tutorial thereby aims to present recent advances in retrieval-augmented text generation comprehensively and comparatively. It firstly highlights the generic paradigm of retrieval-augmented text generation, then reviews notable works for different text generation tasks including dialogue generation, machine translation, and other generation tasks, and finally points out some limitations and shortcomings to facilitate future research.|近年来,检索增强文本生成技术在许多自然语言处理任务中都取得了很好的效果,引起了自然语言处理和信息检索领域的广泛关注,本教程旨在全面、比较地介绍检索增强文本生成技术的最新进展。首先介绍了检索增强文本生成的一般范式,然后回顾了不同文本生成任务(包括对话生成、机器翻译和其他生成任务)中值得注意的工作,最后指出了一些局限性和不足,以利于今后的研究。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Recent+Advances+in+Retrieval-Augmented+Text+Generation)|0| |[Adaptive Dialogue Management for Conversational Information Elicitation](https://doi.org/10.1145/3477495.3531684)|Harshita Sahijwani|Emory University, Atlanta, GA, USA|Information elicitation conversations, for example, when a medical professional asks about a patient's history or a sales agent tries to understand their client's preferences, often start with a set of routine questions. The interviewer asks a predetermined set of questions conversationally, adapting them to the unique characteristics and context of an individual. Multiple-choice questionnaires are commonly used as a screening tool before the client sees the professional for more efficient information elicitation [5]. However, recent proof-of-concept studies show that users are more likely to report their symptoms to an embodied conversational agent (ECA) than on a pen-and-paper survey [3], and rate ECAs highly on user experience [4]. Chatbots allow the user to give free-form responses and ask clarification questions instead of having to interpret and choose from a list of given options. They can also keep the user engaged by sharing relevant information and offering empathetic acknowledgments when appropriate. However, many of the technical challenges involved in building such a conversational agent remain unsolved.|例如,当医疗专业人员询问病人的病史或销售代理人试图了解他们的客户的偏好时,信息引导谈话通常从一组常规问题开始。面试官会用对话的方式提出一系列预先设定好的问题,并根据个人的独特性格和背景进行调整。多项选择问卷通常被用作在客户见到专业人员之前的筛选工具,以便更有效地获取信息[5]。然而,最近的概念验证研究表明,用户更可能向具体的会话代理(ECA)报告他们的症状,而不是纸笔调查[3] ,并且对用户体验给予 ECA 高度评价[4]。聊天机器人允许用户给出自由形式的回答,并提出澄清问题,而不必从给定的选项列表中进行解释和选择。他们还可以通过分享相关信息和在适当的时候提供感同身受的确认来保持用户的参与度。然而,构建这样一个会话代理所涉及的许多技术挑战仍然没有得到解决。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Adaptive+Dialogue+Management+for+Conversational+Information+Elicitation)|0| |[Pre-Training for Mathematics-Aware Retrieval](https://doi.org/10.1145/3477495.3531680)|Anja Reusch|Technische Universität Dresden, Dresden, Germany|Mathematical formulas are an important tool to concisely communicate ideas in science and education, used to clarify descriptions, calculations or derivations. When searching in scientific literature, mathematical notation, which is often written using the LATEX notation, therefore plays a crucial role that should not be neglected. The task of mathematics-aware information retrieval is to retrieve relevant passages given a query or question, which both can include natural language and mathematical formulas. As in many domains that rely on Natural Language Understanding, transformer-based models are now dominating the field of information retrieval [3]. Apart from their size and the transformerencoder architecture, pre-training is considered to be a key factor for the high performance of these models. It has also been shown that domain-adaptive pre-training improves their performance on down-stream tasks even further [2] especially when the vocabulary overlap between pre-training and in-domain data is low. This is also the case for the domain of mathematical documents.|数学公式是在科学和教育中简明地传达思想的重要工具,用于澄清描述、计算或推导。在搜索科学文献时,数学符号(通常使用 LATEX 符号书写)起着不容忽视的关键作用。具有数学意识的信息检索的任务是检索给出查询或问题的相关段落,这些段落既可以包括自然语言,也可以包括数学公式。正如许多依赖于自然语言理解的领域一样,基于转换器的模型目前在信息检索领域占据主导地位。除了它们的大小和变压器编码器的架构,预训练被认为是这些模型的高性能的一个关键因素。研究还表明,领域自适应预训练可以进一步提高他们在下游任务中的表现,尤其是当预训练和领域内数据的词汇重叠度较低时。数学文献领域也是如此。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Pre-Training+for+Mathematics-Aware+Retrieval)|0| |[Explainable Conversational Question Answering over Heterogeneous Sources](https://doi.org/10.1145/3477495.3531688)|Philipp Christmann|Max Planck Institute for Informatics & Saarland University, Saarbrücken, Germany|State-of-the-art conversational question answering (ConvQA) operates over homogeneous sources of information: either a knowledge base (KB), or a text corpus, or a collection of tables. This inherently limits the answer coverage of ConvQA systems. Therefore, during my PhD, we would like to tap into heterogeneous sources for answering conversational questions. Further, we plan to investigate the explainability of such ConvQA systems, to identify what helps users in understanding the answer derivation process.|最先进的会话问答(ConvQA)操作于同质的信息源: 知识库(KB)、文本语料库或表格集合。这固有地限制了 ConvQA 系统的应答覆盖范围。因此,在我攻读博士学位期间,我们希望利用不同的资源来回答会话中的问题。此外,我们计划调查这样的 ConvQA 系统的可解释性,以确定什么有助于用户理解的答案推导过程。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Explainable+Conversational+Question+Answering+over+Heterogeneous+Sources)|0| |[KA-Recsys: Patient Focused Knowledge Appropriate Health Recommender System](https://doi.org/10.1145/3477495.3531687)|Khushboo Thaker||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=KA-Recsys:+Patient+Focused+Knowledge+Appropriate+Health+Recommender+System)|0| -|[Bilateral Self-unbiased Learning from Biased Implicit Feedback](https://doi.org/10.1145/3477495.3531946)|Jaewoong Lee, Seongmin Park, Joonseok Lee, Jongwuk Lee|Sungkyunkwan Univ, Dept Artificial Intelligence, Seoul, South Korea; Sungkyunkwan Univ, Dept Elect & Comp Engn, Seoul, South Korea|Implicit feedback has been widely used to build commercial recommender systems. Because observed feedback represents users' click logs, there is a semantic gap between true relevance and observed feedback. More importantly, observed feedback is usually biased towards popular items, thereby overestimating the actual relevance of popular items. Although existing studies have developed unbiased learning methods using inverse propensity weighting (IPW) or causal reasoning, they solely focus on eliminating the popularity bias of items. In this paper, we propose a novel unbiased recommender learning model, namely BIlateral SElf-unbiased Recommender (BISER), to eliminate the exposure bias of items caused by recommender models. Specifically, BISER consists of two key components: (i) self-inverse propensity weighting (SIPW) to gradually mitigate the bias of items without incurring high computational costs; and (ii) bilateral unbiased learning (BU) to bridge the gap between two complementary models in model predictions, i.e., user- and item-based autoencoders, alleviating the high variance of SIPW. Extensive experiments show that BISER consistently outperforms state-of-the-art unbiased recommender models over several datasets, including Coat, Yahoo! R3, MovieLens, and CiteULike.|隐式反馈已被广泛用于构建商业推荐系统。因为观察到的反馈代表用户的点击日志,所以在真正的相关性和观察到的反馈之间存在语义差距。更重要的是,观察到的反馈通常偏向于流行项目,从而高估了流行项目的实际相关性。虽然现有的研究已经开发出使用反倾向加权(IPW)或因果推理的无偏学习方法,但他们只关注于消除项目的流行偏差。本文提出了一种新的无偏推荐学习模型,即双边自我无偏推荐模型(BISER) ,以消除由推荐模型引起的项目暴露偏差。具体而言,BISER 由两个关键组成部分组成: (i)自逆倾向加权(SIPW) ,以逐渐减轻项目的偏差,而不会产生高计算成本; (ii)双边无偏学习(BU) ,以弥补模型预测中两个互补模型之间的差距,即用户和基于项目的自动编码器,减轻 SIPW 的高方差。大量的实验表明,BISER 始终优于几个数据集,包括 Coat,Yahoo! R3,MovieLens 和 CiteULike 的最先进的无偏推荐模型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Bilateral+Self-unbiased+Learning+from+Biased+Implicit+Feedback)|0| -|[Why do Semantically Unrelated Categories Appear in the Same Session?: A Demand-aware Method](https://doi.org/10.1145/3477495.3531806)|Liqi Yang, Linhao Luo, Xiaofeng Zhang, Fengxin Li, Xinni Zhang, Zelin Jiang, Shuai Tang|China Merchants Securities Co., Ltd, Shenzhen, China; Harbin Institute of Technology, Shenzhen, Shenzhen, China|Session-based recommendation has recently attracted more and more research efforts. Most existing approaches are intuitively proposed to discover users' potential preferences or interests from the anonymous session data. This apparently ignores the fact that these sequential behavior data usually reflect session user's potential demand, i.e., a semantic level factor, and therefore how to estimate underlying demands from a session has become a challenging task. To tackle the aforementioned issue, this paper proposes a novel demand-aware graph neural network model. Particularly, a demand modeling component is designed to extract the underlying multiple demands of each session. Then, the demand-aware graph neural network is designed to first construct session demand graphs and then learn the demand-aware item embeddings to make the recommendation. The mutual information loss is further designed to enhance the quality of the learnt embeddings. Extensive experiments have been performed on two real-world datasets and the proposed model achieves the SOTA model performance.|近年来,基于会话的推荐引起了越来越多的研究者的关注。现有的大多数方法都是直观地从匿名会话数据中发现用户的潜在偏好或兴趣。这显然忽略了这样一个事实,即这些顺序行为数据通常反映会话用户的潜在需求,即语义级因素,因此如何估计会话的潜在需求已经成为一个具有挑战性的任务。为了解决上述问题,本文提出了一种新的需求感知图神经网络模型。特别地,需求建模组件被设计用于提取每个会话的底层多个需求。然后,设计需求感知图神经网络,首先构造会话需求图,然后学习需求感知项嵌入,进行推荐。进一步设计了互信息损失,提高了学习嵌入的质量。在两个实际数据集上进行了广泛的实验,所提出的模型达到了 SOTA 模型的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Why+do+Semantically+Unrelated+Categories+Appear+in+the+Same+Session?:+A+Demand-aware+Method)|0| +|[Bilateral Self-unbiased Learning from Biased Implicit Feedback](https://doi.org/10.1145/3477495.3531946)|Jaewoong Lee, Seongmin Park, Joonseok Lee, Jongwuk Lee|Sungkyunkwan Univ, Dept Elect & Comp Engn, Seoul, South Korea; Sungkyunkwan Univ, Dept Artificial Intelligence, Seoul, South Korea|Implicit feedback has been widely used to build commercial recommender systems. Because observed feedback represents users' click logs, there is a semantic gap between true relevance and observed feedback. More importantly, observed feedback is usually biased towards popular items, thereby overestimating the actual relevance of popular items. Although existing studies have developed unbiased learning methods using inverse propensity weighting (IPW) or causal reasoning, they solely focus on eliminating the popularity bias of items. In this paper, we propose a novel unbiased recommender learning model, namely BIlateral SElf-unbiased Recommender (BISER), to eliminate the exposure bias of items caused by recommender models. Specifically, BISER consists of two key components: (i) self-inverse propensity weighting (SIPW) to gradually mitigate the bias of items without incurring high computational costs; and (ii) bilateral unbiased learning (BU) to bridge the gap between two complementary models in model predictions, i.e., user- and item-based autoencoders, alleviating the high variance of SIPW. Extensive experiments show that BISER consistently outperforms state-of-the-art unbiased recommender models over several datasets, including Coat, Yahoo! R3, MovieLens, and CiteULike.|隐式反馈已被广泛用于构建商业推荐系统。因为观察到的反馈代表用户的点击日志,所以在真正的相关性和观察到的反馈之间存在语义差距。更重要的是,观察到的反馈通常偏向于流行项目,从而高估了流行项目的实际相关性。虽然现有的研究已经开发出使用反倾向加权(IPW)或因果推理的无偏学习方法,但他们只关注于消除项目的流行偏差。本文提出了一种新的无偏推荐学习模型,即双边自我无偏推荐模型(BISER) ,以消除由推荐模型引起的项目暴露偏差。具体而言,BISER 由两个关键组成部分组成: (i)自逆倾向加权(SIPW) ,以逐渐减轻项目的偏差,而不会产生高计算成本; (ii)双边无偏学习(BU) ,以弥补模型预测中两个互补模型之间的差距,即用户和基于项目的自动编码器,减轻 SIPW 的高方差。大量的实验表明,BISER 始终优于几个数据集,包括 Coat,Yahoo! R3,MovieLens 和 CiteULike 的最先进的无偏推荐模型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Bilateral+Self-unbiased+Learning+from+Biased+Implicit+Feedback)|0| +|[Why do Semantically Unrelated Categories Appear in the Same Session?: A Demand-aware Method](https://doi.org/10.1145/3477495.3531806)|Liqi Yang, Linhao Luo, Xiaofeng Zhang, Fengxin Li, Xinni Zhang, Zelin Jiang, Shuai Tang|Harbin Institute of Technology, Shenzhen, Shenzhen, China; China Merchants Securities Co., Ltd, Shenzhen, China|Session-based recommendation has recently attracted more and more research efforts. Most existing approaches are intuitively proposed to discover users' potential preferences or interests from the anonymous session data. This apparently ignores the fact that these sequential behavior data usually reflect session user's potential demand, i.e., a semantic level factor, and therefore how to estimate underlying demands from a session has become a challenging task. To tackle the aforementioned issue, this paper proposes a novel demand-aware graph neural network model. Particularly, a demand modeling component is designed to extract the underlying multiple demands of each session. Then, the demand-aware graph neural network is designed to first construct session demand graphs and then learn the demand-aware item embeddings to make the recommendation. The mutual information loss is further designed to enhance the quality of the learnt embeddings. Extensive experiments have been performed on two real-world datasets and the proposed model achieves the SOTA model performance.|近年来,基于会话的推荐引起了越来越多的研究者的关注。现有的大多数方法都是直观地从匿名会话数据中发现用户的潜在偏好或兴趣。这显然忽略了这样一个事实,即这些顺序行为数据通常反映会话用户的潜在需求,即语义级因素,因此如何估计会话的潜在需求已经成为一个具有挑战性的任务。为了解决上述问题,本文提出了一种新的需求感知图神经网络模型。特别地,需求建模组件被设计用于提取每个会话的底层多个需求。然后,设计需求感知图神经网络,首先构造会话需求图,然后学习需求感知项嵌入,进行推荐。进一步设计了互信息损失,提高了学习嵌入的质量。在两个实际数据集上进行了广泛的实验,所提出的模型达到了 SOTA 模型的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Why+do+Semantically+Unrelated+Categories+Appear+in+the+Same+Session?:+A+Demand-aware+Method)|0| |[Scalable User Interface Optimization Using Combinatorial Bandits](https://doi.org/10.1145/3477495.3536325)|Ioannis Kangas, Maud Schwoerer, Lucas Bernardi|Booking.com, Amsterdam, Netherlands|The mission of major e-commerce platforms is to enable their customers to find the best products for their needs. In the common case of large inventories, complex User Interfaces (UIs) are required to allow a seamless navigation. However, as UIs often contain many widgets of different relevance, the task of constructing an optimal layout arises in order to improve the customer's experience. This is a challenging task, especially in the typical industrial setup where multiple independent teams conflict by adding and modifying UI widgets. It becomes even more challenging due to the customer preferences evolving over time, bringing the need for adaptive solutions. In a previous work [6], we addressed this task by introducing a UI governance framework powered by Machine Learning (ML) algorithms that automatically and continuously search for the optimal layout. Nevertheless, we highlighted that naive algorithmic choices exhibit several issues when implemented in the industry, such as widget dependency, combinatorial solution space and cold start problem. In this work, we demonstrate how we deal with these issues using Combinatorial Bandits, an extension of Multi-Armed Bandits (MAB) where the agent selects not only one but multiple arms at the same time. We develop two novel approaches to model combinatorial bandits, inspired by the Natural Language Processing (NLP) and the Evolutionary Algorithms (EA) fields and present their ability to enable scalable UI optimization.|主要电子商务平台的使命是使其客户能够找到满足其需要的最佳产品。在大量存货的常见情况下,需要复杂的用户界面(UI)来实现无缝导航。然而,由于 UI 通常包含许多不同相关性的小部件,因此需要构建最佳布局以改善客户体验。这是一项具有挑战性的任务,特别是在典型的工业设置中,多个独立团队通过添加和修改 UI 小部件而发生冲突。由于客户的偏好随着时间的推移而不断变化,因此对适应性解决方案的需求变得更加具有挑战性。在之前的工作[6]中,我们通过引入一个由机器学习(ML)算法支持的 UI 治理框架来解决这个问题,该框架可以自动并持续地搜索最佳布局。然而,我们强调,幼稚的算法选择表现出几个问题,如小部件依赖,组合解决方案空间和冷启动问题。在这项工作中,我们展示了如何处理这些问题使用组合强盗,一个扩展的多臂强盗(MAB) ,其中代理人选择不仅一个,但多个武器在同一时间。受自然语言处理(NLP)和进化算法(EA)领域的启发,我们开发了两种新的组合强盗建模方法,并展示了它们实现可扩展 UI 优化的能力。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Scalable+User+Interface+Optimization+Using+Combinatorial+Bandits)|0| |[Users: Can't Work With Them, Can't Work Without Them?](https://doi.org/10.1145/3477495.3532787)|Alistair Moffat|The University of Melbourne, Melbourne, VIC, Australia|If we could design the ideal IR "effectiveness" experiment (as distinct from an IR "efficiency" experiment), what would it look like? It would probably be a lab-based observational study [3] involving multiple search systems masked behind a uniform interface, and with hundreds (or thousands) of users each progressing some "real" search activity they were interested in. And we'd plan to (non-intrusively, somehow) capture per-snippet, per-document, per-SERP, and per-session annotations and satisfaction responses. The collected data could then be compared against a range of measured "task completion quality" indicators, and also against search effectiveness metric scores computed from the elements contained in the SERPs that were served by the systems. That's a tremendously big ask! So we often use offline evaluation techniques instead, employing test collections, static qrels sets, and effectiveness metrics [6]. We abstract the user into a deterministic evaluation script, supposing for pragmatic reasons that we know what query they would issue, and at the same time assuming that we can apply an effectiveness metric to calculate how much usefulness (or satisfaction) they will derive from any given SERP. The great advantage of this approach is that aside from the process of collecting the qrels, it is free of the need for users, meaning that it is repeatable. Indeed, we often do repeat, iterating to set parameters (and to rectify programming errors). Then, once metric scores have been computed, we carry out one or more paired statistical tests and draw conclusions as to relative system effectiveness.|如果我们可以设计一个理想的红外“有效性”实验(不同于红外“有效性”实验) ,它会是什么样子?这可能是一个基于实验室的观察性研究,在一个统一的界面背后隐藏着多个搜索系统,数百(或数千)用户每个人都在进行一些他们感兴趣的“真实”搜索活动。我们计划(非侵入性地,以某种方式)捕获每个片段、每个文档、每个 SERP 和每个会话的注释和满意度响应。然后,收集的数据可以与一系列测量的“任务完成质量”指标进行比较,也可以与搜索效率指标得分进行比较,这些得分是从系统提供的 SERP 中包含的元素计算出来的。这个要求太过分了!因此,我们经常使用离线评估技术,使用测试集合、静态 qrel 集和有效性度量[6]。我们将用户抽象成一个确定性的评估脚本,假设出于实用原因,我们知道他们会发出什么样的查询,同时假设我们可以应用一个有效性度量来计算他们将从任何给定的 SERP 中获得多少有用性(或满意度)。这种方法的最大优点是,除了收集 qrel 的过程之外,它不需要用户,这意味着它是可重复的。实际上,我们经常重复,迭代以设置参数(并纠正编程错误)。然后,一旦计算出度量分数,我们进行一个或多个成对的统计检验,并得出相对系统有效性的结论。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Users:+Can't+Work+With+Them,+Can't+Work+Without+Them?)|0| -|[Interacting with Non-Cooperative User: A New Paradigm for Proactive Dialogue Policy](https://doi.org/10.1145/3477495.3532001)|Wenqiang Lei, Yao Zhang, Feifan Song, Hongru Liang, Jiaxin Mao, Jiancheng Lv, Zhenglu Yang, TatSeng Chua|Renmin University of China, Beijing, China; Sichuan University, Chengdu, China; Nankai University, Tianjin, China; National University of Singapore, Singapore, Singapore; Peking University, Beijing, China|Proactive dialogue system is able to lead the conversation to a goal topic and has advantaged potential in bargain, persuasion, and negotiation. Current corpus-based learning manner limits its practical application in real-world scenarios. To this end, we contribute to advancing the study of the proactive dialogue policy to a more natural and challenging setting, i.e., interacting dynamically with users. Further, we call attention to the non-cooperative user behavior - the user talks about off-path topics when he/she is not satisfied with the previous topics introduced by the agent. We argue that the targets of reaching the goal topic quickly and maintaining a high user satisfaction are not always converged, because the topics close to the goal and the topics user preferred may not be the same. Towards this issue, we propose a new solution named I-Pro that can learn Proactive policy in the Interactive setting. Specifically, we learn the trade-off via a learned goal weight, which consists of four factors (dialogue turn, goal completion difficulty, user satisfaction estimation, and cooperative degree). The experimental results demonstrate I-Pro significantly outperforms baselines in terms of effectiveness and interpretability.|积极主动的对话系统能够将对话引向一个目标话题,并且在讨价还价、说服和谈判方面具有优势。目前基于语料库的学习方式限制了其在现实情景中的实际应用。为此,我们致力于将积极对话政策的研究推向一个更加自然和富有挑战性的环境,即与用户进行动态互动。此外,我们还提请注意非合作用户行为——当用户对代理引入的前一个主题不满意时,他/她会谈论偏离路径的主题。我们认为,快速达到目标主题和保持高用户满意度的目标并不总是趋同的,因为接近目标的主题和用户喜欢的主题可能不一样。针对这个问题,我们提出了一个新的解决方案,称为 I-Pro,可以学习积极的政策在互动的设置。具体来说,我们通过一个学习的目标权重来学习权衡,这个权重包括四个因素(对话转向、目标完成难度、用户满意度估计和合作程度)。实验结果表明,I-Pro 在有效性和可解释性方面明显优于基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Interacting+with+Non-Cooperative+User:+A+New+Paradigm+for+Proactive+Dialogue+Policy)|0| +|[Interacting with Non-Cooperative User: A New Paradigm for Proactive Dialogue Policy](https://doi.org/10.1145/3477495.3532001)|Wenqiang Lei, Yao Zhang, Feifan Song, Hongru Liang, Jiaxin Mao, Jiancheng Lv, Zhenglu Yang, TatSeng Chua|Renmin University of China, Beijing, China; Nankai University, Tianjin, China; National University of Singapore, Singapore, Singapore; Peking University, Beijing, China; Sichuan University, Chengdu, China|Proactive dialogue system is able to lead the conversation to a goal topic and has advantaged potential in bargain, persuasion, and negotiation. Current corpus-based learning manner limits its practical application in real-world scenarios. To this end, we contribute to advancing the study of the proactive dialogue policy to a more natural and challenging setting, i.e., interacting dynamically with users. Further, we call attention to the non-cooperative user behavior - the user talks about off-path topics when he/she is not satisfied with the previous topics introduced by the agent. We argue that the targets of reaching the goal topic quickly and maintaining a high user satisfaction are not always converged, because the topics close to the goal and the topics user preferred may not be the same. Towards this issue, we propose a new solution named I-Pro that can learn Proactive policy in the Interactive setting. Specifically, we learn the trade-off via a learned goal weight, which consists of four factors (dialogue turn, goal completion difficulty, user satisfaction estimation, and cooperative degree). The experimental results demonstrate I-Pro significantly outperforms baselines in terms of effectiveness and interpretability.|积极主动的对话系统能够将对话引向一个目标话题,并且在讨价还价、说服和谈判方面具有优势。目前基于语料库的学习方式限制了其在现实情景中的实际应用。为此,我们致力于将积极对话政策的研究推向一个更加自然和富有挑战性的环境,即与用户进行动态互动。此外,我们还提请注意非合作用户行为——当用户对代理引入的前一个主题不满意时,他/她会谈论偏离路径的主题。我们认为,快速达到目标主题和保持高用户满意度的目标并不总是趋同的,因为接近目标的主题和用户喜欢的主题可能不一样。针对这个问题,我们提出了一个新的解决方案,称为 I-Pro,可以学习积极的政策在互动的设置。具体来说,我们通过一个学习的目标权重来学习权衡,这个权重包括四个因素(对话转向、目标完成难度、用户满意度估计和合作程度)。实验结果表明,I-Pro 在有效性和可解释性方面明显优于基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Interacting+with+Non-Cooperative+User:+A+New+Paradigm+for+Proactive+Dialogue+Policy)|0| |[ADPL: Adversarial Prompt-based Domain Adaptation for Dialogue Summarization with Knowledge Disentanglement](https://doi.org/10.1145/3477495.3531933)|Lulu Zhao, Fujia Zheng, Weihao Zeng, Keqing He, Ruotong Geng, Huixing Jiang, Wei Wu, Weiran Xu|Meituan Group, Beijing, China; Beijing University of Posts and Telecommunications, Beijing, China|Traditional dialogue summarization models rely on a large-scale manually-labeled corpus, lacking generalization ability to new domains, and domain adaptation from a labeled source domain to an unlabeled target domain is important in practical summarization scenarios. However, existing domain adaptation works in dialogue summarization generally require large-scale pre-training using extensive external data. To explore the lightweight fine-tuning methods, in this paper, we propose an efficient Adversarial Disentangled Prompt Learning (ADPL) model for domain adaptation in dialogue summarization. We introduce three kinds of prompts including domain-invariant prompt (DIP), domain-specific prompt (DSP), and task-oriented prompt (TOP). DIP aims to disentangle and transfer the shared knowledge from the source domain and target domain in an adversarial way, which improves the accuracy of prediction about domain-invariant information and enhances the ability for generalization to new domains. DSP is designed to guide our model to focus on domain-specific knowledge using domain-related features. TOP is to capture task-oriented knowledge to generate high-quality summaries. Instead of fine-tuning the whole pre-trained language model (PLM), we only update the prompt networks but keep PLM fixed. Experimental results on the zero-shot setting show that the novel design of prompts can yield more coherent, faithful, and relevant summaries than baselines using the prefix-tuning, and perform at par with fine-tuning while being more efficient. Overall, our work introduces a prompt-based perspective to the zero-shot learning for dialogue summarization task and provides valuable findings and insights for future research.|传统的对话摘要模型依赖于大规模的人工标记语料库,缺乏对新领域的泛化能力,在实际的摘要场景中,从标记的源领域到未标记的目标领域的领域适应性是非常重要的。然而,现有的对话摘要领域适应工作一般需要使用大量的外部数据进行大规模的预训练。为了探索轻量级微调方法,本文提出了一种有效的对话摘要领域自适应对抗性分离提示学习(ADPL)模型。介绍了领域不变提示(DIP)、领域特定提示(DSP)和面向任务提示(TOP)三种提示方式。DIP 旨在对源域和目标域的共享知识进行对抗性的分离和转移,提高了领域不变信息预测的准确性,增强了对新领域的推广能力。DSP 被设计用来指导我们的模型使用领域相关的特征来关注领域特定的知识。TOP 是一种以任务为导向的知识获取方法,用于生成高质量的摘要。我们没有对整个预先训练好的语言模型(PLM)进行微调,而是只更新提示网络,但保持 PLM 不变。零镜头设置的实验结果表明,新颖的提示符设计比基线前缀调整更能产生连贯、忠实和相关的总结,并且在更有效率的同时表现出与微调相当的效果。总的来说,我们的工作为对话总结任务的零拍学习引入了一个基于及时性的视角,并为未来的研究提供了有价值的发现和见解。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ADPL:+Adversarial+Prompt-based+Domain+Adaptation+for+Dialogue+Summarization+with+Knowledge+Disentanglement)|0| |[IR Evaluation and Learning in the Presence of Forbidden Documents](https://doi.org/10.1145/3477495.3532006)|David Carmel, Nachshon Cohen, Amir Ingber, Elad Kravi|Amazon, Haifa, Israel; Pinecone Systems, Haifa, Israel|Many IR collections contain forbidden documents (F-docs), i.e. documents that should not be retrieved to the searcher. In an ideal scenario F-docs are clearly flagged, hence the ranker can filter them out, guaranteeing that no F-doc will be exposed. However, in real-world scenarios, filtering algorithms are prone to errors. Therefore, an IR evaluation system should also measure filtering quality in addition to ranking quality. Typically, filtering is considered as a classification task and is evaluated independently of the ranking quality. However, due to the mutual affinity between the two, it is desirable to evaluate ranking quality while filtering decisions are being made. In this work we propose nDCGf, a novel extension of the nDCGmin metric[14], which measures both ranking and filtering quality of the search results. We show both theoretically and empirically that while nDCGmin is not suitable for the simultaneous ranking and filtering task, nDCGf is a reliable metric in this case. We experiment with three datasets for which ranking and filtering are both required. In the PR dataset our task is to rank product reviews while filtering those marked as spam. Similarly, in the CQA dataset our task is to rank a list of human answers per question while filtering bad answers. We also experiment with the TREC web-track datasets, where F-docs are explicitly labeled, sorting participant runs according to their ranking and filtering quality, demonstrating the stability, sensitivity, and reliability of nDCGf for this task. We propose a learning to rank and filter (LTRF) framework that is specifically designed to optimize nDCGf, by learning a ranking model and optimizing a filtering threshold used for discarding documents with lower scores. We experiment with several loss functions demonstrating their success in learning an effective LTRF model for the simultaneous learning and filtering task.|许多 IR 集合包含禁用的文档(F-docs) ,即不应该被检索到搜索器的文档。在一个理想的场景中,F-doc 被清楚地标记,因此排名者可以过滤掉它们,保证没有 F-doc 被暴露。然而,在真实场景中,过滤算法很容易出错。因此,一个红外评价系统除了评级质量外,还应该衡量过滤质量。通常,过滤被认为是一个分类任务,独立于排序质量进行评估。然而,由于两者之间的相互亲和性,在做出过滤决策的同时评估排名质量是可取的。在这项工作中,我们提出了 nDCGf,nDCGmin 度量的一个新的扩展[14] ,它衡量搜索结果的排名和过滤质量。我们从理论和实验两方面证明了,虽然 nDCGmin 不适合同时进行排序和过滤任务,但在这种情况下,nDCGf 是一个可靠的度量。我们对三个数据集进行了实验,这三个数据集都需要排名和过滤。在公关数据集中,我们的任务是对产品评论进行排序,同时过滤那些被标记为垃圾邮件的评论。类似地,在 CQA 数据集中,我们的任务是对每个问题的人工答案列表进行排序,同时过滤错误答案。我们还对 TREC 网络跟踪数据集进行了实验,其中 F-docs 被明确标记,根据其排名和过滤质量对参与者进行排序,证明了 nDCGf 用于该任务的稳定性,灵敏度和可靠性。我们提出了一个学习排序和过滤(LTRF)框架,专门设计优化 nDCGf,通过学习排序模型和优化过滤阈值用于丢弃分数较低的文档。我们用几个损失函数进行了实验,证明了它们在同时学习和过滤任务中学习一个有效的 LTRF 模型是成功的。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=IR+Evaluation+and+Learning+in+the+Presence+of+Forbidden+Documents)|0| -|[Human Preferences as Dueling Bandits](https://doi.org/10.1145/3477495.3531991)|Xinyi Yan, Chengxi Luo, Charles L. A. Clarke, Nick Craswell, Ellen M. Voorhees, Pablo Castells|National Institute of Standards and Technology, Gaithersburg, MD, USA; Microsoft, Bellevue, WA, USA; Universidad Autónoma de Madrid, Madrid, Spain; University of Waterloo, Waterloo, ON, Canada|The dramatic improvements in core information retrieval tasks engendered by neural rankers create a need for novel evaluation methods. If every ranker returns highly relevant items in the top ranks, it becomes difficult to recognize meaningful differences between them and to build reusable test collections. Several recent papers explore pairwise preference judgments as an alternative to traditional graded relevance assessments. Rather than viewing items one at a time, assessors view items side-by-side and indicate the one that provides the better response to a query, allowing fine-grained distinctions. If we employ preference judgments to identify the probably best items for each query, we can measure rankers by their ability to place these items as high as possible. We frame the problem of finding best items as a dueling bandits problem. While many papers explore dueling bandits for online ranker evaluation via interleaving, they have not been considered as a framework for offline evaluation via human preference judgments. We review the literature for possible solutions. For human preference judgments, any usable algorithm must tolerate ties, since two items may appear nearly equal to assessors, and it must minimize the number of judgments required for any specific pair, since each such comparison requires an independent assessor. Since the theoretical guarantees provided by most algorithms depend on assumptions that are not satisfied by human preference judgments, we simulate selected algorithms on representative test cases to provide insight into their practical utility. Based on these simulations, one algorithm stands out for its potential. Our simulations suggest modifications to further improve its performance. Using the modified algorithm, we collect over 10,000 preference judgments for pools derived from submissions to the TREC 2021 Deep Learning Track, confirming its suitability. We test the idea of best-item evaluation and suggest ideas for further theoretical and practical progress.|由神经排序引起的核心信息检索任务的显著改进创造了对新的评估方法的需求。如果每个排名都返回顶级排名中高度相关的项目,那么就很难识别它们之间有意义的差异,也很难构建可重用的测试集合。最近的几篇论文探讨了成对偏好判断作为传统分级相关性评估的替代方法。评估员不是一次查看一个项,而是并排查看项,并指出哪个项能够对查询提供更好的响应,从而允许细粒度的区分。如果我们使用偏好判断来确定每个查询可能最好的项目,我们可以通过他们将这些项目放在尽可能高的位置的能力来衡量排名。我们把寻找最佳物品的问题框定为一个决斗土匪问题。尽管许多论文通过交错的方式探索了在线排名评价中的决斗强盗,但是它们并没有被认为是一个通过人类偏好判断进行离线评价的框架。我们回顾了可能的解决方案的文献。对于人类偏好判断,任何可用的算法都必须容忍关系,因为两个项目可能看起来几乎等于评估者,它必须最小化任何特定对所需的判断数量,因为每个这样的比较需要一个独立的评估者。由于大多数算法提供的理论保证依赖于人类偏好判断不能满足的假设,我们在代表性测试案例上模拟选定的算法,以深入了解它们的实际效用。基于这些模拟,一种算法因其潜力而脱颖而出。我们的模拟结果表明,改进可以进一步提高它的性能。使用修改后的算法,我们收集了超过10,000个来自 TREC 2021深度学习跟踪提交的池的偏好判断,确认了它的适用性。我们检验了最佳项目评价的思想,并为进一步的理论和实践进展提出了建议。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Human+Preferences+as+Dueling+Bandits)|0| +|[Human Preferences as Dueling Bandits](https://doi.org/10.1145/3477495.3531991)|Xinyi Yan, Chengxi Luo, Charles L. A. Clarke, Nick Craswell, Ellen M. Voorhees, Pablo Castells|National Institute of Standards and Technology, Gaithersburg, MD, USA; Universidad Autónoma de Madrid, Madrid, Spain; University of Waterloo, Waterloo, ON, Canada; Microsoft, Bellevue, WA, USA|The dramatic improvements in core information retrieval tasks engendered by neural rankers create a need for novel evaluation methods. If every ranker returns highly relevant items in the top ranks, it becomes difficult to recognize meaningful differences between them and to build reusable test collections. Several recent papers explore pairwise preference judgments as an alternative to traditional graded relevance assessments. Rather than viewing items one at a time, assessors view items side-by-side and indicate the one that provides the better response to a query, allowing fine-grained distinctions. If we employ preference judgments to identify the probably best items for each query, we can measure rankers by their ability to place these items as high as possible. We frame the problem of finding best items as a dueling bandits problem. While many papers explore dueling bandits for online ranker evaluation via interleaving, they have not been considered as a framework for offline evaluation via human preference judgments. We review the literature for possible solutions. For human preference judgments, any usable algorithm must tolerate ties, since two items may appear nearly equal to assessors, and it must minimize the number of judgments required for any specific pair, since each such comparison requires an independent assessor. Since the theoretical guarantees provided by most algorithms depend on assumptions that are not satisfied by human preference judgments, we simulate selected algorithms on representative test cases to provide insight into their practical utility. Based on these simulations, one algorithm stands out for its potential. Our simulations suggest modifications to further improve its performance. Using the modified algorithm, we collect over 10,000 preference judgments for pools derived from submissions to the TREC 2021 Deep Learning Track, confirming its suitability. We test the idea of best-item evaluation and suggest ideas for further theoretical and practical progress.|由神经排序引起的核心信息检索任务的显著改进创造了对新的评估方法的需求。如果每个排名都返回顶级排名中高度相关的项目,那么就很难识别它们之间有意义的差异,也很难构建可重用的测试集合。最近的几篇论文探讨了成对偏好判断作为传统分级相关性评估的替代方法。评估员不是一次查看一个项,而是并排查看项,并指出哪个项能够对查询提供更好的响应,从而允许细粒度的区分。如果我们使用偏好判断来确定每个查询可能最好的项目,我们可以通过他们将这些项目放在尽可能高的位置的能力来衡量排名。我们把寻找最佳物品的问题框定为一个决斗土匪问题。尽管许多论文通过交错的方式探索了在线排名评价中的决斗强盗,但是它们并没有被认为是一个通过人类偏好判断进行离线评价的框架。我们回顾了可能的解决方案的文献。对于人类偏好判断,任何可用的算法都必须容忍关系,因为两个项目可能看起来几乎等于评估者,它必须最小化任何特定对所需的判断数量,因为每个这样的比较需要一个独立的评估者。由于大多数算法提供的理论保证依赖于人类偏好判断不能满足的假设,我们在代表性测试案例上模拟选定的算法,以深入了解它们的实际效用。基于这些模拟,一种算法因其潜力而脱颖而出。我们的模拟结果表明,改进可以进一步提高它的性能。使用修改后的算法,我们收集了超过10,000个来自 TREC 2021深度学习跟踪提交的池的偏好判断,确认了它的适用性。我们检验了最佳项目评价的思想,并为进一步的理论和实践进展提出了建议。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Human+Preferences+as+Dueling+Bandits)|0| |[IAOTP: An Interactive End-to-End Solution for Aspect-Opinion Term Pairs Extraction](https://doi.org/10.1145/3477495.3532085)|Ambreen Nazir, Yuan Rao|Xi'an Jiaotong University, Xi'an, China|Recently, the aspect-opinion term pairs (AOTP) extraction task has gained substantial importance in the domain of aspect-based sentiment analysis. It intends to extract the potential pair of each aspect term with its corresponding opinion term present in a user review. Some existing studies heavily relied on the annotated aspect terms and/or opinion terms, or adopted external knowledge/resources to figure out the task. Therefore, in this study, we propose a novel end-to-end solution, called an Interactive AOTP (IAOTP) model, for exploring AOTP. The IAOTP model first tracks the boundary of each token in given aspect-specific and opinion-specific representations through a span-based operation. Next, it generates the candidate AOTP by formulating the dyadic relations between tokens through the Biaffine transformation. Then, it computes the positioning information to capture the significant distance relationship that each candidate pair holds. And finally, it jointly models collaborative interactions and prediction of AOTP through a 2D self-attention. Besides the IAOTP model, this study also proposes an independent aspect/opinion encoding model (a RS model) that formulates relational semantics to obtain aspect-specific and opinion-specific representations that can effectively perform the extraction of aspect and opinion terms. Detailed experiments conducted on the publicly available benchmark datasets for AOTP, aspect terms, and opinion terms extraction tasks, clearly demonstrate the significantly improved performance of our models relative to other competitive state-of-the-art baselines.|近年来,在基于方面的情绪分析领域,方面-意见词对(AOTP)提取任务得到了广泛的重视。它打算提取每个方面术语与其相应的意见术语在用户评论中出现的潜在对。一些现有的研究严重依赖于注释的方面术语和/或意见术语,或者采用外部知识/资源来完成任务。因此,在这项研究中,我们提出了一个新的端到端解决方案,称为交互式 AOTP (IAOTP)模型,用于探索 AOTP。IAOTP 模型首先通过基于跨度的操作跟踪给定方面特定表示和意见特定表示中每个令牌的边界。其次,通过双仿射变换建立令牌之间的并元关系,生成候选 AOTP。然后,通过计算定位信息来捕获每个候选对所持有的重要距离关系。最后,通过二维自我注意共同建立 AOTP 协作交互和预测模型。除了 IAOTP 模型外,本研究还提出了一个独立的方面/意见编码模型(RS 模型) ,该模型通过建立关系语义来获得方面特定和意见特定的表示,从而有效地提取方面和意见术语。在 AOTP,方面术语和意见术语提取任务的公开可用基准数据集上进行的详细实验清楚地表明,相对于其他竞争性的最先进的基线,我们的模型的性能显着改善。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=IAOTP:+An+Interactive+End-to-End+Solution+for+Aspect-Opinion+Term+Pairs+Extraction)|0| |[Exploring Heterogeneous Data Lake based on Unified Canonical Graphs](https://doi.org/10.1145/3477495.3531759)|Qin Yuan, Ye Yuan, Zhenyu Wen, He Wang, Chen Chen, Guoren Wang|Beijing Institute of Technology, Beijing, China; Zhejiang University of Technology, Hangzhou, China|A data lake is a repository for massive raw and heterogeneous data, which includes multiple data models with different data schemas and query interfaces. Keyword search can extract valuable information for users without the knowledge of underlying schemas and query languages. However, conventional keyword searches are restricted to a certain data model and cannot easily adapt to a data lake. In this paper, we study a novel keyword search. To achieve high accuracy and efficiency, we introduce canonical graphs and then integrate semantically related vertices based on vertex representations. A matching entity based keyword search algorithm is presented to find answers across multiple data sources. Finally, extensive experimental study shows the effectiveness and efficiency of our solution.|数据湖是大量原始和异构数据的存储库,其中包括具有不同数据模式和查询接口的多个数据模型。关键字搜索可以在不了解基础模式和查询语言的情况下为用户提取有价值的信息。然而,传统的关键字搜索仅限于特定的数据模型,不能很容易地适应数据湖。本文研究了一种新的关键词搜索方法。为了实现高精度和高效率,我们引入了规范图,然后基于顶点表示对语义相关的顶点进行集成。提出了一种基于匹配实体的关键字搜索算法,用于在多个数据源之间寻找答案。最后,广泛的实验研究表明了我们的解决方案的有效性和效率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Exploring+Heterogeneous+Data+Lake+based+on+Unified+Canonical+Graphs)|0| -|[Distilling Knowledge on Text Graph for Social Media Attribute Inference](https://doi.org/10.1145/3477495.3531968)|Quan Li, Xiaoting Li, Lingwei Chen, Dinghao Wu|Visa Research, Palo Alto, CA, USA; The Pennsylvania State University, State College, PA, USA; Wright State University, Dayton, OH, USA|The popularization of social media generates a large amount of user-oriented data, where text data especially attracts researchers and speculators to infer user attributes (e.g., age, gender) for fulfilling their intents. Generally, this line of work casts attribute inference as a text classification problem, and starts to leverage graph neural networks for higher-level text representations. However, these text graphs are constructed on words, suffering from high memory consumption and ineffectiveness on few labeled texts. To address this challenge, we design a text-graph-based few-shot learning model for social media attribute inferences. Our model builds a text graph with texts as nodes and edges learned from current text representations via manifold learning and message passing. To further use unlabeled texts to improve few-shot performance, a knowledge distillation is devised to optimize the problem. This offers a trade-off between expressiveness and complexity. Experiments on social media datasets demonstrate the state-of-the-art performance of our model on attribute inferences with considerably fewer labeled texts.|社交媒体的普及产生了大量以用户为导向的数据,其中文本数据尤其吸引研究人员和投机者推断用户属性(如年龄、性别)以实现其意图。一般来说,这一行的工作将属性推理作为一个文本分类问题,并开始利用图神经网络进行更高级别的文本表示。然而,这些文本图形是建立在单词上的,由于高记忆消耗和对少数带标签的文本无效。为了解决这个问题,我们设计了一个基于文本图形的社会媒体属性推理的少镜头学习模型。我们的模型通过流形学习和消息传递从当前的文本表示中学习文本作为节点和边来构建一个文本图。为了进一步利用未标记文本来提高短镜头性能,设计了一种知识提取方法来优化问题。这在表达性和复杂性之间提供了一种权衡。在社会媒体数据集上的实验证明了我们的模型在标记文本相当少的情况下对属性推理的最新性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Distilling+Knowledge+on+Text+Graph+for+Social+Media+Attribute+Inference)|0| +|[Distilling Knowledge on Text Graph for Social Media Attribute Inference](https://doi.org/10.1145/3477495.3531968)|Quan Li, Xiaoting Li, Lingwei Chen, Dinghao Wu|The Pennsylvania State University, State College, PA, USA; Visa Research, Palo Alto, CA, USA; Wright State University, Dayton, OH, USA|The popularization of social media generates a large amount of user-oriented data, where text data especially attracts researchers and speculators to infer user attributes (e.g., age, gender) for fulfilling their intents. Generally, this line of work casts attribute inference as a text classification problem, and starts to leverage graph neural networks for higher-level text representations. However, these text graphs are constructed on words, suffering from high memory consumption and ineffectiveness on few labeled texts. To address this challenge, we design a text-graph-based few-shot learning model for social media attribute inferences. Our model builds a text graph with texts as nodes and edges learned from current text representations via manifold learning and message passing. To further use unlabeled texts to improve few-shot performance, a knowledge distillation is devised to optimize the problem. This offers a trade-off between expressiveness and complexity. Experiments on social media datasets demonstrate the state-of-the-art performance of our model on attribute inferences with considerably fewer labeled texts.|社交媒体的普及产生了大量以用户为导向的数据,其中文本数据尤其吸引研究人员和投机者推断用户属性(如年龄、性别)以实现其意图。一般来说,这一行的工作将属性推理作为一个文本分类问题,并开始利用图神经网络进行更高级别的文本表示。然而,这些文本图形是建立在单词上的,由于高记忆消耗和对少数带标签的文本无效。为了解决这个问题,我们设计了一个基于文本图形的社会媒体属性推理的少镜头学习模型。我们的模型通过流形学习和消息传递从当前的文本表示中学习文本作为节点和边来构建一个文本图。为了进一步利用未标记文本来提高短镜头性能,设计了一种知识提取方法来优化问题。这在表达性和复杂性之间提供了一种权衡。在社会媒体数据集上的实验证明了我们的模型在标记文本相当少的情况下对属性推理的最新性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Distilling+Knowledge+on+Text+Graph+for+Social+Media+Attribute+Inference)|0| |[A Simple Meta-learning Paradigm for Zero-shot Intent Classification with Mixture Attention Mechanism](https://doi.org/10.1145/3477495.3531803)|Han Liu, Siyang Zhao, Xiaotong Zhang, Feng Zhang, Junjie Sun, Hong Yu, Xianchao Zhang|Peking University, Beijing, China; Dalian University of Technology, Dalian, China|Zero-shot intent classification is a vital and challenging task in dialogue systems, which aims to deal with numerous fast-emerging unacquainted intents without annotated training data. To obtain more satisfactory performance, the crucial points lie in two aspects: extracting better utterance features and strengthening the model generalization ability. In this paper, we propose a simple yet effective meta-learning paradigm for zero-shot intent classification. To learn better semantic representations for utterances, we introduce a new mixture attention mechanism, which encodes the pertinent word occurrence patterns by leveraging the distributional signature attention and multi-layer perceptron attention simultaneously. To strengthen the transfer ability of the model from seen classes to unseen classes, we reformulate zero-shot intent classification with a meta-learning strategy, which trains the model by simulating multiple zero-shot classification tasks on seen categories, and promotes the model generalization ability with a meta-adapting procedure on mimic unseen categories. Extensive experiments on two real-world dialogue datasets in different languages show that our model outperforms other strong baselines on both standard and generalized zero-shot intent classification tasks.|零射击意图分类是对话系统中的一项重要而具有挑战性的任务,其目的是在没有注释训练数据的情况下处理大量快速出现的不熟悉意图。要获得更满意的表现,关键在于提取更好的话语特征和增强模型泛化能力两个方面。在本文中,我们提出了一个简单而有效的元学习范式的零射击意图分类。为了更好地学习话语的语义表征,我们引入了一种新的混合注意机制,该机制同时利用分布特征注意和多层感知器注意对相关词语出现模式进行编码。为了增强模型从可见类到不可见类的迁移能力,我们采用元学习策略重新构建了零射击意图分类模型,通过模拟可见类的多个零射击分类任务来训练模型,并通过模拟不可见类的元自适应过程来提高模型的泛化能力。在两个不同语言的真实世界对话数据集上的大量实验表明,我们的模型在标准和广义零射击意图分类任务上都优于其他强基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Simple+Meta-learning+Paradigm+for+Zero-shot+Intent+Classification+with+Mixture+Attention+Mechanism)|0| -|[Analyzing the Support Level for Tips Extracted from Product Reviews](https://doi.org/10.1145/3477495.3531805)|Miriam Farber, David Carmel, Lital Kuchy, Avihai Mejer|Amazon, Haifa, Israel; Faebook, Haifa, Israel|Useful tips extracted from product reviews assist customers to take a more informed purchase decision, as well as making a better, easier, and safer usage of the product. In this work we argue that extracted tips should be examined based on the amount of support and opposition they receive from all product reviews. A classifier, developed for this purpose, determines the degree to which a tip is supported or contradicted by a single review sentence. These support-levels are then aggregated over all review sentences, providing a global support score, and a global contradiction score, reflecting the support-level of all reviews to the given tip, thus improving the customer confidence in the tip validity. By analyzing a large set of tips extracted from product reviews, we propose a novel taxonomy for categorizing tips as highly-supported, highly-contradicted, controversial (supported and contradicted), and anecdotal (neither supported nor contradicted).|从产品评论中提取的有用的技巧可以帮助客户做出更明智的购买决定,以及更好、更容易和更安全地使用产品。在这项工作中,我们认为提取的技巧应该基于他们从所有产品评审中得到的支持和反对的数量进行检查。为此目的开发的分类器决定了一个复习句支持或反驳一个提示的程度。然后将这些支持级别聚合到所有评论句中,提供全局支持评分和全局矛盾评分,反映所有评论对给定提示的支持级别,从而提高客户对提示有效性的信心。通过分析从产品评论中提取的大量技巧,我们提出了一种新的分类方法,将技巧分为高支持、高矛盾、有争议(支持和矛盾)和轶事(既不支持也不矛盾)。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Analyzing+the+Support+Level+for+Tips+Extracted+from+Product+Reviews)|0| +|[Analyzing the Support Level for Tips Extracted from Product Reviews](https://doi.org/10.1145/3477495.3531805)|Miriam Farber, David Carmel, Lital Kuchy, Avihai Mejer|Faebook, Haifa, Israel; Amazon, Haifa, Israel|Useful tips extracted from product reviews assist customers to take a more informed purchase decision, as well as making a better, easier, and safer usage of the product. In this work we argue that extracted tips should be examined based on the amount of support and opposition they receive from all product reviews. A classifier, developed for this purpose, determines the degree to which a tip is supported or contradicted by a single review sentence. These support-levels are then aggregated over all review sentences, providing a global support score, and a global contradiction score, reflecting the support-level of all reviews to the given tip, thus improving the customer confidence in the tip validity. By analyzing a large set of tips extracted from product reviews, we propose a novel taxonomy for categorizing tips as highly-supported, highly-contradicted, controversial (supported and contradicted), and anecdotal (neither supported nor contradicted).|从产品评论中提取的有用的技巧可以帮助客户做出更明智的购买决定,以及更好、更容易和更安全地使用产品。在这项工作中,我们认为提取的技巧应该基于他们从所有产品评审中得到的支持和反对的数量进行检查。为此目的开发的分类器决定了一个复习句支持或反驳一个提示的程度。然后将这些支持级别聚合到所有评论句中,提供全局支持评分和全局矛盾评分,反映所有评论对给定提示的支持级别,从而提高客户对提示有效性的信心。通过分析从产品评论中提取的大量技巧,我们提出了一种新的分类方法,将技巧分为高支持、高矛盾、有争议(支持和矛盾)和轶事(既不支持也不矛盾)。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Analyzing+the+Support+Level+for+Tips+Extracted+from+Product+Reviews)|0| |[UserBERT: Pre-training User Model with Contrastive Self-supervision](https://doi.org/10.1145/3477495.3531810)|Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang|Microsoft Research Asia, Beijing, China; Department of Electronic Engineering, Tsinghua University, Beijing, China|User modeling is critical for personalization. Existing methods usually train user models from task-specific labeled data, which may be insufficient. In fact, there are usually abundant unlabeled user behavior data that encode rich universal user information, and pre-training user models on them can empower user modeling in many downstream tasks. In this paper, we propose a user model pre-training method named UserBERT to learn universal user models on unlabeled user behavior data with two contrastive self-supervision tasks. The first one is masked behavior prediction and discrimination, aiming to model the contexts of user behaviors. The second one is behavior sequence matching, aiming to capture user interest stable in different periods. Besides, we propose a medium-hard negative sampling framework to select informative negative samples for better contrastive pre-training. Extensive experiments validate the effectiveness of UserBERT in user model pre-training.|用户建模对个性化至关重要。现有的方法通常根据特定于任务的标记数据训练用户模型,这可能是不够的。事实上,通常存在大量未标记的用户行为数据,这些数据编码了丰富的通用用户信息,并且对这些数据进行预训练的用户模型可以在许多下游任务中增强用户建模能力。本文提出了一种用户模型预训练方法 UserBERT,通过两个对比的自我监督任务来学习未标记用户行为数据的通用用户模型。第一种是隐蔽行为预测和识别,旨在对用户行为的上下文进行建模。第二种是行为序列匹配,旨在捕获不同时期用户兴趣的稳定性。此外,我们提出了一个中等硬度的负面抽样框架来选择信息量大的负面样本,以便更好地进行对比预训练。大量实验验证了 UserBERT 在用户模型预训练中的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=UserBERT:+Pre-training+User+Model+with+Contrastive+Self-supervision)|0| -|[Modern Baselines for SPARQL Semantic Parsing](https://doi.org/10.1145/3477495.3531841)|Debayan Banerjee, Pranav Ajit Nair, Jivat Neet Kaur, Ricardo Usbeck, Chris Biemann|Universität Hamburg, Hamburg, Germany; Microsoft Research, Bengaluru, India; Indian Institute of Technology (BHU), Varanasi, India|In this work, we focus on the task of generating SPARQL queries from natural language questions, which can then be executed on Knowledge Graphs (KGs). We assume that gold entity and relations have been provided, and the remaining task is to arrange them in the right order along with SPARQL vocabulary, and input tokens to produce the correct SPARQL query. Pre-trained Language Models (PLMs) have not been explored in depth on this task so far, so we experiment with BART, T5 and PGNs (Pointer Generator Networks) with BERT embeddings, looking for new baselines in the PLM era for this task, on DBpedia and Wikidata KGs. We show that T5 requires special input tokenisation, but produces state of the art performance on LC-QuAD 1.0 and LC-QuAD 2.0 datasets, and outperforms task-specific models from previous works. Moreover, the methods enable semantic parsing for questions where a part of the input needs to be copied to the output query, thus enabling a new paradigm in KG semantic parsing.|在这项工作中,我们将重点放在从自然语言问题生成 SPARQL 查询的任务上,然后可以在知识图(Knowledge Graphs,KG)上执行这些查询。我们假设已经提供了 gold 实体和关系,剩下的任务是将它们与 SPARQL 词汇表一起按正确的顺序排列,并输入令牌以生成正确的 SPARQL 查询。预先训练的语言模型(PLM)到目前为止还没有被深入探索,所以我们在 BART,T5和 PGNs (指针生成器网络)中嵌入 BERT 进行试验,在 DBpedia 和 Wikidata KG 上为这项任务寻找 PLM 时代的新基线。我们展示了 T5需要特殊的输入标记,但是在 LC-QuAD 1.0和 LC-QuAD 2.0数据集上产生了最先进的性能,并且优于以前作品中的任务特定模型。此外,这些方法还支持对需要将部分输入复制到输出查询的问题进行语义解析,从而为 KG 语义解析提供了一个新的范例。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Modern+Baselines+for+SPARQL+Semantic+Parsing)|0| +|[Modern Baselines for SPARQL Semantic Parsing](https://doi.org/10.1145/3477495.3531841)|Debayan Banerjee, Pranav Ajit Nair, Jivat Neet Kaur, Ricardo Usbeck, Chris Biemann|Indian Institute of Technology (BHU), Varanasi, India; Microsoft Research, Bengaluru, India; Universität Hamburg, Hamburg, Germany|In this work, we focus on the task of generating SPARQL queries from natural language questions, which can then be executed on Knowledge Graphs (KGs). We assume that gold entity and relations have been provided, and the remaining task is to arrange them in the right order along with SPARQL vocabulary, and input tokens to produce the correct SPARQL query. Pre-trained Language Models (PLMs) have not been explored in depth on this task so far, so we experiment with BART, T5 and PGNs (Pointer Generator Networks) with BERT embeddings, looking for new baselines in the PLM era for this task, on DBpedia and Wikidata KGs. We show that T5 requires special input tokenisation, but produces state of the art performance on LC-QuAD 1.0 and LC-QuAD 2.0 datasets, and outperforms task-specific models from previous works. Moreover, the methods enable semantic parsing for questions where a part of the input needs to be copied to the output query, thus enabling a new paradigm in KG semantic parsing.|在这项工作中,我们将重点放在从自然语言问题生成 SPARQL 查询的任务上,然后可以在知识图(Knowledge Graphs,KG)上执行这些查询。我们假设已经提供了 gold 实体和关系,剩下的任务是将它们与 SPARQL 词汇表一起按正确的顺序排列,并输入令牌以生成正确的 SPARQL 查询。预先训练的语言模型(PLM)到目前为止还没有被深入探索,所以我们在 BART,T5和 PGNs (指针生成器网络)中嵌入 BERT 进行试验,在 DBpedia 和 Wikidata KG 上为这项任务寻找 PLM 时代的新基线。我们展示了 T5需要特殊的输入标记,但是在 LC-QuAD 1.0和 LC-QuAD 2.0数据集上产生了最先进的性能,并且优于以前作品中的任务特定模型。此外,这些方法还支持对需要将部分输入复制到输出查询的问题进行语义解析,从而为 KG 语义解析提供了一个新的范例。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Modern+Baselines+for+SPARQL+Semantic+Parsing)|0| |[Posterior Probability Matters: Doubly-Adaptive Calibration for Neural Predictions in Online Advertising](https://doi.org/10.1145/3477495.3531911)|Penghui Wei, Weimin Zhang, Ruijie Hou, Jinquan Liu, Shaoguo Liu, Liang Wang, Bo Zheng|Alibaba Group, Beijing, China|Predicting user response probabilities is vital for ad ranking and bidding. We hope that predictive models can produce accurate probabilistic predictions that reflect true likelihoods. Calibration techniques aims to post-process model predictions to posterior probabilities. Field-level calibration -- which performs calibration w.r.t. to a specific field value -- is fine-grained and more practical. In this paper we propose a doubly-adaptive approach AdaCalib. It learns an isotonic function family to calibrate model predictions with the guidance of posterior statistics, and field-adaptive mechanisms are designed to ensure that the posterior is appropriate for the field value to be calibrated. Experiments verify that AdaCalib achieves significant improvement on calibration performance. It has been deployed online and beats previous approach.|预测用户响应概率对广告排名和投标至关重要。我们希望预测模型能够产生反映真实可能性的准确概率预测。校正技术旨在对模型预测进行后期处理,以便后验概率。现场级校准——对特定的现场值进行校准。现场级校准更加细粒度和实用。在本文中,我们提出了一个双重自适应的方法 AdaCalib。它学习了一个等调函数族,以后验统计学为指导校准模型预测,并设计了场自适应机制,以确保后验适合于被校准的场值。实验证明,AdaCalib 在校准性能方面取得了显著的改善。它已经在网上部署,打破了以前的做法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Posterior+Probability+Matters:+Doubly-Adaptive+Calibration+for+Neural+Predictions+in+Online+Advertising)|0| |[Table Enrichment System for Machine Learning](https://doi.org/10.1145/3477495.3531678)|Yuyang Dong, Masafumi Oyamada|NEC Corporation, Kawasaki, Japan|Data scientists are constantly facing the problem of how to improve prediction accuracy with insufficient tabular data. We propose a table enrichment system that enriches a query table by adding external attributes (columns) from data lakes and improves the accuracy of machine learning predictive models. Our system has four stages, join row search, task-related table selection, row and column alignment, and feature selection and evaluation, to efficiently create an enriched table for a given query table and a specified machine learning task. We demonstrate our system with a web UI to show the use cases of table enrichment.|如何在表格数据不足的情况下提高预测的准确性,是数据科学家不断面临的问题。我们提出了一个表增加系统,通过增加数据湖的外部属性(列)来丰富查询表,提高机器学习预测模型的准确性。我们的系统分为四个阶段: 连接行搜索、任务相关的表选择、行和列对齐以及特征选择和评估,以有效地为给定的查询表和指定的机器学习任务创建一个丰富的表。我们用一个 web UI 来演示我们的系统,以显示表充实的用例。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Table+Enrichment+System+for+Machine+Learning)|0| |[LawNet-Viz: A Web-based System to Visually Explore Networks of Law Article References](https://doi.org/10.1145/3477495.3531668)|Lucio La Cava, Andrea Simeri, Andrea Tagarelli|University of Calabria, Rende (CS), Italy|We present LawNet-Viz, a web-based tool for the modeling, analysis and visualization of law reference networks extracted from a statute law corpus. LawNet-Viz is designed to support legal research tasks and help legal professionals as well as laymen visually exploring the article connections built upon the explicit law references detected in the article contents. To demonstrate LawNet-Viz, we show its application to the Italian Civil Code (ICC), which exploits a recent BERT-based model fine-tuned on the ICC. LawNet-Viz is a system prototype that is planned for product development.|我们介绍了 LawNet-Viz,这是一个基于网络的工具,用于对从成文法语料库中提取的法律参考网络进行建模、分析和可视化。LawNet-Viz 旨在支持法律研究任务,帮助法律专业人士以及外行人士在文章内容中发现的明确的法律参考文献的基础上,直观地探索文章之间的联系。为了演示 LawNet-Viz,我们展示了它在意大利民法典(ICC)中的应用,该法典利用了最近在 ICC 上微调的基于 BERT 的模型。LawNet-Viz 是计划用于产品开发的系统原型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=LawNet-Viz:+A+Web-based+System+to+Visually+Explore+Networks+of+Law+Article+References)|0| |[Quote Erat Demonstrandum: A Web Interface for Exploring the Quotebank Corpus](https://doi.org/10.1145/3477495.3531696)|Vuk Vukovic, Akhil Arora, HuanCheng Chang, Andreas Spitz, Robert West|University of Konstanz, Konstanz, Germany; EPFL, Lausanne, Switzerland|The use of attributed quotes is the most direct and least filtered pathway of information propagation in news. Consequently, quotes play a central role in the conception, reception, and analysis of news stories. Since quotes provide a more direct window into a speaker's mind than regular reporting, they are a valuable resource for journalists and researchers alike. While substantial research efforts have been devoted to methods for the automated extraction of quotes from news and their attribution to speakers, few comprehensive corpora of attributed quotes from contemporary sources are available to the public. Here, we present an adaptive web interface for searching Quotebank, a massive collection of quotes from the news, which we make available at https://quotebank.dlab.tools.|引用属性是新闻信息传播中最直接、过滤最少的途径。因此,引文在新闻故事的概念、接受和分析中起着核心作用。由于引语比常规报道提供了一个更直接的窗口来了解演讲者的思想,因此它们对记者和研究人员都是一种宝贵的资源。虽然大量的研究工作致力于从新闻中自动摘录引语及其对发言者的归属的方法,但公众很少能够获得来自当代来源的归属引语的综合语料库。在这里,我们提供了一个自适应的网络界面,用于搜索“报价银行”(Quotebank) ,这是一个来自新闻的大量引用集合,我们可以在 https://Quotebank.dlab.tools 上找到它。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Quote+Erat+Demonstrandum:+A+Web+Interface+for+Exploring+the+Quotebank+Corpus)|0| |[Unsupervised Product Offering Title Quality Scores](https://doi.org/10.1145/3477495.3536333)|Henry S. Vieira|LuizaLabs, São Paulo, Brazil|The title of a product offering is the consolidation of a product's characteristics in textual format for user consumption. The low quality of the textual content of a product's title can negatively influence the entire shopping experience. The negative experience can start with the impossibility of discovering a desired product, going from problems in identifying a product and its characteristics up to the purchase of an unwanted item. A solution to this problem is to establish an indicator that automatically describes the quality of the product title. With this assessment, it is possible to notify sellers who have registered products with poor quality titles and encourage revisions or suggest improvements. The focus of this work is to show how it is possible to assign a score that indicates the descriptive quality of product offers in an e-commerce marketplace environment using unsupervised methods.|产品提供的标题是以文本格式整合产品的特征以供用户使用。产品标题文字内容的低质量会对整个购物体验产生负面影响。消极的体验可以从发现想要的产品的不可能性开始,从识别产品及其特征的问题到购买不想要的产品。解决这个问题的一个办法是建立一个自动描述产品名称质量的指标。通过这种评估,可以通知注册产品质量低劣的卖方,并鼓励修改或提出改进建议。这项工作的重点是展示如何可以指定一个分数,表明在电子商务市场环境中的产品提供的描述性质量使用无监督的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Unsupervised+Product+Offering+Title+Quality+Scores)|0| |[Few-shot Information Extraction is Here: Pre-train, Prompt and Entail](https://doi.org/10.1145/3477495.3532786)|Eneko Agirre|University of the Basque Country UPV/EHU, Donostia, Spain|Deep Learning has made tremendous progress in Natural Language Processing (NLP), where large pre-trained language models (PLM) fine-tuned on the target task have become the predominant tool. More recently, in a process called prompting, NLP tasks are rephrased as natural language text, allowing us to better exploit linguistic knowledge learned by PLMs and resulting in significant improvements. Still, PLMs have limited inference ability. In the Textual Entailment task, systems need to output whether the truth of a certain textual hypothesis follows from the given premise text. Manually annotated entailment datasets covering multiple inference phenomena have been used to infuse inference capabilities to PLMs. This talk will review these recent developments, and will present an approach that combines prompts and PLMs fine-tuned for textual entailment that yields state-of-the-art results on Information Extraction (IE) using only a small fraction of the annotations. The approach has additional benefits, like the ability to learn from different schemas and inference datasets. These developments enable a new paradigm for IE where the expert can define the domain-specific schema using natural language and directly run those specifications, annotating a handful of examples in the process. A user interface based on this new paradigm will also be presented. Beyond IE, inference capabilities could be extended, acquired and applied from other tasks, opening a new research avenue where entailment and downstream task performance improve in tandem.|深度学习在自然语言处理(NLP)领域取得了巨大的进步,其中针对目标任务进行微调的大型预训练语言模型(PLM)已成为主要的工具。最近,在一个叫做“提示”的过程中,NLP 任务被重新定义为自然语言文本,使我们能够更好地利用 PLM 学到的语言知识,从而带来显著的改进。不过,PLM 的推理能力有限。在文字蕴涵任务中,系统需要输出某个文本假设的真实性是否来自给定的前提文本。包含多种推理现象的人工注释的蕴涵数据集已经被用来为 PLM 注入推理能力。这个演讲将回顾这些最近的发展,并将提出一个方法,结合提示和 PLM 微调的文字蕴涵,产生最先进的结果在信息抽取(IE)使用只有一小部分的注释。这种方法还有其他好处,比如可以从不同的模式和推断数据集中学习。这些开发为 IE 提供了一个新的范例,在这个范例中,专家可以使用自然语言定义特定于领域的模式,并直接运行这些规范,在过程中注释一些示例。本文还将介绍一个基于这种新范例的用户界面。在 IE 之外,推理能力可以从其他任务中得到扩展、获取和应用,开辟了一条新的研究途径,其中蕴含和下游任务绩效同步提高。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Few-shot+Information+Extraction+is+Here:+Pre-train,+Prompt+and+Entail)|0| -|[Improving Implicit Alternating Least Squares with Ring-based Regularization](https://doi.org/10.1145/3477495.3531995)|Rui Fan, Jin Chen, Jin Zhang, Defu Lian, Enhong Chen|University of Electronic Science and Technology of China, Chengdu, China; University of Science and Technology of China, Hefei, China|Due to the widespread presence of implicit feedback, recommendation based on them has been a long-standing research problem in academia and industry. However, it suffers from the extremely-sparse problem, since each user only interacts with a few items. One well-known and good-performing method is to treat each user's all uninteracted items as negative with low confidence. The method intrinsically imposes an implicit regularization to penalize large deviation of each user's preferences for uninteracted items from a constant. However, these methods have to assume a constant-rating prior to uninteracted items, which may be questionable. In this paper, we propose a novel ring-based regularization to penalize significant differences of each user's preferences between each item and some other items. The ring structure, described by an item graph, determines which other items are selected for each item in the regularization. The regularization not only averts the introduction of the prior ratings but also implicitly penalizes the remarkable preference differences for all items according to theoretical analysis. However, optimizing the recommenders with the regularization still suffers from computational challenges, so we develop a scalable alternating least square algorithm by carefully designing gradient computation. Therefore, as long as connecting each item with a sublinear/constant number of other items in the item graph, the overall learning algorithm could be comparably efficient to the existing algorithms. The proposed regularization is extensively evaluated with several public recommendation datasets, where the results show that the regularization could lead to considerable improvements in recommendation performance.|由于内隐反馈的广泛存在,基于内隐反馈的推荐已经成为学术界和业界长期研究的课题。但是,由于每个用户只与少数几个项目交互,因此它遇到了极其稀疏的问题。一个众所周知的好方法是将每个用户的所有未交互的项目视为负面的,并且信心不足。这种方法本质上强加了一种隐式正则化,以惩罚每个用户对常量中未交互项偏好的巨大偏差。但是,这些方法必须在未交互的项目之前假设一个常量评级,这可能是值得怀疑的。在本文中,我们提出了一个新的基于环的正则化,以惩罚每个用户的偏好之间的显着差异,每个项目和其他一些项目。由项目图描述的环结构确定为正则化中的每个项目选择哪些其他项目。根据理论分析,规则化不仅避免了先验评分的引入,而且隐含地惩罚了对所有项目的显著偏好差异。然而,正则化推荐算法的优化仍然面临着计算上的挑战,因此我们通过精心设计梯度计算,发展了一种可扩展的交替最小二乘算法。因此,只要将每个项目与项目图中其他项目的次线性/常数连接起来,整个学习算法就可以比现有算法更有效。利用若干公共推荐数据集对拟议的规范化进行了广泛评估,结果表明,规范化可大大改善推荐性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+Implicit+Alternating+Least+Squares+with+Ring-based+Regularization)|0| -|[Target-aware Abstractive Related Work Generation with Contrastive Learning](https://doi.org/10.1145/3477495.3532065)|Xiuying Chen, Hind Alamro, Mingzhe Li, Shen Gao, Rui Yan, Xin Gao, Xiangliang Zhang|Renmin University of China, Beijing, China; KAUST, Jeddah, China; University of Notre Dame & KAUST, South Bend, China; KAUST, Jeddah, Saudi Arabia; Peking University, Beijing, China|The related work section is an important component of a scientific paper, which highlights the contribution of the target paper in the context of the reference papers. Authors can save their time and effort by using the automatically generated related work section as a draft to complete the final related work. Most of the existing related work section generation methods rely on extracting off-the-shelf sentences to make a comparative discussion about the target work and the reference papers. However, such sentences need to be written in advance and are hard to obtain in practice. Hence, in this paper, we propose an abstractive target-aware related work generator (TAG), which can generate related work sections consisting of new sentences. Concretely, we first propose a target-aware graph encoder, which models the relationships between reference papers and the target paper with target-centered attention mechanisms. In the decoding process, we propose a hierarchical decoder that attends to the nodes of different levels in the graph with keyphrases as semantic indicators. Finally, to generate a more informative related work, we propose multi-level contrastive optimization objectives, which aim to maximize the mutual information between the generated related work with the references and minimize that with non-references. Extensive experiments on two public scholar datasets show that the proposed model brings substantial improvements over several strong baselines in terms of automatic and tailored human evaluations.|相关工作部分是科学论文的一个重要组成部分,它突出了目标论文在参考文件中的贡献。作者可以通过使用自动生成的相关工作部分作为草稿来完成最终的相关工作,从而节省时间和精力。现有的相关作品章节生成方法大多依赖于抽取现成的句子,对目标作品和参考文献进行比较讨论。然而,这样的句子需要事先写好,在实践中很难获得。因此,本文提出了一种抽象的目标感知相关工作生成器(TAG) ,它可以生成由新句子组成的相关工作部分。具体地说,我们首先提出了一种目标感知图形编码器,它使用以目标为中心的注意机制来模拟参考文献和目标文献之间的关系。在解码过程中,我们提出了一种以关键词作为语义指标的层次化解码器,它可以处理图中不同层次的节点。最后,为了生成信息量更大的相关作品,我们提出了多层次对比优化目标,目的是最大化生成的相关作品与参考文献之间的相互信息,最小化非参考文献之间的相互信息。对两个公共学者数据集的大量实验表明,该模型在自动化和量身定制的人类评估方面比几个强大的基线带来了实质性的改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Target-aware+Abstractive+Related+Work+Generation+with+Contrastive+Learning)|0| +|[Improving Implicit Alternating Least Squares with Ring-based Regularization](https://doi.org/10.1145/3477495.3531995)|Rui Fan, Jin Chen, Jin Zhang, Defu Lian, Enhong Chen|University of Science and Technology of China, Hefei, China; University of Electronic Science and Technology of China, Chengdu, China|Due to the widespread presence of implicit feedback, recommendation based on them has been a long-standing research problem in academia and industry. However, it suffers from the extremely-sparse problem, since each user only interacts with a few items. One well-known and good-performing method is to treat each user's all uninteracted items as negative with low confidence. The method intrinsically imposes an implicit regularization to penalize large deviation of each user's preferences for uninteracted items from a constant. However, these methods have to assume a constant-rating prior to uninteracted items, which may be questionable. In this paper, we propose a novel ring-based regularization to penalize significant differences of each user's preferences between each item and some other items. The ring structure, described by an item graph, determines which other items are selected for each item in the regularization. The regularization not only averts the introduction of the prior ratings but also implicitly penalizes the remarkable preference differences for all items according to theoretical analysis. However, optimizing the recommenders with the regularization still suffers from computational challenges, so we develop a scalable alternating least square algorithm by carefully designing gradient computation. Therefore, as long as connecting each item with a sublinear/constant number of other items in the item graph, the overall learning algorithm could be comparably efficient to the existing algorithms. The proposed regularization is extensively evaluated with several public recommendation datasets, where the results show that the regularization could lead to considerable improvements in recommendation performance.|由于内隐反馈的广泛存在,基于内隐反馈的推荐已经成为学术界和业界长期研究的课题。但是,由于每个用户只与少数几个项目交互,因此它遇到了极其稀疏的问题。一个众所周知的好方法是将每个用户的所有未交互的项目视为负面的,并且信心不足。这种方法本质上强加了一种隐式正则化,以惩罚每个用户对常量中未交互项偏好的巨大偏差。但是,这些方法必须在未交互的项目之前假设一个常量评级,这可能是值得怀疑的。在本文中,我们提出了一个新的基于环的正则化,以惩罚每个用户的偏好之间的显着差异,每个项目和其他一些项目。由项目图描述的环结构确定为正则化中的每个项目选择哪些其他项目。根据理论分析,规则化不仅避免了先验评分的引入,而且隐含地惩罚了对所有项目的显著偏好差异。然而,正则化推荐算法的优化仍然面临着计算上的挑战,因此我们通过精心设计梯度计算,发展了一种可扩展的交替最小二乘算法。因此,只要将每个项目与项目图中其他项目的次线性/常数连接起来,整个学习算法就可以比现有算法更有效。利用若干公共推荐数据集对拟议的规范化进行了广泛评估,结果表明,规范化可大大改善推荐性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+Implicit+Alternating+Least+Squares+with+Ring-based+Regularization)|0| +|[Target-aware Abstractive Related Work Generation with Contrastive Learning](https://doi.org/10.1145/3477495.3532065)|Xiuying Chen, Hind Alamro, Mingzhe Li, Shen Gao, Rui Yan, Xin Gao, Xiangliang Zhang|KAUST, Jeddah, China; KAUST, Jeddah, Saudi Arabia; Renmin University of China, Beijing, China; University of Notre Dame & KAUST, South Bend, China; Peking University, Beijing, China|The related work section is an important component of a scientific paper, which highlights the contribution of the target paper in the context of the reference papers. Authors can save their time and effort by using the automatically generated related work section as a draft to complete the final related work. Most of the existing related work section generation methods rely on extracting off-the-shelf sentences to make a comparative discussion about the target work and the reference papers. However, such sentences need to be written in advance and are hard to obtain in practice. Hence, in this paper, we propose an abstractive target-aware related work generator (TAG), which can generate related work sections consisting of new sentences. Concretely, we first propose a target-aware graph encoder, which models the relationships between reference papers and the target paper with target-centered attention mechanisms. In the decoding process, we propose a hierarchical decoder that attends to the nodes of different levels in the graph with keyphrases as semantic indicators. Finally, to generate a more informative related work, we propose multi-level contrastive optimization objectives, which aim to maximize the mutual information between the generated related work with the references and minimize that with non-references. Extensive experiments on two public scholar datasets show that the proposed model brings substantial improvements over several strong baselines in terms of automatic and tailored human evaluations.|相关工作部分是科学论文的一个重要组成部分,它突出了目标论文在参考文件中的贡献。作者可以通过使用自动生成的相关工作部分作为草稿来完成最终的相关工作,从而节省时间和精力。现有的相关作品章节生成方法大多依赖于抽取现成的句子,对目标作品和参考文献进行比较讨论。然而,这样的句子需要事先写好,在实践中很难获得。因此,本文提出了一种抽象的目标感知相关工作生成器(TAG) ,它可以生成由新句子组成的相关工作部分。具体地说,我们首先提出了一种目标感知图形编码器,它使用以目标为中心的注意机制来模拟参考文献和目标文献之间的关系。在解码过程中,我们提出了一种以关键词作为语义指标的层次化解码器,它可以处理图中不同层次的节点。最后,为了生成信息量更大的相关作品,我们提出了多层次对比优化目标,目的是最大化生成的相关作品与参考文献之间的相互信息,最小化非参考文献之间的相互信息。对两个公共学者数据集的大量实验表明,该模型在自动化和量身定制的人类评估方面比几个强大的基线带来了实质性的改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Target-aware+Abstractive+Related+Work+Generation+with+Contrastive+Learning)|0| |[Information Need Awareness: An EEG Study](https://doi.org/10.1145/3477495.3531999)|Dominika Michalkova, Mario ParraRodriguez, Yashar Moshfeghi|University of Strathclyde, Glasgow, United Kingdom|A fundamental goal of Information Retrieval (IR) is to satisfy search­ers' information need (IN). Advances in neuroimaging technologies have allowed for interdisciplinary research to investigate the brain activity associated with the realisation of IN. While these studies have been informative, they were not able to capture the cognitive processes underlying the realisation of IN and the interplay between them with a high temporal resolution. This paper aims to investigate this research question by inferring the variability of brain activity based on the contrast of a state of IN with the two other (no-IN) scenarios. To do so, we employed Electroencephalography (EEG) and constructed an Event-Related Potential (ERP) analysis of the brain signals captured while the participants were experiencing the realisation of IN. In particular, the brain signals of 24 healthy participants were captured while performing a Question-Answering (Q/A) Task. Our results show a link between the early stages of processing, corresponding to awareness and the late activity, meaning memory control mechanisms. Our findings also show that participants exhibited early N1-P2 complex indexing awareness processes and indicate, thus, that the realisation of IN is manifested in the brain before it reaches the user's consciousness. This research contributes novel insights into a better understanding of IN and informs the design of IR systems to better satisfy it.|信息检索的一个基本目标是满足用户的信息需求。神经影像技术的进步使得科际整合能够研究与智能网络实现相关的大脑活动。虽然这些研究提供了很多信息,但是他们并没有能够高时间解析度地捕捉到智力认知的认知过程以及它们之间的相互作用。本文旨在通过比较智力状态与其他两种情境(无智力状态)的差异来推断大脑活动的变异性,从而探讨这一研究问题。为了做到这一点,我们使用了脑电图(EEG) ,并构建了一个事件相关电位(ERP)分析,分析参与者在体验智力活动时捕捉到的大脑信号。特别是,24名健康参与者的大脑信号在进行问答(Q/A)任务时被捕获。我们的研究结果显示,早期的加工阶段,相应的意识和晚期的活动,意味着记忆控制机制之间的联系。我们的研究结果还表明,参与者表现出早期的 N1-P2复杂的索引意识过程,并表明,因此,实现 IN 是表现在大脑之前,达到用户的意识。这项研究为更好地理解智能网提供了新的见解,并为设计更好地满足智能网的红外系统提供了参考。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Information+Need+Awareness:+An+EEG+Study)|0| |[Unifying Cross-lingual Summarization and Machine Translation with Compression Rate](https://doi.org/10.1145/3477495.3532071)|Yu Bai, Heyan Huang, Kai Fan, Yang Gao, Yiming Zhu, Jiaao Zhan, Zewen Chi, Boxing Chen||Cross-lingual Summarization (CLS), converting a document into a cross-lingual summary, is highly related to Machine Translation (MT) task. However, MT resources are still underutilized for the CLS task. In this paper, we propose a novel task, Cross-lingual Summarization with Compression rate (CSC), to benefit cross-lingual summarization through large-scale MT corpus. Through introducing compression rate, we regard MT task as a special CLS task with the compression rate of 100%. Hence they can be trained as a unified task, sharing knowledge more effectively. Moreover, to bridge these two tasks smoothly, we propose a simple yet effective data augmentation method to produce document-summary pairs with different compression rates. The proposed method not only improves the performance of CLS task, but also provides controllability to generate summaries in desired lengths. Experiments demonstrate that our method outperforms various strong baselines.|跨语言摘要(CLS)是将文档转换为跨语言摘要的一种技术,它与机器翻译(MT)任务密切相关。然而,用于 CLS 任务的 MT 资源仍未得到充分利用。在本文中,我们提出了一个新的任务,跨语言压缩率摘要(CSC) ,以利于跨语言摘要通过大规模的机器翻译语料库。通过引入压缩率,将机器翻译任务视为一种特殊的 CLS 任务,压缩率为100% 。因此,他们可以被训练成一个统一的任务,更有效地分享知识。此外,为了平滑地连接这两个任务,我们提出了一种简单而有效的数据增强方法来产生不同压缩率的文档-摘要对。该方法不仅提高了 CLS 任务的性能,而且提供了生成所需长度摘要的可控性。实验结果表明,该方法的性能优于各种强基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Unifying+Cross-lingual+Summarization+and+Machine+Translation+with+Compression+Rate)|0| |[What Makes the Story Forward?: Inferring Commonsense Explanations as Prompts for Future Event Generation](https://doi.org/10.1145/3477495.3532080)|Li Lin, Yixin Cao, Lifu Huang, Shuang Li, Xuming Hu, Lijie Wen, Jianmin Wang|Virginia Tech, Blacksburg, VA, USA; Singapore Management University, Singapore, Singapore; Tsinghua University, Beijing, China|Prediction over event sequences is critical for many real-world applications in Information Retrieval and Natural Language Processing. Future Event Generation (FEG) is a challenging task in event sequence prediction because it requires not only fluent text generation but also commonsense reasoning to maintain the logical coherence of the entire event story. In this paper, we propose a novel explainable FEG framework, Coep. It highlights and integrates two types of event knowledge, sequential knowledge of direct event-event relations and inferential knowledge that reflects the intermediate character psychology between events, such as intents, causes, reactions, which intrinsically pushes the story forward. To alleviate the knowledge forgetting issue, we design two modules, IM and GM, for each type of knowledge, which are combined via prompt tuning. First, IM focuses on understanding inferential knowledge to generate commonsense explanations and provide a soft prompt vector for GM. We also design a contrastive discriminator for better generalization ability. Second, GM generates future events by modeling direct sequential knowledge with the guidance of IM. Automatic and human evaluation demonstrate that our approach can generate more coherent, specific, and logical future events.|对事件序列的预测对于信息检索和自然语言处理中的许多实际应用来说是至关重要的。未来事件生成(FEG)是事件序列预测中的一个具有挑战性的任务,因为它不仅需要流畅的文本生成,而且需要常识推理来保持整个事件故事的逻辑一致性。在本文中,我们提出了一个新的可解释的 FEG 框架,Coep。它突出并整合了两种类型的事件知识,即直接事件-事件关系的序贯知识和反映事件之间的中间人物心理的推理知识,这些中间人物心理在本质上推动了故事的发展。为了解决知识遗忘问题,我们针对每种类型的知识设计了两个模块,即 IM 模块和 GM 模块,并通过快速调整将它们结合起来。首先,IM 侧重于理解推理知识,产生常识性的解释,并为 GM 提供一个软提示向量。为了提高泛化能力,我们还设计了一种对比鉴别器。其次,GM 在 IM 的指导下,通过建模直接序列知识生成未来事件。自动和人工评估表明,我们的方法可以生成更加连贯、具体和合乎逻辑的未来事件。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=What+Makes+the+Story+Forward?:+Inferring+Commonsense+Explanations+as+Prompts+for+Future+Event+Generation)|0| |[A Dual-Expert Framework for Event Argument Extraction](https://doi.org/10.1145/3477495.3531923)|Rui Li, Wenlin Zhao, Cheng Yang, Sen Su|Beijing University of Posts and Telecommunications, Beijing, China|Event argument extraction (EAE) is an important information extraction task, which aims to identify the arguments of an event described in a given text and classify the roles played by them. A key characteristic in realistic EAE data is that the instance numbers of different roles follow an obvious long-tail distribution. However, the training and evaluation paradigms of existing EAE models either prone to neglect the performance on "tail roles'', or change the role instance distribution for model training to an unrealistic uniform distribution. Though some generic methods can alleviate the class imbalance in long-tail datasets, they usually sacrifice the performance of "head classes'' as a trade-off. To address the above issues, we propose to train our model on realistic long-tail EAE datasets, and evaluate the average performance over all roles. Inspired by the Mixture of Experts (MOE), we propose a Routing-Balanced Dual Expert Framework (RBDEF), which divides all roles into "head" and "tail" two scopes and assigns the classifications of head and tail roles to two separate experts. In inference, each encoded instance will be allocated to one of the two experts by a routing mechanism. To reduce routing errors caused by the imbalance of role instances, we design a Balanced Routing Mechanism (BRM), which transfers several head roles to the tail expert to balance the load of routing, and employs a tri-filter routing strategy to reduce the misallocation of the tail expert's instances. To enable an effective learning of tail roles with scarce instances, we devise Target-Specialized Meta Learning (TSML) to train the tail expert. Different from other meta learning algorithms that only search a generic parameter initialization equally applying to infinite tasks, TSML can adaptively adjust its search path to obtain a specialized initialization for the tail expert, thereby expanding the benefits to the learning of tail roles. In experiments, RBDEF significantly outperforms the state-of-the-art EAE models and advanced methods for long-tail data.|事件参数提取(EAE)是一项重要的信息抽取任务,其目的是识别给定文本中描述的事件的参数,并对它们所扮演的角色进行分类。实际 EAE 数据的一个关键特征是不同角色的实例数量遵循明显的长尾分布。然而,现有 EAE 模型的训练和评估范式要么忽视了“尾部角色”的表现,要么将模型训练的角色实例分布改变为不现实的统一分布。虽然一些通用的方法可以缓解长尾数据集中的类不平衡,但它们通常牺牲“头类”的性能作为一种权衡。为了解决上述问题,我们建议在实际的长尾 EAE 数据集上训练我们的模型,并评估所有角色的平均性能。受专家混合模型(MOE)的启发,本文提出了一种路由平衡双专家框架(RBDEF) ,该框架将所有角色划分为“头部”和“尾部”两个范围,并将“头部”和“尾部”角色的分类分配给两个独立的专家。在推理中,每个编码实例将通过路由机制分配给两个专家中的一个。为了减少由于角色实例不平衡而引起的路由错误,设计了一种平衡路由机制(BRM) ,将多个主要角色转移给尾部专家以平衡路由负载,并采用三重过滤路由策略来减少尾部专家实例的错误分配。为了能够有效地学习尾部角色与稀少的实例,我们设计了目标专门化元学习(TSML) ,以培训尾部专家。不同于其他元学习算法,只搜索一个通用的参数初始化同样适用于无限的任务,TSML 可以自适应地调整其搜索路径,以获得专门的初始化尾部专家,从而扩大的好处,尾部角色的学习。在实验中,RBDEF 的性能明显优于最先进的 EAE 模型和先进的长尾数据处理方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Dual-Expert+Framework+for+Event+Argument+Extraction)|0| -|[CorED: Incorporating Type-level and Instance-level Correlations for Fine-grained Event Detection](https://doi.org/10.1145/3477495.3531956)|Jiawei Sheng, Rui Sun, Shu Guo, Shiyao Cui, Jiangxia Cao, Lihong Wang, Tingwen Liu, Hongbo Xu|Beihang University, Beijing, China; National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing, China; Institute of Information Engineering, Chinese Academy of Sciences & UCAS, Beijing, China|Event detection (ED) is a pivotal task for information retrieval, which aims at identifying event triggers and classifying them into pre-defined event types. In real-world applications, events are usually annotated with numerous fine-grained types, which often arises long-tail type nature and co-occurrence event nature. Existing studies explore the event correlations without full utilization, which may limit the capability of event detection. This paper simultaneously incorporates both the type-level and instance-level event correlations, and proposes a novel framework, termed as CorED. Specifically, we devise an adaptive graph-based type encoder to capture instance-level correlations, learning type representations not only from their training data but also from their relevant types, thus leading to more informative type representations especially for the low-resource types. Besides, we devise an instance interactive decoder to capture instance-level correlations, which predicts event instance types conditioned on the contextual typed event instances, leveraging co-occurrence events as remarkable evidence in prediction. We conduct experiments on two public benchmarks, MAVEN and ACE-2005 dataset. Empirical results demonstrate the unity of both type-level and instance-level correlations, and the model achieves effectiveness performance on both benchmarks.|事件检测是信息检索的关键任务,其目的是识别事件触发器,并将其分类为预先定义的事件类型。在实际应用程序中,事件通常用许多细粒度类型进行注释,这些类型通常具有长尾类型性质和共现事件性质。现有的研究在未充分利用事件相关性的情况下,可能会限制事件检测的能力。本文同时结合了类型级和实例级的事件相关性,提出了一种新的框架,称为 CoreED。具体来说,我们设计了一种基于自适应图形的类型编码器来捕获实例级相关性,不仅从它们的训练数据中学习类型表示,而且从它们的相关类型中学习类型表示,从而导致更多的信息类型表示,特别是对于低资源类型。此外,我们设计了一个实例交互式解码器来捕获实例级的相关性,该解码器利用共现事件作为预测中的显著证据来预测以上下文类型的事件实例为条件的事件实例类型。我们在 MAVEN 和 ACE-2005两个公共基准数据集上进行了实验。实证结果表明,该模型实现了类型级和实例级相关性的统一,达到了两个基准的有效性表现。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CorED:+Incorporating+Type-level+and+Instance-level+Correlations+for+Fine-grained+Event+Detection)|0| -|[QUASER: Question Answering with Scalable Extractive Rationalization](https://doi.org/10.1145/3477495.3532049)|Asish Ghoshal, Srinivasan Iyer, Bhargavi Paranjape, Kushal Lakhotia, Scott Wentau Yih, Yashar Mehdad|Meta AI, Seattle, WA, USA; University of Washington, Seattle, WA, USA|Designing natural language processing (NLP) models that produce predictions by first extracting a set of relevant input sentences, i.e., rationales, is gaining importance for improving model interpretability and producing supporting evidence for users. Current unsupervised approaches are designed to extract rationales that maximize prediction accuracy, which is invariably obtained by exploiting spurious correlations in datasets, and leads to unconvincing rationales. In this paper, we introduce unsupervised generative models to extract dual-purpose rationales, which must not only be able to support a subsequent answer prediction, but also support a reproduction of the input query. We show that such models can produce more meaningful rationales, that are less influenced by dataset artifacts, and as a result, also achieve the state-of-the-art on rationale extraction metrics on four datasets from the ERASER benchmark, significantly improving upon previous unsupervised methods. Our multi-task model is scalable and enables using state-of-the-art pretrained language models to design explainable question answering systems.|设计自然语言处理(NLP)模型,通过首先提取一组相关的输入句,即基本原理,来产生预测,这对于提高模型的可解释性和为用户提供支持性证据越来越重要。目前的无监督方法旨在提取最大限度地提高预测准确性的理由,这些理由总是通过利用数据集中的虚假相关性来获得,并导致不令人信服的理由。在本文中,我们引入了无监督生成模型来提取双重用途的基本原理,它不仅要能够支持后续的答案预测,而且还要能够支持输入查询的重现。我们表明,这样的模型可以产生更有意义的基本原理,这些基本原理受数据集伪影的影响较小,因此,在 ERASER 基准的四个数据集上也实现了最先进的基本原理提取指标,显着改善了以前的无监督方法。我们的多任务模型是可扩展的,并且能够使用最先进的预先训练的语言模型来设计可解释的问答系统。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=QUASER:+Question+Answering+with+Scalable+Extractive+Rationalization)|0| +|[CorED: Incorporating Type-level and Instance-level Correlations for Fine-grained Event Detection](https://doi.org/10.1145/3477495.3531956)|Jiawei Sheng, Rui Sun, Shu Guo, Shiyao Cui, Jiangxia Cao, Lihong Wang, Tingwen Liu, Hongbo Xu|National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing, China; Beihang University, Beijing, China; Institute of Information Engineering, Chinese Academy of Sciences & UCAS, Beijing, China|Event detection (ED) is a pivotal task for information retrieval, which aims at identifying event triggers and classifying them into pre-defined event types. In real-world applications, events are usually annotated with numerous fine-grained types, which often arises long-tail type nature and co-occurrence event nature. Existing studies explore the event correlations without full utilization, which may limit the capability of event detection. This paper simultaneously incorporates both the type-level and instance-level event correlations, and proposes a novel framework, termed as CorED. Specifically, we devise an adaptive graph-based type encoder to capture instance-level correlations, learning type representations not only from their training data but also from their relevant types, thus leading to more informative type representations especially for the low-resource types. Besides, we devise an instance interactive decoder to capture instance-level correlations, which predicts event instance types conditioned on the contextual typed event instances, leveraging co-occurrence events as remarkable evidence in prediction. We conduct experiments on two public benchmarks, MAVEN and ACE-2005 dataset. Empirical results demonstrate the unity of both type-level and instance-level correlations, and the model achieves effectiveness performance on both benchmarks.|事件检测是信息检索的关键任务,其目的是识别事件触发器,并将其分类为预先定义的事件类型。在实际应用程序中,事件通常用许多细粒度类型进行注释,这些类型通常具有长尾类型性质和共现事件性质。现有的研究在未充分利用事件相关性的情况下,可能会限制事件检测的能力。本文同时结合了类型级和实例级的事件相关性,提出了一种新的框架,称为 CoreED。具体来说,我们设计了一种基于自适应图形的类型编码器来捕获实例级相关性,不仅从它们的训练数据中学习类型表示,而且从它们的相关类型中学习类型表示,从而导致更多的信息类型表示,特别是对于低资源类型。此外,我们设计了一个实例交互式解码器来捕获实例级的相关性,该解码器利用共现事件作为预测中的显著证据来预测以上下文类型的事件实例为条件的事件实例类型。我们在 MAVEN 和 ACE-2005两个公共基准数据集上进行了实验。实证结果表明,该模型实现了类型级和实例级相关性的统一,达到了两个基准的有效性表现。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CorED:+Incorporating+Type-level+and+Instance-level+Correlations+for+Fine-grained+Event+Detection)|0| +|[QUASER: Question Answering with Scalable Extractive Rationalization](https://doi.org/10.1145/3477495.3532049)|Asish Ghoshal, Srinivasan Iyer, Bhargavi Paranjape, Kushal Lakhotia, Scott Wentau Yih, Yashar Mehdad|University of Washington, Seattle, WA, USA; Meta AI, Seattle, WA, USA|Designing natural language processing (NLP) models that produce predictions by first extracting a set of relevant input sentences, i.e., rationales, is gaining importance for improving model interpretability and producing supporting evidence for users. Current unsupervised approaches are designed to extract rationales that maximize prediction accuracy, which is invariably obtained by exploiting spurious correlations in datasets, and leads to unconvincing rationales. In this paper, we introduce unsupervised generative models to extract dual-purpose rationales, which must not only be able to support a subsequent answer prediction, but also support a reproduction of the input query. We show that such models can produce more meaningful rationales, that are less influenced by dataset artifacts, and as a result, also achieve the state-of-the-art on rationale extraction metrics on four datasets from the ERASER benchmark, significantly improving upon previous unsupervised methods. Our multi-task model is scalable and enables using state-of-the-art pretrained language models to design explainable question answering systems.|设计自然语言处理(NLP)模型,通过首先提取一组相关的输入句,即基本原理,来产生预测,这对于提高模型的可解释性和为用户提供支持性证据越来越重要。目前的无监督方法旨在提取最大限度地提高预测准确性的理由,这些理由总是通过利用数据集中的虚假相关性来获得,并导致不令人信服的理由。在本文中,我们引入了无监督生成模型来提取双重用途的基本原理,它不仅要能够支持后续的答案预测,而且还要能够支持输入查询的重现。我们表明,这样的模型可以产生更有意义的基本原理,这些基本原理受数据集伪影的影响较小,因此,在 ERASER 基准的四个数据集上也实现了最先进的基本原理提取指标,显着改善了以前的无监督方法。我们的多任务模型是可扩展的,并且能够使用最先进的预先训练的语言模型来设计可解释的问答系统。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=QUASER:+Question+Answering+with+Scalable+Extractive+Rationalization)|0| |[PTAU: Prompt Tuning for Attributing Unanswerable Questions](https://doi.org/10.1145/3477495.3532048)|Jinzhi Liao, Xiang Zhao, Jianming Zheng, Xinyi Li, Fei Cai, Jiuyang Tang|National University of Defense Technology, Changsha, China|Current question answering systems are insufficient when confronting real-life scenarios, as they can hardly be aware of whether a question is answerable given its context. Hence, there is a recent pursuit of unanswerability of a question and its attribution. Attribution of unanswerability requires the system to choose an appropriate cause for an unanswerable question. As the task is sophisticated for even human beings, it is expensive to acquire labeled data, which makes it a low-data regime problem. Moreover, the causes themselves are semantically abstract and complex, and the process of attribution is heavily question- and context-dependent. Thus, a capable model has to carefully appreciate the causes, and then, judiciously contrast the question with its context, in order to cast it into the right cause. In response to the challenges, we present PTAU, which refers to and implements a high-level human reading strategy such that one reads with anticipation. In specific, PTAU leverages the recent prompt-tuning paradigm, and is further enhanced with two innovatively conceived modules: 1) a cause-oriented template module that constructs continuous templates towards certain attributing class in high dimensional vector space; and 2) a semantics-aware label module that exploits label semantics through contrastive learning to render the classes distinguishable. Extensive experiments demonstrate that the proposed design better enlightens not only the attribution model, but also current question answering models, leading to superior performance.|当前的问答系统在面对现实情景时是不够的,因为它们很难意识到一个问题是否可以根据其上下文进行回答。因此,最近有一个追求一个问题及其归属无法回答。无法回答的归因要求系统为无法回答的问题选择合适的原因。由于这项任务甚至对人类来说都是复杂的,因此获取标记数据的成本很高,这使得它成为一个低数据量的系统问题。此外,原因本身具有语义抽象性和复杂性,归因过程严重依赖于问题和上下文。因此,一个有能力的模型必须仔细地鉴别原因,然后,明智地将问题与其上下文进行对比,以便将其投入正确的原因。为了应对这些挑战,我们提出了 PTAU,它提出并实施了一种高水平的人类阅读策略,使人们在阅读时带有预期性。具体而言,PTAU 利用了最近的提示调优范式,并进一步增强了两个创新构想的模块: 1)面向原因的模板模块,在高维向量空间中构建针对特定属性类的连续模板; 2)语义感知的标签模块,通过对比学习利用标签语义来使类可区分。大量实验表明,该设计不仅对归因模型有较好的启发作用,而且对现有的问答模型也有较好的启发作用,具有较好的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PTAU:+Prompt+Tuning+for+Attributing+Unanswerable+Questions)|0| -|[DGQAN: Dual Graph Question-Answer Attention Networks for Answer Selection](https://doi.org/10.1145/3477495.3532084)|Haitian Yang, Xuan Zhao, Yan Wang, Min Li, Wei Chen, Weiqing Huang|Institute of Information Engineering, Chinese Academy of Sciences & School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China; York University, Toronto, Canada; Shanghai University of Finance and Economics, Shanghai, China|Community question answering (CQA) becomes increasingly prevalent in recent years, providing platforms for users with various backgrounds to obtain information and share knowledge. However, the redundancy and lengthiness issues of crowd-sourced answers limit the performance of answer selection, thus leading to difficulties in reading or even misunderstandings for community users. To solve these problems, we propose the dual graph question-answer attention networks (DGQAN) for answer selection task. Aims to fully understand the internal structure of the question and the corresponding answer, firstly, we construct a dual-CQA concept graph with graph convolution networks using the original question and answer text. Specifically, our CQA concept graph exploits the correlation information between question-answer pairs to construct two sub-graphs (QSubject-Answer and QBody-Answer), respectively. Further, a novel dual attention mechanism is incorporated to model both the internal and external semantic relations among questions and answers. More importantly, we conduct experiment to investigate the impact of each layer in the BERT model. The experimental results show that DGQAN model achieves state-of-the-art performance on three datasets (SemEval-2015, 2016, and 2017), outperforming all the baseline models.|近年来,社区问答越来越普遍,为不同背景的用户提供了获取信息和分享知识的平台。然而,众包答案的冗余和冗长问题限制了答案选择的性能,从而导致阅读困难,甚至对社区用户产生误解。为了解决这些问题,我们提出了双图问答注意网络(DGQAN)的答案选择任务。为了充分理解问题和相应答案的内部结构,首先利用原始问答文本构建了一个具有图卷积网络的双 CQA 概念图。具体来说,我们的 CQA 概念图利用问题-答案对之间的相关信息分别构造了两个子图(问题-答案和问题-答案)。此外,还引入了一种新的双重注意机制来建立问答之间的内部和外部语义关系模型。更重要的是,我们通过实验研究了 BERT 模型中各层的影响。实验结果表明,DGQAN 模型在三个数据集(SemEval-2015,2016和2017)上实现了最先进的性能,优于所有基线模型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DGQAN:+Dual+Graph+Question-Answer+Attention+Networks+for+Answer+Selection)|0| -|[Towards Event-level Causal Relation Identification](https://doi.org/10.1145/3477495.3531758)|Chuang Fan, Daoxing Liu, Libo Qin, Yue Zhang, Ruifeng Xu|Harbin Institute of Technology (Shenzhen), Shenzhen, China; WestLake University, Hangzhou, China; Harbin Institute of Technology, Harbin, China|Existing methods usually identify causal relations between events at the mention-level, which takes each event mention pair as a separate input. As a result, they either suffer from conflicts among causal relations predicted separately or require a set of additional constraints to resolve such conflicts. We propose to study this task in a more realistic setting, where event-level causality identification can be made. The advantage is two folds: 1) with modeling different mentions of an event as a single unit, no more conflicts among predicted results, without any extra constraints; 2) with the use of diverse knowledge sources (e.g., co-occurrence and coreference relations), a rich graph-based event structure can be induced from the document for supporting event-level causal inference. Graph convolutional network is used to encode such structural information, which aims to capture the local and non-local dependencies among nodes. Results show that our model achieves the best performance under both mention- and event-level settings, outperforming a number of strong baselines by at least 2.8% on F1 score.|现有的方法通常在提及级别上确定事件之间的因果关系,它将每个事件提及对作为一个单独的输入。因此,它们要么受到单独预测的因果关系之间的冲突的影响,要么需要一套额外的约束来解决这些冲突。我们建议在一个更现实的背景下研究这个任务,在这里可以进行事件级的因果关系识别。其优点有两个方面: 1)将事件的不同提及建模为一个单元,预测结果之间没有冲突,没有任何额外的约束; 2)利用多样化的知识来源(例如,共现关系和共参照关系) ,可以从文档中诱导出丰富的基于图的事件结构,以支持事件级因果推理。图卷积网络用于对这些结构信息进行编码,目的是捕获节点之间的局部和非局部依赖关系。结果表明,我们的模型在提及和事件级别设置下都达到了最佳性能,在 F1得分上比一些强基线至少高出2.8% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Event-level+Causal+Relation+Identification)|0| -|[Hierarchical Task-aware Multi-Head Attention Network](https://doi.org/10.1145/3477495.3531781)|Jing Du, Lina Yao, Xianzhi Wang, Bin Guo, Zhiwen Yu|Northwestern Polytechnical University, Xi'an, Shaanxi, China; The University of New South Wales, Sydney, NSW, Australia; University of Technology Sydney, Sydney, NSW, Australia|Neural Multi-task Learning is gaining popularity as a way to learn multiple tasks jointly within a single model. While related research continues to break new ground, two major limitations still remain, including (i) poor generalization to scenarios where tasks are loosely correlated; and (ii) under-investigation on global commonality and local characteristics of tasks. Our aim is to bridge these gaps by presenting a neural multi-task learning model coined Hierarchical Task-aware Multi-headed Attention Network (HTMN). HTMN explicitly distinguishes task-specific features from task-shared features to reduce the impact caused by weak correlation between tasks. The proposed method highlights two parts: Multi-level Task-aware Experts Network that identifies task-shared global features and task-specific local features, and Hierarchical Multi-Head Attention Network that hybridizes global and local features to profile more robust and adaptive representations for each task. Afterwards, each task tower receives its hybrid task-adaptive representation to perform task-specific predictions. Extensive experiments on two real datasets show that HTMN consistently outperforms the compared methods on a variety of prediction tasks.|神经多任务学习作为一种在单个模型中联合学习多个任务的方法越来越受到人们的欢迎。尽管相关研究不断取得新进展,但仍然存在两个主要的局限性,包括(i)对任务松散相关的情景概括不足; 以及(ii)对任务的全局共性和局部特征调查不足。我们的目标是通过提出一个神经多任务学习模型来弥补这些差距,该模型被称为分层任务感知多头注意网络(HTMN)。HTMN 明确区分任务特定特性和任务共享特性,以减少任务之间相关性较弱所造成的影响。提出的方法突出了两个部分: 多级任务感知专家网络,识别任务共享的全局特征和任务特定的局部特征,和分层多头注意网络,混合全局和局部特征,以配置更健壮的和自适应的表示为每个任务。然后,每个任务塔接收它的混合任务自适应表示来执行任务特定的预测。在两个实际数据集上进行的大量实验表明,HTMN 在各种预测任务中的表现均优于比较方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Hierarchical+Task-aware+Multi-Head+Attention+Network)|0| +|[DGQAN: Dual Graph Question-Answer Attention Networks for Answer Selection](https://doi.org/10.1145/3477495.3532084)|Haitian Yang, Xuan Zhao, Yan Wang, Min Li, Wei Chen, Weiqing Huang|York University, Toronto, Canada; Institute of Information Engineering, Chinese Academy of Sciences & School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China; Shanghai University of Finance and Economics, Shanghai, China|Community question answering (CQA) becomes increasingly prevalent in recent years, providing platforms for users with various backgrounds to obtain information and share knowledge. However, the redundancy and lengthiness issues of crowd-sourced answers limit the performance of answer selection, thus leading to difficulties in reading or even misunderstandings for community users. To solve these problems, we propose the dual graph question-answer attention networks (DGQAN) for answer selection task. Aims to fully understand the internal structure of the question and the corresponding answer, firstly, we construct a dual-CQA concept graph with graph convolution networks using the original question and answer text. Specifically, our CQA concept graph exploits the correlation information between question-answer pairs to construct two sub-graphs (QSubject-Answer and QBody-Answer), respectively. Further, a novel dual attention mechanism is incorporated to model both the internal and external semantic relations among questions and answers. More importantly, we conduct experiment to investigate the impact of each layer in the BERT model. The experimental results show that DGQAN model achieves state-of-the-art performance on three datasets (SemEval-2015, 2016, and 2017), outperforming all the baseline models.|近年来,社区问答越来越普遍,为不同背景的用户提供了获取信息和分享知识的平台。然而,众包答案的冗余和冗长问题限制了答案选择的性能,从而导致阅读困难,甚至对社区用户产生误解。为了解决这些问题,我们提出了双图问答注意网络(DGQAN)的答案选择任务。为了充分理解问题和相应答案的内部结构,首先利用原始问答文本构建了一个具有图卷积网络的双 CQA 概念图。具体来说,我们的 CQA 概念图利用问题-答案对之间的相关信息分别构造了两个子图(问题-答案和问题-答案)。此外,还引入了一种新的双重注意机制来建立问答之间的内部和外部语义关系模型。更重要的是,我们通过实验研究了 BERT 模型中各层的影响。实验结果表明,DGQAN 模型在三个数据集(SemEval-2015,2016和2017)上实现了最先进的性能,优于所有基线模型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DGQAN:+Dual+Graph+Question-Answer+Attention+Networks+for+Answer+Selection)|0| +|[Towards Event-level Causal Relation Identification](https://doi.org/10.1145/3477495.3531758)|Chuang Fan, Daoxing Liu, Libo Qin, Yue Zhang, Ruifeng Xu|Harbin Institute of Technology, Harbin, China; Harbin Institute of Technology (Shenzhen), Shenzhen, China; WestLake University, Hangzhou, China|Existing methods usually identify causal relations between events at the mention-level, which takes each event mention pair as a separate input. As a result, they either suffer from conflicts among causal relations predicted separately or require a set of additional constraints to resolve such conflicts. We propose to study this task in a more realistic setting, where event-level causality identification can be made. The advantage is two folds: 1) with modeling different mentions of an event as a single unit, no more conflicts among predicted results, without any extra constraints; 2) with the use of diverse knowledge sources (e.g., co-occurrence and coreference relations), a rich graph-based event structure can be induced from the document for supporting event-level causal inference. Graph convolutional network is used to encode such structural information, which aims to capture the local and non-local dependencies among nodes. Results show that our model achieves the best performance under both mention- and event-level settings, outperforming a number of strong baselines by at least 2.8% on F1 score.|现有的方法通常在提及级别上确定事件之间的因果关系,它将每个事件提及对作为一个单独的输入。因此,它们要么受到单独预测的因果关系之间的冲突的影响,要么需要一套额外的约束来解决这些冲突。我们建议在一个更现实的背景下研究这个任务,在这里可以进行事件级的因果关系识别。其优点有两个方面: 1)将事件的不同提及建模为一个单元,预测结果之间没有冲突,没有任何额外的约束; 2)利用多样化的知识来源(例如,共现关系和共参照关系) ,可以从文档中诱导出丰富的基于图的事件结构,以支持事件级因果推理。图卷积网络用于对这些结构信息进行编码,目的是捕获节点之间的局部和非局部依赖关系。结果表明,我们的模型在提及和事件级别设置下都达到了最佳性能,在 F1得分上比一些强基线至少高出2.8% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Event-level+Causal+Relation+Identification)|0| +|[Hierarchical Task-aware Multi-Head Attention Network](https://doi.org/10.1145/3477495.3531781)|Jing Du, Lina Yao, Xianzhi Wang, Bin Guo, Zhiwen Yu|The University of New South Wales, Sydney, NSW, Australia; Northwestern Polytechnical University, Xi'an, Shaanxi, China; University of Technology Sydney, Sydney, NSW, Australia|Neural Multi-task Learning is gaining popularity as a way to learn multiple tasks jointly within a single model. While related research continues to break new ground, two major limitations still remain, including (i) poor generalization to scenarios where tasks are loosely correlated; and (ii) under-investigation on global commonality and local characteristics of tasks. Our aim is to bridge these gaps by presenting a neural multi-task learning model coined Hierarchical Task-aware Multi-headed Attention Network (HTMN). HTMN explicitly distinguishes task-specific features from task-shared features to reduce the impact caused by weak correlation between tasks. The proposed method highlights two parts: Multi-level Task-aware Experts Network that identifies task-shared global features and task-specific local features, and Hierarchical Multi-Head Attention Network that hybridizes global and local features to profile more robust and adaptive representations for each task. Afterwards, each task tower receives its hybrid task-adaptive representation to perform task-specific predictions. Extensive experiments on two real datasets show that HTMN consistently outperforms the compared methods on a variety of prediction tasks.|神经多任务学习作为一种在单个模型中联合学习多个任务的方法越来越受到人们的欢迎。尽管相关研究不断取得新进展,但仍然存在两个主要的局限性,包括(i)对任务松散相关的情景概括不足; 以及(ii)对任务的全局共性和局部特征调查不足。我们的目标是通过提出一个神经多任务学习模型来弥补这些差距,该模型被称为分层任务感知多头注意网络(HTMN)。HTMN 明确区分任务特定特性和任务共享特性,以减少任务之间相关性较弱所造成的影响。提出的方法突出了两个部分: 多级任务感知专家网络,识别任务共享的全局特征和任务特定的局部特征,和分层多头注意网络,混合全局和局部特征,以配置更健壮的和自适应的表示为每个任务。然后,每个任务塔接收它的混合任务自适应表示来执行任务特定的预测。在两个实际数据集上进行的大量实验表明,HTMN 在各种预测任务中的表现均优于比较方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Hierarchical+Task-aware+Multi-Head+Attention+Network)|0| |[Enhancing Event-Level Sentiment Analysis with Structured Arguments](https://doi.org/10.1145/3477495.3531784)|Qi Zhang, Jie Zhou, Qin Chen, Qingchun Bai, Liang He|Shanghai Open University, Shanghai, China; Fudan University, Shanghai, China; East China Normal University, Shanghai, China|Previous studies about event-level sentiment analysis (SA) usually model the event as a topic, a category or target terms, while the structured arguments (e.g., subject, object, time and location) that have potential effects on the sentiment are not well studied. In this paper, we redefine the task as structured event-level SA and propose an End-to-End Event-level Sentiment Analysis (E3SA) approach to solve this issue. Specifically, we explicitly extract and model the event structure information for enhancing event-level SA. Extensive experiments demonstrate the great advantages of our proposed approach over the state-of-the-art methods. Noting the lack of the dataset, we also release a large-scale real-world dataset with event arguments and sentiment labelling for promoting more researches.|以往关于事件层面情绪分析(SA)的研究通常将事件建模为一个主题、一个类别或目标术语,而对情绪有潜在影响的结构化论证(如主语、客体、时间和地点)则没有得到很好的研究。本文将任务重新定义为结构化事件级情绪分析,并提出了一种端到端事件级情绪分析(E3SA)方法来解决这一问题。具体来说,我们显式地提取和建模事件结构信息,以增强事件级 SA。大量的实验证明了我们提出的方法相对于最先进的方法的巨大优势。注意到数据集的缺乏,我们也发布了一个大规模的现实世界的数据集与事件论点和情绪标签,以促进更多的研究。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Enhancing+Event-Level+Sentiment+Analysis+with+Structured+Arguments)|0| -|[Translation-Based Implicit Annotation Projection for Zero-Shot Cross-Lingual Event Argument Extraction](https://doi.org/10.1145/3477495.3531808)|Chenwei Lou, Jun Gao, Changlong Yu, Wei Wang, Huan Zhao, Weiwei Tu, Ruifeng Xu|Harbin Institute of Technology (Shenzhen), Shenzhen, China; 4Paradigm Inc, Beijing, China; The Hong Kong University of Science and Technology, Hong Kong, Hong Kong; Tsinghua University, Beijing, China|Zero-shot cross-lingual event argument extraction (EAE) is a challenging yet practical problem in Information Extraction. Most previous works heavily rely on external structured linguistic features, which are not easily accessible in real-world scenarios. This paper investigates a translation-based method to implicitly project annotations from the source language to the target language. With the use of translation-based parallel corpora, no additional linguistic features are required during training and inference. As a result, the proposed approach is more cost effective than previous works on zero-shot cross-lingual EAE. Moreover, our implicit annotation projection approach introduces less noises and hence is more effective and robust than explicit ones. Experimental results show that our model achieves the best performance, outperforming a number of competitive baselines. The thorough analysis further demonstrates the effectiveness of our model compared to explicit annotation projection approaches.|零镜头跨语言事件参数提取(EAE)是一个具有挑战性的实际信息抽取问题。以往的大多数作品都严重依赖于外部结构化语言特征,而这些特征在现实世界中并不容易获得。本文研究了一种基于翻译的方法来隐式地将注释从源语言投射到目标语言。使用基于翻译的平行语料库,在训练和推理过程中不需要额外的语言特征。结果表明,本文提出的方法比以往针对零镜头跨语言 EAE 的研究更具有成本效益。此外,我们的隐式注释投影方法引入较少的噪声,因此比显式更有效和鲁棒性。实验结果表明,我们的模型达到了最佳的性能,超过了一些竞争基线。深入的分析进一步证明了我们的模型与显式注释投影方法相比的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Translation-Based+Implicit+Annotation+Projection+for+Zero-Shot+Cross-Lingual+Event+Argument+Extraction)|0| +|[Translation-Based Implicit Annotation Projection for Zero-Shot Cross-Lingual Event Argument Extraction](https://doi.org/10.1145/3477495.3531808)|Chenwei Lou, Jun Gao, Changlong Yu, Wei Wang, Huan Zhao, Weiwei Tu, Ruifeng Xu|The Hong Kong University of Science and Technology, Hong Kong, Hong Kong; Harbin Institute of Technology (Shenzhen), Shenzhen, China; Tsinghua University, Beijing, China; 4Paradigm Inc, Beijing, China|Zero-shot cross-lingual event argument extraction (EAE) is a challenging yet practical problem in Information Extraction. Most previous works heavily rely on external structured linguistic features, which are not easily accessible in real-world scenarios. This paper investigates a translation-based method to implicitly project annotations from the source language to the target language. With the use of translation-based parallel corpora, no additional linguistic features are required during training and inference. As a result, the proposed approach is more cost effective than previous works on zero-shot cross-lingual EAE. Moreover, our implicit annotation projection approach introduces less noises and hence is more effective and robust than explicit ones. Experimental results show that our model achieves the best performance, outperforming a number of competitive baselines. The thorough analysis further demonstrates the effectiveness of our model compared to explicit annotation projection approaches.|零镜头跨语言事件参数提取(EAE)是一个具有挑战性的实际信息抽取问题。以往的大多数作品都严重依赖于外部结构化语言特征,而这些特征在现实世界中并不容易获得。本文研究了一种基于翻译的方法来隐式地将注释从源语言投射到目标语言。使用基于翻译的平行语料库,在训练和推理过程中不需要额外的语言特征。结果表明,本文提出的方法比以往针对零镜头跨语言 EAE 的研究更具有成本效益。此外,我们的隐式注释投影方法引入较少的噪声,因此比显式更有效和鲁棒性。实验结果表明,我们的模型达到了最佳的性能,超过了一些竞争基线。深入的分析进一步证明了我们的模型与显式注释投影方法相比的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Translation-Based+Implicit+Annotation+Projection+for+Zero-Shot+Cross-Lingual+Event+Argument+Extraction)|0| |[Understanding Long Programming Languages with Structure-Aware Sparse Attention](https://doi.org/10.1145/3477495.3531811)|Tingting Liu, Chengyu Wang, Cen Chen, Ming Gao, Aoying Zhou|Alibaba Group, Hangzhou, China; East China Normal University, Shanghai, China|Programming-based Pre-trained Language Models (PPLMs) such as CodeBERT have achieved great success in many downstream code-related tasks. Since the memory and computational complexity of self-attention in the Transformer grow quadratically with the sequence length, PPLMs typically limit the code length to 512. However, codes in real-world applications are generally long, such as code searches, which cannot be processed efficiently by existing PPLMs. To solve this problem, in this paper, we present SASA, a Structure-Aware Sparse Attention mechanism, which reduces the complexity and improves performance for long code understanding tasks. The key components in SASA are top-k sparse attention and Abstract Syntax Tree (AST)-based structure-aware attention. With top-k sparse attention, the most crucial attention relation can be obtained with a lower computational cost. As the code structure represents the logic of the code statements, which is a complement to the code sequence characteristics, we further introduce AST structures into attention. Extensive experiments on CodeXGLUE tasks show that SASA achieves better performance than the competing baselines.|基于编程的预训练语言模型(PPLM) ,如 CodeBERT,在许多下游代码相关的任务中取得了巨大的成功。由于变压器中自注意的内存和计算复杂度随序列长度的二次增长而增长,PPLM 通常将码长限制在512。然而,实际应用中的代码通常都很长,比如代码搜索,现有的 PPLM 无法有效地处理这些代码。为了解决这一问题,本文提出了一种基于结构感知的稀疏注意机制 SASA,该机制降低了长代码理解任务的复杂度,提高了性能。SASA 的关键组成部分是 top-k 稀疏注意和基于抽象语法树(AST)的结构感知注意。使用 top-k 稀疏注意,可以以较低的计算代价得到最关键的注意关系。由于代码结构代表了代码语句的逻辑,是对代码序列特性的补充,因此我们进一步引入了 AST 结构。在 CodeXGLUE 任务上的大量实验表明,SASA 比竞争基线获得了更好的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Understanding+Long+Programming+Languages+with+Structure-Aware+Sparse+Attention)|0| |[Dialogue Topic Segmentation via Parallel Extraction Network with Neighbor Smoothing](https://doi.org/10.1145/3477495.3531817)|Jinxiong Xia, Cao Liu, Jiansong Chen, Yuchen Li, Fan Yang, Xunliang Cai, Guanglu Wan, Houfeng Wang|Peking University, Beijing, China; Meituan, Beijing, China|Dialogue topic segmentation is a challenging task in which dialogues are split into segments with pre-defined topics. Existing works on topic segmentation adopt a two-stage paradigm, including text segmentation and segment labeling. However, such methods tend to focus on the local context in segmentation, and the inter-segment dependency is not well captured. Besides, the ambiguity and labeling noise in dialogue segment bounds bring further challenges to existing models. In this work, we propose the Parallel Extraction Network with Neighbor Smoothing (PEN-NS) to address the above issues. Specifically, we propose the parallel extraction network to perform segment extractions, optimizing the bipartite matching cost of segments to capture inter-segment dependency. Furthermore, we propose neighbor smoothing to handle the segment-bound noise and ambiguity. Experiments on a dialogue-based and a document-based topic segmentation dataset show that PEN-NS outperforms state-the-of-art models significantly.|对话主题分割是一个具有挑战性的任务,其中对话分割成具有预定义的主题片段。现有的主题切分研究采用两阶段模式,包括文本切分和段标注。然而,这些方法在分割过程中往往只关注局部上下文,而且不能很好地捕获分段间的依赖关系。此外,对话段边界的模糊性和标注噪声也给现有的模型带来了进一步的挑战。针对上述问题,本文提出了一种基于邻域平滑的并行抽取网络(PEN-NS)。具体来说,我们提出并行提取网络来执行分段提取,优化分段的二部匹配代价来捕获分段间的依赖关系。此外,我们还提出了邻域平滑法来处理分段定界噪声和模糊度。在基于对话和基于文档的主题分割数据集上进行的实验表明,PEN-NS 模型的性能明显优于目前最先进的模型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Dialogue+Topic+Segmentation+via+Parallel+Extraction+Network+with+Neighbor+Smoothing)|0| -|[Expression Syntax Information Bottleneck for Math Word Problems](https://doi.org/10.1145/3477495.3531824)|Jing Xiong, Chengming Li, Min Yang, Xiping Hu, Bin Hu|Sun Yat-sen University, Shenzhen, China; Lanzhou University, Lanzhou, China; Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China|Math Word Problems (MWP) aims to automatically solve mathematical questions given in texts. Previous studies tend to design complex models to capture additional information in the original text so as to enable the model to gain more comprehensive features. In this paper, we turn our attention in the opposite direction, and work on how to discard redundant features containing spurious correlations for MWP. To this end, we design an Expression Syntax Information Bottleneck method for MWP (called ESIB) based on variational information bottleneck, which extracts essential features of the expression syntax tree while filtering latent-specific redundancy containing syntax-irrelevant features. The key idea of ESIB is to encourage multiple models to predict the same expression syntax tree for different problem representations of the same problem by mutual learning so as to capture consistent information of expression syntax tree and discard latent-specific redundancy. To improve the generalization ability of the model and generate more diverse expressions, we design a self-distillation loss to encourage the model to rely more on the expression syntax information in the latent space. Experimental results on two large-scale benchmarks show that our model not only achieves state-of-the-art results but also generates more diverse solutions.|数学词汇问题(MWP)旨在自动解决课文中给出的数学问题。以往的研究倾向于设计复杂的模型来捕捉原始文本中的附加信息,从而使模型获得更全面的特征。本文从相反的角度出发,研究了如何去除含有虚假相关的冗余特征。为此,我们设计了一种基于变分信息瓶颈的 MWP 表达式语法信息瓶颈方法(称为 ESIB) ,该方法在过滤包含语法无关特征的潜在特定冗余的同时,提取表达式语法树的基本特征。ESIB 的核心思想是鼓励多个模型通过相互学习对同一问题的不同问题表示预测相同的表达式语法树,从而获取表达式语法树的一致性信息,去除潜在的特定冗余。为了提高模型的泛化能力,生成更多不同的表达式,我们设计了一个自蒸馏损失,以鼓励模型更多地依赖潜在空间中的表达式语法信息。在两个大规模基准上的实验结果表明,该模型不仅取得了最佳的结果,而且产生了更加多样化的解决方案。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Expression+Syntax+Information+Bottleneck+for+Math+Word+Problems)|0| -|[Masking and Generation: An Unsupervised Method for Sarcasm Detection](https://doi.org/10.1145/3477495.3531825)|Rui Wang, Qianlong Wang, Bin Liang, Yi Chen, Zhiyuan Wen, Bing Qin, Ruifeng Xu|Harbin Institute of Technology, Shenzhen, China; Harbin Institute of Technology, Harbin, China; Harbin Institute of Technology & Peng Cheng Laboratory, Shenzhen, China|Existing approaches for sarcasm detection are mainly based on supervised learning, in which the promising performance largely depends on a considerable amount of labeled data or extra information. In the real world scenario, however, the abundant labeled data or extra information requires high labor cost, not to mention that sufficient annotated data is unavailable in many low-resource conditions. To alleviate this dilemma, we investigate sarcasm detection from an unsupervised perspective, in which we explore a masking and generation paradigm in the context to extract the context incongruities for learning sarcastic expression. Further, to improve the feature representations of the sentences, we use unsupervised contrastive learning to improve the sentence representation based on the standard dropout. Experimental results on six perceived sarcasm detection benchmark datasets show that our approach outperforms baselines. Simultaneously, our unsupervised method obtains comparative performance with supervised methods for the intended sarcasm dataset.|现有的挖苦检测方法主要基于监督式学习,其有效性很大程度上取决于大量的标记数据或额外信息。然而,在现实世界的场景中,大量的标记数据或额外的信息需要很高的人工成本,更不用说在许多资源不足的情况下没有足够的注释数据了。为了缓解这一困境,我们从无监督的角度研究了讽刺语的检测问题,探索了语境中的掩蔽和生成范式,以提取语境中的不一致性,从而学习讽刺语的表达。此外,为了改善句子的特征表示,我们使用无监督对比学习来改善基于标准辍学的句子表示。实验结果表明,该方法的性能优于基准测试。同时,我们的无监督方法获得了比较性能的监督方法为预期讽刺数据集。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Masking+and+Generation:+An+Unsupervised+Method+for+Sarcasm+Detection)|0| -|[Learned Token Pruning in Contextualized Late Interaction over BERT (ColBERT)](https://doi.org/10.1145/3477495.3531835)|Carlos Lassance, Maroua Maachou, Joohee Park, Stéphane Clinchant|Naver, Seoul, Republic of Korea; Naver Labs Europe, Meylan, France|BERT-based rankers have been shown very effective as rerankers in information retrieval tasks. In order to extend these models to full-ranking scenarios, the ColBERT model has been recently proposed, which adopts a late interaction mechanism. This mechanism allows for the representation of documents to be precomputed in advance. However, the late-interaction mechanism leads to large index size, as one needs to save a representation for each token of every document. In this work, we focus on token pruning techniques in order to mitigate this problem. We test four methods, ranging from simpler ones to the use of a single layer of attention mechanism to select the tokens to keep at indexing time. Our experiments show that for the MS MARCO-passages collection, indexes can be pruned up to 70% of their original size, without a significant drop in performance. We also evaluate on the MS MARCO-documents collection and the BEIR benchmark, which reveals some challenges for the proposed mechanism.|以 BERT 为基础的排名已经被证明在信息检索任务中非常有效。为了将这些模型扩展到完全排序的场景,最近提出了 ColBERT 模型,该模型采用了一种后交互机制。这种机制允许预先计算文档的表示形式。但是,后期交互机制会导致索引大小增加,因为需要为每个文档的每个标记保存表示形式。在这项工作中,我们重点关注令牌剪枝技术,以减轻这个问题。我们测试了四种方法,从简单的方法到使用单层注意机制在索引时选择要保留的令牌。我们的实验表明,对于 MS MARCO 段收集,索引可以修剪高达原始大小的70% ,性能没有明显下降。我们还对 MS MARCO 文档集和 BEIR 基准进行了评估,揭示了该机制面临的一些挑战。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learned+Token+Pruning+in+Contextualized+Late+Interaction+over+BERT+(ColBERT))|0| +|[Expression Syntax Information Bottleneck for Math Word Problems](https://doi.org/10.1145/3477495.3531824)|Jing Xiong, Chengming Li, Min Yang, Xiping Hu, Bin Hu|Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; Lanzhou University, Lanzhou, China; Sun Yat-sen University, Shenzhen, China|Math Word Problems (MWP) aims to automatically solve mathematical questions given in texts. Previous studies tend to design complex models to capture additional information in the original text so as to enable the model to gain more comprehensive features. In this paper, we turn our attention in the opposite direction, and work on how to discard redundant features containing spurious correlations for MWP. To this end, we design an Expression Syntax Information Bottleneck method for MWP (called ESIB) based on variational information bottleneck, which extracts essential features of the expression syntax tree while filtering latent-specific redundancy containing syntax-irrelevant features. The key idea of ESIB is to encourage multiple models to predict the same expression syntax tree for different problem representations of the same problem by mutual learning so as to capture consistent information of expression syntax tree and discard latent-specific redundancy. To improve the generalization ability of the model and generate more diverse expressions, we design a self-distillation loss to encourage the model to rely more on the expression syntax information in the latent space. Experimental results on two large-scale benchmarks show that our model not only achieves state-of-the-art results but also generates more diverse solutions.|数学词汇问题(MWP)旨在自动解决课文中给出的数学问题。以往的研究倾向于设计复杂的模型来捕捉原始文本中的附加信息,从而使模型获得更全面的特征。本文从相反的角度出发,研究了如何去除含有虚假相关的冗余特征。为此,我们设计了一种基于变分信息瓶颈的 MWP 表达式语法信息瓶颈方法(称为 ESIB) ,该方法在过滤包含语法无关特征的潜在特定冗余的同时,提取表达式语法树的基本特征。ESIB 的核心思想是鼓励多个模型通过相互学习对同一问题的不同问题表示预测相同的表达式语法树,从而获取表达式语法树的一致性信息,去除潜在的特定冗余。为了提高模型的泛化能力,生成更多不同的表达式,我们设计了一个自蒸馏损失,以鼓励模型更多地依赖潜在空间中的表达式语法信息。在两个大规模基准上的实验结果表明,该模型不仅取得了最佳的结果,而且产生了更加多样化的解决方案。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Expression+Syntax+Information+Bottleneck+for+Math+Word+Problems)|0| +|[Masking and Generation: An Unsupervised Method for Sarcasm Detection](https://doi.org/10.1145/3477495.3531825)|Rui Wang, Qianlong Wang, Bin Liang, Yi Chen, Zhiyuan Wen, Bing Qin, Ruifeng Xu|Harbin Institute of Technology, Harbin, China; Harbin Institute of Technology & Peng Cheng Laboratory, Shenzhen, China; Harbin Institute of Technology, Shenzhen, China|Existing approaches for sarcasm detection are mainly based on supervised learning, in which the promising performance largely depends on a considerable amount of labeled data or extra information. In the real world scenario, however, the abundant labeled data or extra information requires high labor cost, not to mention that sufficient annotated data is unavailable in many low-resource conditions. To alleviate this dilemma, we investigate sarcasm detection from an unsupervised perspective, in which we explore a masking and generation paradigm in the context to extract the context incongruities for learning sarcastic expression. Further, to improve the feature representations of the sentences, we use unsupervised contrastive learning to improve the sentence representation based on the standard dropout. Experimental results on six perceived sarcasm detection benchmark datasets show that our approach outperforms baselines. Simultaneously, our unsupervised method obtains comparative performance with supervised methods for the intended sarcasm dataset.|现有的挖苦检测方法主要基于监督式学习,其有效性很大程度上取决于大量的标记数据或额外信息。然而,在现实世界的场景中,大量的标记数据或额外的信息需要很高的人工成本,更不用说在许多资源不足的情况下没有足够的注释数据了。为了缓解这一困境,我们从无监督的角度研究了讽刺语的检测问题,探索了语境中的掩蔽和生成范式,以提取语境中的不一致性,从而学习讽刺语的表达。此外,为了改善句子的特征表示,我们使用无监督对比学习来改善基于标准辍学的句子表示。实验结果表明,该方法的性能优于基准测试。同时,我们的无监督方法获得了比较性能的监督方法为预期讽刺数据集。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Masking+and+Generation:+An+Unsupervised+Method+for+Sarcasm+Detection)|0| +|[Learned Token Pruning in Contextualized Late Interaction over BERT (ColBERT)](https://doi.org/10.1145/3477495.3531835)|Carlos Lassance, Maroua Maachou, Joohee Park, Stéphane Clinchant|Naver Labs Europe, Meylan, France; Naver, Seoul, Republic of Korea|BERT-based rankers have been shown very effective as rerankers in information retrieval tasks. In order to extend these models to full-ranking scenarios, the ColBERT model has been recently proposed, which adopts a late interaction mechanism. This mechanism allows for the representation of documents to be precomputed in advance. However, the late-interaction mechanism leads to large index size, as one needs to save a representation for each token of every document. In this work, we focus on token pruning techniques in order to mitigate this problem. We test four methods, ranging from simpler ones to the use of a single layer of attention mechanism to select the tokens to keep at indexing time. Our experiments show that for the MS MARCO-passages collection, indexes can be pruned up to 70% of their original size, without a significant drop in performance. We also evaluate on the MS MARCO-documents collection and the BEIR benchmark, which reveals some challenges for the proposed mechanism.|以 BERT 为基础的排名已经被证明在信息检索任务中非常有效。为了将这些模型扩展到完全排序的场景,最近提出了 ColBERT 模型,该模型采用了一种后交互机制。这种机制允许预先计算文档的表示形式。但是,后期交互机制会导致索引大小增加,因为需要为每个文档的每个标记保存表示形式。在这项工作中,我们重点关注令牌剪枝技术,以减轻这个问题。我们测试了四种方法,从简单的方法到使用单层注意机制在索引时选择要保留的令牌。我们的实验表明,对于 MS MARCO 段收集,索引可以修剪高达原始大小的70% ,性能没有明显下降。我们还对 MS MARCO 文档集和 BEIR 基准进行了评估,揭示了该机制面临的一些挑战。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learned+Token+Pruning+in+Contextualized+Late+Interaction+over+BERT+(ColBERT))|0| |[GraFN: Semi-Supervised Node Classification on Graph with Few Labels via Non-Parametric Distribution Assignment](https://doi.org/10.1145/3477495.3531838)|Junseok Lee, Yunhak Oh, Yeonjun In, Namkyeong Lee, Dongmin Hyun, Chanyoung Park|KAIST, Daejeon, Republic of Korea; POSTECH, Pohang, Republic of Korea|Despite the success of Graph Neural Networks (GNNs) on various applications, GNNs encounter significant performance degradation when the amount of supervision signals, i.e., number of labeled nodes, is limited, which is expected as GNNs are trained solely based on the supervision obtained from the labeled nodes. On the other hand, recent self-supervised learning paradigm aims to train GNNs by solving pretext tasks that do not require any labeled nodes, and it has shown to even outperform GNNs trained with few labeled nodes. However, a major drawback of self-supervised methods is that they fall short of learning class discriminative node representations since no labeled information is utilized during training. To this end, we propose a novel semi-supervised method for graphs, GraFN, that leverages few labeled nodes to ensure nodes that belong to the same class to be grouped together, thereby achieving the best of both worlds of semi-supervised and self-supervised methods. Specifically, GraFN randomly samples support nodes from labeled nodes and anchor nodes from the entire graph. Then, it minimizes the difference between two predicted class distributions that are non-parametrically assigned by anchor-supports similarity from two differently augmented graphs. We experimentally show that GraFN surpasses both the semi-supervised and self-supervised methods in terms of node classification on real-world graphs.|尽管图形神经网络(GNN)在各种应用中取得了成功,但是当监督信号(即标记节点的数量)受到限制时,GNN 会遇到显著的性能下降,这是预期的,因为 GNN 仅仅基于从标记节点获得的监督进行训练。另一方面,最近的自我监督学习范式旨在通过解决不需要任何标记节点的托辞任务来训练 GNN,并且已经证明它的表现甚至优于用少量标记节点训练的 GNN。然而,自监督方法的一个主要缺点是,由于在训练过程中没有使用标记信息,因此它们不能很好地表示学习类的判别节点。为此,我们提出了一种新的图的半监督方法,GraFN,它利用少量的标记节点来确保属于同一类的节点被分组在一起,从而实现了半监督和自监督方法的最佳结合。具体来说,GraFN 随机采样支持来自标记节点的节点和来自整个图的锚节点。然后,从两个不同的增广图中最小化由锚支持相似性非参数赋值的两个预测类分布之间的差异。实验结果表明,GraFN 在实际图的节点分类方面优于半监督和自监督方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=GraFN:+Semi-Supervised+Node+Classification+on+Graph+with+Few+Labels+via+Non-Parametric+Distribution+Assignment)|0| -|[Which Discriminator for Cooperative Text Generation?](https://doi.org/10.1145/3477495.3531858)|Antoine Chaffin, Thomas Scialom, Sylvain Lamprier, Jacopo Staiano, Benjamin Piwowarski, Ewa Kijak, Vincent Claveau|Université Rennes, IRISA, Rennes, France; IRISA, IMATAG, Rennes, France; reciTAL, Paris, France; ISIR - Sorbonne Université, Paris, France; ISIR - Sorbonne Université, reciTAL, Paris, France; CNRS, IRISA, Rennes, France; CNRS, ISIR - Sorbonne Université, Paris, France|Language models generate texts by successively predicting probability distributions for next tokens given past ones. A growing field of interest tries to leverage external information in the decoding process so that the generated texts have desired properties, such as being more natural, non toxic, faithful, or having a specific writing style. A solution is to use a classifier at each generation step, resulting in a cooperative environment where the classifier guides the decoding of the language model distribution towards relevant texts for the task at hand. In this paper, we examine three families of (transformer-based) discriminators for this specific task of cooperative decoding: bidirectional, left-to-right and generative ones. We evaluate the pros and cons of these different types of discriminators for cooperative generation, exploring respective accuracy on classification tasks along with their impact on the resulting sample quality and computational performances. We also provide the code of a batched implementation of the powerful cooperative decoding strategy used for our experiments, the Monte Carlo Tree Search, working with each discriminator for Natural Language Generation.|语言模型通过依次预测给定过去标记的下一个标记的概率分布来生成文本。越来越多的研究领域试图在解码过程中利用外部信息,使生成的文本具有期望的特性,如更自然、无毒、忠实或具有特定的写作风格。一个解决方案是在每个生成步骤中使用一个分类器,从而形成一个合作环境,在这个环境中,分类器将语言模型分布的解码引导到手头任务的相关文本中。在这篇论文中,我们针对这个特定的合作解码任务,研究了三类(基于变压器的)鉴别器: 双向的,从左到右的和生成的。我们评估了这些不同类型的鉴别器在合作生成中的优缺点,探讨了它们在分类任务中各自的准确性及其对所得样本质量和计算性能的影响。我们还提供了一个批处理实现的强大的合作解码策略,用于我们的实验,蒙特卡罗树搜索,与自然语言生成的每个鉴别器工作的代码。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Which+Discriminator+for+Cooperative+Text+Generation?)|0| +|[Which Discriminator for Cooperative Text Generation?](https://doi.org/10.1145/3477495.3531858)|Antoine Chaffin, Thomas Scialom, Sylvain Lamprier, Jacopo Staiano, Benjamin Piwowarski, Ewa Kijak, Vincent Claveau|Université Rennes, IRISA, Rennes, France; CNRS, IRISA, Rennes, France; reciTAL, Paris, France; CNRS, ISIR - Sorbonne Université, Paris, France; IRISA, IMATAG, Rennes, France; ISIR - Sorbonne Université, Paris, France; ISIR - Sorbonne Université, reciTAL, Paris, France|Language models generate texts by successively predicting probability distributions for next tokens given past ones. A growing field of interest tries to leverage external information in the decoding process so that the generated texts have desired properties, such as being more natural, non toxic, faithful, or having a specific writing style. A solution is to use a classifier at each generation step, resulting in a cooperative environment where the classifier guides the decoding of the language model distribution towards relevant texts for the task at hand. In this paper, we examine three families of (transformer-based) discriminators for this specific task of cooperative decoding: bidirectional, left-to-right and generative ones. We evaluate the pros and cons of these different types of discriminators for cooperative generation, exploring respective accuracy on classification tasks along with their impact on the resulting sample quality and computational performances. We also provide the code of a batched implementation of the powerful cooperative decoding strategy used for our experiments, the Monte Carlo Tree Search, working with each discriminator for Natural Language Generation.|语言模型通过依次预测给定过去标记的下一个标记的概率分布来生成文本。越来越多的研究领域试图在解码过程中利用外部信息,使生成的文本具有期望的特性,如更自然、无毒、忠实或具有特定的写作风格。一个解决方案是在每个生成步骤中使用一个分类器,从而形成一个合作环境,在这个环境中,分类器将语言模型分布的解码引导到手头任务的相关文本中。在这篇论文中,我们针对这个特定的合作解码任务,研究了三类(基于变压器的)鉴别器: 双向的,从左到右的和生成的。我们评估了这些不同类型的鉴别器在合作生成中的优缺点,探讨了它们在分类任务中各自的准确性及其对所得样本质量和计算性能的影响。我们还提供了一个批处理实现的强大的合作解码策略,用于我们的实验,蒙特卡罗树搜索,与自然语言生成的每个鉴别器工作的代码。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Which+Discriminator+for+Cooperative+Text+Generation?)|0| |[Topological Analysis of Contradictions in Text](https://doi.org/10.1145/3477495.3531881)|Xiangcheng Wu, Xi Niu, Ruhani Rahman|University of North Carolina at Charlotte, Charlotte, NC, USA|Automatically finding contradictions from text is a fundamental yet under-studied problem in natural language understanding and information retrieval. Recently, topology, a branch of mathematics concerned with the properties of geometric shapes, has been shown useful to understand semantics of text. This study presents a topological approach to enhancing deep learning models in detecting contradictions in text. In addition, in order to better understand contradictions, we propose a classification with six types of contradictions. Following that, the topologically enhanced models are evaluated with different contradictions types, as well as different text genres. Overall we have demonstrated the usefulness of topological features in finding contradictions, especially the more latent and more complex contradictions in text.|自动从文本中找出矛盾是自然语言理解和信息检索中一个基本但尚未得到充分研究的问题。近年来,拓扑学作为一门研究 Unicode几何图形列表性质的数学分支,在理解文本语义方面发挥了重要作用。本研究提出了一种拓扑方法来增强深度学习模型在文本矛盾检测中的应用。此外,为了更好地理解矛盾,我们提出了六种类型的矛盾分类。然后,利用不同的矛盾类型和不同的文本类型对拓扑增强模型进行评估。总的来说,我们已经证明了拓扑特征在发现矛盾,特别是在文本中更多的潜在和更复杂的矛盾方面的有用性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Topological+Analysis+of+Contradictions+in+Text)|0| -|[Dual Pseudo Supervision for Semi-Supervised Text Classification with a Reliable Teacher](https://doi.org/10.1145/3477495.3531887)|Shujie Li, Min Yang, Chengming Li, Ruifeng Xu|Harbin Institute of Technology & Peng Cheng Lab, Shenzhen, China; Sun Yat-sen University, Shenzhen, China; University of Science and Technology of China, Hefei, China; Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China|In this paper, we study the semi-supervised text classification (SSTC) by exploring both labeled and extra unlabeled data. One of the most popular SSTC techniques is pseudo-labeling which assigns pseudo labels for unlabeled data via a teacher classifier trained on labeled data. These pseudo labeled data is then applied to train a student classifier. However, when the pseudo labels are inaccurate, the student classifier will learn from inaccurate data and get even worse performance than the teacher. To mitigate this issue, we propose a simple yet efficient pseudo-labeling framework called Dual Pseudo Supervision (DPS), which exploits the feedback signal from the student to guide the teacher to generate better pseudo labels. In particular, we alternately update the student based on the pseudo labeled data annotated by the teacher and optimize the teacher based on the student's performance via meta learning. In addition, we also design a consistency regularization term to further improve the stability of the teacher. With the above two strategies, the learned reliable teacher can provide more accurate pseudo-labels to the student and thus improve the overall performance of text classification. We conduct extensive experiments on three benchmark datasets (i.e., AG News, Yelp and Yahoo) to verify the effectiveness of our DPS method. Experimental results show that our approach achieves substantially better performance than the strong competitors. For reproducibility, we will release our code and data of this paper publicly at https://github.com/GRIT621/DPS.|本文通过对已标记和未标记数据的分析,研究了半监督文本分类算法(SSTC)。最流行的 SSTC 技术之一是伪标记技术,它通过对标记数据进行训练的教师分类器为未标记的数据分配伪标记。然后应用这些伪标记数据训练学生分类器。然而,当伪标签不准确时,学生分类器会从不准确的数据中学习,得到比教师更差的性能。为了解决这个问题,我们提出了一个简单而有效的伪标签框架,称为双伪监督(DPS) ,它利用学生的反馈信号来指导教师生成更好的伪标签。特别是,我们交替更新学生的基础上伪标记数据的教师注释和优化教师的基础上学生的表现通过元学习。此外,我们还设计了一个一致性正则项,以进一步提高教师的稳定性。通过以上两种策略,学习可靠的教师可以为学生提供更准确的伪标签,从而提高文本分类的整体性能。我们对三个基准数据集(即 AG News、 Yelp 和 Yahoo)进行了广泛的实验,以验证我们的 DPS 方法的有效性。实验结果表明,我们的方法实现了大大优于强竞争对手的性能。为确保重复性,我们会在 https://github.com/grit621/dps 公开发布本文件的代码和数据。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Dual+Pseudo+Supervision+for+Semi-Supervised+Text+Classification+with+a+Reliable+Teacher)|0| +|[Dual Pseudo Supervision for Semi-Supervised Text Classification with a Reliable Teacher](https://doi.org/10.1145/3477495.3531887)|Shujie Li, Min Yang, Chengming Li, Ruifeng Xu|Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; University of Science and Technology of China, Hefei, China; Sun Yat-sen University, Shenzhen, China; Harbin Institute of Technology & Peng Cheng Lab, Shenzhen, China|In this paper, we study the semi-supervised text classification (SSTC) by exploring both labeled and extra unlabeled data. One of the most popular SSTC techniques is pseudo-labeling which assigns pseudo labels for unlabeled data via a teacher classifier trained on labeled data. These pseudo labeled data is then applied to train a student classifier. However, when the pseudo labels are inaccurate, the student classifier will learn from inaccurate data and get even worse performance than the teacher. To mitigate this issue, we propose a simple yet efficient pseudo-labeling framework called Dual Pseudo Supervision (DPS), which exploits the feedback signal from the student to guide the teacher to generate better pseudo labels. In particular, we alternately update the student based on the pseudo labeled data annotated by the teacher and optimize the teacher based on the student's performance via meta learning. In addition, we also design a consistency regularization term to further improve the stability of the teacher. With the above two strategies, the learned reliable teacher can provide more accurate pseudo-labels to the student and thus improve the overall performance of text classification. We conduct extensive experiments on three benchmark datasets (i.e., AG News, Yelp and Yahoo) to verify the effectiveness of our DPS method. Experimental results show that our approach achieves substantially better performance than the strong competitors. For reproducibility, we will release our code and data of this paper publicly at https://github.com/GRIT621/DPS.|本文通过对已标记和未标记数据的分析,研究了半监督文本分类算法(SSTC)。最流行的 SSTC 技术之一是伪标记技术,它通过对标记数据进行训练的教师分类器为未标记的数据分配伪标记。然后应用这些伪标记数据训练学生分类器。然而,当伪标签不准确时,学生分类器会从不准确的数据中学习,得到比教师更差的性能。为了解决这个问题,我们提出了一个简单而有效的伪标签框架,称为双伪监督(DPS) ,它利用学生的反馈信号来指导教师生成更好的伪标签。特别是,我们交替更新学生的基础上伪标记数据的教师注释和优化教师的基础上学生的表现通过元学习。此外,我们还设计了一个一致性正则项,以进一步提高教师的稳定性。通过以上两种策略,学习可靠的教师可以为学生提供更准确的伪标签,从而提高文本分类的整体性能。我们对三个基准数据集(即 AG News、 Yelp 和 Yahoo)进行了广泛的实验,以验证我们的 DPS 方法的有效性。实验结果表明,我们的方法实现了大大优于强竞争对手的性能。为确保重复性,我们会在 https://github.com/grit621/dps 公开发布本文件的代码和数据。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Dual+Pseudo+Supervision+for+Semi-Supervised+Text+Classification+with+a+Reliable+Teacher)|0| |[An Efficient Fusion Mechanism for Multimodal Low-resource Setting](https://doi.org/10.1145/3477495.3531900)|Dushyant Singh Chauhan, Asif Ekbal, Pushpak Bhattacharyya|Indian Institute of Technology Patna, Patna, India|The effective fusion of multiple modalities (i.e., text, acoustic, and visual) is a non-trivial task, as these modalities often carry specific and diverse information and do not contribute equally. The fusion of different modalities could even be more challenging under the low-resource setting, where we have fewer samples for training. This paper proposes a multi-representative fusion mechanism that generates diverse fusions with multiple modalities and then chooses the best fusion among them. To achieve this, we first apply convolution filters on multimodal inputs to generate different and diverse representations of modalities. We then fuse pairwise modalities with multiple representations to get the multiple fusions. Finally, we propose an attention mechanism that only selects the most appropriate fusion, which eventually helps resolve the noise problem by ignoring the noisy fusions. We evaluate our proposed approach on three low-resource multimodal sentiment analysis datasets, i.e., YouTube, MOUD, and ICT-MMMO. Experimental results show the effectiveness of our proposed approach with the accuracies of 59.3%, 83.0%, and 84.1% for the YouTube, MOUD, and ICT-MMMO datasets, respectively.|多种模式(即文本、声学和视觉)的有效融合是一项非常重要的任务,因为这些模式往往携带特定的、多样化的信息,并且不能平等地作出贡献。在资源匮乏的情况下,不同模式的融合可能更具挑战性,因为我们的培训样本较少。提出了一种多代表性融合机制,该机制通过多种融合方式生成多种融合,然后从中选择最佳融合方式。为了实现这一点,我们首先应用卷积滤波器的多模式输入,以产生不同的和不同的表示形式。然后,我们融合成对模式与多重表示,以获得多重融合。最后,我们提出了一种注意机制,只选择最适当的融合,这最终有助于解决噪声问题,而忽略了噪声融合。我们在三个低资源多模态情绪分析数据集(即 YouTube、 MOUD 和 ICT-MMMO)上评估了我们提出的方法。实验结果表明,该方法对 YouTube、 MOUD 和 ICT-MMMO 数据集的准确率分别为59.3% 、83.0% 和84.1% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=An+Efficient+Fusion+Mechanism+for+Multimodal+Low-resource+Setting)|0| -|[PST: Measuring Skill Proficiency in Programming Exercise Process via Programming Skill Tracing](https://doi.org/10.1145/3477495.3531903)|Ruixin Li, Yu Yin, Le Dai, Shuanghong Shen, Xin Lin, Yu Su, Enhong Chen|Institute of Advanced Technology, University of Science and Technology of China & State Key Laboratory of Cognitive Intelligence, Hefei, Anhui, China; School of Computer Science and Technology, University of Science and Technology of China & State Key Laboratory of Cognitive Intelligence, Hefei, Anhui, China; Hefei Normal University & Hefei Comprehensive National Science Center, Hefei, Anhui, China; School of Data Science, University of Science and Technology of China & State Key Laboratory of Cognitive Intelligence, Hefei, Anhui, China|Programming has become an important skill for individuals nowadays. For the demand to improve personal programming skill, tracking programming skill proficiency is getting more and more important. However, few researchers pay attention to measuring the programming skill of learners. Most of existing studies on learner capability portrait only made use of the exercise results, while the rich behavioral information contained in programming exercise process remains unused. Therefore, we propose a model that measures skill proficiency in programming exercise process named Programming Skill Tracing (PST). We designed Code Information Graph (CIG) to represent the feature of learners' solution code, and Code Tracing Graph (CTG) to measure the changes between the adjacent submissions. Furthermore, we divided programming skill into programming knowledge and coding ability to get more fine-grained assessment. Finally, we conducted various experiments to verify the effectiveness and interpretability of our PST model.|编程已经成为当今个人的一项重要技能。对于提高个人编程技能的需求,跟踪编程技能熟练程度变得越来越重要。然而,很少有研究者注意测量学习者的编程技能。现有的关于学习者能力描述的研究大多只是利用了习题结果,而编程习题过程中所包含的丰富的行为信息却没有得到充分的利用。因此,我们提出了一个测量编程技能熟练程度的模型,命名为编程技能跟踪(PST)。我们设计了代码信息图(CIG)来表示学习者解决方案代码的特征,代码跟踪图(CTG)来度量相邻提交的代码之间的变化。此外,将编程技能分为编程知识和编码能力两部分,以获得更细粒度的评价。最后,我们进行了各种实验来验证我们的 PST 模型的有效性和可解释性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PST:+Measuring+Skill+Proficiency+in+Programming+Exercise+Process+via+Programming+Skill+Tracing)|0| -|[MuchSUM: Multi-channel Graph Neural Network for Extractive Summarization](https://doi.org/10.1145/3477495.3531906)|Qianren Mao, Hongdong Zhu, Junnan Liu, Cheng Ji, Hao Peng, Jianxin Li, Lihong Wang, Zheng Wang|Beihang University, Beijing, China; University of Leeds, Leeds, West Yorkshire, United Kingdom; CNCERT, Beijing, China|Recent studies of extractive text summarization have leveraged BERT for document encoding with breakthrough performance. However, when using a pre-trained BERT-based encoder, existing approaches for selecting representative sentences for text summarization are inadequate since the encoder is not explicitly trained for representing sentences. Simply providing the BERT-initialized sentences to cross-sentential graph-based neural networks (GNNs) to encode semantic features of the sentences is not ideal because doing so fail to integrate other summary-worthy features like sentence importance and positions. This paper presents MuchSUM, a better approach for extractive text summarization. MuchSUM is a multi-channel graph convolutional network designed to explicitly incorporate multiple salient summary-worthy features. Specifically, we introduce three specific graph channels to encode the node textual features, node centrality features, and node position features, respectively, under bipartite word-sentence heterogeneous graphs. Then, a cross-channel convolution operation is designed to distill the common graph representations shared by different channels. Finally, the sentence representations of each channel are fused for extractive summarization. We also investigate three weighted graphs in each channel to infuse edge features for graph-based summarization modeling. Experimental results demonstrate our model can achieve considerable performance compared with some BERT-initialized graph-based extractive summarization systems.|提取文本摘要的最新研究已经利用 BERT 技术实现了突破性的文档编码。然而,当使用预先训练的 BERT 编码器时,现有的选择文本摘要代表性句子的方法是不够的,因为编码器没有明确地训练代表性句子。简单地将 BERT 初始化的句子提供给基于跨句图的神经网络(GNN)来编码句子的语义特征是不理想的,因为这样做不能整合其他有总结价值的特征,如句子的重要性和位置。提出了一种更好的文本摘要提取方法 MuchSUM。MuchSUM 是一个多通道图卷积网络,旨在明确合并多个突出的值得总结的功能。具体来说,我们引入了三个特定的图通道,分别对二分词-句子异质图下的结点文本特征、结点中心特征和结点位置特征进行编码。然后,设计一个跨信道卷积运算来提取不同信道共享的公共图表示。最后,对每个通道的句子表示进行融合,进行提取摘要。我们还研究了每个通道中的三个加权图,为基于图的摘要建模注入边缘特征。实验结果表明,与一些 BERT 初始化的基于图的抽取摘要系统相比,该模型具有较好的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MuchSUM:+Multi-channel+Graph+Neural+Network+for+Extractive+Summarization)|0| -|[Multi-label Masked Language Modeling on Zero-shot Code-switched Sentiment Analysis](https://doi.org/10.1145/3477495.3531914)|Zhi Li, Xing Gao, Ji Zhang, Yin Zhang|Alibaba Group, Hangzhou, China; Zhejiang University, Hangzhou, China|In multilingual communities, code-switching is a common phenomenon and code-switched tasks have become a crucial area of research in natural language processing (NLP) applications. Existing approaches mainly focus on supervised learning. However, it is expensive to annotate a sufficient amount of code-switched data. In this paper, we consider zero-shot setting and improve model performance on code-switched tasks via monolingual language datasets, unlabeled code-switched datasets, and semantic dictionaries. Inspired by the mechanism of code-switching itself, we propose multi-label masked language modeling and predict both the masked word and its synonyms in other languages. Experimental results show that compared with baselines, our method can further improve the pretrained multilingual model's performance on code-switched sentiment analysis datasets.|在多语言社区中,语码转换是一种常见的现象,语码转换任务已经成为自然语言处理(NLP)应用研究的一个重要领域。现有方法主要侧重于监督式学习。然而,对足够数量的代码切换数据进行注释是昂贵的。本文通过单语种语言数据集、未标记的语码转换数据集和语义词典,考虑了零拍设置,提高了语码转换任务的模型性能。受语码转换本身机制的启发,我们提出了多标签隐藏语言模型,并对其他语言中的隐藏词及其同义词进行了预测。实验结果表明,与基线方法相比,该方法可以进一步提高预训练多语言模型在编码切换情感分析数据集上的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-label+Masked+Language+Modeling+on+Zero-shot+Code-switched+Sentiment+Analysis)|0| +|[PST: Measuring Skill Proficiency in Programming Exercise Process via Programming Skill Tracing](https://doi.org/10.1145/3477495.3531903)|Ruixin Li, Yu Yin, Le Dai, Shuanghong Shen, Xin Lin, Yu Su, Enhong Chen|Hefei Normal University & Hefei Comprehensive National Science Center, Hefei, Anhui, China; School of Data Science, University of Science and Technology of China & State Key Laboratory of Cognitive Intelligence, Hefei, Anhui, China; Institute of Advanced Technology, University of Science and Technology of China & State Key Laboratory of Cognitive Intelligence, Hefei, Anhui, China; School of Computer Science and Technology, University of Science and Technology of China & State Key Laboratory of Cognitive Intelligence, Hefei, Anhui, China|Programming has become an important skill for individuals nowadays. For the demand to improve personal programming skill, tracking programming skill proficiency is getting more and more important. However, few researchers pay attention to measuring the programming skill of learners. Most of existing studies on learner capability portrait only made use of the exercise results, while the rich behavioral information contained in programming exercise process remains unused. Therefore, we propose a model that measures skill proficiency in programming exercise process named Programming Skill Tracing (PST). We designed Code Information Graph (CIG) to represent the feature of learners' solution code, and Code Tracing Graph (CTG) to measure the changes between the adjacent submissions. Furthermore, we divided programming skill into programming knowledge and coding ability to get more fine-grained assessment. Finally, we conducted various experiments to verify the effectiveness and interpretability of our PST model.|编程已经成为当今个人的一项重要技能。对于提高个人编程技能的需求,跟踪编程技能熟练程度变得越来越重要。然而,很少有研究者注意测量学习者的编程技能。现有的关于学习者能力描述的研究大多只是利用了习题结果,而编程习题过程中所包含的丰富的行为信息却没有得到充分的利用。因此,我们提出了一个测量编程技能熟练程度的模型,命名为编程技能跟踪(PST)。我们设计了代码信息图(CIG)来表示学习者解决方案代码的特征,代码跟踪图(CTG)来度量相邻提交的代码之间的变化。此外,将编程技能分为编程知识和编码能力两部分,以获得更细粒度的评价。最后,我们进行了各种实验来验证我们的 PST 模型的有效性和可解释性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PST:+Measuring+Skill+Proficiency+in+Programming+Exercise+Process+via+Programming+Skill+Tracing)|0| +|[MuchSUM: Multi-channel Graph Neural Network for Extractive Summarization](https://doi.org/10.1145/3477495.3531906)|Qianren Mao, Hongdong Zhu, Junnan Liu, Cheng Ji, Hao Peng, Jianxin Li, Lihong Wang, Zheng Wang|Beihang University, Beijing, China; CNCERT, Beijing, China; University of Leeds, Leeds, West Yorkshire, United Kingdom|Recent studies of extractive text summarization have leveraged BERT for document encoding with breakthrough performance. However, when using a pre-trained BERT-based encoder, existing approaches for selecting representative sentences for text summarization are inadequate since the encoder is not explicitly trained for representing sentences. Simply providing the BERT-initialized sentences to cross-sentential graph-based neural networks (GNNs) to encode semantic features of the sentences is not ideal because doing so fail to integrate other summary-worthy features like sentence importance and positions. This paper presents MuchSUM, a better approach for extractive text summarization. MuchSUM is a multi-channel graph convolutional network designed to explicitly incorporate multiple salient summary-worthy features. Specifically, we introduce three specific graph channels to encode the node textual features, node centrality features, and node position features, respectively, under bipartite word-sentence heterogeneous graphs. Then, a cross-channel convolution operation is designed to distill the common graph representations shared by different channels. Finally, the sentence representations of each channel are fused for extractive summarization. We also investigate three weighted graphs in each channel to infuse edge features for graph-based summarization modeling. Experimental results demonstrate our model can achieve considerable performance compared with some BERT-initialized graph-based extractive summarization systems.|提取文本摘要的最新研究已经利用 BERT 技术实现了突破性的文档编码。然而,当使用预先训练的 BERT 编码器时,现有的选择文本摘要代表性句子的方法是不够的,因为编码器没有明确地训练代表性句子。简单地将 BERT 初始化的句子提供给基于跨句图的神经网络(GNN)来编码句子的语义特征是不理想的,因为这样做不能整合其他有总结价值的特征,如句子的重要性和位置。提出了一种更好的文本摘要提取方法 MuchSUM。MuchSUM 是一个多通道图卷积网络,旨在明确合并多个突出的值得总结的功能。具体来说,我们引入了三个特定的图通道,分别对二分词-句子异质图下的结点文本特征、结点中心特征和结点位置特征进行编码。然后,设计一个跨信道卷积运算来提取不同信道共享的公共图表示。最后,对每个通道的句子表示进行融合,进行提取摘要。我们还研究了每个通道中的三个加权图,为基于图的摘要建模注入边缘特征。实验结果表明,与一些 BERT 初始化的基于图的抽取摘要系统相比,该模型具有较好的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MuchSUM:+Multi-channel+Graph+Neural+Network+for+Extractive+Summarization)|0| +|[Multi-label Masked Language Modeling on Zero-shot Code-switched Sentiment Analysis](https://doi.org/10.1145/3477495.3531914)|Zhi Li, Xing Gao, Ji Zhang, Yin Zhang|Zhejiang University, Hangzhou, China; Alibaba Group, Hangzhou, China|In multilingual communities, code-switching is a common phenomenon and code-switched tasks have become a crucial area of research in natural language processing (NLP) applications. Existing approaches mainly focus on supervised learning. However, it is expensive to annotate a sufficient amount of code-switched data. In this paper, we consider zero-shot setting and improve model performance on code-switched tasks via monolingual language datasets, unlabeled code-switched datasets, and semantic dictionaries. Inspired by the mechanism of code-switching itself, we propose multi-label masked language modeling and predict both the masked word and its synonyms in other languages. Experimental results show that compared with baselines, our method can further improve the pretrained multilingual model's performance on code-switched sentiment analysis datasets.|在多语言社区中,语码转换是一种常见的现象,语码转换任务已经成为自然语言处理(NLP)应用研究的一个重要领域。现有方法主要侧重于监督式学习。然而,对足够数量的代码切换数据进行注释是昂贵的。本文通过单语种语言数据集、未标记的语码转换数据集和语义词典,考虑了零拍设置,提高了语码转换任务的模型性能。受语码转换本身机制的启发,我们提出了多标签隐藏语言模型,并对其他语言中的隐藏词及其同义词进行了预测。实验结果表明,与基线方法相比,该方法可以进一步提高预训练多语言模型在编码切换情感分析数据集上的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-label+Masked+Language+Modeling+on+Zero-shot+Code-switched+Sentiment+Analysis)|0| |[Extractive Elementary Discourse Units for Improving Abstractive Summarization](https://doi.org/10.1145/3477495.3531916)|Ye Xiong, Teeradaj Racharak, Minh Le Nguyen|Japan Advanced Institute of Science and Technology, Nomi,Ishikawa, Japan|Abstractive summarization focuses on generating concise and fluent text from an original document while maintaining the original intent and containing the new words that do not appear in the original document. Recent studies point out that rewriting extractive summaries help improve the performance with a more concise and comprehensible output summary, which uses a sentence as a textual unit. However, a single document sentence normally cannot supply sufficient information. In this paper, we apply elementary discourse unit (EDU) as textual unit of content selection. In order to utilize EDU for generating a high quality summary, we propose a novel summarization model that first designs an EDU selector to choose salient content. Then, the generator model rewrites the selected EDUs as the final summary. To determine the relevancy of each EDU on the entire document, we choose to apply group tag embedding, which can establish the connection between summary sentences and relevant EDUs, so that our generator does not only focus on selected EDUs, but also ingest the entire original document. Extensive experiments on the CNN/Daily Mail dataset have demonstrated the effectiveness of our model.|抽象摘要的重点是从原始文档中生成简洁流畅的文本,同时保持原始意图并包含原始文档中没有出现的新词。最近的研究指出,重写提取摘要有助于提高性能与更简明易懂的输出摘要,使用一个句子作为文本单位。然而,一个单独的文档句子通常不能提供足够的信息。本文采用基本语篇单位(EDU)作为内容选择的语篇单位。为了利用 EDU 生成高质量的摘要,我们提出了一种新的摘要模型,首先设计一个 EDU 选择器来选择显著的内容。然后,生成器模型重写所选的 EDU 作为最终摘要。为了确定每个 EDU 对整个文档的相关性,我们选择使用组标签嵌入,它可以建立摘要句子和相关 EDU 之间的连接,因此我们的生成器不仅关注选定的 EDU,而且摄取整个原始文档。在 CNN/Daily Mail 数据集上的大量实验已经证明了我们模型的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Extractive+Elementary+Discourse+Units+for+Improving+Abstractive+Summarization)|0| |[LightSGCN: Powering Signed Graph Convolution Network for Link Sign Prediction with Simplified Architecture Design](https://doi.org/10.1145/3477495.3531917)|Haoxin Liu|Tsinghua University, Beijing, China|With both positive and negative links, signed graphs exist widely in the real world. Recently, signed graph neural networks (GNNs) have shown superior performance in the most common signed graph analysis task, i.e., link sign prediction. Existing signed GNNs follow the classic nonlinear-propagation paradigm in unsigned GNNs. However, several recent studies on unsigned GNNs have shown that such a paradigm increases training difficulty and even reduces performance in various unsigned graph analysis tasks. Meanwhile, most of the public real-world signed graph datasets do not provide node features. These motivate us to consider whether the existing complex model architecture is suitable. In this work, we aim to simplify the architecture of signed GNNs to make it more concise and appropriate for link sign prediction. We propose a simplified signed graph convolution network model called LightSGCN. Specifically, LightSGCN utilizes linear propagation based on the balance theory, a widely adopted social theory. Then, the linear combination of hidden representations at each layer is used as the final representations. Moreover, we also propose a tailored prediction function. These finally yield a simple yet effective LightSGCN model, which is more interpretable, easier to implement, and more efficient to train. Experimental results on four real-world signed graphs demonstrate that such a linear method outperforms the state-of-the-art signed GNNs methods with significant improvement in the link sign prediction task and achieves more than 100X speedup over the most similar and simplest baseline.|符号图有正负两个链接,在现实世界中广泛存在。最近,签名图神经网络(GNN)在最常见的签名图分析任务,即链路符号预测中表现出了优越的性能。现有的签名 GNN 遵循无签名 GNN 中经典的非线性传播范式。然而,最近几项关于无符号 GNN 的研究表明,这种模式增加了训练难度,甚至降低了各种无符号图分析任务的性能。同时,现实世界中的大多数公共签名图数据集都没有提供节点特征。这促使我们考虑现有的复杂模型体系结构是否合适。在这项工作中,我们的目标是简化签名 GNN 的体系结构,使其更简洁,适合链路符号预测。我们提出了一个简化的符号图卷积网络模型,称为 LightSGCN。具体来说,LightSGCN 利用基于平衡理论的线性传播,这是一种被广泛采用的社会理论。然后,每一层的隐藏表示的线性组合被用作最终的表示。此外,我们还提出了一个量身定制的预测函数。这些最终产生了一个简单而有效的 LightSGCN 模型,该模型更易于解释、更易于实现、更易于训练。实验结果表明,这种线性方法的性能优于最先进的签名 GNN 方法,在链路符号预测任务方面有显著改进,在最相似和最简单的基线上实现了超过100倍的加速。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=LightSGCN:+Powering+Signed+Graph+Convolution+Network+for+Link+Sign+Prediction+with+Simplified+Architecture+Design)|0| |[ir_metadata: An Extensible Metadata Schema for IR Experiments](https://doi.org/10.1145/3477495.3531738)|Timo Breuer, Jüri Keller, Philipp Schaer||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ir_metadata:+An+Extensible+Metadata+Schema+for+IR+Experiments)|0| diff --git a/papers/sigir/sigir2023.md b/papers/sigir/sigir2023.md index 998ee04c..91e2370d 100644 --- a/papers/sigir/sigir2023.md +++ b/papers/sigir/sigir2023.md @@ -183,7 +183,7 @@ |[Complex Item Set Recommendation](https://doi.org/10.1145/3539618.3594248)|Mozhdeh Ariannezhad, Ming Li, Sami Jullien, Maarten de Rijke||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Complex+Item+Set+Recommendation)|0| |[Recent Advances in the Foundations and Applications of Unbiased Learning to Rank](https://doi.org/10.1145/3539618.3594247)|Shashank Gupta, Philipp Hager, Jin Huang, Ali Vardasbi, Harrie Oosterhuis||Since its inception, the field of unbiased learning to rank (ULTR) has remained very active and has seen several impactful advancements in recent years. This tutorial provides both an introduction to the core concepts of the field and an overview of recent advancements in its foundations along with several applications of its methods. The tutorial is divided into four parts: Firstly, we give an overview of the different forms of bias that can be addressed with ULTR methods. Secondly, we present a comprehensive discussion of the latest estimation techniques in the ULTR field. Thirdly, we survey published results of ULTR in real-world applications. Fourthly, we discuss the connection between ULTR and fairness in ranking. We end by briefly reflecting on the future of ULTR research and its applications. This tutorial is intended to benefit both researchers and industry practitioners who are interested in developing new ULTR solutions or utilizing them in real-world applications.|自成立以来,无偏学习排名(ULTR)领域一直非常活跃,近年来取得了一些有影响力的进展。本教程介绍了该领域的核心概念,概述了该领域基础方面的最新进展及其方法的若干应用。本教程分为四个部分: 首先,我们概述了不同形式的偏倚,可以用 ULTR 方法处理。其次,我们对 ULTR 领域中的最新估计技术进行了全面的讨论。第三,我们调查了已发表的 ULTR 在实际应用中的结果。第四,我们讨论了 ULTR 与排名公平性之间的关系。最后,我们简要地回顾了 ULTR 研究及其应用的未来。本教程旨在使那些对开发新的 ULTR 解决方案或在实际应用中使用它们感兴趣的研究人员和行业从业人员受益。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Recent+Advances+in+the+Foundations+and+Applications+of+Unbiased+Learning+to+Rank)|0| |[Large-Scale Data Processing for Information Retrieval Applications](https://doi.org/10.1145/3539618.3591797)|Pooya Khandel||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Large-Scale+Data+Processing+for+Information+Retrieval+Applications)|0| -|[Generative Information Retrieval](https://doi.org/10.1145/3539618.3591871)|Marc Najork|Walmart Labs, Sunnyvale, CA, USA; Instacart, San Francisco, CA, USA|ABSTRACTIn the relatively short history of machine learning, the subtle balance between engineering and theoretical progress has been proved critical at various stages. The most recent wave of AI has brought to the IR community powerful techniques, particularly for pattern recognition. While many benefits from the burst of ideas as numerous tasks become algorithmically feasible, the balance is tilting toward the application side. The existing theoretical tools in IR can no longer explain, guide, and justify the newly-established methodologies. With no choices, we have to bet our design on black-box mechanisms that we only empirically understand. The consequences can be suffering: in stark contrast to how the IR industry has envisioned modern AI making life easier, many are experiencing increased confusion and costs in data manipulation, model selection, monitoring, censoring, and decision making. This reality is not surprising: without handy theoretical tools, we often lack principled knowledge of the pattern recognition model's expressivity, optimization property, generalization guarantee, and our decision-making process has to rely on over-simplified assumptions and human judgments from time to time. Facing all the challenges, we started researching advanced theoretical tools emerging from various domains that can potentially resolve modern IR problems. We encountered many impactful ideas and made several independent publications emphasizing different pieces. Time is now to bring the community a systematic tutorial on how we successfully adapt those tools and make significant progress in understanding, designing, and eventually productionize impactful IR systems. We emphasize systematicity because IR is a comprehensive discipline that touches upon particular aspects of learning, causal inference analysis, interactive (online) decision-making, etc. It thus requires systematic calibrations to render the actual usefulness of the imported theoretical tools to serve IR problems, as they usually exhibit unique structures and definitions. Therefore, we plan this tutorial to systematically demonstrate our learning and successful experience of using advanced theoretical tools for understanding and designing IR systems.|在机器学习相对较短的历史中,工程与理论进步之间的微妙平衡在不同的阶段被证明是至关重要的。最近的人工智能浪潮给红外社区带来了强大的技术,特别是模式识别。虽然随着大量任务在算法上变得可行,思想的迸发带来了许多好处,但是平衡正在向应用程序方面倾斜。现有的 IR 理论工具已经不能解释、指导和证明新建立的方法论。由于别无选择,我们不得不把我们的设计押在我们只能凭经验理解的黑盒机制上。其结果可能是痛苦的: 与红外行业设想的现代人工智能如何使生活变得更容易形成鲜明对比的是,许多人在数据操作、模型选择、监控、审查和决策方面正经历着越来越多的混乱和成本。这一现实并不令人惊讶: 没有方便的理论工具,我们往往缺乏模式识别模型的表达能力、优化特性、泛化保证的原则性知识,我们的决策过程不得不依赖于过于简化的假设和人为判断。面对这些挑战,我们开始研究来自不同领域的先进理论工具,这些工具可以解决现代国际关系问题。我们遇到了许多有影响力的想法,并作出了几个独立的出版物,强调不同的作品。现在是时候给社区带来一个系统的教程,告诉他们我们如何成功地调整这些工具,并在理解、设计和最终生产有影响力的 IR 系统方面取得重大进展。我们强调系统性,因为国际关系是一个综合性的学科,涉及到学习的特定方面,因果推理分析,互动(在线)决策,等等。因此,需要进行系统的校准,以提供进口的理论工具的实际用途,以服务红外问题,因为他们通常表现出独特的结构和定义。因此,我们计划本教程系统地展示我们使用先进的理论工具来理解和设计 IR 系统的学习和成功经验。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Generative+Information+Retrieval)|0| +|[Generative Information Retrieval](https://doi.org/10.1145/3539618.3591871)|Marc Najork|Instacart, San Francisco, CA, USA; Walmart Labs, Sunnyvale, CA, USA|ABSTRACTIn the relatively short history of machine learning, the subtle balance between engineering and theoretical progress has been proved critical at various stages. The most recent wave of AI has brought to the IR community powerful techniques, particularly for pattern recognition. While many benefits from the burst of ideas as numerous tasks become algorithmically feasible, the balance is tilting toward the application side. The existing theoretical tools in IR can no longer explain, guide, and justify the newly-established methodologies. With no choices, we have to bet our design on black-box mechanisms that we only empirically understand. The consequences can be suffering: in stark contrast to how the IR industry has envisioned modern AI making life easier, many are experiencing increased confusion and costs in data manipulation, model selection, monitoring, censoring, and decision making. This reality is not surprising: without handy theoretical tools, we often lack principled knowledge of the pattern recognition model's expressivity, optimization property, generalization guarantee, and our decision-making process has to rely on over-simplified assumptions and human judgments from time to time. Facing all the challenges, we started researching advanced theoretical tools emerging from various domains that can potentially resolve modern IR problems. We encountered many impactful ideas and made several independent publications emphasizing different pieces. Time is now to bring the community a systematic tutorial on how we successfully adapt those tools and make significant progress in understanding, designing, and eventually productionize impactful IR systems. We emphasize systematicity because IR is a comprehensive discipline that touches upon particular aspects of learning, causal inference analysis, interactive (online) decision-making, etc. It thus requires systematic calibrations to render the actual usefulness of the imported theoretical tools to serve IR problems, as they usually exhibit unique structures and definitions. Therefore, we plan this tutorial to systematically demonstrate our learning and successful experience of using advanced theoretical tools for understanding and designing IR systems.|在机器学习相对较短的历史中,工程与理论进步之间的微妙平衡在不同的阶段被证明是至关重要的。最近的人工智能浪潮给红外社区带来了强大的技术,特别是模式识别。虽然随着大量任务在算法上变得可行,思想的迸发带来了许多好处,但是平衡正在向应用程序方面倾斜。现有的 IR 理论工具已经不能解释、指导和证明新建立的方法论。由于别无选择,我们不得不把我们的设计押在我们只能凭经验理解的黑盒机制上。其结果可能是痛苦的: 与红外行业设想的现代人工智能如何使生活变得更容易形成鲜明对比的是,许多人在数据操作、模型选择、监控、审查和决策方面正经历着越来越多的混乱和成本。这一现实并不令人惊讶: 没有方便的理论工具,我们往往缺乏模式识别模型的表达能力、优化特性、泛化保证的原则性知识,我们的决策过程不得不依赖于过于简化的假设和人为判断。面对这些挑战,我们开始研究来自不同领域的先进理论工具,这些工具可以解决现代国际关系问题。我们遇到了许多有影响力的想法,并作出了几个独立的出版物,强调不同的作品。现在是时候给社区带来一个系统的教程,告诉他们我们如何成功地调整这些工具,并在理解、设计和最终生产有影响力的 IR 系统方面取得重大进展。我们强调系统性,因为国际关系是一个综合性的学科,涉及到学习的特定方面,因果推理分析,互动(在线)决策,等等。因此,需要进行系统的校准,以提供进口的理论工具的实际用途,以服务红外问题,因为他们通常表现出独特的结构和定义。因此,我们计划本教程系统地展示我们使用先进的理论工具来理解和设计 IR 系统的学习和成功经验。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Generative+Information+Retrieval)|0| |[Tasks, Copilots, and the Future of Search](https://doi.org/10.1145/3539618.3593069)|Ryen W. White||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Tasks,+Copilots,+and+the+Future+of+Search)|0| |[Learning to Re-rank with Constrained Meta-Optimal Transport](https://doi.org/10.1145/3539618.3591714)|Andrés Hoyos Idrobo||Many re-ranking strategies in search systems rely on stochastic ranking policies, encoded as Doubly-Stochastic (DS) matrices, that satisfy desired ranking constraints in expectation, e.g., Fairness of Exposure (FOE). These strategies are generally two-stage pipelines: \emph{i)} an offline re-ranking policy construction step and \emph{ii)} an online sampling of rankings step. Building a re-ranking policy requires repeatedly solving a constrained optimization problem, one for each issued query. Thus, it is necessary to recompute the optimization procedure for any new/unseen query. Regarding sampling, the Birkhoff-von-Neumann decomposition (BvND) is the favored approach to draw rankings from any DS-based policy. However, the BvND is too costly to compute online. Hence, the BvND as a sampling solution is memory-consuming as it can grow as $\gO(N\, n^2)$ for $N$ queries and $n$ documents. This paper offers a novel, fast, lightweight way to predict fair stochastic re-ranking policies: Constrained Meta-Optimal Transport (CoMOT). This method fits a neural network shared across queries like a learning-to-rank system. We also introduce Gumbel-Matching Sampling (GumMS), an online sampling approach from DS-based policies. Our proposed pipeline, CoMOT + GumMS, only needs to store the parameters of a single model, and it generalizes to unseen queries. We empirically evaluated our pipeline on the TREC 2019 and 2020 datasets under FOE constraints. Our experiments show that CoMOT rapidly predicts fair re-ranking policies on held-out data, with a speed-up proportional to the average number of documents per query. It also displays fairness and ranking performance similar to the original optimization-based policy. Furthermore, we empirically validate the effectiveness of GumMS to approximate DS-based policies in expectation.|搜索系统中的许多重排序策略都依赖于随机排序策略,这些策略被编码为双随机(DS)矩阵,满足期望的排序约束,例如,公平曝光(FOE)。这些策略通常是两个阶段的管道: emph { i }离线重新排序策略构建步骤和 emph { ii }在线排序步骤抽样。构建一个重新排序策略需要重复解决一个受限制的最佳化问题,每个发出的查询一个。因此,有必要重新计算任何新的/未见查询的优化过程。关于抽样,Birkhoff-von-Neumann 分解(BvND)是从任何基于 DS 的政策中提取排名的最受欢迎的方法。然而,在线计算 BvND 的成本太高。因此,作为抽样解决方案的 BvND 占用内存,因为对于 $N $查询和 $n $document,它可以增长为 $gO (N,n ^ 2) $。本文提出了一种新颖、快速、轻量级的预测公平随机重排策略的方法: 约束元最优运输(CoMOT)。该方法适用于跨查询共享的神经网络,如学习排序系统。我们还介绍了 Gumbel 匹配抽样(GumMS) ,一种基于 DS 策略的在线抽样方法。我们提出的流水线 CoMOT + GumMS 只需要存储单个模型的参数,并且它可以推广到不可见的查询。我们在 FOE 约束下对 TREC 2019和2020数据集的管道进行了实证评估。我们的实验表明,CoMOT 能够快速地预测对被拒绝的数据进行公平的重新排序的策略,其速度与每个查询的平均文档数成正比。它还显示公平性和排名性能类似于原来的优化为基础的政策。此外,我们还实验验证了 GumMS 在预期情况下逼近基于 DS 策略的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+to+Re-rank+with+Constrained+Meta-Optimal+Transport)|0| |[Constructing Tree-based Index for Efficient and Effective Dense Retrieval](https://doi.org/10.1145/3539618.3591651)|Haitao Li, Qingyao Ai, Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Zheng Liu, Zhao Cao||Recent studies have shown that Dense Retrieval (DR) techniques can significantly improve the performance of first-stage retrieval in IR systems. Despite its empirical effectiveness, the application of DR is still limited. In contrast to statistic retrieval models that rely on highly efficient inverted index solutions, DR models build dense embeddings that are difficult to be pre-processed with most existing search indexing systems. To avoid the expensive cost of brute-force search, the Approximate Nearest Neighbor (ANN) algorithm and corresponding indexes are widely applied to speed up the inference process of DR models. Unfortunately, while ANN can improve the efficiency of DR models, it usually comes with a significant price on retrieval performance. To solve this issue, we propose JTR, which stands for Joint optimization of TRee-based index and query encoding. Specifically, we design a new unified contrastive learning loss to train tree-based index and query encoder in an end-to-end manner. The tree-based negative sampling strategy is applied to make the tree have the maximum heap property, which supports the effectiveness of beam search well. Moreover, we treat the cluster assignment as an optimization problem to update the tree-based index that allows overlapped clustering. We evaluate JTR on numerous popular retrieval benchmarks. Experimental results show that JTR achieves better retrieval performance while retaining high system efficiency compared with widely-adopted baselines. It provides a potential solution to balance efficiency and effectiveness in neural retrieval system designs.|近年来的研究表明,密集检索(DR)技术可以显著提高红外系统第一阶段检索的性能。尽管 DR 在实证研究中取得了一定的成效,但其应用仍然有限。与依赖于高效率倒排索引解决方案的统计检索模型相比,DR 模型构建了密集的嵌入,这些嵌入很难在大多数现有的搜索索引系统中进行预处理。为了避免昂贵的暴力搜索法成本,近似最近邻(ANN)算法和相应的索引被广泛应用于加快 DR 模型的推理过程。遗憾的是,尽管人工神经网络可以提高 DR 模型的效率,但它通常会给检索性能带来巨大的代价。为了解决这个问题,我们提出了 JTR,它代表了基于树的索引和查询编码的联合优化。具体来说,我们设计了一种新的统一对比学习丢失算法,用于以端到端的方式训练基于树的索引和查询编码器。采用基于树的负采样策略,使树具有最大的堆性质,很好地支持了波束搜索的有效性。此外,我们把集群分配当作一个最佳化问题,以更新允许重叠集群的基于树的索引。我们在许多流行的检索基准上评估 JTR。实验结果表明,与广泛采用的基线相比,JTR 在保持较高系统效率的同时,获得了较好的检索性能。它提供了一个潜在的解决方案,以平衡效率和有效的神经检索系统设计。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Constructing+Tree-based+Index+for+Efficient+and+Effective+Dense+Retrieval)|0| @@ -229,7 +229,7 @@ |[From Region to Patch: Attribute-Aware Foreground-Background Contrastive Learning for Fine-Grained Fashion Retrieval](https://doi.org/10.1145/3539618.3591690)|Jianfeng Dong, Xiaoman Peng, Zhe Ma, Daizong Liu, Xiaoye Qu, Xun Yang, Jixiang Zhu, Baolong Liu||Attribute-specific fashion retrieval (ASFR) is a challenging information retrieval task, which has attracted increasing attention in recent years. Different from traditional fashion retrieval which mainly focuses on optimizing holistic similarity, the ASFR task concentrates on attribute-specific similarity, resulting in more fine-grained and interpretable retrieval results. As the attribute-specific similarity typically corresponds to the specific subtle regions of images, we propose a Region-to-Patch Framework (RPF) that consists of a region-aware branch and a patch-aware branch to extract fine-grained attribute-related visual features for precise retrieval in a coarse-to-fine manner. In particular, the region-aware branch is first to be utilized to locate the potential regions related to the semantic of the given attribute. Then, considering that the located region is coarse and still contains the background visual contents, the patch-aware branch is proposed to capture patch-wise attribute-related details from the previous amplified region. Such a hybrid architecture strikes a proper balance between region localization and feature extraction. Besides, different from previous works that solely focus on discriminating the attribute-relevant foreground visual features, we argue that the attribute-irrelevant background features are also crucial for distinguishing the detailed visual contexts in a contrastive manner. Therefore, a novel E-InfoNCE loss based on the foreground and background representations is further proposed to improve the discrimination of attribute-specific representation. Extensive experiments on three datasets demonstrate the effectiveness of our proposed framework, and also show a decent generalization of our RPF on out-of-domain fashion images. Our source code is available at https://github.com/HuiGuanLab/RPF.|特定属性的时尚检索(asFR)是一项具有挑战性的信息检索检索任务,近年来受到越来越多的关注。与以优化整体相似度为核心的传统时尚检索不同,ASFR 任务集中于特定属性的相似度,使得检索结果更加细粒度和可解释性。由于特定属性的相似性通常对应于图像的特定细微区域,因此我们提出了一种由区域感知分支和补丁感知分支组成的区域到补丁框架(Regional-to-Patch Framework,RPF)来提取细粒度的属性相关视觉特征,以便以粗到细的方式进行精确检索。特别地,区域感知分支首先被用来定位与给定属性的语义相关的潜在区域。然后,考虑到所定位的区域比较粗糙,并且仍然包含背景视觉内容,提出了基于补丁感知的分支来从前面的放大区域中获取与补丁相关的属性细节。这种混合结构在区域定位和特征提取之间取得了适当的平衡。此外,与以往单纯侧重于识别与属性相关的前景视觉特征的作品不同,我们认为与属性无关的背景特征对于以对比的方式识别详细的视觉背景也是至关重要的。因此,本文进一步提出了一种基于前景和背景表征的新型 E-InfoNCE 损失算法,以改善特定属性表征的识别能力。在三个数据集上的大量实验证明了我们提出的框架的有效性,同时也显示了我们的 RPF 在域外时尚图像上的良好推广。我们的源代码可以在 https://github.com/huiguanlab/rpf 找到。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=From+Region+to+Patch:+Attribute-Aware+Foreground-Background+Contrastive+Learning+for+Fine-Grained+Fashion+Retrieval)|0| |[Multi-view Multi-aspect Neural Networks for Next-basket Recommendation](https://doi.org/10.1145/3539618.3591738)|Zhiying Deng, Jianjun Li, Zhiqiang Guo, Wei Liu, Li Zou, Guohui Li||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-view+Multi-aspect+Neural+Networks+for+Next-basket+Recommendation)|0| |[EulerNet: Adaptive Feature Interaction Learning via Euler's Formula for CTR Prediction](https://doi.org/10.1145/3539618.3591681)|Zhen Tian, Ting Bai, Wayne Xin Zhao, JiRong Wen, Zhao Cao||Learning effective high-order feature interactions is very crucial in the CTR prediction task. However, it is very time-consuming to calculate high-order feature interactions with massive features in online e-commerce platforms. Most existing methods manually design a maximal order and further filter out the useless interactions from them. Although they reduce the high computational costs caused by the exponential growth of high-order feature combinations, they still suffer from the degradation of model capability due to the suboptimal learning of the restricted feature orders. The solution to maintain the model capability and meanwhile keep it efficient is a technical challenge, which has not been adequately addressed. To address this issue, we propose an adaptive feature interaction learning model, named as EulerNet, in which the feature interactions are learned in a complex vector space by conducting space mapping according to Euler's formula. EulerNet converts the exponential powers of feature interactions into simple linear combinations of the modulus and phase of the complex features, making it possible to adaptively learn the high-order feature interactions in an efficient way. Furthermore, EulerNet incorporates the implicit and explicit feature interactions into a unified architecture, which achieves the mutual enhancement and largely boosts the model capabilities. Such a network can be fully learned from data, with no need of pre-designed form or order for feature interactions. Extensive experiments conducted on three public datasets have demonstrated the effectiveness and efficiency of our approach. Our code is available at: https://github.com/RUCAIBox/EulerNet.|在 CTR 预测任务中,学习有效的高阶特征交互是非常关键的。然而,在线电子商务平台中计算具有大量特征的高阶特征交互是非常耗时的。大多数现有的方法手工设计一个最大顺序,并进一步从中筛选出无用的交互。虽然它们降低了高阶特征组合的指数增长所造成的高计算成本,但是由于受限特征阶次的次优学习,它们仍然受到模型能力退化的影响。保持模型能力并同时保持其有效性的解决方案是一个技术挑战,尚未得到充分解决。针对这一问题,提出了一种自适应特征交互学习模型 EulerNet。EulerNet 将特征交互的指数幂转化为复杂特征模量和相位的简单线性组合,使得高阶特征交互的自适应学习成为可能。此外,EulerNet 将隐式和显式的特征交互融合到一个统一的体系结构中,实现了相互增强,大大提高了模型的性能。这样的网络可以完全从数据中学习,不需要预先设计的形式或特征交互的顺序。在三个公共数据集上进行的大量实验已经证明了我们方法的有效性和效率。我们的代码可以在以下 https://github.com/rucaibox/eulernet 找到。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=EulerNet:+Adaptive+Feature+Interaction+Learning+via+Euler's+Formula+for+CTR+Prediction)|0| -|[FiD-Light: Efficient and Effective Retrieval-Augmented Text Generation](https://doi.org/10.1145/3539618.3591687)|Sebastian Hofstätter, Jiecao Chen, Karthik Raman, Hamed Zamani|Google; University of Massachusetts, Amherst; Technische Universität Wien|Retrieval-augmented generation models offer many benefits over standalone language models: besides a textual answer to a given query they provide provenance items retrieved from an updateable knowledge base. However, they are also more complex systems and need to handle long inputs. In this work, we introduce FiD-Light to strongly increase the efficiency of the state-of-the-art retrieval-augmented FiD model, while maintaining the same level of effectiveness. Our FiD-Light model constrains the information flow from the encoder (which encodes passages separately) to the decoder (using concatenated encoded representations). Furthermore, we adapt FiD-Light with re-ranking capabilities through textual source pointers, to improve the top-ranked provenance precision. Our experiments on a diverse set of seven knowledge intensive tasks (KILT) show FiD-Light consistently improves the Pareto frontier between query latency and effectiveness. FiD-Light with source pointing sets substantial new state-of-the-art results on six KILT tasks for combined text generation and provenance retrieval evaluation, while maintaining reasonable efficiency.|与独立语言模型相比,检索增强生成模型提供了许多好处: 除了对给定查询的文本回答之外,它们还提供了从可更新的知识库中检索到的出处项。然而,它们也是更复杂的系统,需要处理长输入。在这项工作中,我们引入了 FiD-Light,在保持相同的效率水平的同时,强有力地提高了最先进的检索增强 FiD 模型的效率。我们的 FiD-Light 模型约束了从编码器(分别编码通道)到解码器(使用级联编码表示)的信息流。此外,我们通过文本源指针改编具有重新排序能力的 FiD-Light,以提高顶级来源精度。我们在七个不同的知识密集型任务(KILT)上的实验表明,FiD-Light 一致地改善了查询延迟和有效性之间的帕累托边界。带源点的 FiD-Light 在六个 KILT 任务上设置了大量的最新结果,用于组合文本生成和出处检索评估,同时保持合理的效率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=FiD-Light:+Efficient+and+Effective+Retrieval-Augmented+Text+Generation)|0| +|[FiD-Light: Efficient and Effective Retrieval-Augmented Text Generation](https://doi.org/10.1145/3539618.3591687)|Sebastian Hofstätter, Jiecao Chen, Karthik Raman, Hamed Zamani|University of Massachusetts, Amherst; Technische Universität Wien; Google|Retrieval-augmented generation models offer many benefits over standalone language models: besides a textual answer to a given query they provide provenance items retrieved from an updateable knowledge base. However, they are also more complex systems and need to handle long inputs. In this work, we introduce FiD-Light to strongly increase the efficiency of the state-of-the-art retrieval-augmented FiD model, while maintaining the same level of effectiveness. Our FiD-Light model constrains the information flow from the encoder (which encodes passages separately) to the decoder (using concatenated encoded representations). Furthermore, we adapt FiD-Light with re-ranking capabilities through textual source pointers, to improve the top-ranked provenance precision. Our experiments on a diverse set of seven knowledge intensive tasks (KILT) show FiD-Light consistently improves the Pareto frontier between query latency and effectiveness. FiD-Light with source pointing sets substantial new state-of-the-art results on six KILT tasks for combined text generation and provenance retrieval evaluation, while maintaining reasonable efficiency.|与独立语言模型相比,检索增强生成模型提供了许多好处: 除了对给定查询的文本回答之外,它们还提供了从可更新的知识库中检索到的出处项。然而,它们也是更复杂的系统,需要处理长输入。在这项工作中,我们引入了 FiD-Light,在保持相同的效率水平的同时,强有力地提高了最先进的检索增强 FiD 模型的效率。我们的 FiD-Light 模型约束了从编码器(分别编码通道)到解码器(使用级联编码表示)的信息流。此外,我们通过文本源指针改编具有重新排序能力的 FiD-Light,以提高顶级来源精度。我们在七个不同的知识密集型任务(KILT)上的实验表明,FiD-Light 一致地改善了查询延迟和有效性之间的帕累托边界。带源点的 FiD-Light 在六个 KILT 任务上设置了大量的最新结果,用于组合文本生成和出处检索评估,同时保持合理的效率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=FiD-Light:+Efficient+and+Effective+Retrieval-Augmented+Text+Generation)|0| |[PLATE: A Prompt-Enhanced Paradigm for Multi-Scenario Recommendations](https://doi.org/10.1145/3539618.3591750)|Yuhao Wang, Xiangyu Zhao, Bo Chen, Qidong Liu, Huifeng Guo, Huanshuo Liu, Yichao Wang, Rui Zhang, Ruiming Tang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PLATE:+A+Prompt-Enhanced+Paradigm+for+Multi-Scenario+Recommendations)|0| |[LightGT: A Light Graph Transformer for Multimedia Recommendation](https://doi.org/10.1145/3539618.3591716)|Yinwei Wei, Wenqi Liu, Fan Liu, Xiang Wang, Liqiang Nie, TatSeng Chua||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=LightGT:+A+Light+Graph+Transformer+for+Multimedia+Recommendation)|0| |[Law Article-Enhanced Legal Case Matching: A Causal Learning Approach](https://doi.org/10.1145/3539618.3591709)|Zhongxiang Sun, Jun Xu, Xiao Zhang, Zhenhua Dong, JiRong Wen||Legal case matching, which automatically constructs a model to estimate the similarities between the source and target cases, has played an essential role in intelligent legal systems. Semantic text matching models have been applied to the task where the source and target legal cases are considered as long-form text documents. These general-purpose matching models make the predictions solely based on the texts in the legal cases, overlooking the essential role of the law articles in legal case matching. In the real world, the matching results (e.g., relevance labels) are dramatically affected by the law articles because the contents and the judgments of a legal case are radically formed on the basis of law. From the causal sense, a matching decision is affected by the mediation effect from the cited law articles by the legal cases, and the direct effect of the key circumstances (e.g., detailed fact descriptions) in the legal cases. In light of the observation, this paper proposes a model-agnostic causal learning framework called Law-Match, under which the legal case matching models are learned by respecting the corresponding law articles. Given a pair of legal cases and the related law articles, Law-Match considers the embeddings of the law articles as instrumental variables (IVs), and the embeddings of legal cases as treatments. Using IV regression, the treatments can be decomposed into law-related and law-unrelated parts, respectively reflecting the mediation and direct effects. These two parts are then combined with different weights to collectively support the final matching prediction. We show that the framework is model-agnostic, and a number of legal case matching models can be applied as the underlying models. Comprehensive experiments show that Law-Match can outperform state-of-the-art baselines on three public datasets.|法律案件匹配是一种自动构建判断源案件与目标案件相似性的模型,在智能法律系统中发挥着重要作用。本文将语义文本匹配模型应用于源、目标法律案件作为长文本文档的任务中。这些通用匹配模型仅根据法律案件的案文进行预测,忽视了法律条款在法律案件匹配中的重要作用。在现实世界中,由于案件的内容和判决是在法律的基础上从根本上形成的,所以匹配结果(如关联标签)受到法律条文的巨大影响。从因果关系的角度来看,法律案件引用的法律条文所产生的调解效果,以及法律案件中关键情节(如详细的事实描述)的直接效果,都会影响匹配决策。基于这种观察,本文提出了一个模型无关的因果学习框架——法律匹配,在此框架下,通过尊重相应的法律条款来学习法律案例匹配模型。在给定一对法律案例和相关法律条文的情况下,Law-Match 将法律条文的嵌入视为工具变量,将法律案例的嵌入视为处理方法。利用四元回归,可以将处理分解为与法律相关的部分和与法律无关的部分,分别反映调解效果和直接效果。然后将这两部分与不同的权重相结合,共同支持最终的匹配预测。我们表明,该框架是模型无关的,一些法律案件匹配模型可以作为基础模型应用。综合实验表明,Law-Match 在三个公共数据集上的表现优于最先进的基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Law+Article-Enhanced+Legal+Case+Matching:+A+Causal+Learning+Approach)|0| diff --git a/papers/www/www2023.md b/papers/www/www2023.md index 5cfd75e4..9861e08a 100644 --- a/papers/www/www2023.md +++ b/papers/www/www2023.md @@ -2,13 +2,13 @@ |论文|作者|组织|摘要|翻译|代码|引用数| |---|---|---|---|---|---|---| -|[Automated Ontology Evaluation: Evaluating Coverage and Correctness using a Domain Corpus](https://doi.org/10.1145/3543873.3587617)|Antonio Zaitoun, Tomer Sagi, Katja Hose|Aalborg University, Denmark; Aalborg University, Denmark and TU Wien, Austria; University of Haifa, Israel|Ontologies conceptualize domains and are a crucial part of web semantics and information systems. However, re-using an existing ontology for a new task requires a detailed evaluation of the candidate ontology as it may cover only a subset of the domain concepts, contain information that is redundant or misleading, and have inaccurate relations and hierarchies between concepts. Manual evaluation of large and complex ontologies is a tedious task. Thus, a few approaches have been proposed for automated evaluation, ranging from concept coverage to ontology generation from a corpus. Existing approaches, however, are limited by their dependence on external structured knowledge sources, such as a thesaurus, as well as by their inability to evaluate semantic relationships. In this paper, we propose a novel framework to automatically evaluate the domain coverage and semantic correctness of existing ontologies based on domain information derived from text. The approach uses a domain-tuned named-entity-recognition model to extract phrasal concepts. The extracted concepts are then used as a representation of the domain against which we evaluate the candidate ontology’s concepts. We further employ a domain-tuned language model to determine the semantic correctness of the candidate ontology’s relations. We demonstrate our automated approach on several large ontologies from the oceanographic domain and show its agreement with a manual evaluation by domain experts and its superiority over the state-of-the-art.|本体概念化领域,是网络语义和信息系统的重要组成部分。然而,在新任务中重新使用现有的本体需要对候选本体进行详细的评估,因为它可能只涉及领域概念的一个子集,包含冗余或误导的信息,并且概念之间的关系和层次结构不准确。手工评估大型和复杂的本体是一项繁琐的任务。因此,提出了一些自动评估的方法,从概念覆盖到从语料库中生成本体。然而,现有的方法受到依赖于外部结构化知识源(如同义词表)以及无法评估语义关系的限制。本文提出了一种基于文本的领域信息自动评估现有本体的领域覆盖率和语义正确性的框架。该方法使用一个域调整的命名实体识别模型来提取短语概念。然后将提取的概念用作领域的表示,我们根据这个表示来评估候选本体的概念。我们进一步使用领域调优的语言模型来确定候选本体关系的语义正确性。我们展示了我们的自动化方法从海洋学领域的几个大的本体论,并表明其与领域专家的手工评估的一致性及其优越性的最新水平。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Automated+Ontology+Evaluation:+Evaluating+Coverage+and+Correctness+using+a+Domain+Corpus)|2| +|[Automated Ontology Evaluation: Evaluating Coverage and Correctness using a Domain Corpus](https://doi.org/10.1145/3543873.3587617)|Antonio Zaitoun, Tomer Sagi, Katja Hose|Aalborg University, Denmark and TU Wien, Austria; Aalborg University, Denmark; University of Haifa, Israel|Ontologies conceptualize domains and are a crucial part of web semantics and information systems. However, re-using an existing ontology for a new task requires a detailed evaluation of the candidate ontology as it may cover only a subset of the domain concepts, contain information that is redundant or misleading, and have inaccurate relations and hierarchies between concepts. Manual evaluation of large and complex ontologies is a tedious task. Thus, a few approaches have been proposed for automated evaluation, ranging from concept coverage to ontology generation from a corpus. Existing approaches, however, are limited by their dependence on external structured knowledge sources, such as a thesaurus, as well as by their inability to evaluate semantic relationships. In this paper, we propose a novel framework to automatically evaluate the domain coverage and semantic correctness of existing ontologies based on domain information derived from text. The approach uses a domain-tuned named-entity-recognition model to extract phrasal concepts. The extracted concepts are then used as a representation of the domain against which we evaluate the candidate ontology’s concepts. We further employ a domain-tuned language model to determine the semantic correctness of the candidate ontology’s relations. We demonstrate our automated approach on several large ontologies from the oceanographic domain and show its agreement with a manual evaluation by domain experts and its superiority over the state-of-the-art.|本体概念化领域,是网络语义和信息系统的重要组成部分。然而,在新任务中重新使用现有的本体需要对候选本体进行详细的评估,因为它可能只涉及领域概念的一个子集,包含冗余或误导的信息,并且概念之间的关系和层次结构不准确。手工评估大型和复杂的本体是一项繁琐的任务。因此,提出了一些自动评估的方法,从概念覆盖到从语料库中生成本体。然而,现有的方法受到依赖于外部结构化知识源(如同义词表)以及无法评估语义关系的限制。本文提出了一种基于文本的领域信息自动评估现有本体的领域覆盖率和语义正确性的框架。该方法使用一个域调整的命名实体识别模型来提取短语概念。然后将提取的概念用作领域的表示,我们根据这个表示来评估候选本体的概念。我们进一步使用领域调优的语言模型来确定候选本体关系的语义正确性。我们展示了我们的自动化方法从海洋学领域的几个大的本体论,并表明其与领域专家的手工评估的一致性及其优越性的最新水平。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Automated+Ontology+Evaluation:+Evaluating+Coverage+and+Correctness+using+a+Domain+Corpus)|2| |[Reinforcement Learning-based Counter-Misinformation Response Generation: A Case Study of COVID-19 Vaccine Misinformation](https://doi.org/10.1145/3543507.3583388)|Bing He, Mustaque Ahamad, Srijan Kumar||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Reinforcement+Learning-based+Counter-Misinformation+Response+Generation:+A+Case+Study+of+COVID-19+Vaccine+Misinformation)|2| -|[A Concept Knowledge Graph for User Next Intent Prediction at Alipay](https://doi.org/10.1145/3543873.3587308)|Yacheng He, Qianghuai Jia, Lin Yuan, Ruopeng Li, Yixin Ou, Ningyu Zhang|Ant Group, China; Zhejiang University, China|This paper illustrates the technologies of user next intent prediction with a concept knowledge graph. The system has been deployed on the Web at Alipay, serving more than 100 million daily active users. To explicitly characterize user intent, we propose AlipayKG, which is an offline concept knowledge graph in the Life-Service domain modeling the historical behaviors of users, the rich content interacted by users and the relations between them. We further introduce a Transformer-based model which integrates expert rules from the knowledge graph to infer the online user's next intent. Experimental results demonstrate that the proposed system can effectively enhance the performance of the downstream tasks while retaining explainability.|本文用概念知识图说明了用户下一意图预测技术。该系统已经部署在支付宝的网站上,为超过1亿日活跃用户提供服务。为了明确表征用户意图,本文提出了 AlipayKG,它是生活服务领域中的一个离线概念知识图,对用户的历史行为、用户交互的丰富内容以及用户之间的关系进行建模。我们进一步介绍了一个基于 Transformer 的模型,该模型集成了来自知识图的专家规则,以推断在线用户的下一个意图。实验结果表明,该系统能够有效地提高下游任务的性能,同时保持可解释性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Concept+Knowledge+Graph+for+User+Next+Intent+Prediction+at+Alipay)|1| -|[Interaction-level Membership Inference Attack Against Federated Recommender Systems](https://doi.org/10.1145/3543507.3583359)|Wei Yuan, Chaoqun Yang, Quoc Viet Hung Nguyen, Lizhen Cui, Tieke He, Hongzhi Yin|Shandong University, China; Griffith University, Australia; The University of Queensland, Australia; Nanjing University, China|The marriage of federated learning and recommender system (FedRec) has been widely used to address the growing data privacy concerns in personalized recommendation services. In FedRecs, users' attribute information and behavior data (i.e., user-item interaction data) are kept locally on their personal devices, therefore, it is considered a fairly secure approach to protect user privacy. As a result, the privacy issue of FedRecs is rarely explored. Unfortunately, several recent studies reveal that FedRecs are vulnerable to user attribute inference attacks, highlighting the privacy concerns of FedRecs. In this paper, we further investigate the privacy problem of user behavior data (i.e., user-item interactions) in FedRecs. Specifically, we perform the first systematic study on interaction-level membership inference attacks on FedRecs. An interaction-level membership inference attacker is first designed, and then the classical privacy protection mechanism, Local Differential Privacy (LDP), is adopted to defend against the membership inference attack. Unfortunately, the empirical analysis shows that LDP is not effective against such new attacks unless the recommendation performance is largely compromised. To mitigate the interaction-level membership attack threats, we design a simple yet effective defense method to significantly reduce the attacker's inference accuracy without losing recommendation performance. Extensive experiments are conducted with two widely used FedRecs (Fed-NCF and Fed-LightGCN) on three real-world recommendation datasets (MovieLens-100K, Steam-200K, and Amazon Cell Phone), and the experimental results show the effectiveness of our solutions.|联邦学习与推荐系统的结合(FedRec)已被广泛用于解决个性化推荐服务中日益增长的数据隐私问题。在 FedRecs 中,用户的属性信息和行为数据(即用户项交互数据)保存在他们的个人设备上,因此,它被认为是保护用户隐私的一种相当安全的方法。因此,FedRecs 的隐私问题很少被探讨。不幸的是,最近的一些研究表明,联邦医疗记录系统容易受到用户属性推理攻击,突出了联邦医疗记录系统的隐私问题。在本文中,我们进一步研究了 FedRecs 中用户行为数据(即用户项交互)的隐私问题。具体来说,我们对 FedRecs 的交互层次成员推理攻击进行了第一次系统研究。首先设计了一个交互级别的成员推断攻击,然后采用经典的隐私保护机制,即本地差分隐私(lDP)来抵御成员推断攻击。遗憾的是,实证分析表明,除非推荐性能受到很大影响,否则 LDP 无法有效地抵抗这种新的攻击。为了减轻交互级别的成员攻击威胁,我们设计了一种简单而有效的防御方法,在不损失推荐性能的前提下显著降低攻击者的推断精度。在三个真实世界的推荐数据集(MovieLens-100K,Stream-200K 和 Amazon Cell Phone)上,用两个广泛使用的 FedRecs (Fed-NCF 和 Fed-LightGCN)进行了广泛的实验,实验结果显示了我们的解决方案的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Interaction-level+Membership+Inference+Attack+Against+Federated+Recommender+Systems)|1| +|[A Concept Knowledge Graph for User Next Intent Prediction at Alipay](https://doi.org/10.1145/3543873.3587308)|Yacheng He, Qianghuai Jia, Lin Yuan, Ruopeng Li, Yixin Ou, Ningyu Zhang|Zhejiang University, China; Ant Group, China|This paper illustrates the technologies of user next intent prediction with a concept knowledge graph. The system has been deployed on the Web at Alipay, serving more than 100 million daily active users. To explicitly characterize user intent, we propose AlipayKG, which is an offline concept knowledge graph in the Life-Service domain modeling the historical behaviors of users, the rich content interacted by users and the relations between them. We further introduce a Transformer-based model which integrates expert rules from the knowledge graph to infer the online user's next intent. Experimental results demonstrate that the proposed system can effectively enhance the performance of the downstream tasks while retaining explainability.|本文用概念知识图说明了用户下一意图预测技术。该系统已经部署在支付宝的网站上,为超过1亿日活跃用户提供服务。为了明确表征用户意图,本文提出了 AlipayKG,它是生活服务领域中的一个离线概念知识图,对用户的历史行为、用户交互的丰富内容以及用户之间的关系进行建模。我们进一步介绍了一个基于 Transformer 的模型,该模型集成了来自知识图的专家规则,以推断在线用户的下一个意图。实验结果表明,该系统能够有效地提高下游任务的性能,同时保持可解释性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Concept+Knowledge+Graph+for+User+Next+Intent+Prediction+at+Alipay)|1| +|[Interaction-level Membership Inference Attack Against Federated Recommender Systems](https://doi.org/10.1145/3543507.3583359)|Wei Yuan, Chaoqun Yang, Quoc Viet Hung Nguyen, Lizhen Cui, Tieke He, Hongzhi Yin|Griffith University, Australia; Shandong University, China; Nanjing University, China; The University of Queensland, Australia|The marriage of federated learning and recommender system (FedRec) has been widely used to address the growing data privacy concerns in personalized recommendation services. In FedRecs, users' attribute information and behavior data (i.e., user-item interaction data) are kept locally on their personal devices, therefore, it is considered a fairly secure approach to protect user privacy. As a result, the privacy issue of FedRecs is rarely explored. Unfortunately, several recent studies reveal that FedRecs are vulnerable to user attribute inference attacks, highlighting the privacy concerns of FedRecs. In this paper, we further investigate the privacy problem of user behavior data (i.e., user-item interactions) in FedRecs. Specifically, we perform the first systematic study on interaction-level membership inference attacks on FedRecs. An interaction-level membership inference attacker is first designed, and then the classical privacy protection mechanism, Local Differential Privacy (LDP), is adopted to defend against the membership inference attack. Unfortunately, the empirical analysis shows that LDP is not effective against such new attacks unless the recommendation performance is largely compromised. To mitigate the interaction-level membership attack threats, we design a simple yet effective defense method to significantly reduce the attacker's inference accuracy without losing recommendation performance. Extensive experiments are conducted with two widely used FedRecs (Fed-NCF and Fed-LightGCN) on three real-world recommendation datasets (MovieLens-100K, Steam-200K, and Amazon Cell Phone), and the experimental results show the effectiveness of our solutions.|联邦学习与推荐系统的结合(FedRec)已被广泛用于解决个性化推荐服务中日益增长的数据隐私问题。在 FedRecs 中,用户的属性信息和行为数据(即用户项交互数据)保存在他们的个人设备上,因此,它被认为是保护用户隐私的一种相当安全的方法。因此,FedRecs 的隐私问题很少被探讨。不幸的是,最近的一些研究表明,联邦医疗记录系统容易受到用户属性推理攻击,突出了联邦医疗记录系统的隐私问题。在本文中,我们进一步研究了 FedRecs 中用户行为数据(即用户项交互)的隐私问题。具体来说,我们对 FedRecs 的交互层次成员推理攻击进行了第一次系统研究。首先设计了一个交互级别的成员推断攻击,然后采用经典的隐私保护机制,即本地差分隐私(lDP)来抵御成员推断攻击。遗憾的是,实证分析表明,除非推荐性能受到很大影响,否则 LDP 无法有效地抵抗这种新的攻击。为了减轻交互级别的成员攻击威胁,我们设计了一种简单而有效的防御方法,在不损失推荐性能的前提下显著降低攻击者的推断精度。在三个真实世界的推荐数据集(MovieLens-100K,Stream-200K 和 Amazon Cell Phone)上,用两个广泛使用的 FedRecs (Fed-NCF 和 Fed-LightGCN)进行了广泛的实验,实验结果显示了我们的解决方案的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Interaction-level+Membership+Inference+Attack+Against+Federated+Recommender+Systems)|1| |[Learning with Exposure Constraints in Recommendation Systems](https://doi.org/10.1145/3543507.3583320)|Omer BenPorat, Rotem Torkan|Faculty of Data and Decision Sciences, Technion - Israel Institute of Technology, Israel|Recommendation systems are dynamic economic systems that balance the needs of multiple stakeholders. A recent line of work studies incentives from the content providers' point of view. Content providers, e.g., vloggers and bloggers, contribute fresh content and rely on user engagement to create revenue and finance their operations. In this work, we propose a contextual multi-armed bandit setting to model the dependency of content providers on exposure. In our model, the system receives a user context in every round and has to select one of the arms. Every arm is a content provider who must receive a minimum number of pulls every fixed time period (e.g., a month) to remain viable in later rounds; otherwise, the arm departs and is no longer available. The system aims to maximize the users' (content consumers) welfare. To that end, it should learn which arms are vital and ensure they remain viable by subsidizing arm pulls if needed. We develop algorithms with sub-linear regret, as well as a lower bound that demonstrates that our algorithms are optimal up to logarithmic factors.|推荐系统是平衡多个利益相关者需求的动态经济系统。最近的一项工作是从内容提供商的角度研究激励机制。内容提供商,例如,视频博客和博客,贡献新的内容,并依靠用户参与来创造收入和资助他们的业务。在这项工作中,我们提出了一个上下文多臂老虎机设置来模拟内容提供者对曝光的依赖。在我们的模型中,系统在每一轮中接收一个用户上下文,并且必须选择一个武器。每只手臂都是一个内容提供者,它必须在每个固定的时间段(例如,一个月)接受最少数量的拉动,以便在以后的回合中保持活力; 否则,这只手臂就会离开,不再可用。该系统旨在最大化用户(内容消费者)的福利。为此,它应该了解哪些武器是至关重要的,并确保他们保持可行的补贴,如果需要的手臂拉。我们开发的算法与次线性遗憾,以及一个下限,表明我们的算法是最优的对数因素。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+with+Exposure+Constraints+in+Recommendation+Systems)|1| -|[On How Zero-Knowledge Proof Blockchain Mixers Improve, and Worsen User Privacy](https://doi.org/10.1145/3543507.3583217)|Zhipeng Wang, Stefanos Chaliasos, Kaihua Qin, Liyi Zhou, Lifeng Gao, Pascal Berrang, Benjamin Livshits, Arthur Gervais|Imperial College London, United Kingdom; University of Birmingham, United Kingdom; UCL, United Kingdom and UC Berkeley, USA|Zero-knowledge proof (ZKP) mixers are one of the most widely-used blockchain privacy solutions, operating on top of smart contract-enabled blockchains. We find that ZKP mixers are tightly intertwined with the growing number of Decentralized Finance (DeFi) attacks and Blockchain Extractable Value (BEV) extractions. Through coin flow tracing, we discover that 205 blockchain attackers and 2,595 BEV extractors leverage mixers as their source of funds, while depositing a total attack revenue of 412.87M USD. Moreover, the US OFAC sanctions against the largest ZKP mixer, Tornado.Cash, have reduced the mixer's daily deposits by more than 80%. Further, ZKP mixers advertise their level of privacy through a so-called anonymity set size, which similarly to k-anonymity allows a user to hide among a set of k other users. Through empirical measurements, we, however, find that these anonymity set claims are mostly inaccurate. For the most popular mixers on Ethereum (ETH) and Binance Smart Chain (BSC), we show how to reduce the anonymity set size on average by 27.34% and 46.02% respectively. Our empirical evidence is also the first to suggest a differing privacy-predilection of users on ETH and BSC. State-of-the-art ZKP mixers are moreover interwoven with the DeFi ecosystem by offering anonymity mining (AM) incentives, i.e., users receive monetary rewards for mixing coins. However, contrary to the claims of related work, we find that AM does not necessarily improve the quality of a mixer's anonymity set. Our findings indicate that AM attracts privacy-ignorant users, who then do not contribute to improving the privacy of other mixer users.|零知识证明(ZKP)混频器是最广泛使用的区块链隐私解决方案之一,运行在智能合同启用的区块链之上。我们发现 ZKP 混频器与不断增加的分散金融(DeFi)攻击和区块链可提取值(BEV)提取紧密相关。通过硬币流追踪,我们发现205个区块链攻击者和2595个 BEV 提取者利用混合器作为他们的资金来源,同时存储总攻击收入为412.87万美元。此外,美国海外资产管制办公室制裁了 ZKP 最大的搅拌机龙卷风。现金,减少了搅拌机的每日存款超过80% 。此外,ZKP 混频器通过所谓的匿名集大小来宣传他们的隐私级别,这与 k 匿名类似,允许用户隐藏在 k 其他用户集中。然而,通过实证测量,我们发现这些匿名集索赔大多是不准确的。针对 Etherum (ETH)和 Binance 智能链(BSC)上最常用的混频器,我们分别给出了如何平均减少匿名集大小27.34% 和46.02% 的方法。我们的经验证明也是第一个提出不同隐私偏好的用户。最先进的 ZKP 混合器还通过提供匿名挖掘(AM)奖励与 DeFi 生态系统交织在一起,也就是说,用户通过混合硬币获得金钱奖励。然而,与相关工作的主张相反,我们发现 AM 并不一定改善混频器匿名集的质量。我们的研究结果表明,AM 吸引了那些对隐私无知的用户,而这些用户并没有帮助改善其他混频器用户的隐私。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On+How+Zero-Knowledge+Proof+Blockchain+Mixers+Improve,+and+Worsen+User+Privacy)|1| -|[To Store or Not? Online Data Selection for Federated Learning with Limited Storage](https://doi.org/10.1145/3543507.3583426)|Chen Gong, Zhenzhe Zheng, Fan Wu, Yunfeng Shao, Bingshuai Li, Guihai Chen||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=To+Store+or+Not?+Online+Data+Selection+for+Federated+Learning+with+Limited+Storage)|1| +|[On How Zero-Knowledge Proof Blockchain Mixers Improve, and Worsen User Privacy](https://doi.org/10.1145/3543507.3583217)|Zhipeng Wang, Stefanos Chaliasos, Kaihua Qin, Liyi Zhou, Lifeng Gao, Pascal Berrang, Benjamin Livshits, Arthur Gervais|UCL, United Kingdom and UC Berkeley, USA; University of Birmingham, United Kingdom; Imperial College London, United Kingdom|Zero-knowledge proof (ZKP) mixers are one of the most widely-used blockchain privacy solutions, operating on top of smart contract-enabled blockchains. We find that ZKP mixers are tightly intertwined with the growing number of Decentralized Finance (DeFi) attacks and Blockchain Extractable Value (BEV) extractions. Through coin flow tracing, we discover that 205 blockchain attackers and 2,595 BEV extractors leverage mixers as their source of funds, while depositing a total attack revenue of 412.87M USD. Moreover, the US OFAC sanctions against the largest ZKP mixer, Tornado.Cash, have reduced the mixer's daily deposits by more than 80%. Further, ZKP mixers advertise their level of privacy through a so-called anonymity set size, which similarly to k-anonymity allows a user to hide among a set of k other users. Through empirical measurements, we, however, find that these anonymity set claims are mostly inaccurate. For the most popular mixers on Ethereum (ETH) and Binance Smart Chain (BSC), we show how to reduce the anonymity set size on average by 27.34% and 46.02% respectively. Our empirical evidence is also the first to suggest a differing privacy-predilection of users on ETH and BSC. State-of-the-art ZKP mixers are moreover interwoven with the DeFi ecosystem by offering anonymity mining (AM) incentives, i.e., users receive monetary rewards for mixing coins. However, contrary to the claims of related work, we find that AM does not necessarily improve the quality of a mixer's anonymity set. Our findings indicate that AM attracts privacy-ignorant users, who then do not contribute to improving the privacy of other mixer users.|零知识证明(ZKP)混频器是最广泛使用的区块链隐私解决方案之一,运行在智能合同启用的区块链之上。我们发现 ZKP 混频器与不断增加的分散金融(DeFi)攻击和区块链可提取值(BEV)提取紧密相关。通过硬币流追踪,我们发现205个区块链攻击者和2595个 BEV 提取者利用混合器作为他们的资金来源,同时存储总攻击收入为412.87万美元。此外,美国海外资产管制办公室制裁了 ZKP 最大的搅拌机龙卷风。现金,减少了搅拌机的每日存款超过80% 。此外,ZKP 混频器通过所谓的匿名集大小来宣传他们的隐私级别,这与 k 匿名类似,允许用户隐藏在 k 其他用户集中。然而,通过实证测量,我们发现这些匿名集索赔大多是不准确的。针对 Etherum (ETH)和 Binance 智能链(BSC)上最常用的混频器,我们分别给出了如何平均减少匿名集大小27.34% 和46.02% 的方法。我们的经验证明也是第一个提出不同隐私偏好的用户。最先进的 ZKP 混合器还通过提供匿名挖掘(AM)奖励与 DeFi 生态系统交织在一起,也就是说,用户通过混合硬币获得金钱奖励。然而,与相关工作的主张相反,我们发现 AM 并不一定改善混频器匿名集的质量。我们的研究结果表明,AM 吸引了那些对隐私无知的用户,而这些用户并没有帮助改善其他混频器用户的隐私。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On+How+Zero-Knowledge+Proof+Blockchain+Mixers+Improve,+and+Worsen+User+Privacy)|1| +|[To Store or Not? Online Data Selection for Federated Learning with Limited Storage](https://doi.org/10.1145/3543507.3583426)|Chen Gong, Zhenzhe Zheng, Fan Wu, Yunfeng Shao, Bingshuai Li, Guihai Chen|; Department of Computer Science and Engineering, Shanghai Jiao Tong University, China|Machine learning models have been deployed in mobile networks to deal with massive data from different layers to enable automated network management and intelligence on devices. To overcome high communication cost and severe privacy concerns of centralized machine learning, federated learning (FL) has been proposed to achieve distributed machine learning among networked devices. While the computation and communication limitation has been widely studied, the impact of on-device storage on the performance of FL is still not explored. Without an effective data selection policy to filter the massive streaming data on devices, classical FL can suffer from much longer model training time ($4\times$) and significant inference accuracy reduction ($7\%$), observed in our experiments. In this work, we take the first step to consider the online data selection for FL with limited on-device storage. We first define a new data valuation metric for data evaluation and selection in FL with theoretical guarantees for speeding up model convergence and enhancing final model accuracy, simultaneously. We further design {\ttfamily ODE}, a framework of \textbf{O}nline \textbf{D}ata s\textbf{E}lection for FL, to coordinate networked devices to store valuable data samples. Experimental results on one industrial dataset and three public datasets show the remarkable advantages of {\ttfamily ODE} over the state-of-the-art approaches. Particularly, on the industrial dataset, {\ttfamily ODE} achieves as high as $2.5\times$ speedup of training time and $6\%$ increase in inference accuracy, and is robust to various factors in practical environments.|机器学习模型已经部署在移动网络中,用于处理来自不同层次的大量数据,以实现设备上的自动网络管理和智能化。为了克服集中式机器学习的高通信成本和严重的隐私问题,提出了联邦学习(FL)来实现网络设备之间的分布式机器学习。虽然计算和通信的局限性已经被广泛研究,但是在设备上存储对 FL 性能的影响还没有被探讨。在我们的实验中观察到,如果没有有效的数据选择策略来过滤设备上的大量流数据,经典的 FL 可能会遭受更长的模型训练时间(4倍 $)和显着的推断准确性降低(7% $)。在这项工作中,我们采取的第一步考虑在线数据选择的 FL 与有限的设备上的存储。我们首先定义了一个新的用于 FL 中数据评估和选择的数据估值度量,同时为加快模型收敛速度和提高最终模型精度提供了理论保证。进一步设计了一个基于 textbf { O } nline textbf { D } ata’s textbf { E }选项的框架{ ttfamily ODE } ,用于协调网络设备以存储有价值的数据样本。在一个工业数据集和三个公共数据集上的实验结果表明,{ ttfamily ODE }方法比现有的方法具有显著的优势。特别是在工业数据集上,{ ttfamily ODE }达到了2.5倍的训练时间加速和6% 的推理精度提高,并且对实际环境中的各种因素具有鲁棒性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=To+Store+or+Not?+Online+Data+Selection+for+Federated+Learning+with+Limited+Storage)|1| |[Chain of Explanation: New Prompting Method to Generate Quality Natural Language Explanation for Implicit Hate Speech](https://doi.org/10.1145/3543873.3587320)|Fan Huang, Haewoon Kwak, Jisun An||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Chain+of+Explanation:+New+Prompting+Method+to+Generate+Quality+Natural+Language+Explanation+for+Implicit+Hate+Speech)|1| |[NeuKron: Constant-Size Lossy Compression of Sparse Reorderable Matrices and Tensors](https://doi.org/10.1145/3543507.3583226)|Taehyung Kwon, Jihoon Ko, Jinhong Jung, Kijung Shin||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=NeuKron:+Constant-Size+Lossy+Compression+of+Sparse+Reorderable+Matrices+and+Tensors)|1| |[Hierarchical Knowledge Graph Learning Enabled Socioeconomic Indicator Prediction in Location-Based Social Network](https://doi.org/10.1145/3543507.3583239)|Zhilun Zhou, Yu Liu, Jingtao Ding, Depeng Jin, Yong Li||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Hierarchical+Knowledge+Graph+Learning+Enabled+Socioeconomic+Indicator+Prediction+in+Location-Based+Social+Network)|1| @@ -20,235 +20,235 @@ |[Knowledge-infused Contrastive Learning for Urban Imagery-based Socioeconomic Prediction](https://doi.org/10.1145/3543507.3583876)|Yu Liu, Xin Zhang, Jingtao Ding, Yanxin Xi, Yong Li||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Knowledge-infused+Contrastive+Learning+for+Urban+Imagery-based+Socioeconomic+Prediction)|1| |[Dynamic Embedding-based Retrieval for Personalized Item Recommendations at Instacart](https://doi.org/10.1145/3543873.3587668)|Chuanwei Ruan, Allan Stewart, Han Li, Ryan Ye, David Vengerov, Haixun Wang|Instacart, USA|Personalization is essential in e-commerce, with item recommendation as a critical task. In this paper, we describe a hybrid embedding-based retrieval system for real-time personalized item recommendations on Instacart. Our system addresses unique challenges in the multi-source retrieval system, and includes several key components to make it highly personalized and dynamic. Specifically, our system features a hybrid embedding model that includes a long-term user interests embedding model and a real-time session-based model, which are combined to capture users’ immediate intents and historical interactions. Additionally, we have developed a contextual bandit solution to dynamically adjust the number of candidates from each source and optimally allocate retrieval slots given a limited computational budget. Our modeling and system optimization efforts have enabled us to provide highly personalized item recommendations in real-time at scale to all our customers, including new and long-standing users.|个性化在电子商务中是必不可少的,项目推荐是一项关键任务。本文描述了一个基于嵌入的混合检索系统,用于 Instacart 上的实时个性化项目推荐。我们的系统解决了多源检索系统中的独特挑战,并包含了几个关键组件,使其具有高度的个性化和动态性。具体来说,我们的系统采用混合嵌入模型,包括长期用户兴趣嵌入模型和基于实时会话的嵌入模型,它们结合起来捕获用户的直接意图和历史交互。此外,我们已经开发了一个上下文盗贼解决方案来动态调整每个来源的候选人数量,并在有限的计算预算下优化分配检索时隙。我们的建模和系统优化工作,使我们能够提供高度个性化的项目推荐的实时规模,我们的所有客户,包括新的和长期的用户。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Dynamic+Embedding-based+Retrieval+for+Personalized+Item+Recommendations+at+Instacart)|0| |[A Multi-Granularity Matching Attention Network for Query Intent Classification in E-commerce Retrieval](https://doi.org/10.1145/3543873.3584639)|Chunyuan Yuan, Yiming Qiu, Mingming Li, Haiqing Hu, Songlin Wang, Sulong Xu|JD.com, Beijing, China, China|Query intent classification, which aims at assisting customers to find desired products, has become an essential component of the e-commerce search. Existing query intent classification models either design more exquisite models to enhance the representation learning of queries or explore label-graph and multi-task to facilitate models to learn external information. However, these models cannot capture multi-granularity matching features from queries and categories, which makes them hard to mitigate the gap in the expression between informal queries and categories. This paper proposes a Multi-granularity Matching Attention Network (MMAN), which contains three modules: a self-matching module, a char-level matching module, and a semantic-level matching module to comprehensively extract features from the query and a query-category interaction matrix. In this way, the model can eliminate the difference in expression between queries and categories for query intent classification. We conduct extensive offline and online A/B experiments, and the results show that the MMAN significantly outperforms the strong baselines, which shows the superiority and effectiveness of MMAN. MMAN has been deployed in production and brings great commercial value for our company.|查询意图分类已经成为电子商务搜索的一个重要组成部分,其目的是帮助客户找到期望的产品。现有的查询意图分类模型要么设计更精细的模型来增强查询的表示学习,要么探索标签图和多任务来促进模型学习外部信息。然而,这些模型不能从查询和类别中捕获多粒度匹配特性,这使得它们很难缩小非正式查询和类别之间的表达差距。提出了一种多粒度匹配注意网络(MMAN)模型,该模型包括三个模块: 自匹配模块、字符级匹配模块和语义级匹配模块。通过这种方式,该模型可以消除查询和类别之间在查询意图分类方面的表达差异。我们进行了大量的离线和在线 A/B 实验,结果表明,MMAN 的性能明显优于强基线,显示了 MMAN 的优越性和有效性。MMAN 已投入生产,为公司带来了巨大的商业价值。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Multi-Granularity+Matching+Attention+Network+for+Query+Intent+Classification+in+E-commerce+Retrieval)|0| -|[Divide and Conquer: Towards Better Embedding-based Retrieval for Recommender Systems from a Multi-task Perspective](https://doi.org/10.1145/3543873.3584629)|Yuan Zhang, Xue Dong, Weijie Ding, Biao Li, Peng Jiang, Kun Gai|Unaffiliated, China; Shandong University, China; Kuaishou Technology, China|Embedding-based retrieval (EBR) methods are widely used in modern recommender systems thanks to its simplicity and effectiveness. However, along the journey of deploying and iterating on EBR in production, we still identify some fundamental issues in existing methods. First, when dealing with large corpus of candidate items, EBR models often have difficulties in balancing the performance on distinguishing highly relevant items (positives) from both irrelevant ones (easy negatives) and from somewhat related yet not competitive ones (hard negatives). Also, we have little control in the diversity and fairness of the retrieval results because of the ``greedy'' nature of nearest vector search. These issues compromise the performance of EBR methods in large-scale industrial scenarios. This paper introduces a simple and proven-in-production solution to overcome these issues. The proposed solution takes a divide-and-conquer approach: the whole set of candidate items are divided into multiple clusters and we run EBR to retrieve relevant candidates from each cluster in parallel; top candidates from each cluster are then combined by some controllable merging strategies. This approach allows our EBR models to only concentrate on discriminating positives from mostly hard negatives. It also enables further improvement from a multi-tasking learning (MTL) perspective: retrieval problems within each cluster can be regarded as individual tasks; inspired by recent successes in prompting and prefix-tuning, we propose an efficient task adaption technique further boosting the retrieval performance within each cluster with negligible overheads.|嵌入式检索方法以其简单有效的特点在现代推荐系统中得到了广泛的应用。然而,在生产中部署和迭代 EBR 的过程中,我们仍然发现了现有方法中的一些基本问题。首先,在处理大量候选项目时,EBR 模型往往难以平衡区分高度相关项目(正面)和无关项目(简单负面)以及有些相关但没有竞争性的项目(硬负面)。此外,由于最近向量搜索的“贪婪”特性,我们对检索结果的多样性和公平性几乎没有控制。这些问题影响了 EBR 方法在大规模工业场景中的性能。本文介绍了一个简单且已经在生产中得到验证的解决方案来克服这些问题。该解决方案采用分而治之的方法: 将整个候选项集划分为多个集群,并运行 EBR 并行地从每个集群中检索相关候选项; 然后通过一些可控的合并策略将每个集群中的最优候选项集合起来。这种方法允许我们的 EBR 模型只集中于区分正面和大多数硬负面。它还能从多任务学习(MTL)的角度进一步改进: 每个集群中的检索问题可以被视为单个任务; 受最近在提示和前缀调优方面的成功启发,我们提出了一种有效的任务适应技术,进一步提高了每个集群中的检索性能,开销可以忽略不计。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Divide+and+Conquer:+Towards+Better+Embedding-based+Retrieval+for+Recommender+Systems+from+a+Multi-task+Perspective)|0| -|[Expressive user embedding from churn and recommendation multi-task learning](https://doi.org/10.1145/3543873.3587306)|Huajun Bai, Davide Liu, Thomas Hirtz, Alexandre Boulenger|Genify, United Arab Emirates; Tsinghua University, China; Genify, China|In this paper, we present a Multi-Task model for Recommendation and Churn prediction (MT) in the retail banking industry. The model leverages a hard parameter-sharing framework and consists of a shared multi-stack encoder with multi-head self-attention and two fully connected task heads. It is trained to achieve two multi-class classification tasks: predicting product churn and identifying the next-best products (NBP) for users, individually. Our experiments demonstrate the superiority of the multi-task model compared to its single-task versions, reaching top-1 precision at 78.1% and 77.6%, for churn and NBP prediction respectively. Moreover, we find that the model learns a coherent and expressive high-level representation reflecting user intentions related to both tasks. There is a clear separation between users with acquisitions and users with churn. In addition, acquirers are more tightly clustered compared to the churners. The gradual separability of churning and acquiring users, who diverge in intent, is a desirable property. It provides a basis for model explainability, critical to industry adoption, and also enables other downstream applications. These potential additional benefits, beyond reducing customer attrition and increasing product use–two primary concerns of businesses, make such a model even more valuable.|本文提出了一个零售银行业推荐和流失预测的多任务模型。该模型利用一个硬参数共享框架,由一个具有多头自注意的共享多栈编码器和两个完全连接的任务头组成。它被训练以完成两个多类别的分类任务: 预测产品流失和为用户分别识别次优产品(NBP)。我们的实验证明了多任务模型相对于单任务模型的优越性,在流失预测和 NBP 预测方面分别达到了78.1% 和77.6% 的 Top-1精度。此外,我们发现该模型学习了一个连贯的和表达的高层次表示,反映了与两个任务相关的用户意图。并购用户和流失用户之间有明显的区别。此外,与搅拌器相比,收购者更紧密地聚集在一起。搅动用户和获取用户的逐渐可分性,这是一个可取的特性,因为用户的意图不同。它为模型的可解释性提供了基础,对于工业的采用至关重要,并且还支持其他下游应用程序。这些潜在的额外好处,除了减少客户流失和增加产品使用(企业的两个主要关注点)之外,使得这种模式更加有价值。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Expressive+user+embedding+from+churn+and+recommendation+multi-task+learning)|0| -|[Continual Transfer Learning for Cross-Domain Click-Through Rate Prediction at Taobao](https://doi.org/10.1145/3543873.3584625)|Lixin Liu, Yanling Wang, Tianming Wang, Dong Guan, Jiawei Wu, Jingxu Chen, Rong Xiao, Wenxiang Zhu, Fei Fang|Alibaba Group, China; Renmin University of China, China; Alibaba group, China|As one of the largest e-commerce platforms in the world, Taobao's recommendation systems (RSs) serve the demands of shopping for hundreds of millions of customers. Click-Through Rate (CTR) prediction is a core component of the RS. One of the biggest characteristics in CTR prediction at Taobao is that there exist multiple recommendation domains where the scales of different domains vary significantly. Therefore, it is crucial to perform cross-domain CTR prediction to transfer knowledge from large domains to small domains to alleviate the data sparsity issue. However, existing cross-domain CTR prediction methods are proposed for static knowledge transfer, ignoring that all domains in real-world RSs are continually time-evolving. In light of this, we present a necessary but novel task named Continual Transfer Learning (CTL), which transfers knowledge from a time-evolving source domain to a time-evolving target domain. In this work, we propose a simple and effective CTL model called CTNet to solve the problem of continual cross-domain CTR prediction at Taobao, and CTNet can be trained efficiently. Particularly, CTNet considers an important characteristic in the industry that models has been continually well-trained for a very long time. So CTNet aims to fully utilize all the well-trained model parameters in both source domain and target domain to avoid losing historically acquired knowledge, and only needs incremental target domain data for training to guarantee efficiency. Extensive offline experiments and online A/B testing at Taobao demonstrate the efficiency and effectiveness of CTNet. CTNet is now deployed online in the recommender systems of Taobao, serving the main traffic of hundreds of millions of active users.|作为世界上最大的电子商务平台之一,淘宝的推荐系统(RS)为数以亿计的顾客提供购物服务。点进率预测是遥感的核心组成部分。淘宝网点击率预测的最大特点之一是存在多个推荐域,不同域的规模差异很大。因此,进行跨域 CTR 预测,将知识从大域转移到小域,以缓解数据稀疏性问题至关重要。然而,现有的跨域 CTR 预测方法都是针对静态知识转移而提出的,忽略了现实 RSS 中的所有域都是不断时间演化的。鉴于此,我们提出了一个必要的,但新颖的任务称为连续转移学习(CTL) ,它将知识从一个时间演化的源领域转移到一个时间演化的目标领域。本文提出了一种简单有效的 CTL 模型 CTNet 来解决淘宝网连续跨域 CTR 预测问题,可以有效地训练 CTNet。特别是,CTNet 认为模特行业的一个重要特征是模特长期以来一直受到良好的培训。因此,CTNet 的目标是充分利用源域和目标域中所有训练有素的模型参数,避免丢失历史获得的知识,只需要增量的目标域数据进行训练,以保证训练效率。在淘宝上的大量离线实验和在线 A/B 测试证明了 CTNet 的效率和有效性。CTNet 现已部署在淘宝网的推荐系统中,为数亿活跃用户的主要流量提供服务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Continual+Transfer+Learning+for+Cross-Domain+Click-Through+Rate+Prediction+at+Taobao)|0| +|[Divide and Conquer: Towards Better Embedding-based Retrieval for Recommender Systems from a Multi-task Perspective](https://doi.org/10.1145/3543873.3584629)|Yuan Zhang, Xue Dong, Weijie Ding, Biao Li, Peng Jiang, Kun Gai|Kuaishou Technology, China; Shandong University, China; Unaffiliated, China|Embedding-based retrieval (EBR) methods are widely used in modern recommender systems thanks to its simplicity and effectiveness. However, along the journey of deploying and iterating on EBR in production, we still identify some fundamental issues in existing methods. First, when dealing with large corpus of candidate items, EBR models often have difficulties in balancing the performance on distinguishing highly relevant items (positives) from both irrelevant ones (easy negatives) and from somewhat related yet not competitive ones (hard negatives). Also, we have little control in the diversity and fairness of the retrieval results because of the ``greedy'' nature of nearest vector search. These issues compromise the performance of EBR methods in large-scale industrial scenarios. This paper introduces a simple and proven-in-production solution to overcome these issues. The proposed solution takes a divide-and-conquer approach: the whole set of candidate items are divided into multiple clusters and we run EBR to retrieve relevant candidates from each cluster in parallel; top candidates from each cluster are then combined by some controllable merging strategies. This approach allows our EBR models to only concentrate on discriminating positives from mostly hard negatives. It also enables further improvement from a multi-tasking learning (MTL) perspective: retrieval problems within each cluster can be regarded as individual tasks; inspired by recent successes in prompting and prefix-tuning, we propose an efficient task adaption technique further boosting the retrieval performance within each cluster with negligible overheads.|嵌入式检索方法以其简单有效的特点在现代推荐系统中得到了广泛的应用。然而,在生产中部署和迭代 EBR 的过程中,我们仍然发现了现有方法中的一些基本问题。首先,在处理大量候选项目时,EBR 模型往往难以平衡区分高度相关项目(正面)和无关项目(简单负面)以及有些相关但没有竞争性的项目(硬负面)。此外,由于最近向量搜索的“贪婪”特性,我们对检索结果的多样性和公平性几乎没有控制。这些问题影响了 EBR 方法在大规模工业场景中的性能。本文介绍了一个简单且已经在生产中得到验证的解决方案来克服这些问题。该解决方案采用分而治之的方法: 将整个候选项集划分为多个集群,并运行 EBR 并行地从每个集群中检索相关候选项; 然后通过一些可控的合并策略将每个集群中的最优候选项集合起来。这种方法允许我们的 EBR 模型只集中于区分正面和大多数硬负面。它还能从多任务学习(MTL)的角度进一步改进: 每个集群中的检索问题可以被视为单个任务; 受最近在提示和前缀调优方面的成功启发,我们提出了一种有效的任务适应技术,进一步提高了每个集群中的检索性能,开销可以忽略不计。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Divide+and+Conquer:+Towards+Better+Embedding-based+Retrieval+for+Recommender+Systems+from+a+Multi-task+Perspective)|0| +|[Expressive user embedding from churn and recommendation multi-task learning](https://doi.org/10.1145/3543873.3587306)|Huajun Bai, Davide Liu, Thomas Hirtz, Alexandre Boulenger|Tsinghua University, China; Genify, United Arab Emirates; Genify, China|In this paper, we present a Multi-Task model for Recommendation and Churn prediction (MT) in the retail banking industry. The model leverages a hard parameter-sharing framework and consists of a shared multi-stack encoder with multi-head self-attention and two fully connected task heads. It is trained to achieve two multi-class classification tasks: predicting product churn and identifying the next-best products (NBP) for users, individually. Our experiments demonstrate the superiority of the multi-task model compared to its single-task versions, reaching top-1 precision at 78.1% and 77.6%, for churn and NBP prediction respectively. Moreover, we find that the model learns a coherent and expressive high-level representation reflecting user intentions related to both tasks. There is a clear separation between users with acquisitions and users with churn. In addition, acquirers are more tightly clustered compared to the churners. The gradual separability of churning and acquiring users, who diverge in intent, is a desirable property. It provides a basis for model explainability, critical to industry adoption, and also enables other downstream applications. These potential additional benefits, beyond reducing customer attrition and increasing product use–two primary concerns of businesses, make such a model even more valuable.|本文提出了一个零售银行业推荐和流失预测的多任务模型。该模型利用一个硬参数共享框架,由一个具有多头自注意的共享多栈编码器和两个完全连接的任务头组成。它被训练以完成两个多类别的分类任务: 预测产品流失和为用户分别识别次优产品(NBP)。我们的实验证明了多任务模型相对于单任务模型的优越性,在流失预测和 NBP 预测方面分别达到了78.1% 和77.6% 的 Top-1精度。此外,我们发现该模型学习了一个连贯的和表达的高层次表示,反映了与两个任务相关的用户意图。并购用户和流失用户之间有明显的区别。此外,与搅拌器相比,收购者更紧密地聚集在一起。搅动用户和获取用户的逐渐可分性,这是一个可取的特性,因为用户的意图不同。它为模型的可解释性提供了基础,对于工业的采用至关重要,并且还支持其他下游应用程序。这些潜在的额外好处,除了减少客户流失和增加产品使用(企业的两个主要关注点)之外,使得这种模式更加有价值。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Expressive+user+embedding+from+churn+and+recommendation+multi-task+learning)|0| +|[Continual Transfer Learning for Cross-Domain Click-Through Rate Prediction at Taobao](https://doi.org/10.1145/3543873.3584625)|Lixin Liu, Yanling Wang, Tianming Wang, Dong Guan, Jiawei Wu, Jingxu Chen, Rong Xiao, Wenxiang Zhu, Fei Fang|Alibaba group, China; Renmin University of China, China; Alibaba Group, China|As one of the largest e-commerce platforms in the world, Taobao's recommendation systems (RSs) serve the demands of shopping for hundreds of millions of customers. Click-Through Rate (CTR) prediction is a core component of the RS. One of the biggest characteristics in CTR prediction at Taobao is that there exist multiple recommendation domains where the scales of different domains vary significantly. Therefore, it is crucial to perform cross-domain CTR prediction to transfer knowledge from large domains to small domains to alleviate the data sparsity issue. However, existing cross-domain CTR prediction methods are proposed for static knowledge transfer, ignoring that all domains in real-world RSs are continually time-evolving. In light of this, we present a necessary but novel task named Continual Transfer Learning (CTL), which transfers knowledge from a time-evolving source domain to a time-evolving target domain. In this work, we propose a simple and effective CTL model called CTNet to solve the problem of continual cross-domain CTR prediction at Taobao, and CTNet can be trained efficiently. Particularly, CTNet considers an important characteristic in the industry that models has been continually well-trained for a very long time. So CTNet aims to fully utilize all the well-trained model parameters in both source domain and target domain to avoid losing historically acquired knowledge, and only needs incremental target domain data for training to guarantee efficiency. Extensive offline experiments and online A/B testing at Taobao demonstrate the efficiency and effectiveness of CTNet. CTNet is now deployed online in the recommender systems of Taobao, serving the main traffic of hundreds of millions of active users.|作为世界上最大的电子商务平台之一,淘宝的推荐系统(RS)为数以亿计的顾客提供购物服务。点进率预测是遥感的核心组成部分。淘宝网点击率预测的最大特点之一是存在多个推荐域,不同域的规模差异很大。因此,进行跨域 CTR 预测,将知识从大域转移到小域,以缓解数据稀疏性问题至关重要。然而,现有的跨域 CTR 预测方法都是针对静态知识转移而提出的,忽略了现实 RSS 中的所有域都是不断时间演化的。鉴于此,我们提出了一个必要的,但新颖的任务称为连续转移学习(CTL) ,它将知识从一个时间演化的源领域转移到一个时间演化的目标领域。本文提出了一种简单有效的 CTL 模型 CTNet 来解决淘宝网连续跨域 CTR 预测问题,可以有效地训练 CTNet。特别是,CTNet 认为模特行业的一个重要特征是模特长期以来一直受到良好的培训。因此,CTNet 的目标是充分利用源域和目标域中所有训练有素的模型参数,避免丢失历史获得的知识,只需要增量的目标域数据进行训练,以保证训练效率。在淘宝上的大量离线实验和在线 A/B 测试证明了 CTNet 的效率和有效性。CTNet 现已部署在淘宝网的推荐系统中,为数亿活跃用户的主要流量提供服务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Continual+Transfer+Learning+for+Cross-Domain+Click-Through+Rate+Prediction+at+Taobao)|0| |[MAKE: Vision-Language Pre-training based Product Retrieval in Taobao Search](https://doi.org/10.1145/3543873.3584627)|Xiaoyang Zheng, Zilong Wang, Sen Li, Ke Xu, Tao Zhuang, Qingwen Liu, Xiaoyi Zeng|; Alibaba Group, China|Taobao Search consists of two phases: the retrieval phase and the ranking phase. Given a user query, the retrieval phase returns a subset of candidate products for the following ranking phase. Recently, the paradigm of pre-training and fine-tuning has shown its potential in incorporating visual clues into retrieval tasks. In this paper, we focus on solving the problem of text-to-multimodal retrieval in Taobao Search. We consider that users' attention on titles or images varies on products. Hence, we propose a novel Modal Adaptation module for cross-modal fusion, which helps assigns appropriate weights on texts and images across products. Furthermore, in e-commerce search, user queries tend to be brief and thus lead to significant semantic imbalance between user queries and product titles. Therefore, we design a separate text encoder and a Keyword Enhancement mechanism to enrich the query representations and improve text-to-multimodal matching. To this end, we present a novel vision-language (V+L) pre-training methods to exploit the multimodal information of (user query, product title, product image). Extensive experiments demonstrate that our retrieval-specific pre-training model (referred to as MAKE) outperforms existing V+L pre-training methods on the text-to-multimodal retrieval task. MAKE has been deployed online and brings major improvements on the retrieval system of Taobao Search.|淘宝搜索包括两个阶段: 检索阶段和排名阶段。给定一个用户查询,检索阶段返回下一个排序阶段的候选产品的子集。最近,预先训练和微调的范式已经显示了其在将视觉线索纳入检索任务方面的潜力。本文主要研究淘宝搜索中文本到多模式检索的问题。我们认为用户对标题或图片的关注因产品而异。因此,我们提出了一个新的模态适应模块的跨模态融合,这有助于分配适当的权重的文本和图像跨产品。此外,在电子商务搜索中,用户查询往往是简短的,从而导致用户查询和产品标题之间的语义严重失衡。因此,我们设计了一个单独的文本编码器和一个关键字增强机制,以丰富查询表示和改善文本到多模式匹配。为此,我们提出了一种新的视觉语言(V + L)预训练方法来利用多模态信息(用户查询、产品标题、产品图像)。大量的实验表明,我们的检索特定的预训练模型(简称 MAKE)在文本到多模态检索任务上优于现有的 V + L 预训练方法。MAKE 已经在线部署,并对淘宝搜索的检索系统进行了重大改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MAKE:+Vision-Language+Pre-training+based+Product+Retrieval+in+Taobao+Search)|0| -|[HAPENS: Hardness-Personalized Negative Sampling for Implicit Collaborative Filtering](https://doi.org/10.1145/3543873.3584631)|Haoxin Liu, Pu Zhao, Si Qin, Yong Shi, Mirror Xu, Qingwei Lin, Dongmei Zhang|Microsoft Bing, China; Microsoft Research, China|For training implicit collaborative filtering (ICF) models, hard negative sampling (HNS) has become a state-of-the-art solution for obtaining negative signals from massive uninteracted items. However, selecting appropriate hardness levels for personalized recommendations remains a fundamental, yet underexplored, problem. Previous HNS works have primarily adjusted the hardness level by tuning a single hyperparameter. However, applying the same hardness level to each user is unsuitable due to varying user behavioral characteristics, the quantity and quality of user records, and different consistencies of models’ inductive biases. Moreover, increasing the number of hyperparameters is not practical due to the massive number of users. To address this important and challenging problem, we propose a model-agnostic and practical approach called hardness-personalized negative sampling (HAPENS). HAPENS uses a two-stage approach: in stage one, it trains the ICF model with a customized objective function that optimizes its worst performance on each user’s interacted item set. In stage two, it utilizes these worst performances as personalized hardness levels with a well-designed sampling distribution, and trains the final model with the same architecture. We evaluated HAPENS on the collected Bing advertising dataset and one public dataset, and the comprehensive experimental results demonstrate its robustness and superiority. Moreover, HAPENS has delivered significant benefits to the Bing advertising system. To the best of our knowledge, we are the first to study this important and challenging problem.|对于训练内隐协同过滤模型(ICF) ,硬负采样(hNS)已成为从大量未交互项目中获取负信号的最新解决方案。然而,为个性化推荐选择合适的硬度水平仍然是一个基本的、尚未得到充分探索的问题。以往的 HNS 工作主要是通过调整单个超参数来调整硬度水平。然而,由于不同的用户行为特征、用户记录的数量和质量以及模型归纳偏差的不同一致性,对每个用户应用相同的硬度水平是不合适的。此外,由于用户数量庞大,增加超参数的数量是不切实际的。为了解决这一重要而具有挑战性的问题,我们提出了一种模型不可知的实用方法,称为硬度个性化阴性采样(HAPENS)。HAPENS 使用两阶段的方法: 在第一阶段,它使用一个定制的目标函数来训练 ICF 模型,该目标函数在每个用户的交互项集上优化其最差的性能。在第二阶段,它利用这些最差的性能作为个性化的硬度水平,具有设计良好的采样分布,并训练最终模型具有相同的架构。我们对搜集到的 Bing 广告数据集和一个公共数据集进行了 HAPENS 评估,综合实验结果表明了 HAPENS 的鲁棒性和优越性。此外,HAPENS 为必应广告系统带来了巨大的好处。据我们所知,我们是第一个研究这个重要而富有挑战性的问题的。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=HAPENS:+Hardness-Personalized+Negative+Sampling+for+Implicit+Collaborative+Filtering)|0| +|[HAPENS: Hardness-Personalized Negative Sampling for Implicit Collaborative Filtering](https://doi.org/10.1145/3543873.3584631)|Haoxin Liu, Pu Zhao, Si Qin, Yong Shi, Mirror Xu, Qingwei Lin, Dongmei Zhang|Microsoft Research, China; Microsoft Bing, China|For training implicit collaborative filtering (ICF) models, hard negative sampling (HNS) has become a state-of-the-art solution for obtaining negative signals from massive uninteracted items. However, selecting appropriate hardness levels for personalized recommendations remains a fundamental, yet underexplored, problem. Previous HNS works have primarily adjusted the hardness level by tuning a single hyperparameter. However, applying the same hardness level to each user is unsuitable due to varying user behavioral characteristics, the quantity and quality of user records, and different consistencies of models’ inductive biases. Moreover, increasing the number of hyperparameters is not practical due to the massive number of users. To address this important and challenging problem, we propose a model-agnostic and practical approach called hardness-personalized negative sampling (HAPENS). HAPENS uses a two-stage approach: in stage one, it trains the ICF model with a customized objective function that optimizes its worst performance on each user’s interacted item set. In stage two, it utilizes these worst performances as personalized hardness levels with a well-designed sampling distribution, and trains the final model with the same architecture. We evaluated HAPENS on the collected Bing advertising dataset and one public dataset, and the comprehensive experimental results demonstrate its robustness and superiority. Moreover, HAPENS has delivered significant benefits to the Bing advertising system. To the best of our knowledge, we are the first to study this important and challenging problem.|对于训练内隐协同过滤模型(ICF) ,硬负采样(hNS)已成为从大量未交互项目中获取负信号的最新解决方案。然而,为个性化推荐选择合适的硬度水平仍然是一个基本的、尚未得到充分探索的问题。以往的 HNS 工作主要是通过调整单个超参数来调整硬度水平。然而,由于不同的用户行为特征、用户记录的数量和质量以及模型归纳偏差的不同一致性,对每个用户应用相同的硬度水平是不合适的。此外,由于用户数量庞大,增加超参数的数量是不切实际的。为了解决这一重要而具有挑战性的问题,我们提出了一种模型不可知的实用方法,称为硬度个性化阴性采样(HAPENS)。HAPENS 使用两阶段的方法: 在第一阶段,它使用一个定制的目标函数来训练 ICF 模型,该目标函数在每个用户的交互项集上优化其最差的性能。在第二阶段,它利用这些最差的性能作为个性化的硬度水平,具有设计良好的采样分布,并训练最终模型具有相同的架构。我们对搜集到的 Bing 广告数据集和一个公共数据集进行了 HAPENS 评估,综合实验结果表明了 HAPENS 的鲁棒性和优越性。此外,HAPENS 为必应广告系统带来了巨大的好处。据我们所知,我们是第一个研究这个重要而富有挑战性的问题的。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=HAPENS:+Hardness-Personalized+Negative+Sampling+for+Implicit+Collaborative+Filtering)|0| |[Que2Engage: Embedding-based Retrieval for Relevant and Engaging Products at Facebook Marketplace](https://doi.org/10.1145/3543873.3584633)|Yunzhong He, Yuxin Tian, Mengjiao Wang, Feier Chen, Licheng Yu, Maolong Tang, Congcong Chen, Ning Zhang, Bin Kuang, Arul Prakash|University of California, Merced, USA; Meta, USA|Embedding-based Retrieval (EBR) in e-commerce search is a powerful search retrieval technique to address semantic matches between search queries and products. However, commercial search engines like Facebook Marketplace Search are complex multi-stage systems optimized for multiple business objectives. At Facebook Marketplace, search retrieval focuses on matching search queries with relevant products, while search ranking puts more emphasis on contextual signals to up-rank the more engaging products. As a result, the end-to-end searcher experience is a function of both relevance and engagement, and the interaction between different stages of the system. This presents challenges to EBR systems in order to optimize for better searcher experiences. In this paper we presents Que2Engage, a search EBR system built towards bridging the gap between retrieval and ranking for end-to-end optimizations. Que2Engage takes a multimodal & multitask approach to infuse contextual information into the retrieval stage and to balance different business objectives. We show the effectiveness of our approach via a multitask evaluation framework and thorough baseline comparisons and ablation studies. Que2Engage is deployed on Facebook Marketplace Search and shows significant improvements in searcher engagement in two weeks of A/B testing.|电子商务搜索中的嵌入式检索(EBR)是解决搜索查询与产品之间语义匹配的一种强有力的检索技术。然而,像 Facebook Marketplace Search 这样的商业搜索引擎是为多个业务目标而优化的复杂的多阶段系统。在 Facebook Marketplace,搜索检索侧重于将搜索查询与相关产品进行匹配,而搜索排名更侧重于上下文信号,以提升更具吸引力的产品的排名。因此,端到端的搜索体验是相关性和参与度的函数,以及系统不同阶段之间的相互作用。这对 EBR 系统提出了挑战,以便优化更好的搜索体验。本文介绍了 Que2Engage,这是一个搜索 EBR 系统,旨在弥合检索和排序之间的差距,以实现端到端的优化。Que2Engage 采用多模态和多任务的方法将上下文信息注入检索阶段,并平衡不同的业务目标。我们通过一个多任务评估框架和彻底的基线比较和消融研究来展示我们的方法的有效性。Que2Engage 部署在 Facebook Marketplace Search 上,并在两周的 A/B 测试中显示出搜索者参与度的显著改善。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Que2Engage:+Embedding-based+Retrieval+for+Relevant+and+Engaging+Products+at+Facebook+Marketplace)|0| |[Learning Multi-Stage Multi-Grained Semantic Embeddings for E-Commerce Search](https://doi.org/10.1145/3543873.3584638)|Binbin Wang, Mingming Li, Zhixiong Zeng, Jingwei Zhuo, Songlin Wang, Sulong Xu, Bo Long, Weipeng Yan|JD.com, China|Retrieving relevant items that match users' queries from billion-scale corpus forms the core of industrial e-commerce search systems, in which embedding-based retrieval (EBR) methods are prevailing. These methods adopt a two-tower framework to learn embedding vectors for query and item separately and thus leverage efficient approximate nearest neighbor (ANN) search to retrieve relevant items. However, existing EBR methods usually ignore inconsistent user behaviors in industrial multi-stage search systems, resulting in insufficient retrieval efficiency with a low commercial return. To tackle this challenge, we propose to improve EBR methods by learning Multi-level Multi-Grained Semantic Embeddings(MMSE). We propose the multi-stage information mining to exploit the ordered, clicked, unclicked and random sampled items in practical user behavior data, and then capture query-item similarity via a post-fusion strategy. We then propose multi-grained learning objectives that integrate the retrieval loss with global comparison ability and the ranking loss with local comparison ability to generate semantic embeddings. Both experiments on a real-world billion-scale dataset and online A/B tests verify the effectiveness of MMSE in achieving significant performance improvements on metrics such as offline recall and online conversion rate (CVR).|基于嵌入式检索(EBR)方法是工业电子商务搜索系统的核心,它可以从数十亿规模的语料库中检索出与用户查询相匹配的相关项目。这些方法采用双塔架构,分别学习查询和项目的嵌入向量,从而利用有效的近似最近邻(ANN)搜索来检索相关项目。然而,现有的 EBR 方法往往忽略了工业多阶段搜索系统中不一致的用户行为,导致检索效率不足,商业收益较低。为了解决这一问题,我们提出通过学习多级多粒度语义嵌入(MMSE)来改进 EBR 方法。提出了一种基于多阶段信息挖掘的方法,利用实际用户行为数据中的有序、点击、未点击和随机抽样条目,通过后融合策略获取查询条目的相似性。然后提出多粒度学习目标,将检索损失与全局比较能力、排序损失与局部比较能力相结合,生成语义嵌入。在真实世界的十亿级数据集上的实验和在线 A/B 测试都验证了 MMSE 在离线召回率和在线转换率(CVR)等指标上实现显著性能改进的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Multi-Stage+Multi-Grained+Semantic+Embeddings+for+E-Commerce+Search)|0| |[CAM2: Conformity-Aware Multi-Task Ranking Model for Large-Scale Recommender Systems](https://doi.org/10.1145/3543873.3584657)|Ameya Raul, Amey Porobo Dharwadker, Brad Schumitsch|Meta Inc., USA|Learning large-scale industrial recommender system models by fitting them to historical user interaction data makes them vulnerable to conformity bias. This may be due to a number of factors, including the fact that user interests may be difficult to determine and that many items are often interacted with based on ecosystem factors other than their relevance to the individual user. In this work, we introduce CAM2, a conformity-aware multi-task ranking model to serve relevant items to users on one of the largest industrial recommendation platforms. CAM2 addresses these challenges systematically by leveraging causal modeling to disentangle users' conformity to popular items from their true interests. This framework is generalizable and can be scaled to support multiple representations of conformity and user relevance in any large-scale recommender system. We provide deeper practical insights and demonstrate the effectiveness of the proposed model through improvements in offline evaluation metrics compared to our production multi-task ranking model. We also show through online experiments that the CAM2 model results in a significant 0.50% increase in aggregated user engagement, coupled with a 0.21% increase in daily active users on Facebook Watch, a popular video discovery and sharing platform serving billions of users.|通过将大规模工业推荐系统模型与历史用户交互数据进行拟合,使其容易受到一致性偏差的影响。这可能是由于若干因素,包括用户的兴趣可能难以确定,而且许多项目往往基于生态系统因素而不是它们与个别用户的相关性进行交互。在这项工作中,我们介绍了 CAM2,一个整合意识的多任务排序模型,以服务于相关项目的用户在一个最大的行业推荐平台。CAM2通过利用因果建模系统地解决这些挑战,从用户的真实兴趣中分离出用户对流行项目的一致性。这个框架是可以推广的,可以扩展到支持任何大规模推荐系统中的多种一致性和用户相关性的表示。我们提供了更深入的实践见解,并证明了该模型的有效性,通过改进离线评估指标相比,我们的生产多任务排序模型。我们还通过在线实验表明,CAM2模型显著增加了0.50% 的聚合用户参与度,同时 Facebook Watch 的日常活跃用户增加了0.21% 。 Facebook Watch 是一个流行的视频发现和分享平台,服务于数十亿用户。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CAM2:+Conformity-Aware+Multi-Task+Ranking+Model+for+Large-Scale+Recommender+Systems)|0| |[A Deep Behavior Path Matching Network for Click-Through Rate Prediction](https://doi.org/10.1145/3543873.3584662)|Jian Dong, Yisong Yu, Yapeng Zhang, Yimin Lv, Shuli Wang, Beihong Jin, Yongkang Wang, Xingxing Wang, Dong Wang|Meituan Ltd., China; Institute of Software, Chinese Academy of Sciences, China and University of Chinese Academy of Sciences, China; Meituan Ltd, China|User behaviors on an e-commerce app not only contain different kinds of feedback on items but also sometimes imply the cognitive clue of the user's decision-making. For understanding the psychological procedure behind user decisions, we present the behavior path and propose to match the user's current behavior path with historical behavior paths to predict user behaviors on the app. Further, we design a deep neural network for behavior path matching and solve three difficulties in modeling behavior paths: sparsity, noise interference, and accurate matching of behavior paths. In particular, we leverage contrastive learning to augment user behavior paths, provide behavior path self-activation to alleviate the effect of noise, and adopt a two-level matching mechanism to identify the most appropriate candidate. Our model shows excellent performance on two real-world datasets, outperforming the state-of-the-art CTR model. Moreover, our model has been deployed on the Meituan food delivery platform and has accumulated 1.6% improvement in CTR and 1.8% improvement in advertising revenue.|用户在电子商务应用程序上的行为不仅包含对项目的不同类型的反馈,而且有时还意味着用户决策的认知线索。为了理解用户决策背后的心理过程,我们提出了行为路径,并建议匹配用户的当前行为路径和历史行为路径,以预测用户在应用程序上的行为。进一步,我们设计了一个用于行为路径匹配的深层神经网络,解决了行为路径建模中的三个难点: 稀疏性、噪声干扰和行为路径的精确匹配。特别地,我们利用对比学习来增强用户的行为路径,提供行为路径自激活来减轻噪声的影响,并采用两级匹配机制来确定最合适的候选者。我们的模型在两个真实世界的数据集上显示了出色的性能,优于最先进的 CTR 模型。此外,我们的模型已经部署在美团食品配送平台上,点击率累计提高了1.6% ,广告收入累计提高了1.8% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Deep+Behavior+Path+Matching+Network+for+Click-Through+Rate+Prediction)|0| |[Cross-lingual Search for e-Commerce based on Query Translatability and Mixed-Domain Fine-Tuning](https://doi.org/10.1145/3543873.3587660)|Jesus PerezMartin, Jorge GomezRobles, Asier GutiérrezFandiño, Pankaj Adsul, Sravanthi Rajanala, Leonardo Lezcano||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Cross-lingual+Search+for+e-Commerce+based+on+Query+Translatability+and+Mixed-Domain+Fine-Tuning)|0| -|[Enhancing User Personalization in Conversational Recommenders](https://doi.org/10.1145/3543507.3583192)|Allen Lin, Ziwei Zhu, Jianling Wang, James Caverlee|Texas A&M University, USA; George Mason University, USA|Conversational recommenders are emerging as a powerful tool to personalize a user's recommendation experience. Through a back-and-forth dialogue, users can quickly hone in on just the right items. Many approaches to conversational recommendation, however, only partially explore the user preference space and make limiting assumptions about how user feedback can be best incorporated, resulting in long dialogues and poor recommendation performance. In this paper, we propose a novel conversational recommendation framework with two unique features: (i) a greedy NDCG attribute selector, to enhance user personalization in the interactive preference elicitation process by prioritizing attributes that most effectively represent the actual preference space of the user; and (ii) a user representation refiner, to effectively fuse together the user preferences collected from the interactive elicitation process to obtain a more personalized understanding of the user. Through extensive experiments on four frequently used datasets, we find the proposed framework not only outperforms all the state-of-the-art conversational recommenders (in terms of both recommendation performance and conversation efficiency), but also provides a more personalized experience for the user under the proposed multi-groundtruth multi-round conversational recommendation setting.|对话式推荐正在成为个性化用户推荐体验的强大工具。通过反复的对话,用户可以快速找到正确的项目。然而,许多会话推荐方法只是部分地探索了用户偏好空间,并对如何最好地整合用户反馈进行了有限的假设,导致了冗长的对话和糟糕的推荐性能。本文提出了一种新的会话推荐框架,该框架具有两个独特的特征: (1)贪婪的 NDCG 属性选择器,通过对最有效地表示用户实际偏好空间的属性进行优先排序,增强交互式偏好启发过程中的用户个性化; (2)用户表示细化器,有效地融合交互式启发过程中收集到的用户偏好,以获得对用户更个性化的理解。通过对四个常用数据集的大量实验,我们发现该框架不仅在推荐性能和会话效率方面优于所有最先进的会话推荐器,而且在提出的多地面真相多轮会话推荐设置下为用户提供了更加个性化的体验。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Enhancing+User+Personalization+in+Conversational+Recommenders)|0| -|[Dual-interest Factorization-heads Attention for Sequential Recommendation](https://doi.org/10.1145/3543507.3583278)|Guanyu Lin, Chen Gao, Yu Zheng, Jianxin Chang, Yanan Niu, Yang Song, Zhiheng Li, Depeng Jin, Yong Li|Department of Electronic Engineering, Tsinghua University, China; kuaishou, China; Tsinghua University, China|Accurate user interest modeling is vital for recommendation scenarios. One of the effective solutions is the sequential recommendation that relies on click behaviors, but this is not elegant in the video feed recommendation where users are passive in receiving the streaming contents and return skip or no-skip behaviors. Here skip and no-skip behaviors can be treated as negative and positive feedback, respectively. With the mixture of positive and negative feedback, it is challenging to capture the transition pattern of behavioral sequence. To do so, FeedRec has exploited a shared vanilla Transformer, which may be inelegant because head interaction of multi-heads attention does not consider different types of feedback. In this paper, we propose Dual-interest Factorization-heads Attention for Sequential Recommendation (short for DFAR) consisting of feedback-aware encoding layer, dual-interest disentangling layer and prediction layer. In the feedback-aware encoding layer, we first suppose each head of multi-heads attention can capture specific feedback relations. Then we further propose factorization-heads attention which can mask specific head interaction and inject feedback information so as to factorize the relation between different types of feedback. Additionally, we propose a dual-interest disentangling layer to decouple positive and negative interests before performing disentanglement on their representations. Finally, we evolve the positive and negative interests by corresponding towers whose outputs are contrastive by BPR loss. Experiments on two real-world datasets show the superiority of our proposed method against state-of-the-art baselines. Further ablation study and visualization also sustain its effectiveness. We release the source code here: https://github.com/tsinghua-fib-lab/WWW2023-DFAR.|准确的用户兴趣建模对于推荐场景至关重要。其中一个有效的解决方案是依赖于点击行为的顺序推荐,但是在视频提要推荐中这并不优雅,因为用户在接收流内容和返回跳过或不跳过行为时是被动的。在这里,跳过和不跳过行为可以分别视为负反馈和正反馈。由于正反馈和负反馈的混合,捕捉行为序列的转换模式具有挑战性。为此,FeedRec 利用了一个共享的香草变压器,这可能是不雅的,因为多头注意的头部交互没有考虑不同类型的反馈。本文提出了由反馈感知编码层、双兴趣分解层和预测层组成的双兴趣分解顺序推荐系统。在反馈感知编码层,我们首先假设多头注意的每个头都能捕获特定的反馈关系。然后进一步提出因子分解-头注意,它可以掩盖特定的头交互,并注入反馈信息,从而对不同类型的反馈之间的关系进行因子分解。此外,我们提出了一个双利益解缠层,以解耦正面和负面的利益之前,执行解缠的表示。最后,我们通过相应的塔进行正负利益演化,其输出由于业务流程重组损失而具有对比性。在两个实际数据集上的实验表明了我们提出的方法对最先进的基线的优越性。进一步的消融研究和可视化也支持其有效性。我们在这里发布源代码: https://github.com/tsinghua-fib-lab/www2023-dfar。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Dual-interest+Factorization-heads+Attention+for+Sequential+Recommendation)|0| -|[A Cross-Media Retrieval System for Web-SNS-Map Using Suggested Keywords Generating and Ranking Method Based on Search Characteristics](https://doi.org/10.1145/3543873.3587344)|Da Li, Masaki Sugihashi, Tadahiko Kumamoto, Yukiko Kawai|Chiba Institute of Technology, Japan; Fukuoka University, Japan; Kyoto Sangyo University, Japan|The research on multimedia retrieval has lasted for several decades. However, past efforts generally focused on single-media retrieval, where the queries and retrieval results belong to the same media (platform) type, such as social media platforms or search engines. In single-media retrieval, users have to select search media or options based on search characteristics such as contents, time, or spatial distance, they might be unable to retrieve correct results mixed in other media if they carelessly forget to select. In this study, we propose a cross-media retrieval system using suggestion generation methods to integrate three search characteristics of the Web (textual content-based retrieval), SNS (timeliness), and map (spatial distance-aware retrieval). In our previous research, we attempted to improve search efficiency using clustering methods to provide search results to users through related terms, etc. In this paper, we focus on the search efficiency of multiple search media. We utilize Google search engine to obtain the retrieval content from the Web, Twitter to obtain timely information from SNSs, and Google Maps to get geographical information from maps. We apply the obtained retrieval results to analyze the similarities between them by clustering. Then, we generate relevant suggestions and provide them to users. Moreover, we validate the effectiveness of the search results generated by our proposed system.|多媒体检索的研究已经持续了几十年。然而,过去的努力通常集中在单媒体检索,其中查询和检索结果属于相同的媒体(平台)类型,如社会媒体平台或搜索引擎。在单媒体检索中,用户必须根据内容、时间或空间距离等搜索特征选择搜索媒体或选项,如果不小心忘记选择,可能无法检索混合在其他媒体中的正确结果。在本研究中,我们提出一个跨媒体检索系统,利用建议产生的方法来整合网页(文本内容检索)、 SNS (及时性)和地图(空间距离感知检索)的三个搜索特性。在我们以前的研究中,我们尝试使用聚类方法来提高搜索效率,通过相关词汇等为用户提供搜索结果。本文主要研究多种搜索媒体的搜索效率。我们利用谷歌搜索引擎从网络中获取检索内容,利用 Twitter 从 SNS 中获取及时信息,利用谷歌地图从地图中获取地理信息。我们应用所得到的检索结果,通过聚类分析它们之间的相似性。然后,我们生成相关的建议并提供给用户。此外,我们还验证了该系统所产生的搜索结果的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Cross-Media+Retrieval+System+for+Web-SNS-Map+Using+Suggested+Keywords+Generating+and+Ranking+Method+Based+on+Search+Characteristics)|0| -|[A Knowledge Enhanced Hierarchical Fusion Network for CTR Prediction under Account Search Scenario in WeChat](https://doi.org/10.1145/3543873.3584650)|Yuanzhou Yao, Zhao Zhang, Kaijia Yang, Huasheng Liang, Qiang Yan, Fuzheng Zhuang, Yongjun Xu, Boyu Diao, Chao Li|WeChat, Tencent, China; Institute of Computing Technology, Chinese Academy of Sciences, China; Institute of Artificial Intelligence, Beihang University, China; Zhejiang Lab, China|Click-through rate (CTR) estimation plays as a pivotal function module in various online services. Previous studies mainly apply CTR models to the field of recommendation or online advertisement. Indeed, CTR is also critical in information retrieval, since the CTR probability can serve as a valuable feature for a query-document pair. In this paper, we study the CTR task under account search scenario in WeChat, where users search official accounts or mini programs corresponding to an organization. Despite the large number of CTR models, directly applying them to our task is inappropriate since the account retrieval task has a number of specific characteristics. E.g., different from traditional user-centric CTR models, in our task, CTR prediction is query-centric and does not model user information. In addition, queries and accounts are short texts, and heavily rely on prior knowledge and semantic understanding. These characteristics require us to specially design a CTR model for the task. To this end, we propose a novel CTR prediction model named Knowledge eNhanced hIerarchical Fusion nEtwork (KNIFE). Specifically, to tackle the prior information problem, we mine the knowledge graph of accounts as side information; to enhance the representations of queries, we construct a bipartite graph for queries and accounts. In addition, a hierarchical network structure is proposed to fuse the representations of different information in a fine-grained manner. Finally, the representations of queries and accounts are obtained from this hierarchical network and fed into the CTR model together with other features for prediction. We conduct extensive experiments against 12 existing models across two industrial datasets. Both offline and online A/B test results indicate the effectiveness of KNIFE.|在各种网上服务中,点进率评估是一个关键的功能模块。以往的研究主要将点击率模型应用于推荐或在线广告领域。实际上,点击率在信息检索中也很关键,因为点击率可以作为查询-文档对的一个有价值的特性。本文研究了微信中用户搜索官方账号或与组织对应的小程序的帐号搜索情景下的点击率任务。尽管有大量的点击率检索模型,但由于账户检索任务具有许多特殊性,直接将其应用于我们的任务是不合适的。例如,与传统的以用户为中心的 CTR 模型不同,在我们的任务中,CTR 预测是以查询为中心的,不对用户信息建模。此外,查询和帐户是简短的文本,并且严重依赖于先前的知识和语义理解。这些特性要求我们为任务专门设计一个 CTR 模型。为此,我们提出了一种新的 CTR 预测模型——知识增强分层融合网络(KNIFE)。具体来说,为了解决先验信息问题,我们挖掘帐户的知识图作为边信息; 为了增强查询的表示,我们为查询和帐户构造一个二分图。此外,提出了一种分层网络结构,以细粒度的方式融合不同信息的表示。最后,从这个层次网络中获得查询和帐户的表示,并将其与其他用于预测的特征一起反馈到 CTR 模型中。我们对两个工业数据集中的12个现有模型进行了广泛的实验。离线和在线 A/B 测试结果均表明了 KNIFE 的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Knowledge+Enhanced+Hierarchical+Fusion+Network+for+CTR+Prediction+under+Account+Search+Scenario+in+WeChat)|0| +|[Enhancing User Personalization in Conversational Recommenders](https://doi.org/10.1145/3543507.3583192)|Allen Lin, Ziwei Zhu, Jianling Wang, James Caverlee|George Mason University, USA; Texas A&M University, USA|Conversational recommenders are emerging as a powerful tool to personalize a user's recommendation experience. Through a back-and-forth dialogue, users can quickly hone in on just the right items. Many approaches to conversational recommendation, however, only partially explore the user preference space and make limiting assumptions about how user feedback can be best incorporated, resulting in long dialogues and poor recommendation performance. In this paper, we propose a novel conversational recommendation framework with two unique features: (i) a greedy NDCG attribute selector, to enhance user personalization in the interactive preference elicitation process by prioritizing attributes that most effectively represent the actual preference space of the user; and (ii) a user representation refiner, to effectively fuse together the user preferences collected from the interactive elicitation process to obtain a more personalized understanding of the user. Through extensive experiments on four frequently used datasets, we find the proposed framework not only outperforms all the state-of-the-art conversational recommenders (in terms of both recommendation performance and conversation efficiency), but also provides a more personalized experience for the user under the proposed multi-groundtruth multi-round conversational recommendation setting.|对话式推荐正在成为个性化用户推荐体验的强大工具。通过反复的对话,用户可以快速找到正确的项目。然而,许多会话推荐方法只是部分地探索了用户偏好空间,并对如何最好地整合用户反馈进行了有限的假设,导致了冗长的对话和糟糕的推荐性能。本文提出了一种新的会话推荐框架,该框架具有两个独特的特征: (1)贪婪的 NDCG 属性选择器,通过对最有效地表示用户实际偏好空间的属性进行优先排序,增强交互式偏好启发过程中的用户个性化; (2)用户表示细化器,有效地融合交互式启发过程中收集到的用户偏好,以获得对用户更个性化的理解。通过对四个常用数据集的大量实验,我们发现该框架不仅在推荐性能和会话效率方面优于所有最先进的会话推荐器,而且在提出的多地面真相多轮会话推荐设置下为用户提供了更加个性化的体验。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Enhancing+User+Personalization+in+Conversational+Recommenders)|0| +|[Dual-interest Factorization-heads Attention for Sequential Recommendation](https://doi.org/10.1145/3543507.3583278)|Guanyu Lin, Chen Gao, Yu Zheng, Jianxin Chang, Yanan Niu, Yang Song, Zhiheng Li, Depeng Jin, Yong Li|Tsinghua University, China; kuaishou, China; Department of Electronic Engineering, Tsinghua University, China|Accurate user interest modeling is vital for recommendation scenarios. One of the effective solutions is the sequential recommendation that relies on click behaviors, but this is not elegant in the video feed recommendation where users are passive in receiving the streaming contents and return skip or no-skip behaviors. Here skip and no-skip behaviors can be treated as negative and positive feedback, respectively. With the mixture of positive and negative feedback, it is challenging to capture the transition pattern of behavioral sequence. To do so, FeedRec has exploited a shared vanilla Transformer, which may be inelegant because head interaction of multi-heads attention does not consider different types of feedback. In this paper, we propose Dual-interest Factorization-heads Attention for Sequential Recommendation (short for DFAR) consisting of feedback-aware encoding layer, dual-interest disentangling layer and prediction layer. In the feedback-aware encoding layer, we first suppose each head of multi-heads attention can capture specific feedback relations. Then we further propose factorization-heads attention which can mask specific head interaction and inject feedback information so as to factorize the relation between different types of feedback. Additionally, we propose a dual-interest disentangling layer to decouple positive and negative interests before performing disentanglement on their representations. Finally, we evolve the positive and negative interests by corresponding towers whose outputs are contrastive by BPR loss. Experiments on two real-world datasets show the superiority of our proposed method against state-of-the-art baselines. Further ablation study and visualization also sustain its effectiveness. We release the source code here: https://github.com/tsinghua-fib-lab/WWW2023-DFAR.|准确的用户兴趣建模对于推荐场景至关重要。其中一个有效的解决方案是依赖于点击行为的顺序推荐,但是在视频提要推荐中这并不优雅,因为用户在接收流内容和返回跳过或不跳过行为时是被动的。在这里,跳过和不跳过行为可以分别视为负反馈和正反馈。由于正反馈和负反馈的混合,捕捉行为序列的转换模式具有挑战性。为此,FeedRec 利用了一个共享的香草变压器,这可能是不雅的,因为多头注意的头部交互没有考虑不同类型的反馈。本文提出了由反馈感知编码层、双兴趣分解层和预测层组成的双兴趣分解顺序推荐系统。在反馈感知编码层,我们首先假设多头注意的每个头都能捕获特定的反馈关系。然后进一步提出因子分解-头注意,它可以掩盖特定的头交互,并注入反馈信息,从而对不同类型的反馈之间的关系进行因子分解。此外,我们提出了一个双利益解缠层,以解耦正面和负面的利益之前,执行解缠的表示。最后,我们通过相应的塔进行正负利益演化,其输出由于业务流程重组损失而具有对比性。在两个实际数据集上的实验表明了我们提出的方法对最先进的基线的优越性。进一步的消融研究和可视化也支持其有效性。我们在这里发布源代码: https://github.com/tsinghua-fib-lab/www2023-dfar。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Dual-interest+Factorization-heads+Attention+for+Sequential+Recommendation)|0| +|[A Cross-Media Retrieval System for Web-SNS-Map Using Suggested Keywords Generating and Ranking Method Based on Search Characteristics](https://doi.org/10.1145/3543873.3587344)|Da Li, Masaki Sugihashi, Tadahiko Kumamoto, Yukiko Kawai|Kyoto Sangyo University, Japan; Chiba Institute of Technology, Japan; Fukuoka University, Japan|The research on multimedia retrieval has lasted for several decades. However, past efforts generally focused on single-media retrieval, where the queries and retrieval results belong to the same media (platform) type, such as social media platforms or search engines. In single-media retrieval, users have to select search media or options based on search characteristics such as contents, time, or spatial distance, they might be unable to retrieve correct results mixed in other media if they carelessly forget to select. In this study, we propose a cross-media retrieval system using suggestion generation methods to integrate three search characteristics of the Web (textual content-based retrieval), SNS (timeliness), and map (spatial distance-aware retrieval). In our previous research, we attempted to improve search efficiency using clustering methods to provide search results to users through related terms, etc. In this paper, we focus on the search efficiency of multiple search media. We utilize Google search engine to obtain the retrieval content from the Web, Twitter to obtain timely information from SNSs, and Google Maps to get geographical information from maps. We apply the obtained retrieval results to analyze the similarities between them by clustering. Then, we generate relevant suggestions and provide them to users. Moreover, we validate the effectiveness of the search results generated by our proposed system.|多媒体检索的研究已经持续了几十年。然而,过去的努力通常集中在单媒体检索,其中查询和检索结果属于相同的媒体(平台)类型,如社会媒体平台或搜索引擎。在单媒体检索中,用户必须根据内容、时间或空间距离等搜索特征选择搜索媒体或选项,如果不小心忘记选择,可能无法检索混合在其他媒体中的正确结果。在本研究中,我们提出一个跨媒体检索系统,利用建议产生的方法来整合网页(文本内容检索)、 SNS (及时性)和地图(空间距离感知检索)的三个搜索特性。在我们以前的研究中,我们尝试使用聚类方法来提高搜索效率,通过相关词汇等为用户提供搜索结果。本文主要研究多种搜索媒体的搜索效率。我们利用谷歌搜索引擎从网络中获取检索内容,利用 Twitter 从 SNS 中获取及时信息,利用谷歌地图从地图中获取地理信息。我们应用所得到的检索结果,通过聚类分析它们之间的相似性。然后,我们生成相关的建议并提供给用户。此外,我们还验证了该系统所产生的搜索结果的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Cross-Media+Retrieval+System+for+Web-SNS-Map+Using+Suggested+Keywords+Generating+and+Ranking+Method+Based+on+Search+Characteristics)|0| +|[A Knowledge Enhanced Hierarchical Fusion Network for CTR Prediction under Account Search Scenario in WeChat](https://doi.org/10.1145/3543873.3584650)|Yuanzhou Yao, Zhao Zhang, Kaijia Yang, Huasheng Liang, Qiang Yan, Fuzheng Zhuang, Yongjun Xu, Boyu Diao, Chao Li|WeChat, Tencent, China; Institute of Computing Technology, Chinese Academy of Sciences, China; Zhejiang Lab, China; Institute of Artificial Intelligence, Beihang University, China|Click-through rate (CTR) estimation plays as a pivotal function module in various online services. Previous studies mainly apply CTR models to the field of recommendation or online advertisement. Indeed, CTR is also critical in information retrieval, since the CTR probability can serve as a valuable feature for a query-document pair. In this paper, we study the CTR task under account search scenario in WeChat, where users search official accounts or mini programs corresponding to an organization. Despite the large number of CTR models, directly applying them to our task is inappropriate since the account retrieval task has a number of specific characteristics. E.g., different from traditional user-centric CTR models, in our task, CTR prediction is query-centric and does not model user information. In addition, queries and accounts are short texts, and heavily rely on prior knowledge and semantic understanding. These characteristics require us to specially design a CTR model for the task. To this end, we propose a novel CTR prediction model named Knowledge eNhanced hIerarchical Fusion nEtwork (KNIFE). Specifically, to tackle the prior information problem, we mine the knowledge graph of accounts as side information; to enhance the representations of queries, we construct a bipartite graph for queries and accounts. In addition, a hierarchical network structure is proposed to fuse the representations of different information in a fine-grained manner. Finally, the representations of queries and accounts are obtained from this hierarchical network and fed into the CTR model together with other features for prediction. We conduct extensive experiments against 12 existing models across two industrial datasets. Both offline and online A/B test results indicate the effectiveness of KNIFE.|在各种网上服务中,点进率评估是一个关键的功能模块。以往的研究主要将点击率模型应用于推荐或在线广告领域。实际上,点击率在信息检索中也很关键,因为点击率可以作为查询-文档对的一个有价值的特性。本文研究了微信中用户搜索官方账号或与组织对应的小程序的帐号搜索情景下的点击率任务。尽管有大量的点击率检索模型,但由于账户检索任务具有许多特殊性,直接将其应用于我们的任务是不合适的。例如,与传统的以用户为中心的 CTR 模型不同,在我们的任务中,CTR 预测是以查询为中心的,不对用户信息建模。此外,查询和帐户是简短的文本,并且严重依赖于先前的知识和语义理解。这些特性要求我们为任务专门设计一个 CTR 模型。为此,我们提出了一种新的 CTR 预测模型——知识增强分层融合网络(KNIFE)。具体来说,为了解决先验信息问题,我们挖掘帐户的知识图作为边信息; 为了增强查询的表示,我们为查询和帐户构造一个二分图。此外,提出了一种分层网络结构,以细粒度的方式融合不同信息的表示。最后,从这个层次网络中获得查询和帐户的表示,并将其与其他用于预测的特征一起反馈到 CTR 模型中。我们对两个工业数据集中的12个现有模型进行了广泛的实验。离线和在线 A/B 测试结果均表明了 KNIFE 的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Knowledge+Enhanced+Hierarchical+Fusion+Network+for+CTR+Prediction+under+Account+Search+Scenario+in+WeChat)|0| |[Multi-Objective Ranking to Boost Navigational Suggestions in eCommerce AutoComplete](https://doi.org/10.1145/3543873.3584649)|Sonali Singh, Sachin Farfade, Prakash Mandayam Comar|Amazon, India|Query AutoComplete (QAC) helps customers complete their search queries quickly by suggesting completed queries. QAC on eCommerce sites usually employ Learning to Rank (LTR) approaches based on customer behaviour signals such as clicks and conversion rates to optimize business metrics. However, they do not exclusively optimize for the quality of suggested queries which results in lack of navigational suggestions like product categories and attributes, e.g., "sports shoes" and "white shoes" for query "shoes". We propose to improve the quality of query suggestions by introducing navigational suggestions without impacting the business metrics. For this purpose, we augment the customer behaviour (CB) based objective with Query-Quality (QQ) objective and assemble them with trainable mixture weights to define multi-objective optimization function. We propose to optimize this multi-objective function by implementing ALMO algorithm to obtain a model robust against any mixture weight. We show that this formulation improves query relevance on an eCommerce QAC dataset by at least 13% over the baseline Deep Pairwise LTR (DeepPLTR) with minimal impact on MRR and results in a lift of 0.26% in GMV in an online A/B test. We also evaluated our approach on public search logs datasets and got improvement in query relevance by using query coherence as QQ objective.|QueryAutoComplete (QAC)通过建议已完成的查询,帮助客户快速完成搜索查询。电子商务网站上的 QAC 通常采用基于客户行为信号(如点击率和转换率)的学习排名(LTR)方法来优化业务指标。然而,它们并不专门针对建议查询的质量进行优化,这会导致缺乏像产品类别和属性这样的导航建议,例如,“运动鞋”和查询“鞋子”的“白鞋子”。我们建议通过引入导航建议而不影响业务度量来提高查询建议的质量。为此,我们将基于顾客行为(CB)的目标与查询质量(QQ)目标相结合,并用可训练的混合权重组合它们来定义多目标优化函数。我们提出通过实现 ALMO 算法来优化这个多目标函数,以获得对任意混合权重的鲁棒模型。我们表明,这种制定方法使电子商务 QAC 数据集的查询相关性比基线 Deep Pairwise LTR (DeepPLTR)至少提高了13% ,对 MRR 的影响最小,并且在线 A/B 测试中导致 GMV 升高0.26% 。对公共检索日志数据集的检索方法进行了评估,并以查询一致性为 QQ 目标,提高了查询相关性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-Objective+Ranking+to+Boost+Navigational+Suggestions+in+eCommerce+AutoComplete)|0| |[Personalization and Recommendations in Search](https://doi.org/10.1145/3543873.3589749)|Sudarshan Lamkhede, Anlei Dong, Moumita Bhattacharya, Hongning Wang|Dept. of Computer Science, University of Virginia, USA; Microsoft Bing, USA; Netflix Research, USA|The utility of a search system for its users can be further enhanced by providing personalized results and recommendations within the search context. However, the research discussions around these aspects of search remain fragmented across different conferences and workshops. Hence, this workshop aims to bring together researchers and practitioners from industry and academia to engage in the discussions of algorithmic and system challenges in search personalization and effectively recommending within search context.|通过在搜索上下文中提供个性化的结果和建议,可以进一步加强搜索系统对用户的效用。然而,围绕搜索这些方面的研究讨论在不同的会议和研讨会上仍然支离破碎。因此,这个研讨会的目的是聚集业界和学术界的研究人员和从业人员,参与讨论在搜索个性化和有效推荐搜索背景下的算法和系统挑战。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Personalization+and+Recommendations+in+Search)|0| -|[Cooperative Retriever and Ranker in Deep Recommenders](https://doi.org/10.1145/3543507.3583422)|Xu Huang, Defu Lian, Jin Chen, Liu Zheng, Xing Xie, Enhong Chen|; University of Electronic Science and Technology of China, China; Microsoft Research Asia, China; University of Science and Technology of China, China|Deep recommender systems (DRS) are intensively applied in modern web services. To deal with the massive web contents, DRS employs a two-stage workflow: retrieval and ranking, to generate its recommendation results. The retriever aims to select a small set of relevant candidates from the entire items with high efficiency; while the ranker, usually more precise but time-consuming, is supposed to further refine the best items from the retrieved candidates. Traditionally, the two components are trained either independently or within a simple cascading pipeline, which is prone to poor collaboration effect. Though some latest works suggested to train retriever and ranker jointly, there still exist many severe limitations: item distribution shift between training and inference, false negative, and misalignment of ranking order. As such, it remains to explore effective collaborations between retriever and ranker.|深度推荐系统(DRS)在现代 Web 服务中得到了广泛的应用。为了处理海量的网络内容,DRS 采用了两个阶段的工作流程: 检索和排名,以生成其推荐结果。检索器的目标是从整个项目中高效地选择一小部分相关候选项; 而排名器通常更精确但更耗时,应该从检索到的候选项中进一步提炼出最好的项目。传统上,这两个组件要么单独训练,要么在一个简单的级联管道中训练,这样容易产生较差的协作效果。虽然最新的一些研究提出了联合训练检索器和排序器,但仍然存在很多严重的局限性: 训练和推理之间的项目分布转移、错误否定和排序顺序不一致。因此,仍然需要探索检索器和排名器之间的有效协作。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Cooperative+Retriever+and+Ranker+in+Deep+Recommenders)|0| -|[Modeling Temporal Positive and Negative Excitation for Sequential Recommendation](https://doi.org/10.1145/3543507.3583463)|Chengkai Huang, Shoujin Wang, Xianzhi Wang, Lina Yao|CSIRO's Data 61, Australia and The University of New South Wales, Australia; University of Technology Sydney, Australia; The University of New South Wales, Australia|Sequential recommendation aims to predict the next item which interests users via modeling their interest in items over time. Most of the existing works on sequential recommendation model users’ dynamic interest in specific items while overlooking users’ static interest revealed by some static attribute information of items, e.g., category, brand. Moreover, existing works often only consider the positive excitation of a user’s historical interactions on his/her next choice on candidate items while ignoring the commonly existing negative excitation, resulting in insufficiently modeling dynamic interest. The overlook of static interest and negative excitation will lead to incomplete interest modeling and thus impedes the recommendation performance. To this end, in this paper, we propose modeling both static interest and negative excitation for dynamic interest to further improve the recommendation performance. Accordingly, we design a novel Static-Dynamic Interest Learning (SDIL) framework featured with a novel Temporal Positive and Negative Excitation Modeling (TPNE) module for accurate sequential recommendation. TPNE is specially designed for comprehensively modeling dynamic interest based on temporal positive and negative excitation learning. Extensive experiments on three real-world datasets show that SDIL can effectively capture both static and dynamic interest and outperforms state-of-the-art baselines.|序贯推荐旨在通过建立用户对项目的兴趣模型来预测下一个用户感兴趣的项目。现有的序贯推荐模型大多是建立在用户对特定项目的动态兴趣的基础上,忽略了项目的静态属性信息(如类别、品牌等)所揭示的用户的静态兴趣。此外,现有的作品往往只考虑用户的历史交互作用对他/她的下一个选择的候选项的正激励,而忽略了普遍存在的负激励,导致不足的建模动态兴趣。忽视静态兴趣和负激励会导致兴趣建模的不完整,从而影响推荐性能。为此,本文提出了静态兴趣模型和动态兴趣的负激励模型,以进一步提高推荐性能。因此,我们设计了一个新颖的静态-动态兴趣学习(SDIL)框架,该框架具有一个新颖的时态正负激励建模(TPNE)模块,用于准确的顺序推荐。TPNE 是一种基于时间正负激励学习的动态兴趣综合建模方法。在三个实际数据集上的大量实验表明,SDIL 能够有效地捕获静态和动态兴趣,并且性能优于最先进的基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Modeling+Temporal+Positive+and+Negative+Excitation+for+Sequential+Recommendation)|0| +|[Cooperative Retriever and Ranker in Deep Recommenders](https://doi.org/10.1145/3543507.3583422)|Xu Huang, Defu Lian, Jin Chen, Liu Zheng, Xing Xie, Enhong Chen|University of Electronic Science and Technology of China, China; ; Microsoft Research Asia, China; University of Science and Technology of China, China|Deep recommender systems (DRS) are intensively applied in modern web services. To deal with the massive web contents, DRS employs a two-stage workflow: retrieval and ranking, to generate its recommendation results. The retriever aims to select a small set of relevant candidates from the entire items with high efficiency; while the ranker, usually more precise but time-consuming, is supposed to further refine the best items from the retrieved candidates. Traditionally, the two components are trained either independently or within a simple cascading pipeline, which is prone to poor collaboration effect. Though some latest works suggested to train retriever and ranker jointly, there still exist many severe limitations: item distribution shift between training and inference, false negative, and misalignment of ranking order. As such, it remains to explore effective collaborations between retriever and ranker.|深度推荐系统(DRS)在现代 Web 服务中得到了广泛的应用。为了处理海量的网络内容,DRS 采用了两个阶段的工作流程: 检索和排名,以生成其推荐结果。检索器的目标是从整个项目中高效地选择一小部分相关候选项; 而排名器通常更精确但更耗时,应该从检索到的候选项中进一步提炼出最好的项目。传统上,这两个组件要么单独训练,要么在一个简单的级联管道中训练,这样容易产生较差的协作效果。虽然最新的一些研究提出了联合训练检索器和排序器,但仍然存在很多严重的局限性: 训练和推理之间的项目分布转移、错误否定和排序顺序不一致。因此,仍然需要探索检索器和排名器之间的有效协作。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Cooperative+Retriever+and+Ranker+in+Deep+Recommenders)|0| +|[Modeling Temporal Positive and Negative Excitation for Sequential Recommendation](https://doi.org/10.1145/3543507.3583463)|Chengkai Huang, Shoujin Wang, Xianzhi Wang, Lina Yao|The University of New South Wales, Australia; CSIRO's Data 61, Australia and The University of New South Wales, Australia; University of Technology Sydney, Australia|Sequential recommendation aims to predict the next item which interests users via modeling their interest in items over time. Most of the existing works on sequential recommendation model users’ dynamic interest in specific items while overlooking users’ static interest revealed by some static attribute information of items, e.g., category, brand. Moreover, existing works often only consider the positive excitation of a user’s historical interactions on his/her next choice on candidate items while ignoring the commonly existing negative excitation, resulting in insufficiently modeling dynamic interest. The overlook of static interest and negative excitation will lead to incomplete interest modeling and thus impedes the recommendation performance. To this end, in this paper, we propose modeling both static interest and negative excitation for dynamic interest to further improve the recommendation performance. Accordingly, we design a novel Static-Dynamic Interest Learning (SDIL) framework featured with a novel Temporal Positive and Negative Excitation Modeling (TPNE) module for accurate sequential recommendation. TPNE is specially designed for comprehensively modeling dynamic interest based on temporal positive and negative excitation learning. Extensive experiments on three real-world datasets show that SDIL can effectively capture both static and dynamic interest and outperforms state-of-the-art baselines.|序贯推荐旨在通过建立用户对项目的兴趣模型来预测下一个用户感兴趣的项目。现有的序贯推荐模型大多是建立在用户对特定项目的动态兴趣的基础上,忽略了项目的静态属性信息(如类别、品牌等)所揭示的用户的静态兴趣。此外,现有的作品往往只考虑用户的历史交互作用对他/她的下一个选择的候选项的正激励,而忽略了普遍存在的负激励,导致不足的建模动态兴趣。忽视静态兴趣和负激励会导致兴趣建模的不完整,从而影响推荐性能。为此,本文提出了静态兴趣模型和动态兴趣的负激励模型,以进一步提高推荐性能。因此,我们设计了一个新颖的静态-动态兴趣学习(SDIL)框架,该框架具有一个新颖的时态正负激励建模(TPNE)模块,用于准确的顺序推荐。TPNE 是一种基于时间正负激励学习的动态兴趣综合建模方法。在三个实际数据集上的大量实验表明,SDIL 能够有效地捕获静态和动态兴趣,并且性能优于最先进的基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Modeling+Temporal+Positive+and+Negative+Excitation+for+Sequential+Recommendation)|0| |[Beyond Two-Tower: Attribute Guided Representation Learning for Candidate Retrieval](https://doi.org/10.1145/3543507.3583254)|Hongyu Shan, Qishen Zhang, Zhongyi Liu, Guannan Zhang, Chenliang Li|Wuhan University, China; antgroup, China|Candidate retrieval is a key part of the modern search engines whose goal is to find candidate items that are semantically related to the query from a large item pool. The core difference against the later ranking stage is the requirement of low latency. Hence, two-tower structure with two parallel yet independent encoder for both query and item is prevalent in many systems. In these efforts, the semantic information of a query and a candidate item is fed into the corresponding encoder and then use their representations for retrieval. With the popularity of pre-trained semantic models, the state-of-the-art for semantic retrieval tasks has achieved the significant performance gain. However, the capacity of learning relevance signals is still limited by the isolation between the query and the item. The interaction-based modeling between the query and the item has been widely validated to be useful for the ranking stage, where more computation cost is affordable. Here, we are quite initerested in an demanding question: how to exploiting query-item interaction-based learning to enhance candidate retrieval and still maintain the low computation cost. Note that an item usually contain various heteorgeneous attributes which could help us understand the item characteristics more precisely. To this end, we propose a novel attribute guided representation learning framework (named AGREE) to enhance the candidate retrieval by exploiting query-attribute relevance. The key idea is to couple the query and item representation learning together during the training phase, but also enable easy decoupling for efficient inference. Specifically, we introduce an attribute fusion layer in the item side to identify most relevant item features for item representation. On the query side, an attribute-aware learning process is introduced to better infer the search intent also from these attributes. After model training, we then decouple the attribute information away from the query encoder, which guarantees the low latency for the inference phase. Extensive experiments over two real-world large-scale datasets demonstrate the superiority of the proposed AGREE against several state-of-the-art technical alternatives. Further online A/B test from AliPay search servise also show that AGREE achieves substantial performance gain over four business metrics. Currently, the proposed AGREE has been deployed online in AliPay for serving major traffic.|候选检索是现代搜索引擎的一个关键部分,其目标是从一个大的项目池中查找与查询语义相关的候选项。与后期排名阶段的核心区别在于对低延迟的要求。因此,双塔结构的两个并行但独立的编码器的查询和项目是普遍存在的许多系统。在这些工作中,查询和候选项的语义信息被输入到相应的编码器中,然后使用它们的表示进行检索。随着预训练语义模型的普及,语义检索任务的性能得到了显著提高。然而,相关信号的学习能力仍然受到查询与项目之间隔离的限制。基于交互的查询和项目之间的建模已被广泛验证是有用的排名阶段,其中更多的计算成本是负担得起的。如何利用基于查询项交互的学习来提高候选检索的效率,同时保持较低的计算成本,是本文研究的热点问题。注意,项目通常包含各种异构属性,这些属性可以帮助我们更精确地理解项目特征。为此,我们提出了一种新的属性引导表示学习框架(AGREE) ,利用查询-属性相关性来增强候选检索。其核心思想是在训练阶段将查询和项目表示学习耦合在一起,同时也为有效的推理提供了简单的解耦。具体来说,我们在项目端引入一个属性融合层来识别项目表示中最相关的项目特征。在查询方面,引入了一个感知属性的学习过程,以更好地从这些属性中推断出搜索意图。经过模型训练后,将属性信息与查询编码器解耦,保证了推理阶段的低延迟。通过两个现实世界大规模数据集的大量实验证明了所提议的 AGREE 相对于几种最先进的技术选择的优越性。支付宝搜索服务的进一步在线 A/B 测试也表明,AGREE 在四个业务指标上取得了显著的性能提升。目前,拟议的《支付宝协议》已在支付宝网上部署,以服务主要流量。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Beyond+Two-Tower:+Attribute+Guided+Representation+Learning+for+Candidate+Retrieval)|0| -|[Improving Content Retrievability in Search with Controllable Query Generation](https://doi.org/10.1145/3543507.3583261)|Gustavo Penha, Enrico Palumbo, Maryam Aziz, Alice Wang, Hugues Bouchard|Spotify, Spain; Spotify, USA; Spotify, Netherlands; Spotify, Italy|An important goal of online platforms is to enable content discovery, i.e. allow users to find a catalog entity they were not familiar with. A pre-requisite to discover an entity, e.g. a book, with a search engine is that the entity is retrievable, i.e. there are queries for which the system will surface such entity in the top results. However, machine-learned search engines have a high retrievability bias, where the majority of the queries return the same entities. This happens partly due to the predominance of narrow intent queries, where users create queries using the title of an already known entity, e.g. in book search 'harry potter'. The amount of broad queries where users want to discover new entities, e.g. in music search 'chill lyrical electronica with an atmospheric feeling to it', and have a higher tolerance to what they might find, is small in comparison. We focus here on two factors that have a negative impact on the retrievability of the entities (I) the training data used for dense retrieval models and (II) the distribution of narrow and broad intent queries issued in the system. We propose CtrlQGen, a method that generates queries for a chosen underlying intent-narrow or broad. We can use CtrlQGen to improve factor (I) by generating training data for dense retrieval models comprised of diverse synthetic queries. CtrlQGen can also be used to deal with factor (II) by suggesting queries with broader intents to users. Our results on datasets from the domains of music, podcasts, and books reveal that we can significantly decrease the retrievability bias of a dense retrieval model when using CtrlQGen. First, by using the generated queries as training data for dense models we make 9% of the entities retrievable (go from zero to non-zero retrievability). Second, by suggesting broader queries to users, we can make 12% of the entities retrievable in the best case.|在线平台的一个重要目标是支持内容发现,即允许用户找到他们不熟悉的目录实体。使用搜索引擎发现一个实体(例如一本书)的先决条件是该实体是可检索的,也就是说,有一些查询系统将在顶部结果中显示该实体。然而,机器学习搜索引擎有很高的可检索性偏差,其中大多数查询返回相同的实体。这部分是由于狭义意图查询的优势,用户使用已知实体的标题创建查询,例如在图书搜索“哈利波特”。用户希望发现新实体的广泛查询的数量,例如在音乐搜索“寒冷的抒情电子乐与大气的感觉”,并有一个更高的容忍度,他们可能会发现,相比之下是小的。这里我们重点讨论对实体的可检索性有负面影响的两个因素(I)用于密集检索模型的训练数据和(II)系统中发出的狭义和广义意图查询的分布。我们提出了 CtrlQGen,一种为选定的底层意图生成查询的方法——狭义的或者广义的。我们可以使用 CtrlQGen 通过为由不同合成查询组成的密集检索模型生成训练数据来改进 factor (I)。CtrlQGen 还可以通过向用户建议具有更广泛意图的查询来处理 factor (II)。我们对音乐、播客和书籍领域的数据集的研究结果表明,当使用 CtrlQGen 时,我们可以显著降低密集检索模型的可检索性偏差。首先,通过使用生成的查询作为密集模型的训练数据,我们使9% 的实体可检索(从零到非零可检索性)。其次,通过向用户建议更广泛的查询,我们可以使12% 的实体在最佳情况下可检索。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+Content+Retrievability+in+Search+with+Controllable+Query+Generation)|0| +|[Improving Content Retrievability in Search with Controllable Query Generation](https://doi.org/10.1145/3543507.3583261)|Gustavo Penha, Enrico Palumbo, Maryam Aziz, Alice Wang, Hugues Bouchard|Spotify, Netherlands; Spotify, Spain; Spotify, USA; Spotify, Italy|An important goal of online platforms is to enable content discovery, i.e. allow users to find a catalog entity they were not familiar with. A pre-requisite to discover an entity, e.g. a book, with a search engine is that the entity is retrievable, i.e. there are queries for which the system will surface such entity in the top results. However, machine-learned search engines have a high retrievability bias, where the majority of the queries return the same entities. This happens partly due to the predominance of narrow intent queries, where users create queries using the title of an already known entity, e.g. in book search 'harry potter'. The amount of broad queries where users want to discover new entities, e.g. in music search 'chill lyrical electronica with an atmospheric feeling to it', and have a higher tolerance to what they might find, is small in comparison. We focus here on two factors that have a negative impact on the retrievability of the entities (I) the training data used for dense retrieval models and (II) the distribution of narrow and broad intent queries issued in the system. We propose CtrlQGen, a method that generates queries for a chosen underlying intent-narrow or broad. We can use CtrlQGen to improve factor (I) by generating training data for dense retrieval models comprised of diverse synthetic queries. CtrlQGen can also be used to deal with factor (II) by suggesting queries with broader intents to users. Our results on datasets from the domains of music, podcasts, and books reveal that we can significantly decrease the retrievability bias of a dense retrieval model when using CtrlQGen. First, by using the generated queries as training data for dense models we make 9% of the entities retrievable (go from zero to non-zero retrievability). Second, by suggesting broader queries to users, we can make 12% of the entities retrievable in the best case.|在线平台的一个重要目标是支持内容发现,即允许用户找到他们不熟悉的目录实体。使用搜索引擎发现一个实体(例如一本书)的先决条件是该实体是可检索的,也就是说,有一些查询系统将在顶部结果中显示该实体。然而,机器学习搜索引擎有很高的可检索性偏差,其中大多数查询返回相同的实体。这部分是由于狭义意图查询的优势,用户使用已知实体的标题创建查询,例如在图书搜索“哈利波特”。用户希望发现新实体的广泛查询的数量,例如在音乐搜索“寒冷的抒情电子乐与大气的感觉”,并有一个更高的容忍度,他们可能会发现,相比之下是小的。这里我们重点讨论对实体的可检索性有负面影响的两个因素(I)用于密集检索模型的训练数据和(II)系统中发出的狭义和广义意图查询的分布。我们提出了 CtrlQGen,一种为选定的底层意图生成查询的方法——狭义的或者广义的。我们可以使用 CtrlQGen 通过为由不同合成查询组成的密集检索模型生成训练数据来改进 factor (I)。CtrlQGen 还可以通过向用户建议具有更广泛意图的查询来处理 factor (II)。我们对音乐、播客和书籍领域的数据集的研究结果表明,当使用 CtrlQGen 时,我们可以显著降低密集检索模型的可检索性偏差。首先,通过使用生成的查询作为密集模型的训练数据,我们使9% 的实体可检索(从零到非零可检索性)。其次,通过向用户建议更广泛的查询,我们可以使12% 的实体在最佳情况下可检索。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+Content+Retrievability+in+Search+with+Controllable+Query+Generation)|0| |[PIE: Personalized Interest Exploration for Large-Scale Recommender Systems](https://doi.org/10.1145/3543873.3584656)|Khushhall Chandra Mahajan, Amey Porobo Dharwadker, Romil Shah, Simeng Qu, Gaurav Bang, Brad Schumitsch|Meta Inc., USA|Recommender systems are increasingly successful in recommending personalized content to users. However, these systems often capitalize on popular content. There is also a continuous evolution of user interests that need to be captured, but there is no direct way to systematically explore users' interests. This also tends to affect the overall quality of the recommendation pipeline as training data is generated from the candidates presented to the user. In this paper, we present a framework for exploration in large-scale recommender systems to address these challenges. It consists of three parts, first the user-creator exploration which focuses on identifying the best creators that users are interested in, second the online exploration framework and third a feed composition mechanism that balances explore and exploit to ensure optimal prevalence of exploratory videos. Our methodology can be easily integrated into an existing large-scale recommender system with minimal modifications. We also analyze the value of exploration by defining relevant metrics around user-creator connections and understanding how this helps the overall recommendation pipeline with strong online gains in creator and ecosystem value. In contrast to the regression on user engagement metrics generally seen while exploring, our method is able to achieve significant improvements of 3.50% in strong creator connections and 0.85% increase in novel creator connections. Moreover, our work has been deployed in production on Facebook Watch, a popular video discovery and sharing platform serving billions of users.|推荐系统在向用户推荐个性化内容方面越来越成功。然而,这些系统往往利用流行的内容。用户兴趣的不断演变也需要被捕捉,但是没有直接的方法来系统地探索用户的兴趣。这也往往影响推荐管道的总体质量,因为培训数据是从向用户提供的候选人中产生的。在本文中,我们提出了一个大规模推荐系统的探索框架,以解决这些挑战。它由三部分组成,第一部分是用户创建者探索,侧重于确定用户感兴趣的最佳创建者,第二部分是在线探索框架,第三部分是平衡探索和利用的馈送组合机制,以确保探索性视频的最佳流行。我们的方法可以很容易地集成到一个现有的大规模推荐系统中,只需要做很少的修改。我们还通过定义用户-创建者连接的相关度量来分析探索的价值,并了解这如何帮助整个推荐流水线在创建者和生态系统价值方面获得强大的在线收益。与探索过程中常见的用户参与度指标的回归相比,我们的方法能够在强创作者关系中获得3.50% 的显著提高,在新创作者关系中获得0.85% 的显著提高。此外,我们的工作已经部署在 Facebook Watch 上,这是一个流行的视频发现和分享平台,为数十亿用户提供服务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PIE:+Personalized+Interest+Exploration+for+Large-Scale+Recommender+Systems)|0| |[Improving Product Search with Season-Aware Query-Product Semantic Similarity](https://doi.org/10.1145/3543873.3587625)|Haoming Chen, Yetian Chen, Jingjing Meng, Yang Jiao, Yikai Ni, Yan Gao, Michinari Momma, Yi Sun|Harvard University, USA; Amazon.com, USA|Product search for online shopping should be season-aware, i.e., presenting seasonally relevant products to customers. In this paper, we propose a simple yet effective solution to improve seasonal relevance in product search by incorporating seasonality into language models for semantic matching. We first identify seasonal queries and products by analyzing implicit seasonal contexts through time-series analysis over the past year. Then we introduce explicit seasonal contexts by enhancing the query representation with a season token according to when the query is issued. A new season-enhanced BERT model (SE-BERT) is also proposed to learn the semantic similarity between the resulting seasonal queries and products. SE-BERT utilizes Multi-modal Adaption Gate (MAG) to augment the season-enhanced semantic embedding with other contextual information such as product price and review counts for robust relevance prediction. To better align with the ranking objective, a listwise loss function (neural NDCG) is used to regularize learning. Experimental results validate the effectiveness of the proposed method, which outperforms existing solutions for query-product relevance prediction in terms of NDCG and Price Weighted Purchases (PWP).|网上购物的产品搜寻应具有季节性,即向顾客展示季节性相关的产品。在本文中,我们提出了一个简单而有效的解决方案,以改善产品搜索的季节性相关性,将季节性纳入语义匹配的语言模型。我们首先通过对过去一年的时间序列分析,分析隐含的季节性背景,识别出季节性查询和产品。然后根据查询发出的时间,使用季节标记增强查询表示,从而引入明确的季节上下文。提出了一种新的季节增强 BERT 模型(SE-BERT) ,用于学习产生的季节查询与产品之间的语义相似性。该算法利用多模态自适应门(MAG)增强季节增强语义嵌入,并结合产品价格和评论计数等上下文信息进行鲁棒相关性预测。为了更好地与排名目标保持一致,一个列表损失函数(神经 NDCG)被用来规范学习。实验结果验证了该方法的有效性,在 NDCG 和价格加权购买(PWP)方面优于现有的查询产品相关性预测方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+Product+Search+with+Season-Aware+Query-Product+Semantic+Similarity)|0| -|[Blend and Match: Distilling Semantic Search Models with Different Inductive Biases and Model Architectures](https://doi.org/10.1145/3543873.3587629)|Hamed Bonab, Ashutosh Joshi, Ravi Bhatia, Ankit Gandhi, Vijay Huddar, Juhi Naik, Mutasem AlDarabsah, Choon Hui Teo, Jonathan May, Tarun Agarwal, Vaclav Petricek|Amazon, India; Amazon, USA and USC Information Sciences Institute, USA; Amazon, USA|Commercial search engines use different semantic models to augment lexical matches. These models provide candidate items for a user’s query from a target space of millions to billions of items. Models with different inductive biases provide relatively different predictions, making it desirable to launch multiple semantic models in production. However, latency and resource constraints make simultaneously deploying multiple models impractical. In this paper, we introduce a distillation approach, called Blend and Match (BM), to unify two different semantic search models into a single model. We use a Bi-encoder semantic matching model as our primary model and propose a novel loss function to incorporate eXtreme Multi-label Classification (XMC) predictions as the secondary model. Our experiments conducted on two large-scale datasets, collected from a popular e-commerce store, show that our proposed approach significantly improves the recall of the primary Bi-encoder model by 11% to 17% with a minimal loss in precision. We show that traditional knowledge distillation approaches result in a sub-optimal performance for our problem setting, and our BM approach yields comparable rankings with strong Rank Fusion (RF) methods used only if one could deploy multiple models.|商业搜索引擎使用不同的语义模型来增加词汇匹配。这些模型为用户的查询提供从数百万到数十亿的候选项。具有不同归纳偏差的模型提供了相对不同的预测,因此在生产环境中启动多个语义模型是可取的。然而,延迟和资源限制使得同时部署多个模型不切实际。在本文中,我们引入了一种称为“混合与匹配”(Blend and Match,BM)的提取方法,将两个不同的语义搜索模型统一到一个单一的模型中。我们使用一个双编码器语义匹配模型作为我们的主要模型,并提出了一个新的损失函数合并 eXtreme 多标签分类(XMC)预测作为次要模型。我们在两个大规模数据集上进行的实验,从一个流行的电子商务商店收集,表明我们提出的方法显着提高了11% 至17% 的主要双编码器模型的召回率,最小的精度损失。我们表明,传统的知识提取方法导致次优的性能为我们的问题设置,和我们的 BM 方法产生可比的排名与强秩融合(RF)方法只有当一个人可以部署多个模型使用。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Blend+and+Match:+Distilling+Semantic+Search+Models+with+Different+Inductive+Biases+and+Model+Architectures)|0| +|[Blend and Match: Distilling Semantic Search Models with Different Inductive Biases and Model Architectures](https://doi.org/10.1145/3543873.3587629)|Hamed Bonab, Ashutosh Joshi, Ravi Bhatia, Ankit Gandhi, Vijay Huddar, Juhi Naik, Mutasem AlDarabsah, Choon Hui Teo, Jonathan May, Tarun Agarwal, Vaclav Petricek|Amazon, India; Amazon, USA; Amazon, USA and USC Information Sciences Institute, USA|Commercial search engines use different semantic models to augment lexical matches. These models provide candidate items for a user’s query from a target space of millions to billions of items. Models with different inductive biases provide relatively different predictions, making it desirable to launch multiple semantic models in production. However, latency and resource constraints make simultaneously deploying multiple models impractical. In this paper, we introduce a distillation approach, called Blend and Match (BM), to unify two different semantic search models into a single model. We use a Bi-encoder semantic matching model as our primary model and propose a novel loss function to incorporate eXtreme Multi-label Classification (XMC) predictions as the secondary model. Our experiments conducted on two large-scale datasets, collected from a popular e-commerce store, show that our proposed approach significantly improves the recall of the primary Bi-encoder model by 11% to 17% with a minimal loss in precision. We show that traditional knowledge distillation approaches result in a sub-optimal performance for our problem setting, and our BM approach yields comparable rankings with strong Rank Fusion (RF) methods used only if one could deploy multiple models.|商业搜索引擎使用不同的语义模型来增加词汇匹配。这些模型为用户的查询提供从数百万到数十亿的候选项。具有不同归纳偏差的模型提供了相对不同的预测,因此在生产环境中启动多个语义模型是可取的。然而,延迟和资源限制使得同时部署多个模型不切实际。在本文中,我们引入了一种称为“混合与匹配”(Blend and Match,BM)的提取方法,将两个不同的语义搜索模型统一到一个单一的模型中。我们使用一个双编码器语义匹配模型作为我们的主要模型,并提出了一个新的损失函数合并 eXtreme 多标签分类(XMC)预测作为次要模型。我们在两个大规模数据集上进行的实验,从一个流行的电子商务商店收集,表明我们提出的方法显着提高了11% 至17% 的主要双编码器模型的召回率,最小的精度损失。我们表明,传统的知识提取方法导致次优的性能为我们的问题设置,和我们的 BM 方法产生可比的排名与强秩融合(RF)方法只有当一个人可以部署多个模型使用。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Blend+and+Match:+Distilling+Semantic+Search+Models+with+Different+Inductive+Biases+and+Model+Architectures)|0| |[Joint Internal Multi-Interest Exploration and External Domain Alignment for Cross Domain Sequential Recommendation](https://doi.org/10.1145/3543507.3583366)|Weiming Liu, Xiaolin Zheng, Chaochao Chen, Jiajie Su, Xinting Liao, Mengling Hu, Yanchao Tan|Zhejiang University, China; Fuzhou Univerisity, China|Sequential Cross-Domain Recommendation (CDR) has been popularly studied to utilize different domain knowledge and users’ historical behaviors for the next-item prediction. In this paper, we focus on the cross-domain sequential recommendation problem. This commonly exist problem is rather challenging from two perspectives, i.e., the implicit user historical rating sequences are difficult in modeling and the users/items on different domains are mostly non-overlapped. Most previous sequential CDR approaches cannot solve the cross-domain sequential recommendation problem well, since (1) they cannot sufficiently depict the users’ actual preferences, (2) they cannot leverage and transfer useful knowledge across domains. To tackle the above issues, we propose joint Internal multi-interest exploration and External domain alignment for cross domain Sequential Recommendation model (IESRec). IESRec includes two main modules, i.e., internal multi-interest exploration module and external domain alignment module. To reflect the users’ diverse characteristics with multi-interests evolution, we first propose internal temporal optimal transport method in the internal multi-interest exploration module. We further propose external alignment optimal transport method in the external domain alignment module to reduce domain discrepancy for the item embeddings. Our empirical studies on Amazon datasets demonstrate that IESRec significantly outperforms the state-of-the-art models.|序贯跨域推荐(CDR)是一种利用不同领域知识和用户历史行为进行下一个项目预测的方法。本文主要研究跨域序列推荐问题。这个常见的问题从两个方面来看都是相当具有挑战性的,即隐式用户历史评分序列难以建模,而且不同领域的用户/项目大多是非重叠的。以往的顺序 CDR 方法不能很好地解决跨域顺序推荐问题,因为(1)它们不能充分描述用户的实际偏好,(2)它们不能利用和跨域传递有用的知识。为了解决上述问题,我们提出了跨域序列推荐模型(IESRec)的内部多利益探索和外部域对齐的联合方法。IESRec 主要包括两个模块,即内部多兴趣探索模块和外部域对齐模块。为了反映用户的多样性特征和多种利益的演化,我们首先在内部多种利益探索模块中提出了内部时间最优传输方法。进一步提出了外域对齐模块中的外域对齐最优传输方法,以减少项目嵌入时的域差异。我们对亚马逊数据集的实证研究表明,IESRec 明显优于最先进的模型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Joint+Internal+Multi-Interest+Exploration+and+External+Domain+Alignment+for+Cross+Domain+Sequential+Recommendation)|0| |[Latent User Intent Modeling for Sequential Recommenders](https://doi.org/10.1145/3543873.3584641)|Bo Chang, Alexandros Karatzoglou, Yuyan Wang, Can Xu, Ed H. Chi, Minmin Chen|Google, USA|Sequential recommender models are essential components of modern industrial recommender systems. These models learn to predict the next items a user is likely to interact with based on his/her interaction history on the platform. Most sequential recommenders however lack a higher-level understanding of user intents, which often drive user behaviors online. Intent modeling is thus critical for understanding users and optimizing long-term user experience. We propose a probabilistic modeling approach and formulate user intent as latent variables, which are inferred based on user behavior signals using variational autoencoders (VAE). The recommendation policy is then adjusted accordingly given the inferred user intent. We demonstrate the effectiveness of the latent user intent modeling via offline analyses as well as live experiments on a large-scale industrial recommendation platform.|序贯推荐模型是现代工业推荐系统的重要组成部分。这些模型学习根据用户在平台上的交互历史来预测用户可能与之交互的下一个项目。然而,大多数顺序推荐系统缺乏对用户意图的更高层次的理解,这往往会驱动用户在线行为。因此,意图建模对于理解用户和优化长期用户体验至关重要。提出了一种基于变分自动编码器(VAE)的基于用户行为信号的概率建模方法,并将用户意图表示为潜变量。然后根据推断出的用户意图相应地调整推荐策略。通过离线分析以及在大规模工业推荐平台上的实验,验证了潜在用户意图建模的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Latent+User+Intent+Modeling+for+Sequential+Recommenders)|0| -|[Deep Neural Network with LinUCB: A Contextual Bandit Approach for Personalized Recommendation](https://doi.org/10.1145/3543873.3587684)|Qicai Shi, Feng Xiao, Douglas Pickard, Inga Chen, Liang Chen|Disneystreaming, USA; Disneystreaming, China|Recommender systems are widely used in many Web applications to recommend items which are relevant to a user’s preferences. However, focusing on exploiting user preferences while ignoring exploration will lead to biased feedback and hurt the user’s experience in the long term. The Mutli-Armed Bandit (MAB) is introduced to balance the tradeoff between exploitation and exploration. By utilizing context information in the reward function, contextual bandit algorithms lead to better performance compared to context-free bandit algorithms. However, existing contextual bandit algorithms either assume a linear relation between the expected reward and context features, whose representation power gets limited, or use a deep neural network in the reward function which is impractical in implementation. In this paper, we propose a new contextual bandit algorithm, DeepLinUCB, which leverages the representation power of deep neural network to transform the raw context features in the reward function. Specifically, this deep neural network is dedicated to the recommender system, which is efficient and practical in real-world applications. Furthermore, we conduct extensive experiments in our online recommender system using requests from real-world scenarios and show that DeepLinUCB is efficient and outperforms other bandit algorithms.|在许多 Web 应用程序中,推荐系统被广泛用于推荐与用户首选项相关的项目。然而,只关注用户偏好而忽视探索将导致偏见的反馈,从长远来看会损害用户的体验。为了平衡开发与勘探之间的权衡,引进了多臂匪。通过在奖励函数中利用上下文信息,上下文盗贼算法比无上下文盗贼算法具有更好的性能。然而,现有的上下文盗贼算法要么假定期望奖励与上下文特征之间存在线性关系,其表示能力受到限制,要么在奖励函数中使用深度神经网络,这在实现上是不切实际的。本文提出了一种新的上下文盗贼算法 DeepLinUCB,该算法利用深层神经网络的表示能力来转换奖励函数中的原始上下文特征。具体来说,这种深层神经网络专门用于推荐系统,在实际应用中非常有效和实用。此外,我们使用来自现实场景的请求,在我们的在线推荐系统中进行了大量的实验,结果表明 DeepLinUCB 是高效的,并且优于其他盗贼算法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Deep+Neural+Network+with+LinUCB:+A+Contextual+Bandit+Approach+for+Personalized+Recommendation)|0| +|[Deep Neural Network with LinUCB: A Contextual Bandit Approach for Personalized Recommendation](https://doi.org/10.1145/3543873.3587684)|Qicai Shi, Feng Xiao, Douglas Pickard, Inga Chen, Liang Chen|Disneystreaming, China; Disneystreaming, USA|Recommender systems are widely used in many Web applications to recommend items which are relevant to a user’s preferences. However, focusing on exploiting user preferences while ignoring exploration will lead to biased feedback and hurt the user’s experience in the long term. The Mutli-Armed Bandit (MAB) is introduced to balance the tradeoff between exploitation and exploration. By utilizing context information in the reward function, contextual bandit algorithms lead to better performance compared to context-free bandit algorithms. However, existing contextual bandit algorithms either assume a linear relation between the expected reward and context features, whose representation power gets limited, or use a deep neural network in the reward function which is impractical in implementation. In this paper, we propose a new contextual bandit algorithm, DeepLinUCB, which leverages the representation power of deep neural network to transform the raw context features in the reward function. Specifically, this deep neural network is dedicated to the recommender system, which is efficient and practical in real-world applications. Furthermore, we conduct extensive experiments in our online recommender system using requests from real-world scenarios and show that DeepLinUCB is efficient and outperforms other bandit algorithms.|在许多 Web 应用程序中,推荐系统被广泛用于推荐与用户首选项相关的项目。然而,只关注用户偏好而忽视探索将导致偏见的反馈,从长远来看会损害用户的体验。为了平衡开发与勘探之间的权衡,引进了多臂匪。通过在奖励函数中利用上下文信息,上下文盗贼算法比无上下文盗贼算法具有更好的性能。然而,现有的上下文盗贼算法要么假定期望奖励与上下文特征之间存在线性关系,其表示能力受到限制,要么在奖励函数中使用深度神经网络,这在实现上是不切实际的。本文提出了一种新的上下文盗贼算法 DeepLinUCB,该算法利用深层神经网络的表示能力来转换奖励函数中的原始上下文特征。具体来说,这种深层神经网络专门用于推荐系统,在实际应用中非常有效和实用。此外,我们使用来自现实场景的请求,在我们的在线推荐系统中进行了大量的实验,结果表明 DeepLinUCB 是高效的,并且优于其他盗贼算法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Deep+Neural+Network+with+LinUCB:+A+Contextual+Bandit+Approach+for+Personalized+Recommendation)|0| |[Contrastive Collaborative Filtering for Cold-Start Item Recommendation](https://doi.org/10.1145/3543507.3583286)|Zhihui Zhou, Lilin Zhang, Ning Yang|Sichuan University, China|The cold-start problem is a long-standing challenge in recommender systems. As a promising solution, content-based generative models usually project a cold-start item's content onto a warm-start item embedding to capture collaborative signals from item content so that collaborative filtering can be applied. However, since the training of the cold-start recommendation models is conducted on warm datasets, the existent methods face the issue that the collaborative embeddings of items will be blurred, which significantly degenerates the performance of cold-start item recommendation. To address this issue, we propose a novel model called Contrastive Collaborative Filtering for Cold-start item Recommendation (CCFCRec), which capitalizes on the co-occurrence collaborative signals in warm training data to alleviate the issue of blurry collaborative embeddings for cold-start item recommendation. In particular, we devise a contrastive collaborative filtering (CF) framework, consisting of a content CF module and a co-occurrence CF module to generate the content-based collaborative embedding and the co-occurrence collaborative embedding for a training item, respectively. During the joint training of the two CF modules, we apply a contrastive learning between the two collaborative embeddings, by which the knowledge about the co-occurrence signals can be indirectly transferred to the content CF module, so that the blurry collaborative embeddings can be rectified implicitly by the memorized co-occurrence collaborative signals during the applying phase. Together with the sound theoretical analysis, the extensive experiments conducted on real datasets demonstrate the superiority of the proposed model. The codes and datasets are available on https://github.com/zzhin/CCFCRec.|在推荐系统中,冷启动问题是一个长期存在的挑战。作为一种有前途的解决方案,基于内容的生成模型通常将一个冷启动项目的内容投射到一个嵌入的热启动项目上,以从项目内容中捕获协作信号,从而可以应用协同过滤。然而,由于冷启动推荐模型的训练是在暖数据集上进行的,现有的方法面临着项目协同嵌入模糊的问题,这严重影响了冷启动项目推荐的性能。为了解决这个问题,我们提出了一个新的模型,称为冷启动项目推荐对比协同过滤(CCFCrec) ,它利用共现协作信号在暖培训数据,以减轻问题模糊的协作嵌入冷启动项目推荐。特别地,我们设计了一个对比协同过滤(CF)框架,由一个内容 CF 模块和一个共现 CF 模块组成,分别为一个培训项目生成基于内容的协同嵌入和共现协同嵌入。在两个 CF 模块的联合训练中,我们对两个协同嵌入进行了对比学习,通过对比学习可以将关于共现信号的知识间接转移到内容 CF 模块中,从而在应用阶段可以通过记忆共现协同信号来隐式纠正模糊的协同嵌入。通过在实际数据集上的大量实验,结合理论分析,证明了该模型的优越性。代码和数据集可在 https://github.com/zzhin/ccfcrec 上获得。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Contrastive+Collaborative+Filtering+for+Cold-Start+Item+Recommendation)|0| -|[ColdNAS: Search to Modulate for User Cold-Start Recommendation](https://doi.org/10.1145/3543507.3583344)|Shiguang Wu, Yaqing Wang, Qinghe Jing, Daxiang Dong, Dejing Dou, Quanming Yao|Baidu Inc., China; Electronic Engineering, Tsinghua University, China|Making personalized recommendation for cold-start users, who only have a few interaction histories, is a challenging problem in recommendation systems. Recent works leverage hypernetworks to directly map user interaction histories to user-specific parameters, which are then used to modulate predictor by feature-wise linear modulation function. These works obtain the state-of-the-art performance. However, the physical meaning of scaling and shifting in recommendation data is unclear. Instead of using a fixed modulation function and deciding modulation position by expertise, we propose a modulation framework called ColdNAS for user cold-start problem, where we look for proper modulation structure, including function and position, via neural architecture search. We design a search space which covers broad models and theoretically prove that this search space can be transformed to a much smaller space, enabling an efficient and robust one-shot search algorithm. Extensive experimental results on benchmark datasets show that ColdNAS consistently performs the best. We observe that different modulation functions lead to the best performance on different datasets, which validates the necessity of designing a searching-based method. Codes are available at https://github.com/LARS-research/ColdNAS.|在推荐系统中,为只有少量交互历史的冷启动用户进行个性化推荐是一个具有挑战性的问题。最近的研究利用超网络将用户交互历史直接映射到用户特定的参数,然后用特征线性调制函数对预测器进行调制。这些作品获得了最先进的表演水平。然而,推荐数据的缩放和转移的物理意义尚不清楚。为了解决用户冷启动问题,我们提出了一种称为 ColdNAS 的调制框架,该框架通过神经结构搜索寻找合适的调制结构,包括功能和位置,而不是使用固定的调制函数来确定调制位置。我们设计了一个覆盖广泛模型的搜索空间,并从理论上证明了这个搜索空间可以转换成更小的空间,从而实现了一种高效、鲁棒的一次性搜索算法。在基准数据集上的大量实验结果表明,ColdNAS 始终表现最好。我们观察到不同的调制函数对不同的数据集产生最佳的性能,这验证了设计一种基于搜索的方法的必要性。密码可在 https://github.com/lars-research/coldnas 索取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ColdNAS:+Search+to+Modulate+for+User+Cold-Start+Recommendation)|0| -|[Improving the Relevance of Product Search for Queries with Negations](https://doi.org/10.1145/3543873.3587319)|Felice Antonio Merra, Omar Zaidan, Fabricio de Sousa Nascimento|Amazon, Japan; Amazon, Germany|Product search engines (PSEs) play an essential role in retail websites as they make it easier for users to retrieve relevant products within large catalogs. Despite the continuous progress that has led to increasingly accurate search engines, a limited focus has been given to their performance on queries with negations. Indeed, while we would expect to retrieve different products for the queries “iPhone 13 cover with ring” and “iPhone 13 cover without ring”, this does not happen in popular PSEs with the latter query containing results with the unwanted ring component. The limitation of modern PSEs in understanding negations motivates the need for further investigation. In this work, we start by defining the negation intent in users queries. Then, we design a transformer-based model, named Negation Detector for Queries (ND4Q), that reaches optimal performance in negation detection (+95% on accuracy metrics). Finally, having built the first negation detector for product search queries, we propose a negation-aware filtering strategy, named Filtering Irrelevant Products (FIP). The promising experimental results in improve the PSE relevance performance using FIP (+9.41% on [email protected] for queries where the negation starts with "without") pave the way to additional research effort towards negation-aware PSEs.|产品搜索引擎(PSE)在零售网站中发挥着重要作用,因为它们使用户更容易在大型目录中检索相关产品。尽管不断取得进展,导致搜索引擎越来越准确,但对否定查询的性能关注有限。事实上,虽然我们期望检索不同的产品的查询“ iPhone13盖有戒指”和“ iPhone13盖无戒指”,这不会发生在流行的 PSE 与后者的查询包含不想要的戒指组件的结果。现代 PSE 在理解否定方面的局限性促使了进一步研究的必要性。在这项工作中,我们首先定义用户查询中的否定意图。然后,我们设计了一个基于变压器的模型,称为查询否定检测器(ND4Q) ,它在否定检测中达到了最佳的性能(在准确性指标上 + 95%)。最后,在构建了产品搜索查询的第一个否定检测器的基础上,提出了一种基于否定感知的过滤策略——过滤不相关产品(FIP)。使用 FIP (对于否定以“无”开头的查询,[ email protected ]增加9.41%)改善 PSE 相关性的有希望的实验结果为针对具有否定意识的 PSE 的额外研究努力铺平了道路。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+the+Relevance+of+Product+Search+for+Queries+with+Negations)|0| -|[Movie Ticket, Popcorn, and Another Movie Next Weekend: Time-Aware Service Sequential Recommendation for User Retention](https://doi.org/10.1145/3543873.3584628)|Xiaoyan Yang, Dong Wang, Binbin Hu, Dan Yang, Yue Shen, Jinjie Gu, Zhiqiang Zhang, Shiwei Lyu, Haipeng Zhang, Guannan Zhang|Ant Group, China; ShanghaiTech University, China|When a customer sees a movie recommendation, she may buy the ticket right away, which is the immediate feedback that helps improve the recommender system. Alternatively, she may choose to come back later and this long-term feedback is also modeled to promote user retention. However, the long-term feedback comes with non-trivial challenges in understanding user retention: the complicated correlation between current demands and follow-up demands, coupled with the periodicity of services. For instance, before the movie, the customer buys popcorn through the App, which temporally correlates with the initial movie recommendation. Days later, she checks the App for new movies, as a weekly routine. To address this complexity in a more fine-grained revisit modeling, we propose Time Aware Service Sequential Recommendation (TASSR) for user retention, which is equipped with a multi-task design and an In-category TimeSeqBlock module. Large-scale online and offline experiments demonstrate its significant advantages over competitive baselines.|当顾客看到一部电影的推荐信时,她可能会马上买票,这是一种即时的反馈,有助于提高推荐系统。或者,她可以选择以后再来,这种长期的反馈也被建模以促进用户保留。然而,长期的反馈在理解用户保留方面带来了重大挑战: 当前需求和后续需求之间的复杂关系,以及服务的周期性。例如,在看电影之前,客户通过 App 购买爆米花,这在时间上与最初的电影推荐相关。几天后,她每周例行检查应用程序是否有新电影。为了在更细粒度的再访问建模中解决这一复杂性,我们提出了用于用户保持的时间感知服务序列推荐(TASSR) ,该推荐配备了多任务设计和同类 TimeSeqBlock 模块。大规模的在线和离线实验证明了它相对于竞争基线的显著优势。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Movie+Ticket,+Popcorn,+and+Another+Movie+Next+Weekend:+Time-Aware+Service+Sequential+Recommendation+for+User+Retention)|0| -|[Unified Vision-Language Representation Modeling for E-Commerce Same-style Products Retrieval](https://doi.org/10.1145/3543873.3584632)|Ben Chen, Linbo Jin, Xinxin Wang, Dehong Gao, Wen Jiang, Wei Ning|Alibaba Group, China; Aliaba Group, China|Same-style products retrieval plays an important role in e-commerce platforms, aiming to identify the same products which may have different text descriptions or images. It can be used for similar products retrieval from different suppliers or duplicate products detection of one supplier. Common methods use the image as the detected object, but they only consider the visual features and overlook the attribute information contained in the textual descriptions, and perform weakly for products in image less important industries like machinery, hardware tools and electronic component, even if an additional text matching module is added. In this paper, we propose a unified vision-language modeling method for e-commerce same-style products retrieval, which is designed to represent one product with its textual descriptions and visual contents. It contains one sampling skill to collect positive pairs from user click log with category and relevance constrained, and a novel contrastive loss unit to model the image, text, and image+text representations into one joint embedding space. It is capable of cross-modal product-to-product retrieval, as well as style transfer and user-interactive search. Offline evaluations on annotated data demonstrate its superior retrieval performance, and online testings show it can attract more clicks and conversions. Moreover, this model has already been deployed online for similar products retrieval in alibaba.com, the largest B2B e-commerce platform in the world.|同类产品检索在电子商务平台中起着重要作用,其目的是识别具有不同文本描述或图像的同类产品。它可用于从不同供应商检索相似产品或检测一个供应商的重复产品。一般的检测方法都是以图像作为检测对象,但它们只考虑视觉特征,忽略了文本描述中的属性信息,对于机械、硬件工具和电子元件等图像不太重要的行业的产品,即使增加了额外的文本匹配模块,检测效果也很差。本文提出了一种统一的电子商务同类产品检索的视觉语言建模方法。它包含一种采样技巧,用于从类别和相关性受限的用户点击日志中收集正对,以及一种新的对比度损失单元,用于将图像、文本和图像 + 文本表示建模为一个联合嵌入空间。它能够进行跨模式的产品对产品检索,以及样式转移和用户交互式搜索。对注释数据的离线评估表明它具有优越的检索性能,在线测试表明它可以吸引更多的点击和转换。此外,该模型已经在全球最大的 B2B 电子商务平台阿里巴巴网站(alibaba.com)的类似产品检索中得到应用。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Unified+Vision-Language+Representation+Modeling+for+E-Commerce+Same-style+Products+Retrieval)|0| -|[Task Adaptive Multi-learner Network for Joint CTR and CVR Estimation](https://doi.org/10.1145/3543873.3584653)|Xiaofan Liu, Qinglin Jia, Chuhan Wu, Jingjie Li, Quanyu Dai, Lin Bo, Rui Zhang, Ruiming Tang|Huawei Noah's Ark Lab, China; Beijing University of Posts and Telecommunications, China; ruizhang.info, China; Renmin University of China, China|CTR and CVR are critical factors in personalized applications, and many methods jointly estimate them via multi-task learning to alleviate the ultra-sparsity of conversion behaviors. However, it is still difficult to predict CVR accurately and robustly due to the limited and even biased knowledge extracted by the single model tower optimized on insufficient conversion samples. In this paper, we propose a task adaptive multi-learner (TAML) framework for joint CTR and CVR prediction. We design a hierarchical task adaptive knowledge representation module with different experts to capture knowledge in different granularities, which can effectively exploit the commonalities between CTR and CVR estimation tasks meanwhile keeping their unique characteristics. We apply multiple learners to extract data knowledge from various views and fuse their predictions to obtain accurate and robust scores. To facilitate knowledge sharing across learners, we further perform self-distillation that uses the fused scores to teach different learners. Thorough offline and online experiments show the superiority of TAML in different Ad ranking tasks, and we have deployed it in Huawei’s online advertising platform to serve the main traffic.|CTR 和 CVR 是个性化应用中的关键因素,多种方法通过多任务学习来联合估计它们,以减轻转换行为的超稀疏性。然而,由于单模型塔在转换样本不足的情况下进行了优化,提取的知识有限,甚至有偏差,因此仍然难以准确、稳健地预测 CVR。本文提出了一个任务自适应多学习器(TAML)框架,用于联合 CTR 和 CVR 预测。设计了一个分层任务自适应知识表示模块,采用不同的专家来获取不同粒度的知识,有效地利用了 CTR 和 CVR 估计任务的共性,同时保持了它们的独特性。我们应用多个学习者从不同的角度提取数据知识,并融合他们的预测,以获得准确和稳健的分数。为了促进学习者之间的知识共享,我们进一步使用融合分数来教授不同的学习者。通过线下和线上的实验,我们发现了 TAML 在不同广告排名任务中的优势,并将其应用于华为的在线广告平台,为主要流量提供服务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Task+Adaptive+Multi-learner+Network+for+Joint+CTR+and+CVR+Estimation)|0| -|[Deep Intention-Aware Network for Click-Through Rate Prediction](https://doi.org/10.1145/3543873.3584661)|Yaxian Xia, Yi Cao, Sihao Hu, Tong Liu, Lingling Lu|Zhejiang University, China; Georgia Institute of Technology, USA; Alibaba Group, China|E-commerce platforms provide entrances for customers to enter mini-apps that can meet their specific shopping requirements. Trigger items displayed on entrance icons can attract more entering. However, conventional Click-Through-Rate (CTR) prediction models, which ignore user instant interest in trigger item, fail to be applied to the new recommendation scenario dubbed Trigger-Induced Recommendation in Mini-Apps (TIRA). Moreover, due to the high stickiness of customers to mini-apps, we argue that existing trigger-based methods that over-emphasize the importance of trigger items, are undesired for TIRA, since a large portion of customer entries are because of their routine shopping habits instead of triggers. We identify that the key to TIRA is to extract customers' personalized entering intention and weigh the impact of triggers based on this intention. To achieve this goal, we convert CTR prediction for TIRA into a separate estimation form, and present Deep Intention-Aware Network (DIAN) with three key elements: 1) Intent Net that estimates user's entering intention, i.e., whether he/she is affected by the trigger or by the habits; 2) Trigger-Aware Net and 3) Trigger-Free Net that estimate CTRs given user's intention is to the trigger-item and the mini-app respectively. Following a joint learning way, DIAN can both accurately predict user intention and dynamically balance the results of trigger-free and trigger-based recommendations based on the estimated intention. Experiments show that DIAN advances state-of-the-art performance in a large real-world dataset, and brings a 9.39% lift of online Item Page View and 4.74% CTR for Juhuasuan, a famous mini-app of Taobao.|电子商务平台为客户提供了进入迷你应用程序,可以满足他们的具体购物需求。触发项目显示在入口图标可以吸引更多的进入。然而,传统的点击率(Click-Through-Rate,CTR)预测模型忽略了用户对触发条目的即时兴趣,无法应用于被称为微型应用程序中的触发诱导推荐(Trigger-)的新推荐场景。此外,由于客户对迷你应用程序的高粘性,我们认为,现有的基于触发器的方法,过分强调触发项目的重要性,是不希望 TIRA,因为大部分客户进入是因为他们的日常购物习惯,而不是触发器。我们认为,TIRA 的关键是提取顾客的个性化进入意图,并根据这一意图权衡触发因素的影响。为了实现这一目标,我们将 TIRA 的 CTR 预测转化为一个单独的估计形式,并提出深度意图感知网络(DIAN)的三个关键要素: 1)意图网络,估计用户的进入意图,即他/她是否受到触发器或习惯的影响; 2)触发感知网络和3)无触发网络,估计给定用户意图的 CTR 分别是触发项目和迷你应用程序。DIAN 采用联合学习的方法,既能准确预测用户意图,又能根据预测意图动态平衡无触发和基于触发的推荐结果。实验表明,DIAN 在一个大型现实数据集中提升了最先进的性能,使在线项目页面查看率提高了9.39% ,淘宝著名小应用聚花酸的点击率提高了4.74% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Deep+Intention-Aware+Network+for+Click-Through+Rate+Prediction)|0| +|[ColdNAS: Search to Modulate for User Cold-Start Recommendation](https://doi.org/10.1145/3543507.3583344)|Shiguang Wu, Yaqing Wang, Qinghe Jing, Daxiang Dong, Dejing Dou, Quanming Yao|Electronic Engineering, Tsinghua University, China; Baidu Inc., China|Making personalized recommendation for cold-start users, who only have a few interaction histories, is a challenging problem in recommendation systems. Recent works leverage hypernetworks to directly map user interaction histories to user-specific parameters, which are then used to modulate predictor by feature-wise linear modulation function. These works obtain the state-of-the-art performance. However, the physical meaning of scaling and shifting in recommendation data is unclear. Instead of using a fixed modulation function and deciding modulation position by expertise, we propose a modulation framework called ColdNAS for user cold-start problem, where we look for proper modulation structure, including function and position, via neural architecture search. We design a search space which covers broad models and theoretically prove that this search space can be transformed to a much smaller space, enabling an efficient and robust one-shot search algorithm. Extensive experimental results on benchmark datasets show that ColdNAS consistently performs the best. We observe that different modulation functions lead to the best performance on different datasets, which validates the necessity of designing a searching-based method. Codes are available at https://github.com/LARS-research/ColdNAS.|在推荐系统中,为只有少量交互历史的冷启动用户进行个性化推荐是一个具有挑战性的问题。最近的研究利用超网络将用户交互历史直接映射到用户特定的参数,然后用特征线性调制函数对预测器进行调制。这些作品获得了最先进的表演水平。然而,推荐数据的缩放和转移的物理意义尚不清楚。为了解决用户冷启动问题,我们提出了一种称为 ColdNAS 的调制框架,该框架通过神经结构搜索寻找合适的调制结构,包括功能和位置,而不是使用固定的调制函数来确定调制位置。我们设计了一个覆盖广泛模型的搜索空间,并从理论上证明了这个搜索空间可以转换成更小的空间,从而实现了一种高效、鲁棒的一次性搜索算法。在基准数据集上的大量实验结果表明,ColdNAS 始终表现最好。我们观察到不同的调制函数对不同的数据集产生最佳的性能,这验证了设计一种基于搜索的方法的必要性。密码可在 https://github.com/lars-research/coldnas 索取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ColdNAS:+Search+to+Modulate+for+User+Cold-Start+Recommendation)|0| +|[Improving the Relevance of Product Search for Queries with Negations](https://doi.org/10.1145/3543873.3587319)|Felice Antonio Merra, Omar Zaidan, Fabricio de Sousa Nascimento|Amazon, Germany; Amazon, Japan|Product search engines (PSEs) play an essential role in retail websites as they make it easier for users to retrieve relevant products within large catalogs. Despite the continuous progress that has led to increasingly accurate search engines, a limited focus has been given to their performance on queries with negations. Indeed, while we would expect to retrieve different products for the queries “iPhone 13 cover with ring” and “iPhone 13 cover without ring”, this does not happen in popular PSEs with the latter query containing results with the unwanted ring component. The limitation of modern PSEs in understanding negations motivates the need for further investigation. In this work, we start by defining the negation intent in users queries. Then, we design a transformer-based model, named Negation Detector for Queries (ND4Q), that reaches optimal performance in negation detection (+95% on accuracy metrics). Finally, having built the first negation detector for product search queries, we propose a negation-aware filtering strategy, named Filtering Irrelevant Products (FIP). The promising experimental results in improve the PSE relevance performance using FIP (+9.41% on [email protected] for queries where the negation starts with "without") pave the way to additional research effort towards negation-aware PSEs.|产品搜索引擎(PSE)在零售网站中发挥着重要作用,因为它们使用户更容易在大型目录中检索相关产品。尽管不断取得进展,导致搜索引擎越来越准确,但对否定查询的性能关注有限。事实上,虽然我们期望检索不同的产品的查询“ iPhone13盖有戒指”和“ iPhone13盖无戒指”,这不会发生在流行的 PSE 与后者的查询包含不想要的戒指组件的结果。现代 PSE 在理解否定方面的局限性促使了进一步研究的必要性。在这项工作中,我们首先定义用户查询中的否定意图。然后,我们设计了一个基于变压器的模型,称为查询否定检测器(ND4Q) ,它在否定检测中达到了最佳的性能(在准确性指标上 + 95%)。最后,在构建了产品搜索查询的第一个否定检测器的基础上,提出了一种基于否定感知的过滤策略——过滤不相关产品(FIP)。使用 FIP (对于否定以“无”开头的查询,[ email protected ]增加9.41%)改善 PSE 相关性的有希望的实验结果为针对具有否定意识的 PSE 的额外研究努力铺平了道路。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+the+Relevance+of+Product+Search+for+Queries+with+Negations)|0| +|[Movie Ticket, Popcorn, and Another Movie Next Weekend: Time-Aware Service Sequential Recommendation for User Retention](https://doi.org/10.1145/3543873.3584628)|Xiaoyan Yang, Dong Wang, Binbin Hu, Dan Yang, Yue Shen, Jinjie Gu, Zhiqiang Zhang, Shiwei Lyu, Haipeng Zhang, Guannan Zhang|ShanghaiTech University, China; Ant Group, China|When a customer sees a movie recommendation, she may buy the ticket right away, which is the immediate feedback that helps improve the recommender system. Alternatively, she may choose to come back later and this long-term feedback is also modeled to promote user retention. However, the long-term feedback comes with non-trivial challenges in understanding user retention: the complicated correlation between current demands and follow-up demands, coupled with the periodicity of services. For instance, before the movie, the customer buys popcorn through the App, which temporally correlates with the initial movie recommendation. Days later, she checks the App for new movies, as a weekly routine. To address this complexity in a more fine-grained revisit modeling, we propose Time Aware Service Sequential Recommendation (TASSR) for user retention, which is equipped with a multi-task design and an In-category TimeSeqBlock module. Large-scale online and offline experiments demonstrate its significant advantages over competitive baselines.|当顾客看到一部电影的推荐信时,她可能会马上买票,这是一种即时的反馈,有助于提高推荐系统。或者,她可以选择以后再来,这种长期的反馈也被建模以促进用户保留。然而,长期的反馈在理解用户保留方面带来了重大挑战: 当前需求和后续需求之间的复杂关系,以及服务的周期性。例如,在看电影之前,客户通过 App 购买爆米花,这在时间上与最初的电影推荐相关。几天后,她每周例行检查应用程序是否有新电影。为了在更细粒度的再访问建模中解决这一复杂性,我们提出了用于用户保持的时间感知服务序列推荐(TASSR) ,该推荐配备了多任务设计和同类 TimeSeqBlock 模块。大规模的在线和离线实验证明了它相对于竞争基线的显著优势。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Movie+Ticket,+Popcorn,+and+Another+Movie+Next+Weekend:+Time-Aware+Service+Sequential+Recommendation+for+User+Retention)|0| +|[Unified Vision-Language Representation Modeling for E-Commerce Same-style Products Retrieval](https://doi.org/10.1145/3543873.3584632)|Ben Chen, Linbo Jin, Xinxin Wang, Dehong Gao, Wen Jiang, Wei Ning|Aliaba Group, China; Alibaba Group, China|Same-style products retrieval plays an important role in e-commerce platforms, aiming to identify the same products which may have different text descriptions or images. It can be used for similar products retrieval from different suppliers or duplicate products detection of one supplier. Common methods use the image as the detected object, but they only consider the visual features and overlook the attribute information contained in the textual descriptions, and perform weakly for products in image less important industries like machinery, hardware tools and electronic component, even if an additional text matching module is added. In this paper, we propose a unified vision-language modeling method for e-commerce same-style products retrieval, which is designed to represent one product with its textual descriptions and visual contents. It contains one sampling skill to collect positive pairs from user click log with category and relevance constrained, and a novel contrastive loss unit to model the image, text, and image+text representations into one joint embedding space. It is capable of cross-modal product-to-product retrieval, as well as style transfer and user-interactive search. Offline evaluations on annotated data demonstrate its superior retrieval performance, and online testings show it can attract more clicks and conversions. Moreover, this model has already been deployed online for similar products retrieval in alibaba.com, the largest B2B e-commerce platform in the world.|同类产品检索在电子商务平台中起着重要作用,其目的是识别具有不同文本描述或图像的同类产品。它可用于从不同供应商检索相似产品或检测一个供应商的重复产品。一般的检测方法都是以图像作为检测对象,但它们只考虑视觉特征,忽略了文本描述中的属性信息,对于机械、硬件工具和电子元件等图像不太重要的行业的产品,即使增加了额外的文本匹配模块,检测效果也很差。本文提出了一种统一的电子商务同类产品检索的视觉语言建模方法。它包含一种采样技巧,用于从类别和相关性受限的用户点击日志中收集正对,以及一种新的对比度损失单元,用于将图像、文本和图像 + 文本表示建模为一个联合嵌入空间。它能够进行跨模式的产品对产品检索,以及样式转移和用户交互式搜索。对注释数据的离线评估表明它具有优越的检索性能,在线测试表明它可以吸引更多的点击和转换。此外,该模型已经在全球最大的 B2B 电子商务平台阿里巴巴网站(alibaba.com)的类似产品检索中得到应用。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Unified+Vision-Language+Representation+Modeling+for+E-Commerce+Same-style+Products+Retrieval)|0| +|[Task Adaptive Multi-learner Network for Joint CTR and CVR Estimation](https://doi.org/10.1145/3543873.3584653)|Xiaofan Liu, Qinglin Jia, Chuhan Wu, Jingjie Li, Quanyu Dai, Lin Bo, Rui Zhang, Ruiming Tang|ruizhang.info, China; Beijing University of Posts and Telecommunications, China; Renmin University of China, China; Huawei Noah's Ark Lab, China|CTR and CVR are critical factors in personalized applications, and many methods jointly estimate them via multi-task learning to alleviate the ultra-sparsity of conversion behaviors. However, it is still difficult to predict CVR accurately and robustly due to the limited and even biased knowledge extracted by the single model tower optimized on insufficient conversion samples. In this paper, we propose a task adaptive multi-learner (TAML) framework for joint CTR and CVR prediction. We design a hierarchical task adaptive knowledge representation module with different experts to capture knowledge in different granularities, which can effectively exploit the commonalities between CTR and CVR estimation tasks meanwhile keeping their unique characteristics. We apply multiple learners to extract data knowledge from various views and fuse their predictions to obtain accurate and robust scores. To facilitate knowledge sharing across learners, we further perform self-distillation that uses the fused scores to teach different learners. Thorough offline and online experiments show the superiority of TAML in different Ad ranking tasks, and we have deployed it in Huawei’s online advertising platform to serve the main traffic.|CTR 和 CVR 是个性化应用中的关键因素,多种方法通过多任务学习来联合估计它们,以减轻转换行为的超稀疏性。然而,由于单模型塔在转换样本不足的情况下进行了优化,提取的知识有限,甚至有偏差,因此仍然难以准确、稳健地预测 CVR。本文提出了一个任务自适应多学习器(TAML)框架,用于联合 CTR 和 CVR 预测。设计了一个分层任务自适应知识表示模块,采用不同的专家来获取不同粒度的知识,有效地利用了 CTR 和 CVR 估计任务的共性,同时保持了它们的独特性。我们应用多个学习者从不同的角度提取数据知识,并融合他们的预测,以获得准确和稳健的分数。为了促进学习者之间的知识共享,我们进一步使用融合分数来教授不同的学习者。通过线下和线上的实验,我们发现了 TAML 在不同广告排名任务中的优势,并将其应用于华为的在线广告平台,为主要流量提供服务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Task+Adaptive+Multi-learner+Network+for+Joint+CTR+and+CVR+Estimation)|0| +|[Deep Intention-Aware Network for Click-Through Rate Prediction](https://doi.org/10.1145/3543873.3584661)|Yaxian Xia, Yi Cao, Sihao Hu, Tong Liu, Lingling Lu|Georgia Institute of Technology, USA; Zhejiang University, China; Alibaba Group, China|E-commerce platforms provide entrances for customers to enter mini-apps that can meet their specific shopping requirements. Trigger items displayed on entrance icons can attract more entering. However, conventional Click-Through-Rate (CTR) prediction models, which ignore user instant interest in trigger item, fail to be applied to the new recommendation scenario dubbed Trigger-Induced Recommendation in Mini-Apps (TIRA). Moreover, due to the high stickiness of customers to mini-apps, we argue that existing trigger-based methods that over-emphasize the importance of trigger items, are undesired for TIRA, since a large portion of customer entries are because of their routine shopping habits instead of triggers. We identify that the key to TIRA is to extract customers' personalized entering intention and weigh the impact of triggers based on this intention. To achieve this goal, we convert CTR prediction for TIRA into a separate estimation form, and present Deep Intention-Aware Network (DIAN) with three key elements: 1) Intent Net that estimates user's entering intention, i.e., whether he/she is affected by the trigger or by the habits; 2) Trigger-Aware Net and 3) Trigger-Free Net that estimate CTRs given user's intention is to the trigger-item and the mini-app respectively. Following a joint learning way, DIAN can both accurately predict user intention and dynamically balance the results of trigger-free and trigger-based recommendations based on the estimated intention. Experiments show that DIAN advances state-of-the-art performance in a large real-world dataset, and brings a 9.39% lift of online Item Page View and 4.74% CTR for Juhuasuan, a famous mini-app of Taobao.|电子商务平台为客户提供了进入迷你应用程序,可以满足他们的具体购物需求。触发项目显示在入口图标可以吸引更多的进入。然而,传统的点击率(Click-Through-Rate,CTR)预测模型忽略了用户对触发条目的即时兴趣,无法应用于被称为微型应用程序中的触发诱导推荐(Trigger-)的新推荐场景。此外,由于客户对迷你应用程序的高粘性,我们认为,现有的基于触发器的方法,过分强调触发项目的重要性,是不希望 TIRA,因为大部分客户进入是因为他们的日常购物习惯,而不是触发器。我们认为,TIRA 的关键是提取顾客的个性化进入意图,并根据这一意图权衡触发因素的影响。为了实现这一目标,我们将 TIRA 的 CTR 预测转化为一个单独的估计形式,并提出深度意图感知网络(DIAN)的三个关键要素: 1)意图网络,估计用户的进入意图,即他/她是否受到触发器或习惯的影响; 2)触发感知网络和3)无触发网络,估计给定用户意图的 CTR 分别是触发项目和迷你应用程序。DIAN 采用联合学习的方法,既能准确预测用户意图,又能根据预测意图动态平衡无触发和基于触发的推荐结果。实验表明,DIAN 在一个大型现实数据集中提升了最先进的性能,使在线项目页面查看率提高了9.39% ,淘宝著名小应用聚花酸的点击率提高了4.74% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Deep+Intention-Aware+Network+for+Click-Through+Rate+Prediction)|0| |[Search Personalization at Netflix](https://doi.org/10.1145/3543873.3587675)|Vito Ostuni, Christoph Kofler, Manjesh Nilange, Sudarshan Lamkhede, Dan Zylberglejd|Netflix Inc., USA|At Netflix, personalization plays a key role in several aspects of our user experience, from ranking titles to constructing an optimal Homepage. Although personalization is a well established research field, its application to search presents unique problems and opportunities. In this paper, we describe the evolution of Search personalization at Netflix, its unique challenges, and provide a high level overview of relevant solutions.|在 Netflix,个性化在我们的用户体验的几个方面起着关键作用,从排名标题到建立一个最佳的主页。虽然个性化是一个成熟的研究领域,但是它在搜索中的应用却带来了独特的问题和机遇。在本文中,我们描述了在 Netflix 搜索个性化的演变,其独特的挑战,并提供了相关解决方案的高层次概述。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Search+Personalization+at+Netflix)|0| -|[Pretrained Embeddings for E-commerce Machine Learning: When it Fails and Why?](https://doi.org/10.1145/3543873.3587669)|Da Xu, Bo Yang|LinkedIn, USA; Amazon, USA|The use of pretrained embeddings has become widespread in modern e-commerce machine learning (ML) systems. In practice, however, we have encountered several key issues when using pretrained embedding in a real-world production system, many of which cannot be fully explained by current knowledge. Unfortunately, we find that there is a lack of a thorough understanding of how pre-trained embeddings work, especially their intrinsic properties and interactions with downstream tasks. Consequently, it becomes challenging to make interactive and scalable decisions regarding the use of pre-trained embeddings in practice. Our investigation leads to two significant discoveries about using pretrained embeddings in e-commerce applications. Firstly, we find that the design of the pretraining and downstream models, particularly how they encode and decode information via embedding vectors, can have a profound impact. Secondly, we establish a principled perspective of pre-trained embeddings via the lens of kernel analysis, which can be used to evaluate their predictability, interactively and scalably. These findings help to address the practical challenges we faced and offer valuable guidance for successful adoption of pretrained embeddings in real-world production. Our conclusions are backed by solid theoretical reasoning, benchmark experiments, as well as online testings.|在现代电子商务机器学习(ML)系统中,预训练嵌入技术已经得到了广泛的应用。然而,在实践中,我们遇到了几个关键问题,当使用预训练嵌入在一个真实的生产系统,其中许多不能完全解释现有的知识。不幸的是,我们发现缺乏对预先训练的嵌入如何工作的透彻理解,特别是它们的内在属性和与下游任务的交互。因此,在实践中使用预先训练的嵌入方法时,做出交互式和可扩展的决策变得具有挑战性。我们的调查导致两个重要的发现,使用预训练嵌入在电子商务应用程序。首先,我们发现预训练和下游模型的设计,特别是它们如何通过嵌入向量对信息进行编码和解码,会产生深远的影响。其次,通过核分析的视角,建立了预训练嵌入的原则性视角,可以用来评估预训练嵌入的可预测性、交互性和可扩展性。这些发现有助于解决我们面临的实际挑战,并为在现实生产中成功采用预先培训的嵌入提供了宝贵的指导。我们的结论得到了可靠的理论推理、基准实验以及在线测试的支持。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Pretrained+Embeddings+for+E-commerce+Machine+Learning:+When+it+Fails+and+Why?)|0| +|[Pretrained Embeddings for E-commerce Machine Learning: When it Fails and Why?](https://doi.org/10.1145/3543873.3587669)|Da Xu, Bo Yang|Amazon, USA; LinkedIn, USA|The use of pretrained embeddings has become widespread in modern e-commerce machine learning (ML) systems. In practice, however, we have encountered several key issues when using pretrained embedding in a real-world production system, many of which cannot be fully explained by current knowledge. Unfortunately, we find that there is a lack of a thorough understanding of how pre-trained embeddings work, especially their intrinsic properties and interactions with downstream tasks. Consequently, it becomes challenging to make interactive and scalable decisions regarding the use of pre-trained embeddings in practice. Our investigation leads to two significant discoveries about using pretrained embeddings in e-commerce applications. Firstly, we find that the design of the pretraining and downstream models, particularly how they encode and decode information via embedding vectors, can have a profound impact. Secondly, we establish a principled perspective of pre-trained embeddings via the lens of kernel analysis, which can be used to evaluate their predictability, interactively and scalably. These findings help to address the practical challenges we faced and offer valuable guidance for successful adoption of pretrained embeddings in real-world production. Our conclusions are backed by solid theoretical reasoning, benchmark experiments, as well as online testings.|在现代电子商务机器学习(ML)系统中,预训练嵌入技术已经得到了广泛的应用。然而,在实践中,我们遇到了几个关键问题,当使用预训练嵌入在一个真实的生产系统,其中许多不能完全解释现有的知识。不幸的是,我们发现缺乏对预先训练的嵌入如何工作的透彻理解,特别是它们的内在属性和与下游任务的交互。因此,在实践中使用预先训练的嵌入方法时,做出交互式和可扩展的决策变得具有挑战性。我们的调查导致两个重要的发现,使用预训练嵌入在电子商务应用程序。首先,我们发现预训练和下游模型的设计,特别是它们如何通过嵌入向量对信息进行编码和解码,会产生深远的影响。其次,通过核分析的视角,建立了预训练嵌入的原则性视角,可以用来评估预训练嵌入的可预测性、交互性和可扩展性。这些发现有助于解决我们面临的实际挑战,并为在现实生产中成功采用预先培训的嵌入提供了宝贵的指导。我们的结论得到了可靠的理论推理、基准实验以及在线测试的支持。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Pretrained+Embeddings+for+E-commerce+Machine+Learning:+When+it+Fails+and+Why?)|0| |[GELTOR: A Graph Embedding Method based on Listwise Learning to Rank](https://doi.org/10.1145/3543507.3583193)|Masoud Reyhani Hamedani, JinSu Ryu, SangWook Kim|Hanyang University, Republic of Korea|Similarity-based embedding methods have introduced a new perspective on graph embedding by conforming the similarity distribution of latent vectors in the embedding space to that of nodes in the graph; they show significant effectiveness over conventional embedding methods in various machine learning tasks. In this paper, we first point out the three drawbacks of existing similarity-based embedding methods: inaccurate similarity computation, conflicting optimization goal, and impairing in/out-degree distributions. Then, motivated by these drawbacks, we propose AdaSim*, a novel similarity measure for graphs that is conducive to the similarity-based graph embedding. We finally propose GELTOR, an effective embedding method that employs AdaSim* as a node similarity measure and the concept of learning-to-rank in the embedding process. Contrary to existing methods, GELTOR does not learn the similarity scores distribution; instead, for any target node, GELTOR conforms the ranks of its top-t similar nodes in the embedding space to their original ranks based on AdaSim* scores. We conduct extensive experiments with six real-world datasets to evaluate the effectiveness of GELTOR in graph reconstruction, link prediction, and node classification tasks. Our experimental results show that (1) AdaSim* outperforms AdaSim, RWR, and MCT in computing nodes similarity in graphs, (2) our GETLOR outperforms existing state-of-the-arts and conventional embedding methods in most cases of the above machine learning tasks, thereby implying that learning-to-rank is beneficial to graph embedding.|基于相似性的嵌入方法通过调整嵌入空间中潜在向量与图中节点的相似性分布,为图的嵌入提供了一个新的视角,它们在各种机器学习任务中显示出比传统的嵌入方法更为有效的效果。本文首先指出了现有的基于相似度的嵌入方法存在的三个缺点: 相似度计算不准确、优化目标冲突和损伤内外度分布。然后,基于这些缺点,我们提出了 AdaSim * ,这是一种新的图的相似性度量,有利于基于相似性的图嵌入。最后提出了一种有效的嵌入方法 GELTOR,该方法采用 AdaSim * 作为节点相似性度量,并在嵌入过程中引入了学习排序的概念。与现有的方法相反,GELTOR 不学习相似度分数分布; 相反,对于任何目标节点,GELTOR 根据 AdaSim * 分数将其嵌入空间中的顶部 -t 相似节点的排名与其原始排名保持一致。我们使用六个真实世界的数据集进行了广泛的实验,以评估 GELTOR 在图重建、链路预测和节点分类任务中的有效性。我们的实验结果表明: (1) AdaSim * 在计算图中节点相似度方面优于 AdaSim,RWR 和 MCT; (2)在上述机器学习任务的大多数情况下,我们的 GETLOR 优于现有的最先进的和传统的嵌入方法,从而意味着学习排序有利于图嵌入。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=GELTOR:+A+Graph+Embedding+Method+based+on+Listwise+Learning+to+Rank)|0| |[On the Theories Behind Hard Negative Sampling for Recommendation](https://doi.org/10.1145/3543507.3583223)|Wentao Shi, Jiawei Chen, Fuli Feng, Jizhi Zhang, Junkang Wu, Chongming Gao, Xiangnan He|Zhejiang University, China; University of Science and Technology of China, China|Negative sampling has been heavily used to train recommender models on large-scale data, wherein sampling hard examples usually not only accelerates the convergence but also improves the model accuracy. Nevertheless, the reasons for the effectiveness of Hard Negative Sampling (HNS) have not been revealed yet. In this work, we fill the research gap by conducting thorough theoretical analyses on HNS. Firstly, we prove that employing HNS on the Bayesian Personalized Ranking (BPR) learner is equivalent to optimizing One-way Partial AUC (OPAUC). Concretely, the BPR equipped with Dynamic Negative Sampling (DNS) is an exact estimator, while with softmax-based sampling is a soft estimator. Secondly, we prove that OPAUC has a stronger connection with Top-K evaluation metrics than AUC and verify it with simulation experiments. These analyses establish the theoretical foundation of HNS in optimizing Top-K recommendation performance for the first time. On these bases, we offer two insightful guidelines for effective usage of HNS: 1) the sampling hardness should be controllable, e.g., via pre-defined hyper-parameters, to adapt to different Top-K metrics and datasets; 2) the smaller the $K$ we emphasize in Top-K evaluation metrics, the harder the negative samples we should draw. Extensive experiments on three real-world benchmarks verify the two guidelines.|负抽样已经被广泛用于大规模数据的推荐模型训练,而硬实例抽样不仅可以加快模型的收敛速度,而且可以提高模型的精度。然而,硬性负样本(HNS)有效性的原因尚未被揭示。本文通过对 HNS 进行深入的理论分析,填补了研究空白。首先,我们证明了对贝叶斯个性化排序(BPR)学习者使用 HNS 等价于优化单向部分 AUC (OPAUC)。具体来说,装有动态负抽样(DNS)的 BPR 是一个精确估计量,而基于软最大抽样的 BPR 是一个软估计量。其次,我们证明了 OPAUC 与 Top-K 评价指标之间的联系比 AUC 更强,并通过仿真实验进行了验证。这些分析首次为 HNS 优化 Top-K 推荐性能奠定了理论基础。在此基础上,我们为有效使用 HNS 提供了两个有见地的指导方针: 1)抽样硬度应该是可控的,例如,通过预定义的超参数,以适应不同的 Top-K 指标和数据集; 2)我们在 Top-K 评估指标中强调的 $K $越小,我们应该抽取的负面样本就越难。在三个真实世界的基准上进行的大量实验验证了这两条准则。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On+the+Theories+Behind+Hard+Negative+Sampling+for+Recommendation)|0| -|[A Counterfactual Collaborative Session-based Recommender System](https://doi.org/10.1145/3543507.3583321)|Wenzhuo Song, Shoujin Wang, Yan Wang, Kunpeng Liu, Xueyan Liu, Minghao Yin|Macquarie University, Australia; University of Technology Sydney, Australia; Northeast Normal University, China; Jilin University, China; Portland State University, USA|Most session-based recommender systems (SBRSs) focus on extracting information from the observed items in the current session of a user to predict a next item, ignoring the causes outside the session (called outer-session causes, OSCs) that influence the user's selection of items. However, these causes widely exist in the real world, and few studies have investigated their role in SBRSs. In this work, we analyze the causalities and correlations of the OSCs in SBRSs from the perspective of causal inference. We find that the OSCs are essentially the confounders in SBRSs, which leads to spurious correlations in the data used to train SBRS models. To address this problem, we propose a novel SBRS framework named COCO-SBRS (COunterfactual COllaborative Session-Based Recommender Systems) to learn the causality between OSCs and user-item interactions in SBRSs. COCO-SBRS first adopts a self-supervised approach to pre-train a recommendation model by designing pseudo-labels of causes for each user's selection of the item in data to guide the training process. Next, COCO-SBRS adopts counterfactual inference to recommend items based on the outputs of the pre-trained recommendation model considering the causalities to alleviate the data sparsity problem. As a result, COCO-SBRS can learn the causalities in data, preventing the model from learning spurious correlations. The experimental results of our extensive experiments conducted on three real-world datasets demonstrate the superiority of our proposed framework over ten representative SBRSs.|大多数基于会话的推荐系统(SBS)专注于从用户当前会话中观察到的项目中提取信息来预测下一个项目,而忽略了会话之外影响用户选择项目的原因(称为外部会话原因,OSC)。然而,这些原因在现实世界中普遍存在,很少有研究探讨它们在 SBRS 中的作用。本文从因果推理的角度分析了 SBRS 中 OSCs 的因果关系及其相关性。我们发现 OSC 本质上是 SBRS 中的混杂因素,这导致了用于训练 SBRS 模型的数据中存在虚假的相关性。为了解决这一问题,我们提出了一种新的 SBRS 框架 COCO-SBRS (COCO-SBRS,非事实协作的基于会话的推荐系统)来了解在 SBRS 中 OSC 和用户项目交互之间的因果关系。COCO-SBRS 首先采用自我监督的方法对推荐模型进行预训练,为每个用户选择数据中项目的原因设计伪标签,以指导训练过程。其次,COCO-SBRS 采用反事实推理方法,根据预训练推荐模型的输出结果进行推荐,考虑因果关系,以缓解数据稀疏问题。因此,COCO-SBRS 模型可以学习数据中的因果关系,防止模型学习虚假的相关性。我们在三个实际数据集上进行的大量实验结果表明,我们提出的框架优于十个具有代表性的 SBRS。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Counterfactual+Collaborative+Session-based+Recommender+System)|0| +|[A Counterfactual Collaborative Session-based Recommender System](https://doi.org/10.1145/3543507.3583321)|Wenzhuo Song, Shoujin Wang, Yan Wang, Kunpeng Liu, Xueyan Liu, Minghao Yin|Macquarie University, Australia; University of Technology Sydney, Australia; Portland State University, USA; Jilin University, China; Northeast Normal University, China|Most session-based recommender systems (SBRSs) focus on extracting information from the observed items in the current session of a user to predict a next item, ignoring the causes outside the session (called outer-session causes, OSCs) that influence the user's selection of items. However, these causes widely exist in the real world, and few studies have investigated their role in SBRSs. In this work, we analyze the causalities and correlations of the OSCs in SBRSs from the perspective of causal inference. We find that the OSCs are essentially the confounders in SBRSs, which leads to spurious correlations in the data used to train SBRS models. To address this problem, we propose a novel SBRS framework named COCO-SBRS (COunterfactual COllaborative Session-Based Recommender Systems) to learn the causality between OSCs and user-item interactions in SBRSs. COCO-SBRS first adopts a self-supervised approach to pre-train a recommendation model by designing pseudo-labels of causes for each user's selection of the item in data to guide the training process. Next, COCO-SBRS adopts counterfactual inference to recommend items based on the outputs of the pre-trained recommendation model considering the causalities to alleviate the data sparsity problem. As a result, COCO-SBRS can learn the causalities in data, preventing the model from learning spurious correlations. The experimental results of our extensive experiments conducted on three real-world datasets demonstrate the superiority of our proposed framework over ten representative SBRSs.|大多数基于会话的推荐系统(SBS)专注于从用户当前会话中观察到的项目中提取信息来预测下一个项目,而忽略了会话之外影响用户选择项目的原因(称为外部会话原因,OSC)。然而,这些原因在现实世界中普遍存在,很少有研究探讨它们在 SBRS 中的作用。本文从因果推理的角度分析了 SBRS 中 OSCs 的因果关系及其相关性。我们发现 OSC 本质上是 SBRS 中的混杂因素,这导致了用于训练 SBRS 模型的数据中存在虚假的相关性。为了解决这一问题,我们提出了一种新的 SBRS 框架 COCO-SBRS (COCO-SBRS,非事实协作的基于会话的推荐系统)来了解在 SBRS 中 OSC 和用户项目交互之间的因果关系。COCO-SBRS 首先采用自我监督的方法对推荐模型进行预训练,为每个用户选择数据中项目的原因设计伪标签,以指导训练过程。其次,COCO-SBRS 采用反事实推理方法,根据预训练推荐模型的输出结果进行推荐,考虑因果关系,以缓解数据稀疏问题。因此,COCO-SBRS 模型可以学习数据中的因果关系,防止模型学习虚假的相关性。我们在三个实际数据集上进行的大量实验结果表明,我们提出的框架优于十个具有代表性的 SBRS。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Counterfactual+Collaborative+Session-based+Recommender+System)|0| |[Debiased Contrastive Learning for Sequential Recommendation](https://doi.org/10.1145/3543507.3583361)|Yuhao Yang, Chao Huang, Lianghao Xia, Chunzhen Huang, Da Luo, Kangyi Lin||Current sequential recommender systems are proposed to tackle the dynamic user preference learning with various neural techniques, such as Transformer and Graph Neural Networks (GNNs). However, inference from the highly sparse user behavior data may hinder the representation ability of sequential pattern encoding. To address the label shortage issue, contrastive learning (CL) methods are proposed recently to perform data augmentation in two fashions: (i) randomly corrupting the sequence data (e.g. stochastic masking, reordering); (ii) aligning representations across pre-defined contrastive views. Although effective, we argue that current CL-based methods have limitations in addressing popularity bias and disentangling of user conformity and real interest. In this paper, we propose a new Debiased Contrastive learning paradigm for Recommendation (DCRec) that unifies sequential pattern encoding with global collaborative relation modeling through adaptive conformity-aware augmentation. This solution is designed to tackle the popularity bias issue in recommendation systems. Our debiased contrastive learning framework effectively captures both the patterns of item transitions within sequences and the dependencies between users across sequences. Our experiments on various real-world datasets have demonstrated that DCRec significantly outperforms state-of-the-art baselines, indicating its efficacy for recommendation. To facilitate reproducibility of our results, we make our implementation of DCRec publicly available at: https://github.com/HKUDS/DCRec.|目前的顺序推荐系统主要采用变压器和图形神经网络(GNN)等多种神经网络技术来解决动态用户偏好学习问题。然而,从高度稀疏的用户行为数据中进行推断可能会阻碍序列模式编码的表示能力。为了解决标签短缺问题,最近提出了对比学习(CL)方法,以两种方式进行数据增强: (i)随机破坏序列数据(例如随机掩蔽,重新排序) ; (ii)跨预定义的对比视图对齐表示。虽然有效,但我们认为目前基于 CL 的方法在解决流行偏差和用户一致性与真实兴趣的分离方面存在局限性。本文提出了一种新的无偏对比推荐学习范式,它通过自适应整合意识增强将序列模式编码与全局协作关系建模相结合。该解决方案旨在解决推荐系统中的流行偏差问题。我们的去偏差对比学习框架有效地捕获了序列中的项目转换模式和用户之间的依赖关系。我们在各种真实世界数据集上的实验表明,DCREc 显著优于最先进的基线,表明其推荐功效。为了便于重复我们的结果,我们将我们的 DCRec 的实现公布于以下 https://github.com/hkuds/DCRec。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Debiased+Contrastive+Learning+for+Sequential+Recommendation)|0| -|[Learning Vector-Quantized Item Representation for Transferable Sequential Recommenders](https://doi.org/10.1145/3543507.3583434)|Yupeng Hou, Zhankui He, Julian J. McAuley, Wayne Xin Zhao|Beijing Key Laboratory of Big Data Management and Analysis Methods, Renmin University of China, China; UC San Diego, USA; Renmin University of China, China|Recently, the generality of natural language text has been leveraged to develop transferable recommender systems. The basic idea is to employ pre-trained language models~(PLM) to encode item text into item representations. Despite the promising transferability, the binding between item text and item representations might be too tight, leading to potential problems such as over-emphasizing the effect of text features and exaggerating the negative impact of domain gap. To address this issue, this paper proposes VQ-Rec, a novel approach to learning Vector-Quantized item representations for transferable sequential Recommenders. The main novelty of our approach lies in the new item representation scheme: it first maps item text into a vector of discrete indices (called item code), and then employs these indices to lookup the code embedding table for deriving item representations. Such a scheme can be denoted as "text $\Longrightarrow$ code $\Longrightarrow$ representation". Based on this representation scheme, we further propose an enhanced contrastive pre-training approach, using semi-synthetic and mixed-domain code representations as hard negatives. Furthermore, we design a new cross-domain fine-tuning method based on a differentiable permutation-based network. Extensive experiments conducted on six public benchmarks demonstrate the effectiveness of the proposed approach, in both cross-domain and cross-platform settings. Code and pre-trained model are available at: https://github.com/RUCAIBox/VQ-Rec.|近年来,人们利用自然语言文本的通用性来开发可转移的推荐系统。其基本思想是使用预先训练好的语言模型 ~ (PLM)将项目文本编码成项目表示。项目文本与项目表征之间的联系过于紧密,可能导致过分强调文本特征的作用,夸大领域差距的负面影响等问题。为了解决这一问题,本文提出了一种新的学习矢量量化项目表示的方法 VQ-Rec。该方法的主要创新点在于新的项目表示方案: 它首先将项目文本映射到一个离散索引的向量(称为项目代码) ,然后使用这些索引查找代码嵌入表以获得项目表示。这样的方案可以表示为“ text $Longrightarrow $code $Longrightarrow $代表”。基于这种表示方案,我们进一步提出了一种增强的对比预训练方法,使用半合成和混合域代码表示作为硬负数。在此基础上,设计了一种新的基于可微置换网络的跨域微调方法。在六个公共基准上进行的大量实验证明了该方法在跨领域和跨平台环境中的有效性。代码和预先训练的模型可在以下 https://github.com/rucaibox/vq-rec 找到:。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Vector-Quantized+Item+Representation+for+Transferable+Sequential+Recommenders)|0| -|[KAE-Informer: A Knowledge Auto-Embedding Informer for Forecasting Long-Term Workloads of Microservices](https://doi.org/10.1145/3543507.3583288)|Qin Hua, Dingyu Yang, Shiyou Qian, Hanwen Hu, Jian Cao, Guangtao Xue|Alibaba Group, China; Shanghai Jiao Tong University, China|Accurately forecasting workloads in terms of throughput that is quantified as queries per second (QPS) is essential for microservices to elastically adjust their resource allocations. However, long-term QPS prediction is challenging in two aspects: 1) generality across various services with different temporal patterns, 2) characterization of intricate QPS sequences which are entangled by multiple components. In this paper, we propose a knowledge auto-embedding Informer network (KAE-Informer) for forecasting the long-term QPS sequences of microservices. By analyzing a large number of microservice traces, we discover that there are two main decomposable and predictable components in QPS sequences, namely global trend & dominant periodicity (TP) and low-frequency residual patterns with long-range dependencies. These two components are important for accurately forecasting long-term QPS. First, KAE-Informer embeds the knowledge of TP components through mathematical modeling. Second, KAE-Informer designs a convolution ProbSparse self-attention mechanism and a multi-layer event discrimination scheme to extract and embed the knowledge of local context awareness and event regression effect implied in residual components, respectively. We conduct experiments based on three real datasets including a QPS dataset collected from 40 microservices. The experiment results show that KAE-Informer achieves a reduction of MAPE, MAE and RMSE by about 16.6%, 17.6% and 23.1% respectively, compared to the state-of-the-art models.|根据每秒查询(QPS)量化的吞吐量准确预测工作负载对于微服务弹性调整其资源分配至关重要。然而,长期的 QPS 预测在两个方面具有挑战性: 1)不同时间模式的服务之间的一般性,2)被多个组件纠缠在一起的复杂的 QPS 序列的角色塑造。本文提出了一种基于知识自动嵌入的信息网络(KAE-Informer)来预测微服务的长期 QPS 序列。通过对大量微服务跟踪的分析,发现 QPS 序列中存在两个主要的可分解和可预测成分,即全局趋势和主周期(TP)和具有长程依赖性的低频残差模式。这两个组成部分是准确预测长期 QPS 的重要组成部分。首先,KAE-Informer 通过数学建模嵌入 TP 元件的知识。其次,KAE-Informer 分别设计了一种卷积 Probse 自注意机制和一种多层次事件识别方案来提取和嵌入残差分量中隐含的局部上下文感知和事件回归效应的知识。我们基于三个实际数据集进行实验,其中包括从40个微服务中收集的 QPS 数据集。实验结果表明,与现有的模型相比,KAE-Informer 的 MAPE、 MAE 和 RMSE 分别降低了约16.6% 、17.6% 和23.1% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=KAE-Informer:+A+Knowledge+Auto-Embedding+Informer+for+Forecasting+Long-Term+Workloads+of+Microservices)|0| +|[Learning Vector-Quantized Item Representation for Transferable Sequential Recommenders](https://doi.org/10.1145/3543507.3583434)|Yupeng Hou, Zhankui He, Julian J. McAuley, Wayne Xin Zhao|UC San Diego, USA; Renmin University of China, China; Beijing Key Laboratory of Big Data Management and Analysis Methods, Renmin University of China, China|Recently, the generality of natural language text has been leveraged to develop transferable recommender systems. The basic idea is to employ pre-trained language models~(PLM) to encode item text into item representations. Despite the promising transferability, the binding between item text and item representations might be too tight, leading to potential problems such as over-emphasizing the effect of text features and exaggerating the negative impact of domain gap. To address this issue, this paper proposes VQ-Rec, a novel approach to learning Vector-Quantized item representations for transferable sequential Recommenders. The main novelty of our approach lies in the new item representation scheme: it first maps item text into a vector of discrete indices (called item code), and then employs these indices to lookup the code embedding table for deriving item representations. Such a scheme can be denoted as "text $\Longrightarrow$ code $\Longrightarrow$ representation". Based on this representation scheme, we further propose an enhanced contrastive pre-training approach, using semi-synthetic and mixed-domain code representations as hard negatives. Furthermore, we design a new cross-domain fine-tuning method based on a differentiable permutation-based network. Extensive experiments conducted on six public benchmarks demonstrate the effectiveness of the proposed approach, in both cross-domain and cross-platform settings. Code and pre-trained model are available at: https://github.com/RUCAIBox/VQ-Rec.|近年来,人们利用自然语言文本的通用性来开发可转移的推荐系统。其基本思想是使用预先训练好的语言模型 ~ (PLM)将项目文本编码成项目表示。项目文本与项目表征之间的联系过于紧密,可能导致过分强调文本特征的作用,夸大领域差距的负面影响等问题。为了解决这一问题,本文提出了一种新的学习矢量量化项目表示的方法 VQ-Rec。该方法的主要创新点在于新的项目表示方案: 它首先将项目文本映射到一个离散索引的向量(称为项目代码) ,然后使用这些索引查找代码嵌入表以获得项目表示。这样的方案可以表示为“ text $Longrightarrow $code $Longrightarrow $代表”。基于这种表示方案,我们进一步提出了一种增强的对比预训练方法,使用半合成和混合域代码表示作为硬负数。在此基础上,设计了一种新的基于可微置换网络的跨域微调方法。在六个公共基准上进行的大量实验证明了该方法在跨领域和跨平台环境中的有效性。代码和预先训练的模型可在以下 https://github.com/rucaibox/vq-rec 找到:。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Vector-Quantized+Item+Representation+for+Transferable+Sequential+Recommenders)|0| +|[KAE-Informer: A Knowledge Auto-Embedding Informer for Forecasting Long-Term Workloads of Microservices](https://doi.org/10.1145/3543507.3583288)|Qin Hua, Dingyu Yang, Shiyou Qian, Hanwen Hu, Jian Cao, Guangtao Xue|Shanghai Jiao Tong University, China; Alibaba Group, China|Accurately forecasting workloads in terms of throughput that is quantified as queries per second (QPS) is essential for microservices to elastically adjust their resource allocations. However, long-term QPS prediction is challenging in two aspects: 1) generality across various services with different temporal patterns, 2) characterization of intricate QPS sequences which are entangled by multiple components. In this paper, we propose a knowledge auto-embedding Informer network (KAE-Informer) for forecasting the long-term QPS sequences of microservices. By analyzing a large number of microservice traces, we discover that there are two main decomposable and predictable components in QPS sequences, namely global trend & dominant periodicity (TP) and low-frequency residual patterns with long-range dependencies. These two components are important for accurately forecasting long-term QPS. First, KAE-Informer embeds the knowledge of TP components through mathematical modeling. Second, KAE-Informer designs a convolution ProbSparse self-attention mechanism and a multi-layer event discrimination scheme to extract and embed the knowledge of local context awareness and event regression effect implied in residual components, respectively. We conduct experiments based on three real datasets including a QPS dataset collected from 40 microservices. The experiment results show that KAE-Informer achieves a reduction of MAPE, MAE and RMSE by about 16.6%, 17.6% and 23.1% respectively, compared to the state-of-the-art models.|根据每秒查询(QPS)量化的吞吐量准确预测工作负载对于微服务弹性调整其资源分配至关重要。然而,长期的 QPS 预测在两个方面具有挑战性: 1)不同时间模式的服务之间的一般性,2)被多个组件纠缠在一起的复杂的 QPS 序列的角色塑造。本文提出了一种基于知识自动嵌入的信息网络(KAE-Informer)来预测微服务的长期 QPS 序列。通过对大量微服务跟踪的分析,发现 QPS 序列中存在两个主要的可分解和可预测成分,即全局趋势和主周期(TP)和具有长程依赖性的低频残差模式。这两个组成部分是准确预测长期 QPS 的重要组成部分。首先,KAE-Informer 通过数学建模嵌入 TP 元件的知识。其次,KAE-Informer 分别设计了一种卷积 Probse 自注意机制和一种多层次事件识别方案来提取和嵌入残差分量中隐含的局部上下文感知和事件回归效应的知识。我们基于三个实际数据集进行实验,其中包括从40个微服务中收集的 QPS 数据集。实验结果表明,与现有的模型相比,KAE-Informer 的 MAPE、 MAE 和 RMSE 分别降低了约16.6% 、17.6% 和23.1% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=KAE-Informer:+A+Knowledge+Auto-Embedding+Informer+for+Forecasting+Long-Term+Workloads+of+Microservices)|0| |[Propaganda Política Pagada: Exploring U.S. Political Facebook Ads en Español](https://doi.org/10.1145/3543507.3583425)|Bruno Coelho, Tobias Lauinger, Laura Edelson, Ian Goldstein, Damon McCoy|New York University, USA|In 2021, the U.S. Hispanic population totaled 62.5 million people, 68% of whom spoke Spanish in their homes. To date, it is unclear which political advertisers address this audience in their preferred language, and whether they do so differently than for English-speaking audiences. In this work, we study differences between political Facebook ads in English and Spanish during 2020, the latest U.S. presidential election. Political advertisers spent $ 1.48 B in English, but only $ 28.8 M in Spanish, disproportionately little compared to the share of Spanish speakers in the population. We further find a lower proportion of election-related advertisers (which additionally are more liberal-leaning than in the English set), and a higher proportion of government agencies in the set of Spanish ads. We perform multilingual topic classification, finding that the most common ad topics in English were also present in Spanish, but to a different extent, and with a different composition of advertisers. Thus, Spanish speakers are served different types of ads from different types of advertisers than English speakers, and in lower amounts; these results raise the question of whether political communication through Facebook ads may be inequitable and effectively disadvantaging the sizeable minority of Spanish speakers in the U.S. population.|2021年,美国西班牙裔人口总数为6250万,其中68% 的人在家里说西班牙语。到目前为止,还不清楚哪些政治广告主用自己喜欢的语言向这些受众发表演讲,以及他们的做法是否与英语受众不同。在这项工作中,我们研究了2020年美国总统大选期间 Facebook 上英语和西班牙语的政治广告之间的差异。政治广告客户在英语广告上花费了14.8亿美元,但在西班牙语广告上只花费了2880万美元,与说西班牙语的人口比例相比,这个数字不成比例。我们进一步发现,与选举有关的广告客户比例较低(此外,这些广告客户比英语广告客户更倾向于自由派) ,而在西班牙语广告客户中,政府机构的比例较高。我们进行了多语言话题分类,发现英语中最常见的广告话题也出现在西班牙语中,但程度不同,广告主的构成也不同。因此,说西班牙语的人比说英语的人得到了不同类型的广告,而且数量较少; 这些结果提出了一个问题: 通过 Facebook 广告进行的政治交流是否不公平,是否有效地损害了美国人口中说西班牙语的少数人的利益。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Propaganda+Política+Pagada:+Exploring+U.S.+Political+Facebook+Ads+en+Español)|0| -|[Learning Denoised and Interpretable Session Representation for Conversational Search](https://doi.org/10.1145/3543507.3583265)|Kelong Mao, Hongjin Qian, Fengran Mo, Zhicheng Dou, Bang Liu, Xiaohua Cheng, Zhao Cao|Université de Montréal, Canada; RALI & Mila, Université de Montréal, Canada; Huawei Poisson Lab, China; Renmin University of China, China|Conversational search supports multi-turn user-system interactions to solve complex information needs. Compared with the traditional single-turn ad-hoc search, conversational search faces a more complex search intent understanding problem because a conversational search session is much longer and contains many noisy tokens. However, existing conversational dense retrieval solutions simply fine-tune the pre-trained ad-hoc query encoder on limited conversational search data, which are hard to achieve satisfactory performance in such a complex conversational search scenario. Meanwhile, the learned latent representation also lacks interpretability that people cannot perceive how the model understands the session. To tackle the above drawbacks, we propose a sparse Lexical-based Conversational REtriever (LeCoRE), which extends the SPLADE model with two well-matched multi-level denoising methods uniformly based on knowledge distillation and external query rewrites to generate denoised and interpretable lexical session representation. Extensive experiments on four public conversational search datasets in both normal and zero-shot evaluation settings demonstrate the strong performance of LeCoRE towards more effective and interpretable conversational search.|会话搜索支持多回合的用户-系统交互,以解决复杂的信息需求。与传统的单向自组织搜索相比,会话搜索面临着更复杂的搜索意图理解问题,因为会话搜索会话更长,且包含大量噪声标记。然而,现有的会话密集检索解决方案只是在有限的会话搜索数据上对预先训练好的自组织查询编码器进行微调,难以在如此复杂的会话搜索场景中获得令人满意的性能。同时,习得的潜在表征也缺乏可解释性,人们无法感知模型是如何理解会话的。针对上述缺点,本文提出了一种基于稀疏词汇的会话检索(Conversational REtriever,LeCoRE)算法,该算法扩展了 SPLADE 模型,采用基于知识提取和外部查询重写的两种匹配性较好的多级去噪方法,均匀地生成去噪和可解释的词汇会话表示。对四个公共会话搜索数据集在正常和零拍评估环境下的大量实验表明,LeCoRE 在更有效和可解释的会话搜索方面具有很强的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Denoised+and+Interpretable+Session+Representation+for+Conversational+Search)|0| +|[Learning Denoised and Interpretable Session Representation for Conversational Search](https://doi.org/10.1145/3543507.3583265)|Kelong Mao, Hongjin Qian, Fengran Mo, Zhicheng Dou, Bang Liu, Xiaohua Cheng, Zhao Cao|RALI & Mila, Université de Montréal, Canada; Huawei Poisson Lab, China; Renmin University of China, China; Université de Montréal, Canada|Conversational search supports multi-turn user-system interactions to solve complex information needs. Compared with the traditional single-turn ad-hoc search, conversational search faces a more complex search intent understanding problem because a conversational search session is much longer and contains many noisy tokens. However, existing conversational dense retrieval solutions simply fine-tune the pre-trained ad-hoc query encoder on limited conversational search data, which are hard to achieve satisfactory performance in such a complex conversational search scenario. Meanwhile, the learned latent representation also lacks interpretability that people cannot perceive how the model understands the session. To tackle the above drawbacks, we propose a sparse Lexical-based Conversational REtriever (LeCoRE), which extends the SPLADE model with two well-matched multi-level denoising methods uniformly based on knowledge distillation and external query rewrites to generate denoised and interpretable lexical session representation. Extensive experiments on four public conversational search datasets in both normal and zero-shot evaluation settings demonstrate the strong performance of LeCoRE towards more effective and interpretable conversational search.|会话搜索支持多回合的用户-系统交互,以解决复杂的信息需求。与传统的单向自组织搜索相比,会话搜索面临着更复杂的搜索意图理解问题,因为会话搜索会话更长,且包含大量噪声标记。然而,现有的会话密集检索解决方案只是在有限的会话搜索数据上对预先训练好的自组织查询编码器进行微调,难以在如此复杂的会话搜索场景中获得令人满意的性能。同时,习得的潜在表征也缺乏可解释性,人们无法感知模型是如何理解会话的。针对上述缺点,本文提出了一种基于稀疏词汇的会话检索(Conversational REtriever,LeCoRE)算法,该算法扩展了 SPLADE 模型,采用基于知识提取和外部查询重写的两种匹配性较好的多级去噪方法,均匀地生成去噪和可解释的词汇会话表示。对四个公共会话搜索数据集在正常和零拍评估环境下的大量实验表明,LeCoRE 在更有效和可解释的会话搜索方面具有很强的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Denoised+and+Interpretable+Session+Representation+for+Conversational+Search)|0| |[Fairly Adaptive Negative Sampling for Recommendations](https://doi.org/10.1145/3543507.3583355)|Xiao Chen, Wenqi Fan, Jingfan Chen, Haochen Liu, Zitao Liu, Zhaoxiang Zhang, Qing Li||Pairwise learning strategies are prevalent for optimizing recommendation models on implicit feedback data, which usually learns user preference by discriminating between positive (i.e., clicked by a user) and negative items (i.e., obtained by negative sampling). However, the size of different item groups (specified by item attribute) is usually unevenly distributed. We empirically find that the commonly used uniform negative sampling strategy for pairwise algorithms (e.g., BPR) can inherit such data bias and oversample the majority item group as negative instances, severely countering group fairness on the item side. In this paper, we propose a Fairly adaptive Negative sampling approach (FairNeg), which improves item group fairness via adaptively adjusting the group-level negative sampling distribution in the training process. In particular, it first perceives the model's unfairness status at each step and then adjusts the group-wise sampling distribution with an adaptive momentum update strategy for better facilitating fairness optimization. Moreover, a negative sampling distribution Mixup mechanism is proposed, which gracefully incorporates existing importance-aware sampling techniques intended for mining informative negative samples, thus allowing for achieving multiple optimization purposes. Extensive experiments on four public datasets show our proposed method's superiority in group fairness enhancement and fairness-utility tradeoff.|成对学习策略普遍用于优化隐性反馈数据的推荐模型,它通常通过区分正面(即用户点击)和负面(即通过负面抽样获得)来学习用户偏好。但是,不同项目组(由项目属性指定)的大小通常是不均匀分布的。实证结果表明,成对算法中常用的一致负抽样策略(如 BPR)会继承这种数据偏差,并将多数项目组作为负实例过度抽样,严重影响项目方的群体公平性。本文提出了一种公平自适应负抽样方法(FairNeg) ,该方法通过在训练过程中自适应调整组级负抽样分布来提高项目组的公平性。特别地,它首先在每个步骤中感知模型的不公平状态,然后利用自适应动量更新策略调整分组抽样分布,以更好地促进公平性优化。此外,提出了负抽样分布混合机制,它优雅地结合了现有的重要性感知抽样技术,旨在挖掘信息负样本,从而实现多种优化目的。在四个公共数据集上的大量实验表明,该方法在增强群体公平性和公平-效用权衡方面具有优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Fairly+Adaptive+Negative+Sampling+for+Recommendations)|0| |[CNSVRE: A Query Reformulated Search System with Explainable Summarization for Virtual Research Environment](https://doi.org/10.1145/3543873.3587360)|Na Li, Yangjun Zhang, Zhiming Zhao|University of Amsterdam, Netherlands|Computational notebook environments have drawn broad attention in data-centric research applications, e.g., virtual research environment, for exploratory data analysis and algorithm prototyping. Vanilla computational notebook search solutions have been proposed but they do not pay much attention to the information needs of scientific researchers. Previous studies either treat computational notebook search as a code search problem or focus on content-based computational notebook search. The queries being considered are neither research-concerning nor diversified whereas researchers’ information needs are highly specialized and complex. Moreover, relevance evaluation for computational notebooks is tricky and unreliable since computational notebooks contain fragments of text and code and are usually poorly organized. To solve the above challenges, we propose a computational notebook search system for virtual research environment (VRE), i.e., CNSVRE, with scientific query reformulation and computational notebook summarization. We conduct a user study to demonstrate the effectiveness, efficiency, and satisfaction with the system.|计算机笔记本环境在以数据为中心的研究应用中引起了广泛的关注,例如用于探索性数据分析和算法原型的虚拟研究环境。香草计算笔记本搜索解决方案已经提出,但他们没有太多的关注科学研究人员的信息需求。以往的研究要么将计算笔记本搜索视为一个代码搜索问题,要么将重点放在基于内容的计算笔记本搜索上。被考虑的查询既不涉及研究,也不多样化,而研究人员的信息需求是高度专业化和复杂化的。此外,计算笔记本的相关性评估是棘手和不可靠的,因为计算笔记本包含文本和代码片段,通常组织不良。为了解决上述挑战,我们提出了一个虚拟研究环境(即 CNSVRE)的计算笔记本搜索系统,该系统具有科学的查询重构和计算笔记本摘要。我们进行了用户研究,以证明系统的有效性、效率和满意度。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CNSVRE:+A+Query+Reformulated+Search+System+with+Explainable+Summarization+for+Virtual+Research+Environment)|0| |[Personalized style recommendation via reinforcement learning](https://doi.org/10.1145/3543873.3587367)|Jiyun Luo, Kurchi Subhra Hazra, Wenyu Huo, Rui Li, Abhijit Mahabal|Pinterest Inc., USA|Pinterest fashion and home decor searchers often have different style tastes. Some existing work adopts users’ past engagement to infer style preference. These methods cannot help users discover new styles. Other work requires users to provide text or visual signals to describe their style preference, but users often are not familiar with style terms and do not have the right image to start with. In this paper, we propose a reinforcement learning (RL) method to help users explore and exploit style space without requiring extra user input. Experimental results show that our method improves the success rate of Pinterest fashion and home decor searches by 34.8%.|Pinterest 时尚和家居装饰搜索往往有不同的风格品味。现有的一些工作采用用户过去的接触来推断风格偏好。这些方法不能帮助用户发现新样式。其他工作需要用户提供文本或视觉信号来描述他们的风格偏好,但用户往往不熟悉风格术语,并没有正确的图像开始。在这篇文章中,我们提出了一个强化学习(RL)方法来帮助用户探索和开发样式空间,而不需要额外的用户输入。实验结果表明,该方法提高了 Pinterest 时装和家居装饰搜索的成功率34.8% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Personalized+style+recommendation+via+reinforcement+learning)|0| -|[HierCat: Hierarchical Query Categorization from Weakly Supervised Data at Facebook Marketplace](https://doi.org/10.1145/3543873.3584622)|Yunzhong He, Cong Zhang, Ruoyan Kong, Chaitanya Kulkarni, Qing Liu, Ashish Gandhe, Amit Nithianandan, Arul Prakash|Meta, USA; University of Minnesota Twin Cities, USA|Query categorization at customer-to-customer e-commerce platforms like Facebook Marketplace is challenging due to the vagueness of search intent, noise in real-world data, and imbalanced training data across languages. Its deployment also needs to consider challenges in scalability and downstream integration in order to translate modeling advances into better search result relevance. In this paper we present HierCat, the query categorization system at Facebook Marketplace. HierCat addresses these challenges by leveraging multi-task pre-training of dual-encoder architectures with a hierarchical inference step to effectively learn from weakly supervised training data mined from searcher engagement. We show that HierCat not only outperforms popular methods in offline experiments, but also leads to 1.4% improvement in NDCG and 4.3% increase in searcher engagement at Facebook Marketplace Search in online A/B testing.|像 Facebook Marketplace 这样的客户对客户的电子商务平台,由于搜索意图的模糊性、现实世界数据中的噪音以及跨语言的不平衡训练数据,查询分类是一个挑战。它的部署还需要考虑可伸缩性和下游集成方面的挑战,以便将建模进展转化为更好的搜索结果相关性。本文介绍了 Facebook Marketplace 的查询分类系统 HierCat。HierCat 通过利用双重编码器架构的多任务预训练和分层推理步骤来解决这些挑战,以有效地学习从搜索引擎参与中挖掘的弱监督训练数据。我们发现 HierCat 不仅在离线实验中表现优于流行的方法,而且在线 A/B 测试中导致 NDCG 改善1.4% ,Facebook Marketplace Search 的搜索者参与度提高4.3% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=HierCat:+Hierarchical+Query+Categorization+from+Weakly+Supervised+Data+at+Facebook+Marketplace)|0| -|[Search-based Recommendation: the Case for Difficult Predictions](https://doi.org/10.1145/3543873.3587374)|Ghazaleh Haratinezhad Torbati, Gerhard Weikum, Andrew Yates|University of Amsterdam, Netherlands; Max Planck Institute for Informatics, Germany|Recommender systems have achieved impressive results on benchmark datasets. However, the numbers are often influenced by assumptions made on the data and evaluation mode. This work questions and revises these assumptions, to study and improve the quality, particularly for the difficult case of search-based recommendations. Users start with a personally liked item as a query and look for similar items that match their tastes. User satisfaction requires discovering truly unknown items: new authors of books rather than merely more books of known writers. We propose a unified system architecture that combines interaction-based and content-based signals and leverages language models for Transformer-powered predictions. We present new techniques for selecting negative training samples, and investigate their performance in the underexplored search-based evaluation mode.|推荐系统在基准数据集上取得了令人印象深刻的成果。然而,这些数字往往受到对数据和评估模式的假设的影响。本文对这些假设进行了质疑和修正,以研究和提高质量,特别是针对困难案例的基于搜索的推荐。用户从一个个人喜欢的项目开始查询,然后寻找与他们口味相符的类似项目。用户满意度要求发现真正未知的项目: 书籍的新作者,而不仅仅是知名作家的书籍。我们提出了一个统一的系统体系结构,它结合了基于交互和基于内容的信号,并利用语言模型进行基于 Transformer 的预测。我们提出了选择负训练样本的新技术,并研究了它们在基于搜索的评估模式中的表现。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Search-based+Recommendation:+the+Case+for+Difficult+Predictions)|0| +|[HierCat: Hierarchical Query Categorization from Weakly Supervised Data at Facebook Marketplace](https://doi.org/10.1145/3543873.3584622)|Yunzhong He, Cong Zhang, Ruoyan Kong, Chaitanya Kulkarni, Qing Liu, Ashish Gandhe, Amit Nithianandan, Arul Prakash|University of Minnesota Twin Cities, USA; Meta, USA|Query categorization at customer-to-customer e-commerce platforms like Facebook Marketplace is challenging due to the vagueness of search intent, noise in real-world data, and imbalanced training data across languages. Its deployment also needs to consider challenges in scalability and downstream integration in order to translate modeling advances into better search result relevance. In this paper we present HierCat, the query categorization system at Facebook Marketplace. HierCat addresses these challenges by leveraging multi-task pre-training of dual-encoder architectures with a hierarchical inference step to effectively learn from weakly supervised training data mined from searcher engagement. We show that HierCat not only outperforms popular methods in offline experiments, but also leads to 1.4% improvement in NDCG and 4.3% increase in searcher engagement at Facebook Marketplace Search in online A/B testing.|像 Facebook Marketplace 这样的客户对客户的电子商务平台,由于搜索意图的模糊性、现实世界数据中的噪音以及跨语言的不平衡训练数据,查询分类是一个挑战。它的部署还需要考虑可伸缩性和下游集成方面的挑战,以便将建模进展转化为更好的搜索结果相关性。本文介绍了 Facebook Marketplace 的查询分类系统 HierCat。HierCat 通过利用双重编码器架构的多任务预训练和分层推理步骤来解决这些挑战,以有效地学习从搜索引擎参与中挖掘的弱监督训练数据。我们发现 HierCat 不仅在离线实验中表现优于流行的方法,而且在线 A/B 测试中导致 NDCG 改善1.4% ,Facebook Marketplace Search 的搜索者参与度提高4.3% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=HierCat:+Hierarchical+Query+Categorization+from+Weakly+Supervised+Data+at+Facebook+Marketplace)|0| +|[Search-based Recommendation: the Case for Difficult Predictions](https://doi.org/10.1145/3543873.3587374)|Ghazaleh Haratinezhad Torbati, Gerhard Weikum, Andrew Yates|Max Planck Institute for Informatics, Germany; University of Amsterdam, Netherlands|Recommender systems have achieved impressive results on benchmark datasets. However, the numbers are often influenced by assumptions made on the data and evaluation mode. This work questions and revises these assumptions, to study and improve the quality, particularly for the difficult case of search-based recommendations. Users start with a personally liked item as a query and look for similar items that match their tastes. User satisfaction requires discovering truly unknown items: new authors of books rather than merely more books of known writers. We propose a unified system architecture that combines interaction-based and content-based signals and leverages language models for Transformer-powered predictions. We present new techniques for selecting negative training samples, and investigate their performance in the underexplored search-based evaluation mode.|推荐系统在基准数据集上取得了令人印象深刻的成果。然而,这些数字往往受到对数据和评估模式的假设的影响。本文对这些假设进行了质疑和修正,以研究和提高质量,特别是针对困难案例的基于搜索的推荐。用户从一个个人喜欢的项目开始查询,然后寻找与他们口味相符的类似项目。用户满意度要求发现真正未知的项目: 书籍的新作者,而不仅仅是知名作家的书籍。我们提出了一个统一的系统体系结构,它结合了基于交互和基于内容的信号,并利用语言模型进行基于 Transformer 的预测。我们提出了选择负训练样本的新技术,并研究了它们在基于搜索的评估模式中的表现。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Search-based+Recommendation:+the+Case+for+Difficult+Predictions)|0| |[Reweighting Clicks with Dwell Time in Recommendation](https://doi.org/10.1145/3543873.3584624)|Ruobing Xie, Lin Ma, Shaoliang Zhang, Feng Xia, Leyu Lin|WeChat, Tencent, China|The click behavior is the most widely-used user positive feedback in recommendation. However, simply considering each click equally in training may suffer from clickbaits and title-content mismatching, and thus fail to precisely capture users' real satisfaction on items. Dwell time could be viewed as a high-quality quantitative indicator of user preferences on each click, while existing recommendation models do not fully explore the modeling of dwell time. In this work, we focus on reweighting clicks with dwell time in recommendation. Precisely, we first define a new behavior named valid read, which helps to select high-quality click instances for different users and items via dwell time. Next, we propose a normalized dwell time function to reweight click signals in training for recommendation. The Click reweighting model achieves significant improvements on both offline and online evaluations in real-world systems.|点击行为是推荐中使用最广泛的用户正面反馈。然而,在培训中仅仅考虑每一次点击的平等性可能会遭受点击诱惑和标题内容不匹配的问题,因此不能准确地捕捉用户对项目的真正满意度。停留时间可以被视为每次点击时用户偏好的高质量定量指标,而现有的推荐模型并没有充分探索停留时间的建模。在这项工作中,我们将重点放在用推荐中的停留时间重新加权点击。确切地说,我们首先定义一个名为有效读的新行为,它有助于通过停留时间为不同的用户和项目选择高质量的单击实例。接下来,我们提出了一个规范化的停留时间函数来重新加权点击信号的训练推荐。Click 重新加权模型在现实世界系统的离线和在线评估方面都取得了显著的改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Reweighting+Clicks+with+Dwell+Time+in+Recommendation)|0| -|[Disentangled Causal Embedding With Contrastive Learning For Recommender System](https://doi.org/10.1145/3543873.3584637)|Weiqi Zhao, Dian Tang, Xin Chen, Dawei Lv, Daoli Ou, Biao Li, Peng Jiang, Kun Gai|Unaffiliated, China; Kuaishou Technology, China|Recommender systems usually rely on observed user interaction data to build personalized recommendation models, assuming that the observed data reflect user interest. However, user interacting with an item may also due to conformity, the need to follow popular items. Most previous studies neglect user's conformity and entangle interest with it, which may cause the recommender systems fail to provide satisfying results. Therefore, from the cause-effect view, disentangling these interaction causes is a crucial issue. It also contributes to OOD problems, where training and test data are out-of-distribution. Nevertheless, it is quite challenging as we lack the signal to differentiate interest and conformity. The data sparsity of pure cause and the items' long-tail problem hinder disentangled causal embedding. In this paper, we propose DCCL, a framework that adopts contrastive learning to disentangle these two causes by sample augmentation for interest and conformity respectively. Futhermore, DCCL is model-agnostic, which can be easily deployed in any industrial online system. Extensive experiments are conducted over two real-world datasets and DCCL outperforms state-of-the-art baselines on top of various backbone models in various OOD environments. We also demonstrate the performance improvements by online A/B testing on Kuaishou, a billion-user scale short-video recommender system.|推荐系统通常依赖于观察到的用户交互数据来建立个性化的推荐模型,假设观察到的数据反映了用户的兴趣。然而,用户与一个项目的互动也可能是由于一致性,需要遵循流行的项目。以往的大多数研究忽视了用户的一致性,并与之产生利益纠葛,这可能导致推荐系统不能提供令人满意的结果。因此,从因果观点来看,解开这些相互作用的原因是一个至关重要的问题。它还会导致面向对象设计(OOD)问题,即培训和测试数据分布不均。然而,这是相当具有挑战性的,因为我们缺乏区分兴趣和一致性的信号。纯因果关系的数据稀疏性和项目的长尾问题阻碍了因果关系的解纠缠嵌入。在本文中,我们提出了 DCCL,一个采用对比学习的框架,分别通过兴趣和从众的样本增加来解决这两个原因。此外,DCCL 是模型无关的,可以很容易地部署在任何工业在线系统。在两个真实世界的数据集上进行了广泛的实验,DCCL 在各种面向对象设计(OOD)环境中的各种骨干模型上的表现优于最先进的基线。我们还通过在 Kuaishou 的在线 A/B 测试展示了性能改进,这是一个拥有10亿用户规模的短视频推荐系统。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Disentangled+Causal+Embedding+With+Contrastive+Learning+For+Recommender+System)|0| +|[Disentangled Causal Embedding With Contrastive Learning For Recommender System](https://doi.org/10.1145/3543873.3584637)|Weiqi Zhao, Dian Tang, Xin Chen, Dawei Lv, Daoli Ou, Biao Li, Peng Jiang, Kun Gai|Kuaishou Technology, China; Unaffiliated, China|Recommender systems usually rely on observed user interaction data to build personalized recommendation models, assuming that the observed data reflect user interest. However, user interacting with an item may also due to conformity, the need to follow popular items. Most previous studies neglect user's conformity and entangle interest with it, which may cause the recommender systems fail to provide satisfying results. Therefore, from the cause-effect view, disentangling these interaction causes is a crucial issue. It also contributes to OOD problems, where training and test data are out-of-distribution. Nevertheless, it is quite challenging as we lack the signal to differentiate interest and conformity. The data sparsity of pure cause and the items' long-tail problem hinder disentangled causal embedding. In this paper, we propose DCCL, a framework that adopts contrastive learning to disentangle these two causes by sample augmentation for interest and conformity respectively. Futhermore, DCCL is model-agnostic, which can be easily deployed in any industrial online system. Extensive experiments are conducted over two real-world datasets and DCCL outperforms state-of-the-art baselines on top of various backbone models in various OOD environments. We also demonstrate the performance improvements by online A/B testing on Kuaishou, a billion-user scale short-video recommender system.|推荐系统通常依赖于观察到的用户交互数据来建立个性化的推荐模型,假设观察到的数据反映了用户的兴趣。然而,用户与一个项目的互动也可能是由于一致性,需要遵循流行的项目。以往的大多数研究忽视了用户的一致性,并与之产生利益纠葛,这可能导致推荐系统不能提供令人满意的结果。因此,从因果观点来看,解开这些相互作用的原因是一个至关重要的问题。它还会导致面向对象设计(OOD)问题,即培训和测试数据分布不均。然而,这是相当具有挑战性的,因为我们缺乏区分兴趣和一致性的信号。纯因果关系的数据稀疏性和项目的长尾问题阻碍了因果关系的解纠缠嵌入。在本文中,我们提出了 DCCL,一个采用对比学习的框架,分别通过兴趣和从众的样本增加来解决这两个原因。此外,DCCL 是模型无关的,可以很容易地部署在任何工业在线系统。在两个真实世界的数据集上进行了广泛的实验,DCCL 在各种面向对象设计(OOD)环境中的各种骨干模型上的表现优于最先进的基线。我们还通过在 Kuaishou 的在线 A/B 测试展示了性能改进,这是一个拥有10亿用户规模的短视频推荐系统。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Disentangled+Causal+Embedding+With+Contrastive+Learning+For+Recommender+System)|0| |[Confidence Ranking for CTR Prediction](https://doi.org/10.1145/3543873.3584643)|Jian Zhu, Congcong Liu, Pei Wang, Xiwei Zhao, Zhangang Lin, Jingping Shao|JD.com, China|Model evolution and data updating are two common phenomena in large-scale real-world machine learning applications, e.g. ads and recommendation systems. To adapt, the real-world system typically retrain with all available data and online learn with recently available data to update the models periodically with the goal of better serving performance. In this paper, we propose a novel framework, named Confidence Ranking, which designs the optimization objective as a ranking function with two different models. Our confidence ranking loss allows direct optimization of the logits output for different convex surrogate functions of metrics, e.g. AUC and Accuracy depending on the target task and dataset. Armed with our proposed methods, our experiments show that the introduction of confidence ranking loss can outperform all baselines on the CTR prediction tasks of public and industrial datasets. This framework has been deployed in the advertisement system of JD.com to serve the main traffic in the fine-rank stage.|模型演化和数据更新是广告和推荐系统等大规模真实世界机器学习应用中的两种常见现象。为了适应这种情况,现实世界中的系统通常使用所有可用的数据进行再培训,并使用最近可用的数据进行在线学习,以便定期更新模型,从而更好地服务于性能。在本文中,我们提出了一个新的框架,称为置信排序,设计的优化目标为一个排序函数与两个不同的模型。我们的置信度排序损失允许直接优化不同凸性度量替代函数的 logit 输出,例如 AUC 和精度取决于目标任务和数据集。实验结果表明,在公共数据集和工业数据集的 CTR 预测任务中,置信度排序损失的引入能够优于所有基线。该框架已经部署在京东的广告系统中,服务于精品阶段的主流流量。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Confidence+Ranking+for+CTR+Prediction)|0| -|[Personalised Search in E-Comm Groceries](https://doi.org/10.1145/3543873.3587588)|Ramprabhu Murugesan, Anuja Sharan|Walmart labs, India; Walmart Labs, India|Personalized Search(henceforth called P10d Search) focuses to deliver user-specific search results based on the previous purchases. Search engine retrieves the result based on the defined relevancy algorithm. When a user searches a keyword, search engine constructs the search query based on the defined searchable fields/attributes along with configured relevancy algorithm. Position of the item retrieved in search results is determined by the search algorithm based on the search term. The results are further refined or ranked based on different click stream signals, product features, market data to provide much relevant results. Personalisation provides the ranked the list of items for a given user based on past purchases. Personalisation is agnostic of search query and takes user id, cart additions, site taxonomy and user’s shopping history as input signals. In summary, search engine queries data based on relevancy and personalisation engine retrieves based purely on purchases. Goal of personalised search is to enhance the search results by adding personalised results without affecting the search relevance.|个性化检索(以下简称 P10d 搜索)的重点是提供基于以前购买的特定用户的搜索结果。搜索引擎根据定义的相关算法检索结果。当用户搜索关键字时,搜索引擎根据定义的可搜索字段/属性以及配置的相关性算法构造搜索查询。在搜索结果中检索到的项的位置由基于搜索项的搜索算法确定。根据不同的点击流信号、产品特点、市场数据对结果进行进一步细化或排序,以提供更多相关的结果。个性化为给定用户提供了基于过去购买的商品的排名列表。个性化是不可知的搜索查询,并采取用户 ID,购物车添加,网站分类和用户的购物历史作为输入信号。总之,搜索引擎查询数据的相关性和个性化引擎检索纯粹基于购买。个性化搜索的目标是在不影响搜索相关性的情况下,通过添加个性化搜索结果来提高搜索结果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Personalised+Search+in+E-Comm+Groceries)|0| +|[Personalised Search in E-Comm Groceries](https://doi.org/10.1145/3543873.3587588)|Ramprabhu Murugesan, Anuja Sharan|Walmart Labs, India; Walmart labs, India|Personalized Search(henceforth called P10d Search) focuses to deliver user-specific search results based on the previous purchases. Search engine retrieves the result based on the defined relevancy algorithm. When a user searches a keyword, search engine constructs the search query based on the defined searchable fields/attributes along with configured relevancy algorithm. Position of the item retrieved in search results is determined by the search algorithm based on the search term. The results are further refined or ranked based on different click stream signals, product features, market data to provide much relevant results. Personalisation provides the ranked the list of items for a given user based on past purchases. Personalisation is agnostic of search query and takes user id, cart additions, site taxonomy and user’s shopping history as input signals. In summary, search engine queries data based on relevancy and personalisation engine retrieves based purely on purchases. Goal of personalised search is to enhance the search results by adding personalised results without affecting the search relevance.|个性化检索(以下简称 P10d 搜索)的重点是提供基于以前购买的特定用户的搜索结果。搜索引擎根据定义的相关算法检索结果。当用户搜索关键字时,搜索引擎根据定义的可搜索字段/属性以及配置的相关性算法构造搜索查询。在搜索结果中检索到的项的位置由基于搜索项的搜索算法确定。根据不同的点击流信号、产品特点、市场数据对结果进行进一步细化或排序,以提供更多相关的结果。个性化为给定用户提供了基于过去购买的商品的排名列表。个性化是不可知的搜索查询,并采取用户 ID,购物车添加,网站分类和用户的购物历史作为输入信号。总之,搜索引擎查询数据的相关性和个性化引擎检索纯粹基于购买。个性化搜索的目标是在不影响搜索相关性的情况下,通过添加个性化搜索结果来提高搜索结果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Personalised+Search+in+E-Comm+Groceries)|0| |[Graph Embedding for Mapping Interdisciplinary Research Networks](https://doi.org/10.1145/3543873.3587570)|Eoghan Cunningham, Derek Greene|University College Dublin, Ireland|Representation learning is the first step in automating tasks such as research paper recommendation, classification, and retrieval. Due to the accelerating rate of research publication, together with the recognised benefits of interdisciplinary research, systems that facilitate researchers in discovering and understanding relevant works from beyond their immediate school of knowledge are vital. This work explores different methods of research paper representation (or document embedding), to identify those methods that are capable of preserving the interdisciplinary implications of research papers in their embeddings. In addition to evaluating state of the art methods of document embedding in a interdisciplinary citation prediction task, we propose a novel Graph Neural Network architecture designed to preserve the key interdisciplinary implications of research articles in citation network node embeddings. Our proposed method outperforms other GNN-based methods in interdisciplinary citation prediction, without compromising overall citation prediction performance.|表示学习是研究论文推荐、分类和检索等任务自动化的第一步。由于研究发表的速度加快,再加上科际整合的公认好处,有助研究人员发现和理解其直接学校知识以外的相关著作的系统是至关重要的。本文探讨了研究论文表示(或文档嵌入)的不同方法,以确定哪些方法能够在其嵌入过程中保持研究论文的跨学科含义。除了评估在跨学科引文预测任务中嵌入文档的最新方法之外,我们还提出了一种新的图形神经网络体系结构,旨在保存引文网络节点嵌入中研究论文的关键跨学科含义。我们提出的方法在跨学科引文预测方面优于其他基于 GNN 的方法,而不影响整体的引文预测性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph+Embedding+for+Mapping+Interdisciplinary+Research+Networks)|0| |[Deep Passage Retrieval in E-Commerce](https://doi.org/10.1145/3543873.3587624)|Vinay Rao Dandin, Ozan Ersoy, Kyung Hyuk Kim|Flipkart US R&D Center, USA|We have developed a conversational assistant called the Decision Assistant (DA) to help customers make purchase decisions. To answer customer queries successfully, we use a question and answering (QnA) system that retrieves data on product pages and extracts answers. With various data sources available on the product pages, we deal with unique challenges such as different terminologies and data formats for successful answer retrieval. In this paper, we propose two different bi-encoder architectures for retrieving data from each of the two data sources considered – product descriptions and specifications. The proposed architectures beat the baseline approaches while maintaining a high recall and low latency in production. We envision that the proposed approaches can be widely applicable to other e-commerce QnA systems.|我们已经开发了一个称为决策助理(DA)的会话助理来帮助客户做出购买决策。为了成功地回答客户的查询,我们使用一个问答(QnA)系统来检索产品页面上的数据并提取答案。随着各种数据源可在产品页面,我们处理独特的挑战,如不同的术语和数据格式,以成功的答案检索。在本文中,我们提出了两种不同的双编码器体系结构来检索数据从每个两个数据源考虑-产品描述和规格。所提出的体系结构打破了基线方法,同时在生产中保持了较高的召回率和较低的延迟。我们设想所提出的方法可以广泛应用于其他电子商务 QnA 系统。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Deep+Passage+Retrieval+in+E-Commerce)|0| |[Quantize Sequential Recommenders Without Private Data](https://doi.org/10.1145/3543507.3583351)|Lingfeng Shi, Yuang Liu, Jun Wang, Wei Zhang|East China Normal University, China|Deep neural networks have achieved great success in sequential recommendation systems. While maintaining high competence in user modeling and next-item recommendation, these models have long been plagued by the numerous parameters and computation, which inhibit them to be deployed on resource-constrained mobile devices. Model quantization, as one of the main paradigms for compression techniques, converts float parameters to low-bit values to reduce parameter redundancy and accelerate inference. To avoid drastic performance degradation, it usually requests a fine-tuning phase with an original dataset. However, the training set of user-item interactions is not always available due to transmission limits or privacy concerns. In this paper, we propose a novel framework to quantize sequential recommenders without access to any real private data. A generator is employed in the framework to synthesize fake sequence samples to feed the quantized sequential recommendation model and minimize the gap with a full-precision sequential recommendation model. The generator and the quantized model are optimized with a min-max game — alternating discrepancy estimation and knowledge transfer. Moreover, we devise a two-level discrepancy modeling strategy to transfer information between the quantized model and the full-precision model. The extensive experiments of various recommendation networks on three public datasets demonstrate the effectiveness of the proposed framework.|深层神经网络在序贯推荐系统中取得了巨大的成功。尽管这些模型在用户建模和下一个项目推荐方面保持了很高的能力,但长期以来,这些模型一直受到众多参数和计算的困扰,这些参数和计算阻碍了它们被部署到资源受限的移动设备上。模型量化作为压缩技术的主要范式之一,将浮点参数转换为低位值,以减少参数冗余,加速推理。为了避免严重的性能下降,它通常要求对原始数据集进行微调。然而,由于传输限制或隐私问题,用户项交互的训练集并不总是可用的。在本文中,我们提出了一个新的框架,量化顺序推荐没有访问任何真正的私有数据。该框架采用生成器对伪序列样本进行合成,以满足量化序列推荐模型的要求,同时采用全精度序列推荐模型使推荐间隔最小化。利用最小-最大对策-交替差异估计和知识转移对生成器和量化模型进行优化。此外,我们还设计了一个两层差异建模策略来传递量化模型和全精度模型之间的信息。在三个公共数据集上对各种推荐网络进行了广泛的实验,证明了该框架的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Quantize+Sequential+Recommenders+Without+Private+Data)|0| |[Adap-τ : Adaptively Modulating Embedding Magnitude for Recommendation](https://doi.org/10.1145/3543507.3583363)|Jiawei Chen, Junkang Wu, Jiancan Wu, Xuezhi Cao, Sheng Zhou, Xiangnan He||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Adap-τ+:+Adaptively+Modulating+Embedding+Magnitude+for+Recommendation)|0| -|[Clustered Embedding Learning for Recommender Systems](https://doi.org/10.1145/3543507.3583362)|Yizhou Chen, Guangda Huzhang, Anxiang Zeng, Qingtao Yu, Hui Sun, HengYi Li, Jingyi Li, Yabo Ni, Han Yu, Zhiming Zhou|SCSE, Nanyang Technological University, Singapore; Shanghai University of Finance and Economics, China; Shopee Pte Ltd., Singapore|In recent years, recommender systems have advanced rapidly, where embedding learning for users and items plays a critical role. A standard method learns a unique embedding vector for each user and item. However, such a method has two important limitations in real-world applications: 1) it is hard to learn embeddings that generalize well for users and items with rare interactions on their own; and 2) it may incur unbearably high memory costs when the number of users and items scales up. Existing approaches either can only address one of the limitations or have flawed overall performances. In this paper, we propose Clustered Embedding Learning (CEL) as an integrated solution to these two problems. CEL is a plug-and-play embedding learning framework that can be combined with any differentiable feature interaction model. It is capable of achieving improved performance, especially for cold users and items, with reduced memory cost. CEL enables automatic and dynamic clustering of users and items in a top-down fashion, where clustered entities jointly learn a shared embedding. The accelerated version of CEL has an optimal time complexity, which supports efficient online updates. Theoretically, we prove the identifiability and the existence of a unique optimal number of clusters for CEL in the context of nonnegative matrix factorization. Empirically, we validate the effectiveness of CEL on three public datasets and one business dataset, showing its consistently superior performance against current state-of-the-art methods. In particular, when incorporating CEL into the business model, it brings an improvement of $+0.6\%$ in AUC, which translates into a significant revenue gain; meanwhile, the size of the embedding table gets $2650$ times smaller.|近年来,推荐系统发展迅速,其中用户和项目的嵌入式学习起着至关重要的作用。标准方法为每个用户和项学习唯一的嵌入向量。然而,这种方法在实际应用中有两个重要的局限性: 1)很难学习嵌入式技术,这种技术可以很好地适用于用户和具有罕见交互的项目; 2)当用户和项目的数量增加时,它可能会产生难以忍受的高内存成本。现有的方法要么只能解决其中的一个限制,要么具有有缺陷的整体性能。在本文中,我们提出了集群嵌入式学习(CEL)作为这两个问题的综合解决方案。CEL 是一个即插即用的嵌入式学习框架,可以与任何可微的特征交互模型相结合。它能够提高性能,特别是对于冷用户和项目,同时降低内存成本。CEL 以自顶向下的方式支持用户和项目的自动和动态集群,集群实体可以在这种方式下联合学习共享嵌入。CEL 的加速版本具有最佳的时间复杂度,支持高效的在线更新。理论上,我们证明了在非负矩阵分解的情况下,CEL 的可识别性和唯一最优簇数的存在性。通过实验,我们验证了 CEL 在三个公共数据集和一个业务数据集上的有效性,显示了与当前最先进的方法相比,CEL 始终具有优越的性能。特别是,当将 CEL 融入到商业模式中时,它在 AUC 中带来了 $+ 0.6% 的改进,这意味着显著的收入增长; 与此同时,嵌入表的大小变小了2650美元。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Clustered+Embedding+Learning+for+Recommender+Systems)|0| -|[MMMLP: Multi-modal Multilayer Perceptron for Sequential Recommendations](https://doi.org/10.1145/3543507.3583378)|Jiahao Liang, Xiangyu Zhao, Muyang Li, Zijian Zhang, Wanyu Wang, Haochen Liu, Zitao Liu|Jilin University, China and City University of Hong Kong, Hong Kong; Michigan State University, USA; University of Sydney, Australia; Guangdong Institute of Smart Education, Jinan University, China; City University of Hong Kong, Hong Kong|Sequential recommendation aims to offer potentially interesting products to users by capturing their historical sequence of interacted items. Although it has facilitated extensive physical scenarios, sequential recommendation for multi-modal sequences has long been neglected. Multi-modal data that depicts a user’s historical interactions exists ubiquitously, such as product pictures, textual descriptions, and interacted item sequences, providing semantic information from multiple perspectives that comprehensively describe a user’s preferences. However, existing sequential recommendation methods either fail to directly handle multi-modality or suffer from high computational complexity. To address this, we propose a novel Multi-Modal Multi-Layer Perceptron (MMMLP) for maintaining multi-modal sequences for sequential recommendation. MMMLP is a purely MLP-based architecture that consists of three modules - the Feature Mixer Layer, Fusion Mixer Layer, and Prediction Layer - and has an edge on both efficacy and efficiency. Extensive experiments show that MMMLP achieves state-of-the-art performance with linear complexity. We also conduct ablating analysis to verify the contribution of each component. Furthermore, compatible experiments are devised, and the results show that the multi-modal representation learned by our proposed model generally benefits other recommendation models, emphasizing our model’s ability to handle multi-modal information. We have made our code available online to ease reproducibility1.|顺序推荐旨在通过获取用户交互项的历史顺序,为用户提供潜在有趣的产品。虽然它促进了广泛的物理场景,多模态序列的顺序推荐长期以来被忽视。描述用户历史交互的多模态数据无处不在,比如产品图片、文本描述和交互式项目序列,从多个角度提供语义信息,全面描述用户的偏好。然而,现有的顺序推荐方法要么不能直接处理多模态问题,要么计算复杂度较高。为了解决这个问题,我们提出了一种新的多模态多层感知器(MMMLP)来维护多模态序列的顺序推荐。MMLP 是一个纯粹基于 MLP 的架构,它由三个模块组成——特征混合层、融合混合层和预测层——并且在功效和效率方面都有优势。大量的实验表明,MMLP 在线性复杂度方面达到了最先进的性能。我们还进行了烧蚀分析,以验证每个组分的贡献。此外,设计了相容性实验,结果表明,我们提出的模型学习的多模态表示一般有利于其他推荐模型,强调我们的模型的能力,处理多模态信息。我们已经在网上提供了我们的代码,以便于重现。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MMMLP:+Multi-modal+Multilayer+Perceptron+for+Sequential+Recommendations)|0| -|[AutoMLP: Automated MLP for Sequential Recommendations](https://doi.org/10.1145/3543507.3583440)|Muyang Li, Zijian Zhang, Xiangyu Zhao, Wanyu Wang, Minghao Zhao, Runze Wu, Ruocheng Guo|City University of Hong Kong, Hong Kong and University of Sydney, Australia; Bytedance AI Lab UK, United Kingdom; Fuxi AI Lab, NetEase, China; City University of Hong Kong, Hong Kong and Jilin University, China; City University of Hong Kong, Hong Kong|Sequential recommender systems aim to predict users' next interested item given their historical interactions. However, a long-standing issue is how to distinguish between users' long/short-term interests, which may be heterogeneous and contribute differently to the next recommendation. Existing approaches usually set pre-defined short-term interest length by exhaustive search or empirical experience, which is either highly inefficient or yields subpar results. The recent advanced transformer-based models can achieve state-of-the-art performances despite the aforementioned issue, but they have a quadratic computational complexity to the length of the input sequence. To this end, this paper proposes a novel sequential recommender system, AutoMLP, aiming for better modeling users' long/short-term interests from their historical interactions. In addition, we design an automated and adaptive search algorithm for preferable short-term interest length via end-to-end optimization. Through extensive experiments, we show that AutoMLP has competitive performance against state-of-the-art methods, while maintaining linear computational complexity.|顺序推荐系统的目的是预测用户的下一个感兴趣的项目给予他们的历史交互。然而,一个长期存在的问题是如何区分用户的长期和短期利益,这可能是不同的,并作出不同的贡献下一个建议。现有方法通常通过穷举搜索或实证经验来设定预先确定的短期利率长度,这种方法要么效率极低,要么效果不佳。尽管存在上述问题,最近的先进的基于变压器的模型能够实现最先进的性能,但是它们对于输入序列的长度具有二次计算复杂度。为此,本文提出了一种新的顺序推荐系统—— AutoMLP,旨在从用户的历史交互中更好地建立用户的长期/短期兴趣模型。此外,我们设计了一个自动化和自适应的搜索算法,通过端到端优化较好的短期兴趣长度。通过大量的实验,我们发现 AutoMLP 在保持线性计算复杂度的同时,具有与最先进的方法相竞争的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=AutoMLP:+Automated+MLP+for+Sequential+Recommendations)|0| -|[NASRec: Weight Sharing Neural Architecture Search for Recommender Systems](https://doi.org/10.1145/3543507.3583446)|Tunhou Zhang, Dehua Cheng, Yuchen He, Zhengxing Chen, Xiaoliang Dai, Liang Xiong, Feng Yan, Hai Li, Yiran Chen, Wei Wen|University of Houston, USA; Meta AI, USA; Duke University, USA|The rise of deep neural networks offers new opportunities in optimizing recommender systems. However, optimizing recommender systems using deep neural networks requires delicate architecture fabrication. We propose NASRec, a paradigm that trains a single supernet and efficiently produces abundant models/sub-architectures by weight sharing. To overcome the data multi-modality and architecture heterogeneity challenges in the recommendation domain, NASRec establishes a large supernet (i.e., search space) to search the full architectures. The supernet incorporates versatile choice of operators and dense connectivity to minimize human efforts for finding priors. The scale and heterogeneity in NASRec impose several challenges, such as training inefficiency, operator-imbalance, and degraded rank correlation. We tackle these challenges by proposing single-operator any-connection sampling, operator-balancing interaction modules, and post-training fine-tuning. Our crafted models, NASRecNet, show promising results on three Click-Through Rates (CTR) prediction benchmarks, indicating that NASRec outperforms both manually designed models and existing NAS methods with state-of-the-art performance. Our work is publicly available at https://github.com/facebookresearch/NasRec.|深层神经网络的兴起为优化推荐系统提供了新的机会。然而,使用深层神经网络优化推荐系统需要精细的架构制作。我们提出 NASRec,一个训练单个超级网络并通过权重分享有效地产生丰富的模型/子架构的范例。为了克服推荐域中的数据多态性和体系结构异构性挑战,NASRec 建立了一个大型超网(即搜索空间)来搜索完整的体系结构。超级网结合了多种操作员的选择和密集的连接,以最大限度地减少人的努力,找到前科。NASRec 的规模和异质性带来了一些挑战,如培训效率低下、操作员失衡和等级相关性降低。我们通过提出单操作者任意连接采样、操作者平衡交互模块和训练后微调来应对这些挑战。我们精心设计的模型 NASRecNet 在三个点击率(Click-Through Rate,CTR)预测基准上显示出有希望的结果,表明 NASRecc 的性能优于手工设计的模型和现有的 NAS 方法,具有最先进的性能。我们的工作 https://github.com/facebookresearch/nasrec 公开。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=NASRec:+Weight+Sharing+Neural+Architecture+Search+for+Recommender+Systems)|0| +|[Clustered Embedding Learning for Recommender Systems](https://doi.org/10.1145/3543507.3583362)|Yizhou Chen, Guangda Huzhang, Anxiang Zeng, Qingtao Yu, Hui Sun, HengYi Li, Jingyi Li, Yabo Ni, Han Yu, Zhiming Zhou|Shanghai University of Finance and Economics, China; Shopee Pte Ltd., Singapore; SCSE, Nanyang Technological University, Singapore|In recent years, recommender systems have advanced rapidly, where embedding learning for users and items plays a critical role. A standard method learns a unique embedding vector for each user and item. However, such a method has two important limitations in real-world applications: 1) it is hard to learn embeddings that generalize well for users and items with rare interactions on their own; and 2) it may incur unbearably high memory costs when the number of users and items scales up. Existing approaches either can only address one of the limitations or have flawed overall performances. In this paper, we propose Clustered Embedding Learning (CEL) as an integrated solution to these two problems. CEL is a plug-and-play embedding learning framework that can be combined with any differentiable feature interaction model. It is capable of achieving improved performance, especially for cold users and items, with reduced memory cost. CEL enables automatic and dynamic clustering of users and items in a top-down fashion, where clustered entities jointly learn a shared embedding. The accelerated version of CEL has an optimal time complexity, which supports efficient online updates. Theoretically, we prove the identifiability and the existence of a unique optimal number of clusters for CEL in the context of nonnegative matrix factorization. Empirically, we validate the effectiveness of CEL on three public datasets and one business dataset, showing its consistently superior performance against current state-of-the-art methods. In particular, when incorporating CEL into the business model, it brings an improvement of $+0.6\%$ in AUC, which translates into a significant revenue gain; meanwhile, the size of the embedding table gets $2650$ times smaller.|近年来,推荐系统发展迅速,其中用户和项目的嵌入式学习起着至关重要的作用。标准方法为每个用户和项学习唯一的嵌入向量。然而,这种方法在实际应用中有两个重要的局限性: 1)很难学习嵌入式技术,这种技术可以很好地适用于用户和具有罕见交互的项目; 2)当用户和项目的数量增加时,它可能会产生难以忍受的高内存成本。现有的方法要么只能解决其中的一个限制,要么具有有缺陷的整体性能。在本文中,我们提出了集群嵌入式学习(CEL)作为这两个问题的综合解决方案。CEL 是一个即插即用的嵌入式学习框架,可以与任何可微的特征交互模型相结合。它能够提高性能,特别是对于冷用户和项目,同时降低内存成本。CEL 以自顶向下的方式支持用户和项目的自动和动态集群,集群实体可以在这种方式下联合学习共享嵌入。CEL 的加速版本具有最佳的时间复杂度,支持高效的在线更新。理论上,我们证明了在非负矩阵分解的情况下,CEL 的可识别性和唯一最优簇数的存在性。通过实验,我们验证了 CEL 在三个公共数据集和一个业务数据集上的有效性,显示了与当前最先进的方法相比,CEL 始终具有优越的性能。特别是,当将 CEL 融入到商业模式中时,它在 AUC 中带来了 $+ 0.6% 的改进,这意味着显著的收入增长; 与此同时,嵌入表的大小变小了2650美元。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Clustered+Embedding+Learning+for+Recommender+Systems)|0| +|[MMMLP: Multi-modal Multilayer Perceptron for Sequential Recommendations](https://doi.org/10.1145/3543507.3583378)|Jiahao Liang, Xiangyu Zhao, Muyang Li, Zijian Zhang, Wanyu Wang, Haochen Liu, Zitao Liu|City University of Hong Kong, Hong Kong; Jilin University, China and City University of Hong Kong, Hong Kong; University of Sydney, Australia; Michigan State University, USA; Guangdong Institute of Smart Education, Jinan University, China|Sequential recommendation aims to offer potentially interesting products to users by capturing their historical sequence of interacted items. Although it has facilitated extensive physical scenarios, sequential recommendation for multi-modal sequences has long been neglected. Multi-modal data that depicts a user’s historical interactions exists ubiquitously, such as product pictures, textual descriptions, and interacted item sequences, providing semantic information from multiple perspectives that comprehensively describe a user’s preferences. However, existing sequential recommendation methods either fail to directly handle multi-modality or suffer from high computational complexity. To address this, we propose a novel Multi-Modal Multi-Layer Perceptron (MMMLP) for maintaining multi-modal sequences for sequential recommendation. MMMLP is a purely MLP-based architecture that consists of three modules - the Feature Mixer Layer, Fusion Mixer Layer, and Prediction Layer - and has an edge on both efficacy and efficiency. Extensive experiments show that MMMLP achieves state-of-the-art performance with linear complexity. We also conduct ablating analysis to verify the contribution of each component. Furthermore, compatible experiments are devised, and the results show that the multi-modal representation learned by our proposed model generally benefits other recommendation models, emphasizing our model’s ability to handle multi-modal information. We have made our code available online to ease reproducibility1.|顺序推荐旨在通过获取用户交互项的历史顺序,为用户提供潜在有趣的产品。虽然它促进了广泛的物理场景,多模态序列的顺序推荐长期以来被忽视。描述用户历史交互的多模态数据无处不在,比如产品图片、文本描述和交互式项目序列,从多个角度提供语义信息,全面描述用户的偏好。然而,现有的顺序推荐方法要么不能直接处理多模态问题,要么计算复杂度较高。为了解决这个问题,我们提出了一种新的多模态多层感知器(MMMLP)来维护多模态序列的顺序推荐。MMLP 是一个纯粹基于 MLP 的架构,它由三个模块组成——特征混合层、融合混合层和预测层——并且在功效和效率方面都有优势。大量的实验表明,MMLP 在线性复杂度方面达到了最先进的性能。我们还进行了烧蚀分析,以验证每个组分的贡献。此外,设计了相容性实验,结果表明,我们提出的模型学习的多模态表示一般有利于其他推荐模型,强调我们的模型的能力,处理多模态信息。我们已经在网上提供了我们的代码,以便于重现。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MMMLP:+Multi-modal+Multilayer+Perceptron+for+Sequential+Recommendations)|0| +|[AutoMLP: Automated MLP for Sequential Recommendations](https://doi.org/10.1145/3543507.3583440)|Muyang Li, Zijian Zhang, Xiangyu Zhao, Wanyu Wang, Minghao Zhao, Runze Wu, Ruocheng Guo|City University of Hong Kong, Hong Kong; Bytedance AI Lab UK, United Kingdom; Fuxi AI Lab, NetEase, China; City University of Hong Kong, Hong Kong and Jilin University, China; City University of Hong Kong, Hong Kong and University of Sydney, Australia|Sequential recommender systems aim to predict users' next interested item given their historical interactions. However, a long-standing issue is how to distinguish between users' long/short-term interests, which may be heterogeneous and contribute differently to the next recommendation. Existing approaches usually set pre-defined short-term interest length by exhaustive search or empirical experience, which is either highly inefficient or yields subpar results. The recent advanced transformer-based models can achieve state-of-the-art performances despite the aforementioned issue, but they have a quadratic computational complexity to the length of the input sequence. To this end, this paper proposes a novel sequential recommender system, AutoMLP, aiming for better modeling users' long/short-term interests from their historical interactions. In addition, we design an automated and adaptive search algorithm for preferable short-term interest length via end-to-end optimization. Through extensive experiments, we show that AutoMLP has competitive performance against state-of-the-art methods, while maintaining linear computational complexity.|顺序推荐系统的目的是预测用户的下一个感兴趣的项目给予他们的历史交互。然而,一个长期存在的问题是如何区分用户的长期和短期利益,这可能是不同的,并作出不同的贡献下一个建议。现有方法通常通过穷举搜索或实证经验来设定预先确定的短期利率长度,这种方法要么效率极低,要么效果不佳。尽管存在上述问题,最近的先进的基于变压器的模型能够实现最先进的性能,但是它们对于输入序列的长度具有二次计算复杂度。为此,本文提出了一种新的顺序推荐系统—— AutoMLP,旨在从用户的历史交互中更好地建立用户的长期/短期兴趣模型。此外,我们设计了一个自动化和自适应的搜索算法,通过端到端优化较好的短期兴趣长度。通过大量的实验,我们发现 AutoMLP 在保持线性计算复杂度的同时,具有与最先进的方法相竞争的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=AutoMLP:+Automated+MLP+for+Sequential+Recommendations)|0| +|[NASRec: Weight Sharing Neural Architecture Search for Recommender Systems](https://doi.org/10.1145/3543507.3583446)|Tunhou Zhang, Dehua Cheng, Yuchen He, Zhengxing Chen, Xiaoliang Dai, Liang Xiong, Feng Yan, Hai Li, Yiran Chen, Wei Wen|Duke University, USA; Meta AI, USA; University of Houston, USA|The rise of deep neural networks offers new opportunities in optimizing recommender systems. However, optimizing recommender systems using deep neural networks requires delicate architecture fabrication. We propose NASRec, a paradigm that trains a single supernet and efficiently produces abundant models/sub-architectures by weight sharing. To overcome the data multi-modality and architecture heterogeneity challenges in the recommendation domain, NASRec establishes a large supernet (i.e., search space) to search the full architectures. The supernet incorporates versatile choice of operators and dense connectivity to minimize human efforts for finding priors. The scale and heterogeneity in NASRec impose several challenges, such as training inefficiency, operator-imbalance, and degraded rank correlation. We tackle these challenges by proposing single-operator any-connection sampling, operator-balancing interaction modules, and post-training fine-tuning. Our crafted models, NASRecNet, show promising results on three Click-Through Rates (CTR) prediction benchmarks, indicating that NASRec outperforms both manually designed models and existing NAS methods with state-of-the-art performance. Our work is publicly available at https://github.com/facebookresearch/NasRec.|深层神经网络的兴起为优化推荐系统提供了新的机会。然而,使用深层神经网络优化推荐系统需要精细的架构制作。我们提出 NASRec,一个训练单个超级网络并通过权重分享有效地产生丰富的模型/子架构的范例。为了克服推荐域中的数据多态性和体系结构异构性挑战,NASRec 建立了一个大型超网(即搜索空间)来搜索完整的体系结构。超级网结合了多种操作员的选择和密集的连接,以最大限度地减少人的努力,找到前科。NASRec 的规模和异质性带来了一些挑战,如培训效率低下、操作员失衡和等级相关性降低。我们通过提出单操作者任意连接采样、操作者平衡交互模块和训练后微调来应对这些挑战。我们精心设计的模型 NASRecNet 在三个点击率(Click-Through Rate,CTR)预测基准上显示出有希望的结果,表明 NASRecc 的性能优于手工设计的模型和现有的 NAS 方法,具有最先进的性能。我们的工作 https://github.com/facebookresearch/nasrec 公开。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=NASRec:+Weight+Sharing+Neural+Architecture+Search+for+Recommender+Systems)|0| |[Membership Inference Attacks Against Sequential Recommender Systems](https://doi.org/10.1145/3543507.3583447)|Zhihao Zhu, Chenwang Wu, Rui Fan, Defu Lian, Enhong Chen|University of Science and Technology of China, China|Recent studies have demonstrated the vulnerability of recommender systems to membership inference attacks, which determine whether a user’s historical data was utilized for model training, posing serious privacy leakage issues. Existing works assumed that member and non-member users follow different recommendation modes, and then infer membership based on the difference vector between the user’s historical behaviors and the recommendation list. The previous frameworks are invalid against inductive recommendations, such as sequential recommendations, since the disparities of difference vectors constructed by the recommendations between members and non-members become imperceptible. This motivates us to dig deeper into the target model. In addition, most MIA frameworks assume that they can obtain some in-distribution data from the same distribution of the target data, which is hard to gain in recommender system. To address these difficulties, we propose a Membership Inference Attack framework against sequential recommenders based on Model Extraction(ME-MIA). Specifically, we train a surrogate model to simulate the target model based on two universal loss functions. For a given behavior sequence, the loss functions ensure the recommended items and corresponding rank of the surrogate model are consistent with the target model’s recommendation. Due to the special training mode of the surrogate model, it is hard to judge which user is its member(non-member). Therefore, we establish a shadow model and use shadow model’s members(non-members) to train the attack model later. Next, we build a user feature generator to construct representative feature vectors from the shadow(surrogate) model. The crafting feature vectors are finally input into the attack model to identify users’ membership. Furthermore, to tackle the high cost of obtaining in-distribution data, we develop two variants of ME-MIA, realizing data-efficient and even data-free MIA by fabricating authentic in-distribution data. Notably, the latter is impossible in the previous works. Finally, we evaluate ME-MIA against multiple sequential recommendation models on three real-world datasets. Experimental results show that ME-MIA and its variants can achieve efficient extraction and outperform state-of-the-art algorithms in terms of attack performance.|最近的研究表明,推荐系统容易受到成员推断攻击,这决定了用户的历史数据是否被用于模型训练,造成严重的隐私泄露问题。现有的研究假设成员用户和非成员用户遵循不同的推荐模式,然后根据用户历史行为和推荐列表之间的差异向量推断成员关系。以前的框架对于归纳推荐(如顺序推荐)是无效的,因为成员和非成员之间由推荐构造的差异向量的差异变得不可察觉。这促使我们更深入地研究目标模型。此外,大多数 MIA 框架都假定它们可以从目标数据的同一分布中获得一些分布内数据,而这在推荐系统中是很难获得的。为了解决这些问题,我们提出了一个基于模型提取(ME-MIA)的针对顺序推荐的成员推理攻击框架。具体来说,我们训练了一个代理模型来模拟目标模型基于两个通用的损失函数。对于给定的行为序列,损失函数保证代理模型的推荐项和相应的等级与目标模型的推荐一致。由于代理模型的特殊训练模式,很难判断哪个用户是它的成员(非成员)。因此,我们建立了一个阴影模型,然后利用阴影模型的成员(非成员)来训练攻击模型。接下来,我们构建一个用户特征生成器来从阴影(代理)模型中构造具有代表性的特征向量。最后将特征向量输入到攻击模型中,识别用户的隶属关系。此外,为了解决获取内部分发数据的高成本问题,我们开发了两种不同的 ME-MIA,通过制作真实的内部分发数据来实现数据高效甚至无数据的 MIA。值得注意的是,后者在前面的作品中是不可能的。最后,我们在三个实际数据集上对多个顺序推荐模型进行 ME-MIA 评估。实验结果表明,ME-MIA 算法及其变体能够实现有效的提取,并且在攻击性能方面优于目前最先进的算法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Membership+Inference+Attacks+Against+Sequential+Recommender+Systems)|0| |[Communicative MARL-based Relevance Discerning Network for Repetition-Aware Recommendation](https://doi.org/10.1145/3543507.3583459)|Kaiyuan Li, Pengfei Wang, Haitao Wang, Qiang Liu, Xingxing Wang, Dong Wang, Shangguang Wang|Beijing University of Posts and Telecommunications, China; Meituan, China|The repeated user-item interaction now is becoming a common phenomenon in the e-commerce scenario. Due to its potential economic profit, various models are emerging to predict which item will be re-interacted based on the user-item interactions. In this specific scenario, item relevance is a critical factor that needs to be concerned, which tends to have different effects on the succeeding re-interacted one (i.e., stimulating or delaying its emergence). It is necessary to make a detailed discernment of item relevance for a better repetition-aware recommendation. Unfortunately, existing works usually mixed all these types, which may disturb the learning process and result in poor performance. In this paper, we introduce a novel Communicative MARL-based Relevance Discerning Network (CARDfor short) to automatically discern the item relevance for a better repetition-aware recommendation. Specifically, CARDformalizes the item relevance discerning problem into a communication selection process in MARL. CARDtreats each unique interacted item as an agent and defines three different communication types over agents, which are stimulative, inhibitive, and noisy respectively. After this, CARDutilizes a Gumbel-enhanced classifier to distinguish the communication types among agents, and an attention-based Reactive Point Process is further designed to transmit the well-discerned stimulative and inhibitive incentives separately among all agents to make an effective collaboration for repetition decisions. Experimental results on two real-world e-commerce datasets show that our proposed method outperforms the state-of-the-art recommendation methods in terms of both sequential and repetition-aware recommenders. Furthermore, CARDis also deployed in the online sponsored search advertising system in Meituan, obtaining a performance improvement of over 1.5% and 1.2% in CTR and effective Cost Per Mille (eCPM) respectively, which is significant to the business.|重复的用户-项目交互现在正在成为电子商务场景中的一个普遍现象。由于其潜在的经济利益,各种模型正在出现,以预测哪些项目将重新交互的基础上,用户项目的交互。在这个特定的场景中,项目相关性是一个需要关注的关键因素,它往往对后续的重新相互作用有不同的影响(即,刺激或延迟其出现)。有必要对项目的相关性进行详细的识别,以便更好地提出有重复意识的建议。不幸的是,现有的作品往往混合了所有这些类型,这可能会干扰学习过程,导致较差的表现。本文介绍了一种新的基于交际 MARL 的关联识别网络(CARD) ,该网络可以自动识别项目的相关性,从而获得更好的重复感知推荐。特别地,CARD 将项目相关性识别问题形式化为 MARL 中的通信选择过程。CARD 将每个独特的交互项目视为一个代理,并定义了代理上的三种不同的通信类型,分别是刺激性、抑制性和噪声性。此后,CARD 利用 Gumbel 增强的分类器来区分代理人之间的通信类型,并进一步设计基于注意力的反应点过程,以在所有代理人之间分别传递明确的刺激和抑制激励,以便为重复决策进行有效的协作。在两个实际电子商务数据集上的实验结果表明,该方法在顺序推荐和重复推荐方面都优于目前最先进的推荐方法。此外,CARD 还部署在在线赞助的搜索广告系统中,美团点击率和有效每公里成本(eCPM)分别提高了1.5% 和1.2% ,这对业务具有重要意义。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Communicative+MARL-based+Relevance+Discerning+Network+for+Repetition-Aware+Recommendation)|0| -|[Personalized Graph Signal Processing for Collaborative Filtering](https://doi.org/10.1145/3543507.3583466)|Jiahao Liu, Dongsheng Li, Hansu Gu, Tun Lu, Peng Zhang, Li Shang, Ning Gu|School of Computer Science, Fudan University, China and Shanghai Key Laboratory of Data Science, Fudan University, China; Amazon, USA; Microsoft Research Asia, China|The collaborative filtering (CF) problem with only user-item interaction information can be solved by graph signal processing (GSP), which uses low-pass filters to smooth the observed interaction signals on the similarity graph to obtain the prediction signals. However, the interaction signal may not be sufficient to accurately characterize user interests and the low-pass filters may ignore the useful information contained in the high-frequency component of the observed signals, resulting in suboptimal accuracy. To this end, we propose a personalized graph signal processing (PGSP) method for collaborative filtering. Firstly, we design the personalized graph signal containing richer user information and construct an augmented similarity graph containing more graph topology information, to more effectively characterize user interests. Secondly, we devise a mixed-frequency graph filter to introduce useful information in the high-frequency components of the observed signals by combining an ideal low-pass filter that smooths signals globally and a linear low-pass filter that smooths signals locally. Finally, we combine the personalized graph signal, the augmented similarity graph and the mixed-frequency graph filter by proposing a pipeline consisting of three key steps: pre-processing, graph convolution and post-processing. Extensive experiments show that PGSP can achieve superior accuracy compared with state-of-the-art CF methods and, as a nonparametric method, PGSP has very high training efficiency.|图形信号处理(gSP)可以解决只有用户-项目交互信息的协同过滤(CF)问题,它使用低通滤波器平滑相似图上观察到的交互信号,以获得预测信号。然而,交互信号可能不足以准确地表征用户的兴趣,而且低通滤波器可能会忽略观测信号的高频分量中包含的有用信息,从而导致次优精度。为此,我们提出了一个个性化的图形信号处理(PgSP)方法来处理协同过滤。首先,设计了包含更丰富用户信息的个性化图形信号,构造了包含更多图形拓扑信息的增广相似度图,以更有效地刻画用户兴趣。其次,我们设计了一个混合频率图形滤波器,通过结合理想的低通滤波器对信号进行全局平滑和线性低通滤波器对信号进行局部平滑,从而在观测信号的高频成分中引入有用的信息。最后,结合个性化图形信号、增强相似图和混合频率图滤波,提出了一种由预处理、图卷积和后处理三个关键步骤组成的流水线。大量的实验表明,与现有的 CF 方法相比,PGSP 具有更高的精度,并且作为一种非参数方法,PGSP 具有很高的训练效率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Personalized+Graph+Signal+Processing+for+Collaborative+Filtering)|0| +|[Personalized Graph Signal Processing for Collaborative Filtering](https://doi.org/10.1145/3543507.3583466)|Jiahao Liu, Dongsheng Li, Hansu Gu, Tun Lu, Peng Zhang, Li Shang, Ning Gu|Microsoft Research Asia, China; Amazon, USA; School of Computer Science, Fudan University, China and Shanghai Key Laboratory of Data Science, Fudan University, China|The collaborative filtering (CF) problem with only user-item interaction information can be solved by graph signal processing (GSP), which uses low-pass filters to smooth the observed interaction signals on the similarity graph to obtain the prediction signals. However, the interaction signal may not be sufficient to accurately characterize user interests and the low-pass filters may ignore the useful information contained in the high-frequency component of the observed signals, resulting in suboptimal accuracy. To this end, we propose a personalized graph signal processing (PGSP) method for collaborative filtering. Firstly, we design the personalized graph signal containing richer user information and construct an augmented similarity graph containing more graph topology information, to more effectively characterize user interests. Secondly, we devise a mixed-frequency graph filter to introduce useful information in the high-frequency components of the observed signals by combining an ideal low-pass filter that smooths signals globally and a linear low-pass filter that smooths signals locally. Finally, we combine the personalized graph signal, the augmented similarity graph and the mixed-frequency graph filter by proposing a pipeline consisting of three key steps: pre-processing, graph convolution and post-processing. Extensive experiments show that PGSP can achieve superior accuracy compared with state-of-the-art CF methods and, as a nonparametric method, PGSP has very high training efficiency.|图形信号处理(gSP)可以解决只有用户-项目交互信息的协同过滤(CF)问题,它使用低通滤波器平滑相似图上观察到的交互信号,以获得预测信号。然而,交互信号可能不足以准确地表征用户的兴趣,而且低通滤波器可能会忽略观测信号的高频分量中包含的有用信息,从而导致次优精度。为此,我们提出了一个个性化的图形信号处理(PgSP)方法来处理协同过滤。首先,设计了包含更丰富用户信息的个性化图形信号,构造了包含更多图形拓扑信息的增广相似度图,以更有效地刻画用户兴趣。其次,我们设计了一个混合频率图形滤波器,通过结合理想的低通滤波器对信号进行全局平滑和线性低通滤波器对信号进行局部平滑,从而在观测信号的高频成分中引入有用的信息。最后,结合个性化图形信号、增强相似图和混合频率图滤波,提出了一种由预处理、图卷积和后处理三个关键步骤组成的流水线。大量的实验表明,与现有的 CF 方法相比,PGSP 具有更高的精度,并且作为一种非参数方法,PGSP 具有很高的训练效率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Personalized+Graph+Signal+Processing+for+Collaborative+Filtering)|0| |[Multi-Task Recommendations with Reinforcement Learning](https://doi.org/10.1145/3543507.3583467)|Ziru Liu, Jiejie Tian, Qingpeng Cai, Xiangyu Zhao, Jingtong Gao, Shuchang Liu, Dayou Chen, Tonghao He, Dong Zheng, Peng Jiang, Kun Gai|City University of Hong Kong, China; Kuaishou, China; Unaffiliated, China|In recent years, Multi-task Learning (MTL) has yielded immense success in Recommender System (RS) applications. However, current MTL-based recommendation models tend to disregard the session-wise patterns of user-item interactions because they are predominantly constructed based on item-wise datasets. Moreover, balancing multiple objectives has always been a challenge in this field, which is typically avoided via linear estimations in existing works. To address these issues, in this paper, we propose a Reinforcement Learning (RL) enhanced MTL framework, namely RMTL, to combine the losses of different recommendation tasks using dynamic weights. To be specific, the RMTL structure can address the two aforementioned issues by (i) constructing an MTL environment from session-wise interactions and (ii) training multi-task actor-critic network structure, which is compatible with most existing MTL-based recommendation models, and (iii) optimizing and fine-tuning the MTL loss function using the weights generated by critic networks. Experiments on two real-world public datasets demonstrate the effectiveness of RMTL with a higher AUC against state-of-the-art MTL-based recommendation models. Additionally, we evaluate and validate RMTL's compatibility and transferability across various MTL models.|近年来,多任务学习在推荐系统应用方面取得了巨大的成功。然而,目前基于 MTL 的推荐模型倾向于忽略用户项目交互的会话模式,因为它们主要是基于项目数据集构建的。此外,平衡多个目标一直是这个领域的一个挑战,这通常是通过现有工作中的线性估计来避免的。为了解决这些问题,在本文中,我们提出了一个强化学习增强的 MTL 框架,即 RMTL,它使用动态权重来组合不同推荐任务的丢失。具体来说,RMTL 结构可以解决上述两个问题: (1)通过会话交互构建 MTL 环境; (2)训练与大多数基于 MTL 的推荐模型兼容的多任务参与者-评论者网络结构; (3)利用评论者网络生成的权重优化和微调 MTL 损失函数。在两个真实世界的公共数据集上的实验证明了具有较高 AUC 的 RMTL 对基于最新 MTL 的推荐模型的有效性。此外,我们评估和验证 RMTL 的兼容性和跨各种 MTL 模型的可转移性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-Task+Recommendations+with+Reinforcement+Learning)|0| -|[A Self-Correcting Sequential Recommender](https://doi.org/10.1145/3543507.3583479)|Yujie Lin, Chenyang Wang, Zhumin Chen, Zhaochun Ren, Xin Xin, Qiang Yan, Maarten de Rijke, Xiuzhen Cheng, Pengjie Ren|WeChat, Tencent, China; University of Amsterdam, Netherlands; Shandong University, China|Sequential recommendations aim to capture users' preferences from their historical interactions so as to predict the next item that they will interact with. Sequential recommendation methods usually assume that all items in a user's historical interactions reflect her/his preferences and transition patterns between items. However, real-world interaction data is imperfect in that (i) users might erroneously click on items, i.e., so-called misclicks on irrelevant items, and (ii) users might miss items, i.e., unexposed relevant items due to inaccurate recommendations. To tackle the two issues listed above, we propose STEAM, a Self-correcTing sEquentiAl recoMmender. STEAM first corrects an input item sequence by adjusting the misclicked and/or missed items. It then uses the corrected item sequence to train a recommender and make the next item prediction.We design an item-wise corrector that can adaptively select one type of operation for each item in the sequence. The operation types are 'keep', 'delete' and 'insert.' In order to train the item-wise corrector without requiring additional labeling, we design two self-supervised learning mechanisms: (i) deletion correction (i.e., deleting randomly inserted items), and (ii) insertion correction (i.e., predicting randomly deleted items). We integrate the corrector with the recommender by sharing the encoder and by training them jointly. We conduct extensive experiments on three real-world datasets and the experimental results demonstrate that STEAM outperforms state-of-the-art sequential recommendation baselines. Our in-depth analyses confirm that STEAM benefits from learning to correct the raw item sequences.|序贯推荐旨在从用户的历史交互中获取他们的偏好,从而预测他们将要交互的下一个项目。顺序推荐方法通常假设用户历史交互中的所有项目都反映了用户的偏好和项目之间的转换模式。然而,真实世界的交互数据是不完美的,因为(i)用户可能会错误地点击项目,即所谓的不相关项目的错误点击,以及(ii)用户可能会错过项目,即由于不准确的推荐而未公开的相关项目。为了解决上面列出的两个问题,我们提出 STEAM,一个自我修正的 sEquentiAl 推荐器。STEAM 首先通过调整错误点击和/或错过的项目来更正输入项目序列。然后使用校正后的项目序列来训练推荐者并对下一个项目进行预测。我们设计了一个项目校正器,它可以自适应地为序列中的每个项目选择一种操作类型。操作类型为“ keep”、“ delete”和“ insert”。为了训练项目校正器而不需要额外的标签,我们设计了两个自我监督学习机制: (i)删除校正(即删除随机插入的项目)和(ii)插入校正(即预测随机删除的项目)。我们通过共享编码器和共同训练,将校正器和推荐器结合起来。我们在三个真实世界的数据集上进行了广泛的实验,实验结果表明 STEAM 的性能优于最先进的顺序推荐基线。我们的深入分析证实,STEAM 受益于学习纠正原始项目序列。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Self-Correcting+Sequential+Recommender)|0| +|[A Self-Correcting Sequential Recommender](https://doi.org/10.1145/3543507.3583479)|Yujie Lin, Chenyang Wang, Zhumin Chen, Zhaochun Ren, Xin Xin, Qiang Yan, Maarten de Rijke, Xiuzhen Cheng, Pengjie Ren|WeChat, Tencent, China; Shandong University, China; University of Amsterdam, Netherlands|Sequential recommendations aim to capture users' preferences from their historical interactions so as to predict the next item that they will interact with. Sequential recommendation methods usually assume that all items in a user's historical interactions reflect her/his preferences and transition patterns between items. However, real-world interaction data is imperfect in that (i) users might erroneously click on items, i.e., so-called misclicks on irrelevant items, and (ii) users might miss items, i.e., unexposed relevant items due to inaccurate recommendations. To tackle the two issues listed above, we propose STEAM, a Self-correcTing sEquentiAl recoMmender. STEAM first corrects an input item sequence by adjusting the misclicked and/or missed items. It then uses the corrected item sequence to train a recommender and make the next item prediction.We design an item-wise corrector that can adaptively select one type of operation for each item in the sequence. The operation types are 'keep', 'delete' and 'insert.' In order to train the item-wise corrector without requiring additional labeling, we design two self-supervised learning mechanisms: (i) deletion correction (i.e., deleting randomly inserted items), and (ii) insertion correction (i.e., predicting randomly deleted items). We integrate the corrector with the recommender by sharing the encoder and by training them jointly. We conduct extensive experiments on three real-world datasets and the experimental results demonstrate that STEAM outperforms state-of-the-art sequential recommendation baselines. Our in-depth analyses confirm that STEAM benefits from learning to correct the raw item sequences.|序贯推荐旨在从用户的历史交互中获取他们的偏好,从而预测他们将要交互的下一个项目。顺序推荐方法通常假设用户历史交互中的所有项目都反映了用户的偏好和项目之间的转换模式。然而,真实世界的交互数据是不完美的,因为(i)用户可能会错误地点击项目,即所谓的不相关项目的错误点击,以及(ii)用户可能会错过项目,即由于不准确的推荐而未公开的相关项目。为了解决上面列出的两个问题,我们提出 STEAM,一个自我修正的 sEquentiAl 推荐器。STEAM 首先通过调整错误点击和/或错过的项目来更正输入项目序列。然后使用校正后的项目序列来训练推荐者并对下一个项目进行预测。我们设计了一个项目校正器,它可以自适应地为序列中的每个项目选择一种操作类型。操作类型为“ keep”、“ delete”和“ insert”。为了训练项目校正器而不需要额外的标签,我们设计了两个自我监督学习机制: (i)删除校正(即删除随机插入的项目)和(ii)插入校正(即预测随机删除的项目)。我们通过共享编码器和共同训练,将校正器和推荐器结合起来。我们在三个真实世界的数据集上进行了广泛的实验,实验结果表明 STEAM 的性能优于最先进的顺序推荐基线。我们的深入分析证实,STEAM 受益于学习纠正原始项目序列。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Self-Correcting+Sequential+Recommender)|0| |[Confident Action Decision via Hierarchical Policy Learning for Conversational Recommendation](https://doi.org/10.1145/3543507.3583536)|Heeseon Kim, Hyeongjun Yang, KyongHo Lee|Department of Computer Science, Yonsei University, Republic of Korea|Conversational recommender systems (CRS) aim to acquire a user’s dynamic interests for a successful recommendation. By asking about his/her preferences, CRS explore current needs of a user and recommend items of interest. However, previous works may not determine a proper action in a timely manner which leads to the insufficient information gathering and the waste of conversation turns. Since they learn a single decision policy, it is difficult for them to address the general decision problems in CRS. Besides, existing methods do not distinguish whether the past behaviors inferred from the historical interactions are closely related to the user’s current preference. To address these issues, we propose a novel Hierarchical policy learning based Conversational Recommendation framework (HiCR). HiCR formulates the multi-round decision making process as a hierarchical policy learning scheme, which consists of both a high-level policy and a low-level policy. In detail, the high-level policy aims to determine what type of action to take, such as a recommendation or a query, by observing the comprehensive conversation information. According to the decided action type, the low-level policy selects a specific action, such as which attribute to ask or which item to recommend. The hierarchical conversation policy enables CRS to decide an optimal action, resulting in reducing the unnecessary consumption of conversation turns and the continuous failure of recommendations. Furthermore, in order to filter out the unnecessary historical information when enriching the current user preference, we extract and utilize the informative past behaviors that are attentive to the current needs. Empirical experiments on four real-world datasets show the superiority of our approach against the current state-of-the-art methods.|会话推荐系统(CRS)的目标是获取用户的动态兴趣,从而实现成功的推荐。通过询问用户的偏好,CRS 探索用户当前的需求并推荐感兴趣的项目。然而,以往的作品不能及时确定适当的行动,导致信息收集不足和谈话的浪费。由于他们只学习单一的决策策略,因此很难解决 CRS 中的一般决策问题。此外,现有的方法不能区分从历史交互中推断出的过去行为是否与用户当前的偏好密切相关。为了解决这些问题,我们提出了一种新的基于层次策略学习的会话推荐框架(HiCR)。HiCR 将多轮决策过程描述为一个分层的决策学习方案,该方案由高层决策和低层决策两部分组成。具体来说,高级策略旨在通过观察全面的会话信息来确定采取何种类型的操作,比如推荐或查询。根据确定的操作类型,底层策略选择一个特定的操作,比如询问哪个属性或推荐哪个项目。分层对话策略使 CRS 能够决定一个最优的操作,从而减少不必要的话轮消耗和建议的持续失败。此外,为了在丰富当前用户偏好时过滤掉不必要的历史信息,我们提取并利用了关注当前需求的信息性过去行为。在四个真实世界数据集上的实验表明了我们的方法相对于当前最先进的方法的优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Confident+Action+Decision+via+Hierarchical+Policy+Learning+for+Conversational+Recommendation)|0| |[Mutual Wasserstein Discrepancy Minimization for Sequential Recommendation](https://doi.org/10.1145/3543507.3583529)|Ziwei Fan, Zhiwei Liu, Hao Peng, Philip S. Yu|University of Illinois Chicago, USA; Beihang University, USA; Salesforce AI Research, USA|Self-supervised sequential recommendation significantly improves recommendation performance by maximizing mutual information with well-designed data augmentations. However, the mutual information estimation is based on the calculation of Kullback Leibler divergence with several limitations, including asymmetrical estimation, the exponential need of the sample size, and training instability. Also, existing data augmentations are mostly stochastic and can potentially break sequential correlations with random modifications. These two issues motivate us to investigate an alternative robust mutual information measurement capable of modeling uncertainty and alleviating KL divergence limitations. To this end, we propose a novel self-supervised learning framework based on Mutual WasserStein discrepancy minimization MStein for the sequential recommendation. We propose the Wasserstein Discrepancy Measurement to measure the mutual information between augmented sequences. Wasserstein Discrepancy Measurement builds upon the 2-Wasserstein distance, which is more robust, more efficient in small batch sizes, and able to model the uncertainty of stochastic augmentation processes. We also propose a novel contrastive learning loss based on Wasserstein Discrepancy Measurement. Extensive experiments on four benchmark datasets demonstrate the effectiveness of MStein over baselines. More quantitative analyses show the robustness against perturbations and training efficiency in batch size. Finally, improvements analysis indicates better representations of popular users or items with significant uncertainty. The source code is at https://github.com/zfan20/MStein.|自监督顺序推荐通过设计良好的数据增强最大化互信息,显著提高了推荐性能。然而,互信息估计是基于 Kullback Leibler 散度的计算,具有不对称估计、样本量的指数需求和训练不稳定性等局限性。此外,现有的数据扩充大多是随机的,并可能打破随机修改顺序相关性。这两个问题促使我们研究一种可替代的鲁棒互信息测量方法,该方法能够对不确定性进行建模并减轻 KL 发散的限制。为此,我们提出了一种新的基于 Mutual WasserStein 差异最小化 MStein 的自监督学习框架,用于顺序推荐。我们提出了 Wasserstein 差异度量来度量增广序列之间的互信息。Wasserstein 误差测度建立在2-Wasserstein 距离的基础上,该距离在小批量情况下具有更强的鲁棒性和更高的效率,能够对随机增量过程的不确定性进行建模。我们还提出了一种新的基于 Wasserstein 差异度量的对比学习损失。在四个基准数据集上的大量实验证明了 MStein 在基线上的有效性。进一步的定量分析表明,该算法具有较强的抗干扰能力,并且在批量情况下具有较高的训练效率。最后,改进分析表明,流行用户或具有显著不确定性的项目的表示更好。源代码在 https://github.com/zfan20/mstein。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Mutual+Wasserstein+Discrepancy+Minimization+for+Sequential+Recommendation)|0| |[Automatic Feature Selection By One-Shot Neural Architecture Search In Recommendation Systems](https://doi.org/10.1145/3543507.3583444)|He Wei, Yuekui Yang, Haiyang Wu, Yangyang Tang, Meixi Liu, Jianfeng Li|Machine learning platform department, TEG, Tencent, China and Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology, Tsinghua University, China; Machine learning platform department, TEG, Tencent, China|Feature selection is crucial in large-scale recommendation system, which can not only reduce the computational cost, but also improve the recommendation efficiency. Most existing works rank the features and then select the top-k ones as the final feature subset. However, they assess feature importance individually and ignore the interrelationship between features. Consequently, multiple features with high relevance may be selected simultaneously, resulting in sub-optimal result. In this work, we solve this problem by proposing an AutoML-based feature selection framework that can automatically search the optimal feature subset. Specifically, we first embed the search space into a weight-sharing Supernet. Then, a two-stage neural architecture search method is employed to evaluate the feature quality. In the first stage, a well-designed sampling method considering feature convergence fairness is applied to train the Supernet. In the second stage, a reinforcement learning method is used to search for the optimal feature subset efficiently. The Experimental results on two real datasets demonstrate the superior performance of new framework over other solutions. Our proposed method obtain significant improvement with a 20% reduction in the amount of features on the Criteo. More validation experiments demonstrate the ability and robustness of the framework.|特征选择是大规模推荐系统的关键,它不仅可以降低计算量,而且可以提高推荐效率。大多数已有的作品对特征进行排序,然后选择最上面的 k 个特征作为最终的特征子集。然而,他们单独评估特征的重要性,而忽略了特征之间的相互关系。因此,可以同时选择多个高相关性的特征,从而导致次优结果。针对这一问题,本文提出了一种基于 AutoML 的特征选择框架,该框架可以自动搜索最优特征子集。具体来说,我们首先将搜索空间嵌入到一个权重共享的超级网络中。然后,采用两阶段神经网络结构搜索方法对特征质量进行评价。在第一阶段,采用一种设计良好的考虑特征收敛公平性的抽样方法对超网进行训练。在第二阶段,使用强化学习方法有效地搜索最优特征子集。在两个实际数据集上的实验结果表明,新框架的性能优于其他解决方案。我们提出的方法获得了显着的改进,在标准的数量减少了20% 的特征。更多的验证实验证明了该框架的能力和鲁棒性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Automatic+Feature+Selection+By+One-Shot+Neural+Architecture+Search+In+Recommendation+Systems)|0| -|[Catch: Collaborative Feature Set Search for Automated Feature Engineering](https://doi.org/10.1145/3543507.3583527)|Guoshan Lu, Haobo Wang, Saisai Yang, Jing Yuan, Guozheng Yang, Cheng Zang, Gang Chen, Junbo Zhao|Zhejiang University, China; Zheshang Bank Co., Ltd., China; Institute of Computing Innovation, Zhejiang University, China|Feature engineering often plays a crucial role in building mining systems for tabular data, which traditionally requires experienced human experts to perform. Thanks to the rapid advances in reinforcement learning, it has offered an automated alternative, i.e. automated feature engineering (AutoFE). In this work, through scrutiny of the prior AutoFE methods, we characterize several research challenges that remained in this regime, concerning system-wide efficiency, efficacy, and practicality toward production. We then propose Catch, a full-fledged new AutoFE framework that comprehensively addresses the aforementioned challenges. The core to Catch composes a hierarchical-policy reinforcement learning scheme that manifests a collaborative feature engineering exploration and exploitation grounded on the granularity of the whole feature set. At a higher level of the hierarchy, a decision-making module controls the post-processing of the attained feature engineering transformation. We extensively experiment with Catch on 26 academic standardized tabular datasets and 9 industrialized real-world datasets. Measured by numerous metrics and analyses, Catch establishes a new state-of-the-art, from perspectives performance, latency as well as its practicality towards production. Source code1 can be found at https://github.com/1171000709/Catch.|在构建表格数据挖掘系统时,特征工程往往起着至关重要的作用,表格数据挖掘传统上需要有经验的人类专家来完成。由于强化学习的快速发展,它提供了一种自动化的替代方案,即自动化特征工程(AutoFE)。在这项工作中,通过审查以前的自动有限元方法,我们描述了几个研究挑战,仍然在这个制度,关于系统的效率,效率和实用性的生产。然后,我们建议使用 Catch,这是一个成熟的新的 AutoFE 框架,可以全面解决上述挑战。Core to Catch 组成了一个层次化的策略强化学习方案,体现了基于整个特性集粒度的协同特性工程探索和开发。在层次结构的更高层次上,决策模块控制所获得的特征工程变换的后处理。我们对26个学术标准化表格数据集和9个工业化真实世界数据集进行了广泛的实验。通过大量的度量和分析,Catch 从性能、延迟以及对生产的实用性的角度建立了一个新的最先进的状态。源代码1可在 https://github.com/1171000709/catch 找到。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Catch:+Collaborative+Feature+Set+Search+for+Automated+Feature+Engineering)|0| -|[The Hitchhiker's Guide to Facebook Web Tracking with Invisible Pixels and Click IDs](https://doi.org/10.1145/3543507.3583311)|Paschalis Bekos, Panagiotis Papadopoulos, Evangelos P. Markatos, Nicolas Kourtellis|FORTH, Greece; University of Crete/FORTH, Greece; Telefonica Research, Spain|Over the past years, advertisement companies have used various tracking methods to persistently track users across the web. Such tracking methods usually include first and third-party cookies, cookie synchronization, as well as a variety of fingerprinting mechanisms. Facebook (FB) recently introduced a new tagging mechanism that attaches a one-time tag as a URL parameter (FBCLID) on outgoing links to other websites. Although such a tag does not seem to have enough information to persistently track users, we demonstrate that despite its ephemeral nature, when combined with FB Pixel, it can aid in persistently monitoring user browsing behavior across i) different websites, ii) different actions on each website, iii) time, i.e., both in the past as well as in the future. We refer to this online monitoring of users as FB web tracking. We find that FB Pixel tracks a wide range of user activities on websites with alarming detail, especially on websites classified as sensitive categories under GDPR. Also, we show how the FBCLID tag can be used to match, and thus de-anonymize, activities of online users performed in the distant past (even before those users had a FB account) tracked by FB Pixel. In fact, by combining this tag with cookies that have rolling expiration dates, FB can also keep track of users' browsing activities in the future as well. Our experimental results suggest that 23% of the 10k most popular websites have adopted this technology, and can contribute to this activity tracking on the web. Furthermore, our longitudinal study shows that this type of user activity tracking can go as far back as 2015. Simply said, if a user creates for the first time a FB account today, FB could, under some conditions, match their anonymously collected past web browsing activity to their newly created FB profile, from as far back as 2015 and continue tracking their activity in the future.|在过去的几年里,广告公司使用了各种各样的跟踪方法来持续跟踪网络上的用户。这种跟踪方法通常包括第一方和第三方 cookie、 cookie 同步以及各种指纹识别机制。Facebook (FB)最近推出了一种新的标签机制,它将一次性标签作为 URL 参数(FBCLID)附加到其他网站的外向链接上。虽然这样的标签似乎没有足够的信息来持续跟踪用户,我们证明,尽管它的短暂性质,当结合 FB 像素,它可以帮助持续监测用户浏览行为在 i)不同的网站,ii)不同的行动在每个网站,iii)时间,即在过去和未来。我们把这种对用户的在线监控称为 FB 网络跟踪。我们发现,FB 像素跟踪广泛的用户活动的网站具有惊人的细节,特别是在网站归类为敏感类别下的 GDPR。此外,我们展示了如何使用 FBCLID 标签来匹配,从而去匿名,在线用户的活动执行在遥远的过去(甚至在那些用户有一个 FB 帐户之前)由 FB 像素跟踪。事实上,通过将这个标签与具有滚动过期日期的 cookie 相结合,FB 还可以跟踪用户未来的浏览活动。我们的实验结果表明,23% 的10k 最受欢迎的网站已经采用了这项技术,并可以有助于在网上跟踪这项活动。此外,我们的追踪研究显示,这种类型的用户活动跟踪可以追溯到2015年。简单地说,如果一个用户今天第一次创建一个 FB 帐户,FB 可以,在某些条件下,匹配他们的匿名收集过去的网页浏览活动和他们新创建的 FB 配置文件,从2015年开始,并在未来继续跟踪他们的活动。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=The+Hitchhiker's+Guide+to+Facebook+Web+Tracking+with+Invisible+Pixels+and+Click+IDs)|0| -|[Atrapos: Real-time Evaluation of Metapath Query Workloads](https://doi.org/10.1145/3543507.3583322)|Serafeim Chatzopoulos, Thanasis Vergoulis, Dimitrios Skoutas, Theodore Dalamagas, Christos Tryfonopoulos, Panagiotis Karras|Aarhus University, Denmark; University of the Peloponnese, Greece; University of the Peloponnese, Greece and IMSI, Athena RC, Greece; IMSI, Athena RC, Greece|Heterogeneous information networks (HINs) represent different types of entities and relationships between them. Exploring and mining HINs relies on metapath queries that identify pairs of entities connected by relationships of diverse semantics. While the real-time evaluation of metapath query workloads on large, web-scale HINs is highly demanding in computational cost, current approaches do not exploit interrelationships among the queries. In this paper, we present Atrapos, a new approach for the real-time evaluation of metapath query workloads that leverages a combination of efficient sparse matrix multiplication and intermediate result caching. Atrapos selects intermediate results to cache and reuse by detecting frequent sub-metapaths among workload queries in real time, using a tailor-made data structure, the Overlap Tree, and an associated caching policy. Our experimental study on real data shows that Atrapos  accelerates exploratory data analysis and mining on HINs, outperforming off-the-shelf caching approaches and state-of-the-art research prototypes in all examined scenarios.|异构信息网络(HIN)表示不同类型的实体以及它们之间的关系。探索和挖掘 HIN 依赖于元路径查询,这些查询标识由不同语义关系连接的实体对。尽管对大型 Web 规模 HIN 上的元路径查询工作负载进行实时评估对计算成本要求很高,但目前的方法没有利用查询之间的相互关系。在这篇文章中,我们介绍了一种新的实时评估元路径查询工作负载的方法—— Arapos,它结合了高效的稀疏矩阵乘法和中间结果缓存。Apapos 通过实时检测工作负载查询之间频繁的子元路径,使用量身定制的数据结构、重叠树和相关的缓存策略来选择缓存和重用的中间结果。我们对真实数据的实验研究表明,在所有经过检验的场景中,阿特波斯加速了 HIN 的探索性数据分析和挖掘,表现优于现成的缓存方法和最先进的研究原型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Atrapos:+Real-time+Evaluation+of+Metapath+Query+Workloads)|0| +|[Catch: Collaborative Feature Set Search for Automated Feature Engineering](https://doi.org/10.1145/3543507.3583527)|Guoshan Lu, Haobo Wang, Saisai Yang, Jing Yuan, Guozheng Yang, Cheng Zang, Gang Chen, Junbo Zhao|Zheshang Bank Co., Ltd., China; Institute of Computing Innovation, Zhejiang University, China; Zhejiang University, China|Feature engineering often plays a crucial role in building mining systems for tabular data, which traditionally requires experienced human experts to perform. Thanks to the rapid advances in reinforcement learning, it has offered an automated alternative, i.e. automated feature engineering (AutoFE). In this work, through scrutiny of the prior AutoFE methods, we characterize several research challenges that remained in this regime, concerning system-wide efficiency, efficacy, and practicality toward production. We then propose Catch, a full-fledged new AutoFE framework that comprehensively addresses the aforementioned challenges. The core to Catch composes a hierarchical-policy reinforcement learning scheme that manifests a collaborative feature engineering exploration and exploitation grounded on the granularity of the whole feature set. At a higher level of the hierarchy, a decision-making module controls the post-processing of the attained feature engineering transformation. We extensively experiment with Catch on 26 academic standardized tabular datasets and 9 industrialized real-world datasets. Measured by numerous metrics and analyses, Catch establishes a new state-of-the-art, from perspectives performance, latency as well as its practicality towards production. Source code1 can be found at https://github.com/1171000709/Catch.|在构建表格数据挖掘系统时,特征工程往往起着至关重要的作用,表格数据挖掘传统上需要有经验的人类专家来完成。由于强化学习的快速发展,它提供了一种自动化的替代方案,即自动化特征工程(AutoFE)。在这项工作中,通过审查以前的自动有限元方法,我们描述了几个研究挑战,仍然在这个制度,关于系统的效率,效率和实用性的生产。然后,我们建议使用 Catch,这是一个成熟的新的 AutoFE 框架,可以全面解决上述挑战。Core to Catch 组成了一个层次化的策略强化学习方案,体现了基于整个特性集粒度的协同特性工程探索和开发。在层次结构的更高层次上,决策模块控制所获得的特征工程变换的后处理。我们对26个学术标准化表格数据集和9个工业化真实世界数据集进行了广泛的实验。通过大量的度量和分析,Catch 从性能、延迟以及对生产的实用性的角度建立了一个新的最先进的状态。源代码1可在 https://github.com/1171000709/catch 找到。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Catch:+Collaborative+Feature+Set+Search+for+Automated+Feature+Engineering)|0| +|[The Hitchhiker's Guide to Facebook Web Tracking with Invisible Pixels and Click IDs](https://doi.org/10.1145/3543507.3583311)|Paschalis Bekos, Panagiotis Papadopoulos, Evangelos P. Markatos, Nicolas Kourtellis|University of Crete/FORTH, Greece; Telefonica Research, Spain; FORTH, Greece|Over the past years, advertisement companies have used various tracking methods to persistently track users across the web. Such tracking methods usually include first and third-party cookies, cookie synchronization, as well as a variety of fingerprinting mechanisms. Facebook (FB) recently introduced a new tagging mechanism that attaches a one-time tag as a URL parameter (FBCLID) on outgoing links to other websites. Although such a tag does not seem to have enough information to persistently track users, we demonstrate that despite its ephemeral nature, when combined with FB Pixel, it can aid in persistently monitoring user browsing behavior across i) different websites, ii) different actions on each website, iii) time, i.e., both in the past as well as in the future. We refer to this online monitoring of users as FB web tracking. We find that FB Pixel tracks a wide range of user activities on websites with alarming detail, especially on websites classified as sensitive categories under GDPR. Also, we show how the FBCLID tag can be used to match, and thus de-anonymize, activities of online users performed in the distant past (even before those users had a FB account) tracked by FB Pixel. In fact, by combining this tag with cookies that have rolling expiration dates, FB can also keep track of users' browsing activities in the future as well. Our experimental results suggest that 23% of the 10k most popular websites have adopted this technology, and can contribute to this activity tracking on the web. Furthermore, our longitudinal study shows that this type of user activity tracking can go as far back as 2015. Simply said, if a user creates for the first time a FB account today, FB could, under some conditions, match their anonymously collected past web browsing activity to their newly created FB profile, from as far back as 2015 and continue tracking their activity in the future.|在过去的几年里,广告公司使用了各种各样的跟踪方法来持续跟踪网络上的用户。这种跟踪方法通常包括第一方和第三方 cookie、 cookie 同步以及各种指纹识别机制。Facebook (FB)最近推出了一种新的标签机制,它将一次性标签作为 URL 参数(FBCLID)附加到其他网站的外向链接上。虽然这样的标签似乎没有足够的信息来持续跟踪用户,我们证明,尽管它的短暂性质,当结合 FB 像素,它可以帮助持续监测用户浏览行为在 i)不同的网站,ii)不同的行动在每个网站,iii)时间,即在过去和未来。我们把这种对用户的在线监控称为 FB 网络跟踪。我们发现,FB 像素跟踪广泛的用户活动的网站具有惊人的细节,特别是在网站归类为敏感类别下的 GDPR。此外,我们展示了如何使用 FBCLID 标签来匹配,从而去匿名,在线用户的活动执行在遥远的过去(甚至在那些用户有一个 FB 帐户之前)由 FB 像素跟踪。事实上,通过将这个标签与具有滚动过期日期的 cookie 相结合,FB 还可以跟踪用户未来的浏览活动。我们的实验结果表明,23% 的10k 最受欢迎的网站已经采用了这项技术,并可以有助于在网上跟踪这项活动。此外,我们的追踪研究显示,这种类型的用户活动跟踪可以追溯到2015年。简单地说,如果一个用户今天第一次创建一个 FB 帐户,FB 可以,在某些条件下,匹配他们的匿名收集过去的网页浏览活动和他们新创建的 FB 配置文件,从2015年开始,并在未来继续跟踪他们的活动。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=The+Hitchhiker's+Guide+to+Facebook+Web+Tracking+with+Invisible+Pixels+and+Click+IDs)|0| +|[Atrapos: Real-time Evaluation of Metapath Query Workloads](https://doi.org/10.1145/3543507.3583322)|Serafeim Chatzopoulos, Thanasis Vergoulis, Dimitrios Skoutas, Theodore Dalamagas, Christos Tryfonopoulos, Panagiotis Karras|Aarhus University, Denmark; University of the Peloponnese, Greece and IMSI, Athena RC, Greece; IMSI, Athena RC, Greece; University of the Peloponnese, Greece|Heterogeneous information networks (HINs) represent different types of entities and relationships between them. Exploring and mining HINs relies on metapath queries that identify pairs of entities connected by relationships of diverse semantics. While the real-time evaluation of metapath query workloads on large, web-scale HINs is highly demanding in computational cost, current approaches do not exploit interrelationships among the queries. In this paper, we present Atrapos, a new approach for the real-time evaluation of metapath query workloads that leverages a combination of efficient sparse matrix multiplication and intermediate result caching. Atrapos selects intermediate results to cache and reuse by detecting frequent sub-metapaths among workload queries in real time, using a tailor-made data structure, the Overlap Tree, and an associated caching policy. Our experimental study on real data shows that Atrapos  accelerates exploratory data analysis and mining on HINs, outperforming off-the-shelf caching approaches and state-of-the-art research prototypes in all examined scenarios.|异构信息网络(HIN)表示不同类型的实体以及它们之间的关系。探索和挖掘 HIN 依赖于元路径查询,这些查询标识由不同语义关系连接的实体对。尽管对大型 Web 规模 HIN 上的元路径查询工作负载进行实时评估对计算成本要求很高,但目前的方法没有利用查询之间的相互关系。在这篇文章中,我们介绍了一种新的实时评估元路径查询工作负载的方法—— Arapos,它结合了高效的稀疏矩阵乘法和中间结果缓存。Apapos 通过实时检测工作负载查询之间频繁的子元路径,使用量身定制的数据结构、重叠树和相关的缓存策略来选择缓存和重用的中间结果。我们对真实数据的实验研究表明,在所有经过检验的场景中,阿特波斯加速了 HIN 的探索性数据分析和挖掘,表现优于现成的缓存方法和最先进的研究原型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Atrapos:+Real-time+Evaluation+of+Metapath+Query+Workloads)|0| |[TRAVERS: A Diversity-Based Dynamic Approach to Iterative Relevance Search over Knowledge Graphs](https://doi.org/10.1145/3543507.3583429)|Ziyang Li, Yu Gu, Yulin Shen, Wei Hu, Gong Cheng|State Key Laboratory for Novel Software Technology, Nanjing University, China; Ohio State University, USA|Relevance search over knowledge graphs seeks top-ranked answer entities that are most relevant to a query entity. Since the semantics of relevance varies with the user need and its formalization is difficult for non-experts, existing methods infer semantics from user-provided example answer entities. However, a user may provide very few examples, even none at the beginning of interaction, thereby limiting the effectiveness of such methods. In this paper, we vision a more practical scenario called labeling-based iterative relevance search: instead of effortfully inputting example answer entities, the user effortlessly (e.g., implicitly) labels current answer entities, and is rewarded with improved answer entities in the next iteration. To realize the scenario, our approach TRAVERS incorporates two rankers: a diversity-oriented ranker for supporting cold start and avoiding converging to sub-optimum caused by noisy labels, and a relevance-oriented ranker capable of handling unbalanced labels. Moreover, the two rankers and their combination dynamically evolve over iterations. TRAVERS outperformed a variety of baselines in experiments with simulated and real user behavior.|基于知识图的相关性搜索寻找与查询实体最相关的排名最高的答案实体。由于相关性的语义随用户需求而变化,而且对于非专家来说,相关性的形式化很困难,现有的方法都是从用户提供的示例答案实体中推断语义。然而,用户可能只提供很少的例子,甚至在交互开始时没有例子,从而限制了这些方法的有效性。在本文中,我们设想了一个更实际的场景,叫做基于标签的迭代相关性搜索: 用户不必费力地输入示例答案实体,而是毫不费力地(例如,隐式地)标记当前答案实体,并在下一次迭代中得到改进的答案实体。为了实现该方案,我们的方法 TRAVERS 包含两个排序器: 一个面向多样性的排序器支持冷启动,避免收敛到次优由于噪声标签,和一个相关性导向的排序器能够处理不平衡的标签。此外,这两个排名及其组合在迭代中动态演化。在模拟和真实用户行为的实验中,TRAVERS 的表现优于各种基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=TRAVERS:+A+Diversity-Based+Dynamic+Approach+to+Iterative+Relevance+Search+over+Knowledge+Graphs)|0| |[Message Function Search for Knowledge Graph Embedding](https://doi.org/10.1145/3543507.3583546)|Shimin Di, Lei Chen|The Hong Kong University of Secience and Technology, China; The Hong Kong University of Science and Technology (Guangzhou), China|Recently, many promising embedding models have been proposed to embed knowledge graphs (KGs) and their more general forms, such as n-ary relational data (NRD) and hyper-relational KG (HKG). To promote the data adaptability and performance of embedding models, KG searching methods propose to search for suitable models for a given KG data set. But they are restricted to a single KG form, and the searched models are restricted to a single type of embedding model. To tackle such issues, we propose to build a search space for the message function in graph neural networks (GNNs). However, it is a non-trivial task. Existing message function designs fix the structures and operators, which makes them difficult to handle different KG forms and data sets. Therefore, we first design a novel message function space, which enables both structures and operators to be searched for the given KG form (including KG, NRD, and HKG) and data. The proposed space can flexibly take different KG forms as inputs and is expressive to search for different types of embedding models. Especially, some existing message function designs and some classic KG embedding models can be instantiated as special cases of our space. We empirically show that the searched message functions are data-dependent, and can achieve leading performance on benchmark KGs, NRD, and HKGs.|近年来,人们提出了许多有前途的嵌入模型来嵌入知识图(KG)及其更一般的形式,如 n 元关系数据(NRD)和超关系 KG (HKG)。为了提高嵌入模型的数据适应性和性能,KG 搜索方法提出为给定的 KG 数据集寻找合适的模型。但是它们仅限于单一的 KG 形式,所搜索的模型仅限于单一类型的嵌入模型。为了解决这些问题,我们提出在图神经网络(GNN)中建立一个消息函数的搜索空间。然而,这是一项非常重要的任务。现有的消息功能设计固定了结构和操作符,使得它们难以处理不同的 KG 表单和数据集。因此,我们首先设计一个新的消息函数空间,它允许搜索给定的 KG 表单(包括 KG、 NRD 和 HKG)和数据的结构和操作符。该空间可以灵活地采用不同的 KG 形式作为输入,具有表达性,可以搜索不同类型的嵌入模型。特别是,现有的一些消息函数设计和一些经典的 KG 嵌入模型可以作为我们空间的特例进行实例化。实验结果表明,搜索消息函数具有数据依赖性,可以在基准幼儿园、 NRD 幼儿园和 HKG 幼儿园中取得领先的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Message+Function+Search+for+Knowledge+Graph+Embedding)|0| |[FINGER: Fast Inference for Graph-based Approximate Nearest Neighbor Search](https://doi.org/10.1145/3543507.3583318)|Patrick H. Chen, WeiCheng Chang, JyunYu Jiang, HsiangFu Yu, Inderjit S. Dhillon, ChoJui Hsieh||Approximate K-Nearest Neighbor Search (AKNNS) has now become ubiquitous in modern applications, for example, as a fast search procedure with two tower deep learning models. Graph-based methods for AKNNS in particular have received great attention due to their superior performance. These methods rely on greedy graph search to traverse the data points as embedding vectors in a database. Under this greedy search scheme, we make a key observation: many distance computations do not influence search updates so these computations can be approximated without hurting performance. As a result, we propose FINGER, a fast inference method to achieve efficient graph search. FINGER approximates the distance function by estimating angles between neighboring residual vectors with low-rank bases and distribution matching. The approximated distance can be used to bypass unnecessary computations, which leads to faster searches. Empirically, accelerating a popular graph-based method named HNSW by FINGER is shown to outperform existing graph-based methods by 20%-60% across different benchmark datasets.|近似 K 最近邻搜索(AKNNS)已经成为现代应用中普遍存在的问题,例如,作为一种具有两个塔式深度学习模型的快速搜索过程。基于图的 AKNNS 方法由于其优越的性能而受到了广泛的关注。这些方法依赖于贪婪图搜索,以嵌入向量的形式遍历数据库中的数据点。在这种贪婪的搜索方案下,我们做了一个关键的观察: 许多距离计算不影响搜索更新,所以这些计算可以近似而不损害性能。因此,我们提出了 FINGER,一种快速的推理方法来实现有效的图搜索。FINGER 通过估计低秩基相邻残差向量之间的夹角和分布匹配来逼近距离函数。近似距离可以用来绕过不必要的计算,从而导致更快的搜索。经验表明,通过 FINGER 加速一种流行的基于图的方法 HNSW,在不同的基准数据集上比现有的基于图的方法的性能提高了20% -60% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=FINGER:+Fast+Inference+for+Graph-based+Approximate+Nearest+Neighbor+Search)|0| |[Match4Match: Enhancing Text-Video Retrieval by Maximum Flow with Minimum Cost](https://doi.org/10.1145/3543507.3583365)|Zhongjie Duan, Chengyu Wang, Cen Chen, Wenmeng Zhou, Jun Huang, Weining Qian|East China Normal University, China; Alibaba Group, China|With the explosive growth of video and text data on the web, text-video retrieval has become a vital task for online video platforms. Recently, text-video retrieval methods based on pre-trained models have attracted a lot of attention. However, existing methods cannot effectively capture the fine-grained information in videos, and typically suffer from the hubness problem where a collection of similar videos are retrieved by a large number of different queries. In this paper, we propose Match4Match, a new text-video retrieval method based on CLIP (Contrastive Language-Image Pretraining) and graph optimization theories. To balance calculation efficiency and model accuracy, Match4Match seamlessly supports three inference modes for different application scenarios. In fast vector retrieval mode, we embed texts and videos in the same space and employ a vector retrieval engine to obtain the top K videos. In fine-grained alignment mode, our method fully utilizes the pre-trained knowledge of the CLIP model to align words with corresponding video frames, and uses the fine-grained information to compute text-video similarity more accurately. In flow-style matching mode, to alleviate the detrimental impact of the hubness problem, we model the retrieval problem as a combinatorial optimization problem and solve it using maximum flow with minimum cost algorithm. To demonstrate the effectiveness of our method, we conduct experiments on five public text-video datasets. The overall performance of our proposed method outperforms state-of-the-art methods. Additionally, we evaluate the computational efficiency of Match4Match. Benefiting from the three flexible inference modes, Match4Match can respond to a large number of query requests with low latency or achieve high recall with acceptable time consumption.|随着网络视频和文本数据的爆炸式增长,文本视频检索已经成为在线视频平台的一项重要任务。近年来,基于预训练模型的文本视频检索方法引起了人们的广泛关注。然而,现有的方法不能有效地捕获视频中的细粒度信息,通常会遇到集线器问题,即大量不同的查询检索相似的视频集合。本文提出了一种基于对比语言-图像预训练(CLIP)和图形优化理论的文本-视频检索方法 Match4Match。为了平衡计算效率和模型精度,Match4Match 无缝支持针对不同应用场景的三种推理模式。在快速矢量检索模式下,我们将文本和视频嵌入到同一空间中,并使用矢量检索引擎获取最高 K 视频。在细粒度对齐模式下,该方法充分利用 CLIP 模型的预训练知识对相应的视频帧进行单词对齐,并利用细粒度信息更准确地计算文本-视频的相似度。在流式匹配模式下,为了减轻中继问题的不利影响,我们将检索问题建模为一个组合优化问题,并使用最大流和最小成本算法解决该问题。为了验证该方法的有效性,我们在五个公共文本视频数据集上进行了实验。我们提出的方法的总体性能优于最先进的方法。此外,我们还评估了 Match4Match 的计算效率。Match4Match 得益于这三种灵活的推理模式,可以以较低的延迟响应大量查询请求,或者以可接受的时间消耗实现高召回率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Match4Match:+Enhancing+Text-Video+Retrieval+by+Maximum+Flow+with+Minimum+Cost)|0| -|[Zero-shot Clarifying Question Generation for Conversational Search](https://doi.org/10.1145/3543507.3583420)|Zhenduo Wang, Yuancheng Tu, Corby Rosset, Nick Craswell, Ming Wu, Qingyao Ai|Microsoft Corp, USA; GitHub Inc, USA; Tsinghua University, China; University of Utah, USA|A long-standing challenge for search and conversational assistants is query intention detection in ambiguous queries. Asking clarifying questions in conversational search has been widely studied and considered an effective solution to resolve query ambiguity. Existing work have explored various approaches for clarifying question ranking and generation. However, due to the lack of real conversational search data, they have to use artificial datasets for training, which limits their generalizability to real-world search scenarios. As a result, the industry has shown reluctance to implement them in reality, further suspending the availability of real conversational search interaction data. The above dilemma can be formulated as a cold start problem of clarifying question generation and conversational search in general. Furthermore, even if we do have large-scale conversational logs, it is not realistic to gather training data that can comprehensively cover all possible queries and topics in open-domain search scenarios. The risk of fitting bias when training a clarifying question retrieval/generation model on incomprehensive dataset is thus another important challenge. In this work, we innovatively explore generating clarifying questions in a zero-shot setting to overcome the cold start problem and we propose a constrained clarifying question generation system which uses both question templates and query facets to guide the effective and precise question generation. The experiment results show that our method outperforms existing state-of-the-art zero-shot baselines by a large margin. Human annotations to our model outputs also indicate our method generates 25.2\% more natural questions, 18.1\% more useful questions, 6.1\% less unnatural and 4\% less useless questions.|模糊查询中的查询意图检测一直是搜索和会话助手面临的挑战。在会话搜索中提出澄清问题被广泛研究,被认为是解决查询歧义的有效方法。现有的工作已经探索了各种方法来澄清问题的排序和生成。然而,由于缺乏真实的会话搜索数据,他们不得不使用人工数据集进行训练,这限制了他们对真实世界搜索场景的普遍性。结果,业界表现出不愿意在现实中实现它们,进一步暂停了真正的会话搜索交互数据的可用性。上述困境可以概括为澄清问题生成和一般会话搜索的冷启动问题。此外,即使我们有大规模的会话日志,收集能够全面涵盖开放域搜索场景中所有可能的查询和主题的培训数据也是不现实的。因此,在不全面的数据集上训练澄清问题检索/生成模型时,拟合偏差的风险是另一个重要的挑战。在这项工作中,我们创新性地探索了在零拍环境下生成澄清问题来克服冷启动问题,并提出了一个有约束的澄清问题生成系统,该系统使用问题模板和查询面来指导有效和准确的问题生成。实验结果表明,我们的方法比现有的最先进的零拍摄基线有很大的优势。我们的模型输出的人工注释也表明我们的方法产生了25.2% 更自然的问题,18.1% 更有用的问题,6.1% 更少的非自然的和4% 更少的无用的问题。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Zero-shot+Clarifying+Question+Generation+for+Conversational+Search)|0| +|[Zero-shot Clarifying Question Generation for Conversational Search](https://doi.org/10.1145/3543507.3583420)|Zhenduo Wang, Yuancheng Tu, Corby Rosset, Nick Craswell, Ming Wu, Qingyao Ai|Tsinghua University, China; GitHub Inc, USA; Microsoft Corp, USA; University of Utah, USA|A long-standing challenge for search and conversational assistants is query intention detection in ambiguous queries. Asking clarifying questions in conversational search has been widely studied and considered an effective solution to resolve query ambiguity. Existing work have explored various approaches for clarifying question ranking and generation. However, due to the lack of real conversational search data, they have to use artificial datasets for training, which limits their generalizability to real-world search scenarios. As a result, the industry has shown reluctance to implement them in reality, further suspending the availability of real conversational search interaction data. The above dilemma can be formulated as a cold start problem of clarifying question generation and conversational search in general. Furthermore, even if we do have large-scale conversational logs, it is not realistic to gather training data that can comprehensively cover all possible queries and topics in open-domain search scenarios. The risk of fitting bias when training a clarifying question retrieval/generation model on incomprehensive dataset is thus another important challenge. In this work, we innovatively explore generating clarifying questions in a zero-shot setting to overcome the cold start problem and we propose a constrained clarifying question generation system which uses both question templates and query facets to guide the effective and precise question generation. The experiment results show that our method outperforms existing state-of-the-art zero-shot baselines by a large margin. Human annotations to our model outputs also indicate our method generates 25.2\% more natural questions, 18.1\% more useful questions, 6.1\% less unnatural and 4\% less useless questions.|模糊查询中的查询意图检测一直是搜索和会话助手面临的挑战。在会话搜索中提出澄清问题被广泛研究,被认为是解决查询歧义的有效方法。现有的工作已经探索了各种方法来澄清问题的排序和生成。然而,由于缺乏真实的会话搜索数据,他们不得不使用人工数据集进行训练,这限制了他们对真实世界搜索场景的普遍性。结果,业界表现出不愿意在现实中实现它们,进一步暂停了真正的会话搜索交互数据的可用性。上述困境可以概括为澄清问题生成和一般会话搜索的冷启动问题。此外,即使我们有大规模的会话日志,收集能够全面涵盖开放域搜索场景中所有可能的查询和主题的培训数据也是不现实的。因此,在不全面的数据集上训练澄清问题检索/生成模型时,拟合偏差的风险是另一个重要的挑战。在这项工作中,我们创新性地探索了在零拍环境下生成澄清问题来克服冷启动问题,并提出了一个有约束的澄清问题生成系统,该系统使用问题模板和查询面来指导有效和准确的问题生成。实验结果表明,我们的方法比现有的最先进的零拍摄基线有很大的优势。我们的模型输出的人工注释也表明我们的方法产生了25.2% 更自然的问题,18.1% 更有用的问题,6.1% 更少的非自然的和4% 更少的无用的问题。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Zero-shot+Clarifying+Question+Generation+for+Conversational+Search)|0| |[Everything Evolves in Personalized PageRank](https://doi.org/10.1145/3543507.3583474)|Zihao Li, Dongqi Fu, Jingrui He|University of Illinois at Urbana-Champaign, USA|Personalized PageRank, as a graphical model, has been proven as an effective solution in many applications such as web page search, recommendation, etc. However, in the real world, the setting of personalized PageRank is usually dynamic like the evolving World Wide Web. On the one hand, the outdated PageRank solution can be sub-optimal for ignoring the evolution pattern. On the other hand, solving the solution from the scratch at each timestamp causes costly computation complexity. Hence, in this paper, we aim to solve the Personalized PageRank effectively and efficiently in a fully dynamic setting, i.e., every component in the Personalized PageRank formula is dependent on time. To this end, we propose the EvePPR method that can track the exact personalized PageRank solution at each timestamp in the fully dynamic setting, and we theoretically and empirically prove the accuracy and time complexity of EvePPR. Moreover, we apply EvePPR to solve the dynamic knowledge graph alignment task, where a fully dynamic setting is necessary but complex. The experiments show that EvePPR outperforms the state-of-the-art baselines for similar nodes retrieval across graphs.|个性化 PageRank 作为一种图形化模型,已被证明是网页搜索、推荐等应用中的一种有效解决方案。然而,在现实世界中,个性化 PageRank 的设置通常是动态的,就像不断发展的万维网一样。一方面,过时的 PageRank 解决方案可能是次优的,因为它忽略了进化模式。另一方面,在每个时间戳从零开始求解解决方案会导致昂贵的计算复杂度。因此,本文的目标是在一个完全动态的环境下有效地解决个性化 PageRank 问题,也就是说,个性化 PageRank 公式中的每个组成部分都依赖于时间。为此,我们提出了 EvePPR 方法,该方法可以在完全动态的环境下精确跟踪每个时间戳的个性化 PageRank 解,并从理论和实验上证明了 EvePPR 方法的准确性和时间复杂度。此外,我们应用 EvePPR 来解决动态知识图对齐任务,其中一个完全动态的设置是必要的,但是复杂的。实验表明,EvePPR 在跨图检索相似节点时优于最新的基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Everything+Evolves+in+Personalized+PageRank)|0| -|[Incorporating Explicit Subtopics in Personalized Search](https://doi.org/10.1145/3543507.3583488)|Shuting Wang, Zhicheng Dou, Jing Yao, Yujia Zhou, JiRong Wen|Renmin University of China, China and Engineering Research Center of Next-Generation Intelligent Search and Recommendation, Ministry of Educationf Education, China; Social Computing Group, Microsoft Research Asia, China; Renmin University of China, China|The key to personalized search is modeling user intents to tailor returned results for different users. Existing personalized methods mainly focus on learning implicit user interest vectors. In this paper, we propose ExpliPS, a personalized search model that explicitly incorporates query subtopics into personalization. It models the user’s current intent by estimating the user’s preference over the subtopics of the current query and personalizes the results over the weighted subtopics. We think that in such a way, personalized search could be more explainable and stable. Specifically, we first employ a semantic encoder to learn the representations of the user’s historical behaviours. Then with the historical behaviour representations, a subtopic preference encoder is devised to predict the user’s subtopic preferences on the current query. Finally, we rerank the candidates via a subtopic-aware ranker that prioritizes the documents relevant to the user-preferred subtopics. Experimental results show our model ExpliPS outperforms the state-of-the-art personalized web search models with explainable and stable results.|个性化检索的关键是建立用户意图模型,为不同的用户定制返回的结果。现有的个性化方法主要侧重于学习隐式用户兴趣向量。在这篇文章中,我们提出了 expliPS,一个明确地将查询子主题合并到个性化中的个性化检索模型。它通过估计用户对当前查询的子主题的偏好来建模用户的当前意图,并对加权子主题的结果进行个性化处理。我们认为,通过这种方式,个性化检索可以更容易解释,也更稳定。具体来说,我们首先使用一个语义编码器来学习用户历史行为的表示。然后结合历史行为表示,设计了一种子主题偏好编码器来预测用户对当前查询的子主题偏好。最后,我们通过一个子主题感知排名器对候选人进行重新排名,该排名器对与用户首选子主题相关的文档进行优先排序。实验结果表明,该模型的性能优于目前最先进的个性化网络搜索模型,结果具有可解释性和稳定性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Incorporating+Explicit+Subtopics+in+Personalized+Search)|0| -|[Optimizing Feature Set for Click-Through Rate Prediction](https://doi.org/10.1145/3543507.3583545)|Fuyuan Lyu, Xing Tang, Dugang Liu, Liang Chen, Xiuqiang He, Xue Liu|FiT, Tencent, China; McGill University, Canada; Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), China|Click-through prediction (CTR) models transform features into latent vectors and enumerate possible feature interactions to improve performance based on the input feature set. Therefore, when selecting an optimal feature set, we should consider the influence of both feature and its interaction. However, most previous works focus on either feature field selection or only select feature interaction based on the fixed feature set to produce the feature set. The former restricts search space to the feature field, which is too coarse to determine subtle features. They also do not filter useless feature interactions, leading to higher computation costs and degraded model performance. The latter identifies useful feature interaction from all available features, resulting in many redundant features in the feature set. In this paper, we propose a novel method named OptFS to address these problems. To unify the selection of feature and its interaction, we decompose the selection of each feature interaction into the selection of two correlated features. Such a decomposition makes the model end-to-end trainable given various feature interaction operations. By adopting feature-level search space, we set a learnable gate to determine whether each feature should be within the feature set. Because of the large-scale search space, we develop a learning-by-continuation training scheme to learn such gates. Hence, OptFS generates the feature set only containing features which improve the final prediction results. Experimentally, we evaluate OptFS on three public datasets, demonstrating OptFS can optimize feature sets which enhance the model performance and further reduce both the storage and computational cost.|点击预测(CTR)模型将特征转换为潜在向量,并列举可能的特征交互,以提高基于输入特征集的性能。因此,在选择最优特征集时,应同时考虑特征及其相互作用的影响。然而,以往的工作主要集中在特征字段的选择或者仅仅基于固定特征集选择特征交互来产生特征集。前者将搜索空间限制在特征域内,特征域太粗,无法确定细微的特征。它们也不过滤无用的特征交互,导致更高的计算成本和降低模型性能。后者从所有可用的特征中识别出有用的特征交互,从而导致特征集中的许多冗余特征。在本文中,我们提出了一种新的方法称为 OptFS 来解决这些问题。为了统一特征选择和特征交互,将每个特征交互的选择分解为两个相关特征的选择。这样的分解使得模型在给定各种特征交互操作的情况下可以进行端到端的训练。通过采用特征级搜索空间,我们设置了一个可学习的门来确定每个特征是否应该在特征集中。由于搜索空间较大,我们提出了一种基于连续学习的训练方案来学习这类门。因此,OptFS 生成的特征集仅包含改善最终预测结果的特征。在实验上,我们对三个公共数据集上的 OptFS 进行了评估,结果表明 OptFS 可以优化特征集,从而提高模型的性能,进一步降低存储和计算成本。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Optimizing+Feature+Set+for+Click-Through+Rate+Prediction)|0| -|[Filtered-DiskANN: Graph Algorithms for Approximate Nearest Neighbor Search with Filters](https://doi.org/10.1145/3543507.3583552)|Siddharth Gollapudi, Neel Karia, Varun Sivashankar, Ravishankar Krishnaswamy, Nikit Begwani, Swapnil Raz, Yiyong Lin, Yin Zhang, Neelam Mahapatro, Premkumar Srinivasan, Amit Singh, Harsha Vardhan Simhadri|Microsoft Research, USA; Microsoft, India; Columbia University, USA; Microsoft, USA; Microsoft Research, India|As Approximate Nearest Neighbor Search (ANNS)-based dense retrieval becomes ubiquitous for search and recommendation scenarios, efficiently answering filtered ANNS queries has become a critical requirement. Filtered ANNS queries ask for the nearest neighbors of a query’s embedding from the points in the index that match the query’s labels such as date, price range, language. There has been little prior work on algorithms that use label metadata associated with vector data to build efficient indices for filtered ANNS queries. Consequently, current indices have high search latency or low recall which is not practical in interactive web-scenarios. We present two algorithms with native support for faster and more accurate filtered ANNS queries: one with streaming support, and another based on batch construction. Central to our algorithms is the construction of a graph-structured index which forms connections not only based on the geometry of the vector data, but also the associated label set. On real-world data with natural labels, both algorithms are an order of magnitude or more efficient for filtered queries than the current state of the art algorithms. The generated indices also be queried from an SSD and support thousands of queries per second at over [email protected]|随着基于近似最近邻搜索(ANNS)的密集检索在搜索和推荐场景中的普及,有效地回答经过滤的 ANNS 查询已成为一个关键要求。经过过滤的 ANNS 查询要求查询嵌入的最近邻居从索引中匹配查询标签的点,如日期,价格范围,语言。使用与矢量数据相关联的标签元数据为经过过滤的 ANNS 查询构建高效索引的算法之前几乎没有研究。因此,当前的索引具有较高的搜索延迟或较低的召回率,这在交互式网络场景中是不实用的。我们提出了两个算法与本地支持更快,更准确的过滤 ANNS 查询: 一个流支持,另一个基于批量构造。我们算法的核心是构造一个图结构索引,它不仅根据矢量数据的几何形状,而且根据相关的标签集形成连接。对于带有自然标签的真实世界数据,这两种算法对于过滤查询来说都是一种数量级,或者比目前最先进的算法效率更高。生成的索引也可以从 SSD 查询,并支持每秒在 over [ email protected ]处的数千个查询|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Filtered-DiskANN:+Graph+Algorithms+for+Approximate+Nearest+Neighbor+Search+with+Filters)|0| +|[Incorporating Explicit Subtopics in Personalized Search](https://doi.org/10.1145/3543507.3583488)|Shuting Wang, Zhicheng Dou, Jing Yao, Yujia Zhou, JiRong Wen|Social Computing Group, Microsoft Research Asia, China; Renmin University of China, China and Engineering Research Center of Next-Generation Intelligent Search and Recommendation, Ministry of Educationf Education, China; Renmin University of China, China|The key to personalized search is modeling user intents to tailor returned results for different users. Existing personalized methods mainly focus on learning implicit user interest vectors. In this paper, we propose ExpliPS, a personalized search model that explicitly incorporates query subtopics into personalization. It models the user’s current intent by estimating the user’s preference over the subtopics of the current query and personalizes the results over the weighted subtopics. We think that in such a way, personalized search could be more explainable and stable. Specifically, we first employ a semantic encoder to learn the representations of the user’s historical behaviours. Then with the historical behaviour representations, a subtopic preference encoder is devised to predict the user’s subtopic preferences on the current query. Finally, we rerank the candidates via a subtopic-aware ranker that prioritizes the documents relevant to the user-preferred subtopics. Experimental results show our model ExpliPS outperforms the state-of-the-art personalized web search models with explainable and stable results.|个性化检索的关键是建立用户意图模型,为不同的用户定制返回的结果。现有的个性化方法主要侧重于学习隐式用户兴趣向量。在这篇文章中,我们提出了 expliPS,一个明确地将查询子主题合并到个性化中的个性化检索模型。它通过估计用户对当前查询的子主题的偏好来建模用户的当前意图,并对加权子主题的结果进行个性化处理。我们认为,通过这种方式,个性化检索可以更容易解释,也更稳定。具体来说,我们首先使用一个语义编码器来学习用户历史行为的表示。然后结合历史行为表示,设计了一种子主题偏好编码器来预测用户对当前查询的子主题偏好。最后,我们通过一个子主题感知排名器对候选人进行重新排名,该排名器对与用户首选子主题相关的文档进行优先排序。实验结果表明,该模型的性能优于目前最先进的个性化网络搜索模型,结果具有可解释性和稳定性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Incorporating+Explicit+Subtopics+in+Personalized+Search)|0| +|[Optimizing Feature Set for Click-Through Rate Prediction](https://doi.org/10.1145/3543507.3583545)|Fuyuan Lyu, Xing Tang, Dugang Liu, Liang Chen, Xiuqiang He, Xue Liu|Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), China; McGill University, Canada; FiT, Tencent, China|Click-through prediction (CTR) models transform features into latent vectors and enumerate possible feature interactions to improve performance based on the input feature set. Therefore, when selecting an optimal feature set, we should consider the influence of both feature and its interaction. However, most previous works focus on either feature field selection or only select feature interaction based on the fixed feature set to produce the feature set. The former restricts search space to the feature field, which is too coarse to determine subtle features. They also do not filter useless feature interactions, leading to higher computation costs and degraded model performance. The latter identifies useful feature interaction from all available features, resulting in many redundant features in the feature set. In this paper, we propose a novel method named OptFS to address these problems. To unify the selection of feature and its interaction, we decompose the selection of each feature interaction into the selection of two correlated features. Such a decomposition makes the model end-to-end trainable given various feature interaction operations. By adopting feature-level search space, we set a learnable gate to determine whether each feature should be within the feature set. Because of the large-scale search space, we develop a learning-by-continuation training scheme to learn such gates. Hence, OptFS generates the feature set only containing features which improve the final prediction results. Experimentally, we evaluate OptFS on three public datasets, demonstrating OptFS can optimize feature sets which enhance the model performance and further reduce both the storage and computational cost.|点击预测(CTR)模型将特征转换为潜在向量,并列举可能的特征交互,以提高基于输入特征集的性能。因此,在选择最优特征集时,应同时考虑特征及其相互作用的影响。然而,以往的工作主要集中在特征字段的选择或者仅仅基于固定特征集选择特征交互来产生特征集。前者将搜索空间限制在特征域内,特征域太粗,无法确定细微的特征。它们也不过滤无用的特征交互,导致更高的计算成本和降低模型性能。后者从所有可用的特征中识别出有用的特征交互,从而导致特征集中的许多冗余特征。在本文中,我们提出了一种新的方法称为 OptFS 来解决这些问题。为了统一特征选择和特征交互,将每个特征交互的选择分解为两个相关特征的选择。这样的分解使得模型在给定各种特征交互操作的情况下可以进行端到端的训练。通过采用特征级搜索空间,我们设置了一个可学习的门来确定每个特征是否应该在特征集中。由于搜索空间较大,我们提出了一种基于连续学习的训练方案来学习这类门。因此,OptFS 生成的特征集仅包含改善最终预测结果的特征。在实验上,我们对三个公共数据集上的 OptFS 进行了评估,结果表明 OptFS 可以优化特征集,从而提高模型的性能,进一步降低存储和计算成本。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Optimizing+Feature+Set+for+Click-Through+Rate+Prediction)|0| +|[Filtered-DiskANN: Graph Algorithms for Approximate Nearest Neighbor Search with Filters](https://doi.org/10.1145/3543507.3583552)|Siddharth Gollapudi, Neel Karia, Varun Sivashankar, Ravishankar Krishnaswamy, Nikit Begwani, Swapnil Raz, Yiyong Lin, Yin Zhang, Neelam Mahapatro, Premkumar Srinivasan, Amit Singh, Harsha Vardhan Simhadri|Microsoft, India; Microsoft Research, USA; Columbia University, USA; Microsoft Research, India; Microsoft, USA|As Approximate Nearest Neighbor Search (ANNS)-based dense retrieval becomes ubiquitous for search and recommendation scenarios, efficiently answering filtered ANNS queries has become a critical requirement. Filtered ANNS queries ask for the nearest neighbors of a query’s embedding from the points in the index that match the query’s labels such as date, price range, language. There has been little prior work on algorithms that use label metadata associated with vector data to build efficient indices for filtered ANNS queries. Consequently, current indices have high search latency or low recall which is not practical in interactive web-scenarios. We present two algorithms with native support for faster and more accurate filtered ANNS queries: one with streaming support, and another based on batch construction. Central to our algorithms is the construction of a graph-structured index which forms connections not only based on the geometry of the vector data, but also the associated label set. On real-world data with natural labels, both algorithms are an order of magnitude or more efficient for filtered queries than the current state of the art algorithms. The generated indices also be queried from an SSD and support thousands of queries per second at over [email protected]|随着基于近似最近邻搜索(ANNS)的密集检索在搜索和推荐场景中的普及,有效地回答经过滤的 ANNS 查询已成为一个关键要求。经过过滤的 ANNS 查询要求查询嵌入的最近邻居从索引中匹配查询标签的点,如日期,价格范围,语言。使用与矢量数据相关联的标签元数据为经过过滤的 ANNS 查询构建高效索引的算法之前几乎没有研究。因此,当前的索引具有较高的搜索延迟或较低的召回率,这在交互式网络场景中是不实用的。我们提出了两个算法与本地支持更快,更准确的过滤 ANNS 查询: 一个流支持,另一个基于批量构造。我们算法的核心是构造一个图结构索引,它不仅根据矢量数据的几何形状,而且根据相关的标签集形成连接。对于带有自然标签的真实世界数据,这两种算法对于过滤查询来说都是一种数量级,或者比目前最先进的算法效率更高。生成的索引也可以从 SSD 查询,并支持每秒在 over [ email protected ]处的数千个查询|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Filtered-DiskANN:+Graph+Algorithms+for+Approximate+Nearest+Neighbor+Search+with+Filters)|0| |[P-MMF: Provider Max-min Fairness Re-ranking in Recommender System](https://doi.org/10.1145/3543507.3583296)|Chen Xu, Sirui Chen, Jun Xu, Weiran Shen, Xiao Zhang, Gang Wang, Zhenhua Dong||In this paper, we address the issue of recommending fairly from the aspect of providers, which has become increasingly essential in multistakeholder recommender systems. Existing studies on provider fairness usually focused on designing proportion fairness (PF) metrics that first consider systematic fairness. However, sociological researches show that to make the market more stable, max-min fairness (MMF) is a better metric. The main reason is that MMF aims to improve the utility of the worst ones preferentially, guiding the system to support the providers in weak market positions. When applying MMF to recommender systems, how to balance user preferences and provider fairness in an online recommendation scenario is still a challenging problem. In this paper, we proposed an online re-ranking model named Provider Max-min Fairness Re-ranking (P-MMF) to tackle the problem. Specifically, P-MMF formulates provider fair recommendation as a resource allocation problem, where the exposure slots are considered the resources to be allocated to providers and the max-min fairness is used as the regularizer during the process. We show that the problem can be further represented as a regularized online optimizing problem and solved efficiently in its dual space. During the online re-ranking phase, a momentum gradient descent method is designed to conduct the dynamic re-ranking. Theoretical analysis showed that the regret of P-MMF can be bounded. Experimental results on four public recommender datasets demonstrated that P-MMF can outperformed the state-of-the-art baselines. Experimental results also show that P-MMF can retain small computationally costs on a corpus with the large number of items.|在本文中,我们从提供者的角度讨论了公平推荐的问题,这在多利益相关者推荐系统中已经变得越来越重要。现有的关于提供者公平性的研究通常集中在设计比例公平性(PF)指标时首先考虑系统公平性。然而,社会学研究表明,为了使市场更加稳定,极大极小公平(MMF)是一个更好的衡量标准。主要原因在于,货币市场基金旨在优先提高最差的基金的效用,引导金融体系支持处于弱势市场地位的基金提供者。在将 MMF 应用于推荐系统时,如何在在线推荐场景中平衡用户偏好和提供者公平性仍然是一个具有挑战性的问题。在这篇文章中,我们提出了一个在线重新排序模型——提供者极大极小公平重新排序(P-MMF)来解决这个问题。具体来说,P-MMF 将提供商公平推荐制定为一个资源分配问题,其中风险承担时段被视为将分配给提供商的资源,而极大极小公平则被用作过程中的规范者。证明了该问题可以进一步表示为正则化在线优化问题,并在其对偶空间中有效地求解。在在线重新排名阶段,动量梯度下降法方法被设计用于进行动态重新排名。理论分析表明,P-MMF 的遗憾是有限的。对四个公共推荐数据集的实验结果表明,P-MMF 能够优于最先进的基线。实验结果还表明,P-MMF 能够在项目数量较多的语料库上保持较小的计算开销。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=P-MMF:+Provider+Max-min+Fairness+Re-ranking+in+Recommender+System)|0| -|[Dual Intent Enhanced Graph Neural Network for Session-based New Item Recommendation](https://doi.org/10.1145/3543507.3583526)|Di Jin, Luzhi Wang, Yizhen Zheng, Guojie Song, Fei Jiang, Xiang Li, Wei Lin, Shirui Pan|School of Intelligence Science and Technology, Peking University, China; College of Intelligence and Computing, Tianjin University, China; Meituan, China; Department of Data Science and AI, Faculty of IT, Monash University, Australia; Professional, China; School of Information and Communication Technology, Griffith University, Australia|Recommender systems are essential to various fields, e.g., e-commerce, e-learning, and streaming media. At present, graph neural networks (GNNs) for session-based recommendations normally can only recommend items existing in users' historical sessions. As a result, these GNNs have difficulty recommending items that users have never interacted with (new items), which leads to a phenomenon of information cocoon. Therefore, it is necessary to recommend new items to users. As there is no interaction between new items and users, we cannot include new items when building session graphs for GNN session-based recommender systems. Thus, it is challenging to recommend new items for users when using GNN-based methods. We regard this challenge as '\textbf{G}NN \textbf{S}ession-based \textbf{N}ew \textbf{I}tem \textbf{R}ecommendation (GSNIR)'. To solve this problem, we propose a dual-intent enhanced graph neural network for it. Due to the fact that new items are not tied to historical sessions, the users' intent is difficult to predict. We design a dual-intent network to learn user intent from an attention mechanism and the distribution of historical data respectively, which can simulate users' decision-making process in interacting with a new item. To solve the challenge that new items cannot be learned by GNNs, inspired by zero-shot learning (ZSL), we infer the new item representation in GNN space by using their attributes. By outputting new item probabilities, which contain recommendation scores of the corresponding items, the new items with higher scores are recommended to users. Experiments on two representative real-world datasets show the superiority of our proposed method. The case study from the real-world verifies interpretability benefits brought by the dual-intent module and the new item reasoning module. The code is available at Github: https://github.com/Ee1s/NirGNN|推荐系统对于电子商务、电子学习和流媒体等各个领域都是必不可少的。目前,基于会话推荐的图神经网络(GNN)通常只能推荐用户历史会话中存在的项目。因此,这些 GNN 很难推荐用户从未接触过的项目(新项目) ,从而导致信息茧现象。因此,有必要向用户推荐新项目。由于新项目和用户之间没有交互,所以在为基于 GNN 会话的推荐系统构建会话图时,我们不能包含新项目。因此,在使用基于 GNN 的方法时,向用户推荐新项目是一个挑战。我们将此挑战视为“ textbf { G } NN textbf { S } session-based textbf { N } ew textbf { I } tem textbf { R }推荐(GSNIR)”。为了解决这一问题,我们提出了一种双意图增强的图神经网络。由于新条目不与历史会话相关联,因此很难预测用户的意图。设计了一个双意图网络,分别从注意机制和历史数据分布中学习用户意图,模拟用户在与新项目交互时的决策过程。为了解决 GNN 无法学习新项目的问题,受零点学习(ZSL)的启发,我们利用 GNN 空间中新项目的属性来推断新项目的表示。通过输出新项目概率,其中包含相应项目的推荐分数,新项目的分数越高,推荐给用户。在两个具有代表性的实际数据集上的实验表明了该方法的优越性。通过实际案例分析,验证了双意图模块和新的项目推理模块所带来的可解释性优势。代码可以在 Github: https://Github.com/ee1s/nirgnn 上找到|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Dual+Intent+Enhanced+Graph+Neural+Network+for+Session-based+New+Item+Recommendation)|0| -|[Cross-domain recommendation via user interest alignment](https://doi.org/10.1145/3543507.3583263)|Chuang Zhao, Hongke Zhao, Ming HE, Jian Zhang, Jianping Fan|AI Lab at Lenovo Research, China; College of Management and Economics, Tianjin University, China; College of Management and Economics, Tianjin University, China and AI Lab at Lenovo Research, China; School of Cyberspace Security, Hangzhou Dianzi University, China|Cross-domain recommendation aims to leverage knowledge from multiple domains to alleviate the data sparsity and cold-start problems in traditional recommender systems. One popular paradigm is to employ overlapping user representations to establish domain connections, thereby improving recommendation performance in all scenarios. Nevertheless, the general practice of this approach is to train user embeddings in each domain separately and then aggregate them in a plain manner, often ignoring potential cross-domain similarities between users and items. Furthermore, considering that their training objective is recommendation task-oriented without specific regularizations, the optimized embeddings disregard the interest alignment among user's views, and even violate the user's original interest distribution. To address these challenges, we propose a novel cross-domain recommendation framework, namely COAST, to improve recommendation performance on dual domains by perceiving the cross-domain similarity between entities and aligning user interests. Specifically, we first construct a unified cross-domain heterogeneous graph and redefine the message passing mechanism of graph convolutional networks to capture high-order similarity of users and items across domains. Targeted at user interest alignment, we develop deep insights from two more fine-grained perspectives of user-user and user-item interest invariance across domains by virtue of affluent unsupervised and semantic signals. We conduct intensive experiments on multiple tasks, constructed from two large recommendation data sets. Extensive results show COAST consistently and significantly outperforms state-of-the-art cross-domain recommendation algorithms as well as classic single-domain recommendation methods.|跨域推荐的目的是利用来自多个域的知识来缓解传统推荐系统中的数据稀疏和冷启动问题。一个流行的范例是使用重叠的用户表示来建立域连接,从而在所有场景中提高推荐性能。然而,这种方法的一般实践是分别训练每个域中的用户嵌入,然后以简单的方式聚合它们,通常忽略用户和项目之间潜在的跨域相似性。此外,考虑到其训练目标是面向推荐任务的,没有具体的规范化,优化嵌入无视用户视图之间的兴趣一致性,甚至违背了用户原有的兴趣分布。为了应对这些挑战,我们提出了一种新的跨域推荐框架,即 COAST,通过感知实体之间的跨域相似性和调整用户兴趣来提高双域推荐的性能。具体来说,我们首先构建一个统一的跨域异构图,重新定义图卷积网络的消息传递机制,以获取跨域用户和项目的高阶相似性。针对用户兴趣对齐,我们从用户-用户和用户项目兴趣不变性的两个更细粒度的角度,通过丰富的无监督和语义信号,开发深刻的见解。我们对两个大型推荐数据集构建的多个任务进行了深入的实验。广泛的结果表明,COAST 始终如一地显著优于最先进的跨域推荐算法和经典的单域推荐方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Cross-domain+recommendation+via+user+interest+alignment)|0| +|[Dual Intent Enhanced Graph Neural Network for Session-based New Item Recommendation](https://doi.org/10.1145/3543507.3583526)|Di Jin, Luzhi Wang, Yizhen Zheng, Guojie Song, Fei Jiang, Xiang Li, Wei Lin, Shirui Pan|School of Intelligence Science and Technology, Peking University, China; Department of Data Science and AI, Faculty of IT, Monash University, Australia; College of Intelligence and Computing, Tianjin University, China; School of Information and Communication Technology, Griffith University, Australia; Professional, China; Meituan, China|Recommender systems are essential to various fields, e.g., e-commerce, e-learning, and streaming media. At present, graph neural networks (GNNs) for session-based recommendations normally can only recommend items existing in users' historical sessions. As a result, these GNNs have difficulty recommending items that users have never interacted with (new items), which leads to a phenomenon of information cocoon. Therefore, it is necessary to recommend new items to users. As there is no interaction between new items and users, we cannot include new items when building session graphs for GNN session-based recommender systems. Thus, it is challenging to recommend new items for users when using GNN-based methods. We regard this challenge as '\textbf{G}NN \textbf{S}ession-based \textbf{N}ew \textbf{I}tem \textbf{R}ecommendation (GSNIR)'. To solve this problem, we propose a dual-intent enhanced graph neural network for it. Due to the fact that new items are not tied to historical sessions, the users' intent is difficult to predict. We design a dual-intent network to learn user intent from an attention mechanism and the distribution of historical data respectively, which can simulate users' decision-making process in interacting with a new item. To solve the challenge that new items cannot be learned by GNNs, inspired by zero-shot learning (ZSL), we infer the new item representation in GNN space by using their attributes. By outputting new item probabilities, which contain recommendation scores of the corresponding items, the new items with higher scores are recommended to users. Experiments on two representative real-world datasets show the superiority of our proposed method. The case study from the real-world verifies interpretability benefits brought by the dual-intent module and the new item reasoning module. The code is available at Github: https://github.com/Ee1s/NirGNN|推荐系统对于电子商务、电子学习和流媒体等各个领域都是必不可少的。目前,基于会话推荐的图神经网络(GNN)通常只能推荐用户历史会话中存在的项目。因此,这些 GNN 很难推荐用户从未接触过的项目(新项目) ,从而导致信息茧现象。因此,有必要向用户推荐新项目。由于新项目和用户之间没有交互,所以在为基于 GNN 会话的推荐系统构建会话图时,我们不能包含新项目。因此,在使用基于 GNN 的方法时,向用户推荐新项目是一个挑战。我们将此挑战视为“ textbf { G } NN textbf { S } session-based textbf { N } ew textbf { I } tem textbf { R }推荐(GSNIR)”。为了解决这一问题,我们提出了一种双意图增强的图神经网络。由于新条目不与历史会话相关联,因此很难预测用户的意图。设计了一个双意图网络,分别从注意机制和历史数据分布中学习用户意图,模拟用户在与新项目交互时的决策过程。为了解决 GNN 无法学习新项目的问题,受零点学习(ZSL)的启发,我们利用 GNN 空间中新项目的属性来推断新项目的表示。通过输出新项目概率,其中包含相应项目的推荐分数,新项目的分数越高,推荐给用户。在两个具有代表性的实际数据集上的实验表明了该方法的优越性。通过实际案例分析,验证了双意图模块和新的项目推理模块所带来的可解释性优势。代码可以在 Github: https://Github.com/ee1s/nirgnn 上找到|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Dual+Intent+Enhanced+Graph+Neural+Network+for+Session-based+New+Item+Recommendation)|0| +|[Cross-domain recommendation via user interest alignment](https://doi.org/10.1145/3543507.3583263)|Chuang Zhao, Hongke Zhao, Ming HE, Jian Zhang, Jianping Fan|School of Cyberspace Security, Hangzhou Dianzi University, China; College of Management and Economics, Tianjin University, China; AI Lab at Lenovo Research, China; College of Management and Economics, Tianjin University, China and AI Lab at Lenovo Research, China|Cross-domain recommendation aims to leverage knowledge from multiple domains to alleviate the data sparsity and cold-start problems in traditional recommender systems. One popular paradigm is to employ overlapping user representations to establish domain connections, thereby improving recommendation performance in all scenarios. Nevertheless, the general practice of this approach is to train user embeddings in each domain separately and then aggregate them in a plain manner, often ignoring potential cross-domain similarities between users and items. Furthermore, considering that their training objective is recommendation task-oriented without specific regularizations, the optimized embeddings disregard the interest alignment among user's views, and even violate the user's original interest distribution. To address these challenges, we propose a novel cross-domain recommendation framework, namely COAST, to improve recommendation performance on dual domains by perceiving the cross-domain similarity between entities and aligning user interests. Specifically, we first construct a unified cross-domain heterogeneous graph and redefine the message passing mechanism of graph convolutional networks to capture high-order similarity of users and items across domains. Targeted at user interest alignment, we develop deep insights from two more fine-grained perspectives of user-user and user-item interest invariance across domains by virtue of affluent unsupervised and semantic signals. We conduct intensive experiments on multiple tasks, constructed from two large recommendation data sets. Extensive results show COAST consistently and significantly outperforms state-of-the-art cross-domain recommendation algorithms as well as classic single-domain recommendation methods.|跨域推荐的目的是利用来自多个域的知识来缓解传统推荐系统中的数据稀疏和冷启动问题。一个流行的范例是使用重叠的用户表示来建立域连接,从而在所有场景中提高推荐性能。然而,这种方法的一般实践是分别训练每个域中的用户嵌入,然后以简单的方式聚合它们,通常忽略用户和项目之间潜在的跨域相似性。此外,考虑到其训练目标是面向推荐任务的,没有具体的规范化,优化嵌入无视用户视图之间的兴趣一致性,甚至违背了用户原有的兴趣分布。为了应对这些挑战,我们提出了一种新的跨域推荐框架,即 COAST,通过感知实体之间的跨域相似性和调整用户兴趣来提高双域推荐的性能。具体来说,我们首先构建一个统一的跨域异构图,重新定义图卷积网络的消息传递机制,以获取跨域用户和项目的高阶相似性。针对用户兴趣对齐,我们从用户-用户和用户项目兴趣不变性的两个更细粒度的角度,通过丰富的无监督和语义信号,开发深刻的见解。我们对两个大型推荐数据集构建的多个任务进行了深入的实验。广泛的结果表明,COAST 始终如一地显著优于最先进的跨域推荐算法和经典的单域推荐方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Cross-domain+recommendation+via+user+interest+alignment)|0| |[A Semantic Partitioning Method for Large-Scale Training of Knowledge Graph Embeddings](https://doi.org/10.1145/3543873.3587537)|Yuhe Bai|Sorbonne University, France|In recent years, knowledge graph embeddings have achieved great success. Many methods have been proposed and achieved state-of-the-art results in various tasks. However, most of the current methods present one or more of the following problems: (i) They only consider fact triplets, while ignoring the ontology information of knowledge graphs. (ii) The obtained embeddings do not contain much semantic information. Therefore, using these embeddings for semantic tasks is problematic. (iii) They do not enable large-scale training. In this paper, we propose a new algorithm that incorporates the ontology of knowledge graphs and partitions the knowledge graph based on classes to include more semantic information for parallel training of large-scale knowledge graph embeddings. Our preliminary results show that our algorithm performs well on several popular benchmarks.|近年来,知识图嵌入技术取得了很大的成功。已经提出了许多方法,并在各种任务中取得了最新的成果。然而,目前的大多数方法都存在以下一个或多个问题: (i)它们只考虑事实三元组,而忽略了知识图的本体信息。(ii)所得的嵌入资料并无太多语义信息。因此,将这些嵌入用于语义任务是有问题的。(iii)不能进行大规模培训。在本文中,我们提出了一个新的算法,它结合了知识图的本体和基于类的知识图划分,以包括更多的语义信息并行训练大规模的知识图嵌入。我们的初步结果表明,我们的算法在几个流行的基准测试中表现良好。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Semantic+Partitioning+Method+for+Large-Scale+Training+of+Knowledge+Graph+Embeddings)|0| |[Intent-Aware Propensity Estimation via Click Pattern Stratification](https://doi.org/10.1145/3543873.3587610)|Ehsan Ebrahimzadeh, Alex Cozzi, Abraham Bagherjeiran|Search Ranking and Monetization, eBay, USA|Counterfactual learning to rank via inverse propensity weighting is the most popular approach to train ranking models using biased implicit user feedback from logged search data. Standard click propensity estimation techniques rely on simple models of user browsing behavior that primarily account for the attributes of the presentation context that affect whether the relevance of an item to the search context is observed. Most notably, the inherent effect of the listwise presentation of the items on users’ propensity for engagement is captured in the position of the presented items on the search result page. In this work, we enrich this position bias based click propensity model by proposing an observation model that further incorporates the underlying search intent, as reflected in the user’s click pattern in the search context. Our approach does not require an intent prediction model based on the content of the search context. Instead, we rely on a simple, yet effective, non-causal estimate of the user’s browsing intent from the number of click events in the search context. We empirically characterize the distinct rank decay patterns of the estimated click propensities in the characterized intent classes. In particular, we demonstrate a sharper decay of click propensities in top ranks for the intent class identified by sparse user clicks and the higher likelihood of observing clicks in lower ranks for the intent class identified by higher number of user clicks. We show that the proposed intent-aware propensity estimation technique helps with training ranking models with more effective personalization and generalization power through empirical results for a ranking task in a major e-commerce platform.|通过逆倾向加权反事实学习排序是最流行的方法来训练排序模型使用有偏见的隐式用户反馈的日志搜索数据。标准的点击倾向评估技术依赖于用户浏览行为的简单模型,这些模型主要解释了表示上下文的属性,这些属性影响了项目与搜索上下文的相关性是否被观察到。最值得注意的是,在搜索结果页面上呈现的项目的位置捕捉到了项目列表方式对用户参与倾向的内在影响。在这项工作中,我们丰富了这个基于位置偏差的点击倾向模型,通过提出一个观察模型,进一步结合潜在的搜索意图,如反映在用户的点击模式在搜索上下文。我们的方法不需要基于搜索上下文内容的意图预测模型。相反,我们依赖于从搜索上下文中的点击事件数量对用户的浏览意图进行简单而有效的非因果估计。我们经验性地描述了特征意图类中估计的点击倾向的不同秩衰减模式。特别是,我们证明了通过稀疏的用户点击识别的意图类别的顶级点击倾向的更强烈的衰减,以及通过更高数量的用户点击识别的意图类别在较低级别中观察到点击的可能性更高。通过对一个大型电子商务平台的排序任务进行实证分析,我们发现提出的意图感知倾向估计技术有助于训练排序模型,使其具有更有效的个性化和泛化能力。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Intent-Aware+Propensity+Estimation+via+Click+Pattern+Stratification)|0| -|[Disentangling Degree-related Biases and Interest for Out-of-Distribution Generalized Directed Network Embedding](https://doi.org/10.1145/3543507.3583271)|Hyunsik Yoo, YeonChang Lee, Kijung Shin, SangWook Kim|Korea Advanced Institute of Science and Technology, Republic of Korea; Georgia Institute of Technology, USA; Hanyang University, Republic of Korea|The goal of directed network embedding is to represent the nodes in a given directed network as embeddings that preserve the asymmetric relationships between nodes. While a number of directed network embedding methods have been proposed, we empirically show that the existing methods lack out-of-distribution generalization abilities against degree-related distributional shifts. To mitigate this problem, we propose ODIN (Out-of-Distribution Generalized Directed Network Embedding), a new directed NE method where we model multiple factors in the formation of directed edges. Then, for each node, ODIN learns multiple embeddings, each of which preserves its corresponding factor, by disentangling interest factors and biases related to in- and out-degrees of nodes. Our experiments on four real-world directed networks demonstrate that disentangling multiple factors enables ODIN to yield out-of-distribution generalized embeddings that are consistently effective under various degrees of shifts in degree distributions. Specifically, ODIN universally outperforms 9 state-of-the-art competitors in 2 LP tasks on 4 real-world datasets under both identical distribution (ID) and non-ID settings. The code is available at https://github.com/hsyoo32/odin.|有向网络嵌入的目的是将给定有向网络中的节点表示为保持节点间不对称关系的嵌入。虽然已经提出了一些有向网络嵌入方法,但是实验表明,现有的方法缺乏对度相关分布偏移的分布外泛化能力。为了解决这一问题,我们提出了一种新的有向网络嵌入方法 ODIN (Out-of-Distribution Generalization Directed Network Embedding) ,该方法对有向边的形成过程中的多个因素进行建模。然后,对于每个节点,ODIN 通过分离与节点内外度相关的兴趣因子和偏差来学习多个嵌入,每个嵌入保留相应的因子。我们在四个真实世界的定向网络上的实验表明,解开多个因素使 ODIN 能够产生分布外的广义嵌入,在度分布的不同程度的转移下一致有效。具体而言,ODIN 在4个真实世界数据集的2个 LP 任务中,在相同的分布(ID)和非 ID 设置下,普遍优于9个最先进的竞争对手。密码可在 https://github.com/hsyoo32/odin 查阅。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Disentangling+Degree-related+Biases+and+Interest+for+Out-of-Distribution+Generalized+Directed+Network+Embedding)|0| -|[Fine-tuning Partition-aware Item Similarities for Efficient and Scalable Recommendation](https://doi.org/10.1145/3543507.3583240)|Tianjun Wei, Jianghong Ma, Tommy W. S. Chow|Harbin Institute of Technology, China; City University of Hong Kong, Hong Kong|Collaborative filtering (CF) is widely searched in recommendation with various types of solutions. Recent success of Graph Convolution Networks (GCN) in CF demonstrates the effectiveness of modeling high-order relationships through graphs, while repetitive graph convolution and iterative batch optimization limit their efficiency. Instead, item similarity models attempt to construct direct relationships through efficient interaction encoding. Despite their great performance, the growing item numbers result in quadratic growth in similarity modeling process, posing critical scalability problems. In this paper, we investigate the graph sampling strategy adopted in latest GCN model for efficiency improving, and identify the potential item group structure in the sampled graph. Based on this, we propose a novel item similarity model which introduces graph partitioning to restrict the item similarity modeling within each partition. Specifically, we show that the spectral information of the original graph is well in preserving global-level information. Then, it is added to fine-tune local item similarities with a new data augmentation strategy acted as partition-aware prior knowledge, jointly to cope with the information loss brought by partitioning. Experiments carried out on 4 datasets show that the proposed model outperforms state-of-the-art GCN models with 10x speed-up and item similarity models with 95\% parameter storage savings.|协同过滤(CF)在推荐中被广泛搜索,提供了各种类型的解决方案。最近,图卷积网络(GCN)在 CF 中的成功证明了通过图建立高阶关系的有效性,而重复图卷积和迭代批处理优化限制了它们的效率。相反,项目相似性模型试图通过有效的交互编码来构建直接关系。尽管它们具有很好的性能,但是在相似性建模过程中,项目数量的增长会导致二次增长,从而产生关键的可扩展性问题。本文研究了最新 GCN 模型中为提高效率而采用的图抽样策略,并识别了抽样图中潜在的项目组结构。在此基础上,提出了一种新的项目相似度模型,该模型引入图划分来约束项目相似度建模。具体地说,我们证明了原始图的光谱信息在保持全局水平信息方面是很好的。然后加入一种新的数据增强策略作为分区感知的先验知识,对局部项相似性进行微调,共同应对分区带来的信息丢失。在4个数据集上进行的实验表明,该模型比最新的 GCN 模型具有10倍的加速度和项目相似度,节省了95% 的参数存储空间。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Fine-tuning+Partition-aware+Item+Similarities+for+Efficient+and+Scalable+Recommendation)|0| -|[Multi-Behavior Recommendation with Cascading Graph Convolution Networks](https://doi.org/10.1145/3543507.3583439)|Zhiyong Cheng, Sai Han, Fan Liu, Lei Zhu, Zan Gao, Yuxin Peng|Shandong Artificial Intelligence Institute, Qilu University of Technology (Shandong Academy of Sciences), China; School of Information Science and Engineering, Shandong Normal University, China; School of Computing, National University of Singapore, Singapore; Wangxuan Institute of Computer Technology, Peking University, China and Peng Cheng Laboratory, China|Multi-behavior recommendation, which exploits auxiliary behaviors (e.g., click and cart) to help predict users' potential interactions on the target behavior (e.g., buy), is regarded as an effective way to alleviate the data sparsity or cold-start issues in recommendation. Multi-behaviors are often taken in certain orders in real-world applications (e.g., click>cart>buy). In a behavior chain, a latter behavior usually exhibits a stronger signal of user preference than the former one does. Most existing multi-behavior models fail to capture such dependencies in a behavior chain for embedding learning. In this work, we propose a novel multi-behavior recommendation model with cascading graph convolution networks (named MB-CGCN). In MB-CGCN, the embeddings learned from one behavior are used as the input features for the next behavior's embedding learning after a feature transformation operation. In this way, our model explicitly utilizes the behavior dependencies in embedding learning. Experiments on two benchmark datasets demonstrate the effectiveness of our model on exploiting multi-behavior data. It outperforms the best baseline by 33.7% and 35.9% on average over the two datasets in terms of Recall@10 and NDCG@10, respectively.|多行为推荐利用辅助行为(如点击和购物车)来帮助预测用户在目标行为(如购买)上的潜在交互,被认为是缓解推荐中数据稀疏或冷启动问题的有效方法。在实际应用程序中,多行为通常按照特定的顺序执行(例如,单击 > 购物车 > 购买)。在行为链中,后一种行为通常比前一种行为表现出更强的用户偏好信号。大多数现有的多行为模型无法在嵌入式学习的行为链中捕获这种依赖关系。提出了一种新的具有级联图卷积网络的多行为推荐模型(MB-CGCN)。在 MB-CGCN 中,从一个行为中学习的嵌入作为特征转换操作后下一个行为的嵌入学习的输入特征。通过这种方式,我们的模型明确地利用了嵌入式学习中的行为依赖。在两个基准数据集上的实验证明了该模型对多行为数据的有效性。以 Recall@10和 NDCG@10计算,该方法比最佳基线的平均值分别高出33.7% 和35.9% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-Behavior+Recommendation+with+Cascading+Graph+Convolution+Networks)|0| -|[Cross-domain Recommendation with Behavioral Importance Perception](https://doi.org/10.1145/3543507.3583494)|Hong Chen, Xin Wang, Ruobing Xie, Yuwei Zhou, Wenwu Zhu|WeChat Search Application Department, Tencent, China; Department of Computer Science and Technology, Tsinghua University, China|Cross-domain recommendation (CDR) aims to leverage the source domain information to provide better recommendation for the target domain, which is widely adopted in recommender systems to alleviate the data sparsity and cold-start problems. However, existing CDR methods mostly focus on designing effective model architectures to transfer the source domain knowledge, ignoring the behavior-level effect during the loss optimization process, where behaviors regarding different aspects in the source domain may have different importance for the CDR model optimization. The ignorance of the behavior-level effect will cause the carefully designed model architectures ending up with sub-optimal parameters, which limits the recommendation performance. To tackle the problem, we propose a generic behavioral importance-aware optimization framework for cross-domain recommendation (BIAO). Specifically, we propose a behavioral perceptron which predicts the importance of each source behavior according to the corresponding item’s global impact and local user-specific impact. The joint optimization process of the CDR model and the behavioral perceptron is formulated as a bi-level optimization problem. In the lower optimization, only the CDR model is updated with weighted source behavior loss and the target domain loss, while in the upper optimization, the behavioral perceptron is updated with implicit gradient from a developing dataset obtained through the proposed reorder-and-reuse strategy. Extensive experiments show that our proposed optimization framework consistently improves the performance of different cross-domain recommendation models in 7 cross-domain scenarios, demonstrating that our method can serve as a generic and powerful tool for cross-domain recommendation1.|跨域推荐(CDR)是利用源域信息为目标域提供更好的推荐,在推荐系统中被广泛采用以缓解数据稀疏和冷启动问题。然而,现有的 CDR 方法大多侧重于设计有效的模型结构来传递源域知识,忽略了损失优化过程中的行为级效应,其中源域中不同方面的行为对 CDR 模型优化的重要性不同。对行为级效应的忽视将导致精心设计的模型结构最终得到次优参数,从而限制了推荐性能。为了解决这个问题,我们提出了一个通用的跨域推荐的行为重要性感知优化框架(BIAO)。具体来说,我们提出了一种行为感知器,它根据相应项目的全局影响和局部用户特定影响来预测每个源行为的重要性。CDR 模型和行为感知器的联合优化过程是一个双层次的最佳化问题。在下层优化中,只对 CDR 模型进行加权源行为丢失和目标域丢失的更新,而在上层优化中,行为感知器通过提出的重排序和重用策略从一个正在发展的数据集中获得隐式梯度更新。大量的实验表明,我们提出的优化框架在7个跨领域场景中始终如一地提高了不同跨领域推荐模型的性能,表明我们的方法可以作为跨领域推荐的一个通用和强大的工具1。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Cross-domain+Recommendation+with+Behavioral+Importance+Perception)|0| +|[Disentangling Degree-related Biases and Interest for Out-of-Distribution Generalized Directed Network Embedding](https://doi.org/10.1145/3543507.3583271)|Hyunsik Yoo, YeonChang Lee, Kijung Shin, SangWook Kim|Georgia Institute of Technology, USA; Korea Advanced Institute of Science and Technology, Republic of Korea; Hanyang University, Republic of Korea|The goal of directed network embedding is to represent the nodes in a given directed network as embeddings that preserve the asymmetric relationships between nodes. While a number of directed network embedding methods have been proposed, we empirically show that the existing methods lack out-of-distribution generalization abilities against degree-related distributional shifts. To mitigate this problem, we propose ODIN (Out-of-Distribution Generalized Directed Network Embedding), a new directed NE method where we model multiple factors in the formation of directed edges. Then, for each node, ODIN learns multiple embeddings, each of which preserves its corresponding factor, by disentangling interest factors and biases related to in- and out-degrees of nodes. Our experiments on four real-world directed networks demonstrate that disentangling multiple factors enables ODIN to yield out-of-distribution generalized embeddings that are consistently effective under various degrees of shifts in degree distributions. Specifically, ODIN universally outperforms 9 state-of-the-art competitors in 2 LP tasks on 4 real-world datasets under both identical distribution (ID) and non-ID settings. The code is available at https://github.com/hsyoo32/odin.|有向网络嵌入的目的是将给定有向网络中的节点表示为保持节点间不对称关系的嵌入。虽然已经提出了一些有向网络嵌入方法,但是实验表明,现有的方法缺乏对度相关分布偏移的分布外泛化能力。为了解决这一问题,我们提出了一种新的有向网络嵌入方法 ODIN (Out-of-Distribution Generalization Directed Network Embedding) ,该方法对有向边的形成过程中的多个因素进行建模。然后,对于每个节点,ODIN 通过分离与节点内外度相关的兴趣因子和偏差来学习多个嵌入,每个嵌入保留相应的因子。我们在四个真实世界的定向网络上的实验表明,解开多个因素使 ODIN 能够产生分布外的广义嵌入,在度分布的不同程度的转移下一致有效。具体而言,ODIN 在4个真实世界数据集的2个 LP 任务中,在相同的分布(ID)和非 ID 设置下,普遍优于9个最先进的竞争对手。密码可在 https://github.com/hsyoo32/odin 查阅。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Disentangling+Degree-related+Biases+and+Interest+for+Out-of-Distribution+Generalized+Directed+Network+Embedding)|0| +|[Fine-tuning Partition-aware Item Similarities for Efficient and Scalable Recommendation](https://doi.org/10.1145/3543507.3583240)|Tianjun Wei, Jianghong Ma, Tommy W. S. Chow|City University of Hong Kong, Hong Kong; Harbin Institute of Technology, China|Collaborative filtering (CF) is widely searched in recommendation with various types of solutions. Recent success of Graph Convolution Networks (GCN) in CF demonstrates the effectiveness of modeling high-order relationships through graphs, while repetitive graph convolution and iterative batch optimization limit their efficiency. Instead, item similarity models attempt to construct direct relationships through efficient interaction encoding. Despite their great performance, the growing item numbers result in quadratic growth in similarity modeling process, posing critical scalability problems. In this paper, we investigate the graph sampling strategy adopted in latest GCN model for efficiency improving, and identify the potential item group structure in the sampled graph. Based on this, we propose a novel item similarity model which introduces graph partitioning to restrict the item similarity modeling within each partition. Specifically, we show that the spectral information of the original graph is well in preserving global-level information. Then, it is added to fine-tune local item similarities with a new data augmentation strategy acted as partition-aware prior knowledge, jointly to cope with the information loss brought by partitioning. Experiments carried out on 4 datasets show that the proposed model outperforms state-of-the-art GCN models with 10x speed-up and item similarity models with 95\% parameter storage savings.|协同过滤(CF)在推荐中被广泛搜索,提供了各种类型的解决方案。最近,图卷积网络(GCN)在 CF 中的成功证明了通过图建立高阶关系的有效性,而重复图卷积和迭代批处理优化限制了它们的效率。相反,项目相似性模型试图通过有效的交互编码来构建直接关系。尽管它们具有很好的性能,但是在相似性建模过程中,项目数量的增长会导致二次增长,从而产生关键的可扩展性问题。本文研究了最新 GCN 模型中为提高效率而采用的图抽样策略,并识别了抽样图中潜在的项目组结构。在此基础上,提出了一种新的项目相似度模型,该模型引入图划分来约束项目相似度建模。具体地说,我们证明了原始图的光谱信息在保持全局水平信息方面是很好的。然后加入一种新的数据增强策略作为分区感知的先验知识,对局部项相似性进行微调,共同应对分区带来的信息丢失。在4个数据集上进行的实验表明,该模型比最新的 GCN 模型具有10倍的加速度和项目相似度,节省了95% 的参数存储空间。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Fine-tuning+Partition-aware+Item+Similarities+for+Efficient+and+Scalable+Recommendation)|0| +|[Multi-Behavior Recommendation with Cascading Graph Convolution Networks](https://doi.org/10.1145/3543507.3583439)|Zhiyong Cheng, Sai Han, Fan Liu, Lei Zhu, Zan Gao, Yuxin Peng|School of Computing, National University of Singapore, Singapore; Wangxuan Institute of Computer Technology, Peking University, China and Peng Cheng Laboratory, China; Shandong Artificial Intelligence Institute, Qilu University of Technology (Shandong Academy of Sciences), China; School of Information Science and Engineering, Shandong Normal University, China|Multi-behavior recommendation, which exploits auxiliary behaviors (e.g., click and cart) to help predict users' potential interactions on the target behavior (e.g., buy), is regarded as an effective way to alleviate the data sparsity or cold-start issues in recommendation. Multi-behaviors are often taken in certain orders in real-world applications (e.g., click>cart>buy). In a behavior chain, a latter behavior usually exhibits a stronger signal of user preference than the former one does. Most existing multi-behavior models fail to capture such dependencies in a behavior chain for embedding learning. In this work, we propose a novel multi-behavior recommendation model with cascading graph convolution networks (named MB-CGCN). In MB-CGCN, the embeddings learned from one behavior are used as the input features for the next behavior's embedding learning after a feature transformation operation. In this way, our model explicitly utilizes the behavior dependencies in embedding learning. Experiments on two benchmark datasets demonstrate the effectiveness of our model on exploiting multi-behavior data. It outperforms the best baseline by 33.7% and 35.9% on average over the two datasets in terms of Recall@10 and NDCG@10, respectively.|多行为推荐利用辅助行为(如点击和购物车)来帮助预测用户在目标行为(如购买)上的潜在交互,被认为是缓解推荐中数据稀疏或冷启动问题的有效方法。在实际应用程序中,多行为通常按照特定的顺序执行(例如,单击 > 购物车 > 购买)。在行为链中,后一种行为通常比前一种行为表现出更强的用户偏好信号。大多数现有的多行为模型无法在嵌入式学习的行为链中捕获这种依赖关系。提出了一种新的具有级联图卷积网络的多行为推荐模型(MB-CGCN)。在 MB-CGCN 中,从一个行为中学习的嵌入作为特征转换操作后下一个行为的嵌入学习的输入特征。通过这种方式,我们的模型明确地利用了嵌入式学习中的行为依赖。在两个基准数据集上的实验证明了该模型对多行为数据的有效性。以 Recall@10和 NDCG@10计算,该方法比最佳基线的平均值分别高出33.7% 和35.9% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-Behavior+Recommendation+with+Cascading+Graph+Convolution+Networks)|0| +|[Cross-domain Recommendation with Behavioral Importance Perception](https://doi.org/10.1145/3543507.3583494)|Hong Chen, Xin Wang, Ruobing Xie, Yuwei Zhou, Wenwu Zhu|Department of Computer Science and Technology, Tsinghua University, China; WeChat Search Application Department, Tencent, China|Cross-domain recommendation (CDR) aims to leverage the source domain information to provide better recommendation for the target domain, which is widely adopted in recommender systems to alleviate the data sparsity and cold-start problems. However, existing CDR methods mostly focus on designing effective model architectures to transfer the source domain knowledge, ignoring the behavior-level effect during the loss optimization process, where behaviors regarding different aspects in the source domain may have different importance for the CDR model optimization. The ignorance of the behavior-level effect will cause the carefully designed model architectures ending up with sub-optimal parameters, which limits the recommendation performance. To tackle the problem, we propose a generic behavioral importance-aware optimization framework for cross-domain recommendation (BIAO). Specifically, we propose a behavioral perceptron which predicts the importance of each source behavior according to the corresponding item’s global impact and local user-specific impact. The joint optimization process of the CDR model and the behavioral perceptron is formulated as a bi-level optimization problem. In the lower optimization, only the CDR model is updated with weighted source behavior loss and the target domain loss, while in the upper optimization, the behavioral perceptron is updated with implicit gradient from a developing dataset obtained through the proposed reorder-and-reuse strategy. Extensive experiments show that our proposed optimization framework consistently improves the performance of different cross-domain recommendation models in 7 cross-domain scenarios, demonstrating that our method can serve as a generic and powerful tool for cross-domain recommendation1.|跨域推荐(CDR)是利用源域信息为目标域提供更好的推荐,在推荐系统中被广泛采用以缓解数据稀疏和冷启动问题。然而,现有的 CDR 方法大多侧重于设计有效的模型结构来传递源域知识,忽略了损失优化过程中的行为级效应,其中源域中不同方面的行为对 CDR 模型优化的重要性不同。对行为级效应的忽视将导致精心设计的模型结构最终得到次优参数,从而限制了推荐性能。为了解决这个问题,我们提出了一个通用的跨域推荐的行为重要性感知优化框架(BIAO)。具体来说,我们提出了一种行为感知器,它根据相应项目的全局影响和局部用户特定影响来预测每个源行为的重要性。CDR 模型和行为感知器的联合优化过程是一个双层次的最佳化问题。在下层优化中,只对 CDR 模型进行加权源行为丢失和目标域丢失的更新,而在上层优化中,行为感知器通过提出的重排序和重用策略从一个正在发展的数据集中获得隐式梯度更新。大量的实验表明,我们提出的优化框架在7个跨领域场景中始终如一地提高了不同跨领域推荐模型的性能,表明我们的方法可以作为跨领域推荐的一个通用和强大的工具1。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Cross-domain+Recommendation+with+Behavioral+Importance+Perception)|0| |[Multi-Lingual Multi-Partite Product Title Matching](https://doi.org/10.1145/3543873.3587322)|HuanLin Tay, WeiJie Tay, Hady W. Lauw|Singapore Management University, Singapore|In a globalized marketplace, one could access products or services from almost anywhere. However, resolving which product in one language corresponds to another product in a different language remains an under-explored problem. We explore this from two perspectives. First, given two products of different languages, how to assess their similarity that could signal a potential match. Second, given products from various languages, how to arrive at a multi-partite clustering that respects cardinality constraints efficiently. We describe algorithms for each perspective and integrate them into a promising solution validated on real-world datasets.|在一个全球化的市场中,人们几乎可以从任何地方获得产品或服务。然而,解决一种语言中的哪种产品对应于另一种语言中的另一种产品仍然是一个尚未得到充分探讨的问题。我们从两个角度来探讨这个问题。首先,给定两种不同语言的产品,如何评估它们的相似性,这可能标志着潜在的匹配。其次,给定来自不同语言的产品,如何有效地得到一个尊重基数约束的多部分聚类。我们描述每个视角的算法,并将它们集成到一个在真实世界数据集上验证的有希望的解决方案中。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-Lingual+Multi-Partite+Product+Title+Matching)|0| |[Multi-interest Recommendation on Shopping for Others](https://doi.org/10.1145/3543873.3587341)|Shuang Li, Yaokun Liu, Xiaowang Zhang, Yuexian Hou, Zhiyong Feng|Tianjin University, China; Tianjin University, China, China|Existing recommendation methods based on multi-interest frameworks effectively model users from multiple aspects to represent complex user interests. However, more research still needs to be done on the behavior of users shopping for others. We propose a Multi-Demander Recommendation (MDR) model to learn different people’s interests from a sequence of actions. We first decouple the feature embeddings of items to learn the static preferences of different demanders. Next, a weighted directed global graph is constructed to model the associations among item categories. We partition short sequences by time intervals and look up category embeddings from the graph to capture dynamic intents. Finally, preferences and intentions are combined with learning the interests of different demanders. The conducted experiments demonstrate that our model improves the accuracy of recommendations.|现有的基于多兴趣框架的推荐方法能够有效地从多个方面对用户进行建模,以表达复杂的用户兴趣。然而,还需要对用户为他人购物的行为进行更多的研究。我们提出了一个多需求推荐(MDR)模型,从一系列的行动中了解不同人的兴趣。首先解耦项目的特征嵌入,学习不同需求者的静态偏好。然后,构造一个加权有向全局图来模拟项目类别之间的关联。我们根据时间间隔对短序列进行划分,并从图中查找类别嵌入以捕获动态意图。最后,将偏好和意图与学习不同需求者的兴趣结合起来。实验表明,该模型提高了推荐的准确性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-interest+Recommendation+on+Shopping+for+Others)|0| -|[Explicit and Implicit Semantic Ranking Framework](https://doi.org/10.1145/3543873.3584621)|Xiaofeng Zhu, Thomas Lin, Vishal Anand, Matthew Calderwood, Eric ClausenBrown, Gord Lueck, Wenwai Yim, Cheng Wu|Nuance Communications, USA; Microsoft Corporation, USA|The core challenge in numerous real-world applications is to match an inquiry to the best document from a mutable and finite set of candidates. Existing industry solutions, especially latency-constrained services, often rely on similarity algorithms that sacrifice quality for speed. In this paper we introduce a generic semantic learning-to-rank framework, Self-training Semantic Cross-attention Ranking (sRank). This transformer-based framework uses linear pairwise loss with mutable training batch sizes and achieves quality gains and high efficiency, and has been applied effectively to show gains on two industry tasks at Microsoft over real-world large-scale data sets: Smart Reply (SR) and Ambient Clinical Intelligence (ACI). In Smart Reply, $sRank$ assists live customers with technical support by selecting the best reply from predefined solutions based on consumer and support agent messages. It achieves 11.7% gain in offline top-one accuracy on the SR task over the previous system, and has enabled 38.7% time reduction in composing messages in telemetry recorded since its general release in January 2021. In the ACI task, sRank selects relevant historical physician templates that serve as guidance for a text summarization model to generate higher quality medical notes. It achieves 35.5% top-one accuracy gain, along with 46% relative ROUGE-L gain in generated medical notes.|在许多实际应用程序中的核心挑战是将查询与来自一组可变且有限的候选文档的最佳文档匹配。现有的行业解决方案,尤其是延迟受限的服务,通常依赖于牺牲质量以提高速度的相似性算法。本文介绍了一个通用的语义学习排序框架——自训练语义交叉注意排序(sRank)。这种基于变压器的框架使用线性成对损失和可变的训练批量大小,实现了质量增益和高效率,并已有效地应用于显示在两个行业任务中的收益在微软超过现实世界的大规模数据集: 智能应答(SR)和环境临床智能(ACI)。在智能答复中,$sRank $通过从基于消费者和支持代理消息的预定义解决方案中选择最佳答复,为现场客户提供技术支持。与之前的系统相比,它在离线状态下获得了11.7% 的最高准确率,并且自2021年1月发布以来,在遥测信息的合成方面减少了38.7% 的时间。在 ACI 任务中,sRank 选择相关的历史医生模板作为文本摘要模型的指导,以生成更高质量的医疗笔记。它实现了35.5% 的最高一级准确性增益,以及46% 的相对 ROUGE-L 增益在生成的医疗记录。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Explicit+and+Implicit+Semantic+Ranking+Framework)|0| +|[Explicit and Implicit Semantic Ranking Framework](https://doi.org/10.1145/3543873.3584621)|Xiaofeng Zhu, Thomas Lin, Vishal Anand, Matthew Calderwood, Eric ClausenBrown, Gord Lueck, Wenwai Yim, Cheng Wu|Microsoft Corporation, USA; Nuance Communications, USA|The core challenge in numerous real-world applications is to match an inquiry to the best document from a mutable and finite set of candidates. Existing industry solutions, especially latency-constrained services, often rely on similarity algorithms that sacrifice quality for speed. In this paper we introduce a generic semantic learning-to-rank framework, Self-training Semantic Cross-attention Ranking (sRank). This transformer-based framework uses linear pairwise loss with mutable training batch sizes and achieves quality gains and high efficiency, and has been applied effectively to show gains on two industry tasks at Microsoft over real-world large-scale data sets: Smart Reply (SR) and Ambient Clinical Intelligence (ACI). In Smart Reply, $sRank$ assists live customers with technical support by selecting the best reply from predefined solutions based on consumer and support agent messages. It achieves 11.7% gain in offline top-one accuracy on the SR task over the previous system, and has enabled 38.7% time reduction in composing messages in telemetry recorded since its general release in January 2021. In the ACI task, sRank selects relevant historical physician templates that serve as guidance for a text summarization model to generate higher quality medical notes. It achieves 35.5% top-one accuracy gain, along with 46% relative ROUGE-L gain in generated medical notes.|在许多实际应用程序中的核心挑战是将查询与来自一组可变且有限的候选文档的最佳文档匹配。现有的行业解决方案,尤其是延迟受限的服务,通常依赖于牺牲质量以提高速度的相似性算法。本文介绍了一个通用的语义学习排序框架——自训练语义交叉注意排序(sRank)。这种基于变压器的框架使用线性成对损失和可变的训练批量大小,实现了质量增益和高效率,并已有效地应用于显示在两个行业任务中的收益在微软超过现实世界的大规模数据集: 智能应答(SR)和环境临床智能(ACI)。在智能答复中,$sRank $通过从基于消费者和支持代理消息的预定义解决方案中选择最佳答复,为现场客户提供技术支持。与之前的系统相比,它在离线状态下获得了11.7% 的最高准确率,并且自2021年1月发布以来,在遥测信息的合成方面减少了38.7% 的时间。在 ACI 任务中,sRank 选择相关的历史医生模板作为文本摘要模型的指导,以生成更高质量的医疗笔记。它实现了35.5% 的最高一级准确性增益,以及46% 的相对 ROUGE-L 增益在生成的医疗记录。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Explicit+and+Implicit+Semantic+Ranking+Framework)|0| |[MPKGAC: Multimodal Product Attribute Completion in E-commerce](https://doi.org/10.1145/3543873.3584623)|Kai Wang, Jianzhi Shao, Tao Zhang, Qijin Chen, Chengfu Huo|Alibaba Group, China|Product attributes can display the selling points of products, helping users find their desired products in search results. However, product attributes are typically incomplete. In e-commerce, products have multimodal features, including original attributes, images, and texts. How to make full use of the multimodal data to complete the missing attributes is the key challenge. To this end, we propose MPKGAC, a powerful three-stream framework that handles multimodal product data for attribute completion. We build a multimodal product knowledge graph (KG) from the multimodal features, and then convert the attribute completion problem into a multimodal KG completion task. MPKGAC encodes each modality separately, fuses them adaptively, and integrates multimodal decoders for prediction. Experiments show that MPKGAC outperforms the best baseline by 6.2% in [email protected] MPKGAC is employed to enrich selling points of the women’s clothing industry at Alibaba.com.cn and improves the click-through rate (CTR) by a relative 2.14%.|产品属性可以显示产品的销售点,帮助用户在搜索结果中找到他们想要的产品。但是,产品属性通常是不完整的。在电子商务中,产品具有多模态特征,包括原始属性、图像和文本。如何充分利用多模态数据来完成缺失的属性是一个关键的挑战。为此,我们提出了 MPKGAC,一个强大的三流框架,处理多通道产品数据的属性完成。首先根据多模态特征构造多模态产品知识图,然后将属性完成问题转化为多模态产品完成任务。MPKGAC 对每种模式分别进行编码,自适应地进行融合,并集成多模式解码器进行预测。实验表明,在阿里巴巴网站上,MPKGAC 的表现优于最佳基线6.2% ,提高了女装行业的销售点,提高了相对2.14% 的点进率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MPKGAC:+Multimodal+Product+Attribute+Completion+in+E-commerce)|0| |[Bootstrapping Contrastive Learning Enhanced Music Cold-Start Matching](https://doi.org/10.1145/3543873.3584626)|Xinping Zhao, Ying Zhang, Qiang Xiao, Yuming Ren, Yingchun Yang|NetEase Cloud Music, NetEase Inc., China; Zhejiang University, China; Zhejiang University, China and NetEase Cloud Music, NetEase Inc., China|We study a particular matching task we call Music Cold-Start Matching. In short, given a cold-start song request, we expect to retrieve songs with similar audiences and then fastly push the cold-start song to the audiences of the retrieved songs to warm up it. However, there are hardly any studies done on this task. Therefore, in this paper, we will formalize the problem of Music Cold-Start Matching detailedly and give a scheme. During the offline training, we attempt to learn high-quality song representations based on song content features. But, we find supervision signals typically follow power-law distribution causing skewed representation learning. To address this issue, we propose a novel contrastive learning paradigm named Bootstrapping Contrastive Learning (BCL) to enhance the quality of learned representations by exerting contrastive regularization. During the online serving, to locate the target audiences more accurately, we propose Clustering-based Audience Targeting (CAT) that clusters audience representations to acquire a few cluster centroids and then locate the target audiences by measuring the relevance between the audience representations and the cluster centroids. Extensive experiments on the offline dataset and online system demonstrate the effectiveness and efficiency of our method. Currently, we have deployed it on NetEase Cloud Music, affecting millions of users.|我们研究一个特殊的匹配任务,我们称之为音乐冷启动匹配。简而言之,给定一个冷启动歌曲请求,我们期望检索具有相似受众的歌曲,然后快速将冷启动歌曲推送给被检索歌曲的受众进行预热。然而,几乎没有任何关于这项任务的研究。因此,本文将音乐冷启动匹配问题进行了详细的形式化描述,并给出了一个解决方案。在离线训练中,我们尝试根据歌曲的内容特征来学习高质量的歌曲表现。但是,我们发现监督信号具有典型的幂律分布特征,从而导致了偏态表征学习。为了解决这一问题,我们提出了一种新的对比学习范式——自举对比学习(BCL) ,通过运用对比正则化来提高学习表征的质量。在在线服务过程中,为了更准确地定位目标受众,我们提出了基于聚类的受众定位(CAT)方法,即通过聚类获取受众表征的几个聚类中心,然后通过测量受众表征与聚类中心之间的相关性来定位目标受众。在离线数据集和在线系统上的大量实验证明了该方法的有效性和高效性。目前,我们已经在网易云音乐上部署了它,影响了数百万用户。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Bootstrapping+Contrastive+Learning+Enhanced+Music+Cold-Start+Matching)|0| -|[Reinforcing User Retention in a Billion Scale Short Video Recommender System](https://doi.org/10.1145/3543873.3584640)|Qingpeng Cai, Shuchang Liu, Xueliang Wang, Tianyou Zuo, Wentao Xie, Bin Yang, Dong Zheng, Peng Jiang, Kun Gai|Unaffiliated, China; Kuaishou Technology, China|Recently, short video platforms have achieved rapid user growth by recommending interesting content to users. The objective of the recommendation is to optimize user retention, thereby driving the growth of DAU (Daily Active Users). Retention is a long-term feedback after multiple interactions of users and the system, and it is hard to decompose retention reward to each item or a list of items. Thus traditional point-wise and list-wise models are not able to optimize retention. In this paper, we choose reinforcement learning methods to optimize the retention as they are designed to maximize the long-term performance. We formulate the problem as an infinite-horizon request-based Markov Decision Process, and our objective is to minimize the accumulated time interval of multiple sessions, which is equal to improving the app open frequency and user retention. However, current reinforcement learning algorithms can not be directly applied in this setting due to uncertainty, bias, and long delay time incurred by the properties of user retention. We propose a novel method, dubbed RLUR, to address the aforementioned challenges. Both offline and live experiments show that RLUR can significantly improve user retention. RLUR has been fully launched in Kuaishou app for a long time, and achieves consistent performance improvement on user retention and DAU.|最近,短视频平台通过向用户推荐有趣的内容实现了用户的快速增长。推荐的目的是优化用户保持率,从而推动 DAU (每日活跃用户)的增长。保留是用户和系统进行多次交互后的长期反馈,很难将保留奖励分解为每个项目或一个项目列表。因此,传统的点模型和列表模型不能优化保留。在本文中,我们选择强化学习方法来优化保留,因为它们旨在最大限度地提高长期绩效。我们把这个问题表述为一个基于无限期请求的马可夫决策过程,我们的目标是最小化多个会话的累积时间间隔,这相当于提高应用程序的开放频率和用户保持率。然而,由于用户保持特性的不确定性、偏差和长时间延迟,现有的强化学习算法不能直接应用于这种设置。我们提出了一种新的方法,称为 RLUR,以解决上述挑战。离线和现场实验都表明,RLUR 可以显著提高用户保持率。RLUR 已经在 Kuaishou 应用程序中全面推出很长时间了,并且在用户保留和 DAU 方面取得了持续的性能改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Reinforcing+User+Retention+in+a+Billion+Scale+Short+Video+Recommender+System)|0| +|[Reinforcing User Retention in a Billion Scale Short Video Recommender System](https://doi.org/10.1145/3543873.3584640)|Qingpeng Cai, Shuchang Liu, Xueliang Wang, Tianyou Zuo, Wentao Xie, Bin Yang, Dong Zheng, Peng Jiang, Kun Gai|Kuaishou Technology, China; Unaffiliated, China|Recently, short video platforms have achieved rapid user growth by recommending interesting content to users. The objective of the recommendation is to optimize user retention, thereby driving the growth of DAU (Daily Active Users). Retention is a long-term feedback after multiple interactions of users and the system, and it is hard to decompose retention reward to each item or a list of items. Thus traditional point-wise and list-wise models are not able to optimize retention. In this paper, we choose reinforcement learning methods to optimize the retention as they are designed to maximize the long-term performance. We formulate the problem as an infinite-horizon request-based Markov Decision Process, and our objective is to minimize the accumulated time interval of multiple sessions, which is equal to improving the app open frequency and user retention. However, current reinforcement learning algorithms can not be directly applied in this setting due to uncertainty, bias, and long delay time incurred by the properties of user retention. We propose a novel method, dubbed RLUR, to address the aforementioned challenges. Both offline and live experiments show that RLUR can significantly improve user retention. RLUR has been fully launched in Kuaishou app for a long time, and achieves consistent performance improvement on user retention and DAU.|最近,短视频平台通过向用户推荐有趣的内容实现了用户的快速增长。推荐的目的是优化用户保持率,从而推动 DAU (每日活跃用户)的增长。保留是用户和系统进行多次交互后的长期反馈,很难将保留奖励分解为每个项目或一个项目列表。因此,传统的点模型和列表模型不能优化保留。在本文中,我们选择强化学习方法来优化保留,因为它们旨在最大限度地提高长期绩效。我们把这个问题表述为一个基于无限期请求的马可夫决策过程,我们的目标是最小化多个会话的累积时间间隔,这相当于提高应用程序的开放频率和用户保持率。然而,由于用户保持特性的不确定性、偏差和长时间延迟,现有的强化学习算法不能直接应用于这种设置。我们提出了一种新的方法,称为 RLUR,以解决上述挑战。离线和现场实验都表明,RLUR 可以显著提高用户保持率。RLUR 已经在 Kuaishou 应用程序中全面推出很长时间了,并且在用户保留和 DAU 方面取得了持续的性能改进。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Reinforcing+User+Retention+in+a+Billion+Scale+Short+Video+Recommender+System)|0| |[Jointly modeling products and resource pages for task-oriented recommendation](https://doi.org/10.1145/3543873.3584642)|Brendan Duncan, Surya Kallumadi, Taylor BergKirkpatrick, Julian J. McAuley|UC San Diego, USA; Lowe's Companies, Inc., USA|Modeling high-level user intent in recommender systems can improve performance, although it is often difficult to obtain a ground truth measure of this intent. In this paper, we investigate a novel way to obtain such an intent signal by leveraging resource pages associated with a particular task. We jointly model product interactions and resource page interactions to create a system which can recommend both products and resource pages to users. Our experiments consider the domain of home improvement product recommendation, where resource pages are DIY (do-it-yourself) project pages from Lowes.com. Each DIY page provides a list of tools, materials, and step-by-step instructions to complete a DIY project, such as building a deck, installing cabinets, and fixing a leaking pipe. We use this data as an indicator of the intended project, which is a natural high-level intent signal for home improvement shoppers. We then extend a state-of-the-art system to incorporate this new intent data, and show a significant improvement in the ability of the system to recommend products. We further demonstrate that our system can be used to successfully recommend DIY project pages to users. We have taken initial steps towards deploying our method for project recommendation in production on the Lowe’s website and for recommendations through marketing emails.|在推荐系统中建模高级用户意图可以提高性能,尽管通常很难获得这种意图的地面真实度量。在本文中,我们研究了一种通过利用与特定任务相关联的资源页来获得这种意图信号的新方法。我们联合对产品交互和资源页交互进行建模,以创建一个可以向用户推荐产品和资源页的系统。我们的实验考虑了家装产品推荐领域,其中的资源页面是来自 Lowes.com 的 DIY (DIY-it-yourself)项目页面。每一个 DIY 页面都提供了一系列的工具、材料和一步一步的指导来完成一个 DIY 项目,例如建造一个甲板、安装橱柜和修复一个泄漏的管道。我们使用这些数据作为预期项目的指标,这对于家装购物者来说是一个自然的高层次的意图信号。然后,我们扩展了一个最先进的系统来合并这些新的意图数据,并显示系统推荐产品的能力有了显著的提高。我们进一步演示了我们的系统可以用来成功地向用户推荐 DIY 项目页面。我们已经采取了初步步骤,部署我们的方法,项目推荐生产在劳的网站上,并通过营销电子邮件的建议。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Jointly+modeling+products+and+resource+pages+for+task-oriented+recommendation)|0| -|[Meta-Generator Enhanced Multi-Domain Recommendation](https://doi.org/10.1145/3543873.3584652)|Yingyi Zhang, Xianneng Li, Yahe Yu, Jian Tang, Huanfang Deng, Junya Lu, Yeyin Zhang, Qiancheng Jiang, Yunsen Xian, Liqian Yu, Han Liu|Meituan, China; Meituan-Dianping Group, China; Dalian University of Technology, China|Large-scale e-commercial platforms usually contain multiple business fields, which require industrial algorithms to characterize user intents across multiple domains. Numerous efforts have been made in user multi-domain intent modeling to achieve state-of-the-art performance. However, existing methods mainly focus on the domains having rich user information, which makes implementation to domains with sparse or rare user behavior meet with mixed success. Hence, in this paper, we propose a novel method named Meta-generator enhanced multi-Domain model (MetaDomain) to address the above issue. MetaDomain mainly includes two steps, 1) users’ multi-domain intent representation and 2) users’ multi-domain intent fusion. Specifically, in users’ multi-domain intent representation, we use the gradient information from a domain intent extractor to train the domain intent meta-generator, where the domain intent extractor has the input of users’ sequence feature and domain meta-generator has the input of users’ basic feature, hence the capability of generating users’ intent with sparse behavior. Afterward, in users’ multi-domain intent fusion, a domain graph is used to represent the high-order multi-domain connectivity. Extensive experiments have been carried out under a real-world industrial platform named Meituan. Both offline and rigorous online A/B tests under the billion-level data scale demonstrate the superiority of the proposed MetaDomain method over the state-of-the-art baselines. Furthermore comparing with the method using multi-domain sequence features, MetaDomain can reduce the serving latency by 20%. Currently, MetaDomain has been deployed in Meituan one of the largest worldwide Online-to-Offline(O2O) platforms.|大型电子商务平台通常包含多个业务字段,这需要工业算法来描述跨多个域的用户意图。在用户多领域意图建模方面做了大量工作,以实现最先进的性能。然而,现有的方法主要集中在具有丰富用户信息的域上,这使得对于用户行为稀疏或罕见的域的实现成败参半。为此,本文提出了一种新的元生成器增强型多域模型(MetaDomain)来解决上述问题。元域主要包括两个步骤: 1)用户的多域意图表示和2)用户的多域意图融合。具体来说,在用户的多领域意图表示中,我们利用领域意图提取器的梯度信息来训练领域意图元生成器,其中领域意图提取器输入用户的序列特征,领域元生成器输入用户的基本特征,从而具有生成稀疏行为的用户意图的能力。然后,在用户的多域意图融合中,使用一个域图来表示高阶多域连通性。在一个名为“美团”的真实工业平台下,人们进行了广泛的实验。在十亿级数据规模下的离线和严格的在线 A/B 测试都证明了提出的 MetaDomain 方法相对于最先进的基线的优越性。此外,与采用多域序列特征的方法相比,元域可以减少20% 的服务延迟。目前,MetaDomain 已经部署在全球最大的在线到离线(o2O)平台之一的美团上。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Meta-Generator+Enhanced+Multi-Domain+Recommendation)|0| +|[Meta-Generator Enhanced Multi-Domain Recommendation](https://doi.org/10.1145/3543873.3584652)|Yingyi Zhang, Xianneng Li, Yahe Yu, Jian Tang, Huanfang Deng, Junya Lu, Yeyin Zhang, Qiancheng Jiang, Yunsen Xian, Liqian Yu, Han Liu|Meituan-Dianping Group, China; Meituan, China; Dalian University of Technology, China|Large-scale e-commercial platforms usually contain multiple business fields, which require industrial algorithms to characterize user intents across multiple domains. Numerous efforts have been made in user multi-domain intent modeling to achieve state-of-the-art performance. However, existing methods mainly focus on the domains having rich user information, which makes implementation to domains with sparse or rare user behavior meet with mixed success. Hence, in this paper, we propose a novel method named Meta-generator enhanced multi-Domain model (MetaDomain) to address the above issue. MetaDomain mainly includes two steps, 1) users’ multi-domain intent representation and 2) users’ multi-domain intent fusion. Specifically, in users’ multi-domain intent representation, we use the gradient information from a domain intent extractor to train the domain intent meta-generator, where the domain intent extractor has the input of users’ sequence feature and domain meta-generator has the input of users’ basic feature, hence the capability of generating users’ intent with sparse behavior. Afterward, in users’ multi-domain intent fusion, a domain graph is used to represent the high-order multi-domain connectivity. Extensive experiments have been carried out under a real-world industrial platform named Meituan. Both offline and rigorous online A/B tests under the billion-level data scale demonstrate the superiority of the proposed MetaDomain method over the state-of-the-art baselines. Furthermore comparing with the method using multi-domain sequence features, MetaDomain can reduce the serving latency by 20%. Currently, MetaDomain has been deployed in Meituan one of the largest worldwide Online-to-Offline(O2O) platforms.|大型电子商务平台通常包含多个业务字段,这需要工业算法来描述跨多个域的用户意图。在用户多领域意图建模方面做了大量工作,以实现最先进的性能。然而,现有的方法主要集中在具有丰富用户信息的域上,这使得对于用户行为稀疏或罕见的域的实现成败参半。为此,本文提出了一种新的元生成器增强型多域模型(MetaDomain)来解决上述问题。元域主要包括两个步骤: 1)用户的多域意图表示和2)用户的多域意图融合。具体来说,在用户的多领域意图表示中,我们利用领域意图提取器的梯度信息来训练领域意图元生成器,其中领域意图提取器输入用户的序列特征,领域元生成器输入用户的基本特征,从而具有生成稀疏行为的用户意图的能力。然后,在用户的多域意图融合中,使用一个域图来表示高阶多域连通性。在一个名为“美团”的真实工业平台下,人们进行了广泛的实验。在十亿级数据规模下的离线和严格的在线 A/B 测试都证明了提出的 MetaDomain 方法相对于最先进的基线的优越性。此外,与采用多域序列特征的方法相比,元域可以减少20% 的服务延迟。目前,MetaDomain 已经部署在全球最大的在线到离线(o2O)平台之一的美团上。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Meta-Generator+Enhanced+Multi-Domain+Recommendation)|0| |[Integrated Ranking for News Feed with Reinforcement Learning](https://doi.org/10.1145/3543873.3584651)|Menghui Zhu, Wei Xia, Weiwen Liu, Yifan Liu, Ruiming Tang, Weinan Zhang|Shanghai Jiao Tong University, China; Huawei Noah?s Ark Lab, China|With the development of recommender systems, it becomes an increasingly common need to mix multiple item sequences from different sources. Therefore, the integrated ranking stage is proposed to be responsible for this task with re-ranking models. However, existing methods ignore the relation between the sequences, thus resulting in local optimum over the interaction session. To resolve this challenge, in this paper, we propose a new model named NFIRank (News Feed Integrated Ranking with reinforcement learning) and formulate the whole interaction session as a MDP (Markov Decision Process). Sufficient offline experiments are provided to verify the effectiveness of our model. In addition, we deployed our model on Huawei Browser and gained 1.58% improvements in CTR compared with the baseline in online A/B test. Code will be available at https://gitee.com/mindspore/models/tree/master/research/recommend/NFIRank.|随着推荐系统的发展,混合来自不同来源的多个项目序列的需求变得越来越普遍。因此,提出综合排序阶段负责这一任务的重新排序模型。然而,现有的方法忽略了序列之间的关系,从而导致局部最优的交互会话。为了解决这个问题,在本文中,我们提出了一个新的模型 NFIRank (带强化学习的新闻源综合排名) ,并将整个交互会话表示为一个 MDP (马可夫决策过程)。通过离线实验验证了模型的有效性。此外,我们在华为浏览器上部署了我们的模型,与在线 A/B 测试的基线相比,点击率提高了1.58% 。密码将在 https://gitee.com/mindspore/models/tree/master/research/recommend/nfirank 公布。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Integrated+Ranking+for+News+Feed+with+Reinforcement+Learning)|0| |[Measuring e-Commerce Metric Changes in Online Experiments](https://doi.org/10.1145/3543873.3584654)|C. H. Bryan Liu, Emma J. McCoy|ASOS.com, United Kingdom and Imperial College London, United Kingdom; London School of Economics and Political Science, United Kingdom|Digital technology organizations routinely use online experiments (e.g. A/B tests) to guide their product and business decisions. In e-commerce, we often measure changes to transaction- or item-based business metrics such as Average Basket Value (ABV), Average Basket Size (ABS), and Average Selling Price (ASP); yet it remains a common pitfall to ignore the dependency between the value/size of transactions/items during experiment design and analysis. We present empirical evidence on such dependency, its impact on measurement uncertainty, and practical implications on A/B test outcomes if left unmitigated. By making the evidence available, we hope to drive awareness of the pitfall among experimenters in e-commerce and hence encourage the adoption of established mitigation approaches. We also share lessons learned when incorporating selected mitigation approaches into our experimentation analysis platform currently in production.|数字技术组织经常使用在线实验(例如 A/B 测试)来指导他们的产品和商业决策。在电子商务中,我们经常衡量基于交易或项目的业务指标的变化,如平均篮子价值(ABV)、平均篮子大小(ABS)和平均销售价格(ASP) ; 然而,在实验设计和分析过程中忽视交易/项目的价值/大小之间的依赖性仍然是一个常见的陷阱。我们介绍了这种依赖性的经验证明,它对测量不确定性的影响,以及如果不加以缓解的话对 A/B 测试结果的实际影响。通过提供证据,我们希望提高电子商务实验者对这一陷阱的认识,从而鼓励采用既定的缓解办法。我们还分享了将选定的缓解方法纳入我们目前正在生产的实验分析平台的经验教训。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Measuring+e-Commerce+Metric+Changes+in+Online+Experiments)|0| |[Improve retrieval of regional titles in streaming services with dense retrieval](https://doi.org/10.1145/3543873.3587619)|Bhargav Upadhyay, Tejas Khairnar, Anup Kotalwar|Amazon, India|Customers search for movie and series titles released across the world on streaming services like primevideo.com (PV), netflix.com (Netflix). In non-English speaking countries like India, Nepal and many others, the regional titles are transliterated from native language to English and are being searched in English. Given that there can be multiple transliterations possible for almost all the titles, searching for a regional title can be a very frustrating customer experience if these nuances are not handled correctly by the search system. Typing errors make the problem even more challenging. Streaming services uses spell correction and auto-suggestions/auto-complete features to address this issue up to certain extent. Auto-suggest fails when user searches keywords not in scope of the auto-suggest. Spell correction is effective at correcting common typing errors but as these titles doesn’t follow strict grammar rules and new titles constantly added to the catalog, spell correction have limited success. With recent progress in deep learning (DL), embedding vectors based dense retrieval is being used extensively to retrieve semantically relevant documents for a given query. In this work, we have used dense retrieval to address the noise introduced by transliteration variations and typing errors to improve retrieval of regional media titles. In the absent of any relevant dataset to test our hypothesis, we created a new dataset of 40K query title pairs from PV search logs. We also created a baseline by bench-marking PV’s performance on test data. We present an extensive study on the impact of 1. pre-training, 2. data augmentation, 3. positive to negative sample ratio, and 4. choice of loss function on retrieval performance. Our best model has shown 51.24% improvement in [email protected] over PV baseline.|客户可以通过流媒体服务搜索世界各地发行的电影和剧集,比如 primeideo.com (PV)、 Netflix.com (Netflix)。在非英语国家,如印度、尼泊尔和许多其他国家,地区标题被从母语音译成英语,并用英语进行搜索。鉴于几乎所有标题都可能有多种音译,如果搜索系统不能正确处理这些细微差别,那么搜索地区标题可能是一种非常令人沮丧的客户体验。键入错误使问题更具挑战性。流媒体服务使用拼写修正和自动建议/自动完成功能在一定程度上解决了这个问题。当用户搜索不在自动建议范围内的关键字时,自动建议失败。拼写纠正在纠正常见的打字错误方面是有效的,但是由于这些标题没有遵循严格的语法规则,而且新标题不断地添加到目录中,拼写纠正的成功有限。随着深度学习(DL)技术的发展,基于嵌入向量的密集检索技术被广泛应用于给定查询的语义相关文档检索。在本研究中,我们利用密集检索来解决音译变异和打字错误所引起的噪音问题,以提高地区性媒体标题的检索效率。在缺乏相关数据集来检验我们的假设的情况下,我们从 PV 搜索日志中创建了一个40K 查询标题对的新数据集。我们还通过在测试数据上标记 PV 的性能来创建基线。我们提出了一个广泛的研究影响1。训练前2分钟。数据增强,3。正负样本比率,以及4。损失函数对检索性能的选择。我们最好的模型已经显示了51.24% 的改善[电子邮件保护]超过 PV 基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improve+retrieval+of+regional+titles+in+streaming+services+with+dense+retrieval)|0| |[hp-frac: An index to determine Awarded Researchers](https://doi.org/10.1145/3543873.3587597)|Aashay Singhal, Kamalakar Karlapalem|International Institute of Information Technology, Hyderabad, India|In order to advance academic research, it is important to assess and evaluate the academic influence of researchers and the findings they produce. Citation metrics are universally used methods to evaluate researchers. Amongst the several variations of citation metrics, the h-index proposed by Hirsch has become the leading measure. Recent work shows that h-index is not an effective measure to determine scientific impact - due to changing authorship patterns. This can be mitigated by using h-index of a paper to compute h-index of an author. We show that using fractional allocation of h-index gives better results. In this work, we reapply two indices based on the h-index of a single paper. The indices are referred to as: hp-index and hp-frac-index. We run large-scale experiments in three different fields with about a million publications and 3,000 authors. Our experiments show that hp-frac-index provides a unique ranking when compared to h-index. It also performs better than h-index in providing higher ranks to the awarded researcher.|为了推进学术研究,评估和评估研究人员的学术影响力和他们的发现是非常重要的。引文指标是评价科研人员的普遍方法。在引文指标的众多变化中,赫希提出的 h 指标已经成为引文指标的主导指标。最近的研究表明,由于作者模式的改变,h 指数不是确定科学影响的有效方法。这可以通过使用论文的 h- 索引来计算作者的 h- 索引来减轻。我们表明,使用分数分配的 h 指标给出了更好的结果。在这项工作中,我们重新应用两个指标的基础上的单一文件的 h-索引。这些指数被称为: hp-index 和 hp-frac-index。我们在三个不同的领域进行了大规模的实验,发表了大约一百万篇论文,有3000名作者。我们的实验表明,与 h 指数相比,hp-frac 指数提供了一个唯一的排名。在为获奖研究人员提供更高的排名方面,它也比 h-index 表现得更好。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=hp-frac:+An+index+to+determine+Awarded+Researchers)|0| -|[Application of an ontology for model cards to generate computable artifacts for linking machine learning information from biomedical research](https://doi.org/10.1145/3543873.3587601)|Muhammad Amith, Licong Cui, Kirk Roberts, Cui Tao|School of Biomedical Informatics, The University of Texas Health Science Center at Houston, USA; School of Biomedical Informatics, University of Texas Health Science Center at Houston, USA; Department of Information Science, University of North Texas, USA|Model card reports provide a transparent description of machine learning models which includes information about their evaluation, limitations, intended use, etc. Federal health agencies have expressed an interest in model cards report for research studies using machine-learning based AI. Previously, we have developed an ontology model for model card reports to structure and formalize these reports. In this paper, we demonstrate a Java-based library (OWL API, FaCT++) that leverages our ontology to publish computable model card reports. We discuss future directions and other use cases that highlight applicability and feasibility of ontology-driven systems to support FAIR challenges.|模型卡片报告提供了机器学习模型的透明描述,包括它们的评估、限制、预期用途等信息。联邦卫生机构对使用基于机器学习的人工智能的研究报告模型卡表示了兴趣。在此之前,我们已经开发了一个模型卡片报告的本体模型来构造和形式化这些报告。在本文中,我们演示了一个基于 Java 的库(OWL API,FaCT + +) ,它利用我们的本体来发布可计算模型卡报告。我们讨论未来的方向和其他用例,突出本体驱动的系统的适用性和可行性,以支持 FAIR 的挑战。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Application+of+an+ontology+for+model+cards+to+generate+computable+artifacts+for+linking+machine+learning+information+from+biomedical+research)|0| +|[Application of an ontology for model cards to generate computable artifacts for linking machine learning information from biomedical research](https://doi.org/10.1145/3543873.3587601)|Muhammad Amith, Licong Cui, Kirk Roberts, Cui Tao|Department of Information Science, University of North Texas, USA; School of Biomedical Informatics, The University of Texas Health Science Center at Houston, USA; School of Biomedical Informatics, University of Texas Health Science Center at Houston, USA|Model card reports provide a transparent description of machine learning models which includes information about their evaluation, limitations, intended use, etc. Federal health agencies have expressed an interest in model cards report for research studies using machine-learning based AI. Previously, we have developed an ontology model for model card reports to structure and formalize these reports. In this paper, we demonstrate a Java-based library (OWL API, FaCT++) that leverages our ontology to publish computable model card reports. We discuss future directions and other use cases that highlight applicability and feasibility of ontology-driven systems to support FAIR challenges.|模型卡片报告提供了机器学习模型的透明描述,包括它们的评估、限制、预期用途等信息。联邦卫生机构对使用基于机器学习的人工智能的研究报告模型卡表示了兴趣。在此之前,我们已经开发了一个模型卡片报告的本体模型来构造和形式化这些报告。在本文中,我们演示了一个基于 Java 的库(OWL API,FaCT + +) ,它利用我们的本体来发布可计算模型卡报告。我们讨论未来的方向和其他用例,突出本体驱动的系统的适用性和可行性,以支持 FAIR 的挑战。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Application+of+an+ontology+for+model+cards+to+generate+computable+artifacts+for+linking+machine+learning+information+from+biomedical+research)|0| |[Stance Inference in Twitter through Graph Convolutional Collaborative Filtering Networks with Minimal Supervision](https://doi.org/10.1145/3543873.3587640)|Zhiwei Zhou, Erick Elejalde|Leibniz Universität Hannover, L3S Research Center, Germany|Social Media (SM) has become a stage for people to share thoughts, emotions, opinions, and almost every other aspect of their daily lives. This abundance of human interaction makes SM particularly attractive for social sensing. Especially during polarizing events such as political elections or referendums, users post information and encourage others to support their side, using symbols such as hashtags to represent their attitudes. However, many users choose not to attach hashtags to their messages, use a different language, or show their position only indirectly. Thus, automatically identifying their opinions becomes a more challenging task. To uncover these implicit perspectives, we propose a collaborative filtering model based on Graph Convolutional Networks that exploits the textual content in messages and the rich connections between users and topics. Moreover, our approach only requires a small annotation effort compared to state-of-the-art solutions. Nevertheless, the proposed model achieves competitive performance in predicting individuals' stances. We analyze users' attitudes ahead of two constitutional referendums in Chile in 2020 and 2022. Using two large Twitter datasets, our model achieves improvements of 3.4% in recall and 3.6% in accuracy over the baselines.|社交媒体(SM)已经成为人们分享思想、情感、观点以及日常生活中几乎所有其他方面的一个舞台。这种丰富的人际互动使 SM 对社会感知特别有吸引力。特别是在政治选举或公民投票等两极分化的活动中,用户发布信息并鼓励其他人支持他们的立场,使用标签等符号来表达他们的态度。但是,许多用户选择不在消息中附加标签,不使用其他语言,或者只间接显示自己的位置。因此,自动识别他们的观点成为一项更具挑战性的任务。为了揭示这些隐含的观点,我们提出了一个基于图形卷积网络的协同过滤模型,该模型利用消息中的文本内容以及用户和主题之间的丰富联系。此外,与最先进的解决方案相比,我们的方法只需要很少的注释工作。然而,所提出的模型在预测个体的立场方面达到了竞争性的表现。我们分析了2020年和2022年智利两次宪法公投前用户的态度。使用两个大型的 Twitter 数据集,我们的模型比基线数据集提高了3.4% 的召回率和3.6% 的准确率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Stance+Inference+in+Twitter+through+Graph+Convolutional+Collaborative+Filtering+Networks+with+Minimal+Supervision)|0| -|[Retrieving false claims on Twitter during the Russia-Ukraine conflict](https://doi.org/10.1145/3543873.3587571)|Valerio La Gatta, Chiyu Wei, Luca Luceri, Francesco Pierri, Emilio Ferrara|Information Sciences Institute, University of Southern California, USA; Information Sciences Institute, University of Southern California, USA and University of Naples Federico II, Italy; Politecnico di Milano, Italy and Information Sciences Institute, University of Southern California, USA|Nowadays, false and unverified information on social media sway individuals' perceptions during major geo-political events and threaten the quality of the whole digital information ecosystem. Since the Russian invasion of Ukraine, several fact-checking organizations have been actively involved in verifying stories related to the conflict that circulated online. In this paper, we leverage a public repository of fact-checked claims to build a methodological framework for automatically identifying false and unsubstantiated claims spreading on Twitter in February 2022. Our framework consists of two sequential models: First, the claim detection model identifies whether tweets incorporate a (false) claim among those considered in our collection. Then, the claim retrieval model matches the tweets with fact-checked information by ranking verified claims according to their relevance with the input tweet. Both models are based on pre-trained language models and fine-tuned to perform a text classification task and an information retrieval task, respectively. In particular, to validate the effectiveness of our methodology, we consider 83 verified false claims that spread on Twitter during the first week of the invasion, and manually annotate 5,872 tweets according to the claim(s) they report. Our experiments show that our proposed methodology outperforms standard baselines for both claim detection and claim retrieval. Overall, our results highlight how social media providers could effectively leverage semi-automated approaches to identify, track, and eventually moderate false information that spreads on their platforms.|如今,社交媒体上的虚假和未经证实的信息在重大地缘政治事件中左右着人们的看法,威胁着整个数字信息生态系统的质量。自从俄罗斯入侵乌克兰以来,几个事实核查组织一直积极参与核实在网上流传的与冲突有关的故事。在本文中,我们利用一个事实核查索赔的公共数据库,建立一个方法框架,自动识别2022年2月在 Twitter 上传播的虚假和未经证实的索赔。我们的框架由两个顺序模型组成: 首先,索赔检测模型确定 tweet 是否在我们的集合中考虑的索赔中包含(虚假)索赔。然后,索赔检索模型根据索赔与输入索赔的相关性对索赔进行排序,从而将索赔与事实核查信息进行匹配。这两种模型都是基于预先训练好的语言模型,经过微调后分别执行文本分类任务和信息检索分类任务。特别是,为了验证我们的方法的有效性,我们考虑了在入侵的第一周在 Twitter 上传播的83个经过验证的虚假声明,并根据他们报告的声明手动注释了5,872条推文。我们的实验表明,我们提出的方法在索赔检测和索赔检索方面都优于标准基线。总的来说,我们的研究结果强调了社交媒体提供商如何有效地利用半自动化方法来识别、跟踪并最终控制在他们的平台上传播的虚假信息。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Retrieving+false+claims+on+Twitter+during+the+Russia-Ukraine+conflict)|0| +|[Retrieving false claims on Twitter during the Russia-Ukraine conflict](https://doi.org/10.1145/3543873.3587571)|Valerio La Gatta, Chiyu Wei, Luca Luceri, Francesco Pierri, Emilio Ferrara|Politecnico di Milano, Italy and Information Sciences Institute, University of Southern California, USA; Information Sciences Institute, University of Southern California, USA; Information Sciences Institute, University of Southern California, USA and University of Naples Federico II, Italy|Nowadays, false and unverified information on social media sway individuals' perceptions during major geo-political events and threaten the quality of the whole digital information ecosystem. Since the Russian invasion of Ukraine, several fact-checking organizations have been actively involved in verifying stories related to the conflict that circulated online. In this paper, we leverage a public repository of fact-checked claims to build a methodological framework for automatically identifying false and unsubstantiated claims spreading on Twitter in February 2022. Our framework consists of two sequential models: First, the claim detection model identifies whether tweets incorporate a (false) claim among those considered in our collection. Then, the claim retrieval model matches the tweets with fact-checked information by ranking verified claims according to their relevance with the input tweet. Both models are based on pre-trained language models and fine-tuned to perform a text classification task and an information retrieval task, respectively. In particular, to validate the effectiveness of our methodology, we consider 83 verified false claims that spread on Twitter during the first week of the invasion, and manually annotate 5,872 tweets according to the claim(s) they report. Our experiments show that our proposed methodology outperforms standard baselines for both claim detection and claim retrieval. Overall, our results highlight how social media providers could effectively leverage semi-automated approaches to identify, track, and eventually moderate false information that spreads on their platforms.|如今,社交媒体上的虚假和未经证实的信息在重大地缘政治事件中左右着人们的看法,威胁着整个数字信息生态系统的质量。自从俄罗斯入侵乌克兰以来,几个事实核查组织一直积极参与核实在网上流传的与冲突有关的故事。在本文中,我们利用一个事实核查索赔的公共数据库,建立一个方法框架,自动识别2022年2月在 Twitter 上传播的虚假和未经证实的索赔。我们的框架由两个顺序模型组成: 首先,索赔检测模型确定 tweet 是否在我们的集合中考虑的索赔中包含(虚假)索赔。然后,索赔检索模型根据索赔与输入索赔的相关性对索赔进行排序,从而将索赔与事实核查信息进行匹配。这两种模型都是基于预先训练好的语言模型,经过微调后分别执行文本分类任务和信息检索分类任务。特别是,为了验证我们的方法的有效性,我们考虑了在入侵的第一周在 Twitter 上传播的83个经过验证的虚假声明,并根据他们报告的声明手动注释了5,872条推文。我们的实验表明,我们提出的方法在索赔检测和索赔检索方面都优于标准基线。总的来说,我们的研究结果强调了社交媒体提供商如何有效地利用半自动化方法来识别、跟踪并最终控制在他们的平台上传播的虚假信息。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Retrieving+false+claims+on+Twitter+during+the+Russia-Ukraine+conflict)|0| |[Enhancing Hierarchy-Aware Graph Networks with Deep Dual Clustering for Session-based Recommendation](https://doi.org/10.1145/3543507.3583247)|Jiajie Su, Chaochao Chen, Weiming Liu, Fei Wu, Xiaolin Zheng, Haoming Lyu|Zhejiang University, China|Session-based Recommendation aims at predicting the next interacted item based on short anonymous behavior sessions. However, existing solutions neglect to model two inherent properties of sequential representing distributions, i.e., hierarchy structures resulted from item popularity and collaborations existing in both intra- and inter-session. Tackling with these two factors at the same time is challenging. On the one hand, traditional Euclidean space utilized in previous studies fails to capture hierarchy structures due to a restricted representation ability. On the other hand, the intuitive apply of hyperbolic geometry could extract hierarchical patterns but more emphasis on degree distribution weakens intra- and inter-session collaborations. To address the challenges, we propose a Hierarchy-Aware Dual Clustering Graph Network (HADCG) model for session-based recommendation. Towards the first challenge, we design the hierarchy-aware graph modeling module which converts sessions into hyperbolic session graphs, adopting hyperbolic geometry in propagation and attention mechanism so as to integrate chronological and hierarchical information. As for the second challenge, we introduce the deep dual clustering module which develops a two-level clustering strategy, i.e., information regularizer for intra-session clustering and contrastive learner for inter-session clustering, to enhance hyperbolic representation learning from collaborative perspectives and further promote recommendation performance. Extensive experiments on three real-world datasets demonstrate the effectiveness of the proposed HADCG.|基于会话的推荐是基于短的匿名行为会话来预测下一个交互项。然而,现有的解决方案忽视了顺序表示分布的两个固有属性,即由于项目流行和会话内和会话间存在的协作而产生的层次结构。同时处理这两个因素是具有挑战性的。一方面,传统的欧氏空间由于表示能力的限制,无法捕捉层次结构;。另一方面,双曲几何的直观应用可以提取等级模式,但更强调学位分配会削弱会话内部和会话间的协作。针对这一挑战,我们提出了一种基于会话的层次感知双聚类图网络(HADCG)模型。针对第一个挑战,我们设计了层次感知的图形建模模块,它将会话转换为双曲会话图,在传播和注意机制中采用双曲几何,以便整合时间和层次信息。针对第二个挑战,我们引入了深度双聚类模型,提出了一种两级聚类策略,即会话内聚类的信息调整器和会话间聚类的对比学习器,从协作的角度提高双曲表示学习,进一步提高推荐性能。在三个实际数据集上的大量实验证明了所提出的 HADCG 算法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Enhancing+Hierarchy-Aware+Graph+Networks+with+Deep+Dual+Clustering+for+Session-based+Recommendation)|0| -|[Intra and Inter Domain HyperGraph Convolutional Network for Cross-Domain Recommendation](https://doi.org/10.1145/3543507.3583402)|Zhongxuan Han, Xiaolin Zheng, Chaochao Chen, Wenjie Cheng, Yang Yao|Zhejiang Lab, China; Zhejiang University, China|Cross-Domain Recommendation (CDR) aims to solve the data sparsity problem by integrating the strengths of different domains. Though researchers have proposed various CDR methods to effectively transfer knowledge across domains, they fail to address the following key issues, i.e., (1) they cannot model high-order correlations among users and items in every single domain to obtain more accurate representations; (2) they cannot model the correlations among items across different domains. To tackle the above issues, we propose a novel Intra and Inter Domain HyperGraph Convolutional Network (II-HGCN) framework, which includes two main layers in the modeling process, i.e., the intra-domain layer and the inter-domain layer. In the intra-domain layer, we design a user hypergraph and an item hypergraph to model high-order correlations inside every single domain. Thus we can address the data sparsity problem better and learn high-quality representations of users and items. In the inter-domain layer, we propose an inter-domain hypergraph structure to explore correlations among items from different domains based on their interactions with common users. Therefore we can not only transfer the knowledge of users but also combine embeddings of items across domains. Comprehensive experiments on three widely used benchmark datasets demonstrate that II-HGCN outperforms other state-of-the-art methods, especially when datasets are extremely sparse.|跨域推荐(CDR)旨在通过整合不同域的优势来解决数据稀疏性问题。尽管研究人员已经提出了各种 CDR 方法来有效地跨领域传递知识,但他们未能解决以下关键问题,即: (1)他们不能模拟每个领域中用户和项目之间的高阶相关性以获得更准确的表示; (2)他们不能模拟不同领域中项目之间的相关性。为了解决上述问题,我们提出了一种新的域内和域间超图卷积网络(II-HGCN)框架,它包括建模过程中的两个主要层次,即域内层和域间层。在域内层,我们设计了一个用户超图和一个项目超图来模拟每个域内的高阶相关性。因此,我们可以更好地解决数据稀疏问题,并学习用户和项目的高质量表示。在域间层,我们提出了一个域间超图结构来探索来自不同领域的项目之间的相关性,基于它们与公共用户的交互。因此,不仅可以实现用户知识的传递,还可以实现跨域嵌入的组合。在三个广泛使用的基准数据集上的综合实验表明,II-HGCN 优于其他最先进的方法,特别是在数据集极其稀疏的情况下。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Intra+and+Inter+Domain+HyperGraph+Convolutional+Network+for+Cross-Domain+Recommendation)|0| -|[Generating Counterfactual Hard Negative Samples for Graph Contrastive Learning](https://doi.org/10.1145/3543507.3583499)|Haoran Yang, Hongxu Chen, Sixiao Zhang, Xiangguo Sun, Qian Li, Xiangyu Zhao, Guandong Xu|Curtin University, Australia; The Chinese University of Hong Kong, Hong Kong; University of Technology Sydney, Australia; City University of Hong Kong, Hong Kong|Graph contrastive learning has emerged as a powerful tool for unsupervised graph representation learning. The key to the success of graph contrastive learning is to acquire high-quality positive and negative samples as contrasting pairs for the purpose of learning underlying structural semantics of the input graph. Recent works usually sample negative samples from the same training batch with the positive samples, or from an external irrelevant graph. However, a significant limitation lies in such strategies, which is the unavoidable problem of sampling false negative samples. In this paper, we propose a novel method to utilize \textbf{C}ounterfactual mechanism to generate artificial hard negative samples for \textbf{G}raph \textbf{C}ontrastive learning, namely \textbf{CGC}, which has a different perspective compared to those sampling-based strategies. We utilize counterfactual mechanism to produce hard negative samples, which ensures that the generated samples are similar to, but have labels that different from the positive sample. The proposed method achieves satisfying results on several datasets compared to some traditional unsupervised graph learning methods and some SOTA graph contrastive learning methods. We also conduct some supplementary experiments to give an extensive illustration of the proposed method, including the performances of CGC with different hard negative samples and evaluations for hard negative samples generated with different similarity measurements.|图形对比学习已经成为无监督图形表示学习的有力工具。图形对比学习成功的关键是获取高质量的正负样本作为对比对,从而学习输入图的结构语义。最近的作品通常从同一训练批次的正样本中抽取负样本,或者从一个外部不相关的图中抽取负样本。然而,这种策略存在一个显著的局限性,这就是不可避免的采样假阴性样本的问题。本文提出了一种新的利用 textbf { C }反事实机制生成 textbf { G } raph textbf { C }对比学习人工硬负样本的方法,即 textbf { CGC }。我们利用反事实机制生成硬负样本,确保所生成的样本与正样本相似,但有不同于正样本的标签。与传统的无监督图形学习方法和 SOTA 图形对比学习方法相比,该方法在多个数据集上取得了令人满意的效果。我们还进行了一些补充实验,对所提出的方法进行了广泛的说明,包括对不同硬负样本的 CGC 性能和对不同相似度测量产生的硬负样本的评价。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Generating+Counterfactual+Hard+Negative+Samples+for+Graph+Contrastive+Learning)|0| +|[Intra and Inter Domain HyperGraph Convolutional Network for Cross-Domain Recommendation](https://doi.org/10.1145/3543507.3583402)|Zhongxuan Han, Xiaolin Zheng, Chaochao Chen, Wenjie Cheng, Yang Yao|Zhejiang University, China; Zhejiang Lab, China|Cross-Domain Recommendation (CDR) aims to solve the data sparsity problem by integrating the strengths of different domains. Though researchers have proposed various CDR methods to effectively transfer knowledge across domains, they fail to address the following key issues, i.e., (1) they cannot model high-order correlations among users and items in every single domain to obtain more accurate representations; (2) they cannot model the correlations among items across different domains. To tackle the above issues, we propose a novel Intra and Inter Domain HyperGraph Convolutional Network (II-HGCN) framework, which includes two main layers in the modeling process, i.e., the intra-domain layer and the inter-domain layer. In the intra-domain layer, we design a user hypergraph and an item hypergraph to model high-order correlations inside every single domain. Thus we can address the data sparsity problem better and learn high-quality representations of users and items. In the inter-domain layer, we propose an inter-domain hypergraph structure to explore correlations among items from different domains based on their interactions with common users. Therefore we can not only transfer the knowledge of users but also combine embeddings of items across domains. Comprehensive experiments on three widely used benchmark datasets demonstrate that II-HGCN outperforms other state-of-the-art methods, especially when datasets are extremely sparse.|跨域推荐(CDR)旨在通过整合不同域的优势来解决数据稀疏性问题。尽管研究人员已经提出了各种 CDR 方法来有效地跨领域传递知识,但他们未能解决以下关键问题,即: (1)他们不能模拟每个领域中用户和项目之间的高阶相关性以获得更准确的表示; (2)他们不能模拟不同领域中项目之间的相关性。为了解决上述问题,我们提出了一种新的域内和域间超图卷积网络(II-HGCN)框架,它包括建模过程中的两个主要层次,即域内层和域间层。在域内层,我们设计了一个用户超图和一个项目超图来模拟每个域内的高阶相关性。因此,我们可以更好地解决数据稀疏问题,并学习用户和项目的高质量表示。在域间层,我们提出了一个域间超图结构来探索来自不同领域的项目之间的相关性,基于它们与公共用户的交互。因此,不仅可以实现用户知识的传递,还可以实现跨域嵌入的组合。在三个广泛使用的基准数据集上的综合实验表明,II-HGCN 优于其他最先进的方法,特别是在数据集极其稀疏的情况下。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Intra+and+Inter+Domain+HyperGraph+Convolutional+Network+for+Cross-Domain+Recommendation)|0| +|[Generating Counterfactual Hard Negative Samples for Graph Contrastive Learning](https://doi.org/10.1145/3543507.3583499)|Haoran Yang, Hongxu Chen, Sixiao Zhang, Xiangguo Sun, Qian Li, Xiangyu Zhao, Guandong Xu|University of Technology Sydney, Australia; The Chinese University of Hong Kong, Hong Kong; City University of Hong Kong, Hong Kong; Curtin University, Australia|Graph contrastive learning has emerged as a powerful tool for unsupervised graph representation learning. The key to the success of graph contrastive learning is to acquire high-quality positive and negative samples as contrasting pairs for the purpose of learning underlying structural semantics of the input graph. Recent works usually sample negative samples from the same training batch with the positive samples, or from an external irrelevant graph. However, a significant limitation lies in such strategies, which is the unavoidable problem of sampling false negative samples. In this paper, we propose a novel method to utilize \textbf{C}ounterfactual mechanism to generate artificial hard negative samples for \textbf{G}raph \textbf{C}ontrastive learning, namely \textbf{CGC}, which has a different perspective compared to those sampling-based strategies. We utilize counterfactual mechanism to produce hard negative samples, which ensures that the generated samples are similar to, but have labels that different from the positive sample. The proposed method achieves satisfying results on several datasets compared to some traditional unsupervised graph learning methods and some SOTA graph contrastive learning methods. We also conduct some supplementary experiments to give an extensive illustration of the proposed method, including the performances of CGC with different hard negative samples and evaluations for hard negative samples generated with different similarity measurements.|图形对比学习已经成为无监督图形表示学习的有力工具。图形对比学习成功的关键是获取高质量的正负样本作为对比对,从而学习输入图的结构语义。最近的作品通常从同一训练批次的正样本中抽取负样本,或者从一个外部不相关的图中抽取负样本。然而,这种策略存在一个显著的局限性,这就是不可避免的采样假阴性样本的问题。本文提出了一种新的利用 textbf { C }反事实机制生成 textbf { G } raph textbf { C }对比学习人工硬负样本的方法,即 textbf { CGC }。我们利用反事实机制生成硬负样本,确保所生成的样本与正样本相似,但有不同于正样本的标签。与传统的无监督图形学习方法和 SOTA 图形对比学习方法相比,该方法在多个数据集上取得了令人满意的效果。我们还进行了一些补充实验,对所提出的方法进行了广泛的说明,包括对不同硬负样本的 CGC 性能和对不同相似度测量产生的硬负样本的评价。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Generating+Counterfactual+Hard+Negative+Samples+for+Graph+Contrastive+Learning)|0| |[Toward Degree Bias in Embedding-Based Knowledge Graph Completion](https://doi.org/10.1145/3543507.3583544)|Harry Shomer, Wei Jin, Wentao Wang, Jiliang Tang|Computer Science, Michigan State University, USA|A fundamental task for knowledge graphs (KGs) is knowledge graph completion (KGC). It aims to predict unseen edges by learning representations for all the entities and relations in a KG. A common concern when learning representations on traditional graphs is degree bias. It can affect graph algorithms by learning poor representations for lower-degree nodes, often leading to low performance on such nodes. However, there has been limited research on whether there exists degree bias for embedding-based KGC and how such bias affects the performance of KGC. In this paper, we validate the existence of degree bias in embedding-based KGC and identify the key factor to degree bias. We then introduce a novel data augmentation method, KG-Mixup, to generate synthetic triples to mitigate such bias. Extensive experiments have demonstrated that our method can improve various embedding-based KGC methods and outperform other methods tackling the bias problem on multiple benchmark datasets.|知识图的一个基本任务是知识图的完成。它的目的是通过学习 KG 中所有实体和关系的表示来预测看不见的边。学习传统图表示的一个常见问题是度偏差。它通过学习低度节点的差表示来影响图算法,经常导致低度节点的性能下降。然而,关于嵌入式 KGC 是否存在程度偏差以及这种偏差如何影响 KGC 性能的研究还很有限。在本文中,我们验证了基于嵌入的 KGC 中存在程度偏差,并找出了影响程度偏差的关键因素。然后,我们引入一种新的数据增强方法,KG 混合,产生合成三元组,以减轻这种偏差。大量的实验表明,该方法可以改进各种基于嵌入的 KGC 方法,并优于其他处理多基准数据集偏差问题的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Toward+Degree+Bias+in+Embedding-Based+Knowledge+Graph+Completion)|0| -|[LINet: A Location and Intention-Aware Neural Network for Hotel Group Recommendation](https://doi.org/10.1145/3543507.3583202)|Ruitao Zhu, Detao Lv, Yao Yu, Ruihao Zhu, Zhenzhe Zheng, Ke Bu, Quan Lu, Fan Wu|Alibaba Group, China; Cornell University, USA; Shanghai Jiao Tong University, China|Motivated by the collaboration with Fliggy1, a leading Online Travel Platform (OTP), we investigate an important but less explored research topic about optimizing the quality of hotel supply, namely selecting potential profitable hotels in advance to build up adequate room inventory. We formulate a WWW problem, i.e., within a specific time period (When) and potential travel area (Where), which hotels should be recommended to a certain group of users with similar travel intentions (Why). We identify three critical challenges in solving the WWW problem: user groups generation, travel data sparsity and utilization of hotel recommendation information (e.g., period, location and intention). To this end, we propose LINet, a Location and Intention-aware neural Network for hotel group recommendation. Specifically, LINet first identifies user travel intentions for user groups generalization, and then characterizes the group preferences by jointly considering historical user-hotel interaction and spatio-temporal features of hotels. For data sparsity, we develop a graph neural network, which employs long-term data, and further design an auxiliary loss function of location that efficiently exploits data within the same and across different locations. Both offline and online experiments demonstrate the effectiveness of LINet when compared with state-of-the-art methods. LINet has been successfully deployed on Fliggy to retrieve high quality hotels for business development, serving hundreds of hotel operation scenarios and thousands of hotel operators.|受到与领先的在线旅游平台(OTP) Fliggy1合作的启发,我们研究了一个重要但探索较少的优化酒店供应质量的研究课题,即提前选择潜在的盈利酒店,以建立足够的客房库存。我们制定了一个 WWW 问题,即在一个特定的时间段(何时)和潜在的旅游区域(何地) ,哪些酒店应该被推荐给具有相似旅游意图的特定用户群体(为什么)。我们确定了解决 WWW 问题的三个关键挑战: 用户群生成、旅游数据稀疏和酒店推荐信息的利用(例如,时间、地点和意图)。为此,我们提出了 LINet,一个位置和意图感知的神经网络,用于酒店集团推荐。具体来说,LINet 首先通过用户群的概括来识别用户的旅游意图,然后联合考虑历史上用户与酒店的交互和酒店的时空特征来表征用户的群体偏好。针对数据稀疏性问题,提出了一种基于长期数据的图形神经网络,并进一步设计了位置辅助损失函数,有效地利用同一位置和不同位置的数据。离线和在线实验都证明了与最先进的方法相比,LINet 的有效性。LINet 已经成功地部署在 Fliggy 上,为业务发展检索高质量的酒店,为数百家酒店运营方案和数千家酒店运营商提供服务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=LINet:+A+Location+and+Intention-Aware+Neural+Network+for+Hotel+Group+Recommendation)|0| -|[Distillation from Heterogeneous Models for Top-K Recommendation](https://doi.org/10.1145/3543507.3583209)|SeongKu Kang, Wonbin Kweon, Dongha Lee, Jianxun Lian, Xing Xie, Hwanjo Yu|Yonsei University, Republic of Korea; Pohang University of Science and Technology, Republic of Korea; Microsoft Research Asia, China|Recent recommender systems have shown remarkable performance by using an ensemble of heterogeneous models. However, it is exceedingly costly because it requires resources and inference latency proportional to the number of models, which remains the bottleneck for production. Our work aims to transfer the ensemble knowledge of heterogeneous teachers to a lightweight student model using knowledge distillation (KD), to reduce the huge inference costs while retaining high accuracy. Through an empirical study, we find that the efficacy of distillation severely drops when transferring knowledge from heterogeneous teachers. Nevertheless, we show that an important signal to ease the difficulty can be obtained from the teacher's training trajectory. This paper proposes a new KD framework, named HetComp, that guides the student model by transferring easy-to-hard sequences of knowledge generated from the teachers' trajectories. To provide guidance according to the student's learning state, HetComp uses dynamic knowledge construction to provide progressively difficult ranking knowledge and adaptive knowledge transfer to gradually transfer finer-grained ranking information. Our comprehensive experiments show that HetComp significantly improves the distillation quality and the generalization of the student model.|最近的推荐系统通过使用一系列异构模型显示了显著的性能。然而,它的成本非常高,因为它需要的资源和推理延迟与模型的数量成正比,这仍然是生产的瓶颈。我们的工作旨在利用知识精馏(KD)将异构教师的集成知识转移到一个轻量级的学生模型,以减少庞大的推理成本,同时保持较高的推理精度。通过实证研究发现,异质型教师传授知识时,蒸馏效果严重下降。然而,我们表明,一个重要的信号,缓解困难可以从教师的培训轨迹。提出了一种新的知识发现框架 HetComp,该框架通过传递由教师轨迹生成的易于生成的知识序列来指导学生模型。为了根据学生的学习状态提供指导,HetComp 使用动态知识结构来提供逐步难以排序的知识,并使用自适应知识转移来逐步传递更细粒度的排序信息。我们的综合实验表明,HetComp 显著提高了蒸馏质量和学生模型的推广。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Distillation+from+Heterogeneous+Models+for+Top-K+Recommendation)|0| -|[Exploration and Regularization of the Latent Action Space in Recommendation](https://doi.org/10.1145/3543507.3583244)|Shuchang Liu, Qingpeng Cai, Bowen Sun, Yuhao Wang, Ji Jiang, Dong Zheng, Peng Jiang, Kun Gai, Xiangyu Zhao, Yongfeng Zhang|; City University of Hong Kong, China; Peking University, China; Rutgers University, USA; Kuaishou Technology, China|In recommender systems, reinforcement learning solutions have effectively boosted recommendation performance because of their ability to capture long-term user-system interaction. However, the action space of the recommendation policy is a list of items, which could be extremely large with a dynamic candidate item pool. To overcome this challenge, we propose a hyper-actor and critic learning framework where the policy decomposes the item list generation process into a hyper-action inference step and an effect-action selection step. The first step maps the given state space into a vectorized hyper-action space, and the second step selects the item list based on the hyper-action. In order to regulate the discrepancy between the two action spaces, we design an alignment module along with a kernel mapping function for items to ensure inference accuracy and include a supervision module to stabilize the learning process. We build simulated environments on public datasets and empirically show that our framework is superior in recommendation compared to standard RL baselines.|在推荐系统中,强化学习解决方案有效地提高了推荐性能,因为它们能够捕捉长期的用户系统交互。但是,推荐策略的操作空间是一个项目列表,对于动态候选项目池,这个列表可能非常大。为了克服这一挑战,我们提出了一个超行为者和批评者学习框架,其中策略将项目表生成过程分解为一个超行为推理步骤和一个效果-行为选择步骤。第一步将给定的状态空间映射到向量化的超动作空间,第二步根据超动作选择项目列表。为了调节两个动作空间之间的差异,我们设计了一个对齐模块和一个项目的核映射函数来保证推理的准确性,并包括一个监督模块来稳定学习过程。我们在公共数据集上建立了模拟环境,并且经验表明我们的框架在推荐方面优于标准 RL 基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Exploration+and+Regularization+of+the+Latent+Action+Space+in+Recommendation)|0| -|[Compressed Interaction Graph based Framework for Multi-behavior Recommendation](https://doi.org/10.1145/3543507.3583312)|Wei Guo, Chang Meng, Enming Yuan, Zhicheng He, Huifeng Guo, Yingxue Zhang, Bo Chen, Yaochen Hu, Ruiming Tang, Xiu Li, Rui Zhang|Shenzhen International Graduate School, Tsinghua University, China; Huawei Noah's Ark Lab, China; ruizhang.info, China; Huawei Technologies, Canada; Institute for Interdisciplinary Information Sciences, Tsinghua University, China|Multi-types of user behavior data (e.g., clicking, adding to cart, and purchasing) are recorded in most real-world recommendation scenarios, which can help to learn users' multi-faceted preferences. However, it is challenging to explore multi-behavior data due to the unbalanced data distribution and sparse target behavior, which lead to the inadequate modeling of high-order relations when treating multi-behavior data ''as features'' and gradient conflict in multitask learning when treating multi-behavior data ''as labels''. In this paper, we propose CIGF, a Compressed Interaction Graph based Framework, to overcome the above limitations. Specifically, we design a novel Compressed Interaction Graph Convolution Network (CIGCN) to model instance-level high-order relations explicitly. To alleviate the potential gradient conflict when treating multi-behavior data ''as labels'', we propose a Multi-Expert with Separate Input (MESI) network with separate input on the top of CIGCN for multi-task learning. Comprehensive experiments on three large-scale real-world datasets demonstrate the superiority of CIGF. Ablation studies and in-depth analysis further validate the effectiveness of our proposed model in capturing high-order relations and alleviating gradient conflict. The source code and datasets are available at https://github.com/MC-CV/CIGF.|多种类型的用户行为数据(例如,点击、添加到购物车和购买)记录在大多数真实世界的推荐场景中,这有助于了解用户的多方面偏好。然而,由于多行为数据分布不均衡,目标行为稀疏,导致多任务学习中将多行为数据“作为特征”的高阶关系建模不足,将多行为数据“作为标签”的多任务学习中存在梯度冲突。本文提出了一种基于压缩交互图的 CIGF 框架,以克服上述局限性。具体来说,我们设计了一个新的压缩交互图卷积网络(CIGCN)来显式地建模实例级的高阶关系。为了缓解多行为数据“作为标签”时潜在的梯度冲突,本文提出了一种在 CIGCN 顶部具有独立输入的多专家网络,用于多任务学习。通过对三个大规模实际数据集的综合实验,验证了 CIGF 算法的优越性。烧蚀研究和深入分析进一步验证了我们提出的模型在捕获高阶关系和缓解梯度冲突方面的有效性。源代码和数据集可在 https://github.com/mc-cv/cigf 下载。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Compressed+Interaction+Graph+based+Framework+for+Multi-behavior+Recommendation)|0| +|[LINet: A Location and Intention-Aware Neural Network for Hotel Group Recommendation](https://doi.org/10.1145/3543507.3583202)|Ruitao Zhu, Detao Lv, Yao Yu, Ruihao Zhu, Zhenzhe Zheng, Ke Bu, Quan Lu, Fan Wu|Shanghai Jiao Tong University, China; Cornell University, USA; Alibaba Group, China|Motivated by the collaboration with Fliggy1, a leading Online Travel Platform (OTP), we investigate an important but less explored research topic about optimizing the quality of hotel supply, namely selecting potential profitable hotels in advance to build up adequate room inventory. We formulate a WWW problem, i.e., within a specific time period (When) and potential travel area (Where), which hotels should be recommended to a certain group of users with similar travel intentions (Why). We identify three critical challenges in solving the WWW problem: user groups generation, travel data sparsity and utilization of hotel recommendation information (e.g., period, location and intention). To this end, we propose LINet, a Location and Intention-aware neural Network for hotel group recommendation. Specifically, LINet first identifies user travel intentions for user groups generalization, and then characterizes the group preferences by jointly considering historical user-hotel interaction and spatio-temporal features of hotels. For data sparsity, we develop a graph neural network, which employs long-term data, and further design an auxiliary loss function of location that efficiently exploits data within the same and across different locations. Both offline and online experiments demonstrate the effectiveness of LINet when compared with state-of-the-art methods. LINet has been successfully deployed on Fliggy to retrieve high quality hotels for business development, serving hundreds of hotel operation scenarios and thousands of hotel operators.|受到与领先的在线旅游平台(OTP) Fliggy1合作的启发,我们研究了一个重要但探索较少的优化酒店供应质量的研究课题,即提前选择潜在的盈利酒店,以建立足够的客房库存。我们制定了一个 WWW 问题,即在一个特定的时间段(何时)和潜在的旅游区域(何地) ,哪些酒店应该被推荐给具有相似旅游意图的特定用户群体(为什么)。我们确定了解决 WWW 问题的三个关键挑战: 用户群生成、旅游数据稀疏和酒店推荐信息的利用(例如,时间、地点和意图)。为此,我们提出了 LINet,一个位置和意图感知的神经网络,用于酒店集团推荐。具体来说,LINet 首先通过用户群的概括来识别用户的旅游意图,然后联合考虑历史上用户与酒店的交互和酒店的时空特征来表征用户的群体偏好。针对数据稀疏性问题,提出了一种基于长期数据的图形神经网络,并进一步设计了位置辅助损失函数,有效地利用同一位置和不同位置的数据。离线和在线实验都证明了与最先进的方法相比,LINet 的有效性。LINet 已经成功地部署在 Fliggy 上,为业务发展检索高质量的酒店,为数百家酒店运营方案和数千家酒店运营商提供服务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=LINet:+A+Location+and+Intention-Aware+Neural+Network+for+Hotel+Group+Recommendation)|0| +|[Distillation from Heterogeneous Models for Top-K Recommendation](https://doi.org/10.1145/3543507.3583209)|SeongKu Kang, Wonbin Kweon, Dongha Lee, Jianxun Lian, Xing Xie, Hwanjo Yu|Microsoft Research Asia, China; Yonsei University, Republic of Korea; Pohang University of Science and Technology, Republic of Korea|Recent recommender systems have shown remarkable performance by using an ensemble of heterogeneous models. However, it is exceedingly costly because it requires resources and inference latency proportional to the number of models, which remains the bottleneck for production. Our work aims to transfer the ensemble knowledge of heterogeneous teachers to a lightweight student model using knowledge distillation (KD), to reduce the huge inference costs while retaining high accuracy. Through an empirical study, we find that the efficacy of distillation severely drops when transferring knowledge from heterogeneous teachers. Nevertheless, we show that an important signal to ease the difficulty can be obtained from the teacher's training trajectory. This paper proposes a new KD framework, named HetComp, that guides the student model by transferring easy-to-hard sequences of knowledge generated from the teachers' trajectories. To provide guidance according to the student's learning state, HetComp uses dynamic knowledge construction to provide progressively difficult ranking knowledge and adaptive knowledge transfer to gradually transfer finer-grained ranking information. Our comprehensive experiments show that HetComp significantly improves the distillation quality and the generalization of the student model.|最近的推荐系统通过使用一系列异构模型显示了显著的性能。然而,它的成本非常高,因为它需要的资源和推理延迟与模型的数量成正比,这仍然是生产的瓶颈。我们的工作旨在利用知识精馏(KD)将异构教师的集成知识转移到一个轻量级的学生模型,以减少庞大的推理成本,同时保持较高的推理精度。通过实证研究发现,异质型教师传授知识时,蒸馏效果严重下降。然而,我们表明,一个重要的信号,缓解困难可以从教师的培训轨迹。提出了一种新的知识发现框架 HetComp,该框架通过传递由教师轨迹生成的易于生成的知识序列来指导学生模型。为了根据学生的学习状态提供指导,HetComp 使用动态知识结构来提供逐步难以排序的知识,并使用自适应知识转移来逐步传递更细粒度的排序信息。我们的综合实验表明,HetComp 显著提高了蒸馏质量和学生模型的推广。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Distillation+from+Heterogeneous+Models+for+Top-K+Recommendation)|0| +|[Exploration and Regularization of the Latent Action Space in Recommendation](https://doi.org/10.1145/3543507.3583244)|Shuchang Liu, Qingpeng Cai, Bowen Sun, Yuhao Wang, Ji Jiang, Dong Zheng, Peng Jiang, Kun Gai, Xiangyu Zhao, Yongfeng Zhang|; Rutgers University, USA; Peking University, China; Kuaishou Technology, China; City University of Hong Kong, China|In recommender systems, reinforcement learning solutions have effectively boosted recommendation performance because of their ability to capture long-term user-system interaction. However, the action space of the recommendation policy is a list of items, which could be extremely large with a dynamic candidate item pool. To overcome this challenge, we propose a hyper-actor and critic learning framework where the policy decomposes the item list generation process into a hyper-action inference step and an effect-action selection step. The first step maps the given state space into a vectorized hyper-action space, and the second step selects the item list based on the hyper-action. In order to regulate the discrepancy between the two action spaces, we design an alignment module along with a kernel mapping function for items to ensure inference accuracy and include a supervision module to stabilize the learning process. We build simulated environments on public datasets and empirically show that our framework is superior in recommendation compared to standard RL baselines.|在推荐系统中,强化学习解决方案有效地提高了推荐性能,因为它们能够捕捉长期的用户系统交互。但是,推荐策略的操作空间是一个项目列表,对于动态候选项目池,这个列表可能非常大。为了克服这一挑战,我们提出了一个超行为者和批评者学习框架,其中策略将项目表生成过程分解为一个超行为推理步骤和一个效果-行为选择步骤。第一步将给定的状态空间映射到向量化的超动作空间,第二步根据超动作选择项目列表。为了调节两个动作空间之间的差异,我们设计了一个对齐模块和一个项目的核映射函数来保证推理的准确性,并包括一个监督模块来稳定学习过程。我们在公共数据集上建立了模拟环境,并且经验表明我们的框架在推荐方面优于标准 RL 基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Exploration+and+Regularization+of+the+Latent+Action+Space+in+Recommendation)|0| +|[Compressed Interaction Graph based Framework for Multi-behavior Recommendation](https://doi.org/10.1145/3543507.3583312)|Wei Guo, Chang Meng, Enming Yuan, Zhicheng He, Huifeng Guo, Yingxue Zhang, Bo Chen, Yaochen Hu, Ruiming Tang, Xiu Li, Rui Zhang|ruizhang.info, China; Huawei Noah's Ark Lab, China; Shenzhen International Graduate School, Tsinghua University, China; Institute for Interdisciplinary Information Sciences, Tsinghua University, China; Huawei Technologies, Canada|Multi-types of user behavior data (e.g., clicking, adding to cart, and purchasing) are recorded in most real-world recommendation scenarios, which can help to learn users' multi-faceted preferences. However, it is challenging to explore multi-behavior data due to the unbalanced data distribution and sparse target behavior, which lead to the inadequate modeling of high-order relations when treating multi-behavior data ''as features'' and gradient conflict in multitask learning when treating multi-behavior data ''as labels''. In this paper, we propose CIGF, a Compressed Interaction Graph based Framework, to overcome the above limitations. Specifically, we design a novel Compressed Interaction Graph Convolution Network (CIGCN) to model instance-level high-order relations explicitly. To alleviate the potential gradient conflict when treating multi-behavior data ''as labels'', we propose a Multi-Expert with Separate Input (MESI) network with separate input on the top of CIGCN for multi-task learning. Comprehensive experiments on three large-scale real-world datasets demonstrate the superiority of CIGF. Ablation studies and in-depth analysis further validate the effectiveness of our proposed model in capturing high-order relations and alleviating gradient conflict. The source code and datasets are available at https://github.com/MC-CV/CIGF.|多种类型的用户行为数据(例如,点击、添加到购物车和购买)记录在大多数真实世界的推荐场景中,这有助于了解用户的多方面偏好。然而,由于多行为数据分布不均衡,目标行为稀疏,导致多任务学习中将多行为数据“作为特征”的高阶关系建模不足,将多行为数据“作为标签”的多任务学习中存在梯度冲突。本文提出了一种基于压缩交互图的 CIGF 框架,以克服上述局限性。具体来说,我们设计了一个新的压缩交互图卷积网络(CIGCN)来显式地建模实例级的高阶关系。为了缓解多行为数据“作为标签”时潜在的梯度冲突,本文提出了一种在 CIGCN 顶部具有独立输入的多专家网络,用于多任务学习。通过对三个大规模实际数据集的综合实验,验证了 CIGF 算法的优越性。烧蚀研究和深入分析进一步验证了我们提出的模型在捕获高阶关系和缓解梯度冲突方面的有效性。源代码和数据集可在 https://github.com/mc-cv/cigf 下载。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Compressed+Interaction+Graph+based+Framework+for+Multi-behavior+Recommendation)|0| |[Correlative Preference Transfer with Hierarchical Hypergraph Network for Multi-Domain Recommendation](https://doi.org/10.1145/3543507.3583331)|Zixuan Xu, Penghui Wei, Shaoguo Liu, Weimin Zhang, Liang Wang, Bo Zheng|Alibaba Group, China|Advanced recommender systems usually involve multiple domains (such as scenarios or categories) for various marketing strategies, and users interact with them to satisfy diverse demands. The goal of multi-domain recommendation (MDR) is to improve the recommendation performance of all domains simultaneously. Conventional graph neural network based methods usually deal with each domain separately, or train a shared model to serve all domains. The former fails to leverage users' cross-domain behaviors, making the behavior sparseness issue a great obstacle. The latter learns shared user representation with respect to all domains, which neglects users' domain-specific preferences. In this paper we propose $\mathsf{H^3Trans}$, a hierarchical hypergraph network based correlative preference transfer framework for MDR, which represents multi-domain user-item interactions into a unified graph to help preference transfer. $\mathsf{H^3Trans}$ incorporates two hyperedge-based modules, namely dynamic item transfer (Hyper-I) and adaptive user aggregation (Hyper-U). Hyper-I extracts correlative information from multi-domain user-item feedbacks for eliminating domain discrepancy of item representations. Hyper-U aggregates users' scattered preferences in multiple domains and further exploits the high-order (not only pair-wise) connections to improve user representations. Experiments on both public and production datasets verify the superiority of $\mathsf{H^3Trans}$ for MDR.|高级推荐系统通常涉及多个领域(如场景或类别)的不同营销策略,用户与他们互动,以满足不同的需求。多域推荐(MDR)的目标是同时提高所有域的推荐性能。传统的基于图神经网络的方法通常分别处理各个领域,或者训练一个共享模型来服务于所有领域。前者未能充分利用用户的跨域行为,使得行为稀疏性问题成为一大障碍。后者学习所有域的共享用户表示,这忽略了用户特定域的首选项。本文提出了一种基于层次超图网络的 MDR 相关偏好传递框架 $mathsf { H ^ 3Trans } $,它将多领域用户-项目交互表示为一个统一的图,以帮助偏好传递。$mathsf { H ^ 3Trans } $包含两个基于超边界的模块,即动态项传输(Hyper-I)和自适应用户聚合(Hyper-U)。Hyper-I 从多领域用户项目反馈中提取相关信息,消除项目表示的领域差异。Hyper-U 聚合用户在多个域中的分散偏好,并进一步利用高阶(不仅仅是成对)连接来改善用户表示。在公共数据集和生产数据集上的实验验证了 $mathsf { H ^ 3Trans } $用于 MDR 的优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Correlative+Preference+Transfer+with+Hierarchical+Hypergraph+Network+for+Multi-Domain+Recommendation)|0| -|[User Retention-oriented Recommendation with Decision Transformer](https://doi.org/10.1145/3543507.3583418)|Kesen Zhao, Lixin Zou, Xiangyu Zhao, Maolin Wang, Dawei Yin|Wuhan University, China; Baidu Inc., China; City University of Hong Kong, Hong Kong|Improving user retention with reinforcement learning~(RL) has attracted increasing attention due to its significant importance in boosting user engagement. However, training the RL policy from scratch without hurting users' experience is unavoidable due to the requirement of trial-and-error searches. Furthermore, the offline methods, which aim to optimize the policy without online interactions, suffer from the notorious stability problem in value estimation or unbounded variance in counterfactual policy evaluation. To this end, we propose optimizing user retention with Decision Transformer~(DT), which avoids the offline difficulty by translating the RL as an autoregressive problem. However, deploying the DT in recommendation is a non-trivial problem because of the following challenges: (1) deficiency in modeling the numerical reward value; (2) data discrepancy between the policy learning and recommendation generation; (3) unreliable offline performance evaluation. In this work, we, therefore, contribute a series of strategies for tackling the exposed issues. We first articulate an efficient reward prompt by weighted aggregation of meta embeddings for informative reward embedding. Then, we endow a weighted contrastive learning method to solve the discrepancy between training and inference. Furthermore, we design two robust offline metrics to measure user retention. Finally, the significant improvement in the benchmark datasets demonstrates the superiority of the proposed method.|使用强化学习 ~ (RL)提高用户保持率已经引起了越来越多的关注,因为它在提高用户参与度方面具有重要意义。然而,由于试错检索的要求,在不损害用户体验的情况下从头开始训练 RL 策略是不可避免的。此外,离线方法的目的是优化政策没有在线交互,受到臭名昭著的稳定性问题的价值估计或无界方差的反事实政策评估。为此,我们提出了利用决策转换器 ~ (DT)来优化用户保留,通过将 RL 转换为一个自回归问题来避免离线困难。然而,在推荐中部署 DT 是一个非常重要的问题,因为它面临以下挑战: (1)数值奖励值建模不足; (2)策略学习和推荐生成之间的数据差异; (3)不可靠的离线性能评估。在这项工作中,我们,因此,贡献了一系列的战略,以解决暴露的问题。我们首先通过元嵌入的加权聚合提出了一个有效的信息嵌入奖励提示。然后,我们提出了一种加权对比学习方法来解决训练和推理之间的差异。此外,我们还设计了两个健壮的离线度量来衡量用户保持率。最后,基准数据集的显著改进证明了该方法的优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=User+Retention-oriented+Recommendation+with+Decision+Transformer)|0| -|[Balancing Unobserved Confounding with a Few Unbiased Ratings in Debiased Recommendations](https://doi.org/10.1145/3543507.3583495)|Haoxuan Li, Yanghao Xiao, Chunyuan Zheng, Peng Wu|Peking University, China; University of Chinese Academy of Sciences, China; Beijing Technology and Business University, China; University of California, San Diego, USA|Recommender systems are seen as an effective tool to address information overload, but it is widely known that the presence of various biases makes direct training on large-scale observational data result in sub-optimal prediction performance. In contrast, unbiased ratings obtained from randomized controlled trials or A/B tests are considered to be the golden standard, but are costly and small in scale in reality. To exploit both types of data, recent works proposed to use unbiased ratings to correct the parameters of the propensity or imputation models trained on the biased dataset. However, the existing methods fail to obtain accurate predictions in the presence of unobserved confounding or model misspecification. In this paper, we propose a theoretically guaranteed model-agnostic balancing approach that can be applied to any existing debiasing method with the aim of combating unobserved confounding and model misspecification. The proposed approach makes full use of unbiased data by alternatively correcting model parameters learned with biased data, and adaptively learning balance coefficients of biased samples for further debiasing. Extensive real-world experiments are conducted along with the deployment of our proposal on four representative debiasing methods to demonstrate the effectiveness.|推荐系统被视为解决信息超载问题的有效工具,但众所周知,各种偏差的存在使得对大规模观测数据的直接训练导致次优预测性能。相比之下,通过随机对照试验或 A/B 测试获得的无偏评分被认为是黄金标准,但实际上成本高,规模小。为了利用这两种类型的数据,最近的工作建议使用无偏评级来修正倾向或插补模型的参数训练偏向的数据集。然而,现有的方法不能获得准确的预测存在未观察到的混杂或模型错误说明。本文提出了一种理论保证的模型无关平衡方法,该方法可以应用于任何现有的去偏方法,以消除未观察到的混淆和模型不确定性。该方法充分利用无偏数据,通过交替校正有偏数据学习的模型参数,以及有偏样本的自适应学习平衡系数进一步消除偏差。随着我们的建议在四个有代表性的去偏方法上的部署,广泛的现实世界的实验被进行,以证明有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Balancing+Unobserved+Confounding+with+a+Few+Unbiased+Ratings+in+Debiased+Recommendations)|0| -|[Denoising and Prompt-Tuning for Multi-Behavior Recommendation](https://doi.org/10.1145/3543507.3583513)|Chi Zhang, Rui Chen, Xiangyu Zhao, Qilong Han, Li Li|Harbin Engineering University, China; University of Delaware, USA; City University of Hong Kong, Hong Kong|In practical recommendation scenarios, users often interact with items under multi-typed behaviors (e.g., click, add-to-cart, and purchase). Traditional collaborative filtering techniques typically assume that users only have a single type of behavior with items, making it insufficient to utilize complex collaborative signals to learn informative representations and infer actual user preferences. Consequently, some pioneer studies explore modeling multi-behavior heterogeneity to learn better representations and boost the performance of recommendations for a target behavior. However, a large number of auxiliary behaviors (i.e., click and add-to-cart) could introduce irrelevant information to recommenders, which could mislead the target behavior (i.e., purchase) recommendation, rendering two critical challenges: (i) denoising auxiliary behaviors and (ii) bridging the semantic gap between auxiliary and target behaviors. Motivated by the above observation, we propose a novel framework-Denoising and Prompt-Tuning (DPT) with a three-stage learning paradigm to solve the aforementioned challenges. In particular, DPT is equipped with a pattern-enhanced graph encoder in the first stage to learn complex patterns as prior knowledge in a data-driven manner to guide learning informative representation and pinpointing reliable noise for subsequent stages. Accordingly, we adopt different lightweight tuning approaches with effectiveness and efficiency in the following stages to further attenuate the influence of noise and alleviate the semantic gap among multi-typed behaviors. Extensive experiments on two real-world datasets demonstrate the superiority of DPT over a wide range of state-of-the-art methods. The implementation code is available online at https://github.com/zc-97/DPT.|在实际的推荐场景中,用户经常在多类型行为下与项目交互(例如,单击、添加到购物车和购买)。传统的协同过滤技术通常假设用户只有单一类型的行为与项目,使其不足以利用复杂的协作信号来学习信息表示和推断实际的用户偏好。因此,一些先驱研究探索建立多行为异质性模型,以学习更好的表示方法,并提高针对目标行为的建议的性能。然而,大量的辅助行为(即点击和添加到购物车)可能会向推荐者引入不相关的信息,这可能会误导目标行为(即购买)推荐,造成两个关键的挑战: (i)去噪辅助行为和(ii)桥接辅助行为和目标行为之间的语义差距。基于上述观察,我们提出了一个新的框架-去噪和及时调整(DPT)与三个阶段的学习范式,以解决上述挑战。特别是,DPT 在第一阶段配备了模式增强图形编码器,以数据驱动的方式学习复杂的模式作为先验知识,以指导学习信息表示和确定后续阶段的可靠噪声。相应地,我们在接下来的阶段采用了不同的轻量化方法,以进一步减小噪声的影响,缓解多类型行为之间的语义鸿沟。在两个实际数据集上的大量实验证明了 DPT 相对于一系列最先进的方法的优越性。实施守则可于网上 https://github.com/zc-97/dpt 下载。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Denoising+and+Prompt-Tuning+for+Multi-Behavior+Recommendation)|0| -|[CAMUS: Attribute-Aware Counterfactual Augmentation for Minority Users in Recommendation](https://doi.org/10.1145/3543507.3583538)|Yuxin Ying, Fuzhen Zhuang, Yongchun Zhu, Deqing Wang, Hongwei Zheng|School of Computer Science and Engineering, Beihang University, China; Institute of Artificial Intelligence, Beihang University, China; Institute of Computing Technology, Chinese Academy of Sciences, China; Beijing Academy of Blockchain and Edge Computing, China|Embedding-based methods currently achieved impressive success in recommender systems. However, such methods are more likely to suffer from bias in data distribution, especially the attribute bias problem. For example, when a certain type of user, like the elderly, occupies the mainstream, the recommendation results of minority users would be seriously affected by the mainstream users’ attributes. To address this problem, most existing methods are proposed from the perspective of fairness, which focuses on eliminating unfairness but deteriorates the recommendation performance. Unlike these methods, in this paper, we focus on improving the recommendation performance for minority users of biased attributes. Along this line, we propose a novel attribute-aware Counterfactual Augmentation framework for Minority Users(CAMUS). Specifically, the CAMUS consists of a counterfactual augmenter, a confidence estimator, and a recommender. The counterfactual augmenter conducts data augmentation for the minority group by utilizing the interactions of mainstream users based on a universal counterfactual assumption. Besides, a tri-training-based confidence estimator is applied to ensure the effectiveness of augmentation. Extensive experiments on three real-world datasets have demonstrated the superior performance of the proposed methods. Further case studies verify the universality of the proposed CAMUS framework on different data sparsity, attributes, and models.|基于嵌入的方法目前在推荐系统中取得了令人印象深刻的成功。然而,这些方法更容易受到数据分布偏差的影响,特别是属性偏差问题。例如,当某种类型的用户,如老年人,占据主流时,少数用户的推荐结果会受到主流用户属性的严重影响。为了解决这个问题,现有的方法大多是从公平的角度出发,着重于消除不公平性,但是会降低推荐性能。与这些方法不同的是,本文主要研究如何提高偏向属性的少数用户的推荐性能。在此基础上,我们提出了一种新的面向少数用户的属性感知反事实增强框架(CAMUS)。具体来说,CAMUS 由一个反事实增强器、一个置信度估计器和一个推荐器组成。反事实增强器利用主流用户基于普遍反事实假设的交互作用对少数群体进行数据增强。此外,采用基于三训练的置信估计来保证增广的有效性。在三个实际数据集上的大量实验证明了该方法的优越性能。进一步的案例研究验证了所提出的 CAMUS 框架在不同的数据稀疏性、属性和模型上的通用性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CAMUS:+Attribute-Aware+Counterfactual+Augmentation+for+Minority+Users+in+Recommendation)|0| -|[Dynamically Expandable Graph Convolution for Streaming Recommendation](https://doi.org/10.1145/3543507.3583237)|Bowei He, Xu He, Yingxue Zhang, Ruiming Tang, Chen Ma|Huawei Noah's Ark Lab, China; Department of Computer Science, City University of Hong Kong, Hong Kong; Huawei Noah's Ark Lab Montreal, Canada; City University of Hong Kong, Hong Kong|Personalized recommender systems have been widely studied and deployed to reduce information overload and satisfy users' diverse needs. However, conventional recommendation models solely conduct a one-time training-test fashion and can hardly adapt to evolving demands, considering user preference shifts and ever-increasing users and items in the real world. To tackle such challenges, the streaming recommendation is proposed and has attracted great attention recently. Among these, continual graph learning is widely regarded as a promising approach for the streaming recommendation by academia and industry. However, existing methods either rely on the historical data replay which is often not practical under increasingly strict data regulations, or can seldom solve the \textit{over-stability} issue. To overcome these difficulties, we propose a novel \textbf{D}ynamically \textbf{E}xpandable \textbf{G}raph \textbf{C}onvolution (DEGC) algorithm from a \textit{model isolation} perspective for the streaming recommendation which is orthogonal to previous methods. Based on the motivation of disentangling outdated short-term preferences from useful long-term preferences, we design a sequence of operations including graph convolution pruning, refining, and expanding to only preserve beneficial long-term preference-related parameters and extract fresh short-term preferences. Moreover, we model the temporal user preference, which is utilized as user embedding initialization, for better capturing the individual-level preference shifts. Extensive experiments on the three most representative GCN-based recommendation models and four industrial datasets demonstrate the effectiveness and robustness of our method.|个性化推荐系统已被广泛研究和部署,以减少信息超载和满足用户的不同需求。然而,考虑到用户偏好的变化以及现实世界中不断增加的用户和项目,传统的推荐模型只能进行一次性的训练测试,很难适应不断变化的需求。为了应对这些挑战,流媒体推荐被提出并引起了人们的广泛关注。其中,连续图学习被学术界和工业界广泛认为是一种很有前途的流推荐方法。然而,现有的方法要么依赖于历史数据的重放,这在日益严格的数据规则下往往是不切实际的,要么很少能解决文本的过稳定性问题。为了克服这些困难,我们从文本{模型隔离}的角度提出了一种新的动态 textbf { D }可扩展 textbf { E } Raph textbf { C }内卷(DEGC)算法用于流式推荐,该算法与以前的方法是正交的。基于将过时的短期偏好与有用的长期偏好分离的动机,我们设计了一系列操作,包括图卷积修剪,细化和扩展,以仅保留有益的长期偏好相关参数并提取新的短期偏好。此外,为了更好地捕获个体层次的偏好变化,我们建立了时态用户偏好模型,并将其用于用户嵌入初始化。在三个最具代表性的基于 GCN 的推荐模型和四个工业数据集上的大量实验证明了该方法的有效性和鲁棒性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Dynamically+Expandable+Graph+Convolution+for+Streaming+Recommendation)|0| -|[CTRLStruct: Dialogue Structure Learning for Open-Domain Response Generation](https://doi.org/10.1145/3543507.3583285)|Congchi Yin, Piji Li, Zhaochun Ren|Nanjing University of Aeronautics and Astronautics, China; Shandong University, China|Dialogue structure discovery is essential in dialogue generation. Well-structured topic flow can leverage background information and predict future topics to help generate controllable and explainable responses. However, most previous work focused on dialogue structure learning in task-oriented dialogue other than open-domain dialogue which is more complicated and challenging. In this paper, we present a new framework CTRLStruct for dialogue structure learning to effectively explore topic-level dialogue clusters as well as their transitions with unlabelled information. Precisely, dialogue utterances encoded by bi-directional Transformer are further trained through a special designed contrastive learning task to improve representation. Then we perform clustering to utterance-level representations and form topic-level clusters that can be considered as vertices in dialogue structure graph. The edges in the graph indicating transition probability between vertices are calculated by mimicking expert behavior in datasets. Finally, dialogue structure graph is integrated into dialogue model to perform controlled response generation. Experiments on two popular open-domain dialogue datasets show our model can generate more coherent responses compared to some excellent dialogue models, as well as outperform some typical sentence embedding methods in dialogue utterance representation. Code is available in GitHub.|对话结构的发现是对话生成的基础。结构良好的主题流可以利用背景信息并预测未来的主题,从而帮助产生可控的和可解释的反应。然而,以往的研究大多侧重于面向任务的对话中的对话结构学习,而开放领域的对话更为复杂和具有挑战性。本文提出了一种新的对话结构学习框架 CTRLstruct,以有效地探索话题层次的对话群及其与未标记信息的过渡。准确地说,双向变压器编码的对话话语通过特别设计的对比学习任务进一步训练,以提高表征能力。然后对话语层面的表征进行聚类,形成话题层面的聚类,这些聚类可以看作是对话结构图中的顶点。通过模拟数据集中的专家行为,计算图中表示顶点间转移概率的边。最后,将对话结构图集成到对话模型中进行受控响应的生成。实验结果表明,与一些优秀的对话模型相比,该模型能够产生更加连贯的对话响应,并且在对话话语表征中优于一些典型的句子嵌入方法。代码可在 GitHub 中获得。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CTRLStruct:+Dialogue+Structure+Learning+for+Open-Domain+Response+Generation)|0| -|[BlinkViz: Fast and Scalable Approximate Visualization on Very Large Datasets using Neural-Enhanced Mixed Sum-Product Networks](https://doi.org/10.1145/3543507.3583411)|Yimeng Qiao, Yinan Jing, Hanbing Zhang, Zhenying He, Kai Zhang, X. Sean Wang|Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, China; Shanghai Key Laboratory of Data Science, School of Software, Fudan University, China|Web-based online interactive visual analytics enjoys popularity in recent years. Traditionally, visualizations are produced directly from querying the underlying data. However, for a very large dataset, this way is so time-consuming that it cannot meet the low-latency requirements of interactive visual analytics. In this paper, we propose a learning-based visualization approach called BlinkViz, which uses a learned model to produce approximate visualizations by leveraging mixed sum-product networks to learn the distribution of the original data. In such a way, it makes visualization faster and more scalable by decoupling visualization and data. In addition, to improve the accuracy of approximate visualizations, we propose an enhanced model by incorporating a neural network with residual structures, which can refine prediction results, especially for visual requests with low selectivity. Extensive experiments show that BlinkViz is extremely fast even on a large dataset with hundreds of millions of data records (over 30GB), responding in sub-seconds (from 2ms to less than 500ms for different requests) while keeping a low error rate. Furthermore, our approach remains scalable on latency and memory footprint size regardless of data size.|基于 Web 的在线交互式可视化分析近年来很受欢迎。传统上,可视化是通过查询底层数据直接生成的。然而,对于一个非常大的数据集,这种方法非常耗时,不能满足交互式可视化分析的低延迟要求。本文提出了一种基于学习的可视化方法 BlinkViz,该方法利用学习模型,通过混合和积网络来学习原始数据的分布情况,从而产生近似可视化效果。通过这种方式,可视化和数据解耦,使得可视化更快、更具可伸缩性。此外,为了提高近似可视化的准确性,我们提出了一个增强的模型,通过结合残差结构的神经网络,可以细化预测结果,特别是对低选择性的可视化请求。大量的实验表明,BlinkViz 即使在拥有数亿条数据记录(超过30GB)的大型数据集上也是极其快速的,响应时间在亚秒(对于不同的请求,响应时间从2毫秒到不到500毫秒) ,同时保持较低的错误率。此外,无论数据大小如何,我们的方法在延迟和内存占用大小上都是可伸缩的。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=BlinkViz:+Fast+and+Scalable+Approximate+Visualization+on+Very+Large+Datasets+using+Neural-Enhanced+Mixed+Sum-Product+Networks)|0| +|[User Retention-oriented Recommendation with Decision Transformer](https://doi.org/10.1145/3543507.3583418)|Kesen Zhao, Lixin Zou, Xiangyu Zhao, Maolin Wang, Dawei Yin|Wuhan University, China; City University of Hong Kong, Hong Kong; Baidu Inc., China|Improving user retention with reinforcement learning~(RL) has attracted increasing attention due to its significant importance in boosting user engagement. However, training the RL policy from scratch without hurting users' experience is unavoidable due to the requirement of trial-and-error searches. Furthermore, the offline methods, which aim to optimize the policy without online interactions, suffer from the notorious stability problem in value estimation or unbounded variance in counterfactual policy evaluation. To this end, we propose optimizing user retention with Decision Transformer~(DT), which avoids the offline difficulty by translating the RL as an autoregressive problem. However, deploying the DT in recommendation is a non-trivial problem because of the following challenges: (1) deficiency in modeling the numerical reward value; (2) data discrepancy between the policy learning and recommendation generation; (3) unreliable offline performance evaluation. In this work, we, therefore, contribute a series of strategies for tackling the exposed issues. We first articulate an efficient reward prompt by weighted aggregation of meta embeddings for informative reward embedding. Then, we endow a weighted contrastive learning method to solve the discrepancy between training and inference. Furthermore, we design two robust offline metrics to measure user retention. Finally, the significant improvement in the benchmark datasets demonstrates the superiority of the proposed method.|使用强化学习 ~ (RL)提高用户保持率已经引起了越来越多的关注,因为它在提高用户参与度方面具有重要意义。然而,由于试错检索的要求,在不损害用户体验的情况下从头开始训练 RL 策略是不可避免的。此外,离线方法的目的是优化政策没有在线交互,受到臭名昭著的稳定性问题的价值估计或无界方差的反事实政策评估。为此,我们提出了利用决策转换器 ~ (DT)来优化用户保留,通过将 RL 转换为一个自回归问题来避免离线困难。然而,在推荐中部署 DT 是一个非常重要的问题,因为它面临以下挑战: (1)数值奖励值建模不足; (2)策略学习和推荐生成之间的数据差异; (3)不可靠的离线性能评估。在这项工作中,我们,因此,贡献了一系列的战略,以解决暴露的问题。我们首先通过元嵌入的加权聚合提出了一个有效的信息嵌入奖励提示。然后,我们提出了一种加权对比学习方法来解决训练和推理之间的差异。此外,我们还设计了两个健壮的离线度量来衡量用户保持率。最后,基准数据集的显著改进证明了该方法的优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=User+Retention-oriented+Recommendation+with+Decision+Transformer)|0| +|[Balancing Unobserved Confounding with a Few Unbiased Ratings in Debiased Recommendations](https://doi.org/10.1145/3543507.3583495)|Haoxuan Li, Yanghao Xiao, Chunyuan Zheng, Peng Wu|University of California, San Diego, USA; Beijing Technology and Business University, China; Peking University, China; University of Chinese Academy of Sciences, China|Recommender systems are seen as an effective tool to address information overload, but it is widely known that the presence of various biases makes direct training on large-scale observational data result in sub-optimal prediction performance. In contrast, unbiased ratings obtained from randomized controlled trials or A/B tests are considered to be the golden standard, but are costly and small in scale in reality. To exploit both types of data, recent works proposed to use unbiased ratings to correct the parameters of the propensity or imputation models trained on the biased dataset. However, the existing methods fail to obtain accurate predictions in the presence of unobserved confounding or model misspecification. In this paper, we propose a theoretically guaranteed model-agnostic balancing approach that can be applied to any existing debiasing method with the aim of combating unobserved confounding and model misspecification. The proposed approach makes full use of unbiased data by alternatively correcting model parameters learned with biased data, and adaptively learning balance coefficients of biased samples for further debiasing. Extensive real-world experiments are conducted along with the deployment of our proposal on four representative debiasing methods to demonstrate the effectiveness.|推荐系统被视为解决信息超载问题的有效工具,但众所周知,各种偏差的存在使得对大规模观测数据的直接训练导致次优预测性能。相比之下,通过随机对照试验或 A/B 测试获得的无偏评分被认为是黄金标准,但实际上成本高,规模小。为了利用这两种类型的数据,最近的工作建议使用无偏评级来修正倾向或插补模型的参数训练偏向的数据集。然而,现有的方法不能获得准确的预测存在未观察到的混杂或模型错误说明。本文提出了一种理论保证的模型无关平衡方法,该方法可以应用于任何现有的去偏方法,以消除未观察到的混淆和模型不确定性。该方法充分利用无偏数据,通过交替校正有偏数据学习的模型参数,以及有偏样本的自适应学习平衡系数进一步消除偏差。随着我们的建议在四个有代表性的去偏方法上的部署,广泛的现实世界的实验被进行,以证明有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Balancing+Unobserved+Confounding+with+a+Few+Unbiased+Ratings+in+Debiased+Recommendations)|0| +|[Denoising and Prompt-Tuning for Multi-Behavior Recommendation](https://doi.org/10.1145/3543507.3583513)|Chi Zhang, Rui Chen, Xiangyu Zhao, Qilong Han, Li Li|University of Delaware, USA; City University of Hong Kong, Hong Kong; Harbin Engineering University, China|In practical recommendation scenarios, users often interact with items under multi-typed behaviors (e.g., click, add-to-cart, and purchase). Traditional collaborative filtering techniques typically assume that users only have a single type of behavior with items, making it insufficient to utilize complex collaborative signals to learn informative representations and infer actual user preferences. Consequently, some pioneer studies explore modeling multi-behavior heterogeneity to learn better representations and boost the performance of recommendations for a target behavior. However, a large number of auxiliary behaviors (i.e., click and add-to-cart) could introduce irrelevant information to recommenders, which could mislead the target behavior (i.e., purchase) recommendation, rendering two critical challenges: (i) denoising auxiliary behaviors and (ii) bridging the semantic gap between auxiliary and target behaviors. Motivated by the above observation, we propose a novel framework-Denoising and Prompt-Tuning (DPT) with a three-stage learning paradigm to solve the aforementioned challenges. In particular, DPT is equipped with a pattern-enhanced graph encoder in the first stage to learn complex patterns as prior knowledge in a data-driven manner to guide learning informative representation and pinpointing reliable noise for subsequent stages. Accordingly, we adopt different lightweight tuning approaches with effectiveness and efficiency in the following stages to further attenuate the influence of noise and alleviate the semantic gap among multi-typed behaviors. Extensive experiments on two real-world datasets demonstrate the superiority of DPT over a wide range of state-of-the-art methods. The implementation code is available online at https://github.com/zc-97/DPT.|在实际的推荐场景中,用户经常在多类型行为下与项目交互(例如,单击、添加到购物车和购买)。传统的协同过滤技术通常假设用户只有单一类型的行为与项目,使其不足以利用复杂的协作信号来学习信息表示和推断实际的用户偏好。因此,一些先驱研究探索建立多行为异质性模型,以学习更好的表示方法,并提高针对目标行为的建议的性能。然而,大量的辅助行为(即点击和添加到购物车)可能会向推荐者引入不相关的信息,这可能会误导目标行为(即购买)推荐,造成两个关键的挑战: (i)去噪辅助行为和(ii)桥接辅助行为和目标行为之间的语义差距。基于上述观察,我们提出了一个新的框架-去噪和及时调整(DPT)与三个阶段的学习范式,以解决上述挑战。特别是,DPT 在第一阶段配备了模式增强图形编码器,以数据驱动的方式学习复杂的模式作为先验知识,以指导学习信息表示和确定后续阶段的可靠噪声。相应地,我们在接下来的阶段采用了不同的轻量化方法,以进一步减小噪声的影响,缓解多类型行为之间的语义鸿沟。在两个实际数据集上的大量实验证明了 DPT 相对于一系列最先进的方法的优越性。实施守则可于网上 https://github.com/zc-97/dpt 下载。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Denoising+and+Prompt-Tuning+for+Multi-Behavior+Recommendation)|0| +|[CAMUS: Attribute-Aware Counterfactual Augmentation for Minority Users in Recommendation](https://doi.org/10.1145/3543507.3583538)|Yuxin Ying, Fuzhen Zhuang, Yongchun Zhu, Deqing Wang, Hongwei Zheng|Beijing Academy of Blockchain and Edge Computing, China; Institute of Computing Technology, Chinese Academy of Sciences, China; School of Computer Science and Engineering, Beihang University, China; Institute of Artificial Intelligence, Beihang University, China|Embedding-based methods currently achieved impressive success in recommender systems. However, such methods are more likely to suffer from bias in data distribution, especially the attribute bias problem. For example, when a certain type of user, like the elderly, occupies the mainstream, the recommendation results of minority users would be seriously affected by the mainstream users’ attributes. To address this problem, most existing methods are proposed from the perspective of fairness, which focuses on eliminating unfairness but deteriorates the recommendation performance. Unlike these methods, in this paper, we focus on improving the recommendation performance for minority users of biased attributes. Along this line, we propose a novel attribute-aware Counterfactual Augmentation framework for Minority Users(CAMUS). Specifically, the CAMUS consists of a counterfactual augmenter, a confidence estimator, and a recommender. The counterfactual augmenter conducts data augmentation for the minority group by utilizing the interactions of mainstream users based on a universal counterfactual assumption. Besides, a tri-training-based confidence estimator is applied to ensure the effectiveness of augmentation. Extensive experiments on three real-world datasets have demonstrated the superior performance of the proposed methods. Further case studies verify the universality of the proposed CAMUS framework on different data sparsity, attributes, and models.|基于嵌入的方法目前在推荐系统中取得了令人印象深刻的成功。然而,这些方法更容易受到数据分布偏差的影响,特别是属性偏差问题。例如,当某种类型的用户,如老年人,占据主流时,少数用户的推荐结果会受到主流用户属性的严重影响。为了解决这个问题,现有的方法大多是从公平的角度出发,着重于消除不公平性,但是会降低推荐性能。与这些方法不同的是,本文主要研究如何提高偏向属性的少数用户的推荐性能。在此基础上,我们提出了一种新的面向少数用户的属性感知反事实增强框架(CAMUS)。具体来说,CAMUS 由一个反事实增强器、一个置信度估计器和一个推荐器组成。反事实增强器利用主流用户基于普遍反事实假设的交互作用对少数群体进行数据增强。此外,采用基于三训练的置信估计来保证增广的有效性。在三个实际数据集上的大量实验证明了该方法的优越性能。进一步的案例研究验证了所提出的 CAMUS 框架在不同的数据稀疏性、属性和模型上的通用性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CAMUS:+Attribute-Aware+Counterfactual+Augmentation+for+Minority+Users+in+Recommendation)|0| +|[Dynamically Expandable Graph Convolution for Streaming Recommendation](https://doi.org/10.1145/3543507.3583237)|Bowei He, Xu He, Yingxue Zhang, Ruiming Tang, Chen Ma|Huawei Noah's Ark Lab Montreal, Canada; City University of Hong Kong, Hong Kong; Department of Computer Science, City University of Hong Kong, Hong Kong; Huawei Noah's Ark Lab, China|Personalized recommender systems have been widely studied and deployed to reduce information overload and satisfy users' diverse needs. However, conventional recommendation models solely conduct a one-time training-test fashion and can hardly adapt to evolving demands, considering user preference shifts and ever-increasing users and items in the real world. To tackle such challenges, the streaming recommendation is proposed and has attracted great attention recently. Among these, continual graph learning is widely regarded as a promising approach for the streaming recommendation by academia and industry. However, existing methods either rely on the historical data replay which is often not practical under increasingly strict data regulations, or can seldom solve the \textit{over-stability} issue. To overcome these difficulties, we propose a novel \textbf{D}ynamically \textbf{E}xpandable \textbf{G}raph \textbf{C}onvolution (DEGC) algorithm from a \textit{model isolation} perspective for the streaming recommendation which is orthogonal to previous methods. Based on the motivation of disentangling outdated short-term preferences from useful long-term preferences, we design a sequence of operations including graph convolution pruning, refining, and expanding to only preserve beneficial long-term preference-related parameters and extract fresh short-term preferences. Moreover, we model the temporal user preference, which is utilized as user embedding initialization, for better capturing the individual-level preference shifts. Extensive experiments on the three most representative GCN-based recommendation models and four industrial datasets demonstrate the effectiveness and robustness of our method.|个性化推荐系统已被广泛研究和部署,以减少信息超载和满足用户的不同需求。然而,考虑到用户偏好的变化以及现实世界中不断增加的用户和项目,传统的推荐模型只能进行一次性的训练测试,很难适应不断变化的需求。为了应对这些挑战,流媒体推荐被提出并引起了人们的广泛关注。其中,连续图学习被学术界和工业界广泛认为是一种很有前途的流推荐方法。然而,现有的方法要么依赖于历史数据的重放,这在日益严格的数据规则下往往是不切实际的,要么很少能解决文本的过稳定性问题。为了克服这些困难,我们从文本{模型隔离}的角度提出了一种新的动态 textbf { D }可扩展 textbf { E } Raph textbf { C }内卷(DEGC)算法用于流式推荐,该算法与以前的方法是正交的。基于将过时的短期偏好与有用的长期偏好分离的动机,我们设计了一系列操作,包括图卷积修剪,细化和扩展,以仅保留有益的长期偏好相关参数并提取新的短期偏好。此外,为了更好地捕获个体层次的偏好变化,我们建立了时态用户偏好模型,并将其用于用户嵌入初始化。在三个最具代表性的基于 GCN 的推荐模型和四个工业数据集上的大量实验证明了该方法的有效性和鲁棒性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Dynamically+Expandable+Graph+Convolution+for+Streaming+Recommendation)|0| +|[CTRLStruct: Dialogue Structure Learning for Open-Domain Response Generation](https://doi.org/10.1145/3543507.3583285)|Congchi Yin, Piji Li, Zhaochun Ren|Shandong University, China; Nanjing University of Aeronautics and Astronautics, China|Dialogue structure discovery is essential in dialogue generation. Well-structured topic flow can leverage background information and predict future topics to help generate controllable and explainable responses. However, most previous work focused on dialogue structure learning in task-oriented dialogue other than open-domain dialogue which is more complicated and challenging. In this paper, we present a new framework CTRLStruct for dialogue structure learning to effectively explore topic-level dialogue clusters as well as their transitions with unlabelled information. Precisely, dialogue utterances encoded by bi-directional Transformer are further trained through a special designed contrastive learning task to improve representation. Then we perform clustering to utterance-level representations and form topic-level clusters that can be considered as vertices in dialogue structure graph. The edges in the graph indicating transition probability between vertices are calculated by mimicking expert behavior in datasets. Finally, dialogue structure graph is integrated into dialogue model to perform controlled response generation. Experiments on two popular open-domain dialogue datasets show our model can generate more coherent responses compared to some excellent dialogue models, as well as outperform some typical sentence embedding methods in dialogue utterance representation. Code is available in GitHub.|对话结构的发现是对话生成的基础。结构良好的主题流可以利用背景信息并预测未来的主题,从而帮助产生可控的和可解释的反应。然而,以往的研究大多侧重于面向任务的对话中的对话结构学习,而开放领域的对话更为复杂和具有挑战性。本文提出了一种新的对话结构学习框架 CTRLstruct,以有效地探索话题层次的对话群及其与未标记信息的过渡。准确地说,双向变压器编码的对话话语通过特别设计的对比学习任务进一步训练,以提高表征能力。然后对话语层面的表征进行聚类,形成话题层面的聚类,这些聚类可以看作是对话结构图中的顶点。通过模拟数据集中的专家行为,计算图中表示顶点间转移概率的边。最后,将对话结构图集成到对话模型中进行受控响应的生成。实验结果表明,与一些优秀的对话模型相比,该模型能够产生更加连贯的对话响应,并且在对话话语表征中优于一些典型的句子嵌入方法。代码可在 GitHub 中获得。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CTRLStruct:+Dialogue+Structure+Learning+for+Open-Domain+Response+Generation)|0| +|[BlinkViz: Fast and Scalable Approximate Visualization on Very Large Datasets using Neural-Enhanced Mixed Sum-Product Networks](https://doi.org/10.1145/3543507.3583411)|Yimeng Qiao, Yinan Jing, Hanbing Zhang, Zhenying He, Kai Zhang, X. Sean Wang|Shanghai Key Laboratory of Data Science, School of Software, Fudan University, China; Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, China|Web-based online interactive visual analytics enjoys popularity in recent years. Traditionally, visualizations are produced directly from querying the underlying data. However, for a very large dataset, this way is so time-consuming that it cannot meet the low-latency requirements of interactive visual analytics. In this paper, we propose a learning-based visualization approach called BlinkViz, which uses a learned model to produce approximate visualizations by leveraging mixed sum-product networks to learn the distribution of the original data. In such a way, it makes visualization faster and more scalable by decoupling visualization and data. In addition, to improve the accuracy of approximate visualizations, we propose an enhanced model by incorporating a neural network with residual structures, which can refine prediction results, especially for visual requests with low selectivity. Extensive experiments show that BlinkViz is extremely fast even on a large dataset with hundreds of millions of data records (over 30GB), responding in sub-seconds (from 2ms to less than 500ms for different requests) while keeping a low error rate. Furthermore, our approach remains scalable on latency and memory footprint size regardless of data size.|基于 Web 的在线交互式可视化分析近年来很受欢迎。传统上,可视化是通过查询底层数据直接生成的。然而,对于一个非常大的数据集,这种方法非常耗时,不能满足交互式可视化分析的低延迟要求。本文提出了一种基于学习的可视化方法 BlinkViz,该方法利用学习模型,通过混合和积网络来学习原始数据的分布情况,从而产生近似可视化效果。通过这种方式,可视化和数据解耦,使得可视化更快、更具可伸缩性。此外,为了提高近似可视化的准确性,我们提出了一个增强的模型,通过结合残差结构的神经网络,可以细化预测结果,特别是对低选择性的可视化请求。大量的实验表明,BlinkViz 即使在拥有数亿条数据记录(超过30GB)的大型数据集上也是极其快速的,响应时间在亚秒(对于不同的请求,响应时间从2毫秒到不到500毫秒) ,同时保持较低的错误率。此外,无论数据大小如何,我们的方法在延迟和内存占用大小上都是可伸缩的。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=BlinkViz:+Fast+and+Scalable+Approximate+Visualization+on+Very+Large+Datasets+using+Neural-Enhanced+Mixed+Sum-Product+Networks)|0| |[Semi-supervised Adversarial Learning for Complementary Item Recommendation](https://doi.org/10.1145/3543507.3583462)|Koby Bibas, Oren Sar Shalom, Dietmar Jannach|Amazon, Israel; University of Klagenfurt, Austria; Meta, Israel|Complementary item recommendations are a ubiquitous feature of modern e-commerce sites. Such recommendations are highly effective when they are based on collaborative signals like co-purchase statistics. In certain online marketplaces, however, e.g., on online auction sites, constantly new items are added to the catalog. In such cases, complementary item recommendations are often based on item side-information due to a lack of interaction data. In this work, we propose a novel approach that can leverage both item side-information and labeled complementary item pairs to generate effective complementary recommendations for cold items, i.e., for items for which no co-purchase statistics yet exist. Given that complementary items typically have to be of a different category than the seed item, we technically maintain a latent space for each item category. Simultaneously, we learn to project distributed item representations into these category spaces to determine suitable recommendations. The main learning process in our architecture utilizes labeled pairs of complementary items. In addition, we adopt ideas from Cycle Generative Adversarial Networks (CycleGAN) to leverage available item information even in case no labeled data exists for a given item and category. Experiments on three e-commerce datasets show that our method is highly effective.|互补商品推荐是现代电子商务网站的一个普遍特征。当这些建议基于合作信号(如共同购买统计数据)时,它们是非常有效的。然而,在某些在线市场,例如在线拍卖网站,不断有新物品被添加到目录中。在这种情况下,由于缺乏交互数据,补充项目推荐通常基于项目侧信息。在这项工作中,我们提出了一个新颖的方法,可以利用项目侧信息和标记的互补项目对,以产生有效的补充建议,冷的项目,即项目,共同购买统计数据尚不存在。鉴于补充项目通常必须是一个与种子项目不同的类别,我们在技术上为每个项目类别保持一个潜在的空间。同时,我们学习将分布式项表示投射到这些类别空间中,以确定合适的建议。在我们的建筑中,主要的学习过程是使用标记成对的互补项目。此外,我们采用循环生成对抗网络(CycleGAN)的思想来利用可用的项目信息,即使在给定的项目和类别没有标记数据存在的情况下。在三个电子商务数据集上的实验表明,该方法是高效的。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Semi-supervised+Adversarial+Learning+for+Complementary+Item+Recommendation)|0| |[MaSS: Model-agnostic, Semantic and Stealthy Data Poisoning Attack on Knowledge Graph Embedding](https://doi.org/10.1145/3543507.3583203)|Xiaoyu You, Beina Sheng, Daizong Ding, Mi Zhang, Xudong Pan, Min Yang, Fuli Feng|Fudan University, School of Computer Science, China; University of Science and Technology of China, CCCD Key Lab of Ministry of Culture and Tourism, China|Open-source knowledge graphs are attracting increasing attention. Nevertheless, the openness also raises the concern of data poisoning attacks, that is, the attacker could submit malicious facts to bias the prediction of knowledge graph embedding (KGE) models. Existing studies on such attacks adopt a clear-box setting and neglect the semantic information of the generated facts, making them fail to attack in real-world scenarios. In this work, we consider a more rigorous setting and propose a model-agnostic, semantic, and stealthy data poisoning attack on KGE models from a practical perspective. The main design of our work is to inject indicative paths to make the infected model predict certain malicious facts. With the aid of the proposed opaque-box path injection theory, we theoretically reveal that the attack success rate under the opaque-box setting is determined by the plausibility of triplets on the indicative path. Based on this, we develop a novel and efficient algorithm to search paths that maximize the attack goal, satisfy certain semantic constraints, and preserve certain stealthiness, i.e., the normal functionality of the target KGE will not be influenced although it predicts wrong facts given certain queries. Through extensive evaluation of benchmark datasets and 6 typical knowledge graph embedding models as the victims, we validate the effectiveness in terms of attack success rate (ASR) under opaque-box setting and stealthiness. For example, on FB15k-237, our attack achieves a ASR on DeepPath, with an average ASR over when attacking various KGE models under the opaque-box setting.|开源知识图表越来越受到人们的关注。然而,这种开放性也引起了人们对数据中毒攻击的担忧,即攻击者可能会提交恶意事实来偏向知识图嵌入(KGE)模型的预测。现有的关于此类攻击的研究采用了清晰框设置,忽视了所生成事实的语义信息,使得它们无法在现实世界中进行攻击。在这项工作中,我们考虑了一个更严格的设置,并提出了一个模型无关,语义,隐秘的数据中毒攻击的 KGE 模型从实用的角度。我们工作的主要设计是注入指示性路径,使被感染的模型能够预测某些恶意事实。借助所提出的不透明盒路径注入理论,我们从理论上揭示了在不透明盒设置下的攻击成功率取决于指示路径上三联体的合理性。在此基础上,提出了一种新的高效的路径搜索算法,该算法能够使攻击目标最大化,满足一定的语义约束,并保持一定的隐蔽性。通过对基准数据集和6种典型的知识图嵌入模型作为受害者的广泛评估,验证了在不透明框设置和隐蔽性条件下的攻击成功率(ASR)的有效性。例如,在 FB15k-237上,我们的攻击在 DeepPath 上达到 ASR,在不透明盒子设置下攻击各种 KGE 模型时达到平均 ASR。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MaSS:+Model-agnostic,+Semantic+and+Stealthy+Data+Poisoning+Attack+on+Knowledge+Graph+Embedding)|0| -|[TaxoComplete: Self-Supervised Taxonomy Completion Leveraging Position-Enhanced Semantic Matching](https://doi.org/10.1145/3543507.3583342)|Ines Arous, Ljiljana Dolamic, Philippe CudréMauroux|University of Fribourg, Switzerland; armasuisse, Switzerland|Taxonomies are used to organize knowledge in many applications, including recommender systems, content browsing, or web search. With the emergence of new concepts, static taxonomies become obsolete as they fail to capture up-to-date knowledge. Several approaches have been proposed to address the problem of maintaining taxonomies automatically. These approaches typically rely on a limited set of neighbors to represent a given node in the taxonomy. However, considering distant nodes could improve the representation of some portions of the taxonomy, especially for those nodes situated in the periphery or in sparse regions of the taxonomy. In this work, we propose TaxoComplete, a self-supervised taxonomy completion framework that learns the representation of nodes leveraging their position in the taxonomy. TaxoComplete uses a self-supervision generation process that selects some nodes and associates each of them with an anchor set, which is a set composed of nodes in the close and distant neighborhood of the selected node. Using self-supervision data, TaxoComplete learns a position-enhanced node representation using two components: (1) a query-anchor semantic matching mechanism, which encodes pairs of nodes and matches their semantic distance to their graph distance, such that nodes that are close in the taxonomy are placed closely in the shared embedding space while distant nodes are placed further apart; (2) a direction-aware propagation module, which embeds the direction of edges in node representation, such that we discriminate relation from other taxonomic relations. Our approach allows the representation of nodes to encapsulate information from a large neighborhood while being aware of the distance separating pairs of nodes in the taxonomy. Extensive experiments on four real-world and large-scale datasets show that TaxoComplete is substantially more effective than state-of-the-art methods (2x more effective in terms of [email protected] ).|分类法用于在许多应用程序中组织知识,包括推荐系统、内容浏览或网络搜索。随着新概念的出现,静态分类法变得过时,因为它们无法捕获最新的知识。已经提出了几种方法来解决自动维护分类法的问题。这些方法通常依赖于一组有限的邻居来表示分类法中的给定节点。然而,考虑远处的节点可以改善分类学的某些部分的表示,特别是对于那些位于分类学的边缘或稀疏区域的节点。在这项工作中,我们提出了 TaxoComplete,一个自我监督的分类完成框架,它学习利用节点在分类中的位置来表示节点。TaxoComplete 使用一个自我监督的生成过程,它选择一些节点,并将它们与锚集相关联,锚集是由所选节点的近邻和远邻的节点组成的集合。TaxoComplete 利用自我监督数据学习位置增强的节点表示,它使用两个组件: (1)查询锚语义匹配机制,对节点对进行编码,并将它们的语义距离匹配到它们的图形距离上,这样在分类学中相近的节点被紧密地放置在共享嵌入空间中,而远处的节点被放置得更远; (2)方向感知传播模块,这种模块在节点表示中嵌入边的方向,这样我们可以区分 < 节点,父节点 > 关系和其他分类关系。我们的方法允许节点的表示来封装来自大邻居的信息,同时知道分类法中节点对之间的距离。在四个真实世界和大规模数据集上的大量实验表明,TaxoComplete 实质上比最先进的方法更有效(在[ email protected ]方面效率高出2倍)。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=TaxoComplete:+Self-Supervised+Taxonomy+Completion+Leveraging+Position-Enhanced+Semantic+Matching)|0| +|[TaxoComplete: Self-Supervised Taxonomy Completion Leveraging Position-Enhanced Semantic Matching](https://doi.org/10.1145/3543507.3583342)|Ines Arous, Ljiljana Dolamic, Philippe CudréMauroux|armasuisse, Switzerland; University of Fribourg, Switzerland|Taxonomies are used to organize knowledge in many applications, including recommender systems, content browsing, or web search. With the emergence of new concepts, static taxonomies become obsolete as they fail to capture up-to-date knowledge. Several approaches have been proposed to address the problem of maintaining taxonomies automatically. These approaches typically rely on a limited set of neighbors to represent a given node in the taxonomy. However, considering distant nodes could improve the representation of some portions of the taxonomy, especially for those nodes situated in the periphery or in sparse regions of the taxonomy. In this work, we propose TaxoComplete, a self-supervised taxonomy completion framework that learns the representation of nodes leveraging their position in the taxonomy. TaxoComplete uses a self-supervision generation process that selects some nodes and associates each of them with an anchor set, which is a set composed of nodes in the close and distant neighborhood of the selected node. Using self-supervision data, TaxoComplete learns a position-enhanced node representation using two components: (1) a query-anchor semantic matching mechanism, which encodes pairs of nodes and matches their semantic distance to their graph distance, such that nodes that are close in the taxonomy are placed closely in the shared embedding space while distant nodes are placed further apart; (2) a direction-aware propagation module, which embeds the direction of edges in node representation, such that we discriminate relation from other taxonomic relations. Our approach allows the representation of nodes to encapsulate information from a large neighborhood while being aware of the distance separating pairs of nodes in the taxonomy. Extensive experiments on four real-world and large-scale datasets show that TaxoComplete is substantially more effective than state-of-the-art methods (2x more effective in terms of [email protected] ).|分类法用于在许多应用程序中组织知识,包括推荐系统、内容浏览或网络搜索。随着新概念的出现,静态分类法变得过时,因为它们无法捕获最新的知识。已经提出了几种方法来解决自动维护分类法的问题。这些方法通常依赖于一组有限的邻居来表示分类法中的给定节点。然而,考虑远处的节点可以改善分类学的某些部分的表示,特别是对于那些位于分类学的边缘或稀疏区域的节点。在这项工作中,我们提出了 TaxoComplete,一个自我监督的分类完成框架,它学习利用节点在分类中的位置来表示节点。TaxoComplete 使用一个自我监督的生成过程,它选择一些节点,并将它们与锚集相关联,锚集是由所选节点的近邻和远邻的节点组成的集合。TaxoComplete 利用自我监督数据学习位置增强的节点表示,它使用两个组件: (1)查询锚语义匹配机制,对节点对进行编码,并将它们的语义距离匹配到它们的图形距离上,这样在分类学中相近的节点被紧密地放置在共享嵌入空间中,而远处的节点被放置得更远; (2)方向感知传播模块,这种模块在节点表示中嵌入边的方向,这样我们可以区分 < 节点,父节点 > 关系和其他分类关系。我们的方法允许节点的表示来封装来自大邻居的信息,同时知道分类法中节点对之间的距离。在四个真实世界和大规模数据集上的大量实验表明,TaxoComplete 实质上比最先进的方法更有效(在[ email protected ]方面效率高出2倍)。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=TaxoComplete:+Self-Supervised+Taxonomy+Completion+Leveraging+Position-Enhanced+Semantic+Matching)|0| |[Bipartite Graph Convolutional Hashing for Effective and Efficient Top-N Search in Hamming Space](https://doi.org/10.1145/3543507.3583219)|Yankai Chen, Yixiang Fang, Yifei Zhang, Irwin King|The Chinese University of Hong Kong, Hong Kong; The Chinese University of Hong Kong, Shenzhen, China|Searching on bipartite graphs is basal and versatile to many real-world Web applications, e.g., online recommendation, database retrieval, and query-document searching. Given a query node, the conventional approaches rely on the similarity matching with the vectorized node embeddings in the continuous Euclidean space. To efficiently manage intensive similarity computation, developing hashing techniques for graph structured data has recently become an emerging research direction. Despite the retrieval efficiency in Hamming space, prior work is however confronted with catastrophic performance decay. In this work, we investigate the problem of hashing with Graph Convolutional Network on bipartite graphs for effective Top-N search. We propose an end-to-end Bipartite Graph Convolutional Hashing approach, namely BGCH, which consists of three novel and effective modules: (1) adaptive graph convolutional hashing, (2) latent feature dispersion, and (3) Fourier serialized gradient estimation. Specifically, the former two modules achieve the substantial retention of the structural information against the inevitable information loss in hash encoding; the last module develops Fourier Series decomposition to the hashing function in the frequency domain mainly for more accurate gradient estimation. The extensive experiments on six real-world datasets not only show the performance superiority over the competing hashing-based counterparts, but also demonstrate the effectiveness of all proposed model components contained therein.|对于许多实际的 Web 应用程序,如在线推荐、数据库检索和查询文档搜索,二部图搜索是基础的和通用的。给定一个查询节点,传统的方法依赖于与连续欧氏空间中向量化节点嵌入的相似性匹配。为了有效地管理密集型相似度计算,开发图结构化数据的哈希技术已成为一个新兴的研究方向。尽管在汉明空间中反演效率很高,但是先前的工作却面临着灾难性的性能衰减。在本文中,我们研究了用图卷积网络对二部图进行散列以获得有效的 Top-N 搜索的问题。提出了一种端到端的二部图卷积哈希方法,即 BGCH,它由三个新颖有效的模块组成: (1)自适应图卷积哈希,(2)潜在特征分散,(3)傅里叶序列化梯度估计。具体来说,前两个模块针对哈希编码中不可避免的信息损失实现了结构信息的实质性保留,最后一个模块对频域中的哈希函数进行了傅里叶级数分解,以便更准确地进行梯度估计。在六个实际数据集上进行的大量实验不仅表明了该模型的性能优于竞争对手的散列模型,而且证明了其中所有模型组件的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Bipartite+Graph+Convolutional+Hashing+for+Effective+and+Efficient+Top-N+Search+in+Hamming+Space)|0| -|[LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval](https://doi.org/10.1145/3543507.3583294)|Kai Zhang, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Binxing Jiao, Daxin Jiang|University of Technology Sydney, Australia; The Ohio State University, USA; Microsoft, China|Retrieval models based on dense representations in semantic space have become an indispensable branch for first-stage retrieval. These retrievers benefit from surging advances in representation learning towards compressive global sequence-level embeddings. However, they are prone to overlook local salient phrases and entity mentions in texts, which usually play pivot roles in first-stage retrieval. To mitigate this weakness, we propose to make a dense retriever align a well-performing lexicon-aware representation model. The alignment is achieved by weakened knowledge distillations to enlighten the retriever via two aspects -- 1) a lexicon-augmented contrastive objective to challenge the dense encoder and 2) a pair-wise rank-consistent regularization to make dense model's behavior incline to the other. We evaluate our model on three public benchmarks, which shows that with a comparable lexicon-aware retriever as the teacher, our proposed dense one can bring consistent and significant improvements, and even outdo its teacher. In addition, we found our improvement on the dense retriever is complementary to the standard ranker distillation, which can further lift state-of-the-art performance.|基于语义空间密集表示的检索模型已经成为第一阶段检索不可或缺的分支。这些检索器受益于表示学习向压缩全局序列级嵌入方向的飞速发展。然而,它们往往忽视文本中的局部显著短语和实体提及,而这些短语和实体提及在第一阶段的检索中起着关键作用。为了减轻这个弱点,我们建议使一个密集检索对齐一个良好的表现词典感知表示模型。该方法通过弱化知识提取,从两个方面启发检索者: 1)词典增强对比目标,挑战密集编码器; 2)成对秩一致正则化,使密集模型的行为倾向于另一方。我们在三个公共基准上评估了我们的模型,结果表明,与一个具有可比性的词汇感知检索器作为教师,我们提出的密集型可以带来一致和重大的改进,甚至超过它的老师。此外,我们发现我们对稠密检索器的改进是对标准等级精馏的补充,它可以进一步提高最先进的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=LED:+Lexicon-Enlightened+Dense+Retriever+for+Large-Scale+Retrieval)|0| -|[A Passage-Level Reading Behavior Model for Mobile Search](https://doi.org/10.1145/3543507.3583343)|Zhijing Wu, Jiaxin Mao, Kedi Xu, Dandan Song, Heyan Huang|Gaoling School of Artificial Intelligence, Renmin University of China, China; School of Computer Science and Technology, Beijing Institute of Technology, China; School of Computer Science, Carnegie Mellon University, USA|Reading is a vital and complex cognitive activity during users’ information-seeking process. Several studies have focused on understanding users’ reading behavior in desktop search. Their findings greatly contribute to the design of information retrieval models. However, little is known about how users read a result in mobile search, although search currently happens more frequently in mobile scenarios. In this paper, we conduct a lab-based user study to investigate users’ fine-grained reading behavior patterns in mobile search. We find that users’ reading attention allocation is strongly affected by several behavior biases, such as position and selection biases. Inspired by these findings, we propose a probabilistic generative model, the Passage-level Reading behavior Model (PRM), to model users’ reading behavior in mobile search. The PRM utilizes observable passage-level exposure and viewport duration events to infer users’ unobserved skimming event, reading event, and satisfaction perception during the reading process. Besides fitting the passage-level reading behavior, we utilize the fitted parameters of PRM to estimate the passage-level and document-level relevance. Experimental results show that PRM outperforms existing unsupervised relevance estimation models. PRM has strong interpretability and provides valuable insights into the understanding of how users seek and perceive useful information in mobile search.|阅读是用户信息搜索过程中一种重要而复杂的认知活动。一些研究集中在了解用户在桌面搜索中的阅读行为。他们的发现极大地促进了信息检索模型的设计。然而,尽管目前搜索在移动场景中出现的频率更高,但人们对用户在移动搜索中如何阅读结果知之甚少。本文以实验室为基础,对移动搜索中用户的细粒度阅读行为模式进行了研究。研究发现,位置偏差、选择偏差等行为偏差对用户的阅读注意分配有显著影响。受这些发现的启发,我们提出了一个概率生成模型——短文水平阅读行为模型(PRM) ,来模拟用户在移动搜索中的阅读行为。PRM 利用可观察到的文章水平暴露和视窗持续时间事件来推断用户在阅读过程中未观察到的略读事件、阅读事件和满意度感知。除了拟合文章阅读行为外,我们还利用 PRM 的拟合参数来估计文章阅读水平和文档阅读水平的相关性。实验结果表明,PRM 算法的性能优于现有的无监督相关估计模型。PRM 具有很强的可解释性,为理解用户在移动搜索中如何寻找和感知有用信息提供了有价值的见解。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Passage-Level+Reading+Behavior+Model+for+Mobile+Search)|0| -|[PROD: Progressive Distillation for Dense Retrieval](https://doi.org/10.1145/3543507.3583421)|Zhenghao Lin, Yeyun Gong, Xiao Liu, Hang Zhang, Chen Lin, Anlei Dong, Jian Jiao, Jingwen Lu, Daxin Jiang, Rangan Majumder, Nan Duan|Microsoft, USA; Microsoft Research Asia, China; School of Informatics, Xiamen University, China; Microsoft, China|Knowledge distillation is an effective way to transfer knowledge from a strong teacher to an efficient student model. Ideally, we expect the better the teacher is, the better the student. However, this expectation does not always come true. It is common that a better teacher model results in a bad student via distillation due to the nonnegligible gap between teacher and student. To bridge the gap, we propose PROD, a PROgressive Distillation method, for dense retrieval. PROD consists of a teacher progressive distillation and a data progressive distillation to gradually improve the student. We conduct extensive experiments on five widely-used benchmarks, MS MARCO Passage, TREC Passage 19, TREC Document 19, MS MARCO Document and Natural Questions, where PROD achieves the state-of-the-art within the distillation methods for dense retrieval. The code and models will be released.|知识提取是将知识从一个强有力的教师转化为一个有效的学生模型的有效途径。理想情况下,我们期望老师越好,学生越好。然而,这种期望并不总是能够实现。由于教师与学生之间存在着不可忽视的差距,一个较好的教师模型常常通过精馏的方法导致不好的学生。为了填补这一空白,我们提出了一种逐步精馏方法 PROD,用于密集提取。PROD 由一个教师的逐步升华和一个数据的逐步升华组成,以逐步提高学生的素质。我们对五个广泛使用的基准进行了广泛的实验,即 MS MARCO Passage,TREC Passage 19,TREC Document 19,MS MARCO Document and Natural Questions,PROD 在这些基准上实现了用于密集检索的蒸馏方法的最先进水平。代码和模型将被发布。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PROD:+Progressive+Distillation+for+Dense+Retrieval)|0| -|[Ad Auction Design with Coupon-Dependent Conversion Rate in the Auto-bidding World](https://doi.org/10.1145/3543507.3583230)|Bonan Ni, Xun Wang, Qi Zhang, Pingzhong Tang, Zhourong Chen, Tianjiu Yin, Liangni Lu, Xiaobing Liu, Kewu Sun, Zhe Ma|Institute for Interdisciplinary Information Sciences, Tsinghua University, China; Intelligent Science & Technology Academy of CASIC, China and Scientific Research Key Laboratory of Aerospace Defence Intelligent Systems and Technology, China; TuringSense, China and Institute for Interdisciplinary Information Sciences, Tsinghua University, China; ByteDance, China|Online advertising has become a dominant source of revenue of the Internet. In classic auction theory, only the auctioneer (i.e., the platform) and buyers (i.e., the advertisers) are involved, while the advertising audiences are ignored. For ecommerce advertising, however, the platform can provide coupons for the advertising audiences and nudge them into purchasing more products at lower prices (e.g., 2 dollars off the regular price). Such promotions can lead to an increase in amount and value of purchases. In this paper, we jointly design the coupon value computation, slot allocation, and payment of online advertising in an auto-bidding world. Firstly, we propose the auction mechanism, named CFA-auction (i.e., Coupon-For-the-Audiences-auction), which takes advertising audiences into account in the auction design. We prove the existence of pacing equilibrium, and show that CFA-auction satisfies the IC (incentive compatibility), IR (individual rationality) constraints. Then, we study the optimality of CFA-auction, and prove it can maintain an approximation of the optimal. Finally, experimental evaluation results on both offline dataset as well as online A/B test demonstrate the effectiveness of CFA-auction.|在线广告已成为互联网收入的主要来源。在传统的拍卖理论中,只有拍卖商(即平台)和买家(即广告商)参与,而广告受众被忽略。然而,对于电子商务广告来说,这个平台可以为广告受众提供优惠券,推动他们以更低的价格购买更多的产品(例如,比正常价格低2美元)。此类促销活动可能导致购买数量和价值的增加。在这篇论文中,我们共同设计了自动竞价世界中的优惠券价值计算、时段分配和在线广告支付。首先,我们提出了一种拍卖机制,即 CFA 拍卖(即为受众提供优惠券的拍卖) ,该机制在拍卖设计中考虑了广告受众。我们证明了节奏均衡的存在,并且证明了 CFA 拍卖满足集成激励相容(IC)、个人理性(IR)约束。然后,我们研究了 CFA 拍卖的最优性,并证明了它可以保持最优的近似。最后,通过对离线数据集和在线 A/B 测试的实验结果验证了 CFA 拍卖的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Ad+Auction+Design+with+Coupon-Dependent+Conversion+Rate+in+the+Auto-bidding+World)|0| -|[A Reference-Dependent Model for Web Search Evaluation: Understanding and Measuring the Experience of Boundedly Rational Users](https://doi.org/10.1145/3543507.3583551)|Nuo Chen, Jiqun Liu, Tetsuya Sakai|Waseda University, Japan; The University of Oklahoma, USA|Previous researches demonstrate that users’ actions in search interaction are associated with relative gains and losses to reference points, known as the reference dependence effect. However, this widely confirmed effect is not represented in most user models underpinning existing search evaluation metrics. In this study, we propose a new evaluation metric framework, namely Reference Dependent Metric (ReDeM), for assessing query-level search by incorporating the effect of reference dependence into the modelling of user search behavior. To test the overall effectiveness of the proposed framework, (1) we evaluate the performance, in terms of correlation with user satisfaction, of ReDeMs built upon different reference points against that of the widely-used metrics on three search datasets; (2) we examine the performance of ReDeMs under different task states, like task difficulty and task urgency; and (3) we analyze the statistical reliability of ReDeMs in terms of discriminative power. Experimental results indicate that: (1) ReDeMs integrated with a proper reference point achieve better correlations with user satisfaction than most of the existing metrics, like Discounted Cumulative Gain (DCG) and Rank-Biased Precision (RBP), even though their parameters have already been well-tuned; (2) ReDeMs reach relatively better performance compared to existing metrics when the task triggers a high-level cognitive load; (3) the discriminative power of ReDeMs is far stronger than Expected Reciprocal Rank (ERR), slightly stronger than Precision and similar to DCG, RBP and INST. To our knowledge, this study is the first to explicitly incorporate the reference dependence effect into the user browsing model and offline evaluation metrics. Our work illustrates a promising approach to leveraging the insights about user biases from cognitive psychology in better evaluating user search experience and enhancing user models.|以往的研究表明,用户在搜索交互中的行为与参考点的相对收益和相对损失有关,称为参考依赖效应。然而,这种被广泛证实的效果在大多数支持现有搜索评估指标的用户模型中并没有体现出来。在这项研究中,我们提出了一个新的评估度量框架,即参考依赖度量(ReDeM) ,通过将参考依赖的影响纳入用户搜索行为的建模来评估查询级搜索。为了测试提出的框架的整体有效性,(1)我们评估建立在不同参考点上的 ReDeM 与用户满意度的相关性,与三个搜索数据集上广泛使用的指标的相关性; (2)我们检查 ReDeM 在不同任务状态下的表现,如任务难度和任务紧迫性; (3)我们分析 ReDeM 在区分能力方面的统计可靠性。实验结果表明: (1)与合适的参考点相结合的 ReDeMs 与用户满意度的相关性优于大多数现有指标,如 DCG 和 RBP,尽管它们的参数已经得到了很好的调整; (2)当任务触发高水平的认知负荷时,与现有指标相比,ReDeMs 获得了相对更好的性能; (3) ReDeMs 的区分能力远远强于期望互惠秩序(ERR) ,略强于精度,类似于 DCG、 RBP 和 INST。据我们所知,这项研究是第一个明确地将参考依赖效应纳入用户浏览模型和离线评价指标。我们的工作说明了一种有前途的方法,利用认知心理学对用户偏见的洞察力,更好地评估用户搜索体验和增强用户模型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Reference-Dependent+Model+for+Web+Search+Evaluation:+Understanding+and+Measuring+the+Experience+of+Boundedly+Rational+Users)|0| +|[LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval](https://doi.org/10.1145/3543507.3583294)|Kai Zhang, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Binxing Jiao, Daxin Jiang|University of Technology Sydney, Australia; Microsoft, China; The Ohio State University, USA|Retrieval models based on dense representations in semantic space have become an indispensable branch for first-stage retrieval. These retrievers benefit from surging advances in representation learning towards compressive global sequence-level embeddings. However, they are prone to overlook local salient phrases and entity mentions in texts, which usually play pivot roles in first-stage retrieval. To mitigate this weakness, we propose to make a dense retriever align a well-performing lexicon-aware representation model. The alignment is achieved by weakened knowledge distillations to enlighten the retriever via two aspects -- 1) a lexicon-augmented contrastive objective to challenge the dense encoder and 2) a pair-wise rank-consistent regularization to make dense model's behavior incline to the other. We evaluate our model on three public benchmarks, which shows that with a comparable lexicon-aware retriever as the teacher, our proposed dense one can bring consistent and significant improvements, and even outdo its teacher. In addition, we found our improvement on the dense retriever is complementary to the standard ranker distillation, which can further lift state-of-the-art performance.|基于语义空间密集表示的检索模型已经成为第一阶段检索不可或缺的分支。这些检索器受益于表示学习向压缩全局序列级嵌入方向的飞速发展。然而,它们往往忽视文本中的局部显著短语和实体提及,而这些短语和实体提及在第一阶段的检索中起着关键作用。为了减轻这个弱点,我们建议使一个密集检索对齐一个良好的表现词典感知表示模型。该方法通过弱化知识提取,从两个方面启发检索者: 1)词典增强对比目标,挑战密集编码器; 2)成对秩一致正则化,使密集模型的行为倾向于另一方。我们在三个公共基准上评估了我们的模型,结果表明,与一个具有可比性的词汇感知检索器作为教师,我们提出的密集型可以带来一致和重大的改进,甚至超过它的老师。此外,我们发现我们对稠密检索器的改进是对标准等级精馏的补充,它可以进一步提高最先进的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=LED:+Lexicon-Enlightened+Dense+Retriever+for+Large-Scale+Retrieval)|0| +|[A Passage-Level Reading Behavior Model for Mobile Search](https://doi.org/10.1145/3543507.3583343)|Zhijing Wu, Jiaxin Mao, Kedi Xu, Dandan Song, Heyan Huang|School of Computer Science, Carnegie Mellon University, USA; School of Computer Science and Technology, Beijing Institute of Technology, China; Gaoling School of Artificial Intelligence, Renmin University of China, China|Reading is a vital and complex cognitive activity during users’ information-seeking process. Several studies have focused on understanding users’ reading behavior in desktop search. Their findings greatly contribute to the design of information retrieval models. However, little is known about how users read a result in mobile search, although search currently happens more frequently in mobile scenarios. In this paper, we conduct a lab-based user study to investigate users’ fine-grained reading behavior patterns in mobile search. We find that users’ reading attention allocation is strongly affected by several behavior biases, such as position and selection biases. Inspired by these findings, we propose a probabilistic generative model, the Passage-level Reading behavior Model (PRM), to model users’ reading behavior in mobile search. The PRM utilizes observable passage-level exposure and viewport duration events to infer users’ unobserved skimming event, reading event, and satisfaction perception during the reading process. Besides fitting the passage-level reading behavior, we utilize the fitted parameters of PRM to estimate the passage-level and document-level relevance. Experimental results show that PRM outperforms existing unsupervised relevance estimation models. PRM has strong interpretability and provides valuable insights into the understanding of how users seek and perceive useful information in mobile search.|阅读是用户信息搜索过程中一种重要而复杂的认知活动。一些研究集中在了解用户在桌面搜索中的阅读行为。他们的发现极大地促进了信息检索模型的设计。然而,尽管目前搜索在移动场景中出现的频率更高,但人们对用户在移动搜索中如何阅读结果知之甚少。本文以实验室为基础,对移动搜索中用户的细粒度阅读行为模式进行了研究。研究发现,位置偏差、选择偏差等行为偏差对用户的阅读注意分配有显著影响。受这些发现的启发,我们提出了一个概率生成模型——短文水平阅读行为模型(PRM) ,来模拟用户在移动搜索中的阅读行为。PRM 利用可观察到的文章水平暴露和视窗持续时间事件来推断用户在阅读过程中未观察到的略读事件、阅读事件和满意度感知。除了拟合文章阅读行为外,我们还利用 PRM 的拟合参数来估计文章阅读水平和文档阅读水平的相关性。实验结果表明,PRM 算法的性能优于现有的无监督相关估计模型。PRM 具有很强的可解释性,为理解用户在移动搜索中如何寻找和感知有用信息提供了有价值的见解。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Passage-Level+Reading+Behavior+Model+for+Mobile+Search)|0| +|[PROD: Progressive Distillation for Dense Retrieval](https://doi.org/10.1145/3543507.3583421)|Zhenghao Lin, Yeyun Gong, Xiao Liu, Hang Zhang, Chen Lin, Anlei Dong, Jian Jiao, Jingwen Lu, Daxin Jiang, Rangan Majumder, Nan Duan|Microsoft Research Asia, China; Microsoft, China; Microsoft, USA; School of Informatics, Xiamen University, China|Knowledge distillation is an effective way to transfer knowledge from a strong teacher to an efficient student model. Ideally, we expect the better the teacher is, the better the student. However, this expectation does not always come true. It is common that a better teacher model results in a bad student via distillation due to the nonnegligible gap between teacher and student. To bridge the gap, we propose PROD, a PROgressive Distillation method, for dense retrieval. PROD consists of a teacher progressive distillation and a data progressive distillation to gradually improve the student. We conduct extensive experiments on five widely-used benchmarks, MS MARCO Passage, TREC Passage 19, TREC Document 19, MS MARCO Document and Natural Questions, where PROD achieves the state-of-the-art within the distillation methods for dense retrieval. The code and models will be released.|知识提取是将知识从一个强有力的教师转化为一个有效的学生模型的有效途径。理想情况下,我们期望老师越好,学生越好。然而,这种期望并不总是能够实现。由于教师与学生之间存在着不可忽视的差距,一个较好的教师模型常常通过精馏的方法导致不好的学生。为了填补这一空白,我们提出了一种逐步精馏方法 PROD,用于密集提取。PROD 由一个教师的逐步升华和一个数据的逐步升华组成,以逐步提高学生的素质。我们对五个广泛使用的基准进行了广泛的实验,即 MS MARCO Passage,TREC Passage 19,TREC Document 19,MS MARCO Document and Natural Questions,PROD 在这些基准上实现了用于密集检索的蒸馏方法的最先进水平。代码和模型将被发布。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PROD:+Progressive+Distillation+for+Dense+Retrieval)|0| +|[Ad Auction Design with Coupon-Dependent Conversion Rate in the Auto-bidding World](https://doi.org/10.1145/3543507.3583230)|Bonan Ni, Xun Wang, Qi Zhang, Pingzhong Tang, Zhourong Chen, Tianjiu Yin, Liangni Lu, Xiaobing Liu, Kewu Sun, Zhe Ma|Intelligent Science & Technology Academy of CASIC, China and Scientific Research Key Laboratory of Aerospace Defence Intelligent Systems and Technology, China; TuringSense, China and Institute for Interdisciplinary Information Sciences, Tsinghua University, China; ByteDance, China; Institute for Interdisciplinary Information Sciences, Tsinghua University, China|Online advertising has become a dominant source of revenue of the Internet. In classic auction theory, only the auctioneer (i.e., the platform) and buyers (i.e., the advertisers) are involved, while the advertising audiences are ignored. For ecommerce advertising, however, the platform can provide coupons for the advertising audiences and nudge them into purchasing more products at lower prices (e.g., 2 dollars off the regular price). Such promotions can lead to an increase in amount and value of purchases. In this paper, we jointly design the coupon value computation, slot allocation, and payment of online advertising in an auto-bidding world. Firstly, we propose the auction mechanism, named CFA-auction (i.e., Coupon-For-the-Audiences-auction), which takes advertising audiences into account in the auction design. We prove the existence of pacing equilibrium, and show that CFA-auction satisfies the IC (incentive compatibility), IR (individual rationality) constraints. Then, we study the optimality of CFA-auction, and prove it can maintain an approximation of the optimal. Finally, experimental evaluation results on both offline dataset as well as online A/B test demonstrate the effectiveness of CFA-auction.|在线广告已成为互联网收入的主要来源。在传统的拍卖理论中,只有拍卖商(即平台)和买家(即广告商)参与,而广告受众被忽略。然而,对于电子商务广告来说,这个平台可以为广告受众提供优惠券,推动他们以更低的价格购买更多的产品(例如,比正常价格低2美元)。此类促销活动可能导致购买数量和价值的增加。在这篇论文中,我们共同设计了自动竞价世界中的优惠券价值计算、时段分配和在线广告支付。首先,我们提出了一种拍卖机制,即 CFA 拍卖(即为受众提供优惠券的拍卖) ,该机制在拍卖设计中考虑了广告受众。我们证明了节奏均衡的存在,并且证明了 CFA 拍卖满足集成激励相容(IC)、个人理性(IR)约束。然后,我们研究了 CFA 拍卖的最优性,并证明了它可以保持最优的近似。最后,通过对离线数据集和在线 A/B 测试的实验结果验证了 CFA 拍卖的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Ad+Auction+Design+with+Coupon-Dependent+Conversion+Rate+in+the+Auto-bidding+World)|0| +|[A Reference-Dependent Model for Web Search Evaluation: Understanding and Measuring the Experience of Boundedly Rational Users](https://doi.org/10.1145/3543507.3583551)|Nuo Chen, Jiqun Liu, Tetsuya Sakai|The University of Oklahoma, USA; Waseda University, Japan|Previous researches demonstrate that users’ actions in search interaction are associated with relative gains and losses to reference points, known as the reference dependence effect. However, this widely confirmed effect is not represented in most user models underpinning existing search evaluation metrics. In this study, we propose a new evaluation metric framework, namely Reference Dependent Metric (ReDeM), for assessing query-level search by incorporating the effect of reference dependence into the modelling of user search behavior. To test the overall effectiveness of the proposed framework, (1) we evaluate the performance, in terms of correlation with user satisfaction, of ReDeMs built upon different reference points against that of the widely-used metrics on three search datasets; (2) we examine the performance of ReDeMs under different task states, like task difficulty and task urgency; and (3) we analyze the statistical reliability of ReDeMs in terms of discriminative power. Experimental results indicate that: (1) ReDeMs integrated with a proper reference point achieve better correlations with user satisfaction than most of the existing metrics, like Discounted Cumulative Gain (DCG) and Rank-Biased Precision (RBP), even though their parameters have already been well-tuned; (2) ReDeMs reach relatively better performance compared to existing metrics when the task triggers a high-level cognitive load; (3) the discriminative power of ReDeMs is far stronger than Expected Reciprocal Rank (ERR), slightly stronger than Precision and similar to DCG, RBP and INST. To our knowledge, this study is the first to explicitly incorporate the reference dependence effect into the user browsing model and offline evaluation metrics. Our work illustrates a promising approach to leveraging the insights about user biases from cognitive psychology in better evaluating user search experience and enhancing user models.|以往的研究表明,用户在搜索交互中的行为与参考点的相对收益和相对损失有关,称为参考依赖效应。然而,这种被广泛证实的效果在大多数支持现有搜索评估指标的用户模型中并没有体现出来。在这项研究中,我们提出了一个新的评估度量框架,即参考依赖度量(ReDeM) ,通过将参考依赖的影响纳入用户搜索行为的建模来评估查询级搜索。为了测试提出的框架的整体有效性,(1)我们评估建立在不同参考点上的 ReDeM 与用户满意度的相关性,与三个搜索数据集上广泛使用的指标的相关性; (2)我们检查 ReDeM 在不同任务状态下的表现,如任务难度和任务紧迫性; (3)我们分析 ReDeM 在区分能力方面的统计可靠性。实验结果表明: (1)与合适的参考点相结合的 ReDeMs 与用户满意度的相关性优于大多数现有指标,如 DCG 和 RBP,尽管它们的参数已经得到了很好的调整; (2)当任务触发高水平的认知负荷时,与现有指标相比,ReDeMs 获得了相对更好的性能; (3) ReDeMs 的区分能力远远强于期望互惠秩序(ERR) ,略强于精度,类似于 DCG、 RBP 和 INST。据我们所知,这项研究是第一个明确地将参考依赖效应纳入用户浏览模型和离线评价指标。我们的工作说明了一种有前途的方法,利用认知心理学对用户偏见的洞察力,更好地评估用户搜索体验和增强用户模型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Reference-Dependent+Model+for+Web+Search+Evaluation:+Understanding+and+Measuring+the+Experience+of+Boundedly+Rational+Users)|0| |[Maximizing Submodular Functions for Recommendation in the Presence of Biases](https://doi.org/10.1145/3543507.3583195)|Anay Mehrotra, Nisheeth K. Vishnoi||Subset selection tasks, arise in recommendation systems and search engines and ask to select a subset of items that maximize the value for the user. The values of subsets often display diminishing returns, and hence, submodular functions have been used to model them. If the inputs defining the submodular function are known, then existing algorithms can be used. In many applications, however, inputs have been observed to have social biases that reduce the utility of the output subset. Hence, interventions to improve the utility are desired. Prior works focus on maximizing linear functions -- a special case of submodular functions -- and show that fairness constraint-based interventions can not only ensure proportional representation but also achieve near-optimal utility in the presence of biases. We study the maximization of a family of submodular functions that capture functions arising in the aforementioned applications. Our first result is that, unlike linear functions, constraint-based interventions cannot guarantee any constant fraction of the optimal utility for this family of submodular functions. Our second result is an algorithm for submodular maximization. The algorithm provably outputs subsets that have near-optimal utility for this family under mild assumptions and that proportionally represent items from each group. In empirical evaluation, with both synthetic and real-world data, we observe that this algorithm improves the utility of the output subset for this family of submodular functions over baselines.|子集选择任务,出现在推荐系统和搜索引擎,并要求选择一个子集的项目,最大限度地为用户的价值。子集的值经常显示报酬递减,因此,子模块函数被用来建模它们。如果定义子模函数的输入是已知的,那么可以使用现有的算法。然而,在许多应用中,输入已被观察到具有社会偏差,从而降低了输出子集的效用。因此,需要采取干预措施来改善效用。先前的研究集中在最大化线性函数(次模函数的一个特例) ,并且表明基于公平约束的干预不仅可以确保比例代表制,而且在存在偏差的情况下还可以实现接近最优的效用。我们研究了一类子模函数的最大化问题,这类子模函数捕获上述应用中出现的函数。我们的第一个结果是,与线性函数不同,基于约束的干预不能保证这个子模函数族的最优效用的任何常数部分。我们的第二个结果是一个次模最大化算法。该算法可证明输出的子集具有接近最优的效用为这个家庭在温和的假设和比例代表项目从每组。在实证评价中,我们观察到,无论是合成的还是真实的数据,这个算法都提高了这个子模函数族的输出子集在基线上的效用。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Maximizing+Submodular+Functions+for+Recommendation+in+the+Presence+of+Biases)|0| |[Facility Relocation Search For Good: When Facility Exposure Meets User Convenience](https://doi.org/10.1145/3543507.3583859)|Hui Luo, Zhifeng Bao, J. Shane Culpepper, Mingzhao Li, Yanchang Zhao|RMIT University, Australia; CSIRO, Australia|In this paper, we propose a novel facility relocation problem where facilities (and their services) are portable, which is a combinatorial search problem with many practical applications. Given a set of users, a set of existing facilities, and a set of potential sites, we decide which of the existing facilities to relocate to potential sites, such that two factors are satisfied: (1) facility exposure: facilities after relocation have balanced exposure, namely serving equivalent numbers of users; (2) user convenience: it is convenient for users to access the nearest facility, which provides services with shorter travel distance. This problem is motivated by applications such as dynamically redistributing vaccine resources to align supply with demand for different vaccination centers, and relocating the bike sharing sites daily to improve the transportation efficiency. We first prove that this problem is NP-hard, and then we propose two algorithms: a non-learning best response algorithm () and a reinforcement learning algorithm (). In particular, the best response algorithm finds a Nash equilibrium to balance the facility-related and the user-related goals. To avoid being confined to only one Nash equilibrium, as found in the method, we also propose the reinforcement learning algorithm for long-term benefits, where each facility is an agent and we determine whether a facility needs to be relocated or not. To verify the effectiveness of our methods, we adopt multiple metrics to evaluate not only our objective, but also several other facility exposure equity and user convenience metrics to understand the benefits after facility relocation. Finally, comprehensive experiments using real-world datasets provide insights into the effectiveness of the two algorithms in practice.|在本文中,我们提出了一个新的设施搬迁问题,其中设施(及其服务)是可移植的,这是一个组合搜索问题与许多实际应用。根据一组使用者、一组现有设施及一组可供选择的用地,我们会决定哪些现有设施须迁往可供选择的用地,以满足以下两个因素: (1)设施接触量: 迁移后的设施接触量均衡,即服务相等数目的使用者; (2)使用者方便: 使用者可方便地前往最近的设施,而该设施提供的服务距离较短。这个问题的动机是应用程序,如动态重新分配疫苗资源,以调整供应与不同的疫苗接种中心的需求,并重新安置自行车共享站点,以提高运输效率每天。我们首先证明了这个问题是 NP 难的,然后我们提出了两个算法: 非学习最佳响应算法()和强化学习算法()。特别是,最佳响应算法会找到一个平衡设施相关目标和用户相关目标的纳什均衡点。为了避免只局限于一个纳什均衡点,正如方法中所发现的那样,我们还提出了长期利益的强化学习算法,即每个设施都是一个代理人,我们决定是否需要重新安置一个设施。为了验证我们的方法的有效性,我们不仅采用了多个指标来评估我们的目标,而且还采用了其他一些设施暴露公平性和用户便利性指标来了解设施搬迁后的好处。最后,利用真实世界数据集进行综合实验,验证了这两种算法在实际应用中的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Facility+Relocation+Search+For+Good:+When+Facility+Exposure+Meets+User+Convenience)|0| |[Detecting and Limiting Negative User Experiences in Social Media Platforms](https://doi.org/10.1145/3543507.3583883)|Lluís Garcia Pueyo, Vinodh Kumar Sunkara, Prathyusha Senthil Kumar, Mohit Diwan, Qian Ge, Behrang Javaherian, Vasilis Verroios|Meta Platforms, Inc., USA|Item ranking is important to a social media platform’s success. The order in which posts, videos, messages, comments, ads, used products, notifications are presented to a user greatly affects the time spent on the platform, how often they visit it, how much they interact with each other, and the quantity and quality of the content they post. To this end, item ranking algorithms use models that predict the likelihood of different events, e.g., the user liking, sharing, commenting on a video, clicking/converting on an ad, or opening the platform’s app from a notification. Unfortunately, by solely relying on such event-prediction models, social media platforms tend to over optimize for short-term objectives and ignore the long-term effects. In this paper, we propose an approach that aims at improving item ranking long-term impact. The approach primarily relies on an ML model that predicts negative user experiences. The model utilizes all available UI events: the details of an action can reveal how positive or negative the user experience has been; for example, a user writing a lengthy report asking for a given video to be taken down, likely had a very negative experience. Furthermore, the model takes into account detected integrity (e.g., hostile speech or graphic violence) and quality (e.g., click or engagement bait) issues with the content. Note that those issues can be perceived very differently from different users. Therefore, developing a personalized model, where a prediction refers to a specific user for a specific piece of content at a specific point in time, is a fundamental design choice in our approach. Besides the personalized ML model, our approach consists of two more pieces: (a) the way the personalized model is integrated with an item ranking algorithm and (b) the metrics, methodology, and success criteria for the long term impact of detecting and limiting negative user experiences. Our evaluation process uses extensive A/B testing on the Facebook platform: we compare the impact of our approach in treatment groups against production control groups. The AB test results indicate a 5% to 50% reduction in hides, reports, and submitted feedback. Furthermore, we compare against a baseline that does not include some of the crucial elements of our approach: the comparison shows our approach has a 100x to 30x lower False Positive Ratio than a baseline. Lastly, we present the results from a large scale survey, where we observe a statistically significant improvement of 3 to 6 percent in users’ sentiment regarding content suffering from nudity, clickbait, false / misleading, witnessing-hate, and violence issues.|项目排名对社交媒体平台的成功至关重要。发布、视频、信息、评论、广告、二手产品、通知的顺序对用户在平台上花费的时间、访问频率、互动程度以及发布内容的数量和质量有很大影响。为此,项目排名算法使用模型来预测不同事件的可能性,例如,用户喜欢,分享,评论视频,点击/转换广告,或从通知中打开平台的应用程序。不幸的是,仅仅依靠这种事件预测模型,社交媒体平台往往会过度优化短期目标而忽视长期影响。在本文中,我们提出了一种方法,旨在提高项目排名的长期影响。这种方法主要依靠机器学习模型来预测负面的用户体验。该模型利用了所有可用的 UI 事件: 一个动作的细节可以揭示用户体验的积极或消极程度; 例如,一个用户写了一份长篇报告,要求删除一个给定的视频,很可能有一个非常消极的体验。此外,该模型还考虑了检测到的完整性(例如,敌意言论或暴力画面)和质量(例如,点击或参与诱饵)问题。请注意,这些问题可以从不同的用户看到非常不同。因此,开发个性化的模型,其中预测指的是在特定时间点的特定内容的特定用户,是我们方法中的一个基本设计选择。除了个性化机器学习模型,我们的方法还包括两个部分: (a)个性化模型与项目排序算法的整合方式; (b)检测和限制负面用户体验的长期影响的指标、方法和成功标准。我们的评估过程在 Facebook 平台上使用了广泛的 A/B 测试: 我们比较我们在治疗组和生产控制组中的方法的影响。AB 测试结果表明,皮革、报告和提交的反馈减少了5% 到50% 。此外,我们比较了不包括我们方法的一些关键要素的基线: 比较显示我们的方法比基线低100到30倍的错误阳性率。最后,我们展示了一项大规模调查的结果,我们观察到用户对于遭受裸露、点击诱饵、虚假/误导、目击仇恨和暴力问题的内容的情绪有3% 到6% 的统计显著改善。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Detecting+and+Limiting+Negative+User+Experiences+in+Social+Media+Platforms)|0| |[On Detecting Policy-Related Political Ads: An Exploratory Analysis of Meta Ads in 2022 French Election](https://doi.org/10.1145/3543507.3583875)|Vera Sosnovik, Romaissa Kessi, Maximin Coavoux, Oana Goga|CNRS, France and LIG, Université Grenoble Alpes, Grenoble INP, France; CNRS, France and LIX, Inria, Ecole Polytechnique, Institut Polytechnique de Paris, France|Online political advertising has become the cornerstone of political campaigns. The budget spent solely on political advertising in the U.S. has increased by more than 100% from \$700 million during the 2017-2018 U.S. election cycle to \$1.6 billion during the 2020 U.S. presidential elections. Naturally, the capacity offered by online platforms to micro-target ads with political content has been worrying lawmakers, journalists, and online platforms, especially after the 2016 U.S. presidential election, where Cambridge Analytica has targeted voters with political ads congruent with their personality To curb such risks, both online platforms and regulators (through the DSA act proposed by the European Commission) have agreed that researchers, journalists, and civil society need to be able to scrutinize the political ads running on large online platforms. Consequently, online platforms such as Meta and Google have implemented Ad Libraries that contain information about all political ads running on their platforms. This is the first step on a long path. Due to the volume of available data, it is impossible to go through these ads manually, and we now need automated methods and tools to assist in the scrutiny of political ads. In this paper, we focus on political ads that are related to policy. Understanding which policies politicians or organizations promote and to whom is essential in determining dishonest representations. This paper proposes automated methods based on pre-trained models to classify ads in 14 main policy groups identified by the Comparative Agenda Project (CAP). We discuss several inherent challenges that arise. Finally, we analyze policy-related ads featured on Meta platforms during the 2022 French presidential elections period.|在线政治广告已经成为政治运动的基石。美国政治广告预算从2017-2018年美国大选期间的7亿美元增加到2020年美国总统大选期间的16亿美元,增幅超过100% 。自然,在线平台提供的针对政治内容的微观广告的能力一直令立法者、记者和在线平台感到担忧,尤其是在2016年美国总统大选之后,剑桥分析公司(Cambridge Analytica)针对选民的政治广告符合他们的个性。为了遏制这种风险,在线平台和监管机构(通过欧盟委员会提出的 DSA 法案)已经同意,研究人员、记者和公民社会需要能够审查在大型在线平台上运行的政治广告。因此,像 Meta 和 Google 这样的在线平台已经实现了广告库,其中包含了在其平台上运行的所有政治广告的信息。这是漫长道路上的第一步。由于可获得的数据量很大,手动浏览这些广告是不可能的,我们现在需要自动化的方法和工具来协助审查政治广告。本文主要研究与政策相关的政治广告。了解哪些政治家或组织提倡哪些政策,以及向谁提出这些政策,对于确定不诚实陈述至关重要。本文提出了一种基于预训练模型的广告自动分类方法,用于比较议程项目(CAP)确定的14个主要政策组的广告分类。我们将讨论出现的几个内在挑战。最后,我们分析了2022年法国总统大选期间 Meta 平台上的政策相关广告。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On+Detecting+Policy-Related+Political+Ads:+An+Exploratory+Analysis+of+Meta+Ads+in+2022+French+Election)|0| |[A ML-based Approach for HTML-based Style Recommendation](https://doi.org/10.1145/3543873.3587300)|Ryan Aponte, Ryan A. Rossi, Shunan Guo, Jane Hoffswell, Nedim Lipka, Chang Xiao, Gromit YeukYin Chan, Eunyee Koh, Nesreen K. Ahmed|CMU, USA; Intel Labs, USA; Adobe, USA; Adobe Research, USA|Given a large corpus of HTML-based emails (or websites, posters, documents) collected from the web, how can we train a model capable of learning from such rich heterogeneous data for HTML-based style recommendation tasks such as recommending useful design styles or suggesting alternative HTML designs? To address this new learning task, we first decompose each HTML document in the corpus into a sequence of smaller HTML fragments where each fragment may consist of a set of HTML entities such as buttons, images, textual content (titles, paragraphs) and stylistic entities such as background-style, font-style, button-style, among others. From these HTML fragments, we then derive a single large heterogeneous hypergraph that captures the higher-order dependencies between HTML fragments and entities in such fragments, both within the same HTML document as well as across the HTML documents in the corpus. We then formulate this new HTML style recommendation task as a hypergraph representation learning problem and propose an approach to solve it. Our approach is able to learn effective low-dimensional representations of the higher-order fragments that consist of sets of heterogeneous entities as well as low-dimensional representations of the individual entities themselves. We demonstrate the effectiveness of the approach across several design style recommendation tasks. To the best of our knowledge, this work is the first to develop an ML-based model for the task of HTML-based email style recommendation.|鉴于从网上收集的大量基于 HTML 的电子邮件(或网站、海报、文档) ,我们如何才能培养一个模型,使其能够从这些丰富的异构数据中学习基于 HTML 的风格推荐任务,如推荐有用的设计风格或建议替代 HTML 设计?为了解决这个新的学习任务,我们首先将语料库中的每个 HTML 文档分解为一系列较小的 HTML 片段,其中每个片段可能包含一组 HTML 实体,如按钮、图像、文本内容(标题、段落)和风格实体,如背景风格、字体风格、按钮风格等。然后,从这些 HTML 片段中,我们得到一个单一的大型异构超图,它捕获这些片段中 HTML 片段和实体之间的高阶依赖关系,这些依赖关系既存在于同一 HTML 文档中,也存在于语料库中的 HTML 文档之间。然后将这个新的 HTML 风格推荐任务表示为一个超图表示学习问题,并提出了一种解决方法。我们的方法能够学习高阶片段的有效低维表示,这些片段由异构实体集合以及单个实体本身的低维表示组成。我们在几个设计风格的推荐任务中演示了该方法的有效性。据我们所知,这项工作是第一个开发基于机器学习的任务的基于 HTML 的电子邮件样式推荐模型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+ML-based+Approach+for+HTML-based+Style+Recommendation)|0| -|[Graph-Level Embedding for Time-Evolving Graphs](https://doi.org/10.1145/3543873.3587299)|Lili Wang, Chenghan Huang, Xinyuan Cao, Weicheng Ma, Soroush Vosoughi|Georgia Institute of Technology, USA; Dartmouth College, USA; Jefferies Financial Group LLC, USA|Graph representation learning (also known as network embedding) has been extensively researched with varying levels of granularity, ranging from nodes to graphs. While most prior work in this area focuses on node-level representation, limited research has been conducted on graph-level embedding, particularly for dynamic or temporal networks. However, learning low-dimensional graph-level representations for dynamic networks is critical for various downstream graph retrieval tasks such as temporal graph similarity ranking, temporal graph isomorphism, and anomaly detection. In this paper, we present a novel method for temporal graph-level embedding that addresses this gap. Our approach involves constructing a multilayer graph and using a modified random walk with temporal backtracking to generate temporal contexts for the graph’s nodes. We then train a “document-level’’ language model on these contexts to generate graph-level embeddings. We evaluate our proposed model on five publicly available datasets for the task of temporal graph similarity ranking, and our model outperforms baseline methods. Our experimental results demonstrate the effectiveness of our method in generating graph-level embeddings for dynamic networks.|图表示学习(也称为网络嵌入)已经被广泛研究与不同的粒度级别,从节点到图。虽然该领域的大多数工作集中在节点级表示,但是对图级嵌入的研究很有限,特别是对于动态或时态网络。然而,学习动态网络的低维图级表示对于各种下游图检索任务(如时间图相似性排序、时间图同构和异常检测)至关重要。在本文中,我们提出了一种新的时间图级嵌入方法来解决这一问题。我们的方法包括构造一个多层图,并使用一个修改过的随机游走和时间回溯来为图的节点生成时间上下文。然后,我们在这些上下文上训练一个“文档级”语言模型来生成图级嵌入。我们评估了我们提出的模型在五个公开可用的数据集的时间图相似性排序的任务,我们的模型优于基线方法。实验结果表明了该方法在动态网络图级嵌入生成中的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph-Level+Embedding+for+Time-Evolving+Graphs)|0| -|[SpotLight: Visual Insight Recommendation](https://doi.org/10.1145/3543873.3587302)|Camille Harris, Ryan A. Rossi, Sana Malik, Jane Hoffswell, Fan Du, Tak Yeon Lee, Eunyee Koh, Handong Zhao|KAIST, Republic of Korea; Adobe Research, USA; Georgia Tech, USA|Visualization recommendation systems make understanding data more accessible to users of all skill levels by automatically generating visualizations for users to explore. However, most existing visualization recommendation systems focus on ranking all possible visualizations based on the attributes or encodings, which makes it difficult to find the most relevant insights. We therefore introduce a novel class of insight-based visualization recommendation systems that automatically rank and recommend groups of related insights as well as the most important insights within each group. Our approach combines results from different learning-based methods to discover insights automatically and generalizes to a variety of attribute types (e.g., categorical, numerical, and temporal), including non-trivial combinations of these attribute types. To demonstrate the utility of this approach, we implemented a insight-centric visualization recommendation system, SpotLight, and conducted a user study with twelve participants, which showed that users are able to quickly find and understand relevant insights in unfamiliar data.|可视化推荐系统通过自动生成供用户探索的可视化,使所有技能水平的用户更容易理解数据。然而,现有的可视化推荐系统大多侧重于基于属性或编码对所有可能的可视化进行排序,这使得很难找到最相关的见解。因此,我们引入了一类新颖的基于洞察力的可视化推荐系统,该系统可以自动对相关洞察力以及每个组内最重要的洞察力进行排名和推荐。我们的方法结合了来自不同的基于学习的方法的结果,自动发现见解,并推广到各种属性类型(例如,分类,数字和时间) ,包括这些属性类型的非平凡组合。为了证明这种方法的实用性,我们实施了一个以洞察力为中心的可视化推荐系统 SpotLight,并对12名参与者进行了用户研究,结果显示用户能够快速找到并理解不熟悉数据中的相关见解。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SpotLight:+Visual+Insight+Recommendation)|0| +|[Graph-Level Embedding for Time-Evolving Graphs](https://doi.org/10.1145/3543873.3587299)|Lili Wang, Chenghan Huang, Xinyuan Cao, Weicheng Ma, Soroush Vosoughi|Jefferies Financial Group LLC, USA; Dartmouth College, USA; Georgia Institute of Technology, USA|Graph representation learning (also known as network embedding) has been extensively researched with varying levels of granularity, ranging from nodes to graphs. While most prior work in this area focuses on node-level representation, limited research has been conducted on graph-level embedding, particularly for dynamic or temporal networks. However, learning low-dimensional graph-level representations for dynamic networks is critical for various downstream graph retrieval tasks such as temporal graph similarity ranking, temporal graph isomorphism, and anomaly detection. In this paper, we present a novel method for temporal graph-level embedding that addresses this gap. Our approach involves constructing a multilayer graph and using a modified random walk with temporal backtracking to generate temporal contexts for the graph’s nodes. We then train a “document-level’’ language model on these contexts to generate graph-level embeddings. We evaluate our proposed model on five publicly available datasets for the task of temporal graph similarity ranking, and our model outperforms baseline methods. Our experimental results demonstrate the effectiveness of our method in generating graph-level embeddings for dynamic networks.|图表示学习(也称为网络嵌入)已经被广泛研究与不同的粒度级别,从节点到图。虽然该领域的大多数工作集中在节点级表示,但是对图级嵌入的研究很有限,特别是对于动态或时态网络。然而,学习动态网络的低维图级表示对于各种下游图检索任务(如时间图相似性排序、时间图同构和异常检测)至关重要。在本文中,我们提出了一种新的时间图级嵌入方法来解决这一问题。我们的方法包括构造一个多层图,并使用一个修改过的随机游走和时间回溯来为图的节点生成时间上下文。然后,我们在这些上下文上训练一个“文档级”语言模型来生成图级嵌入。我们评估了我们提出的模型在五个公开可用的数据集的时间图相似性排序的任务,我们的模型优于基线方法。实验结果表明了该方法在动态网络图级嵌入生成中的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph-Level+Embedding+for+Time-Evolving+Graphs)|0| +|[SpotLight: Visual Insight Recommendation](https://doi.org/10.1145/3543873.3587302)|Camille Harris, Ryan A. Rossi, Sana Malik, Jane Hoffswell, Fan Du, Tak Yeon Lee, Eunyee Koh, Handong Zhao|KAIST, Republic of Korea; Georgia Tech, USA; Adobe Research, USA|Visualization recommendation systems make understanding data more accessible to users of all skill levels by automatically generating visualizations for users to explore. However, most existing visualization recommendation systems focus on ranking all possible visualizations based on the attributes or encodings, which makes it difficult to find the most relevant insights. We therefore introduce a novel class of insight-based visualization recommendation systems that automatically rank and recommend groups of related insights as well as the most important insights within each group. Our approach combines results from different learning-based methods to discover insights automatically and generalizes to a variety of attribute types (e.g., categorical, numerical, and temporal), including non-trivial combinations of these attribute types. To demonstrate the utility of this approach, we implemented a insight-centric visualization recommendation system, SpotLight, and conducted a user study with twelve participants, which showed that users are able to quickly find and understand relevant insights in unfamiliar data.|可视化推荐系统通过自动生成供用户探索的可视化,使所有技能水平的用户更容易理解数据。然而,现有的可视化推荐系统大多侧重于基于属性或编码对所有可能的可视化进行排序,这使得很难找到最相关的见解。因此,我们引入了一类新颖的基于洞察力的可视化推荐系统,该系统可以自动对相关洞察力以及每个组内最重要的洞察力进行排名和推荐。我们的方法结合了来自不同的基于学习的方法的结果,自动发现见解,并推广到各种属性类型(例如,分类,数字和时间) ,包括这些属性类型的非平凡组合。为了证明这种方法的实用性,我们实施了一个以洞察力为中心的可视化推荐系统 SpotLight,并对12名参与者进行了用户研究,结果显示用户能够快速找到并理解不熟悉数据中的相关见解。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SpotLight:+Visual+Insight+Recommendation)|0| |[DataExpo: A One-Stop Dataset Service for Open Science Research](https://doi.org/10.1145/3543873.3587305)|Bin Lu, Lyuwen Wu, Lina Yang, Chenxing Sun, Wei Liu, Xiaoying Gan, Shiyu Liang, Luoyi Fu, Xinbing Wang, Chenghu Zhou|Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, China; Shanghai Jiao Tong University, China|The large volumes of data on the Internet provides new opportunities for scientific discovery, especially promoting data-driven open science research. However, due to lack of accurate semantic markups, finding relevant data is still difficult. To address this problem, we develop a one-stop dataset service called DataExpo and propose a deep learning method for automatic metadata ingestion. In this demo paper, we describe the system architecture, and how DataExpo facilitates dataset discovery, search and recommendation. Up till now, DataExpo has indexed over 960,000 datasets from more than 27,000 repositories in the context of Deep-time Digital Earth Program. Demo visitors can explore our service via https://dataexpo.acemap.info.|互联网上的大量数据为科学发现提供了新的机会,特别是促进了数据驱动的开放科学研究。然而,由于缺乏准确的语义标记,找到相关数据仍然很困难。为了解决这个问题,我们开发了一个名为 DataExpo 的一站式数据集服务,并提出了一种自动元数据摄取的深度学习方法。在本演示文章中,我们描述了系统的体系结构,以及 DataExpo 如何促进数据集的发现、搜索和推荐。到目前为止,数据博览会已经从超过27,000个数据库中索引了超过960,000个数据集。示范观众可透过 https://dataexpo.acemap.info 探索我们的服务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DataExpo:+A+One-Stop+Dataset+Service+for+Open+Science+Research)|0| -|[Mirror: A Natural Language Interface for Data Querying, Summarization, and Visualization](https://doi.org/10.1145/3543873.3587309)|Canwen Xu, Julian J. McAuley, Penghan Wang|Cisco, USA; UC San Diego, USA|We present Mirror, an open-source platform for data exploration and analysis powered by large language models. Mirror offers an intuitive natural language interface for querying databases, and automatically generates executable SQL commands to retrieve relevant data and summarize it in natural language. In addition, users can preview and manually edit the generated SQL commands to ensure the accuracy of their queries. Mirror also generates visualizations to facilitate understanding of the data. Designed with flexibility and human input in mind, Mirror is suitable for both experienced data analysts and non-technical professionals looking to gain insights from their data.|我们介绍了一个基于大型语言模型的开源数据探索和分析平台—— Mirror。Mirror 为查询数据库提供了一个直观的自然语言界面,并自动生成可执行的 SQL 命令来检索相关数据并用自然语言对其进行汇总。此外,用户还可以预览和手动编辑生成的 SQL 命令,以确保查询的准确性。Mirror 还生成可视化,以便于理解数据。设计具有灵活性和人工输入的头脑,镜子是适合于有经验的数据分析师和非技术专业人士寻求获得洞察力从他们的数据。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Mirror:+A+Natural+Language+Interface+for+Data+Querying,+Summarization,+and+Visualization)|0| +|[Mirror: A Natural Language Interface for Data Querying, Summarization, and Visualization](https://doi.org/10.1145/3543873.3587309)|Canwen Xu, Julian J. McAuley, Penghan Wang|UC San Diego, USA; Cisco, USA|We present Mirror, an open-source platform for data exploration and analysis powered by large language models. Mirror offers an intuitive natural language interface for querying databases, and automatically generates executable SQL commands to retrieve relevant data and summarize it in natural language. In addition, users can preview and manually edit the generated SQL commands to ensure the accuracy of their queries. Mirror also generates visualizations to facilitate understanding of the data. Designed with flexibility and human input in mind, Mirror is suitable for both experienced data analysts and non-technical professionals looking to gain insights from their data.|我们介绍了一个基于大型语言模型的开源数据探索和分析平台—— Mirror。Mirror 为查询数据库提供了一个直观的自然语言界面,并自动生成可执行的 SQL 命令来检索相关数据并用自然语言对其进行汇总。此外,用户还可以预览和手动编辑生成的 SQL 命令,以确保查询的准确性。Mirror 还生成可视化,以便于理解数据。设计具有灵活性和人工输入的头脑,镜子是适合于有经验的数据分析师和非技术专业人士寻求获得洞察力从他们的数据。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Mirror:+A+Natural+Language+Interface+for+Data+Querying,+Summarization,+and+Visualization)|0| |[Is the Impression Log Beneficial to Effective Model Training in News Recommender Systems? No, It's NOT](https://doi.org/10.1145/3543873.3587312)|Jeewon Ahn, HongKyun Bae, SangWook Kim||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Is+the+Impression+Log+Beneficial+to+Effective+Model+Training+in+News+Recommender+Systems?+No,+It's+NOT)|0| -|[Incorporating Embedding to Topic Modeling for More Effective Short Text Analysis](https://doi.org/10.1145/3543873.3587316)|Junaid Rashid, Jungeun Kim, Usman Naseem|School of Computer Science, The University of Sydney, Sydney, Australia, Australia; Department of Software, Kongju National University, Cheonan, Republic of Korea, Republic of Korea; Department of Data Science, Sejong University, Seoul, Republic of Korea, Republic of Korea|With the growing abundance of short text content on websites, analyzing and comprehending these short texts has become a crucial task. Topic modeling is a widely used technique for analyzing short text documents and uncovering the underlying topics. However, traditional topic models face difficulties in accurately extracting topics from short texts due to limited content and their sparse nature. To address these issues, we propose an Embedding-based topic modeling (EmTM) approach that incorporates word embedding and hierarchical clustering to identify significant topics. Experimental results demonstrate the effectiveness of EmTM on two datasets comprising web short texts, Snippet and News. The results indicate a superiority of EmTM over baseline topic models by its exceptional performance in both classification accuracy and topic coherence metrics.|随着网站上短文内容的日益丰富,分析和理解这些短文已经成为一项重要的任务。主题建模是一种广泛使用的分析短文本文档和揭示潜在主题的技术。然而,传统的话题模型由于内容有限和稀疏的特点,很难准确地从短文中提取话题。为了解决这些问题,我们提出了一种基于嵌入的主题建模(EmTM)方法,该方法结合了单词嵌入和层次聚类来识别重要的主题。实验结果表明,该方法能够有效地处理包括网络短文本、片段和新闻在内的两个数据集。结果表明,与基线主题模型相比,EmTM 在分类精度和主题一致性度量方面具有优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Incorporating+Embedding+to+Topic+Modeling+for+More+Effective+Short+Text+Analysis)|0| +|[Incorporating Embedding to Topic Modeling for More Effective Short Text Analysis](https://doi.org/10.1145/3543873.3587316)|Junaid Rashid, Jungeun Kim, Usman Naseem|Department of Data Science, Sejong University, Seoul, Republic of Korea, Republic of Korea; School of Computer Science, The University of Sydney, Sydney, Australia, Australia; Department of Software, Kongju National University, Cheonan, Republic of Korea, Republic of Korea|With the growing abundance of short text content on websites, analyzing and comprehending these short texts has become a crucial task. Topic modeling is a widely used technique for analyzing short text documents and uncovering the underlying topics. However, traditional topic models face difficulties in accurately extracting topics from short texts due to limited content and their sparse nature. To address these issues, we propose an Embedding-based topic modeling (EmTM) approach that incorporates word embedding and hierarchical clustering to identify significant topics. Experimental results demonstrate the effectiveness of EmTM on two datasets comprising web short texts, Snippet and News. The results indicate a superiority of EmTM over baseline topic models by its exceptional performance in both classification accuracy and topic coherence metrics.|随着网站上短文内容的日益丰富,分析和理解这些短文已经成为一项重要的任务。主题建模是一种广泛使用的分析短文本文档和揭示潜在主题的技术。然而,传统的话题模型由于内容有限和稀疏的特点,很难准确地从短文中提取话题。为了解决这些问题,我们提出了一种基于嵌入的主题建模(EmTM)方法,该方法结合了单词嵌入和层次聚类来识别重要的主题。实验结果表明,该方法能够有效地处理包括网络短文本、片段和新闻在内的两个数据集。结果表明,与基线主题模型相比,EmTM 在分类精度和主题一致性度量方面具有优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Incorporating+Embedding+to+Topic+Modeling+for+More+Effective+Short+Text+Analysis)|0| |[EnhancE: Enhanced Entity and Relation Embedding for Knowledge Hypergraph Link Prediction](https://doi.org/10.1145/3543873.3587326)|Chenxu Wang, Zhao Li, Xin Wang, Zirui Chen|Tianjin University, China|Knowledge Hypergraphs, as the generalization of knowledge graphs, have attracted increasingly widespread attention due to their friendly compatibility with real-world facts. However, link prediction in knowledge hypergraph is still an underexplored field despite the ubiquity of n-ary facts in the real world. Several recent representative embedding-based knowledge hypergraph link prediction methods have proven to be effective in a series of benchmarks, however, they only consider the position (or role) information, ignoring the neighborhood structure among entities and rich semantic information within each fact. To this end, we propose a model named EnhancE for effective link prediction in knowledge hypergraphs. On the one hand, a more expressive entity representation is obtained with both position and neighborhood information added to the initial embedding. On the other hand, rich semantic information of the involved entities within each tuple is incorporated into relation embedding for enhanced representation. Extensive experimental results over real datasets of both knowledge hypergraph and knowledge graph demonstrate the excellent performance of EnhancE compared with a variety of state-of-the-art baselines.|知识超图作为知识图的一种推广,由于其与现实世界事实的友好兼容性而引起了人们越来越广泛的关注。然而,尽管在现实世界中 n 元事实的普遍存在,知识超图中的链接预测仍然是一个未被充分探索的领域。最近几种有代表性的嵌入式知识超图链接预测方法已经被证明在一系列的基准测试中是有效的,但是它们只考虑位置(或角色)信息,忽略了实体间的邻域结构和每个事实中丰富的语义信息。为此,我们提出了一种基于知识超图的有效链接预测模型——增强 E。一方面,在初始嵌入的基础上加入位置信息和邻域信息,得到更具表现力的实体表示;。另一方面,每个元组中所涉及的实体的丰富语义信息被合并到关系嵌入中以增强表示。在知识超图和知识图的实际数据集上进行的大量实验结果表明,与各种最先进的基线相比,增强 E 具有优异的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=EnhancE:+Enhanced+Entity+and+Relation+Embedding+for+Knowledge+Hypergraph+Link+Prediction)|0| |[An Analogical Reasoning Method Based on Multi-task Learning with Relational Clustering](https://doi.org/10.1145/3543873.3587333)|Shuyi Li, Shaojuan Wu, Xiaowang Zhang, Zhiyong Feng|College of Intelligence and Computing, Tianjin University, China|Analogical QA task is a challenging natural language processing problem. When two word pairs are similar in their relationships, we refer to their relations as analogous. Although the analogy method based on word embedding is well developed, the analogy reasoning is far beyond this scope. At present, the methods based on pre-trained language models have explored only the tip of the iceberg. In this paper, we proposed a multi-task learning method for analogical QA task. First, we obtain word-pair representations by leveraging the output embeddings of the [MASK] token in the pre-trained language model. The representations are prepared for two tasks. The first task aims to train an analogical classifier by supervised learning. The second task is an auxiliary task based on relation clustering to generate relation pseudo-labels for word pairs and train relation classifier. Our method guides the model to analyze the relation similarity in analogical reasoning without relation labels. The experiments show that our method achieve excellent performance on four analogical reasoning datasets without the help of external corpus and knowledge. In the most difficult data set E-KAR, it has increased by at least 4%.|类比 QA 任务是一个具有挑战性的自然语言处理问题。当两个词对在关系上相似时,我们把它们的关系称为相似。基于嵌入词的类比推理方法虽然已经得到了很好的发展,但是类比推理远远超出了这个范围。目前,基于预训练语言模型的方法仅仅探索了冰山一角。本文提出了一种类比 QA 任务的多任务学习方法。首先,我们利用预训练语言模型中[ MASK ]令牌的输出嵌入获得词对表示。这些表示准备用于两个任务。第一个任务是通过监督式学习训练一个类比分类器。第二个任务是一个基于关系聚类的辅助任务,用于生成词对的关系伪标签和训练关系分类器。该方法引导模型分析无关联标签的类比推理中的关联相似性。实验结果表明,该方法在不借助外部语料库和知识的情况下,对四个类比推理数据集取得了良好的性能。在最困难的数据集 E-KAR 中,它至少增加了4% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=An+Analogical+Reasoning+Method+Based+on+Multi-task+Learning+with+Relational+Clustering)|0| -|[Templet: A Collaborative System for Knowledge Graph Question Answering over Wikidata](https://doi.org/10.1145/3543873.3587335)|Francisca Suárez, Aidan Hogan|DCC, Universidad de Chile, Chile; DCC, Universidad de Chile, Chile and Instituto Milenio Fundamentos de los Datos (IMFD), Chile|We present Templet: an online question answering (QA) system for Wikidata. Templet is based on the collaboratively-edited repository QAWiki, which collects questions in multiple natural languages along with their corresponding structured queries. Templet generates templates from question–query pairs on QAWiki by replacing key entities with identifiers. Using autocompletion, the user can type a question in natural language, select a template, and again using autocompletion, select the entities they wish to insert into the template’s placeholders, generating a concrete question, query and results. The main objectives of Templet are: (i) to enable users to answer potentially complex questions over Wikidata using natural language templates and autocompletion; (ii) to encourage users to collaboratively create new templates via QAWiki, which in turn can benefit not only Templet, but other QA systems.|我们提出的模板: 一个在线问题回答(QA)系统的 Wikidata。Templet 基于协作编辑的存储库 QAWiki,该存储库用多种自然语言收集问题以及相应的结构化查询。Templet 通过使用标识符替换关键实体,从 QAWiki 上的问题-查询对生成模板。使用自动补全,用户可以用自然语言键入一个问题,选择一个模板,然后再次使用自动补全,选择他们希望插入到模板占位符中的实体,生成一个具体的问题、查询和结果。Templet 的主要目标是: (i)使用户能够通过使用自然语言模板和自动完成来回答 Wikidata 上潜在的复杂问题; (ii)鼓励用户通过 QAWiki 合作创建新的模板,这反过来不仅可以使 Templet 受益,还可以使其他 QA 系统受益。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Templet:+A+Collaborative+System+for+Knowledge+Graph+Question+Answering+over+Wikidata)|0| -|[OptiRef: Query Optimization for Knowledge Bases](https://doi.org/10.1145/3543873.3587342)|Wafaa El Husseini, Cheikh Brahim El Vaigh, François Goasdoué, Hélène Jaudoin|Univ. Rennes, France; Univ. Bourgogne, France|Ontology-mediated query answering (OMQA) consists in asking database queries on a knowledge base (KB); a KB is a set of facts, the KB’s database, described by domain knowledge, the KB’s ontology. FOL-rewritability is the main OMQA technique: it reformulates a query w.r.t. the KB’s ontology so that the evaluation of the reformulated query on the KB’s database computes the correct answers. However, because this technique embeds the domain knowledge relevant to the query into the reformulated query, a reformulated query may be complex and its optimization is the crux of efficiency. We showcase OptiRef that implements a novel, general optimization framework for efficient query answering on datalog ±, description logic, existential rules, OWL and RDF/S KBs. OptiRef optimizes reformulated queries by rapidly computing, based on a KB’s database summary, simpler (contained) queries with the same answers. We demonstrate OptiRef’s effectiveness on well-established benchmarks: performance is significantly improved in general, up to several orders of magnitude in the best cases!|本体介导的查询回答(OMQA)包括在知识库(KB)上询问数据库查询; 知识库是一组事实,知识库的数据库,由领域知识描述,知识库的本体。FOL 可重写性是主要的 OMQA 技术: 它重新规范查询 w.r.t. 知识库的本体,以便在知识库的数据库上计算重新规范查询的正确答案。然而,由于这种技术将与查询相关的领域知识嵌入到重构查询中,因此重构查询可能比较复杂,其优化是提高查询效率的关键。我们展示了 OptiRef,它实现了一个新颖的、通用的优化框架,用于在数据目录 ± 、描述逻辑、存在规则、 OWL 和 RDF/S 知识库上进行有效的查询应答。OptiRef 通过快速计算优化重新配置的查询,基于知识库的数据库摘要,使用具有相同答案的更简单(包含)的查询。我们展示了 OptiRef 在完善的基准测试上的有效性: 性能总体上得到了显著的改善,在最好的情况下达到了几个数量级!|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=OptiRef:+Query+Optimization+for+Knowledge+Bases)|0| -|[Learning Topical Structured Interfaces from Medical Research Literature](https://doi.org/10.1145/3543873.3587353)|Maitry Chauhan, Anna Pyayt, Michael N. Gubanov|University of South Florida, USA; Florida State University, USA|Accessing large-scale structured datasets such as WDC or CORD-191 is very challenging. Even if one topic (e.g. COVID-19 vaccine efficacy) is of interest, all topical tables in different sources/papers have hundreds of different schemas, depending on the authors, which significantly complicates both finding and querying them. Here we demonstrate a scalable Meta-profiler system, capable of constructing a structured standardized interface to a topic of interest in large-scale (semi-)structured datasets. This interface, that we call Meta-profile represents a multi-dimensional meta-data summary for a selected topic of interest, accumulating all differently structured representations of the topical tables in the dataset. Such Meta-profiles can be used as a rich visualization as well as a robust structural query interface simplifying access to large-scale (semi-)structured data for different user segments, such as data scientists and end users.|访问大规模的结构化数据集,如 WDC 或 CORD-191是非常具有挑战性的。即使一个主题(例如2019冠状病毒疾病疫苗效力)是有趣的,不同来源/论文中的所有主题表格都有数百种不同的模式,这取决于作者,这使得查找和查询这些模式变得非常复杂。在这里,我们演示了一个可伸缩的元分析器系统,它能够为大规模(半)结构化数据集中感兴趣的主题构建一个结构化的标准化接口。我们称之为 Meta-profile 的这个接口表示一个选定主题的多维元数据摘要,它积累了数据集中主题表的所有不同结构的表示。这样的元概要文件可以用作丰富的可视化以及健壮的结构化查询界面,简化不同用户段(如数据科学家和最终用户)对大规模(半)结构化数据的访问。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Topical+Structured+Interfaces+from+Medical+Research+Literature)|0| -|[DGBCT: A Scalable Distributed Gradient Boosting Causal Tree at Alipay](https://doi.org/10.1145/3543873.3584645)|Jun Zhou, Caizhi Tang, Qing Cui, Yi Ding, Longfei Li, Fei Wu|Ant Group, China; College of Computer Science and Technology, Zhejiang University, China; College of Computer Science and Technology, Zhejiang University, China and Ant Group, China|Causal effect estimation has been increasingly emphasized in the past few years. To handle this problem, tree-based causal methods have been widely used due to their robustness and explainability. However, most of the existing methods are limited to running on a single machine, making it difficult to scale up to hundreds of millions of data in typical industrial scenarios. This paper proposes DGBCT, a Distributed Gradient Boosting Causal Tree to tackle such problem, and the contribution of this paper is three folds. First, we extend the original GBCT method to a multi-treatment setting and take the monotonic constraints into consideration, so that more typical industrial necessities can be resolved with our framework. Moreover, we implement DGBCT based on the ‘Controller-Coordinator-Worker’ framework, in which dual failover mechanism is achieved, and commendable flexibility is ensured. In addition, empirical results show that DGBCT significantly outperforms the state-of-the-art causal trees, and has a near-linear speedup as the number of workers grows. The system is currently deployed in Alipay1 to support the daily business tasks that involve hundreds of millions of users.|因果效应估计在过去的几年中越来越受到重视。为了解决这个问题,基于树的因果关系方法由于其鲁棒性和可解释性而得到了广泛的应用。然而,现有的大多数方法仅限于在单台机器上运行,因此在典型的工业场景中很难扩展到数亿个数据。本文提出了分布式梯度提升因果树 DGBCT 来解决这个问题,本文的贡献有三个方面。首先,我们将原来的 GBCT 方法扩展到一个多处理环境,并且考虑了单调约束,使得我们的框架能够解决更多的典型工业需求。此外,我们在“控制器-协调器-工作者”框架的基础上实现了 DGBCT,实现了双重故障转移机制,并保证了值得称赞的灵活性。此外,实证结果显示,DGBCT 的表现明显优于最先进的因果树,并且随着工作人员数量的增加有近线性的加速效应。该系统目前部署在支付宝1中,以支持涉及数亿用户的日常业务任务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DGBCT:+A+Scalable+Distributed+Gradient+Boosting+Causal+Tree+at+Alipay)|0| -|[What Image do You Need? A Two-stage Framework for Image Selection in E-commerce](https://doi.org/10.1145/3543873.3584646)|Sheng You, Chao Wang, Baohua Wu, Jingping Liu, Quan Lu, Guanzhou Han, Yanghua Xiao|Fudan University, China; East China University of Science and Technology, China; Alibaba Group, China; Shanghai University, China|In e-commerce, images are widely used to display more intuitive information about items. Image selection significantly affects the user’s click-through rate (CTR). Most existing work considers the CTR as the target to find an appropriate image. However, these methods are challenging to deploy online efficiently. Also, the selected images may not relate to the item but are profitable to CTR, resulting in the undesirable phenomenon of enticing users to click on the item. To address these issues, we propose a novel two-stage pipeline method with content-based recall model and CTR-based ranking model. The first is realized as a joint method based on the title-image matching model and multi-modal knowledge graph embedding learning model. The second is a CTR-based visually aware scoring model, incorporating the entity textual information and entity images. Experimental results show the effectiveness and efficiency of our method in offline evaluations. After a month of online A/B testing on a travel platform Fliggy, the relative improvement of our method is 5% with respect to seller selection on CTCVR in the searching scenario, and our method further improves pCTR from 3.48% of human pick to 3.53% in the recommendation scenario.|在电子商务中,图像被广泛用于显示更直观的商品信息。图像选择会显著影响用户的点进率。大多数现有的工作都将 CTR 作为寻找合适图像的目标。然而,这些方法在线有效部署是具有挑战性的。此外,所选图像可能与该项目无关,但有利于点击率,导致不良现象的诱惑用户点击该项目。为了解决这些问题,我们提出了一种新的基于内容的召回模型和基于点击率的排序模型的两阶段流水线方法。第一种是基于标题-图像匹配模型和多模态知识图嵌入学习模型的联合方法。第二种是基于 CTR 的视觉感知评分模型,结合了实体文本信息和实体图像。实验结果表明了该方法在离线评估中的有效性和有效性。在旅游平台 Fliggy 上进行了一个月的在线 A/B 测试后,在搜索场景中,我们的方法相对于 CTCVR 上的卖方选择的相对改善为5% ,并且我们的方法进一步将 pCTR 从人类选择的3.48% 提高到推荐场景中的3.53% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=What+Image+do+You+Need?+A+Two-stage+Framework+for+Image+Selection+in+E-commerce)|0| +|[Templet: A Collaborative System for Knowledge Graph Question Answering over Wikidata](https://doi.org/10.1145/3543873.3587335)|Francisca Suárez, Aidan Hogan|DCC, Universidad de Chile, Chile and Instituto Milenio Fundamentos de los Datos (IMFD), Chile; DCC, Universidad de Chile, Chile|We present Templet: an online question answering (QA) system for Wikidata. Templet is based on the collaboratively-edited repository QAWiki, which collects questions in multiple natural languages along with their corresponding structured queries. Templet generates templates from question–query pairs on QAWiki by replacing key entities with identifiers. Using autocompletion, the user can type a question in natural language, select a template, and again using autocompletion, select the entities they wish to insert into the template’s placeholders, generating a concrete question, query and results. The main objectives of Templet are: (i) to enable users to answer potentially complex questions over Wikidata using natural language templates and autocompletion; (ii) to encourage users to collaboratively create new templates via QAWiki, which in turn can benefit not only Templet, but other QA systems.|我们提出的模板: 一个在线问题回答(QA)系统的 Wikidata。Templet 基于协作编辑的存储库 QAWiki,该存储库用多种自然语言收集问题以及相应的结构化查询。Templet 通过使用标识符替换关键实体,从 QAWiki 上的问题-查询对生成模板。使用自动补全,用户可以用自然语言键入一个问题,选择一个模板,然后再次使用自动补全,选择他们希望插入到模板占位符中的实体,生成一个具体的问题、查询和结果。Templet 的主要目标是: (i)使用户能够通过使用自然语言模板和自动完成来回答 Wikidata 上潜在的复杂问题; (ii)鼓励用户通过 QAWiki 合作创建新的模板,这反过来不仅可以使 Templet 受益,还可以使其他 QA 系统受益。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Templet:+A+Collaborative+System+for+Knowledge+Graph+Question+Answering+over+Wikidata)|0| +|[OptiRef: Query Optimization for Knowledge Bases](https://doi.org/10.1145/3543873.3587342)|Wafaa El Husseini, Cheikh Brahim El Vaigh, François Goasdoué, Hélène Jaudoin|Univ. Bourgogne, France; Univ. Rennes, France|Ontology-mediated query answering (OMQA) consists in asking database queries on a knowledge base (KB); a KB is a set of facts, the KB’s database, described by domain knowledge, the KB’s ontology. FOL-rewritability is the main OMQA technique: it reformulates a query w.r.t. the KB’s ontology so that the evaluation of the reformulated query on the KB’s database computes the correct answers. However, because this technique embeds the domain knowledge relevant to the query into the reformulated query, a reformulated query may be complex and its optimization is the crux of efficiency. We showcase OptiRef that implements a novel, general optimization framework for efficient query answering on datalog ±, description logic, existential rules, OWL and RDF/S KBs. OptiRef optimizes reformulated queries by rapidly computing, based on a KB’s database summary, simpler (contained) queries with the same answers. We demonstrate OptiRef’s effectiveness on well-established benchmarks: performance is significantly improved in general, up to several orders of magnitude in the best cases!|本体介导的查询回答(OMQA)包括在知识库(KB)上询问数据库查询; 知识库是一组事实,知识库的数据库,由领域知识描述,知识库的本体。FOL 可重写性是主要的 OMQA 技术: 它重新规范查询 w.r.t. 知识库的本体,以便在知识库的数据库上计算重新规范查询的正确答案。然而,由于这种技术将与查询相关的领域知识嵌入到重构查询中,因此重构查询可能比较复杂,其优化是提高查询效率的关键。我们展示了 OptiRef,它实现了一个新颖的、通用的优化框架,用于在数据目录 ± 、描述逻辑、存在规则、 OWL 和 RDF/S 知识库上进行有效的查询应答。OptiRef 通过快速计算优化重新配置的查询,基于知识库的数据库摘要,使用具有相同答案的更简单(包含)的查询。我们展示了 OptiRef 在完善的基准测试上的有效性: 性能总体上得到了显著的改善,在最好的情况下达到了几个数量级!|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=OptiRef:+Query+Optimization+for+Knowledge+Bases)|0| +|[Learning Topical Structured Interfaces from Medical Research Literature](https://doi.org/10.1145/3543873.3587353)|Maitry Chauhan, Anna Pyayt, Michael N. Gubanov|Florida State University, USA; University of South Florida, USA|Accessing large-scale structured datasets such as WDC or CORD-191 is very challenging. Even if one topic (e.g. COVID-19 vaccine efficacy) is of interest, all topical tables in different sources/papers have hundreds of different schemas, depending on the authors, which significantly complicates both finding and querying them. Here we demonstrate a scalable Meta-profiler system, capable of constructing a structured standardized interface to a topic of interest in large-scale (semi-)structured datasets. This interface, that we call Meta-profile represents a multi-dimensional meta-data summary for a selected topic of interest, accumulating all differently structured representations of the topical tables in the dataset. Such Meta-profiles can be used as a rich visualization as well as a robust structural query interface simplifying access to large-scale (semi-)structured data for different user segments, such as data scientists and end users.|访问大规模的结构化数据集,如 WDC 或 CORD-191是非常具有挑战性的。即使一个主题(例如2019冠状病毒疾病疫苗效力)是有趣的,不同来源/论文中的所有主题表格都有数百种不同的模式,这取决于作者,这使得查找和查询这些模式变得非常复杂。在这里,我们演示了一个可伸缩的元分析器系统,它能够为大规模(半)结构化数据集中感兴趣的主题构建一个结构化的标准化接口。我们称之为 Meta-profile 的这个接口表示一个选定主题的多维元数据摘要,它积累了数据集中主题表的所有不同结构的表示。这样的元概要文件可以用作丰富的可视化以及健壮的结构化查询界面,简化不同用户段(如数据科学家和最终用户)对大规模(半)结构化数据的访问。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Topical+Structured+Interfaces+from+Medical+Research+Literature)|0| +|[DGBCT: A Scalable Distributed Gradient Boosting Causal Tree at Alipay](https://doi.org/10.1145/3543873.3584645)|Jun Zhou, Caizhi Tang, Qing Cui, Yi Ding, Longfei Li, Fei Wu|College of Computer Science and Technology, Zhejiang University, China; College of Computer Science and Technology, Zhejiang University, China and Ant Group, China; Ant Group, China|Causal effect estimation has been increasingly emphasized in the past few years. To handle this problem, tree-based causal methods have been widely used due to their robustness and explainability. However, most of the existing methods are limited to running on a single machine, making it difficult to scale up to hundreds of millions of data in typical industrial scenarios. This paper proposes DGBCT, a Distributed Gradient Boosting Causal Tree to tackle such problem, and the contribution of this paper is three folds. First, we extend the original GBCT method to a multi-treatment setting and take the monotonic constraints into consideration, so that more typical industrial necessities can be resolved with our framework. Moreover, we implement DGBCT based on the ‘Controller-Coordinator-Worker’ framework, in which dual failover mechanism is achieved, and commendable flexibility is ensured. In addition, empirical results show that DGBCT significantly outperforms the state-of-the-art causal trees, and has a near-linear speedup as the number of workers grows. The system is currently deployed in Alipay1 to support the daily business tasks that involve hundreds of millions of users.|因果效应估计在过去的几年中越来越受到重视。为了解决这个问题,基于树的因果关系方法由于其鲁棒性和可解释性而得到了广泛的应用。然而,现有的大多数方法仅限于在单台机器上运行,因此在典型的工业场景中很难扩展到数亿个数据。本文提出了分布式梯度提升因果树 DGBCT 来解决这个问题,本文的贡献有三个方面。首先,我们将原来的 GBCT 方法扩展到一个多处理环境,并且考虑了单调约束,使得我们的框架能够解决更多的典型工业需求。此外,我们在“控制器-协调器-工作者”框架的基础上实现了 DGBCT,实现了双重故障转移机制,并保证了值得称赞的灵活性。此外,实证结果显示,DGBCT 的表现明显优于最先进的因果树,并且随着工作人员数量的增加有近线性的加速效应。该系统目前部署在支付宝1中,以支持涉及数亿用户的日常业务任务。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DGBCT:+A+Scalable+Distributed+Gradient+Boosting+Causal+Tree+at+Alipay)|0| +|[What Image do You Need? A Two-stage Framework for Image Selection in E-commerce](https://doi.org/10.1145/3543873.3584646)|Sheng You, Chao Wang, Baohua Wu, Jingping Liu, Quan Lu, Guanzhou Han, Yanghua Xiao|East China University of Science and Technology, China; Shanghai University, China; Alibaba Group, China; Fudan University, China|In e-commerce, images are widely used to display more intuitive information about items. Image selection significantly affects the user’s click-through rate (CTR). Most existing work considers the CTR as the target to find an appropriate image. However, these methods are challenging to deploy online efficiently. Also, the selected images may not relate to the item but are profitable to CTR, resulting in the undesirable phenomenon of enticing users to click on the item. To address these issues, we propose a novel two-stage pipeline method with content-based recall model and CTR-based ranking model. The first is realized as a joint method based on the title-image matching model and multi-modal knowledge graph embedding learning model. The second is a CTR-based visually aware scoring model, incorporating the entity textual information and entity images. Experimental results show the effectiveness and efficiency of our method in offline evaluations. After a month of online A/B testing on a travel platform Fliggy, the relative improvement of our method is 5% with respect to seller selection on CTCVR in the searching scenario, and our method further improves pCTR from 3.48% of human pick to 3.53% in the recommendation scenario.|在电子商务中,图像被广泛用于显示更直观的商品信息。图像选择会显著影响用户的点进率。大多数现有的工作都将 CTR 作为寻找合适图像的目标。然而,这些方法在线有效部署是具有挑战性的。此外,所选图像可能与该项目无关,但有利于点击率,导致不良现象的诱惑用户点击该项目。为了解决这些问题,我们提出了一种新的基于内容的召回模型和基于点击率的排序模型的两阶段流水线方法。第一种是基于标题-图像匹配模型和多模态知识图嵌入学习模型的联合方法。第二种是基于 CTR 的视觉感知评分模型,结合了实体文本信息和实体图像。实验结果表明了该方法在离线评估中的有效性和有效性。在旅游平台 Fliggy 上进行了一个月的在线 A/B 测试后,在搜索场景中,我们的方法相对于 CTCVR 上的卖方选择的相对改善为5% ,并且我们的方法进一步将 pCTR 从人类选择的3.48% 提高到推荐场景中的3.53% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=What+Image+do+You+Need?+A+Two-stage+Framework+for+Image+Selection+in+E-commerce)|0| |[Learning Geolocation by Accurately Matching Customer Addresses via Graph based Active Learning](https://doi.org/10.1145/3543873.3584647)|Saket Maheshwary, Saurabh Sohoney|Amazon, India|We propose a novel adaptation of graph-based active learning for customer address resolution or de-duplication, with the aim to determine if two addresses represent the same physical building or not. For delivery systems, improving address resolution positively impacts multiple downstream systems such as geocoding, route planning and delivery time estimations, leading to an efficient and reliable delivery experience, both for customers as well as delivery agents. Our proposed approach jointly leverages address text, past delivery information and concepts from graph theory to retrieve informative and diverse record pairs to label. We empirically show the effectiveness of our approach on manually curated dataset across addresses from India (IN) and United Arab Emirates (UAE). We achieved absolute improvement in recall on average across IN and UAE while preserving precision over the existing production system. We also introduce delivery point (DP) geocode learning for cold-start addresses as a downstream application of address resolution. In addition to offline evaluation, we also performed online A/B experiments which show that when the production model is augmented with active learnt record pairs, the delivery precision improved by and delivery defects reduced by on an average across shipments from IN and UAE.|我们提出了一种新的基于图的主动学习的客户地址解析或去重复,目的是确定是否两个地址代表相同的物理建筑物。对于送货系统,提高地址分辨率会对多个下游系统产生积极影响,例如地理编码、路径规划和送货时间估计,从而为客户和送货代理带来高效和可靠的送货体验。我们提出的方法共同利用地址文本,过去的传递信息和概念从图论检索信息和不同的记录对标签。我们通过实验证明了我们的方法在人工管理来自印度(IN)和阿拉伯联合酋长国(UAE)的数据集方面的有效性。在保持现有生产系统精度的同时,我们在 IN 和阿联酋的平均召回率上取得了绝对的提高。我们还介绍了用于冷启动地址的交付点(DP)地理编码学习作为地址解析的下游应用。除了离线评估之外,我们还进行了在线 A/B 实验,结果表明,当生产模型增加了主动学习记录对时,从 IN 和阿联酋发货的交付精度提高了,交付缺陷平均减少了。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Geolocation+by+Accurately+Matching+Customer+Addresses+via+Graph+based+Active+Learning)|0| |[CAViaR: Context Aware Video Recommendations](https://doi.org/10.1145/3543873.3584658)|Khushhall Chandra Mahajan, Aditya Palnitkar, Ameya Raul, Brad Schumitsch|Meta Inc., USA|Many recommendation systems rely on point-wise models, which score items individually. However, point-wise models generating scores for a video are unable to account for other videos being recommended in a query. Due to this, diversity has to be introduced through the application of heuristic-based rules, which are not able to capture user preferences, or make balanced trade-offs in terms of diversity and item relevance. In this paper, we propose a novel method which introduces diversity by modeling the impact of low diversity on user's engagement on individual items, thus being able to account for both diversity and relevance to adjust item scores. The proposed method is designed to be easily pluggable into existing large-scale recommender systems, while introducing minimal changes in the recommendations stack. Our models show significant improvements in offline metrics based on the normalized cross entropy loss compared to production point-wise models. Our approach also shows a substantial increase of 1.7% in topline engagements coupled with a 1.5% increase in daily active users in an A/B test with live traffic on Facebook Watch, which translates into an increase of millions in the number of daily active users for the product.|许多推荐系统依赖于逐点模型,这种模型对项目进行单独评分。然而,为视频生成分数的点模型无法解释在查询中推荐的其他视频。因此,必须通过应用启发式规则来引入多样性,这些规则不能捕捉用户的偏好,也不能在多样性和项目相关性方面做出平衡的权衡。本文提出了一种引入多样性的新方法,该方法通过建立低多样性对用户参与个别项目的影响模型,从而能够同时考虑多样性和相关性来调整项目得分。所提出的方法被设计成可以很容易地插入到现有的大规模推荐系统中,同时在推荐堆栈中引入最小的变化。与生产点模型相比,我们的模型显示了基于归一化交叉熵损失的离线指标的显著改进。我们的方法还显示,顶线参与度大幅增加了1.7% ,在 Facebook Watch 上的实时流量的 A/B 测试中,每日活跃用户增加了1.5% ,这意味着产品的每日活跃用户数量增加了数百万。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CAViaR:+Context+Aware+Video+Recommendations)|0| |[Towards Building a Mobile App for People on the Spectrum](https://doi.org/10.1145/3543873.3587533)|Victoria Firsanova|Department of Mathematical Linguistics, Saint Petersburg State University, Russian Federation|The inclusion of autistic people can be augmented by a mobile app that provides information without a human mediator making information perception more liberating for people in the spectrum. This paper is an overview of a doctoral work dedicated to the development of a web-based mobile tool for supporting the inclusion of people on the autism spectrum. The work includes UX/UI research conducted with psychiatry experts, web information retrieval study and neural question-answering research. Currently, the study results comprise several mobile app layouts, a retriever-reader model design and fine-tuned neural network for extractive question-answering. Source code and other resources are available at https://github.com/vifirsanova/empi.|自闭症患者的包容性可以通过移动应用程序得到加强,这个程序可以在没有人类中介的情况下提供信息,从而使自闭症患者的信息感知更加自由。这篇文章是一篇博士论文的概述,该论文致力于开发一种基于网络的移动工具,以支持孤独症患者的融入。这项工作包括由精神病学专家进行的用户体验/用户界面研究、网络信息检索研究和神经问答研究。目前,研究结果包括几个移动应用程序的布局,一个检索-阅读器模型设计和微调神经网络提取问题回答。源代码和其他资源可在 https://github.com/vifirsanova/empi 获得。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Building+a+Mobile+App+for+People+on+the+Spectrum)|0| |[Multi-turn mediated solutions for Conversational Artificial Intelligent systems leveraging graph-based techniques](https://doi.org/10.1145/3543873.3587540)|Riya Naik|Computer Science & Information Systems, Birla Institute Of Technology And Science, Pilani, India|The current era is dominated by intelligent Question Answering (QA) systems that can instantly answer almost all their questions, saving users search time and increasing the throughput and precision in the applied domain. A vast amount of work is being carried out in QA systems to deliver better content satisfying users’ information needs [2]. Since QA systems are ascending the cycle of emerging technologies, there are potential research gaps that can be explored. QA systems form a significant part of Conversational Artificial Intelligent systems giving rise to a new research pathway, i.e., Conversational Question Answering (CQA) systems [32]. We propose to design and develop a CQA system leveraging Hypergraph-based techniques. The approach focuses on the multi-turn conversation and multi-context to gauge users’ exact information needs and deliver better answers. We further aim to address "supporting evidence-based retrieval" for fact-based responsible answer generation. Since the QA system requires a large amount of data and processing, we also intend to investigate hardware performance for effective system utilization.|当今时代的主流是智能问答(QA)系统,它可以即时回答几乎所有的问题,节省用户搜索时间,提高应用领域的吞吐量和精度。为了提供更好的内容以满足用户的信息需求,QA 系统正在进行大量的工作[2]。由于 QA 系统正在提升新兴技术的周期,因此存在可以探索的潜在研究差距。问答系统构成了会话人工智能系统的重要组成部分,从而产生了一种新的研究途径,即会话问答(CQA)系统[32]。我们建议利用基于 Hypergraph 的技术设计和开发一个 CQA 系统。该方法侧重于多回合会话和多上下文,以衡量用户的确切信息需求和提供更好的答案。我们进一步的目标是解决“支持基于证据的检索”的事实为基础的负责任的答案生成。由于 QA 系统需要大量的数据和处理,因此我们还打算研究硬件性能,以便有效地利用系统。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-turn+mediated+solutions+for+Conversational+Artificial+Intelligent+systems+leveraging+graph-based+techniques)|0| |[Graph and Embedding based Approach for Text Clustering: Topic Detection in a Large Multilingual Public Consultation](https://doi.org/10.1145/3543873.3587627)|Nicolas Stefanovitch, Guillaume Jacquet, Bertrand De Longueville|European Commission - Joint Research Centre, Italy|We present a novel algorithm for multilingual text clustering built upon two well studied techniques: multilingual aligned embedding and community detection in graphs. The aim of our algorithm is to discover underlying topics in a multilingual dataset using clustering. We present both a numerical evaluation using silhouette and V-measure metrics, and a qualitative evaluation for which we propose a new systematic approach. Our algorithm presents robust overall performance and its results were empirically evaluated by an analyst. The work we present was done in the context of a large multilingual public consultation, for which our new algorithm was deployed and used on a daily basis.|本文提出了一种新的多语言文本聚类算法,该算法基于两种已经得到广泛研究的技术: 多语言对齐嵌入和图中的社区检测。我们的算法的目的是使用聚类来发现多语言数据集中的基本主题。我们提出了一个数值评估使用轮廓和 V 测量度量,和一个定性的评估,我们提出了一个新的系统方法。我们的算法提出了稳健的整体性能,其结果是经验性的分析评价。我们介绍的工作是在大规模多语种公众协商的背景下完成的,我们的新算法每天都得到部署和使用。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph+and+Embedding+based+Approach+for+Text+Clustering:+Topic+Detection+in+a+Large+Multilingual+Public+Consultation)|0| -|[Dual-grained Text-Image Olfactory Matching Model with Mutual Promotion Stages](https://doi.org/10.1145/3543873.3587649)|Yi Shao, Jiande Sun, Ye Jiang, Jing Li|Qingdao University of Science and Technology, China; Shandong Normal University, China; Shandong Management University, China|Olfactory experience has great advantages in awakening human memories and emotions, which may even surpass vision in some cases. Studies have proved that olfactory scene descriptions in images and text content can also arouse human olfactory imagination, but there are still few studies on solving related problems from the perspective of computer vision and NLP. This paper proposes a multimodal model that can detect similar olfactory experience in paired text-image samples. The model builds two stages, coarse-grained and fine-grained. The model adopts the feature fusion method based on pre-trained CLIP for coarse-grained matching training to obtain a preliminary feature extractor to promote fine-grained matching training, and then uses the similarity calculation method based on stacked cross attention for fine-grained matching training to obtain the final feature extractor which in turn promotes coarse-grained matching training. Finally, we manually build an approximate olfactory nouns list during fine-grained matching training, which not only yields significantly better performance when fed back to the fine-grained matching process, but this noun list can be used for future research. Experiments on the MUSTI task dataset of MediaEval2022 prove that the coarse-grained and fine-grained matching stages in proposed model both perform well, and both F1 measures exceed the existing baseline models.|嗅觉经验在唤醒人类记忆和情感方面有很大的优势,在某些情况下甚至可能超越视觉。研究表明,图像和文本内容中的嗅觉场景描述也能激发人类的嗅觉想象,但从计算机视觉和自然语言处理的角度解决相关问题的研究还很少。本文提出了一个多模态模型,可以检测相似的嗅觉经验配对文本图像样本。该模型分为粗粒度和细粒度两个阶段。该模型采用基于预训练 CLIP 的特征融合方法进行粗粒度匹配训练,得到初步的特征提取器以促进细粒度匹配训练,然后采用基于叠加交叉注意的相似度计算方法进行细粒度匹配训练,得到最终的特征提取器以促进粗粒度匹配训练。最后,我们在细粒度匹配训练过程中手动构建了一个近似的嗅觉名词列表,这不仅在反馈到细粒度匹配过程时产生了明显更好的性能,而且这个名词列表可以用于未来的研究。在 MediaEval2022的 MUSTI 任务数据集上的实验证明,所提出的模型中的粗粒度和细粒度匹配阶段都表现良好,而且两种 F1测度都超过了现有的基线模型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Dual-grained+Text-Image+Olfactory+Matching+Model+with+Mutual+Promotion+Stages)|0| +|[Dual-grained Text-Image Olfactory Matching Model with Mutual Promotion Stages](https://doi.org/10.1145/3543873.3587649)|Yi Shao, Jiande Sun, Ye Jiang, Jing Li|Shandong Normal University, China; Qingdao University of Science and Technology, China; Shandong Management University, China|Olfactory experience has great advantages in awakening human memories and emotions, which may even surpass vision in some cases. Studies have proved that olfactory scene descriptions in images and text content can also arouse human olfactory imagination, but there are still few studies on solving related problems from the perspective of computer vision and NLP. This paper proposes a multimodal model that can detect similar olfactory experience in paired text-image samples. The model builds two stages, coarse-grained and fine-grained. The model adopts the feature fusion method based on pre-trained CLIP for coarse-grained matching training to obtain a preliminary feature extractor to promote fine-grained matching training, and then uses the similarity calculation method based on stacked cross attention for fine-grained matching training to obtain the final feature extractor which in turn promotes coarse-grained matching training. Finally, we manually build an approximate olfactory nouns list during fine-grained matching training, which not only yields significantly better performance when fed back to the fine-grained matching process, but this noun list can be used for future research. Experiments on the MUSTI task dataset of MediaEval2022 prove that the coarse-grained and fine-grained matching stages in proposed model both perform well, and both F1 measures exceed the existing baseline models.|嗅觉经验在唤醒人类记忆和情感方面有很大的优势,在某些情况下甚至可能超越视觉。研究表明,图像和文本内容中的嗅觉场景描述也能激发人类的嗅觉想象,但从计算机视觉和自然语言处理的角度解决相关问题的研究还很少。本文提出了一个多模态模型,可以检测相似的嗅觉经验配对文本图像样本。该模型分为粗粒度和细粒度两个阶段。该模型采用基于预训练 CLIP 的特征融合方法进行粗粒度匹配训练,得到初步的特征提取器以促进细粒度匹配训练,然后采用基于叠加交叉注意的相似度计算方法进行细粒度匹配训练,得到最终的特征提取器以促进粗粒度匹配训练。最后,我们在细粒度匹配训练过程中手动构建了一个近似的嗅觉名词列表,这不仅在反馈到细粒度匹配过程时产生了明显更好的性能,而且这个名词列表可以用于未来的研究。在 MediaEval2022的 MUSTI 任务数据集上的实验证明,所提出的模型中的粗粒度和细粒度匹配阶段都表现良好,而且两种 F1测度都超过了现有的基线模型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Dual-grained+Text-Image+Olfactory+Matching+Model+with+Mutual+Promotion+Stages)|0| |[MEMER - Multimodal Encoder for Multi-signal Early-stage Recommendations](https://doi.org/10.1145/3543873.3587679)|Mohit Agarwal, Srijan Saket, Rishabh Mehrotra|ShareChat, India|Millions of content gets created daily on platforms like YouTube, Facebook, TikTok etc. Most of such large scale recommender systems are data demanding, thus taking substantial time for content embedding to mature. This problem is aggravated when there is no behavioral data available for new content. Poor quality recommendation for these items lead to user dissatisfaction and short content shelf-life. In this paper we propose a solution MEMER (Multimodal Encoder for Multi-signal Early-stage Recommendations), that utilises the multimodal semantic information of content and uses it to generate better quality embeddings for early-stage items. We demonstrate the flexibility of the framework by extending it to various explicit and implicit user actions. Using these learnt embeddings, we conduct offline and online experiments to verify its effectiveness. The predicted embeddings show significant gains in online early-stage experiments for both videos and images (videos:  44% relative gain in click through rate,  46% relative gain in explicit engagements,  9% relative gain in successful video play,  20% relative reduction in skips, images:  56% relative gain in explicit engagements). This also compares well against the performance of mature embeddings (83.3% RelaImpr (RI) [18] in Successful Video Play, 97.8% RelaImpr in Clicks).|每天都有数以百万计的内容在 YouTube、 Facebook、 TikTok 等平台上被创造出来。大多数这样的大规模推荐系统都需要大量的数据,因此内容嵌入需要大量的时间才能成熟。当没有可用于新内容的行为数据时,这个问题就更加严重了。这些项目的低质量推荐导致用户不满意和内容保质期短。在本文中,我们提出了一个解决方案 MEMER (多信号早期推荐的多模式编码器) ,利用多模式内容语义信息,并使用它为早期项目产生更好的质量嵌入。我们通过将框架扩展到各种显式和隐式用户操作来演示框架的灵活性。使用这些学习嵌入,我们进行离线和在线实验,以验证其有效性。预测的嵌入在视频和图像的在线早期实验中都显示出显著的增益(视频: 点击率相对增益44% ,显性参与相对增益46% ,成功视频播放相对增益9% ,跳过相对减少20% ,图像: 显性参与相对增益56%)。这也与成熟嵌入的性能相当(83.3% RelaImpr (RI)[18]在成功的视频播放,97.8% RelaImpr 在点击)。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MEMER+-+Multimodal+Encoder+for+Multi-signal+Early-stage+Recommendations)|0| -|[Social Re-Identification Assisted RTO Detection for E-Commerce](https://doi.org/10.1145/3543873.3587620)|Hitkul Jangra, Abinaya K, Soham Saha, Satyajit Banerjee, Muthusamy Chelliah, Ponnurangam Kumaraguru|Flipkart, India; IIIT Hyderabad, India; IIIT Delhi, India|E-commerce features like easy cancellations, returns, and refunds can be exploited by bad actors or uninformed customers, leading to revenue loss for organization. One such problem faced by e-commerce platforms is Return To Origin (RTO), where the user cancels an order while it is in transit for delivery. In such a scenario platform faces logistics and opportunity costs. Traditionally, models trained on historical trends are used to predict the propensity of an order becoming RTO. Sociology literature has highlighted clear correlations between socio-economic indicators and users’ tendency to exploit systems to gain financial advantage. Social media profiles have information about location, education, and profession which have been shown to be an estimator of socio-economic condition. We believe combining social media data with e-commerce information can lead to improvements in a variety of tasks like RTO, recommendation, fraud detection, and credit modeling. In our proposed system, we find the public social profile of an e-commerce user and extract socio-economic features. Internal data fused with extracted social features are used to train a RTO order detection model. Our system demonstrates a performance improvement in RTO detection of 3.1% and 19.9% on precision and recall, respectively. Our system directly impacts the bottom line revenue and shows the applicability of social re-identification in e-commerce.|电子商务的特点,如容易取消,退货和退款可以利用不良行为者或不知情的客户,导致收入损失的组织。电子商务平台面临的一个这样的问题是返还原产地(RTO) ,即用户在运输途中取消订单。在这种情况下,平台面临物流和机会成本。传统上,根据历史趋势训练的模型被用来预测订单成为 RTO 的倾向。社会学文献强调了社会经济指标与用户利用系统获取金融优势的倾向之间的明确相关性。社交媒体档案包含有关地理位置、教育和职业的信息,这些信息已被证明是社会经济状况的估计值。我们相信,将社交媒体数据与电子商务信息结合起来,可以改进诸如 RTO、推荐、欺诈检测和信用建模等多种任务。在我们提出的系统中,我们找到电子商务用户的公共社会轮廓,并提取社会经济特征。利用内部数据与提取的社会特征融合,建立了 RTO 订单检测模型。我们的系统在检测准确率召回率方面的性能改善分别为3.1% 和19.9% 。我们的系统直接影响到底线收入,显示了社会再认同在电子商务中的适用性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Social+Re-Identification+Assisted+RTO+Detection+for+E-Commerce)|0| +|[Social Re-Identification Assisted RTO Detection for E-Commerce](https://doi.org/10.1145/3543873.3587620)|Hitkul Jangra, Abinaya K, Soham Saha, Satyajit Banerjee, Muthusamy Chelliah, Ponnurangam Kumaraguru|IIIT Hyderabad, India; IIIT Delhi, India; Flipkart, India|E-commerce features like easy cancellations, returns, and refunds can be exploited by bad actors or uninformed customers, leading to revenue loss for organization. One such problem faced by e-commerce platforms is Return To Origin (RTO), where the user cancels an order while it is in transit for delivery. In such a scenario platform faces logistics and opportunity costs. Traditionally, models trained on historical trends are used to predict the propensity of an order becoming RTO. Sociology literature has highlighted clear correlations between socio-economic indicators and users’ tendency to exploit systems to gain financial advantage. Social media profiles have information about location, education, and profession which have been shown to be an estimator of socio-economic condition. We believe combining social media data with e-commerce information can lead to improvements in a variety of tasks like RTO, recommendation, fraud detection, and credit modeling. In our proposed system, we find the public social profile of an e-commerce user and extract socio-economic features. Internal data fused with extracted social features are used to train a RTO order detection model. Our system demonstrates a performance improvement in RTO detection of 3.1% and 19.9% on precision and recall, respectively. Our system directly impacts the bottom line revenue and shows the applicability of social re-identification in e-commerce.|电子商务的特点,如容易取消,退货和退款可以利用不良行为者或不知情的客户,导致收入损失的组织。电子商务平台面临的一个这样的问题是返还原产地(RTO) ,即用户在运输途中取消订单。在这种情况下,平台面临物流和机会成本。传统上,根据历史趋势训练的模型被用来预测订单成为 RTO 的倾向。社会学文献强调了社会经济指标与用户利用系统获取金融优势的倾向之间的明确相关性。社交媒体档案包含有关地理位置、教育和职业的信息,这些信息已被证明是社会经济状况的估计值。我们相信,将社交媒体数据与电子商务信息结合起来,可以改进诸如 RTO、推荐、欺诈检测和信用建模等多种任务。在我们提出的系统中,我们找到电子商务用户的公共社会轮廓,并提取社会经济特征。利用内部数据与提取的社会特征融合,建立了 RTO 订单检测模型。我们的系统在检测准确率召回率方面的性能改善分别为3.1% 和19.9% 。我们的系统直接影响到底线收入,显示了社会再认同在电子商务中的适用性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Social+Re-Identification+Assisted+RTO+Detection+for+E-Commerce)|0| |[Contextual Response Interpretation for Automated Structured Interviews: A Case Study in Market Research](https://doi.org/10.1145/3543873.3587657)|Harshita Sahijwani, Kaustubh D. Dhole, Ankur P. Purwar, Venugopal Vasudevan, Eugene Agichtein|Procter & Gamble, USA; Emory University, USA; Procter & Gamble, Singapore|Structured interviews are used in many settings, importantly in market research on topics such as brand perception, customer habits, or preferences, which are critical to product development, marketing, and e-commerce at large. Such interviews generally consist of a series of questions that are asked to a participant. These interviews are typically conducted by skilled interviewers, who interpret the responses from the participants and can adapt the interview accordingly. Using automated conversational agents to conduct such interviews would enable reaching a much larger and potentially more diverse group of participants than currently possible. However, the technical challenges involved in building such a conversational system are relatively unexplored. To learn more about these challenges, we convert a market research multiple-choice questionnaire to a conversational format and conduct a user study. We address the key task of conducting structured interviews, namely interpreting the participant's response, for example, by matching it to one or more predefined options. Our findings can be applied to improve response interpretation for the information elicitation phase of conversational recommender systems.|结构化访谈用于许多场合,重要的是用于市场调查,如品牌认知、客户习惯或偏好,这些对产品开发、市场营销和电子商务至关重要。这种面试通常包括向参与者提出的一系列问题。这些面试通常由技术熟练的面试官进行,他们解释参与者的回答,并能相应地调整面试。使用自动对话代理进行这种访谈将能够接触到比目前可能的更多、可能更多样化的参与者群体。然而,构建这样一个会话系统所涉及的技术挑战相对而言还没有得到探索。为了更多地了解这些挑战,我们将市场调查多项选择问卷转换为会话形式,并进行用户研究。我们解决的关键任务是进行结构化访谈,即解释参与者的反应,例如,通过匹配一个或多个预先定义的选项。我们的研究结果可以用于改善会话推荐系统中信息激发阶段的响应解释。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Contextual+Response+Interpretation+for+Automated+Structured+Interviews:+A+Case+Study+in+Market+Research)|0| |[Knowledge Graph-Enhanced Neural Query Rewriting](https://doi.org/10.1145/3543873.3587678)|Shahla Farzana, Qunzhi Zhou, Petar Ristoski|University of Illinois Chicago, USA; eBay Inc, USA; eBay Inc., USA|The main task of an e-commerce search engine is to semantically match the user query to the product inventory and retrieve the most relevant items that match the user’s intent. This task is not trivial as often there can be a mismatch between the user’s intent and the product inventory for various reasons, the most prevalent being: (i) the buyers and sellers use different vocabularies, which leads to a mismatch; (ii) the inventory doesn’t contain products that match the user’s intent. To build a successful e-commerce platform it is of paramount importance to be able to address both of these challenges. To do so, query rewriting approaches are used, which try to bridge the semantic gap between the user’s intent and the available product inventory. Such approaches use a combination of query token dropping, replacement and expansion. In this work we introduce a novel Knowledge Graph-enhanced neural query rewriting in the e-commerce domain. We use a relationship-rich product Knowledge Graph to infuse auxiliary knowledge in a transformer-based query rewriting deep neural network. Experiments on two tasks, query pruning and complete query rewriting, show that our proposed approach significantly outperforms a baseline BERT-based query rewriting solution.|电子商务搜索引擎的主要任务是在语义上将用户查询与产品库存相匹配,并检索与用户意图相匹配的最相关项目。这个任务并不是微不足道的,因为在用户的意图和产品库存之间经常会因为各种原因而产生不匹配,最普遍的原因是: (i)买家和卖家使用不同的词汇,这会导致不匹配; (ii)库存不包含符合用户意图的产品。要建立一个成功的电子商务平台,最重要的是能够应对这两个挑战。为此,使用了查询重写方法,这些方法试图弥合用户意图和可用产品目录之间的语义差距。这种方法结合使用查询标记删除、替换和扩展。在这项工作中,我们介绍了一种新的知识图增强神经查询重写在电子商务领域。在基于变压器的查询重写深度神经网络中,我们使用一个关系丰富的产品知识图来注入辅助知识。通过对查询裁剪和完全查询重写两个任务的实验表明,该方法的性能明显优于基于 BERT 的基线查询重写方案。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Knowledge+Graph-Enhanced+Neural+Query+Rewriting)|0| -|[Fairness-aware Differentially Private Collaborative Filtering](https://doi.org/10.1145/3543873.3587577)|Zhenhuan Yang, Yingqiang Ge, Congzhe Su, Dingxian Wang, Xiaoting Zhao, Yiming Ying|University at Albany, SUNY, USA; Rutgers University, USA; Etsy, USA|Recently, there has been an increasing adoption of differential privacy guided algorithms for privacy-preserving machine learning tasks. However, the use of such algorithms comes with trade-offs in terms of algorithmic fairness, which has been widely acknowledged. Specifically, we have empirically observed that the classical collaborative filtering method, trained by differentially private stochastic gradient descent (DP-SGD), results in a disparate impact on user groups with respect to different user engagement levels. This, in turn, causes the original unfair model to become even more biased against inactive users. To address the above issues, we propose \textbf{DP-Fair}, a two-stage framework for collaborative filtering based algorithms. Specifically, it combines differential privacy mechanisms with fairness constraints to protect user privacy while ensuring fair recommendations. The experimental results, based on Amazon datasets, and user history logs collected from Etsy, one of the largest e-commerce platforms, demonstrate that our proposed method exhibits superior performance in terms of both overall accuracy and user group fairness on both shallow and deep recommendation models compared to vanilla DP-SGD.|最近,在保护隐私的机器学习任务中越来越多地采用差分隐私指导算法。然而,这种算法的使用伴随着算法公平性方面的权衡,这已经得到了广泛的认可。具体来说,我们已经经验性地观察到,由差异私人协同过滤(DP-sgd)训练的经典随机梯度下降方法,在不同的用户参与水平方面对用户组产生了不同的影响。这反过来又导致原来的不公平模型对非活动用户变得更加有偏见。为了解决上述问题,我们提出 textbf { DP-fair } ,一个基于协同过滤的算法的两阶段框架。具体来说,它结合了差分隐私机制和公平约束,以保护用户隐私,同时确保公平推荐。基于 Amazon 数据集的实验结果,以及从最大的电子商务平台之一 Etsy 收集的用户历史记录表明,与普通的 DP-SGD 相比,我们提出的方法在浅层和深层推荐模型的总体准确性和用户组公平性方面表现出更好的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Fairness-aware+Differentially+Private+Collaborative+Filtering)|0| +|[Fairness-aware Differentially Private Collaborative Filtering](https://doi.org/10.1145/3543873.3587577)|Zhenhuan Yang, Yingqiang Ge, Congzhe Su, Dingxian Wang, Xiaoting Zhao, Yiming Ying|Rutgers University, USA; University at Albany, SUNY, USA; Etsy, USA|Recently, there has been an increasing adoption of differential privacy guided algorithms for privacy-preserving machine learning tasks. However, the use of such algorithms comes with trade-offs in terms of algorithmic fairness, which has been widely acknowledged. Specifically, we have empirically observed that the classical collaborative filtering method, trained by differentially private stochastic gradient descent (DP-SGD), results in a disparate impact on user groups with respect to different user engagement levels. This, in turn, causes the original unfair model to become even more biased against inactive users. To address the above issues, we propose \textbf{DP-Fair}, a two-stage framework for collaborative filtering based algorithms. Specifically, it combines differential privacy mechanisms with fairness constraints to protect user privacy while ensuring fair recommendations. The experimental results, based on Amazon datasets, and user history logs collected from Etsy, one of the largest e-commerce platforms, demonstrate that our proposed method exhibits superior performance in terms of both overall accuracy and user group fairness on both shallow and deep recommendation models compared to vanilla DP-SGD.|最近,在保护隐私的机器学习任务中越来越多地采用差分隐私指导算法。然而,这种算法的使用伴随着算法公平性方面的权衡,这已经得到了广泛的认可。具体来说,我们已经经验性地观察到,由差异私人协同过滤(DP-sgd)训练的经典随机梯度下降方法,在不同的用户参与水平方面对用户组产生了不同的影响。这反过来又导致原来的不公平模型对非活动用户变得更加有偏见。为了解决上述问题,我们提出 textbf { DP-fair } ,一个基于协同过滤的算法的两阶段框架。具体来说,它结合了差分隐私机制和公平约束,以保护用户隐私,同时确保公平推荐。基于 Amazon 数据集的实验结果,以及从最大的电子商务平台之一 Etsy 收集的用户历史记录表明,与普通的 DP-SGD 相比,我们提出的方法在浅层和深层推荐模型的总体准确性和用户组公平性方面表现出更好的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Fairness-aware+Differentially+Private+Collaborative+Filtering)|0| |[Psychotherapy AI Companion with Reinforcement Learning Recommendations and Interpretable Policy Dynamics](https://doi.org/10.1145/3543873.3587623)|Baihan Lin, Guillermo A. Cecchi, Djallel Bouneffouf|Columbia University, USA; IBM TJ Watson Research Center, USA|We introduce a Reinforcement Learning Psychotherapy AI Companion that generates topic recommendations for therapists based on patient responses. The system uses Deep Reinforcement Learning (DRL) to generate multi-objective policies for four different psychiatric conditions: anxiety, depression, schizophrenia, and suicidal cases. We present our experimental results on the accuracy of recommended topics using three different scales of working alliance ratings: task, bond, and goal. We show that the system is able to capture the real data (historical topics discussed by the therapists) relatively well, and that the best performing models vary by disorder and rating scale. To gain interpretable insights into the learned policies, we visualize policy trajectories in a 2D principal component analysis space and transition matrices. These visualizations reveal distinct patterns in the policies trained with different reward signals and trained on different clinical diagnoses. Our system's success in generating DIsorder-Specific Multi-Objective Policies (DISMOP) and interpretable policy dynamics demonstrates the potential of DRL in providing personalized and efficient therapeutic recommendations.|我们介绍一个强化学习的心理治疗 AI 指南,根据患者的反应为治疗师提供主题建议。该系统使用深度强化学习(DRL)为四种不同的精神疾病(焦虑症、抑郁症、精神分裂症和自杀病例)制定多目标政策。我们使用三种不同的工作联盟评分尺度(任务、联系和目标)对推荐话题的准确性进行了实验研究。我们表明,该系统能够相对较好地捕获真实数据(治疗师讨论的历史主题) ,并且表现最好的模型因障碍和评分尺度而异。为了获得对所学政策的可解释的见解,我们在二维主成分分析空间和转换矩阵中可视化政策轨迹。这些可视化显示了不同奖励信号训练和不同临床诊断训练的政策的不同模式。我们的系统在产生疾病特异性多目标政策(DISMOP)和可解释的政策动态方面的成功表明 DRL 在提供个性化和有效的治疗建议方面的潜力。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Psychotherapy+AI+Companion+with+Reinforcement+Learning+Recommendations+and+Interpretable+Policy+Dynamics)|0| |[Investigating Action-Space Generalization in Reinforcement Learning for Recommendation Systems](https://doi.org/10.1145/3543873.3587661)|Abhishek Naik, Bo Chang, Alexandros Karatzoglou, Martin Mladenov, Ed H. Chi, Minmin Chen|Google Research, USA; University of Alberta, Canada and Alberta Machine Intelligence Institute (Amii), Canada|Recommender systems are used to suggest items to users based on the users’ preferences. Such systems often deal with massive item sets and incredibly sparse user-item interactions, which makes it very challenging to generate high-quality personalized recommendations. Reinforcement learning (RL) is a framework for sequential decision making and naturally formulates recommender-system tasks: recommending items as actions in different user and context states to maximize long-term user experience. We investigate two RL policy parameterizations that generalize sparse user-items interactions by leveraging the relationships between actions: parameterizing the policy over action features as a softmax or Gaussian distribution. Our experiments on synthetic problems suggest that the Gaussian parameterization—which is not commonly used on recommendation tasks—is more robust to the set of action features than the softmax parameterization. Based on these promising results, we propose a more thorough investigation of the theoretical properties and empirical benefits of the Gaussian parameterization for recommender systems.|推荐系统用于根据用户的喜好向用户推荐项目。这样的系统经常处理大量的项目集和难以置信的稀疏的用户-项目交互,这使得生成高质量的个性化推荐非常具有挑战性。推荐强化学习(RL)是一个连续决策的框架,它自然而然地制定了推荐系统的任务: 在不同的用户和上下文状态下,推荐项目作为行动,以最大限度地提高长期用户体验。我们研究了两种 RL 策略参数化,它们通过利用操作之间的关系来推广稀疏的用户-项目交互: 将策略参数化为 softmax 或者正态分布。我们在合成问题上的实验表明,高斯参数化(在推荐任务中并不常用)比 softmax 参量化对动作特征集的鲁棒性更强。基于这些有希望的结果,我们建议对高斯参量化推荐系统的理论特性和经验效益进行更深入的研究。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Investigating+Action-Space+Generalization+in+Reinforcement+Learning+for+Recommendation+Systems)|0| -|[Conversion of Legal Agreements into Smart Legal Contracts using NLP](https://doi.org/10.1145/3543873.3587554)|Eason Chen, Niall Roche, YuenHsien Tseng, Walter Hernández, Jiangbo Shangguan, Alastair Moore|HSBC Business School, Peking University, United Kingdom; National Taiwan Normal University, Taiwan; University College London, United Kingdom|A Smart Legal Contract (SLC) is a specialized digital agreement comprising natural language and computable components. The Accord Project provides an open-source SLC framework containing three main modules: Cicero, Concerto, and Ergo. Currently, we need lawyers, programmers, and clients to work together with great effort to create a usable SLC using the Accord Project. This paper proposes a pipeline to automate the SLC creation process with several Natural Language Processing (NLP) models to convert law contracts to the Accord Project's Concerto model. After evaluating the proposed pipeline, we discovered that our NER pipeline accurately detects CiceroMark from Accord Project template text with an accuracy of 0.8. Additionally, our Question Answering method can extract one-third of the Concerto variables from the template text. We also delve into some limitations and possible future research for the proposed pipeline. Finally, we describe a web interface enabling users to build SLCs. This interface leverages the proposed pipeline to convert text documents to Smart Legal Contracts by using NLP models.|智能法律合同(SLC)是一种专门的数字协议,包括自然语言和可计算组件。Accord Project 提供了一个开源的 SLC 框架,其中包含三个主要模块: Cicero、 Concerto 和 Ergo。目前,我们需要律师、程序员和客户共同努力,使用 AccordProject 创建一个可用的 SLC。本文提出了一种使用多种自然语言处理(NLP)模型实现 SLC 创建过程自动化的流水线,将法律合同转换为 Accord Project 的 Concerto 模型。经过评估后,我们发现我们的 NER 流水线能够准确地从雅阁项目模板文本中检测到 CiceroMark,准确率为0.8。此外,我们的问题回答方法可以提取三分之一的协奏曲变量从模板文本。我们还深入探讨了一些局限性和可能的未来研究的建议管道。最后,我们描述了一个允许用户构建 SLC 的 Web 界面。该接口利用拟议的管道,通过使用 NLP 模型将文本文档转换为智能法律合同。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Conversion+of+Legal+Agreements+into+Smart+Legal+Contracts+using+NLP)|0| +|[Conversion of Legal Agreements into Smart Legal Contracts using NLP](https://doi.org/10.1145/3543873.3587554)|Eason Chen, Niall Roche, YuenHsien Tseng, Walter Hernández, Jiangbo Shangguan, Alastair Moore|National Taiwan Normal University, Taiwan; HSBC Business School, Peking University, United Kingdom; University College London, United Kingdom|A Smart Legal Contract (SLC) is a specialized digital agreement comprising natural language and computable components. The Accord Project provides an open-source SLC framework containing three main modules: Cicero, Concerto, and Ergo. Currently, we need lawyers, programmers, and clients to work together with great effort to create a usable SLC using the Accord Project. This paper proposes a pipeline to automate the SLC creation process with several Natural Language Processing (NLP) models to convert law contracts to the Accord Project's Concerto model. After evaluating the proposed pipeline, we discovered that our NER pipeline accurately detects CiceroMark from Accord Project template text with an accuracy of 0.8. Additionally, our Question Answering method can extract one-third of the Concerto variables from the template text. We also delve into some limitations and possible future research for the proposed pipeline. Finally, we describe a web interface enabling users to build SLCs. This interface leverages the proposed pipeline to convert text documents to Smart Legal Contracts by using NLP models.|智能法律合同(SLC)是一种专门的数字协议,包括自然语言和可计算组件。Accord Project 提供了一个开源的 SLC 框架,其中包含三个主要模块: Cicero、 Concerto 和 Ergo。目前,我们需要律师、程序员和客户共同努力,使用 AccordProject 创建一个可用的 SLC。本文提出了一种使用多种自然语言处理(NLP)模型实现 SLC 创建过程自动化的流水线,将法律合同转换为 Accord Project 的 Concerto 模型。经过评估后,我们发现我们的 NER 流水线能够准确地从雅阁项目模板文本中检测到 CiceroMark,准确率为0.8。此外,我们的问题回答方法可以提取三分之一的协奏曲变量从模板文本。我们还深入探讨了一些局限性和可能的未来研究的建议管道。最后,我们描述了一个允许用户构建 SLC 的 Web 界面。该接口利用拟议的管道,通过使用 NLP 模型将文本文档转换为智能法律合同。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Conversion+of+Legal+Agreements+into+Smart+Legal+Contracts+using+NLP)|0| |[Query-Driven Knowledge Graph Construction using Question Answering and Multimodal Fusion](https://doi.org/10.1145/3543873.3587567)|Yang Peng|University of Florida, USA|Over recent years, large knowledge bases have been constructed to store massive knowledge graphs. However, these knowledge graphs are highly incomplete. To solve this problem, we propose a web-based question answering system with multimodal fusion of unstructured and structured information, to fill in missing information for knowledge bases. To utilize unstructured information from the Web for knowledge graph construction, we design multimodal features and question templates to extract missing facts, which can achieve good quality with very few questions. The question answering system also employs structured information from knowledge bases, such as entity types and entity-to-entity relatedness, to help improve extraction quality. To improve system efficiency, we utilize a few query-driven techniques for web-based question answering to reduce the runtime and provide fast responses to user queries. Extensive experiments have been conducted to demonstrate the effectiveness and efficiency of our system.|近年来,人们建立了大型知识库来存储海量的知识图表。然而,这些知识图是非常不完整的。为了解决这一问题,我们提出了一种基于网络的非结构化和结构化信息多模态融合的问答系统,以填补知识库中缺失的信息。为了利用 Web 中的非结构化信息进行知识图的构造,我们设计了多模态特征和问题模板来提取缺失的事实,这样可以在很少的问题下获得很好的质量。问答系统还利用知识库中的结构化信息,如实体类型和实体间的相关性,以提高抽取质量。为了提高系统的效率,我们采用了一些基于查询驱动的网络问答技术,以减少运行时间,并提供快速响应用户的查询。通过大量的实验验证了该系统的有效性和高效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Query-Driven+Knowledge+Graph+Construction+using+Question+Answering+and+Multimodal+Fusion)|0| -|[Decoding Prompt Syntax: Analysing its Impact on Knowledge Retrieval in Large Language Models](https://doi.org/10.1145/3543873.3587655)|Stephan Linzbach, Tim Tressel, Laura Kallmeyer, Stefan Dietze, Hajira Jabeen|GESIS Leibniz Institute for Social Sciences, Germany and Heinrich Heine University, Germany; GESIS Leibniz Institut für Sozialwissenschaften, Germany; GESIS Leibniz Institute for Social Sciences, Germany; Heinrich Heine University, Germany|Large Language Models (LLMs), with their advanced architectures and training on massive language datasets, contain unexplored knowledge. One method to infer this knowledge is through the use of cloze-style prompts. Typically, these prompts are manually designed because the phrasing of these prompts impacts the knowledge retrieval performance, even if the LLM encodes the desired information. In this paper, we study the impact of prompt syntax on the knowledge retrieval capacity of LLMs. We use a template-based approach to paraphrase simple prompts into prompts with a more complex grammatical structure. We then analyse the LLM performance for these structurally different but semantically equivalent prompts. Our study reveals that simple prompts work better than complex forms of sentences. The performance across the syntactical variations for simple relations (1:1) remains best, with a marginal decrease across different typologies. These results reinforce that simple prompt structures are more effective for knowledge retrieval in LLMs and motivate future research into the impact of prompt syntax on various tasks.|大型语言模型(LLM)具有先进的体系结构和对大量语言数据集的训练,包含了未开发的知识。一种推断这种知识的方法是通过使用完形填空式的提示。通常,这些提示是手动设计的,因为这些提示的措辞会影响知识检索性能,即使 LLM 对所需的信息进行了编码。本文研究了提示语法对 LLM 知识检索能力的影响。我们使用基于模板的方法将简单的提示转述为具有更复杂语法结构的提示。然后,我们分析这些结构不同但语义相等的提示符的 LLM 性能。我们的研究表明,简单的提示语比复杂的句子形式更有效。简单关系(1:1)的句法变化的表现仍然是最好的,不同类型之间的表现略有下降。这些结果强调了简单的提示结构对于 LLM 中的知识检索更有效,并且激发了对提示句法对各种任务的影响的进一步研究。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Decoding+Prompt+Syntax:+Analysing+its+Impact+on+Knowledge+Retrieval+in+Large+Language+Models)|0| +|[Decoding Prompt Syntax: Analysing its Impact on Knowledge Retrieval in Large Language Models](https://doi.org/10.1145/3543873.3587655)|Stephan Linzbach, Tim Tressel, Laura Kallmeyer, Stefan Dietze, Hajira Jabeen|Heinrich Heine University, Germany; GESIS Leibniz Institute for Social Sciences, Germany and Heinrich Heine University, Germany; GESIS Leibniz Institut für Sozialwissenschaften, Germany; GESIS Leibniz Institute for Social Sciences, Germany|Large Language Models (LLMs), with their advanced architectures and training on massive language datasets, contain unexplored knowledge. One method to infer this knowledge is through the use of cloze-style prompts. Typically, these prompts are manually designed because the phrasing of these prompts impacts the knowledge retrieval performance, even if the LLM encodes the desired information. In this paper, we study the impact of prompt syntax on the knowledge retrieval capacity of LLMs. We use a template-based approach to paraphrase simple prompts into prompts with a more complex grammatical structure. We then analyse the LLM performance for these structurally different but semantically equivalent prompts. Our study reveals that simple prompts work better than complex forms of sentences. The performance across the syntactical variations for simple relations (1:1) remains best, with a marginal decrease across different typologies. These results reinforce that simple prompt structures are more effective for knowledge retrieval in LLMs and motivate future research into the impact of prompt syntax on various tasks.|大型语言模型(LLM)具有先进的体系结构和对大量语言数据集的训练,包含了未开发的知识。一种推断这种知识的方法是通过使用完形填空式的提示。通常,这些提示是手动设计的,因为这些提示的措辞会影响知识检索性能,即使 LLM 对所需的信息进行了编码。本文研究了提示语法对 LLM 知识检索能力的影响。我们使用基于模板的方法将简单的提示转述为具有更复杂语法结构的提示。然后,我们分析这些结构不同但语义相等的提示符的 LLM 性能。我们的研究表明,简单的提示语比复杂的句子形式更有效。简单关系(1:1)的句法变化的表现仍然是最好的,不同类型之间的表现略有下降。这些结果强调了简单的提示结构对于 LLM 中的知识检索更有效,并且激发了对提示句法对各种任务的影响的进一步研究。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Decoding+Prompt+Syntax:+Analysing+its+Impact+on+Knowledge+Retrieval+in+Large+Language+Models)|0| |[CS-TGN: Community Search via Temporal Graph Neural Networks](https://doi.org/10.1145/3543873.3587654)|Farnoosh Hashemi, Ali Behrouz, Milad Rezaei Hajidehi|University of British Columbia, Canada|Searching for local communities is an important research challenge that allows for personalized community discovery and supports advanced data analysis in various complex networks, such as the World Wide Web, social networks, and brain networks. The evolution of these networks over time has motivated several recent studies to identify local communities in temporal networks. Given any query nodes, Community Search aims to find a densely connected subgraph containing query nodes. However, existing community search approaches in temporal networks have two main limitations: (1) they adopt pre-defined subgraph patterns to model communities, which cannot find communities that do not conform to these patterns in real-world networks, and (2) they only use the aggregation of disjoint structural information to measure quality, missing the dynamic of connections and temporal properties. In this paper, we propose a query-driven Temporal Graph Convolutional Network (CS-TGN) that can capture flexible community structures by learning from the ground-truth communities in a data-driven manner. CS-TGN first combines the local query-dependent structure and the global graph embedding in each snapshot of the network and then uses a GRU cell with contextual attention to learn the dynamics of interactions and update node embeddings over time. We demonstrate how this model can be used for interactive community search in an online setting, allowing users to evaluate the found communities and provide feedback. Experiments on real-world temporal graphs with ground-truth communities validate the superior quality of the solutions obtained and the efficiency of our model in both temporal and interactive static settings.|搜索当地社区是一个重要的研究挑战,它允许个性化的社区发现,并支持各种复杂网络中的先进数据分析,如万维网、社交网络和大脑网络。随着时间的推移,这些网络的演变促使最近几项研究在时间网络中识别局部群落。给定任何查询节点,Community Search 的目标是找到一个包含查询节点的密集连接子图。然而,现有的时态网络中的社区搜索方法存在两个主要的局限性: (1)它们采用预定义的子图模式来模拟社区,在现实网络中不能找到不符合这些模式的社区; (2)它们只使用不相交的结构信息的聚合来度量质量,缺乏连接的动态性和时态性。本文提出了一种基于查询驱动的时态图卷积网络(CS-TGN) ,该网络通过以数据驱动的方式从地面真相社区中学习知识,可以捕获灵活的社区结构。TGN 首先将本地查询依赖结构和全局图嵌入到网络的每个快照中,然后利用一个具有上下文关注的 GRU 单元来学习交互的动态性,并随着时间的推移更新节点嵌入。我们展示了这个模型如何在在线环境中用于交互式社区搜索,允许用户评估找到的社区并提供反馈。在真实时间图上的实验结果验证了该模型在时间静态和交互静态环境下的优越性和有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CS-TGN:+Community+Search+via+Temporal+Graph+Neural+Networks)|0| -|[Learned Temporal Aggregations for Fraud Classification on E-Commerce Platforms](https://doi.org/10.1145/3543873.3587632)|Xiao Ling, David Yan, Bilal Alsallakh, Ashutosh Pandey, Manan Bakshi, Pamela Bhattacharya|Meta, USA; North Carolina State University, USA; Voxel AI, USA; Meta, Canada|Fraud and other types of adversarial behavior are serious problems on customer-to-customer (C2C) e-commerce platforms, where harmful behaviors by bad actors erode user trust and safety. Many modern e-commerce integrity systems utilize machine learning (ML) to detect fraud and bad actors. We discuss the practical problems faced by integrity systems which utilize data associated with user interactions with the platform. Specifically, we focus on the challenge of representing the user interaction events, and aggregating their features. We compare the performance of two paradigms to handle the feature temporality when training the ML models: hand-engineered temporal aggregation and a learned aggregation using a sequence encoder. We show that a model which learns a time-aggregation using a sequence encoder outperforms models trained on handcrafted aggregations on the fraud classification task with a real-world dataset.|欺诈和其他类型的对抗行为是 C2C 电子商务平台上的严重问题,不良行为者的有害行为侵蚀了用户的信任和安全。许多现代电子商务完整性系统利用机器学习(ML)来检测欺诈和不良行为者。我们讨论完整性系统所面临的实际问题,这些系统利用与平台的用户交互相关的数据。具体来说,我们关注的挑战是表示用户交互事件,并聚合它们的特性。在训练机器学习模型时,我们比较了两种模式处理特征时间的性能: 手工时间聚合和使用序列编码器的学习聚合。我们展示了一个使用序列编码器学习时间聚合的模型优于使用真实世界数据集进行欺诈分类任务的手工聚合训练的模型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learned+Temporal+Aggregations+for+Fraud+Classification+on+E-Commerce+Platforms)|0| +|[Learned Temporal Aggregations for Fraud Classification on E-Commerce Platforms](https://doi.org/10.1145/3543873.3587632)|Xiao Ling, David Yan, Bilal Alsallakh, Ashutosh Pandey, Manan Bakshi, Pamela Bhattacharya|North Carolina State University, USA; Voxel AI, USA; Meta, Canada; Meta, USA|Fraud and other types of adversarial behavior are serious problems on customer-to-customer (C2C) e-commerce platforms, where harmful behaviors by bad actors erode user trust and safety. Many modern e-commerce integrity systems utilize machine learning (ML) to detect fraud and bad actors. We discuss the practical problems faced by integrity systems which utilize data associated with user interactions with the platform. Specifically, we focus on the challenge of representing the user interaction events, and aggregating their features. We compare the performance of two paradigms to handle the feature temporality when training the ML models: hand-engineered temporal aggregation and a learned aggregation using a sequence encoder. We show that a model which learns a time-aggregation using a sequence encoder outperforms models trained on handcrafted aggregations on the fraud classification task with a real-world dataset.|欺诈和其他类型的对抗行为是 C2C 电子商务平台上的严重问题,不良行为者的有害行为侵蚀了用户的信任和安全。许多现代电子商务完整性系统利用机器学习(ML)来检测欺诈和不良行为者。我们讨论完整性系统所面临的实际问题,这些系统利用与平台的用户交互相关的数据。具体来说,我们关注的挑战是表示用户交互事件,并聚合它们的特性。在训练机器学习模型时,我们比较了两种模式处理特征时间的性能: 手工时间聚合和使用序列编码器的学习聚合。我们展示了一个使用序列编码器学习时间聚合的模型优于使用真实世界数据集进行欺诈分类任务的手工聚合训练的模型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learned+Temporal+Aggregations+for+Fraud+Classification+on+E-Commerce+Platforms)|0| |[Decency and Decentralisation: Verifiable Decentralised Knowledge Graph Querying](https://doi.org/10.1145/3543873.3587635)|Aisling Third, John Domingue|Knowledge Media Institute, The Open University, United Kingdom|Increasing interest in decentralisation for data and processing on the Web brings with it the need to re-examine methods for verifying data and behaviour for scalable multi-party interactions. We consider factors relevant to verification of querying activity on knowledge graphs in a Trusted Decentralised Web, and set out ideas for future research in this area.|随着人们对数据地方分权和网络处理的兴趣日益增长,人们需要重新审视可扩展的多方交互的数据和行为验证方法。我们考虑了与可信分布式网络中知识图表查询活动验证相关的因素,并为这一领域的未来研究提出了一些想法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Decency+and+Decentralisation:+Verifiable+Decentralised+Knowledge+Graph+Querying)|0| -|[Towards a Decentralized Data Hub and Query System for Federated Dynamic Data Spaces](https://doi.org/10.1145/3543873.3587646)|Danh Le Phuoc, Sonja Schimmler, Anh LeTuan, Uwe A. Kuehn, Manfred Hauswirth|TU Berlin, Germany; Fraunhofer Institute for Open Communication Systems, Berlin, Germany|This position paper proposes a hybrid architecture for secure and efficient data sharing and processing across dynamic data spaces. On the one hand, current centralized approaches are plagued by issues such as lack of privacy and control for users, high costs, and bad performance, making these approaches unsuitable for the decentralized data spaces prevalent in Europe and various industries (decentralized on the conceptual and physical levels while centralized in the underlying implementation). On the other hand, decentralized systems face challenges with limited knowledge of/control over the global system, fair resource utilization, and data provenance. Our proposed Semantic Data Ledger (SDL) approach combines the advantages of both architectures to overcome their limitations. SDL allows users to choose the best combination of centralized and decentralized features, providing a decentralized infrastructure for the publication of structured data with machine-readable semantics. It supports expressive structured queries, secure data sharing, and payment mechanisms based on an underlying autonomous ledger, enabling the implementation of economic models and fair-use strategies.|本文提出了一种跨动态数据空间的安全有效的数据共享和处理的混合体系结构。一方面,当前的集中式方法受到诸如用户缺乏隐私和控制、高成本和性能差等问题的困扰,使得这些方法不适合在欧洲和各种行业盛行的分散式数据空间(在概念和物理层面上分散,而在底层实现中集中)。另一方面,分散系统面临的挑战是对全球系统的了解和控制有限,资源利用不公平,数据来源不明确。我们提出的语义数据分类账(SDL)方法结合了两种体系结构的优点,克服了它们的局限性。SDL 允许用户选择集中和分散特性的最佳组合,为具有机器可读语义的结构化数据的发布提供分散的基础设施。它支持表达式结构化查询、安全数据共享和基于底层自治分类账的支付机制,使经济模型和合理使用策略的实施成为可能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+a+Decentralized+Data+Hub+and+Query+System+for+Federated+Dynamic+Data+Spaces)|0| +|[Towards a Decentralized Data Hub and Query System for Federated Dynamic Data Spaces](https://doi.org/10.1145/3543873.3587646)|Danh Le Phuoc, Sonja Schimmler, Anh LeTuan, Uwe A. Kuehn, Manfred Hauswirth|Fraunhofer Institute for Open Communication Systems, Berlin, Germany; TU Berlin, Germany|This position paper proposes a hybrid architecture for secure and efficient data sharing and processing across dynamic data spaces. On the one hand, current centralized approaches are plagued by issues such as lack of privacy and control for users, high costs, and bad performance, making these approaches unsuitable for the decentralized data spaces prevalent in Europe and various industries (decentralized on the conceptual and physical levels while centralized in the underlying implementation). On the other hand, decentralized systems face challenges with limited knowledge of/control over the global system, fair resource utilization, and data provenance. Our proposed Semantic Data Ledger (SDL) approach combines the advantages of both architectures to overcome their limitations. SDL allows users to choose the best combination of centralized and decentralized features, providing a decentralized infrastructure for the publication of structured data with machine-readable semantics. It supports expressive structured queries, secure data sharing, and payment mechanisms based on an underlying autonomous ledger, enabling the implementation of economic models and fair-use strategies.|本文提出了一种跨动态数据空间的安全有效的数据共享和处理的混合体系结构。一方面,当前的集中式方法受到诸如用户缺乏隐私和控制、高成本和性能差等问题的困扰,使得这些方法不适合在欧洲和各种行业盛行的分散式数据空间(在概念和物理层面上分散,而在底层实现中集中)。另一方面,分散系统面临的挑战是对全球系统的了解和控制有限,资源利用不公平,数据来源不明确。我们提出的语义数据分类账(SDL)方法结合了两种体系结构的优点,克服了它们的局限性。SDL 允许用户选择集中和分散特性的最佳组合,为具有机器可读语义的结构化数据的发布提供分散的基础设施。它支持表达式结构化查询、安全数据共享和基于底层自治分类账的支付机制,使经济模型和合理使用策略的实施成为可能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+a+Decentralized+Data+Hub+and+Query+System+for+Federated+Dynamic+Data+Spaces)|0| |[What are "personal data spaces"?](https://doi.org/10.1145/3543873.3587656)|Viivi Lähteenoja|University of Helsinki, Finland and Aalto University, Finland|While the concept of “data spaces” is no longer new, its specific application to individuals and personal data management is still undeveloped. This short paper presents a vision for “personal data spaces” in the shape of a work-in-progress description of them and some of the conceptual and implementation features envisioned. It is offered for discussion, debate, and improvement by professionals, policymakers, and researchers operating in the intersection of data spaces and personal data management.|虽然“数据空间”的概念不再是新的,但它在个人和个人数据管理方面的具体应用仍然没有得到发展。这篇简短的论文提出了一个“个人数据空间”的愿景,其形式是一个在建的数据空间描述,以及所设想的一些概念和实现特征。它提供了讨论,辩论和改进的专业人士,决策者和研究人员在数据空间和个人数据管理的交叉运作。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=What+are+"personal+data+spaces"?)|0| -|[TAPP: Defining standard provenance information for clinical research data and workflows - Obstacles and opportunities](https://doi.org/10.1145/3543873.3587562)|Kerstin Gierend, Judith A. H. Wodke, Sascha Genehr, Robert Gött, Ron Henkel, Frank Krüger, Markus Mandalka, Lea Michaelis, Alexander Scheuerlein, Max Schröder, Atinkut Zeleke, Dagmar Waltemath|Medical Informatics Laboratory, MeDaX Group, University Medicine Greifswald, Germany; Faculty of Engineering, Wismar University of Applied Sciences, Germany; Core Unit Data Integration Center, University Medicine Greifswald, Germany; Medical Informatics Laboratory, University Medicine Greifswald, Germany; Institute of Communications Engineering, University of Rostock, Germany; Department of Biomedical Informatics, Center for Preventive Medicine and Digital Health, Medical Faculty Mannheim, Heidelberg University, Germany; Institute for Data Science, University of Greifswald, Germany; Rostock University Library, University of Rostock, Germany|Data provenance has raised much attention across disciplines lately, as it has been shown that enrichment of data with provenance information leads to better credibility, renders data more FAIR fostering data reuse. Also, the biomedical domain has recognised the potential of provenance capture. However, several obstacles prevent efficient, automated, and machine-interpretable enrichment of biomedical data with provenance information, such as data heterogeneity, complexity, and sensitivity. Here, we explain how in Germany clinical data are transferred from hospital information systems into a data integration centre to enable secondary use of patient data and how it can be reused as research data. Considering the complex data infrastructures in hospitals, we indicate obstacles and opportunities when collecting provenance information along heterogeneous data processing pipelines. To express provenance data, we indicate the usage of the Fast Healthcare Interoperability Resource (FHIR) provenance resource for healthcare data. In addition, we consider already existing approaches from other research fields and standard communities. As a solution towards high-quality standardised clinical research data, we propose to develop a ’MInimal Requirements for Automated Provenance Information Enrichment’ (MIRAPIE) guideline. As a community project, MIRAPIE should generalise provenance information concepts to allow its world-wide applicability, possibly beyond the health care sector.|数据来源最近引起了跨学科的广泛关注,因为已经表明,用来源信息丰富数据可以提高可信度,使数据更加公平,促进数据重用。此外,生物医学领域已经认识到种源捕获的潜力。然而,一些障碍阻碍了生物医学数据的有效、自动化和机器可解释的来源信息的丰富,例如数据异构性、复杂性和敏感性。在这里,我们解释在德国如何将临床数据从医院信息系统转移到数据集成中心,以便能够对患者数据进行二次使用,以及如何将其重用为研究数据。考虑到医院复杂的数据基础设施,我们指出了沿着异构数据处理管道收集起源信息的障碍和机会。为了表示来源数据,我们指出了 Fast Healthcare Interoperability Resource (FHIR)来源资源对医疗数据的使用。此外,我们还考虑了来自其他研究领域和标准社区的已有方法。作为高质量标准化临床研究数据的解决方案,我们建议制定一个“自动起源信息丰富的最低要求”(MIRAPIE)指南。作为一个社区项目,MIRAPIE 应该推广起源信息的概念,使其在世界范围内适用,可能超出卫生保健部门。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=TAPP:+Defining+standard+provenance+information+for+clinical+research+data+and+workflows+-+Obstacles+and+opportunities)|0| +|[TAPP: Defining standard provenance information for clinical research data and workflows - Obstacles and opportunities](https://doi.org/10.1145/3543873.3587562)|Kerstin Gierend, Judith A. H. Wodke, Sascha Genehr, Robert Gött, Ron Henkel, Frank Krüger, Markus Mandalka, Lea Michaelis, Alexander Scheuerlein, Max Schröder, Atinkut Zeleke, Dagmar Waltemath|Medical Informatics Laboratory, University Medicine Greifswald, Germany; Medical Informatics Laboratory, MeDaX Group, University Medicine Greifswald, Germany; Department of Biomedical Informatics, Center for Preventive Medicine and Digital Health, Medical Faculty Mannheim, Heidelberg University, Germany; Institute of Communications Engineering, University of Rostock, Germany; Core Unit Data Integration Center, University Medicine Greifswald, Germany; Rostock University Library, University of Rostock, Germany; Institute for Data Science, University of Greifswald, Germany; Faculty of Engineering, Wismar University of Applied Sciences, Germany|Data provenance has raised much attention across disciplines lately, as it has been shown that enrichment of data with provenance information leads to better credibility, renders data more FAIR fostering data reuse. Also, the biomedical domain has recognised the potential of provenance capture. However, several obstacles prevent efficient, automated, and machine-interpretable enrichment of biomedical data with provenance information, such as data heterogeneity, complexity, and sensitivity. Here, we explain how in Germany clinical data are transferred from hospital information systems into a data integration centre to enable secondary use of patient data and how it can be reused as research data. Considering the complex data infrastructures in hospitals, we indicate obstacles and opportunities when collecting provenance information along heterogeneous data processing pipelines. To express provenance data, we indicate the usage of the Fast Healthcare Interoperability Resource (FHIR) provenance resource for healthcare data. In addition, we consider already existing approaches from other research fields and standard communities. As a solution towards high-quality standardised clinical research data, we propose to develop a ’MInimal Requirements for Automated Provenance Information Enrichment’ (MIRAPIE) guideline. As a community project, MIRAPIE should generalise provenance information concepts to allow its world-wide applicability, possibly beyond the health care sector.|数据来源最近引起了跨学科的广泛关注,因为已经表明,用来源信息丰富数据可以提高可信度,使数据更加公平,促进数据重用。此外,生物医学领域已经认识到种源捕获的潜力。然而,一些障碍阻碍了生物医学数据的有效、自动化和机器可解释的来源信息的丰富,例如数据异构性、复杂性和敏感性。在这里,我们解释在德国如何将临床数据从医院信息系统转移到数据集成中心,以便能够对患者数据进行二次使用,以及如何将其重用为研究数据。考虑到医院复杂的数据基础设施,我们指出了沿着异构数据处理管道收集起源信息的障碍和机会。为了表示来源数据,我们指出了 Fast Healthcare Interoperability Resource (FHIR)来源资源对医疗数据的使用。此外,我们还考虑了来自其他研究领域和标准社区的已有方法。作为高质量标准化临床研究数据的解决方案,我们建议制定一个“自动起源信息丰富的最低要求”(MIRAPIE)指南。作为一个社区项目,MIRAPIE 应该推广起源信息的概念,使其在世界范围内适用,可能超出卫生保健部门。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=TAPP:+Defining+standard+provenance+information+for+clinical+research+data+and+workflows+-+Obstacles+and+opportunities)|0| |[ProSA: A provenance system for reproducing query results](https://doi.org/10.1145/3543873.3587563)|Tanja Auge|Faculty of Informatics and Data Science, University of Regensburg, Germany|Good scientific work requires comprehensible, transparent and reproducible research. One way to ensure this is to include all data relevant to a study or evaluation when publishing an article. This data should be at least aggregated or anonymized, at best compact and complete, but always resilient. In this paper we present ProSA, a system for calculating the minimal necessary data set, called sub-database. For this, we combine the Chase — a set of algorithms for transforming databases — with additional provenance information. We display the implementation of provenance guided by the ProSA pipeline and show its use to generate an optimized sub-database. Furhter, we demonstrate how the ProSA GUI looks like and present some applications and extensions.|好的科学工作需要可理解、透明和可重复的研究。确保这一点的一个方法是在发表文章时包括与研究或评估相关的所有数据。这些数据应该至少是聚合或匿名的,充其量是紧凑和完整的,但总是具有弹性。本文介绍了 ProSA,一个计算最小必要数据集的系统,称为子数据库。为此,我们将 Chase (一组转换数据库的算法)与其他来源信息结合起来。我们展示了 ProSA 流水线引导的起源实现,并展示了它用于生成优化的子数据库。进一步,我们将演示 ProSA GUI 的外观,并展示一些应用程序和扩展。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ProSA:+A+provenance+system+for+reproducing+query+results)|0| -|[Hybrid Query and Instance Explanations and Repairs](https://doi.org/10.1145/3543873.3587565)|Seokki Lee, Boris Glavic, Adriane Chapman, Bertram Ludäscher|University of Cincinnati, USA; Illinois Institute of Technology, USA; University of Southampton, United Kingdom; University of Illinois at Urbana-Champaign, USA|Prior work on explaining missing (unexpected) query results identifies which parts of the query or data are responsible for the erroneous result or repairs the query or data to fix such errors. The problem of generating repairs is typically expressed as an optimization problem, i.e., a single repair is returned that is optimal wrt. to some criterion such as minimizing the repair’s side effects. However, such an optimization objective may not concretely model a user’s (often hard to formalize) notion of which repair is “correct”. In this paper, we motivate hybrid explanations and repairs, i.e., that fix both the query and the data. Instead of returning one “optimal” repair, we argue for an approach that empowers the user to explore the space of possible repairs effectively. We also present a proof-of-concept implementation and outline open research problems.|先前解释丢失(意外)查询结果的工作确定了查询或数据的哪些部分对错误结果负责,或者修复查询或数据以修复此类错误。产生维修的问题通常表示为一个最佳化问题,也就是说,一个单一的维修被返回,这是最优的书面意见,以某些标准,如最小化维修的副作用。然而,这样的优化目标可能无法具体地模拟用户(通常很难形式化)的哪种修复是“正确的”概念。在本文中,我们激励混合解释和修复,即,修复查询和数据。与返回一个“最佳”修复相反,我们主张采用一种方法,使用户能够有效地探索可能的修复空间。我们还提出了一个概念验证实现,并概述了开放式研究问题。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Hybrid+Query+and+Instance+Explanations+and+Repairs)|0| -|[Querying Container Provenance](https://doi.org/10.1145/3543873.3587568)|Aniket Modi, Moaz Reyad, Tanu Malik, Ashish Gehani|SRI International, USA; College of Computing and Digital Media, DePaul University, USA; College of Computing and Digital Media, DePaul University, USA and Department of Computer Science and Engineering, IIT Delhi, India|Containers are lightweight mechanisms for the isolation of operating system resources. They are realized by activating a set of namespaces. Given the use of containers in scientific computing, tracking and managing provenance within and across containers is becoming essential for debugging and reproducibility. In this work, we examine the properties of container provenance graphs that result from auditing containerized applications. We observe that the generated container provenance graphs are hypergraphs because one resource may belong to one or more namespaces. We examine the hierarchical behavior of PID, mount, and user namespaces, that are more commonly activated and show that even when represented as hypergraphs, the resulting container provenance graphs are acyclic. We experiment with recently published container logs and identify hypergraph properties.|容器是用于隔离操作系统资源的轻量级机制。它们是通过激活一组名称空间来实现的。鉴于容器在科学计算中的使用,跟踪和管理容器内部和跨容器的出处对于调试和再现性变得至关重要。在本文中,我们研究了审计容器化应用程序所产生的容器起源图的属性。我们注意到,生成的容器起源图是超图,因为一个资源可能属于一个或多个名称空间。我们研究了 PID、 mount 和用户名称空间的分层行为,这些名称空间通常被激活,并且表明即使用超图表示,最终的容器起源图也是无环的。我们使用最近发布的容器日志进行实验,并识别超图属性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Querying+Container+Provenance)|0| -|[Graph-less Collaborative Filtering](https://doi.org/10.1145/3543507.3583196)|Lianghao Xia, Chao Huang, Jiao Shi, Yong Xu|The University of Hong Kong, Hong Kong; South China University of Technology, China|Graph neural networks (GNNs) have shown the power in representation learning over graph-structured user-item interaction data for collaborative filtering (CF) task. However, with their inherently recursive message propagation among neighboring nodes, existing GNN-based CF models may generate indistinguishable and inaccurate user (item) representations due to the over-smoothing and noise effect with low-pass Laplacian smoothing operators. In addition, the recursive information propagation with the stacked aggregators in the entire graph structures may result in poor scalability in practical applications. Motivated by these limitations, we propose a simple and effective collaborative filtering model (SimRec) that marries the power of knowledge distillation and contrastive learning. In SimRec, adaptive transferring knowledge is enabled between the teacher GNN model and a lightweight student network, to not only preserve the global collaborative signals, but also address the over-smoothing issue with representation recalibration. Empirical results on public datasets show that SimRec archives better efficiency while maintaining superior recommendation performance compared with various strong baselines. Our implementations are publicly available at: https://github.com/HKUDS/SimRec.|图形神经网络(GNN)已经显示了在表示学习方面的能力,超过了图形结构的用户-项目交互数据的协同过滤(CF)任务。然而,由于现有的基于 GNN 的 CF 模型固有的相邻节点之间的递归消息传播特性,由于低通拉普拉斯平滑算子的过平滑和噪声效应,可能产生难以区分和不准确的用户(项)表示。此外,在整个图结构中,叠加聚合器的递归信息传播可能导致实际应用中的可扩展性较差。基于这些局限性,我们提出了一个简单有效的协同过滤模型(SimRec) ,它将知识提取和对比学习的力量结合在一起。在 SimRec 中,在教师 GNN 模型和轻量级学生网络之间实现了自适应知识传递,不仅保留了全局协作信号,而且通过表示重校正解决了过于平滑的问题。对公共数据集的实证结果表明,与各种强基线相比,SimRec 在保持优异推荐性能的同时提高了存档效率。我们的实施方案可以在以下 https://github.com/hkuds/simrec 公开获得:。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph-less+Collaborative+Filtering)|0| +|[Hybrid Query and Instance Explanations and Repairs](https://doi.org/10.1145/3543873.3587565)|Seokki Lee, Boris Glavic, Adriane Chapman, Bertram Ludäscher|University of Illinois at Urbana-Champaign, USA; University of Southampton, United Kingdom; University of Cincinnati, USA; Illinois Institute of Technology, USA|Prior work on explaining missing (unexpected) query results identifies which parts of the query or data are responsible for the erroneous result or repairs the query or data to fix such errors. The problem of generating repairs is typically expressed as an optimization problem, i.e., a single repair is returned that is optimal wrt. to some criterion such as minimizing the repair’s side effects. However, such an optimization objective may not concretely model a user’s (often hard to formalize) notion of which repair is “correct”. In this paper, we motivate hybrid explanations and repairs, i.e., that fix both the query and the data. Instead of returning one “optimal” repair, we argue for an approach that empowers the user to explore the space of possible repairs effectively. We also present a proof-of-concept implementation and outline open research problems.|先前解释丢失(意外)查询结果的工作确定了查询或数据的哪些部分对错误结果负责,或者修复查询或数据以修复此类错误。产生维修的问题通常表示为一个最佳化问题,也就是说,一个单一的维修被返回,这是最优的书面意见,以某些标准,如最小化维修的副作用。然而,这样的优化目标可能无法具体地模拟用户(通常很难形式化)的哪种修复是“正确的”概念。在本文中,我们激励混合解释和修复,即,修复查询和数据。与返回一个“最佳”修复相反,我们主张采用一种方法,使用户能够有效地探索可能的修复空间。我们还提出了一个概念验证实现,并概述了开放式研究问题。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Hybrid+Query+and+Instance+Explanations+and+Repairs)|0| +|[Querying Container Provenance](https://doi.org/10.1145/3543873.3587568)|Aniket Modi, Moaz Reyad, Tanu Malik, Ashish Gehani|College of Computing and Digital Media, DePaul University, USA and Department of Computer Science and Engineering, IIT Delhi, India; College of Computing and Digital Media, DePaul University, USA; SRI International, USA|Containers are lightweight mechanisms for the isolation of operating system resources. They are realized by activating a set of namespaces. Given the use of containers in scientific computing, tracking and managing provenance within and across containers is becoming essential for debugging and reproducibility. In this work, we examine the properties of container provenance graphs that result from auditing containerized applications. We observe that the generated container provenance graphs are hypergraphs because one resource may belong to one or more namespaces. We examine the hierarchical behavior of PID, mount, and user namespaces, that are more commonly activated and show that even when represented as hypergraphs, the resulting container provenance graphs are acyclic. We experiment with recently published container logs and identify hypergraph properties.|容器是用于隔离操作系统资源的轻量级机制。它们是通过激活一组名称空间来实现的。鉴于容器在科学计算中的使用,跟踪和管理容器内部和跨容器的出处对于调试和再现性变得至关重要。在本文中,我们研究了审计容器化应用程序所产生的容器起源图的属性。我们注意到,生成的容器起源图是超图,因为一个资源可能属于一个或多个名称空间。我们研究了 PID、 mount 和用户名称空间的分层行为,这些名称空间通常被激活,并且表明即使用超图表示,最终的容器起源图也是无环的。我们使用最近发布的容器日志进行实验,并识别超图属性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Querying+Container+Provenance)|0| +|[Graph-less Collaborative Filtering](https://doi.org/10.1145/3543507.3583196)|Lianghao Xia, Chao Huang, Jiao Shi, Yong Xu|South China University of Technology, China; The University of Hong Kong, Hong Kong|Graph neural networks (GNNs) have shown the power in representation learning over graph-structured user-item interaction data for collaborative filtering (CF) task. However, with their inherently recursive message propagation among neighboring nodes, existing GNN-based CF models may generate indistinguishable and inaccurate user (item) representations due to the over-smoothing and noise effect with low-pass Laplacian smoothing operators. In addition, the recursive information propagation with the stacked aggregators in the entire graph structures may result in poor scalability in practical applications. Motivated by these limitations, we propose a simple and effective collaborative filtering model (SimRec) that marries the power of knowledge distillation and contrastive learning. In SimRec, adaptive transferring knowledge is enabled between the teacher GNN model and a lightweight student network, to not only preserve the global collaborative signals, but also address the over-smoothing issue with representation recalibration. Empirical results on public datasets show that SimRec archives better efficiency while maintaining superior recommendation performance compared with various strong baselines. Our implementations are publicly available at: https://github.com/HKUDS/SimRec.|图形神经网络(GNN)已经显示了在表示学习方面的能力,超过了图形结构的用户-项目交互数据的协同过滤(CF)任务。然而,由于现有的基于 GNN 的 CF 模型固有的相邻节点之间的递归消息传播特性,由于低通拉普拉斯平滑算子的过平滑和噪声效应,可能产生难以区分和不准确的用户(项)表示。此外,在整个图结构中,叠加聚合器的递归信息传播可能导致实际应用中的可扩展性较差。基于这些局限性,我们提出了一个简单有效的协同过滤模型(SimRec) ,它将知识提取和对比学习的力量结合在一起。在 SimRec 中,在教师 GNN 模型和轻量级学生网络之间实现了自适应知识传递,不仅保留了全局协作信号,而且通过表示重校正解决了过于平滑的问题。对公共数据集的实证结果表明,与各种强基线相比,SimRec 在保持优异推荐性能的同时提高了存档效率。我们的实施方案可以在以下 https://github.com/hkuds/simrec 公开获得:。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph-less+Collaborative+Filtering)|0| |[Collaboration-Aware Graph Convolutional Network for Recommender Systems](https://doi.org/10.1145/3543507.3583229)|Yu Wang, Yuying Zhao, Yi Zhang, Tyler Derr|Vanderbilt university, USA|Graph Neural Networks (GNNs) have been successfully adopted in recommender systems by virtue of the message-passing that implicitly captures collaborative effect. Nevertheless, most of the existing message-passing mechanisms for recommendation are directly inherited from GNNs without scrutinizing whether the captured collaborative effect would benefit the prediction of user preferences. In this paper, we first analyze how message-passing captures the collaborative effect and propose a recommendation-oriented topological metric, Common Interacted Ratio (CIR), which measures the level of interaction between a specific neighbor of a node with the rest of its neighbors. After demonstrating the benefits of leveraging collaborations from neighbors with higher CIR, we propose a recommendation-tailored GNN, Collaboration-Aware Graph Convolutional Network (CAGCN), that goes beyond 1-Weisfeiler-Lehman(1-WL) test in distinguishing non-bipartite-subgraph-isomorphic graphs. Experiments on six benchmark datasets show that the best CAGCN variant outperforms the most representative GNN-based recommendation model, LightGCN, by nearly 10% in Recall@20 and also achieves around 80% speedup. Our code is publicly available at https://github.com/YuWVandy/CAGCN.|图形神经网络(GNN)通过隐式捕捉协作效果的消息传递,已成功地应用于推荐系统中。尽管如此,大多数现有的推荐信息传递机制都是直接从 GNN 继承而来的,没有仔细检查所捕获的协作效应是否有利于预测用户的偏好。在本文中,我们首先分析了消息传递是如何捕获协作效果的,并提出了一个面向推荐的拓扑度量,公共交互比(CIR) ,它测量节点的特定邻居与其他邻居之间的交互水平。在证明了利用具有较高 CIR 的邻居的协作的好处之后,我们提出了一种推荐量身定制的 GNN,协作感知图卷积网络(CAGCN) ,其超越了1-Weisfeiler-Lehman (1-WL)检验在区分非二部子图-同构图。在六个基准数据集上的实验表明,在 Recall@20中,最好的 CAGCN 变体比最具代表性的基于 GNN 的推荐模型 LightGCN 的性能提高了近10% ,并且还实现了约80% 的加速比。我们的代码可以在 https://github.com/yuwvandy/cagcn 上公开获取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Collaboration-Aware+Graph+Convolutional+Network+for+Recommender+Systems)|0| -|[HyConvE: A Novel Embedding Model for Knowledge Hypergraph Link Prediction with Convolutional Neural Networks](https://doi.org/10.1145/3543507.3583256)|Chenxu Wang, Xin Wang, Zhao Li, Zirui Chen, Jianxin Li|Deakin University, Australia; Tianjin University, China|Knowledge hypergraph embedding, which projects entities and n-ary relations into a low-dimensional continuous vector space to predict missing links, remains a challenging area to be explored despite the ubiquity of n-ary relational facts in the real world. Currently, knowledge hypergraph link prediction methods are essentially simple extensions of those used in knowledge graphs, where n-ary relational facts are decomposed into different subelements. Convolutional neural networks have been shown to have remarkable information extraction capabilities in previous work on knowledge graph link prediction. In this paper, we propose a novel embedding-based knowledge hypergraph link prediction model named HyConvE, which exploits the powerful learning ability of convolutional neural networks for effective link prediction. Specifically, we employ 3D convolution to capture the deep interactions of entities and relations to efficiently extract explicit and implicit knowledge in each n-ary relational fact without compromising its translation property. In addition, appropriate relation and position-aware filters are utilized sequentially to perform two-dimensional convolution operations to capture the intrinsic patterns and position information in each n-ary relation, respectively. Extensive experimental results on real datasets of knowledge hypergraphs and knowledge graphs demonstrate the superior performance of HyConvE compared with state-of-the-art baselines.|知识超图嵌入将实体和 n 元关系投影到低维连续向量空间中以预测缺失的链接,尽管 n 元关系事实在现实世界中无处不在,但仍然是一个有待探索的挑战领域。目前,知识超图链接预测方法基本上是知识图的简单扩展,其中 n 元关系事实被分解为不同的子元素。卷积神经网络已被证明具有显著的信息抽取能力在以前的工作中的知识图链接预测。本文提出了一种新的基于嵌入的知识超图链接预测模型 HyConve,该模型利用卷积神经网络强大的学习能力进行有效的链接预测。具体来说,我们使用三维卷积来捕捉实体和关系的深层交互作用,以有效地提取每个 n 元关系事实中的显性和隐性知识,而不损害其翻译性质。此外,还利用合适的关系和位置感知滤波器,分别进行二维卷积运算,捕获每个 n 元关系中的内在模式和位置信息。在知识超图和知识图的实际数据集上进行的大量实验结果表明,与最先进的基线相比,HyConve 方法具有更好的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=HyConvE:+A+Novel+Embedding+Model+for+Knowledge+Hypergraph+Link+Prediction+with+Convolutional+Neural+Networks)|0| +|[HyConvE: A Novel Embedding Model for Knowledge Hypergraph Link Prediction with Convolutional Neural Networks](https://doi.org/10.1145/3543507.3583256)|Chenxu Wang, Xin Wang, Zhao Li, Zirui Chen, Jianxin Li|Tianjin University, China; Deakin University, Australia|Knowledge hypergraph embedding, which projects entities and n-ary relations into a low-dimensional continuous vector space to predict missing links, remains a challenging area to be explored despite the ubiquity of n-ary relational facts in the real world. Currently, knowledge hypergraph link prediction methods are essentially simple extensions of those used in knowledge graphs, where n-ary relational facts are decomposed into different subelements. Convolutional neural networks have been shown to have remarkable information extraction capabilities in previous work on knowledge graph link prediction. In this paper, we propose a novel embedding-based knowledge hypergraph link prediction model named HyConvE, which exploits the powerful learning ability of convolutional neural networks for effective link prediction. Specifically, we employ 3D convolution to capture the deep interactions of entities and relations to efficiently extract explicit and implicit knowledge in each n-ary relational fact without compromising its translation property. In addition, appropriate relation and position-aware filters are utilized sequentially to perform two-dimensional convolution operations to capture the intrinsic patterns and position information in each n-ary relation, respectively. Extensive experimental results on real datasets of knowledge hypergraphs and knowledge graphs demonstrate the superior performance of HyConvE compared with state-of-the-art baselines.|知识超图嵌入将实体和 n 元关系投影到低维连续向量空间中以预测缺失的链接,尽管 n 元关系事实在现实世界中无处不在,但仍然是一个有待探索的挑战领域。目前,知识超图链接预测方法基本上是知识图的简单扩展,其中 n 元关系事实被分解为不同的子元素。卷积神经网络已被证明具有显著的信息抽取能力在以前的工作中的知识图链接预测。本文提出了一种新的基于嵌入的知识超图链接预测模型 HyConve,该模型利用卷积神经网络强大的学习能力进行有效的链接预测。具体来说,我们使用三维卷积来捕捉实体和关系的深层交互作用,以有效地提取每个 n 元关系事实中的显性和隐性知识,而不损害其翻译性质。此外,还利用合适的关系和位置感知滤波器,分别进行二维卷积运算,捕获每个 n 元关系中的内在模式和位置信息。在知识超图和知识图的实际数据集上进行的大量实验结果表明,与最先进的基线相比,HyConve 方法具有更好的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=HyConvE:+A+Novel+Embedding+Model+for+Knowledge+Hypergraph+Link+Prediction+with+Convolutional+Neural+Networks)|0| |[Efficient Approximation Algorithms for the Diameter-Bounded Max-Coverage Group Steiner Tree Problem](https://doi.org/10.1145/3543507.3583257)|Ke Zhang, Xiaoqing Wang, Gong Cheng|State Key Laboratory for Novel Software Technology, Nanjing University, China|The Diameter-bounded max-Coverage Group Steiner Tree (DCGST) problem has recently been proposed as an expressive way of formulating keyword-based search and exploration of knowledge graphs. It aims at finding a diameter-bounded tree which covers the most given groups of vertices and has the minimum weight. In contrast to its specialization—the classic Group Steiner Tree (GST) problem which has been extensively studied, the emerging DCGST problem still lacks an efficient algorithm. In this paper, we propose Cba, the first approximation algorithm for the DCGST problem, and we prove its worst-case approximation ratio. Furthermore, we incorporate a best-first search strategy with two pruning methods into PrunedCBA, an improved approximation algorithm. Our extensive experiments on real and synthetic graphs demonstrate the effectiveness and efficiency of PrunedCBA.|直径有界的最大覆盖群 Steiner 树(DCGST)问题是近年来提出的一种基于关键字搜索和知识图探索的表示方法。它的目标是找到一个直径有界的树,它覆盖了最多给定的顶点群,并具有最小的权重。与经典的群 Steiner 树(GST)问题相比,新出现的 DCGST 问题仍然缺乏一种有效的算法。在这篇文章中,我们提出了 Cba,这是 DCGST 问题的第一个近似演算法,并且证明了它的最坏情况逼近比。此外,我们将最佳优先搜索策略和两种修剪方法结合到一个改进的近似演算法 PrunedCBA 中。我们在实图和合成图上的大量实验证明了 PrunedCBA 的有效性和效率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Efficient+Approximation+Algorithms+for+the+Diameter-Bounded+Max-Coverage+Group+Steiner+Tree+Problem)|0| -|[ConsRec: Learning Consensus Behind Interactions for Group Recommendation](https://doi.org/10.1145/3543507.3583277)|Xixi Wu, Yun Xiong, Yao Zhang, Yizhu Jiao, Jiawei Zhang, Yangyong Zhu, Philip S. Yu|University of Illinois at Chicago, USA; Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, China; University of Illinois at Urbana-Champaign, USA; IFM Lab, Department of Computer Science, University of California, Davis, USA|Since group activities have become very common in daily life, there is an urgent demand for generating recommendations for a group of users, referred to as group recommendation task. Existing group recommendation methods usually infer groups' preferences via aggregating diverse members' interests. Actually, groups' ultimate choice involves compromises between members, and finally, an agreement can be reached. However, existing individual information aggregation lacks a holistic group-level consideration, failing to capture the consensus information. Besides, their specific aggregation strategies either suffer from high computational costs or become too coarse-grained to make precise predictions. To solve the aforementioned limitations, in this paper, we focus on exploring consensus behind group behavior data. To comprehensively capture the group consensus, we innovatively design three distinct views which provide mutually complementary information to enable multi-view learning, including member-level aggregation, item-level tastes, and group-level inherent preferences. To integrate and balance the multi-view information, an adaptive fusion component is further proposed. As to member-level aggregation, different from existing linear or attentive strategies, we design a novel hypergraph neural network that allows for efficient hypergraph convolutional operations to generate expressive member-level aggregation. We evaluate our ConsRec on two real-world datasets and experimental results show that our model outperforms state-of-the-art methods. An extensive case study also verifies the effectiveness of consensus modeling.|由于小组活动在日常生活中已经非常普遍,因此迫切需要为一组用户提供建议,称为小组推荐任务。现有的群体推荐方法通常通过聚合不同成员的兴趣来推断群体的偏好。实际上,团体的最终选择包括成员之间的妥协,最终可以达成协议。然而,现有的个人信息聚合缺乏整体的群体层面的考虑,未能捕获共识信息。此外,他们特定的聚合策略要么计算成本高,要么过于粗粒度,无法做出精确的预测。为了解决上述局限性,本文重点探讨群体行为数据背后的共识。为了全面捕捉群体共识,我们创新性地设计了三种不同的视图,提供相互补充的信息,使多视图学习成为可能,包括成员层面的聚合、项目层面的品味和群体层面的固有偏好。为了对多视点信息进行集成和平衡,进一步提出了一种自适应融合构件。对于成员级聚集,不同于现有的线性或注意策略,我们设计了一种新的超图神经网络,该网络允许有效的超图卷积操作来产生具有表达能力的成员级聚集。我们在两个真实世界的数据集上评估了我们的 ConsRec,实验结果表明我们的模型优于最先进的方法。一个广泛的案例研究也验证了一致性建模的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ConsRec:+Learning+Consensus+Behind+Interactions+for+Group+Recommendation)|0| -|[Graph Neural Networks with Diverse Spectral Filtering](https://doi.org/10.1145/3543507.3583324)|Jingwei Guo, Kaizhu Huang, Xinping Yi, Rui Zhang|Xi'an Jiaotong-Liverpool University, China; Xi'an Jiaotong-Liverpool University; The University of Liverpool, China; The University of Liverpool, United Kingdom; Duke Kunshan University, China|Spectral Graph Neural Networks (GNNs) have achieved tremendous success in graph machine learning, with polynomial filters applied for graph convolutions, where all nodes share the identical filter weights to mine their local contexts. Despite the success, existing spectral GNNs usually fail to deal with complex networks (e.g., WWW) due to such homogeneous spectral filtering setting that ignores the regional heterogeneity as typically seen in real-world networks. To tackle this issue, we propose a novel diverse spectral filtering (DSF) framework, which automatically learns node-specific filter weights to exploit the varying local structure properly. Particularly, the diverse filter weights consist of two components — A global one shared among all nodes, and a local one that varies along network edges to reflect node difference arising from distinct graph parts — to balance between local and global information. As such, not only can the global graph characteristics be captured, but also the diverse local patterns can be mined with awareness of different node positions. Interestingly, we formulate a novel optimization problem to assist in learning diverse filters, which also enables us to enhance any spectral GNNs with our DSF framework. We showcase the proposed framework on three state-of-the-arts including GPR-GNN, BernNet, and JacobiConv. Extensive experiments over 10 benchmark datasets demonstrate that our framework can consistently boost model performance by up to 4.92% in node classification tasks, producing diverse filters with enhanced interpretability.|谱图神经网络(GNN)在图形机器学习中取得了巨大的成功,其中多项式滤波器应用于图卷积,其中所有节点共享相同的滤波器权重来挖掘它们的局部上下文。尽管已有的光谱 GNN 在处理复杂网络(如 WWW)时取得了一定的成功,但由于均匀的光谱滤波设置忽略了现实网络中典型的区域异质性,导致 GNN 无法处理复杂网络(如 WWW)。针对这一问题,我们提出了一种新的多谱段滤波(DSF)框架,该框架能够自动学习节点特定的滤波器权值,以适当地利用变化的局部结构。特别地,不同的滤波器权重由两部分组成: 一部分是所有节点共享的全局权重,另一部分是沿网络边缘变化的局部权重,以反映不同图部分产生的节点差异,从而平衡局部和全局信息。因此,不仅可以捕获全局图的特征,而且可以挖掘不同节点位置的不同局部模式。有趣的是,我们制定了一个新的最佳化问题,以帮助学习不同的过滤器,这也使我们能够增强任何光谱 GNN 与我们的 dSF 框架。我们在 GPR-GNN、 BernNet 和 JacobiConv 这三种最新技术的基础上展示了提议的框架。通过对10个基准数据集的大量实验表明,我们的框架可以在节点分类任务中始终如一地将模型性能提高高达4.92% ,产生具有增强可解释性的多样化过滤器。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph+Neural+Networks+with+Diverse+Spectral+Filtering)|0| +|[ConsRec: Learning Consensus Behind Interactions for Group Recommendation](https://doi.org/10.1145/3543507.3583277)|Xixi Wu, Yun Xiong, Yao Zhang, Yizhu Jiao, Jiawei Zhang, Yangyong Zhu, Philip S. Yu|University of Illinois at Chicago, USA; University of Illinois at Urbana-Champaign, USA; IFM Lab, Department of Computer Science, University of California, Davis, USA; Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, China|Since group activities have become very common in daily life, there is an urgent demand for generating recommendations for a group of users, referred to as group recommendation task. Existing group recommendation methods usually infer groups' preferences via aggregating diverse members' interests. Actually, groups' ultimate choice involves compromises between members, and finally, an agreement can be reached. However, existing individual information aggregation lacks a holistic group-level consideration, failing to capture the consensus information. Besides, their specific aggregation strategies either suffer from high computational costs or become too coarse-grained to make precise predictions. To solve the aforementioned limitations, in this paper, we focus on exploring consensus behind group behavior data. To comprehensively capture the group consensus, we innovatively design three distinct views which provide mutually complementary information to enable multi-view learning, including member-level aggregation, item-level tastes, and group-level inherent preferences. To integrate and balance the multi-view information, an adaptive fusion component is further proposed. As to member-level aggregation, different from existing linear or attentive strategies, we design a novel hypergraph neural network that allows for efficient hypergraph convolutional operations to generate expressive member-level aggregation. We evaluate our ConsRec on two real-world datasets and experimental results show that our model outperforms state-of-the-art methods. An extensive case study also verifies the effectiveness of consensus modeling.|由于小组活动在日常生活中已经非常普遍,因此迫切需要为一组用户提供建议,称为小组推荐任务。现有的群体推荐方法通常通过聚合不同成员的兴趣来推断群体的偏好。实际上,团体的最终选择包括成员之间的妥协,最终可以达成协议。然而,现有的个人信息聚合缺乏整体的群体层面的考虑,未能捕获共识信息。此外,他们特定的聚合策略要么计算成本高,要么过于粗粒度,无法做出精确的预测。为了解决上述局限性,本文重点探讨群体行为数据背后的共识。为了全面捕捉群体共识,我们创新性地设计了三种不同的视图,提供相互补充的信息,使多视图学习成为可能,包括成员层面的聚合、项目层面的品味和群体层面的固有偏好。为了对多视点信息进行集成和平衡,进一步提出了一种自适应融合构件。对于成员级聚集,不同于现有的线性或注意策略,我们设计了一种新的超图神经网络,该网络允许有效的超图卷积操作来产生具有表达能力的成员级聚集。我们在两个真实世界的数据集上评估了我们的 ConsRec,实验结果表明我们的模型优于最先进的方法。一个广泛的案例研究也验证了一致性建模的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ConsRec:+Learning+Consensus+Behind+Interactions+for+Group+Recommendation)|0| +|[Graph Neural Networks with Diverse Spectral Filtering](https://doi.org/10.1145/3543507.3583324)|Jingwei Guo, Kaizhu Huang, Xinping Yi, Rui Zhang|Xi'an Jiaotong-Liverpool University, China; Duke Kunshan University, China; The University of Liverpool, United Kingdom; Xi'an Jiaotong-Liverpool University; The University of Liverpool, China|Spectral Graph Neural Networks (GNNs) have achieved tremendous success in graph machine learning, with polynomial filters applied for graph convolutions, where all nodes share the identical filter weights to mine their local contexts. Despite the success, existing spectral GNNs usually fail to deal with complex networks (e.g., WWW) due to such homogeneous spectral filtering setting that ignores the regional heterogeneity as typically seen in real-world networks. To tackle this issue, we propose a novel diverse spectral filtering (DSF) framework, which automatically learns node-specific filter weights to exploit the varying local structure properly. Particularly, the diverse filter weights consist of two components — A global one shared among all nodes, and a local one that varies along network edges to reflect node difference arising from distinct graph parts — to balance between local and global information. As such, not only can the global graph characteristics be captured, but also the diverse local patterns can be mined with awareness of different node positions. Interestingly, we formulate a novel optimization problem to assist in learning diverse filters, which also enables us to enhance any spectral GNNs with our DSF framework. We showcase the proposed framework on three state-of-the-arts including GPR-GNN, BernNet, and JacobiConv. Extensive experiments over 10 benchmark datasets demonstrate that our framework can consistently boost model performance by up to 4.92% in node classification tasks, producing diverse filters with enhanced interpretability.|谱图神经网络(GNN)在图形机器学习中取得了巨大的成功,其中多项式滤波器应用于图卷积,其中所有节点共享相同的滤波器权重来挖掘它们的局部上下文。尽管已有的光谱 GNN 在处理复杂网络(如 WWW)时取得了一定的成功,但由于均匀的光谱滤波设置忽略了现实网络中典型的区域异质性,导致 GNN 无法处理复杂网络(如 WWW)。针对这一问题,我们提出了一种新的多谱段滤波(DSF)框架,该框架能够自动学习节点特定的滤波器权值,以适当地利用变化的局部结构。特别地,不同的滤波器权重由两部分组成: 一部分是所有节点共享的全局权重,另一部分是沿网络边缘变化的局部权重,以反映不同图部分产生的节点差异,从而平衡局部和全局信息。因此,不仅可以捕获全局图的特征,而且可以挖掘不同节点位置的不同局部模式。有趣的是,我们制定了一个新的最佳化问题,以帮助学习不同的过滤器,这也使我们能够增强任何光谱 GNN 与我们的 dSF 框架。我们在 GPR-GNN、 BernNet 和 JacobiConv 这三种最新技术的基础上展示了提议的框架。通过对10个基准数据集的大量实验表明,我们的框架可以在节点分类任务中始终如一地将模型性能提高高达4.92% ,产生具有增强可解释性的多样化过滤器。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph+Neural+Networks+with+Diverse+Spectral+Filtering)|0| |[Semi-decentralized Federated Ego Graph Learning for Recommendation](https://doi.org/10.1145/3543507.3583337)|Liang Qu, Ningzhi Tang, Ruiqi Zheng, Quoc Viet Hung Nguyen, Zi Huang, Yuhui Shi, Hongzhi Yin|Southern University of Science and Technology, China; Griffith University, Australia; The University of Queensland, Australia|Collaborative filtering (CF) based recommender systems are typically trained based on personal interaction data (e.g., clicks and purchases) that could be naturally represented as ego graphs. However, most existing recommendation methods collect these ego graphs from all users to compose a global graph to obtain high-order collaborative information between users and items, and these centralized CF recommendation methods inevitably lead to a high risk of user privacy leakage. Although recently proposed federated recommendation systems can mitigate the privacy problem, they either restrict the on-device local training to an isolated ego graph or rely on an additional third-party server to access other ego graphs resulting in a cumbersome pipeline, which is hard to work in practice. In addition, existing federated recommendation systems require resource-limited devices to maintain the entire embedding tables resulting in high communication costs. In light of this, we propose a semi-decentralized federated ego graph learning framework for on-device recommendations, named SemiDFEGL, which introduces new device-to-device collaborations to improve scalability and reduce communication costs and innovatively utilizes predicted interacted item nodes to connect isolated ego graphs to augment local subgraphs such that the high-order user-item collaborative information could be used in a privacy-preserving manner. Furthermore, the proposed framework is model-agnostic, meaning that it could be seamlessly integrated with existing graph neural network-based recommendation methods and privacy protection techniques. To validate the effectiveness of the proposed SemiDFEGL, extensive experiments are conducted on three public datasets, and the results demonstrate the superiority of the proposed SemiDFEGL compared to other federated recommendation methods.|基于协同过滤(CF)的推荐系统通常基于个人交互数据(例如点击和购买)进行培训,这些数据可以自然地表示为自我图表。然而,现有的大多数推荐方法都是从所有用户中收集这些自我图来构成一个全局图,以获得用户和项目之间的高阶协同信息,而这些集中式的 CF 推荐方法不可避免地会导致用户隐私泄露的高风险。虽然最近提出的联邦推荐系统可以缓解隐私问题,但是它们要么将设备上的本地培训限制在一个孤立的自我图上,要么依赖于另外一个第三方服务器来访问其他自我图,从而产生一个繁琐的管道,这在实践中很难实现。此外,现有的联邦推荐系统需要资源有限的设备来维护整个嵌入表,从而导致高通信成本。鉴于此,我们提出了一个半分散的联邦自我图学习框架 SemiDFEGL,该框架引入了新的设备间协作以提高可扩展性和降低通信成本,并创新地利用预测的交互项节点连接孤立的自我图以增强局部子图,从而可以以保护隐私的方式使用高阶用户项协作信息。此外,提出的框架是模型无关的,这意味着它可以与现有的基于图神经网络的推荐方法和隐私保护技术无缝集成。为了验证半 DFEGL 的有效性,在三个公共数据集上进行了广泛的实验,实验结果表明了半 DFEGL 相对于其他联邦推荐方法的优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Semi-decentralized+Federated+Ego+Graph+Learning+for+Recommendation)|0| |[SINCERE: Sequential Interaction Networks representation learning on Co-Evolving RiEmannian manifolds](https://doi.org/10.1145/3543507.3583353)|Junda Ye, Zhongbao Zhang, Li Sun, Yang Yan, Feiyang Wang, Fuxin Ren|Beijing University of Posts and Telecommunications, China; North China Electric Power University, China|Sequential interaction networks (SIN) have been commonly adopted in many applications such as recommendation systems, search engines and social networks to describe the mutual influence between users and items/products. Efforts on representing SIN are mainly focused on capturing the dynamics of networks in Euclidean space, and recently plenty of work has extended to hyperbolic geometry for implicit hierarchical learning. Previous approaches which learn the embedding trajectories of users and items achieve promising results. However, there are still a range of fundamental issues remaining open. For example, is it appropriate to place user and item nodes in one identical space regardless of their inherent discrepancy? Instead of residing in a single fixed curvature space, how will the representation spaces evolve when new interaction occurs? To explore these issues for sequential interaction networks, we propose SINCERE, a novel method representing Sequential Interaction Networks on Co-Evolving RiEmannian manifolds. SIN- CERE not only takes the user and item embedding trajectories in respective spaces into account, but also emphasizes on the space evolvement that how curvature changes over time. Specifically, we introduce a fresh cross-geometry aggregation which allows us to propagate information across different Riemannian manifolds without breaking conformal invariance, and a curvature estimator which is delicately designed to predict global curvatures effectively according to current local Ricci curvatures. Extensive experiments on several real-world datasets demonstrate the promising performance of SINCERE over the state-of-the-art sequential interaction prediction methods.|在推荐系统、搜索引擎、社交网络等应用中,用户与产品之间的相互影响通常采用序贯交互网络(SIN)来描述。表示 SIN 的努力主要集中在捕捉欧几里得空间中网络的动态,最近大量的工作已经延伸到了隐含双曲几何的深度学习。以往的方法通过学习用户和项目的嵌入轨迹,取得了良好的效果。然而,仍有一系列基本问题悬而未决。例如,是否应该将用户和项目节点放在一个相同的空间中,而不管它们的固有差异?当发生新的相互作用时,表示空间将如何演化,而不是驻留在一个单一的固定曲率空间中?为了研究序贯相互作用网络的这些问题,我们提出了一种新的方法 SINCARE,它在共进化黎曼流形上表示序贯相互作用网络。SIN-CERE 不仅考虑了用户和项目在各自空间中的嵌入轨迹,而且强调了曲率随时间变化的空间演化。具体地说,我们引入了一个新的交叉几何聚合,它允许我们在不破坏共形不变性的情况下在不同的黎曼流形上传播信息,以及一个精心设计的曲率估计器,它可以根据当前的局部 Ricci 曲率有效地预测全局曲率。在几个真实世界数据集上的大量实验表明,SINCARE 相对于最先进的顺序交互预测方法具有很好的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SINCERE:+Sequential+Interaction+Networks+representation+learning+on+Co-Evolving+RiEmannian+manifolds)|0| -|[TIGER: Temporal Interaction Graph Embedding with Restarts](https://doi.org/10.1145/3543507.3583433)|Yao Zhang, Yun Xiong, Yongxiang Liao, Yiheng Sun, Yucheng Jin, Xuehao Zheng, Yangyong Zhu|Tencent Weixin Group, China; Fudan University, China|Temporal interaction graphs (TIGs), consisting of sequences of timestamped interaction events, are prevalent in fields like e-commerce and social networks. To better learn dynamic node embeddings that vary over time, researchers have proposed a series of temporal graph neural networks for TIGs. However, due to the entangled temporal and structural dependencies, existing methods have to process the sequence of events chronologically and consecutively to ensure node representations are up-to-date. This prevents existing models from parallelization and reduces their flexibility in industrial applications. To tackle the above challenge, in this paper, we propose TIGER, a TIG embedding model that can restart at any timestamp. We introduce a restarter module that generates surrogate representations acting as the warm initialization of node representations. By restarting from multiple timestamps simultaneously, we divide the sequence into multiple chunks and naturally enable the parallelization of the model. Moreover, in contrast to previous models that utilize a single memory unit, we introduce a dual memory module to better exploit neighborhood information and alleviate the staleness problem. Extensive experiments on four public datasets and one industrial dataset are conducted, and the results verify both the effectiveness and the efficiency of our work.|时间交互图(TIGs)由时间戳交互事件序列组成,在电子商务和社交网络等领域非常普遍。为了更好地学习随时间变化的动态节点嵌入,研究人员提出了一系列针对 TIG 的时间图神经网络。然而,由于时间和结构的依赖性,现有的方法必须按时间顺序和连续地处理事件序列,以确保节点表示是最新的。这阻止了现有模型的并行化,并降低了它们在工业应用程序中的灵活性。为了应对上述挑战,本文提出了一种 TIG 嵌入模型 TIGER,它可以在任意时间戳重新启动。我们引入了一个 restarter 模块,它生成代理表示作为节点表示的温初始化。通过同时从多个时间戳重新开始,我们将序列划分为多个块,自然而然地实现了模型的并行化。此外,相对于以往的单一存储器模型,我们引入了双存储器模块,以更好地利用邻域信息和缓解过时问题。对四个公共数据集和一个工业数据集进行了广泛的实验,实验结果验证了本文工作的有效性和高效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=TIGER:+Temporal+Interaction+Graph+Embedding+with+Restarts)|0| -|[Expressive and Efficient Representation Learning for Ranking Links in Temporal Graphs](https://doi.org/10.1145/3543507.3583476)|Susheel Suresh, Mayank Shrivastava, Arko Mukherjee, Jennifer Neville, Pan Li|Microsoft, USA; Microsoft Research, USA; Purdue University, USA|Temporal graph representation learning (T-GRL) aims to learn representations that model how graph edges evolve over time. While recent works on T-GRL have improved link prediction accuracy in temporal settings, their methods optimize a point-wise loss function independently over future links rather than optimize jointly over a candidate set per node. In applications where resources (e.g., attention) are allocated based on ranking links by likelihood, the use of a ranking loss is preferred. However it is not straightforward to develop a T-GRL method to optimize a ranking loss due to a tradeoff between model expressivity and scalability. In this work, we address these issues and propose a Temporal Graph network for Ranking (TGRank), which significantly improves performance for link prediction tasks by (i) optimizing a list-wise loss for improved ranking, and (ii) incorporating a labeling approach designed to allow for efficient inference over the candidate set jointly, while provably boosting expressivity. We extensively evaluate TGRank over six real networks. TGRank outperforms the state-of-the-art baselines on average by 14.21%↑ (transductive) and 16.25% ↑ (inductive) in ranking metrics while being more efficient (up-to 65 × speed-up) to make inference on large networks.|时态图表示学习(T-GRL)的目的是学习模拟图边如何随时间演化的表示。尽管最近在 T-GRL 上的工作在时间设置上提高了链路预测的精度,但是他们的方法在未来链路上独立优化点损失函数,而不是在每个节点的候选集上联合优化。在应用程序中,资源(如注意力)的分配是基于可能性的排名链接,使用排名损失是首选。然而,由于模型表达性和可伸缩性之间的权衡,开发 T-GRL 方法来优化排名损失并不容易。在这项工作中,我们解决了这些问题,并提出了排名时态图网络(TGRank) ,它通过(i)优化改善排名的列表损失,以及(ii)结合标记方法,以便对候选集合进行有效的推理,同时可证明地提高表现力,从而显着改善链接预测任务的性能。我们广泛评估 TGRank 在六个实际网络。TGRank 在排序指标方面平均比最先进的基线表现出14.21% 惊(转导)和16.25% 惊(归纳)的优势,同时在大型网络上进行推理的效率更高(高达65倍加速)。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Expressive+and+Efficient+Representation+Learning+for+Ranking+Links+in+Temporal+Graphs)|0| -|[Semi-Supervised Embedding of Attributed Multiplex Networks](https://doi.org/10.1145/3543507.3583485)|Ylli Sadikaj, Justus Rass, Yllka Velaj, Claudia Plant|Faculty of Computer Science, University of Vienna, Austria; Faculty of Computer Science, University of Vienna, Austria and ds:Univie, University of Vienna, Austria; Faculty of Computer Science, University of Vienna, Austria and UniVie Doctoral School Computer Science, University of Vienna, Austria|Complex information can be represented as networks (graphs) characterized by a large number of nodes, multiple types of nodes, and multiple types of relationships between them, i.e. multiplex networks. Additionally, these networks are enriched with different types of node features. We propose a Semi-supervised Embedding approach for Attributed Multiplex Networks (SSAMN), to jointly embed nodes, node attributes, and node labels of multiplex networks in a low dimensional space. Network embedding techniques have garnered research attention for real-world applications. However, most existing techniques solely focus on learning the node embeddings, and only a few learn class label embeddings. Our method assumes that we have different classes of nodes and that we know the class label of some, very few nodes for every class. Guided by this type of supervision, SSAMN learns a low-dimensional representation incorporating all information in a large labeled multiplex network. SSAMN integrates techniques from Spectral Embedding and Homogeneity Analysis to improve the embedding of nodes, node attributes, and node labels. Our experiments demonstrate that we only need very few labels per class in order to have a final embedding that preservers the information of the graph. To evaluate the performance of SSAMN, we run experiments on four real-world datasets. The results show that our approach outperforms state-of-the-art methods for downstream tasks such as semi-supervised node classification and node clustering.|复杂的信息可以用网络(图形)来表示,拥有属性包括大量的节点、多种类型的节点以及它们之间多种类型的关系,即多路网络。此外,这些网络丰富了不同类型的节点特征。提出了一种基于半监督嵌入的属性化多路网络(SSAMN)方法,在低维空间中联合嵌入多路网络的节点、节点属性和节点标签。网络嵌入技术已经成为现实应用领域的研究热点。然而,大多数现有的技术只关注于学习节点嵌入,只有少数学习类标签嵌入。我们的方法假设我们有不同的节点类,并且我们知道每个类的一些非常少的节点的类标签。在这种类型的监督指导下,SSAMN 学习了一种低维表示,将所有信息整合到一个大的标记多路网络中。SSAMN 集成了谱嵌入和均匀性分析技术,改进了节点、节点属性和节点标签的嵌入。我们的实验表明,我们只需要非常少的标签每个类,以便有一个最终的嵌入,保存图的信息。为了评估 SSAMN 的表现,我们在四个真实世界的数据集上进行了实验。结果表明,该方法在处理半监督节点分类和节点聚类等下游任务时,性能优于现有方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Semi-Supervised+Embedding+of+Attributed+Multiplex+Networks)|0| -|[Search to Capture Long-range Dependency with Stacking GNNs for Graph Classification](https://doi.org/10.1145/3543507.3583486)|Lanning Wei, Zhiqiang He, Huan Zhao, Quanming Yao|Institute of Computing Technology, Chinese Academy of Sciences, China and University of Chinese Academy of Sciences, China; Department of Electronic Engineering, Tsinghua University, China; 4Paradigm. Inc, China; Institute of Computing Technology, Chinese Academy of Science, China and Lenovo, China|In recent years, Graph Neural Networks (GNNs) have been popular in the graph classification task. Currently, shallow GNNs are more common due to the well-known over-smoothing problem facing deeper GNNs. However, they are sub-optimal without utilizing the information from distant nodes, i.e., the long-range dependencies. The mainstream methods in the graph classification task can extract the long-range dependencies either by designing the pooling operations or incorporating the higher-order neighbors, while they have evident drawbacks by modifying the original graph structure, which may result in information loss in graph structure learning. In this paper, by justifying the smaller influence of the over-smoothing problem in the graph classification task, we evoke the importance of stacking-based GNNs and then employ them to capture the long-range dependencies without modifying the original graph structure. To achieve this, two design needs are given for stacking-based GNNs, i.e., sufficient model depth and adaptive skip-connection schemes. By transforming the two design needs into designing data-specific inter-layer connections, we propose a novel approach with the help of neural architecture search (NAS), which is dubbed LRGNN (Long-Range Graph Neural Networks). Extensive experiments on five datasets show that the proposed LRGNN can achieve the best performance, and obtained data-specific GNNs with different depth and skip-connection schemes, which can better capture the long-range dependencies.|近年来,图形神经网络(GNN)在图形分类任务中得到了广泛的应用。目前,浅 GNN 更常见,由于众所周知的过度平滑问题所面临的深 GNN。然而,如果没有利用来自远程节点的信息(即远程依赖) ,它们就是次优的。图分类任务中的主流方法可以通过设计合并操作或合并高阶邻居来提取远程依赖关系,但通过修改原有的图结构存在明显的缺陷,可能导致图结构学习中的信息丢失。本文通过证明过平滑问题在图分类任务中的影响较小,引出了基于叠加的 GNN 的重要性,并利用它们在不改变原始图结构的情况下捕获长程依赖关系。为了实现这一目标,给出了基于叠加的 GNN 的两种设计需求,即充分的模型深度和自适应跳跃连接方案。通过将这两种设计需求转化为设计数据特定的层间连接,我们提出了一种神经结构搜索(NAS)的新方法,称为长程图形神经网络(LRGNN)。通过对5个数据集的大量实验表明,本文提出的 LRGNN 能够获得最好的性能,并且能够获得具有不同深度和跳跃连接方案的数据特定 GNN,能够更好地捕获远程依赖关系。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Search+to+Capture+Long-range+Dependency+with+Stacking+GNNs+for+Graph+Classification)|0| +|[TIGER: Temporal Interaction Graph Embedding with Restarts](https://doi.org/10.1145/3543507.3583433)|Yao Zhang, Yun Xiong, Yongxiang Liao, Yiheng Sun, Yucheng Jin, Xuehao Zheng, Yangyong Zhu|Fudan University, China; Tencent Weixin Group, China|Temporal interaction graphs (TIGs), consisting of sequences of timestamped interaction events, are prevalent in fields like e-commerce and social networks. To better learn dynamic node embeddings that vary over time, researchers have proposed a series of temporal graph neural networks for TIGs. However, due to the entangled temporal and structural dependencies, existing methods have to process the sequence of events chronologically and consecutively to ensure node representations are up-to-date. This prevents existing models from parallelization and reduces their flexibility in industrial applications. To tackle the above challenge, in this paper, we propose TIGER, a TIG embedding model that can restart at any timestamp. We introduce a restarter module that generates surrogate representations acting as the warm initialization of node representations. By restarting from multiple timestamps simultaneously, we divide the sequence into multiple chunks and naturally enable the parallelization of the model. Moreover, in contrast to previous models that utilize a single memory unit, we introduce a dual memory module to better exploit neighborhood information and alleviate the staleness problem. Extensive experiments on four public datasets and one industrial dataset are conducted, and the results verify both the effectiveness and the efficiency of our work.|时间交互图(TIGs)由时间戳交互事件序列组成,在电子商务和社交网络等领域非常普遍。为了更好地学习随时间变化的动态节点嵌入,研究人员提出了一系列针对 TIG 的时间图神经网络。然而,由于时间和结构的依赖性,现有的方法必须按时间顺序和连续地处理事件序列,以确保节点表示是最新的。这阻止了现有模型的并行化,并降低了它们在工业应用程序中的灵活性。为了应对上述挑战,本文提出了一种 TIG 嵌入模型 TIGER,它可以在任意时间戳重新启动。我们引入了一个 restarter 模块,它生成代理表示作为节点表示的温初始化。通过同时从多个时间戳重新开始,我们将序列划分为多个块,自然而然地实现了模型的并行化。此外,相对于以往的单一存储器模型,我们引入了双存储器模块,以更好地利用邻域信息和缓解过时问题。对四个公共数据集和一个工业数据集进行了广泛的实验,实验结果验证了本文工作的有效性和高效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=TIGER:+Temporal+Interaction+Graph+Embedding+with+Restarts)|0| +|[Expressive and Efficient Representation Learning for Ranking Links in Temporal Graphs](https://doi.org/10.1145/3543507.3583476)|Susheel Suresh, Mayank Shrivastava, Arko Mukherjee, Jennifer Neville, Pan Li|Microsoft Research, USA; Purdue University, USA; Microsoft, USA|Temporal graph representation learning (T-GRL) aims to learn representations that model how graph edges evolve over time. While recent works on T-GRL have improved link prediction accuracy in temporal settings, their methods optimize a point-wise loss function independently over future links rather than optimize jointly over a candidate set per node. In applications where resources (e.g., attention) are allocated based on ranking links by likelihood, the use of a ranking loss is preferred. However it is not straightforward to develop a T-GRL method to optimize a ranking loss due to a tradeoff between model expressivity and scalability. In this work, we address these issues and propose a Temporal Graph network for Ranking (TGRank), which significantly improves performance for link prediction tasks by (i) optimizing a list-wise loss for improved ranking, and (ii) incorporating a labeling approach designed to allow for efficient inference over the candidate set jointly, while provably boosting expressivity. We extensively evaluate TGRank over six real networks. TGRank outperforms the state-of-the-art baselines on average by 14.21%↑ (transductive) and 16.25% ↑ (inductive) in ranking metrics while being more efficient (up-to 65 × speed-up) to make inference on large networks.|时态图表示学习(T-GRL)的目的是学习模拟图边如何随时间演化的表示。尽管最近在 T-GRL 上的工作在时间设置上提高了链路预测的精度,但是他们的方法在未来链路上独立优化点损失函数,而不是在每个节点的候选集上联合优化。在应用程序中,资源(如注意力)的分配是基于可能性的排名链接,使用排名损失是首选。然而,由于模型表达性和可伸缩性之间的权衡,开发 T-GRL 方法来优化排名损失并不容易。在这项工作中,我们解决了这些问题,并提出了排名时态图网络(TGRank) ,它通过(i)优化改善排名的列表损失,以及(ii)结合标记方法,以便对候选集合进行有效的推理,同时可证明地提高表现力,从而显着改善链接预测任务的性能。我们广泛评估 TGRank 在六个实际网络。TGRank 在排序指标方面平均比最先进的基线表现出14.21% 惊(转导)和16.25% 惊(归纳)的优势,同时在大型网络上进行推理的效率更高(高达65倍加速)。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Expressive+and+Efficient+Representation+Learning+for+Ranking+Links+in+Temporal+Graphs)|0| +|[Semi-Supervised Embedding of Attributed Multiplex Networks](https://doi.org/10.1145/3543507.3583485)|Ylli Sadikaj, Justus Rass, Yllka Velaj, Claudia Plant|Faculty of Computer Science, University of Vienna, Austria and UniVie Doctoral School Computer Science, University of Vienna, Austria; Faculty of Computer Science, University of Vienna, Austria; Faculty of Computer Science, University of Vienna, Austria and ds:Univie, University of Vienna, Austria|Complex information can be represented as networks (graphs) characterized by a large number of nodes, multiple types of nodes, and multiple types of relationships between them, i.e. multiplex networks. Additionally, these networks are enriched with different types of node features. We propose a Semi-supervised Embedding approach for Attributed Multiplex Networks (SSAMN), to jointly embed nodes, node attributes, and node labels of multiplex networks in a low dimensional space. Network embedding techniques have garnered research attention for real-world applications. However, most existing techniques solely focus on learning the node embeddings, and only a few learn class label embeddings. Our method assumes that we have different classes of nodes and that we know the class label of some, very few nodes for every class. Guided by this type of supervision, SSAMN learns a low-dimensional representation incorporating all information in a large labeled multiplex network. SSAMN integrates techniques from Spectral Embedding and Homogeneity Analysis to improve the embedding of nodes, node attributes, and node labels. Our experiments demonstrate that we only need very few labels per class in order to have a final embedding that preservers the information of the graph. To evaluate the performance of SSAMN, we run experiments on four real-world datasets. The results show that our approach outperforms state-of-the-art methods for downstream tasks such as semi-supervised node classification and node clustering.|复杂的信息可以用网络(图形)来表示,拥有属性包括大量的节点、多种类型的节点以及它们之间多种类型的关系,即多路网络。此外,这些网络丰富了不同类型的节点特征。提出了一种基于半监督嵌入的属性化多路网络(SSAMN)方法,在低维空间中联合嵌入多路网络的节点、节点属性和节点标签。网络嵌入技术已经成为现实应用领域的研究热点。然而,大多数现有的技术只关注于学习节点嵌入,只有少数学习类标签嵌入。我们的方法假设我们有不同的节点类,并且我们知道每个类的一些非常少的节点的类标签。在这种类型的监督指导下,SSAMN 学习了一种低维表示,将所有信息整合到一个大的标记多路网络中。SSAMN 集成了谱嵌入和均匀性分析技术,改进了节点、节点属性和节点标签的嵌入。我们的实验表明,我们只需要非常少的标签每个类,以便有一个最终的嵌入,保存图的信息。为了评估 SSAMN 的表现,我们在四个真实世界的数据集上进行了实验。结果表明,该方法在处理半监督节点分类和节点聚类等下游任务时,性能优于现有方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Semi-Supervised+Embedding+of+Attributed+Multiplex+Networks)|0| +|[Search to Capture Long-range Dependency with Stacking GNNs for Graph Classification](https://doi.org/10.1145/3543507.3583486)|Lanning Wei, Zhiqiang He, Huan Zhao, Quanming Yao|Institute of Computing Technology, Chinese Academy of Sciences, China and University of Chinese Academy of Sciences, China; Institute of Computing Technology, Chinese Academy of Science, China and Lenovo, China; Department of Electronic Engineering, Tsinghua University, China; 4Paradigm. Inc, China|In recent years, Graph Neural Networks (GNNs) have been popular in the graph classification task. Currently, shallow GNNs are more common due to the well-known over-smoothing problem facing deeper GNNs. However, they are sub-optimal without utilizing the information from distant nodes, i.e., the long-range dependencies. The mainstream methods in the graph classification task can extract the long-range dependencies either by designing the pooling operations or incorporating the higher-order neighbors, while they have evident drawbacks by modifying the original graph structure, which may result in information loss in graph structure learning. In this paper, by justifying the smaller influence of the over-smoothing problem in the graph classification task, we evoke the importance of stacking-based GNNs and then employ them to capture the long-range dependencies without modifying the original graph structure. To achieve this, two design needs are given for stacking-based GNNs, i.e., sufficient model depth and adaptive skip-connection schemes. By transforming the two design needs into designing data-specific inter-layer connections, we propose a novel approach with the help of neural architecture search (NAS), which is dubbed LRGNN (Long-Range Graph Neural Networks). Extensive experiments on five datasets show that the proposed LRGNN can achieve the best performance, and obtained data-specific GNNs with different depth and skip-connection schemes, which can better capture the long-range dependencies.|近年来,图形神经网络(GNN)在图形分类任务中得到了广泛的应用。目前,浅 GNN 更常见,由于众所周知的过度平滑问题所面临的深 GNN。然而,如果没有利用来自远程节点的信息(即远程依赖) ,它们就是次优的。图分类任务中的主流方法可以通过设计合并操作或合并高阶邻居来提取远程依赖关系,但通过修改原有的图结构存在明显的缺陷,可能导致图结构学习中的信息丢失。本文通过证明过平滑问题在图分类任务中的影响较小,引出了基于叠加的 GNN 的重要性,并利用它们在不改变原始图结构的情况下捕获长程依赖关系。为了实现这一目标,给出了基于叠加的 GNN 的两种设计需求,即充分的模型深度和自适应跳跃连接方案。通过将这两种设计需求转化为设计数据特定的层间连接,我们提出了一种神经结构搜索(NAS)的新方法,称为长程图形神经网络(LRGNN)。通过对5个数据集的大量实验表明,本文提出的 LRGNN 能够获得最好的性能,并且能够获得具有不同深度和跳跃连接方案的数据特定 GNN,能够更好地捕获远程依赖关系。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Search+to+Capture+Long-range+Dependency+with+Stacking+GNNs+for+Graph+Classification)|0| |[Cut-matching Games for Generalized Hypergraph Ratio Cuts](https://doi.org/10.1145/3543507.3583539)|Nate Veldt|Texas A&M University, USA|Hypergraph clustering is a basic algorithmic primitive for analyzing complex datasets and systems characterized by multiway interactions, such as group email conversations, groups of co-purchased retail products, and co-authorship data. This paper presents a practical $O(\log n)$-approximation algorithm for a broad class of hypergraph ratio cut clustering objectives. This includes objectives involving generalized hypergraph cut functions, which allow a user to penalize cut hyperedges differently depending on the number of nodes in each cluster. Our method is a generalization of the cut-matching framework for graph ratio cuts, and relies only on solving maximum s-t flow problems in a special reduced graph. It is significantly faster than existing hypergraph ratio cut algorithms, while also solving a more general problem. In numerical experiments on various types of hypergraphs, we show that it quickly finds ratio cut solutions within a small factor of optimality.|Hypergraph 聚类是一种基本的算法原理,用于分析复杂的数据集和多拥有属性交互的系统,比如群组电子邮件对话、共同购买的零售产品组和合著者数据。本文提出了一个实用的 $o (log n) $- 近似演算法,用于一类广泛的超图比率削减聚类目标。这包括涉及广义超图割函数的目标,它允许用户根据每个簇中节点的数量对割超边进行不同的惩罚。该方法是图比割的割匹配框架的推广,仅依赖于求解特殊简化图中的最大 s-t 流问题。它明显快于现有的超图比率割算法,同时也解决了更一般的问题。通过对不同类型超图的数值实验,我们发现它能在一个小的最优性因子内快速找到比率割分解。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Cut-matching+Games+for+Generalized+Hypergraph+Ratio+Cuts)|0| -|[ApeGNN: Node-Wise Adaptive Aggregation in GNNs for Recommendation](https://doi.org/10.1145/3543507.3583530)|Dan Zhang, Yifan Zhu, Yuxiao Dong, Yuandong Wang, Wenzheng Feng, Evgeny Kharlamov, Jie Tang|Bosch Center for Artificial Intelligence, Germany; Tsinghua University, China|In recent years, graph neural networks (GNNs) have made great progress in recommendation. The core mechanism of GNNs-based recommender system is to iteratively aggregate neighboring information on the user-item interaction graph. However, existing GNNs treat users and items equally and cannot distinguish diverse local patterns of each node, which makes them suboptimal in the recommendation scenario. To resolve this challenge, we present a node-wise adaptive graph neural network framework ApeGNN. ApeGNN develops a node-wise adaptive diffusion mechanism for information aggregation, in which each node is enabled to adaptively decide its diffusion weights based on the local structure (e.g., degree). We perform experiments on six widely-used recommendation datasets. The experimental results show that the proposed ApeGNN is superior to the most advanced GNN-based recommender methods (up to 48.94%), demonstrating the effectiveness of node-wise adaptive aggregation.|近年来,图神经网络在推荐方面取得了很大的进展。基于 GNN 的推荐系统的核心机制是在用户-项目交互图上迭代地聚合相邻信息。然而,现有的 GNN 对用户和项目一视同仁,不能区分每个节点的不同本地模式,这使得它们在推荐场景中处于次优状态。为了解决这一问题,我们提出了一种节点自适应图神经网络框架 ApeGNN。ApeGNN 提出了一种基于节点的自适应信息聚合扩散机制,该机制允许每个节点根据局部结构(如度)自适应确定其扩散权重。我们在六个广泛使用的推荐数据集上进行实验。实验结果表明,该算法优于目前最先进的基于 GNN 的推荐方法(48.94%) ,证明了节点自适应聚集的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ApeGNN:+Node-Wise+Adaptive+Aggregation+in+GNNs+for+Recommendation)|0| -|[Multi-Modal Self-Supervised Learning for Recommendation](https://doi.org/10.1145/3543507.3583206)|Wei Wei, Chao Huang, Lianghao Xia, Chuxu Zhang|The University of Hong Kong, Hong Kong; Brandeis University, USA; University of Hong Kong, Hong Kong|The online emergence of multi-modal sharing platforms (eg, TikTok, Youtube) is powering personalized recommender systems to incorporate various modalities (eg, visual, textual and acoustic) into the latent user representations. While existing works on multi-modal recommendation exploit multimedia content features in enhancing item embeddings, their model representation capability is limited by heavy label reliance and weak robustness on sparse user behavior data. Inspired by the recent progress of self-supervised learning in alleviating label scarcity issue, we explore deriving self-supervision signals with effectively learning of modality-aware user preference and cross-modal dependencies. To this end, we propose a new Multi-Modal Self-Supervised Learning (MMSSL) method which tackles two key challenges. Specifically, to characterize the inter-dependency between the user-item collaborative view and item multi-modal semantic view, we design a modality-aware interactive structure learning paradigm via adversarial perturbations for data augmentation. In addition, to capture the effects that user's modality-aware interaction pattern would interweave with each other, a cross-modal contrastive learning approach is introduced to jointly preserve the inter-modal semantic commonality and user preference diversity. Experiments on real-world datasets verify the superiority of our method in offering great potential for multimedia recommendation over various state-of-the-art baselines. The implementation is released at: https://github.com/HKUDS/MMSSL.|多模式共享平台(如 TikTok、 Youtube)的在线出现为个性化推荐系统提供了动力,将各种模式(如视觉、文本和声学)纳入潜在用户表示。现有的多模态推荐方法利用多媒体内容特征增强项目嵌入,但其模型表示能力受到严重的标签依赖和对稀疏用户行为数据鲁棒性较差的限制。受近年来自我监督学习在缓解标签稀缺问题上的进展的启发,我们探讨了如何通过有效地学习模式感知的用户偏好和跨模式依赖来获得自我监督信号。为此,我们提出了一种新的多模态自主学习(MMSSL)方法,解决了两个关键的挑战。为了刻画用户项目协作视图和项目多模态语义视图之间的相互依赖关系,我们设计了一个基于模态感知的交互式结构学习范式。此外,为了捕捉用户感知情态的交互模式相互交织的影响,引入了一种跨情态对比学习方法,以共同保持多情态语义共性和用户偏好多样性。在现实世界数据集上的实验验证了该方法的优越性,在各种最先进的基线上为多媒体推荐提供了巨大的潜力。实施 https://github.com/hkuds/mmssl 如下:。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-Modal+Self-Supervised+Learning+for+Recommendation)|0| +|[ApeGNN: Node-Wise Adaptive Aggregation in GNNs for Recommendation](https://doi.org/10.1145/3543507.3583530)|Dan Zhang, Yifan Zhu, Yuxiao Dong, Yuandong Wang, Wenzheng Feng, Evgeny Kharlamov, Jie Tang|Tsinghua University, China; Bosch Center for Artificial Intelligence, Germany|In recent years, graph neural networks (GNNs) have made great progress in recommendation. The core mechanism of GNNs-based recommender system is to iteratively aggregate neighboring information on the user-item interaction graph. However, existing GNNs treat users and items equally and cannot distinguish diverse local patterns of each node, which makes them suboptimal in the recommendation scenario. To resolve this challenge, we present a node-wise adaptive graph neural network framework ApeGNN. ApeGNN develops a node-wise adaptive diffusion mechanism for information aggregation, in which each node is enabled to adaptively decide its diffusion weights based on the local structure (e.g., degree). We perform experiments on six widely-used recommendation datasets. The experimental results show that the proposed ApeGNN is superior to the most advanced GNN-based recommender methods (up to 48.94%), demonstrating the effectiveness of node-wise adaptive aggregation.|近年来,图神经网络在推荐方面取得了很大的进展。基于 GNN 的推荐系统的核心机制是在用户-项目交互图上迭代地聚合相邻信息。然而,现有的 GNN 对用户和项目一视同仁,不能区分每个节点的不同本地模式,这使得它们在推荐场景中处于次优状态。为了解决这一问题,我们提出了一种节点自适应图神经网络框架 ApeGNN。ApeGNN 提出了一种基于节点的自适应信息聚合扩散机制,该机制允许每个节点根据局部结构(如度)自适应确定其扩散权重。我们在六个广泛使用的推荐数据集上进行实验。实验结果表明,该算法优于目前最先进的基于 GNN 的推荐方法(48.94%) ,证明了节点自适应聚集的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ApeGNN:+Node-Wise+Adaptive+Aggregation+in+GNNs+for+Recommendation)|0| +|[Multi-Modal Self-Supervised Learning for Recommendation](https://doi.org/10.1145/3543507.3583206)|Wei Wei, Chao Huang, Lianghao Xia, Chuxu Zhang|University of Hong Kong, Hong Kong; Brandeis University, USA; The University of Hong Kong, Hong Kong|The online emergence of multi-modal sharing platforms (eg, TikTok, Youtube) is powering personalized recommender systems to incorporate various modalities (eg, visual, textual and acoustic) into the latent user representations. While existing works on multi-modal recommendation exploit multimedia content features in enhancing item embeddings, their model representation capability is limited by heavy label reliance and weak robustness on sparse user behavior data. Inspired by the recent progress of self-supervised learning in alleviating label scarcity issue, we explore deriving self-supervision signals with effectively learning of modality-aware user preference and cross-modal dependencies. To this end, we propose a new Multi-Modal Self-Supervised Learning (MMSSL) method which tackles two key challenges. Specifically, to characterize the inter-dependency between the user-item collaborative view and item multi-modal semantic view, we design a modality-aware interactive structure learning paradigm via adversarial perturbations for data augmentation. In addition, to capture the effects that user's modality-aware interaction pattern would interweave with each other, a cross-modal contrastive learning approach is introduced to jointly preserve the inter-modal semantic commonality and user preference diversity. Experiments on real-world datasets verify the superiority of our method in offering great potential for multimedia recommendation over various state-of-the-art baselines. The implementation is released at: https://github.com/HKUDS/MMSSL.|多模式共享平台(如 TikTok、 Youtube)的在线出现为个性化推荐系统提供了动力,将各种模式(如视觉、文本和声学)纳入潜在用户表示。现有的多模态推荐方法利用多媒体内容特征增强项目嵌入,但其模型表示能力受到严重的标签依赖和对稀疏用户行为数据鲁棒性较差的限制。受近年来自我监督学习在缓解标签稀缺问题上的进展的启发,我们探讨了如何通过有效地学习模式感知的用户偏好和跨模式依赖来获得自我监督信号。为此,我们提出了一种新的多模态自主学习(MMSSL)方法,解决了两个关键的挑战。为了刻画用户项目协作视图和项目多模态语义视图之间的相互依赖关系,我们设计了一个基于模态感知的交互式结构学习范式。此外,为了捕捉用户感知情态的交互模式相互交织的影响,引入了一种跨情态对比学习方法,以共同保持多情态语义共性和用户偏好多样性。在现实世界数据集上的实验验证了该方法的优越性,在各种最先进的基线上为多媒体推荐提供了巨大的潜力。实施 https://github.com/hkuds/mmssl 如下:。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-Modal+Self-Supervised+Learning+for+Recommendation)|0| |[Bootstrap Latent Representations for Multi-modal Recommendation](https://doi.org/10.1145/3543507.3583251)|Xin Zhou, Hongyu Zhou, Yong Liu, Zhiwei Zeng, Chunyan Miao, Pengwei Wang, Yuan You, Feijun Jiang|Alibaba, China; Nanyang Technological University, Singapore|This paper studies the multi-modal recommendation problem, where the item multi-modality information (e.g., images and textual descriptions) is exploited to improve the recommendation accuracy. Besides the user-item interaction graph, existing state-of-the-art methods usually use auxiliary graphs (e.g., user-user or item-item relation graph) to augment the learned representations of users and/or items. These representations are often propagated and aggregated on auxiliary graphs using graph convolutional networks, which can be prohibitively expensive in computation and memory, especially for large graphs. Moreover, existing multi-modal recommendation methods usually leverage randomly sampled negative examples in Bayesian Personalized Ranking (BPR) loss to guide the learning of user/item representations, which increases the computational cost on large graphs and may also bring noisy supervision signals into the training process. To tackle the above issues, we propose a novel self-supervised multi-modal recommendation model, dubbed BM3, which requires neither augmentations from auxiliary graphs nor negative samples. Specifically, BM3 first bootstraps latent contrastive views from the representations of users and items with a simple dropout augmentation. It then jointly optimizes three multi-modal objectives to learn the representations of users and items by reconstructing the user-item interaction graph and aligning modality features under both inter- and intra-modality perspectives. BM3 alleviates both the need for contrasting with negative examples and the complex graph augmentation from an additional target network for contrastive view generation. We show BM3 outperforms prior recommendation models on three datasets with number of nodes ranging from 20K to 200K, while achieving a 2-9X reduction in training time. Our code is available at https://github.com/enoche/BM3.|本文研究了多模态推荐问题,该问题利用项目的多模态信息(如图像和文本描述)来提高推荐的准确性。除了用户-项目交互图,现有的方法通常使用辅助图(例如,用户-用户或项目-项目关系图)来增强用户和/或项目的学习表示。这些表示通常使用图卷积网络在辅助图上进行传播和聚合,这在计算和存储方面是非常昂贵的,特别是对于大图。此外,现有的多模态推荐方法通常利用贝叶斯个性化排序(BPR)损失中随机抽样的负例子来指导用户/项目表示的学习,这增加了大图上的计算成本,也可能使噪声监督信号进入训练过程。为了解决上述问题,我们提出了一种新的自监督多模态推荐模型,称为 BM3,它不需要辅助图和负样本的增广。具体来说,BM3首先通过简单的辍学增强从用户和项目的表示中引导潜在的对比视图。然后,通过重构用户-项目交互图和在情态间和情态内对齐情态特征,联合优化三个多模态目标来学习用户和项目的表征。BM3减轻了与负面例子对比的需要,也减轻了从一个额外的目标网络生成对比视图的复杂图形增强。我们发现 BM3在三个数据集上的节点数从20K 到200K 不等,优于先前的推荐模型,同时实现了2-9倍的训练时间缩短。我们的代码可以在 https://github.com/enoche/bm3找到。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Bootstrap+Latent+Representations+for+Multi-modal+Recommendation)|0| -|[Recommendation with Causality enhanced Natural Language Explanations](https://doi.org/10.1145/3543507.3583260)|Jingsen Zhang, Xu Chen, Jiakai Tang, Weiqi Shao, Quanyu Dai, Zhenhua Dong, Rui Zhang|Huawei Noah's Ark Lab, China; www.ruizhang.info, China; Renmin University of China, China|Explainable recommendation has recently attracted increasing attention from both academic and industry communities. Among different explainable strategies, generating natural language explanations is an important method, which can deliver more informative, flexible and readable explanations to facilitate better user decisions. Despite the effectiveness, existing models are mostly optimized based on the observed datasets, which can be skewed due to the selection or exposure bias. To alleviate this problem, in this paper, we formulate the task of explainable recommendation with a causal graph, and design a causality enhanced framework to generate unbiased explanations. More specifically, we firstly define an ideal unbiased learning objective, and then derive a tractable loss for the observational data based on the inverse propensity score (IPS), where the key is a sample re-weighting strategy for equalizing the loss and ideal objective in expectation. Considering that the IPS estimated from the sparse and noisy recommendation datasets can be inaccurate, we introduce a fault tolerant mechanism by minimizing the maximum loss induced by the sample weights near the IPS. For more comprehensive modeling, we further analyze and infer the potential latent confounders induced by the complex and diverse user personalities. We conduct extensive experiments by comparing with the state-of-the-art methods based on three real-world datasets to demonstrate the effectiveness of our method.|可解释的建议最近引起了学术界和工业界越来越多的关注。在不同的解释策略中,生成自然语言解释是一种重要的方法,它可以提供更多的信息,灵活和可读的解释,以便于更好的用户决策。尽管有效,现有的模型大多是优化的基础上观察数据集,这可能会由于选择或曝光偏差。为了解决这一问题,本文利用因果图构造了可解释推荐任务,并设计了一个因果增强框架来生成无偏解释。更具体地说,我们首先定义一个理想的无偏学习目标,然后推导出一个基于逆倾向评分(IPS)的观测数据易处理的损失,其中的关键是一个样本重新加权策略来均衡损失和期望的理想目标。针对由稀疏和噪声推荐数据集估计的 IPS 可能不准确的问题,我们引入了一种容错机制,使 IPS 附近的样本权重引起的最大损失最小。为了更全面的建模,我们进一步分析和推断潜在的潜在混杂因素引起的复杂和多样的用户个性。为了验证该方法的有效性,我们在三个实际数据集上进行了广泛的实验,并与现有的方法进行了比较。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Recommendation+with+Causality+enhanced+Natural+Language+Explanations)|0| -|[Two-Stage Constrained Actor-Critic for Short Video Recommendation](https://doi.org/10.1145/3543507.3583259)|Qingpeng Cai, Zhenghai Xue, Chi Zhang, Wanqi Xue, Shuchang Liu, Ruohan Zhan, Xueliang Wang, Tianyou Zuo, Wentao Xie, Dong Zheng, Peng Jiang, Kun Gai|Unaffiliated, China; Hong Kong University of Science and Technology, China; Kuaishou Technology, China|The wide popularity of short videos on social media poses new opportunities and challenges to optimize recommender systems on the video-sharing platforms. Users sequentially interact with the system and provide complex and multi-faceted responses, including watch time and various types of interactions with multiple videos. One the one hand, the platforms aims at optimizing the users' cumulative watch time (main goal) in long term, which can be effectively optimized by Reinforcement Learning. On the other hand, the platforms also needs to satisfy the constraint of accommodating the responses of multiple user interactions (auxiliary goals) such like, follow, share etc. In this paper, we formulate the problem of short video recommendation as a Constrained Markov Decision Process (CMDP). We find that traditional constrained reinforcement learning algorithms can not work well in this setting. We propose a novel two-stage constrained actor-critic method: At stage one, we learn individual policies to optimize each auxiliary signal. At stage two, we learn a policy to (i) optimize the main signal and (ii) stay close to policies learned at the first stage, which effectively guarantees the performance of this main policy on the auxiliaries. Through extensive offline evaluations, we demonstrate effectiveness of our method over alternatives in both optimizing the main goal as well as balancing the others. We further show the advantage of our method in live experiments of short video recommendations, where it significantly outperforms other baselines in terms of both watch time and interactions. Our approach has been fully launched in the production system to optimize user experiences on the platform.|短视频在社交媒体上的广泛流行为优化视频共享平台上的推荐系统带来了新的机遇和挑战。用户按顺序与系统互动,并提供复杂和多方面的反应,包括观看时间和与多个视频的各种类型的互动。一方面,这些平台旨在长期优化用户的累计观看时间(主要目标) ,这可以通过强化学习有效地优化。另一方面,平台还需要满足适应多用户交互(辅助目标)的响应约束,如跟踪、共享等。在这篇文章中,我们将短视频推荐问题描述为一个约束马可夫决策过程(CMDP)。我们发现传统的约束强化学习算法在这种情况下不能很好地工作。我们提出了一种新的两阶段约束行为者-评论方法: 在第一阶段,我们学习个体策略来优化每个辅助信号。在第二阶段,我们学习了一个策略来(i)优化主信号和(ii)紧跟在第一阶段学到的策略,这有效地保证了这个主策略在辅助系统上的性能。通过广泛的离线评估,我们证明了我们的方法在优化主要目标和平衡其他方面的有效性。我们进一步展示了我们的方法在短视频推荐的现场实验中的优势,在观看时间和交互方面显著优于其他基准。我们的方法已经在生产系统中全面推出,以优化平台上的用户体验。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Two-Stage+Constrained+Actor-Critic+for+Short+Video+Recommendation)|0| +|[Recommendation with Causality enhanced Natural Language Explanations](https://doi.org/10.1145/3543507.3583260)|Jingsen Zhang, Xu Chen, Jiakai Tang, Weiqi Shao, Quanyu Dai, Zhenhua Dong, Rui Zhang|www.ruizhang.info, China; Renmin University of China, China; Huawei Noah's Ark Lab, China|Explainable recommendation has recently attracted increasing attention from both academic and industry communities. Among different explainable strategies, generating natural language explanations is an important method, which can deliver more informative, flexible and readable explanations to facilitate better user decisions. Despite the effectiveness, existing models are mostly optimized based on the observed datasets, which can be skewed due to the selection or exposure bias. To alleviate this problem, in this paper, we formulate the task of explainable recommendation with a causal graph, and design a causality enhanced framework to generate unbiased explanations. More specifically, we firstly define an ideal unbiased learning objective, and then derive a tractable loss for the observational data based on the inverse propensity score (IPS), where the key is a sample re-weighting strategy for equalizing the loss and ideal objective in expectation. Considering that the IPS estimated from the sparse and noisy recommendation datasets can be inaccurate, we introduce a fault tolerant mechanism by minimizing the maximum loss induced by the sample weights near the IPS. For more comprehensive modeling, we further analyze and infer the potential latent confounders induced by the complex and diverse user personalities. We conduct extensive experiments by comparing with the state-of-the-art methods based on three real-world datasets to demonstrate the effectiveness of our method.|可解释的建议最近引起了学术界和工业界越来越多的关注。在不同的解释策略中,生成自然语言解释是一种重要的方法,它可以提供更多的信息,灵活和可读的解释,以便于更好的用户决策。尽管有效,现有的模型大多是优化的基础上观察数据集,这可能会由于选择或曝光偏差。为了解决这一问题,本文利用因果图构造了可解释推荐任务,并设计了一个因果增强框架来生成无偏解释。更具体地说,我们首先定义一个理想的无偏学习目标,然后推导出一个基于逆倾向评分(IPS)的观测数据易处理的损失,其中的关键是一个样本重新加权策略来均衡损失和期望的理想目标。针对由稀疏和噪声推荐数据集估计的 IPS 可能不准确的问题,我们引入了一种容错机制,使 IPS 附近的样本权重引起的最大损失最小。为了更全面的建模,我们进一步分析和推断潜在的潜在混杂因素引起的复杂和多样的用户个性。为了验证该方法的有效性,我们在三个实际数据集上进行了广泛的实验,并与现有的方法进行了比较。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Recommendation+with+Causality+enhanced+Natural+Language+Explanations)|0| +|[Two-Stage Constrained Actor-Critic for Short Video Recommendation](https://doi.org/10.1145/3543507.3583259)|Qingpeng Cai, Zhenghai Xue, Chi Zhang, Wanqi Xue, Shuchang Liu, Ruohan Zhan, Xueliang Wang, Tianyou Zuo, Wentao Xie, Dong Zheng, Peng Jiang, Kun Gai|Kuaishou Technology, China; Hong Kong University of Science and Technology, China; Unaffiliated, China|The wide popularity of short videos on social media poses new opportunities and challenges to optimize recommender systems on the video-sharing platforms. Users sequentially interact with the system and provide complex and multi-faceted responses, including watch time and various types of interactions with multiple videos. One the one hand, the platforms aims at optimizing the users' cumulative watch time (main goal) in long term, which can be effectively optimized by Reinforcement Learning. On the other hand, the platforms also needs to satisfy the constraint of accommodating the responses of multiple user interactions (auxiliary goals) such like, follow, share etc. In this paper, we formulate the problem of short video recommendation as a Constrained Markov Decision Process (CMDP). We find that traditional constrained reinforcement learning algorithms can not work well in this setting. We propose a novel two-stage constrained actor-critic method: At stage one, we learn individual policies to optimize each auxiliary signal. At stage two, we learn a policy to (i) optimize the main signal and (ii) stay close to policies learned at the first stage, which effectively guarantees the performance of this main policy on the auxiliaries. Through extensive offline evaluations, we demonstrate effectiveness of our method over alternatives in both optimizing the main goal as well as balancing the others. We further show the advantage of our method in live experiments of short video recommendations, where it significantly outperforms other baselines in terms of both watch time and interactions. Our approach has been fully launched in the production system to optimize user experiences on the platform.|短视频在社交媒体上的广泛流行为优化视频共享平台上的推荐系统带来了新的机遇和挑战。用户按顺序与系统互动,并提供复杂和多方面的反应,包括观看时间和与多个视频的各种类型的互动。一方面,这些平台旨在长期优化用户的累计观看时间(主要目标) ,这可以通过强化学习有效地优化。另一方面,平台还需要满足适应多用户交互(辅助目标)的响应约束,如跟踪、共享等。在这篇文章中,我们将短视频推荐问题描述为一个约束马可夫决策过程(CMDP)。我们发现传统的约束强化学习算法在这种情况下不能很好地工作。我们提出了一种新的两阶段约束行为者-评论方法: 在第一阶段,我们学习个体策略来优化每个辅助信号。在第二阶段,我们学习了一个策略来(i)优化主信号和(ii)紧跟在第一阶段学到的策略,这有效地保证了这个主策略在辅助系统上的性能。通过广泛的离线评估,我们证明了我们的方法在优化主要目标和平衡其他方面的有效性。我们进一步展示了我们的方法在短视频推荐的现场实验中的优势,在观看时间和交互方面显著优于其他基准。我们的方法已经在生产系统中全面推出,以优化平台上的用户体验。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Two-Stage+Constrained+Actor-Critic+for+Short+Video+Recommendation)|0| |[Robust Recommendation with Adversarial Gaussian Data Augmentation](https://doi.org/10.1145/3543507.3583273)|Zhenlei Wang, Xu Chen|Gaoling School of Artificial Intelligence, Renmin university of China, China|Recommender system holds the promise of accurately understanding and estimating the user preferences. However, due to the extremely sparse user-item interactions, the learned recommender models can be less robust and sensitive to the highly dynamic user preferences and easily changed recommendation environments. To alleviate this problem, in this paper, we propose a simple yet effective robust recommender framework by generating additional samples from the Gaussian distributions. In specific, we design two types of data augmentation strategies. For the first one, we directly produce the data based on the original samples, where we simulate the generation process in the latent space. For the second one, we firstly change the original samples towards the direction of maximizing the loss function, and then produce the data based on the altered samples to make more effective explorations. Based on both of the above strategies, we leverage adversarial training to optimize the recommender model with the generated data which can achieve the largest losses. In addition, we theoretically analyze our framework, and find that the above two data augmentation strategies equal to impose a gradient based regularization on the original recommender models. We conduct extensive experiments based on six real-world datasets to demonstrate the effectiveness of our framework.|推荐系统有希望准确理解和估计用户的偏好。然而,由于用户与项目之间的交互非常稀少,所学习的推荐模型可能对高度动态的用户偏好和容易更改的推荐环境不太健壮和敏感。为了解决这个问题,本文提出了一个简单而有效的鲁棒推荐框架,通过从高斯分布生成额外的样本。具体来说,我们设计了两种类型的数据增强策略。对于第一种方法,我们直接在原始样本的基础上生成数据,模拟潜在空间中的生成过程。第二种方法首先将原始样本向损失函数最大化方向改变,然后根据改变后的样本生成数据,进行更有效的探索。基于上述两种策略,我们利用对抗性训练来优化推荐模型,生成的数据可以达到最大的损失。此外,我们还从理论上分析了我们的框架,发现上述两种数据增强策略相当于在原有的推荐模型上加入了基于梯度的正则化。我们基于六个真实世界的数据集进行了广泛的实验,以证明我们的框架的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Robust+Recommendation+with+Adversarial+Gaussian+Data+Augmentation)|0| |[Anti-FakeU: Defending Shilling Attacks on Graph Neural Network based Recommender Model](https://doi.org/10.1145/3543507.3583289)|Xiaoyu You, Chi Li, Daizong Ding, Mi Zhang, Fuli Feng, Xudong Pan, Min Yang|Fudan University, School of Computer Science, China; University of Science and Technology of China, CCCD Key Lab of Ministry of Culture and Tourism, China|Graph neural network (GNN) based recommendation models are observed to be more vulnerable against carefully-designed malicious records injected into the system, i.e., shilling attacks, which manipulate the recommendation to common users and therefore impair user trust. In this paper, we for the first time conduct a systematic study on the vulnerability of GNN based recommendation model against the shilling attack. With the aid of theoretical analysis, we attribute the root cause of the vulnerability to its neighborhood aggregation mechanism, which could make the negative impact of attacks propagate rapidly in the system. To restore the robustness of GNN based recommendation model, the key factor lies in detecting malicious records in the system and preventing the propagation of misinformation. To this end, we construct a user-user graph to capture the patterns of malicious behaviors and design a novel GNN based detector to identify fake users. Furthermore, we develop a data augmentation strategy and a joint learning paradigm to train the recommender model and the proposed detector. Extensive experiments on benchmark datasets validate the enhanced robustness of the proposed method in resisting various types of shilling attacks and identifying fake users, e.g., our proposed method fully mitigating the impact of popularity attacks on target items up to , and improving the accuracy of detecting fake users on the Gowalla dataset by .|基于图神经网络(GNN)的推荐模型更容易受到注入系统的精心设计的恶意记录(即先令攻击)的攻击,这些恶意记录操纵普通用户的推荐,从而损害用户的信任。本文首次对基于 GNN 的推荐模型在面对先令攻击时的脆弱性进行了系统的研究。在理论分析的基础上,将易受攻击的根本原因归结为其邻域聚合机制,使得攻击的负面影响在系统中迅速传播。要恢复基于 GNN 的推荐模型的鲁棒性,关键在于检测系统中的恶意记录,防止错误信息的传播。为此,我们构造了一个用户-用户图来捕捉恶意行为的模式,并设计了一种新的基于 GNN 的检测器来识别虚假用户。此外,我们发展了一个数据增强策略和一个联合学习范式来训练推荐模型和建议的检测器。基准数据集的大量实验验证了该方法在抵御各种先令攻击和识别假用户方面的增强鲁棒性,例如,我们提出的方法充分减轻了流行攻击对目标项的影响,并通过以下方法提高了 Gowalla 数据集检测假用户的准确性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Anti-FakeU:+Defending+Shilling+Attacks+on+Graph+Neural+Network+based+Recommender+Model)|0| -|[Automated Self-Supervised Learning for Recommendation](https://doi.org/10.1145/3543507.3583336)|Lianghao Xia, Chao Huang, Chunzhen Huang, Kangyi Lin, Tao Yu, Ben Kao|The University of Hong Kong, Hong Kong; Tencent, China|Graph neural networks (GNNs) have emerged as the state-of-the-art paradigm for collaborative filtering (CF). To improve the representation quality over limited labeled data, contrastive learning has attracted attention in recommendation and benefited graph-based CF model recently. However, the success of most contrastive methods heavily relies on manually generating effective contrastive views for heuristic-based data augmentation. This does not generalize across different datasets and downstream recommendation tasks, which is difficult to be adaptive for data augmentation and robust to noise perturbation. To fill this crucial gap, this work proposes a unified Automated Collaborative Filtering (AutoCF) to automatically perform data augmentation for recommendation. Specifically, we focus on the generative self-supervised learning framework with a learnable augmentation paradigm that benefits the automated distillation of important self-supervised signals. To enhance the representation discrimination ability, our masked graph autoencoder is designed to aggregate global information during the augmentation via reconstructing the masked subgraph structures. Experiments and ablation studies are performed on several public datasets for recommending products, venues, and locations. Results demonstrate the superiority of AutoCF against various baseline methods. We release the model implementation at https://github.com/HKUDS/AutoCF.|图形神经网络(GNN)已经成为最先进的协同过滤(CF)模式。为了提高有限标记数据的表示质量,对比学习近年来受到推荐界的关注,并受益于基于图的 CF 模型。然而,大多数对比方法的成功在很大程度上依赖于手工生成有效的对比视图,用于基于启发式的数据增强。这不能在不同的数据集和下游推荐任务之间推广,这对于数据增强和抗噪声干扰是很难自适应的。为了填补这个关键的空白,这项工作提出了一个统一的自动化协同过滤(AutoCF)来自动执行数据增强的推荐。具体来说,我们重点研究了具有可学习增强范式的生成式自监督学习框架,该框架有利于自动提取重要的自监督信号。为了提高表示识别能力,我们设计了掩码自动编码器,通过重构掩码子图结构来聚集增强过程中的全局信息。实验和烧蚀研究进行了几个公共数据集推荐产品,场所和地点。结果表明,AutoCF 方法与各种基线方法相比具有优越性。我们在 https://github.com/hkuds/autocf 发布模型实现。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Automated+Self-Supervised+Learning+for+Recommendation)|0| -|[AutoDenoise: Automatic Data Instance Denoising for Recommendations](https://doi.org/10.1145/3543507.3583339)|Weilin Lin, Xiangyu Zhao, Yejing Wang, Yuanshao Zhu, Wanyu Wang|City University of Hong Kong, Hong Kong and Southern University of Science and Technology, China; City University of Hong Kong, Hong Kong|Historical user-item interaction datasets are essential in training modern recommender systems for predicting user preferences. However, the arbitrary user behaviors in most recommendation scenarios lead to a large volume of noisy data instances being recorded, which cannot fully represent their true interests. While a large number of denoising studies are emerging in the recommender system community, all of them suffer from highly dynamic data distributions. In this paper, we propose a Deep Reinforcement Learning (DRL) based framework, AutoDenoise, with an Instance Denoising Policy Network, for denoising data instances with an instance selection manner in deep recommender systems. To be specific, AutoDenoise serves as an agent in DRL to adaptively select noise-free and predictive data instances, which can then be utilized directly in training representative recommendation models. In addition, we design an alternate two-phase optimization strategy to train and validate the AutoDenoise properly. In the searching phase, we aim to train the policy network with the capacity of instance denoising; in the validation phase, we find out and evaluate the denoised subset of data instances selected by the trained policy network, so as to validate its denoising ability. We conduct extensive experiments to validate the effectiveness of AutoDenoise combined with multiple representative recommender system models.|历史用户项目交互数据集对于培训现代推荐系统来预测用户偏好是必不可少的。然而,在大多数推荐场景中,任意的用户行为会导致大量有噪声的数据实例被记录下来,而这些数据实例并不能完全代表用户的真实兴趣。虽然在推荐系统社区中出现了大量的去噪研究,但所有这些研究都受到高度动态数据分布的影响。在这篇文章中,我们提出了一个基于深度强化学习的框架,AutoDenoise,和一个实例去噪策略网络,用于在深度推荐系统中用实例选择的方式去除数据实例。具体来说,自动去噪作为 DRL 中的一个代理,自适应地选择无噪声和预测数据实例,然后可以直接用于训练代表性的推荐模型。此外,我们还设计了一个交替的两阶段优化策略来训练和验证自动去噪的正确性。在搜索阶段,我们的目标是训练具有实例去噪能力的策略网络,在验证阶段,我们找出并评估训练后的策略网络选择的数据实例的去噪子集,以验证其去噪能力。我们进行了广泛的实验,以验证自动去噪与多个具有代表性的推荐系统模型相结合的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=AutoDenoise:+Automatic+Data+Instance+Denoising+for+Recommendations)|0| -|[AutoS2AE: Automate to Regularize Sparse Shallow Autoencoders for Recommendation](https://doi.org/10.1145/3543507.3583349)|Rui Fan, Yuanhao Pu, Jin Chen, Zhihao Zhu, Defu Lian, Enhong Chen|School of Data Science, University of Science and Technology of China, China; School of Computer Science, School of Data Science, University of Science and Technology of China, China and State Key Laboratory of Cognitive Intelligence, China; University of Electronic Science and Technology of China, China; School of Computer Science, University of Science and Technology of China, China|The Embarrassingly Shallow Autoencoders (EASE and SLIM) are strong recommendation methods based on implicit feedback, compared to competing methods like iALS and VAE-CF. However, EASE suffers from several major shortcomings. First, the training and inference of EASE can not scale with the increasing number of items since it requires storing and inverting a large dense matrix; Second, though its optimization objective – the square loss– can yield a closed-form solution, it is not consistent with recommendation goal – predicting a personalized ranking on a set of items, so that its performance is far from optimal w.r.t ranking-oriented recommendation metrics. Finally, the regularization coefficients are sensitive w.r.t recommendation accuracy and vary a lot across different datasets, so the fine-tuning of these parameters is important yet time-consuming. To improve training and inference efficiency, we propose a Similarity-Structure Aware Shallow Autoencoder on top of three similarity structures, including Co-Occurrence, KNN and NSW. We then optimize the model with a weighted square loss, which is proven effective for ranking-based recommendation but still capable of deriving closed-form solutions. However, the weight in the loss can not be learned in the training set and is similarly sensitive w.r.t the accuracy to regularization coefficients. To automatically tune the hyperparameters, we design two validation losses on the validation set for guidance, and update the hyperparameters with the gradient of the validation losses. We finally evaluate the proposed method on multiple real-world datasets and show that it outperforms seven competing baselines remarkably, and verify the effectiveness of each part in the proposed method.|令人尴尬的浅层自动编码器(EASE 和 SLIM)是基于隐式反馈的强有力的推荐方法,与 iALS 和 VAE-CF 等竞争方法相比。然而,EASE 有几个主要的缺点。首先,EASE 的训练和推理不能随着项目数量的增加而扩展,因为它需要存储和反演一个大的密集矩阵; 其次,虽然它的优化目标-平方损失-可以产生一个封闭形式的解决方案,但它不符合推荐目标-预测一组项目的个性化排名,因此它的性能远远不是最优的面向网络排名的推荐指标。最后,正则化系数对推荐精度非常敏感,并且在不同的数据集上有很大的差异,因此对这些参数进行微调非常重要,但也非常耗时。为了提高训练和推理效率,本文提出了一种基于共现、 KNN 和 NSW 三种相似结构的相似结构感知浅层自动编码器。然后,我们用加权平方损失优化模型,这被证明是有效的排名为基础的推荐,但仍然能够导出闭合形式的解决方案。然而,损失中的权重不能在训练集中学习,并且对正则化系数的精度同样敏感。为了自动调整超参数,我们在验证集上设计了两个验证损失作为指导,并用验证损失的梯度来更新超参数。最后,我们对该方法在多个实际数据集上的性能进行了评估,结果表明该方法明显优于7个竞争基线,并验证了该方法各部分的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=AutoS2AE:+Automate+to+Regularize+Sparse+Shallow+Autoencoders+for+Recommendation)|0| -|[Improving Recommendation Fairness via Data Augmentation](https://doi.org/10.1145/3543507.3583341)|Lei Chen, Le Wu, Kun Zhang, Richang Hong, Defu Lian, Zhiqiang Zhang, Jun Zhou, Meng Wang|Hefei University of Technology, China and Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, China; Ant Group, China; Hefei University of Technology, China; University of Science and Technology of China, China|Collaborative filtering based recommendation learns users' preferences from all users' historical behavior data, and has been popular to facilitate decision making. R Recently, the fairness issue of recommendation has become more and more essential. A recommender system is considered unfair when it does not perform equally well for different user groups according to users' sensitive attributes~(e.g., gender, race). Plenty of methods have been proposed to alleviate unfairness by optimizing a predefined fairness goal or changing the distribution of unbalanced training data. However, they either suffered from the specific fairness optimization metrics or relied on redesigning the current recommendation architecture. In this paper, we study how to improve recommendation fairness from the data augmentation perspective. The recommendation model amplifies the inherent unfairness of imbalanced training data. We augment imbalanced training data towards balanced data distribution to improve fairness. The proposed framework is generally applicable to any embedding-based recommendation, and does not need to pre-define a fairness metric. Extensive experiments on two real-world datasets clearly demonstrate the superiority of our proposed framework. We publish the source code at https://github.com/newlei/FDA.|基于协同过滤的推荐从所有用户的历史行为数据中了解用户的偏好,并且已经流行起来以促进决策制定。近年来,推荐的公平性问题变得越来越重要。根据用户的敏感属性 ~ (如性别、种族) ,一个推荐系统在不同用户组中的表现不尽相同,这被认为是不公平的。通过优化预定义的公平目标或改变不平衡训练数据的分布,已经提出了许多缓解不公平现象的方法。然而,它们要么受到特定公平性优化指标的影响,要么依赖于重新设计当前的推荐体系结构。本文从数据增强的角度研究如何提高推荐公平性。推荐模型放大了不平衡训练数据固有的不公平性。为了提高训练数据的公平性,我们对不平衡的训练数据进行扩充以达到平衡的数据分布。提出的框架通常适用于任何基于嵌入的建议,并且不需要预先定义公平性度量。在两个实际数据集上的大量实验清楚地表明了我们提出的框架的优越性。我们在 https://github.com/newlei/fda 公布源代码。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+Recommendation+Fairness+via+Data+Augmentation)|0| -|[Robust Preference-Guided Denoising for Graph based Social Recommendation](https://doi.org/10.1145/3543507.3583374)|Yuhan Quan, Jingtao Ding, Chen Gao, Lingling Yi, Depeng Jin, Yong Li|Tencent, China; Tsinghua University, China|Graph Neural Network(GNN) based social recommendation models improve the prediction accuracy of user preference by leveraging GNN in exploiting preference similarity contained in social relations. However, in terms of both effectiveness and efficiency of recommendation, a large portion of social relations can be redundant or even noisy, e.g., it is quite normal that friends share no preference in a certain domain. Existing models do not fully solve this problem of relation redundancy and noise, as they directly characterize social influence over the full social network. In this paper, we instead propose to improve graph based social recommendation by only retaining the informative social relations to ensure an efficient and effective influence diffusion, i.e., graph denoising. Our designed denoising method is preference-guided to model social relation confidence and benefits user preference learning in return by providing a denoised but more informative social graph for recommendation models. Moreover, to avoid interference of noisy social relations, it designs a self-correcting curriculum learning module and an adaptive denoising strategy, both favoring highly-confident samples. Experimental results on three public datasets demonstrate its consistent capability of improving two state-of-the-art social recommendation models by robustly removing 10-40% of original relations. We release the source code at https://github.com/tsinghua-fib-lab/Graph-Denoising-SocialRec.|基于图神经网络(GNN)的社会推荐模型通过利用社会关系中包含的偏好相似性来提高用户偏好的预测精度。然而,就推荐的有效性和效率而言,很大一部分社会关系可能是冗余的,甚至是嘈杂的,例如,朋友在某个领域没有共同的偏好是很正常的。现有的模型并没有完全解决关系冗余和噪声的问题,因为它们直接表征了社会对整个社会网络的影响。在本文中,我们提出改进基于图的社会推荐,只保留信息性的社会关系,以确保有效和有效的影响扩散,即图去噪。我们设计的去噪方法是偏好引导的社会关系模型的信心和有益的用户偏好学习的回报,提供了一个去噪,但更多的信息社会图的推荐模型。同时,为了避免社会关系噪声的干扰,设计了自校正课程学习模块和自适应去噪策略,两者都有利于高自信样本。在三个公共数据集上的实验结果表明,该算法能够通过鲁棒地去除10-40% 的原始关系来改进两个最新的社会推荐模型。我们在 https://github.com/tsinghua-fib-lab/graph-denoising-socialrec 公布源代码。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Robust+Preference-Guided+Denoising+for+Graph+based+Social+Recommendation)|0| +|[Automated Self-Supervised Learning for Recommendation](https://doi.org/10.1145/3543507.3583336)|Lianghao Xia, Chao Huang, Chunzhen Huang, Kangyi Lin, Tao Yu, Ben Kao|Tencent, China; The University of Hong Kong, Hong Kong|Graph neural networks (GNNs) have emerged as the state-of-the-art paradigm for collaborative filtering (CF). To improve the representation quality over limited labeled data, contrastive learning has attracted attention in recommendation and benefited graph-based CF model recently. However, the success of most contrastive methods heavily relies on manually generating effective contrastive views for heuristic-based data augmentation. This does not generalize across different datasets and downstream recommendation tasks, which is difficult to be adaptive for data augmentation and robust to noise perturbation. To fill this crucial gap, this work proposes a unified Automated Collaborative Filtering (AutoCF) to automatically perform data augmentation for recommendation. Specifically, we focus on the generative self-supervised learning framework with a learnable augmentation paradigm that benefits the automated distillation of important self-supervised signals. To enhance the representation discrimination ability, our masked graph autoencoder is designed to aggregate global information during the augmentation via reconstructing the masked subgraph structures. Experiments and ablation studies are performed on several public datasets for recommending products, venues, and locations. Results demonstrate the superiority of AutoCF against various baseline methods. We release the model implementation at https://github.com/HKUDS/AutoCF.|图形神经网络(GNN)已经成为最先进的协同过滤(CF)模式。为了提高有限标记数据的表示质量,对比学习近年来受到推荐界的关注,并受益于基于图的 CF 模型。然而,大多数对比方法的成功在很大程度上依赖于手工生成有效的对比视图,用于基于启发式的数据增强。这不能在不同的数据集和下游推荐任务之间推广,这对于数据增强和抗噪声干扰是很难自适应的。为了填补这个关键的空白,这项工作提出了一个统一的自动化协同过滤(AutoCF)来自动执行数据增强的推荐。具体来说,我们重点研究了具有可学习增强范式的生成式自监督学习框架,该框架有利于自动提取重要的自监督信号。为了提高表示识别能力,我们设计了掩码自动编码器,通过重构掩码子图结构来聚集增强过程中的全局信息。实验和烧蚀研究进行了几个公共数据集推荐产品,场所和地点。结果表明,AutoCF 方法与各种基线方法相比具有优越性。我们在 https://github.com/hkuds/autocf 发布模型实现。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Automated+Self-Supervised+Learning+for+Recommendation)|0| +|[AutoDenoise: Automatic Data Instance Denoising for Recommendations](https://doi.org/10.1145/3543507.3583339)|Weilin Lin, Xiangyu Zhao, Yejing Wang, Yuanshao Zhu, Wanyu Wang|City University of Hong Kong, Hong Kong; City University of Hong Kong, Hong Kong and Southern University of Science and Technology, China|Historical user-item interaction datasets are essential in training modern recommender systems for predicting user preferences. However, the arbitrary user behaviors in most recommendation scenarios lead to a large volume of noisy data instances being recorded, which cannot fully represent their true interests. While a large number of denoising studies are emerging in the recommender system community, all of them suffer from highly dynamic data distributions. In this paper, we propose a Deep Reinforcement Learning (DRL) based framework, AutoDenoise, with an Instance Denoising Policy Network, for denoising data instances with an instance selection manner in deep recommender systems. To be specific, AutoDenoise serves as an agent in DRL to adaptively select noise-free and predictive data instances, which can then be utilized directly in training representative recommendation models. In addition, we design an alternate two-phase optimization strategy to train and validate the AutoDenoise properly. In the searching phase, we aim to train the policy network with the capacity of instance denoising; in the validation phase, we find out and evaluate the denoised subset of data instances selected by the trained policy network, so as to validate its denoising ability. We conduct extensive experiments to validate the effectiveness of AutoDenoise combined with multiple representative recommender system models.|历史用户项目交互数据集对于培训现代推荐系统来预测用户偏好是必不可少的。然而,在大多数推荐场景中,任意的用户行为会导致大量有噪声的数据实例被记录下来,而这些数据实例并不能完全代表用户的真实兴趣。虽然在推荐系统社区中出现了大量的去噪研究,但所有这些研究都受到高度动态数据分布的影响。在这篇文章中,我们提出了一个基于深度强化学习的框架,AutoDenoise,和一个实例去噪策略网络,用于在深度推荐系统中用实例选择的方式去除数据实例。具体来说,自动去噪作为 DRL 中的一个代理,自适应地选择无噪声和预测数据实例,然后可以直接用于训练代表性的推荐模型。此外,我们还设计了一个交替的两阶段优化策略来训练和验证自动去噪的正确性。在搜索阶段,我们的目标是训练具有实例去噪能力的策略网络,在验证阶段,我们找出并评估训练后的策略网络选择的数据实例的去噪子集,以验证其去噪能力。我们进行了广泛的实验,以验证自动去噪与多个具有代表性的推荐系统模型相结合的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=AutoDenoise:+Automatic+Data+Instance+Denoising+for+Recommendations)|0| +|[AutoS2AE: Automate to Regularize Sparse Shallow Autoencoders for Recommendation](https://doi.org/10.1145/3543507.3583349)|Rui Fan, Yuanhao Pu, Jin Chen, Zhihao Zhu, Defu Lian, Enhong Chen|University of Electronic Science and Technology of China, China; School of Data Science, University of Science and Technology of China, China; School of Computer Science, School of Data Science, University of Science and Technology of China, China and State Key Laboratory of Cognitive Intelligence, China; School of Computer Science, University of Science and Technology of China, China|The Embarrassingly Shallow Autoencoders (EASE and SLIM) are strong recommendation methods based on implicit feedback, compared to competing methods like iALS and VAE-CF. However, EASE suffers from several major shortcomings. First, the training and inference of EASE can not scale with the increasing number of items since it requires storing and inverting a large dense matrix; Second, though its optimization objective – the square loss– can yield a closed-form solution, it is not consistent with recommendation goal – predicting a personalized ranking on a set of items, so that its performance is far from optimal w.r.t ranking-oriented recommendation metrics. Finally, the regularization coefficients are sensitive w.r.t recommendation accuracy and vary a lot across different datasets, so the fine-tuning of these parameters is important yet time-consuming. To improve training and inference efficiency, we propose a Similarity-Structure Aware Shallow Autoencoder on top of three similarity structures, including Co-Occurrence, KNN and NSW. We then optimize the model with a weighted square loss, which is proven effective for ranking-based recommendation but still capable of deriving closed-form solutions. However, the weight in the loss can not be learned in the training set and is similarly sensitive w.r.t the accuracy to regularization coefficients. To automatically tune the hyperparameters, we design two validation losses on the validation set for guidance, and update the hyperparameters with the gradient of the validation losses. We finally evaluate the proposed method on multiple real-world datasets and show that it outperforms seven competing baselines remarkably, and verify the effectiveness of each part in the proposed method.|令人尴尬的浅层自动编码器(EASE 和 SLIM)是基于隐式反馈的强有力的推荐方法,与 iALS 和 VAE-CF 等竞争方法相比。然而,EASE 有几个主要的缺点。首先,EASE 的训练和推理不能随着项目数量的增加而扩展,因为它需要存储和反演一个大的密集矩阵; 其次,虽然它的优化目标-平方损失-可以产生一个封闭形式的解决方案,但它不符合推荐目标-预测一组项目的个性化排名,因此它的性能远远不是最优的面向网络排名的推荐指标。最后,正则化系数对推荐精度非常敏感,并且在不同的数据集上有很大的差异,因此对这些参数进行微调非常重要,但也非常耗时。为了提高训练和推理效率,本文提出了一种基于共现、 KNN 和 NSW 三种相似结构的相似结构感知浅层自动编码器。然后,我们用加权平方损失优化模型,这被证明是有效的排名为基础的推荐,但仍然能够导出闭合形式的解决方案。然而,损失中的权重不能在训练集中学习,并且对正则化系数的精度同样敏感。为了自动调整超参数,我们在验证集上设计了两个验证损失作为指导,并用验证损失的梯度来更新超参数。最后,我们对该方法在多个实际数据集上的性能进行了评估,结果表明该方法明显优于7个竞争基线,并验证了该方法各部分的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=AutoS2AE:+Automate+to+Regularize+Sparse+Shallow+Autoencoders+for+Recommendation)|0| +|[Improving Recommendation Fairness via Data Augmentation](https://doi.org/10.1145/3543507.3583341)|Lei Chen, Le Wu, Kun Zhang, Richang Hong, Defu Lian, Zhiqiang Zhang, Jun Zhou, Meng Wang|Hefei University of Technology, China; Hefei University of Technology, China and Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, China; Ant Group, China; University of Science and Technology of China, China|Collaborative filtering based recommendation learns users' preferences from all users' historical behavior data, and has been popular to facilitate decision making. R Recently, the fairness issue of recommendation has become more and more essential. A recommender system is considered unfair when it does not perform equally well for different user groups according to users' sensitive attributes~(e.g., gender, race). Plenty of methods have been proposed to alleviate unfairness by optimizing a predefined fairness goal or changing the distribution of unbalanced training data. However, they either suffered from the specific fairness optimization metrics or relied on redesigning the current recommendation architecture. In this paper, we study how to improve recommendation fairness from the data augmentation perspective. The recommendation model amplifies the inherent unfairness of imbalanced training data. We augment imbalanced training data towards balanced data distribution to improve fairness. The proposed framework is generally applicable to any embedding-based recommendation, and does not need to pre-define a fairness metric. Extensive experiments on two real-world datasets clearly demonstrate the superiority of our proposed framework. We publish the source code at https://github.com/newlei/FDA.|基于协同过滤的推荐从所有用户的历史行为数据中了解用户的偏好,并且已经流行起来以促进决策制定。近年来,推荐的公平性问题变得越来越重要。根据用户的敏感属性 ~ (如性别、种族) ,一个推荐系统在不同用户组中的表现不尽相同,这被认为是不公平的。通过优化预定义的公平目标或改变不平衡训练数据的分布,已经提出了许多缓解不公平现象的方法。然而,它们要么受到特定公平性优化指标的影响,要么依赖于重新设计当前的推荐体系结构。本文从数据增强的角度研究如何提高推荐公平性。推荐模型放大了不平衡训练数据固有的不公平性。为了提高训练数据的公平性,我们对不平衡的训练数据进行扩充以达到平衡的数据分布。提出的框架通常适用于任何基于嵌入的建议,并且不需要预先定义公平性度量。在两个实际数据集上的大量实验清楚地表明了我们提出的框架的优越性。我们在 https://github.com/newlei/fda 公布源代码。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+Recommendation+Fairness+via+Data+Augmentation)|0| +|[Robust Preference-Guided Denoising for Graph based Social Recommendation](https://doi.org/10.1145/3543507.3583374)|Yuhan Quan, Jingtao Ding, Chen Gao, Lingling Yi, Depeng Jin, Yong Li|Tsinghua University, China; Tencent, China|Graph Neural Network(GNN) based social recommendation models improve the prediction accuracy of user preference by leveraging GNN in exploiting preference similarity contained in social relations. However, in terms of both effectiveness and efficiency of recommendation, a large portion of social relations can be redundant or even noisy, e.g., it is quite normal that friends share no preference in a certain domain. Existing models do not fully solve this problem of relation redundancy and noise, as they directly characterize social influence over the full social network. In this paper, we instead propose to improve graph based social recommendation by only retaining the informative social relations to ensure an efficient and effective influence diffusion, i.e., graph denoising. Our designed denoising method is preference-guided to model social relation confidence and benefits user preference learning in return by providing a denoised but more informative social graph for recommendation models. Moreover, to avoid interference of noisy social relations, it designs a self-correcting curriculum learning module and an adaptive denoising strategy, both favoring highly-confident samples. Experimental results on three public datasets demonstrate its consistent capability of improving two state-of-the-art social recommendation models by robustly removing 10-40% of original relations. We release the source code at https://github.com/tsinghua-fib-lab/Graph-Denoising-SocialRec.|基于图神经网络(GNN)的社会推荐模型通过利用社会关系中包含的偏好相似性来提高用户偏好的预测精度。然而,就推荐的有效性和效率而言,很大一部分社会关系可能是冗余的,甚至是嘈杂的,例如,朋友在某个领域没有共同的偏好是很正常的。现有的模型并没有完全解决关系冗余和噪声的问题,因为它们直接表征了社会对整个社会网络的影响。在本文中,我们提出改进基于图的社会推荐,只保留信息性的社会关系,以确保有效和有效的影响扩散,即图去噪。我们设计的去噪方法是偏好引导的社会关系模型的信心和有益的用户偏好学习的回报,提供了一个去噪,但更多的信息社会图的推荐模型。同时,为了避免社会关系噪声的干扰,设计了自校正课程学习模块和自适应去噪策略,两者都有利于高自信样本。在三个公共数据集上的实验结果表明,该算法能够通过鲁棒地去除10-40% 的原始关系来改进两个最新的社会推荐模型。我们在 https://github.com/tsinghua-fib-lab/graph-denoising-socialrec 公布源代码。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Robust+Preference-Guided+Denoising+for+Graph+based+Social+Recommendation)|0| |[Few-shot News Recommendation via Cross-lingual Transfer](https://doi.org/10.1145/3543507.3583383)|Taicheng Guo, Lu Yu, Basem Shihada, Xiangliang Zhang||The cold-start problem has been commonly recognized in recommendation systems and studied by following a general idea to leverage the abundant interaction records of warm users to infer the preference of cold users. However, the performance of these solutions is limited by the amount of records available from warm users to use. Thus, building a recommendation system based on few interaction records from a few users still remains a challenging problem for unpopular or early-stage recommendation platforms. This paper focuses on solving the few-shot recommendation problem for news recommendation based on two observations. First, news at diferent platforms (even in diferent languages) may share similar topics.Second, the user preference over these topics is transferable across diferent platforms. Therefore, we propose to solve the few-shot news recommendation problem by transferring the user-news preference from a many-shot source domain to a few-shot target domain. To bridge two domainsthat are even in diferent languages and without any overlapping users and news, we propose a novel unsupervised cross-lingual transfer model as the news encoder that aligns semantically similar news in two domains. A user encoder is constructed on top of the aligned news encoding and transfers the user preference from the source to target domain. Experimental results on two real-world news recommendation datasets show the superior performance of our proposed method on addressing few-shot news recommendation, comparing to the baselines. The source code can be found at https://github.com/taichengguo/Few-shot-NewsRec .|冷启动问题在推荐系统中已经得到了广泛的认可,并且通过利用热用户丰富的交互记录来推断冷用户的偏好这一基本思想进行了研究。但是,这些解决方案的性能受限于可供暖用户使用的记录数量。因此,对于不受欢迎或处于早期阶段的推荐平台来说,建立一个基于少量用户交互记录的推荐系统仍然是一个具有挑战性的问题。本文主要研究基于两个观察值的新闻推荐中的少镜头推荐问题。首先,不同平台的新闻(甚至是不同语言的新闻)可能会有相似的话题。其次,用户对这些主题的偏好可以跨不同的平台传递。因此,我们提出通过将用户新闻偏好从多镜头源域转移到少镜头目标域来解决少镜头新闻推荐问题。为了在两个不同语言的领域之间架起一座桥梁,并且没有任何重叠的用户和新闻,我们提出了一种新的无监督跨语言传输模型作为新闻编码器,它将两个领域中语义相似的新闻进行对齐。用户编码器构造在对齐的新闻编码之上,并将用户首选项从源传输到目标域。在两个实际新闻推荐数据集上的实验结果表明,与基线相比,本文提出的方法在处理少镜头新闻推荐方面具有更好的性能。源代码可以在 https://github.com/taichengguo/few-shot-newsrec 找到。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Few-shot+News+Recommendation+via+Cross-lingual+Transfer)|0| -|[Show Me The Best Outfit for A Certain Scene: A Scene-aware Fashion Recommender System](https://doi.org/10.1145/3543507.3583435)|Tangwei Ye, Liang Hu, Qi Zhang, Zhong Yuan Lai, Usman Naseem, Dora D. Liu|DeepBlue Academy of Sciences, China; University of Technology Sydney, Australia and DeepBlue Academy of Sciences, China; DeepBlue Academy of Sciences, China and BirenTech Research, China; University of Sydney, Australia; Tongji University, China and DeepBlue Academy of Sciences, China|Fashion recommendation (FR) has received increasing attention in the research of new types of recommender systems. Existing fashion recommender systems (FRSs) typically focus on clothing item suggestions for users in three scenarios: 1) how to best recommend fashion items preferred by users; 2) how to best compose a complete outfit, and 3) how to best complete a clothing ensemble. However, current FRSs often overlook an important aspect when making FR, that is, the compatibility of the clothing item or outfit recommendations is highly dependent on the scene context. To this end, we propose the scene-aware fashion recommender system (SAFRS), which uncovers a hitherto unexplored avenue where scene information is taken into account when constructing the FR model. More specifically, our SAFRS addresses this problem by encoding scene and outfit information in separation attention encoders and then fusing the resulting feature embeddings via a novel scene-aware compatibility score function. Extensive qualitative and quantitative experiments are conducted to show that our SAFRS model outperforms all baselines for every evaluated metric.|时尚推荐(FR)在新型推荐系统的研究中受到越来越多的关注。现有的时尚推荐系统(FRSs)主要集中在三个场景中为用户提供服装项目建议: 1)如何最好地推荐用户喜欢的时尚项目; 2)如何最好地组合一套完整的服装; 3)如何最好地完成一套服装。然而,目前的 FRS 在制作 FR 时往往忽略了一个重要方面,即服装项目或服装的兼容性建议高度依赖于场景上下文。为此,我们提出了场景感知时尚推荐系统(SAFRS) ,它揭示了一个迄今为止尚未探索的途径,在构建 FR 模型时,场景信息被考虑在内。更具体地说,我们的 SAFRS 通过在分离注意力编码器中编码场景和装备信息,然后通过一种新颖的场景感知兼容性评分函数融合所得到的特征嵌入来解决这个问题。广泛的定性和定量实验表明,我们的 SAFRS 模型优于所有基线的每一个评估指标。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Show+Me+The+Best+Outfit+for+A+Certain+Scene:+A+Scene-aware+Fashion+Recommender+System)|0| -|[Invariant Collaborative Filtering to Popularity Distribution Shift](https://doi.org/10.1145/3543507.3583461)|An Zhang, Jingnan Zheng, Xiang Wang, Yancheng Yuan, TatSeng Chua|Sea-NExT Joint Lab, National University of Singapore, Singapore; National University of Singapore, Singapore; The Hong Kong Polytechnic University, Hong Kong; University of Science and Technology of China, China and Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, China|Collaborative Filtering (CF) models, despite their great success, suffer from severe performance drops due to popularity distribution shifts, where these changes are ubiquitous and inevitable in real-world scenarios. Unfortunately, most leading popularity debiasing strategies, rather than tackling the vulnerability of CF models to varying popularity distributions, require prior knowledge of the test distribution to identify the degree of bias and further learn the popularity-entangled representations to mitigate the bias. Consequently, these models result in significant performance benefits in the target test set, while dramatically deviating the recommendation from users' true interests without knowing the popularity distribution in advance. In this work, we propose a novel learning framework, Invariant Collaborative Filtering (InvCF), to discover disentangled representations that faithfully reveal the latent preference and popularity semantics without making any assumption about the popularity distribution. At its core is the distillation of unbiased preference representations (i.e., user preference on item property), which are invariant to the change of popularity semantics, while filtering out the popularity feature that is unstable or outdated. Extensive experiments on five benchmark datasets and four evaluation settings (i.e., synthetic long-tail, unbiased, temporal split, and out-of-distribution evaluations) demonstrate that InvCF outperforms the state-of-the-art baselines in terms of popularity generalization ability on real recommendations. Visualization studies shed light on the advantages of InvCF for disentangled representation learning. Our codes are available at https://github.com/anzhang314/InvCF.|协同过滤(CF)模型尽管取得了巨大的成功,但由于受欢迎程度的分布变化,性能严重下降,这些变化在现实世界中无处不在,也是不可避免的。不幸的是,大多数领先的流行去偏策略,而不是解决 CF 模型对不同流行分布的脆弱性,需要事先了解测试分布以确定偏倚程度,并进一步学习流行纠缠表示以减轻偏倚。因此,这些模型在目标测试集中产生了显著的性能效益,同时在不事先知道用户流行度分布的情况下,大大偏离了用户的真实兴趣。在这项工作中,我们提出了一个新的学习框架,不变协同过滤(InvCF) ,发现分离的表征,忠实地揭示潜在的偏好和流行语义,而不作任何假设的流行分布。其核心是无偏好的偏好表示(即,用户对项目属性的偏好)的精华,这些偏好对流行语义的变化是不变的,同时过滤掉不稳定或过时的流行特征。对五个基准数据集和四个评估设置(即合成长尾,无偏见,时间分割和分布外评估)的广泛实验表明,InvCF 在真实推荐的普及概括能力方面优于最先进的基线。可视化研究揭示了 InvCF 在分离表征学习中的优势。我们的密码可以在 https://github.com/anzhang314/invcf 找到。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Invariant+Collaborative+Filtering+to+Popularity+Distribution+Shift)|0| +|[Show Me The Best Outfit for A Certain Scene: A Scene-aware Fashion Recommender System](https://doi.org/10.1145/3543507.3583435)|Tangwei Ye, Liang Hu, Qi Zhang, Zhong Yuan Lai, Usman Naseem, Dora D. Liu|Tongji University, China and DeepBlue Academy of Sciences, China; DeepBlue Academy of Sciences, China; DeepBlue Academy of Sciences, China and BirenTech Research, China; University of Sydney, Australia; University of Technology Sydney, Australia and DeepBlue Academy of Sciences, China|Fashion recommendation (FR) has received increasing attention in the research of new types of recommender systems. Existing fashion recommender systems (FRSs) typically focus on clothing item suggestions for users in three scenarios: 1) how to best recommend fashion items preferred by users; 2) how to best compose a complete outfit, and 3) how to best complete a clothing ensemble. However, current FRSs often overlook an important aspect when making FR, that is, the compatibility of the clothing item or outfit recommendations is highly dependent on the scene context. To this end, we propose the scene-aware fashion recommender system (SAFRS), which uncovers a hitherto unexplored avenue where scene information is taken into account when constructing the FR model. More specifically, our SAFRS addresses this problem by encoding scene and outfit information in separation attention encoders and then fusing the resulting feature embeddings via a novel scene-aware compatibility score function. Extensive qualitative and quantitative experiments are conducted to show that our SAFRS model outperforms all baselines for every evaluated metric.|时尚推荐(FR)在新型推荐系统的研究中受到越来越多的关注。现有的时尚推荐系统(FRSs)主要集中在三个场景中为用户提供服装项目建议: 1)如何最好地推荐用户喜欢的时尚项目; 2)如何最好地组合一套完整的服装; 3)如何最好地完成一套服装。然而,目前的 FRS 在制作 FR 时往往忽略了一个重要方面,即服装项目或服装的兼容性建议高度依赖于场景上下文。为此,我们提出了场景感知时尚推荐系统(SAFRS) ,它揭示了一个迄今为止尚未探索的途径,在构建 FR 模型时,场景信息被考虑在内。更具体地说,我们的 SAFRS 通过在分离注意力编码器中编码场景和装备信息,然后通过一种新颖的场景感知兼容性评分函数融合所得到的特征嵌入来解决这个问题。广泛的定性和定量实验表明,我们的 SAFRS 模型优于所有基线的每一个评估指标。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Show+Me+The+Best+Outfit+for+A+Certain+Scene:+A+Scene-aware+Fashion+Recommender+System)|0| +|[Invariant Collaborative Filtering to Popularity Distribution Shift](https://doi.org/10.1145/3543507.3583461)|An Zhang, Jingnan Zheng, Xiang Wang, Yancheng Yuan, TatSeng Chua|University of Science and Technology of China, China and Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, China; National University of Singapore, Singapore; The Hong Kong Polytechnic University, Hong Kong; Sea-NExT Joint Lab, National University of Singapore, Singapore|Collaborative Filtering (CF) models, despite their great success, suffer from severe performance drops due to popularity distribution shifts, where these changes are ubiquitous and inevitable in real-world scenarios. Unfortunately, most leading popularity debiasing strategies, rather than tackling the vulnerability of CF models to varying popularity distributions, require prior knowledge of the test distribution to identify the degree of bias and further learn the popularity-entangled representations to mitigate the bias. Consequently, these models result in significant performance benefits in the target test set, while dramatically deviating the recommendation from users' true interests without knowing the popularity distribution in advance. In this work, we propose a novel learning framework, Invariant Collaborative Filtering (InvCF), to discover disentangled representations that faithfully reveal the latent preference and popularity semantics without making any assumption about the popularity distribution. At its core is the distillation of unbiased preference representations (i.e., user preference on item property), which are invariant to the change of popularity semantics, while filtering out the popularity feature that is unstable or outdated. Extensive experiments on five benchmark datasets and four evaluation settings (i.e., synthetic long-tail, unbiased, temporal split, and out-of-distribution evaluations) demonstrate that InvCF outperforms the state-of-the-art baselines in terms of popularity generalization ability on real recommendations. Visualization studies shed light on the advantages of InvCF for disentangled representation learning. Our codes are available at https://github.com/anzhang314/InvCF.|协同过滤(CF)模型尽管取得了巨大的成功,但由于受欢迎程度的分布变化,性能严重下降,这些变化在现实世界中无处不在,也是不可避免的。不幸的是,大多数领先的流行去偏策略,而不是解决 CF 模型对不同流行分布的脆弱性,需要事先了解测试分布以确定偏倚程度,并进一步学习流行纠缠表示以减轻偏倚。因此,这些模型在目标测试集中产生了显著的性能效益,同时在不事先知道用户流行度分布的情况下,大大偏离了用户的真实兴趣。在这项工作中,我们提出了一个新的学习框架,不变协同过滤(InvCF) ,发现分离的表征,忠实地揭示潜在的偏好和流行语义,而不作任何假设的流行分布。其核心是无偏好的偏好表示(即,用户对项目属性的偏好)的精华,这些偏好对流行语义的变化是不变的,同时过滤掉不稳定或过时的流行特征。对五个基准数据集和四个评估设置(即合成长尾,无偏见,时间分割和分布外评估)的广泛实验表明,InvCF 在真实推荐的普及概括能力方面优于最先进的基线。可视化研究揭示了 InvCF 在分离表征学习中的优势。我们的密码可以在 https://github.com/anzhang314/invcf 找到。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Invariant+Collaborative+Filtering+to+Popularity+Distribution+Shift)|0| |[Code Recommendation for Open Source Software Developers](https://doi.org/10.1145/3543507.3583503)|Yiqiao Jin, Yunsheng Bai, Yanqiao Zhu, Yizhou Sun, Wei Wang|Georgia Institute of Technology, USA; University of California, Los Angeles, USA|Open Source Software (OSS) is forming the spines of technology infrastructures, attracting millions of talents to contribute. Notably, it is challenging and critical to consider both the developers' interests and the semantic features of the project code to recommend appropriate development tasks to OSS developers. In this paper, we formulate the novel problem of code recommendation, whose purpose is to predict the future contribution behaviors of developers given their interaction history, the semantic features of source code, and the hierarchical file structures of projects. Considering the complex interactions among multiple parties within the system, we propose CODER, a novel graph-based code recommendation framework for open source software developers. CODER jointly models microscopic user-code interactions and macroscopic user-project interactions via a heterogeneous graph and further bridges the two levels of information through aggregation on file-structure graphs that reflect the project hierarchy. Moreover, due to the lack of reliable benchmarks, we construct three large-scale datasets to facilitate future research in this direction. Extensive experiments show that our CODER framework achieves superior performance under various experimental settings, including intra-project, cross-project, and cold-start recommendation. We will release all the datasets, code, and utilities for data retrieval upon the acceptance of this work.|开源软件(OSS)正在形成技术基础设施的脊梁,吸引了数以百万计的人才贡献。值得注意的是,同时考虑开发人员的兴趣和项目代码的语义特性,以便向 OSS 开发人员推荐适当的开发任务,这是一项具有挑战性和关键性的工作。本文提出了一个新的代码推荐问题,其目的是根据开发人员的交互历史、源代码的语义特征以及项目的层次化文件结构来预测开发人员未来的贡献行为。考虑到系统中多方之间的复杂交互,我们提出了一种新的基于图的开源软件开发者代码推荐框架 CODER。CODER 通过异构图联合建模微观用户-代码交互和宏观用户-项目交互,并通过聚合反映项目层次结构的文件结构图进一步桥接两个层次的信息。此外,由于缺乏可靠的基准,我们建立了三个大规模的数据集,以方便未来在这方面的研究。大量的实验表明,我们的 CODER 框架在不同的实验环境下,包括项目内、项目间和冷启动推荐,都取得了较好的性能。我们将发布所有的数据集,代码和实用程序的数据检索后,接受这项工作。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Code+Recommendation+for+Open+Source+Software+Developers)|0| |[pFedPrompt: Learning Personalized Prompt for Vision-Language Models in Federated Learning](https://doi.org/10.1145/3543507.3583518)|Tao Guo, Song Guo, Junxiao Wang|The Hong Kong Polytechnic University, Hong Kong|Pre-trained vision-language models like CLIP show great potential in learning representations that capture latent characteristics of users. A recently proposed method called Contextual Optimization (CoOp) introduces the concept of training prompt for adapting pre-trained vision-language models. Given the lightweight nature of this method, researchers have migrated the paradigm from centralized to decentralized system to innovate the collaborative training framework of Federated Learning (FL). However, current prompt training in FL mainly focuses on modeling user consensus and lacks the adaptation to user characteristics, leaving the personalization of prompt largely under-explored. Researches over the past few years have applied personalized FL (pFL) approaches to customizing models for heterogeneous users. Unfortunately, we find that with the variation of modality and training behavior, directly applying the pFL methods to prompt training leads to insufficient personalization and performance. To bridge the gap, we present pFedPrompt, which leverages the unique advantage of multimodality in vision-language models by learning user consensus from linguistic space and adapting to user characteristics in visual space in a non-parametric manner. Through this dual collaboration, the learned prompt will be fully personalized and aligned to the user’s local characteristics. We conduct extensive experiments across various datasets under the FL setting with statistical heterogeneity. The results demonstrate the superiority of our pFedPrompt against the alternative approaches with robust performance.|像 CLIP 这样预先训练好的视觉语言模型在捕捉用户潜在特征的学习表示方面显示出巨大的潜力。最近提出的上下文优化(CoOp)方法引入了训练提示符的概念来适应预先训练好的视觉语言模型。考虑到该方法的轻量级特性,研究人员将该模式从集中式系统迁移到分散式系统,以创新联邦学习(FL)的协同训练框架。然而,目前外语快速教学主要侧重于建立用户共识模型,缺乏对用户特征的适应性,使得个性化快速教学在很大程度上缺乏探索。过去几年的研究已经将个性化 FL (pFL)方法应用于异构用户的定制模型。不幸的是,我们发现,随着模式和训练行为的变化,直接应用 pFL 方法促进训练导致个性化和绩效的不足。为了弥合这一差距,我们提出了 pFedPrompt,它通过从语言空间学习用户共识并以非参数方式适应视觉空间中的用户特征,从而利用了视觉语言模型中多模态的独特优势。通过这种双重协作,学到的提示符将完全个性化,并与用户的本地特征保持一致。我们在统计异质性的 FL 环境下对不同的数据集进行了广泛的实验。实验结果表明,本文提出的 pFedPrompt 算法与其他具有鲁棒性能的方法相比具有优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=pFedPrompt:+Learning+Personalized+Prompt+for+Vision-Language+Models+in+Federated+Learning)|0| -|[Word Sense Disambiguation by Refining Target Word Embedding](https://doi.org/10.1145/3543507.3583191)|Xuefeng Zhang, Richong Zhang, Xiaoyang Li, Fanshuang Kong, Junfan Chen, Samuel Mensah, Yongyi Mao|SKLSDE, School of Computer Science and Engineering, Beihang University, China; The University of Sheffield, United Kingdom; University of Ottawa, Canada|Word Sense Disambiguation (WSD) which aims to identify the correct sense of a target word appearing in a specific context is essential for web text analysis. The use of glosses has been explored as a means for WSD. However, only a few works model the correlation between the target context and gloss. We add to the body of literature by presenting a model that employs a multi-head attention mechanism on deep contextual features of the target word and candidate glosses to refine the target word embedding. Furthermore, to encourage the model to learn the relevant part of target features that align with the correct gloss, we recursively alternate attention on target word features and that of candidate glosses to gradually extract the relevant contextual features of the target word, refining its representation and strengthening the final disambiguation results. Empirical studies on the five most commonly used benchmark datasets show that our proposed model is effective and achieves state-of-the-art results.|词义消歧(WSD)是识别特定语境中目标词的正确意义,是网络文本分析的基础。水务署已研究使用注释作为一种方法。然而,只有少数作品模拟了目标语境和注释之间的相关性。本文提出了一种基于目标词深层语境特征和候选修饰语的多目标注意机制来完善目标词嵌入的模型,并对文献进行了补充。此外,为了鼓励模型学习与正确的注释相一致的目标特征的相关部分,我们递归地交替关注目标词特征和候选注释的特征,以逐渐提取目标词的相关上下文特征,完善其表示并加强最终的消歧结果。对五个最常用的基准数据集的实证研究表明,我们提出的模型是有效的,并取得了最先进的结果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Word+Sense+Disambiguation+by+Refining+Target+Word+Embedding)|0| -|[Dual Policy Learning for Aggregation Optimization in Graph Neural Network-based Recommender Systems](https://doi.org/10.1145/3543507.3583241)|Heesoo Jung, Sangpil Kim, Hogun Park|Dept. of Artificial Intelligence, Sungkyunkwan University, Republic of Korea; Dept. of Electrical and Computer Engineering, Sungkyunkwan University, Republic of Korea and Dept. of Artificial Intelligence, Sungkyunkwan University, Republic of Korea; Dept. of Artificial Intelligence, Korea University, Republic of Korea|Graph Neural Networks (GNNs) provide powerful representations for recommendation tasks. GNN-based recommendation systems capture the complex high-order connectivity between users and items by aggregating information from distant neighbors and can improve the performance of recommender systems. Recently, Knowledge Graphs (KGs) have also been incorporated into the user-item interaction graph to provide more abundant contextual information; they are exploited to address cold-start problems and enable more explainable aggregation in GNN-based recommender systems (GNN-Rs). However, due to the heterogeneous nature of users and items, developing an effective aggregation strategy that works across multiple GNN-Rs, such as LightGCN and KGAT, remains a challenge. In this paper, we propose a novel reinforcement learning-based message passing framework for recommender systems, which we call DPAO (Dual Policy framework for Aggregation Optimization). This framework adaptively determines high-order connectivity to aggregate users and items using dual policy learning. Dual policy learning leverages two Deep-Q-Network models to exploit the user- and item-aware feedback from a GNN-R and boost the performance of the target GNN-R. Our proposed framework was evaluated with both non-KG-based and KG-based GNN-R models on six real-world datasets, and their results show that our proposed framework significantly enhances the recent base model, improving nDCG and Recall by up to 63.7% and 42.9%, respectively. Our implementation code is available at https://github.com/steve30572/DPAO/.|图形神经网络(GNN)为推荐任务提供了强有力的表示。基于 GNN 的推荐系统通过聚合来自远邻的信息来捕获用户和项目之间复杂的高阶连通性,从而提高推荐系统的性能。最近,知识图(KGs)也被纳入到用户项目交互图中,以提供更丰富的上下文信息; 它们被用来解决冷启动问题,并能够在基于 GNN 的推荐系统(GNN-Rs)中实现更可解释的聚合。然而,由于用户和项目的异构性,开发一个跨多个 GNN-R (如 LightGCN 和 KGAT)的有效聚合策略仍然是一个挑战。本文提出了一种新的基于强化学习的推荐系统消息传递框架,称之为聚合优化的双策略框架(DPAO)。该框架使用双策略学习自适应地确定与聚合用户和项目的高阶连通性。双策略学习利用两个 Deep-Q 网络模型来利用来自 GNN-R 的用户和项目感知反馈,提高目标 GNN-R 的性能。我们提出的框架在六个实际数据集上用非 KG 和基于 KG 的 GNN-R 模型进行了评估,结果表明,我们提出的框架显着增强了最近的基础模型,使 nDCG 和 Recall 分别提高了63.7% 和42.9% 。我们的实施守则可于 https://github.com/steve30572/dpao/索取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Dual+Policy+Learning+for+Aggregation+Optimization+in+Graph+Neural+Network-based+Recommender+Systems)|0| +|[Word Sense Disambiguation by Refining Target Word Embedding](https://doi.org/10.1145/3543507.3583191)|Xuefeng Zhang, Richong Zhang, Xiaoyang Li, Fanshuang Kong, Junfan Chen, Samuel Mensah, Yongyi Mao|University of Ottawa, Canada; The University of Sheffield, United Kingdom; SKLSDE, School of Computer Science and Engineering, Beihang University, China|Word Sense Disambiguation (WSD) which aims to identify the correct sense of a target word appearing in a specific context is essential for web text analysis. The use of glosses has been explored as a means for WSD. However, only a few works model the correlation between the target context and gloss. We add to the body of literature by presenting a model that employs a multi-head attention mechanism on deep contextual features of the target word and candidate glosses to refine the target word embedding. Furthermore, to encourage the model to learn the relevant part of target features that align with the correct gloss, we recursively alternate attention on target word features and that of candidate glosses to gradually extract the relevant contextual features of the target word, refining its representation and strengthening the final disambiguation results. Empirical studies on the five most commonly used benchmark datasets show that our proposed model is effective and achieves state-of-the-art results.|词义消歧(WSD)是识别特定语境中目标词的正确意义,是网络文本分析的基础。水务署已研究使用注释作为一种方法。然而,只有少数作品模拟了目标语境和注释之间的相关性。本文提出了一种基于目标词深层语境特征和候选修饰语的多目标注意机制来完善目标词嵌入的模型,并对文献进行了补充。此外,为了鼓励模型学习与正确的注释相一致的目标特征的相关部分,我们递归地交替关注目标词特征和候选注释的特征,以逐渐提取目标词的相关上下文特征,完善其表示并加强最终的消歧结果。对五个最常用的基准数据集的实证研究表明,我们提出的模型是有效的,并取得了最先进的结果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Word+Sense+Disambiguation+by+Refining+Target+Word+Embedding)|0| +|[Dual Policy Learning for Aggregation Optimization in Graph Neural Network-based Recommender Systems](https://doi.org/10.1145/3543507.3583241)|Heesoo Jung, Sangpil Kim, Hogun Park|Dept. of Electrical and Computer Engineering, Sungkyunkwan University, Republic of Korea and Dept. of Artificial Intelligence, Sungkyunkwan University, Republic of Korea; Dept. of Artificial Intelligence, Sungkyunkwan University, Republic of Korea; Dept. of Artificial Intelligence, Korea University, Republic of Korea|Graph Neural Networks (GNNs) provide powerful representations for recommendation tasks. GNN-based recommendation systems capture the complex high-order connectivity between users and items by aggregating information from distant neighbors and can improve the performance of recommender systems. Recently, Knowledge Graphs (KGs) have also been incorporated into the user-item interaction graph to provide more abundant contextual information; they are exploited to address cold-start problems and enable more explainable aggregation in GNN-based recommender systems (GNN-Rs). However, due to the heterogeneous nature of users and items, developing an effective aggregation strategy that works across multiple GNN-Rs, such as LightGCN and KGAT, remains a challenge. In this paper, we propose a novel reinforcement learning-based message passing framework for recommender systems, which we call DPAO (Dual Policy framework for Aggregation Optimization). This framework adaptively determines high-order connectivity to aggregate users and items using dual policy learning. Dual policy learning leverages two Deep-Q-Network models to exploit the user- and item-aware feedback from a GNN-R and boost the performance of the target GNN-R. Our proposed framework was evaluated with both non-KG-based and KG-based GNN-R models on six real-world datasets, and their results show that our proposed framework significantly enhances the recent base model, improving nDCG and Recall by up to 63.7% and 42.9%, respectively. Our implementation code is available at https://github.com/steve30572/DPAO/.|图形神经网络(GNN)为推荐任务提供了强有力的表示。基于 GNN 的推荐系统通过聚合来自远邻的信息来捕获用户和项目之间复杂的高阶连通性,从而提高推荐系统的性能。最近,知识图(KGs)也被纳入到用户项目交互图中,以提供更丰富的上下文信息; 它们被用来解决冷启动问题,并能够在基于 GNN 的推荐系统(GNN-Rs)中实现更可解释的聚合。然而,由于用户和项目的异构性,开发一个跨多个 GNN-R (如 LightGCN 和 KGAT)的有效聚合策略仍然是一个挑战。本文提出了一种新的基于强化学习的推荐系统消息传递框架,称之为聚合优化的双策略框架(DPAO)。该框架使用双策略学习自适应地确定与聚合用户和项目的高阶连通性。双策略学习利用两个 Deep-Q 网络模型来利用来自 GNN-R 的用户和项目感知反馈,提高目标 GNN-R 的性能。我们提出的框架在六个实际数据集上用非 KG 和基于 KG 的 GNN-R 模型进行了评估,结果表明,我们提出的框架显着增强了最近的基础模型,使 nDCG 和 Recall 分别提高了63.7% 和42.9% 。我们的实施守则可于 https://github.com/steve30572/dpao/索取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Dual+Policy+Learning+for+Aggregation+Optimization+in+Graph+Neural+Network-based+Recommender+Systems)|0| |[Addressing Heterophily in Graph Anomaly Detection: A Perspective of Graph Spectrum](https://doi.org/10.1145/3543507.3583268)|Yuan Gao, Xiang Wang, Xiangnan He, Zhenguang Liu, Huamin Feng, Yongdong Zhang|Beijing Electronic Science And Technology Institute, China; Zhejiang University, China; University of Science and Technology of China, China|Graph anomaly detection (GAD) suffers from heterophily — abnormal nodes are sparse so that they are connected to vast normal nodes. The current solutions upon Graph Neural Networks (GNNs) blindly smooth the representation of neiboring nodes, thus undermining the discriminative information of the anomalies. To alleviate the issue, recent studies identify and discard inter-class edges through estimating and comparing the node-level representation similarity. However, the representation of a single node can be misleading when the prediction error is high, thus hindering the performance of the edge indicator. In graph signal processing, the smoothness index is a widely adopted metric which plays the role of frequency in classical spectral analysis. Considering the ground truth Y to be a signal on graph, the smoothness index is equivalent to the value of the heterophily ratio. From this perspective, we aim to address the heterophily problem in the spectral domain. First, we point out that heterophily is positively associated with the frequency of a graph. Towards this end, we could prune inter-class edges by simply emphasizing and delineating the high-frequency components of the graph. Recall that graph Laplacian is a high-pass filter, we adopt it to measure the extent of 1-hop label changing of the center node and indicate high-frequency components. As GAD can be formulated as a semi-supervised binary classification problem, only part of the nodes are labeled. As an alternative, we use the prediction of the nodes to estimate it. Through our analysis, we show that prediction errors are less likely to affect the identification process. Extensive empirical evaluations on four benchmarks demonstrate the effectiveness of the indicator over popular homophilic, heterophilic, and tailored fraud detection methods. Our proposed indicator can effectively reduce the heterophily degree of the graph, thus boosting the overall GAD performance. Codes are open-sourced in https://github.com/blacksingular/GHRN.|图形异常检测(GAD)存在异质性ーー异常节点稀疏,因此它们连接到巨大的正常节点。现有的基于图神经网络(GNN)的解决方案盲目地平滑邻近节点的表示,从而破坏了异常的判别信息。为了缓解这一问题,最近的研究通过估计和比较节点级表示相似度来识别和丢弃类间边缘。然而,当预测误差较大时,单个节点的表示可能会产生误导,从而影响边缘指示器的性能。在图形信号处理中,平滑度指数是一个被广泛采用的度量指标,在经典的谱分析中起着频率的作用。考虑到地面真值 Y 是图上的一个信号,光滑度指标等价于异质比的值。从这个角度出发,我们的目标是解决谱域中的异质性问题。首先,我们指出异质性与图的频率成正相关。为了达到这个目的,我们可以通过简单地强调和描述图的高频成分来修剪类间边缘。回想一下,图拉普拉斯是一个高通滤波器,我们采用它来测量中心节点的1跳标签变化的程度,并指示高频分量。由于 GAD 可以表述为一个半监督的二进制分类问题,所以只对部分节点进行标记。作为一种替代方法,我们使用节点的预测来估计它。通过我们的分析,我们表明,预测错误不太可能影响识别过程。对四个基准的广泛的实证评估证明了指标的有效性超过流行的同质性,异质性和量身定制的欺诈检测方法。我们提出的指标可以有效地降低图的异构度,从而提高整体的 GAD 性能。代码在 https://github.com/blacksingular/ghrn 中是开源的。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Addressing+Heterophily+in+Graph+Anomaly+Detection:+A+Perspective+of+Graph+Spectrum)|0| |[Ginver: Generative Model Inversion Attacks Against Collaborative Inference](https://doi.org/10.1145/3543507.3583306)|Yupeng Yin, Xianglong Zhang, Huanle Zhang, Feng Li, Yue Yu, Xiuzhen Cheng, Pengfei Hu|School of Computer Science and Technology, Shandong University, China; National University of Defense Technology, China and Peng Cheng Laboratory, China|Deep Learning (DL) has been widely adopted in almost all domains, from threat recognition to medical diagnosis. Albeit its supreme model accuracy, DL imposes a heavy burden on devices as it incurs overwhelming system overhead to execute DL models, especially on Internet-of-Things (IoT) and edge devices. Collaborative inference is a promising approach to supporting DL models, by which the data owner (the victim) runs the first layers of the model on her local device and then a cloud provider (the adversary) runs the remaining layers of the model. Compared to offloading the entire model to the cloud, the collaborative inference approach is more data privacy-preserving as the owner’s model input is not exposed to outsiders. However, we show in this paper that the adversary can restore the victim’s model input by exploiting the output of the victim’s local model. Our attack is dubbed Ginver 1: Generative model inversion attacks against collaborative inference. Once trained, Ginver can infer the victim’s unseen model inputs without remaking the inversion attack model and thus has the generative capability. We extensively evaluate Ginver under different settings (e.g., white-box and black-box of the victim’s local model) and applications (e.g., CIFAR10 and FaceScrub datasets). The experimental results show that Ginver recovers high-quality images from the victims.|从威胁识别到医学诊断,深度学习已被广泛应用于几乎所有的领域。尽管 DL 的模型精确度最高,但它对设备造成了沉重的负担,因为它在执行 DL 模型时会产生巨大的系统开销,特别是在物联网(IoT)和边缘设备上。协作推理是支持 DL 模型的一种有前途的方法,通过这种方法,数据所有者(受害者)在其本地设备上运行模型的第一层,然后云提供者(对手)运行模型的其余层。与将整个模型卸载到云中相比,协作推理方法更能保护数据隐私,因为所有者的模型输入不会暴露给外部人员。然而,本文证明了对手可以通过利用被害人局部模型的输出来恢复被害人的模型输入。我们的攻击被称为 Ginver 1: 针对协作推理的生成模型反转攻击。一旦被训练,Ginver 可以推断出受害者看不见的模型输入,而无需重建反转攻击模型,因此具有生成能力。我们在不同的设置(例如,受害者本地模型的白盒和黑盒)和应用程序(例如,CIFAR10和 FaceScrub 数据集)下广泛评估 Ginver。实验结果表明,Ginver 可以从受害者身上恢复出高质量的图像。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Ginver:+Generative+Model+Inversion+Attacks+Against+Collaborative+Inference)|0| |[All Your Shops Are Belong to Us: Security Weaknesses in E-commerce Platforms](https://doi.org/10.1145/3543507.3583319)|Rohan Pagey, Mohammad Mannan, Amr M. Youssef|Concordia Institute for Information Systems Engineering, Concordia University, Canada|Software as a Service (SaaS) e-commerce platforms for merchants allow individual business owners to set up their online stores almost instantly. Prior work has shown that the checkout flows and payment integration of some e-commerce applications are vulnerable to logic bugs with serious financial consequences, e.g., allowing “shopping for free”. Apart from checkout and payment integration, vulnerabilities in other e-commerce operations have remained largely unexplored, even though they can have far more serious consequences, e.g., enabling “store takeover”. In this work, we design and implement a security evaluation framework to uncover security vulnerabilities in e-commerce operations beyond checkout/payment integration. We use this framework to analyze 32 representative e-commerce platforms, including web services of 24 commercial SaaS platforms and 15 associated Android apps, and 8 open source platforms; these platforms host over 10 million stores as approximated through Google dorks. We uncover several new vulnerabilities with serious consequences, e.g., allowing an attacker to take over all stores under a platform, and listing illegal products at a victim’s store—in addition to “shopping for free” bugs, without exploiting the checkout/payment process. We found 12 platforms vulnerable to store takeover (affecting 41000+ stores) and 6 platforms vulnerable to shopping for free (affecting 19000+ stores, approximated via Google dorks on Oct. 8, 2022). We have responsibly disclosed the vulnerabilities to all affected parties, and requested four CVEs (three assigned, and one is pending review).|软件即服务(SaaS)商家电子商务平台允许个体企业主几乎立即建立他们的在线商店。先前的研究已经表明,一些电子商务应用程序的结帐流程和支付集成容易受到逻辑错误的影响,这些逻辑错误会带来严重的财务后果,例如,允许“免费购物”。除了结账和支付集成,其他电子商务操作中的漏洞基本上还没有得到探索,尽管它们可能产生更为严重的后果,例如“商店接管”。在这项工作中,我们设计和实现了一个安全评估框架,以揭示电子商务运作中的安全漏洞超越结帐/支付集成。我们使用这个框架来分析32个具有代表性的电子商务平台,包括24个商业 SaaS 平台和15个相关的 Android 应用程序的网络服务,以及8个开源平台; 这些平台拥有超过1000万个商店,大致相当于 Google 呆子的数量。我们发现了几个具有严重后果的新漏洞,例如,允许攻击者在一个平台下接管所有商店,以及在受害者的商店中列出非法产品ーー除了“免费购物”的漏洞之外,还没有利用结帐/付款过程。我们发现有12个平台容易被商店接管(影响到41000多家商店) ,6个平台容易被免费购物(影响到19000多家商店,大约在2022年10月8日通过谷歌书呆子)。我们已经负责任地向所有受影响的方面披露了漏洞,并要求四个 CVE (三个分配,一个正在等待审查)。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=All+Your+Shops+Are+Belong+to+Us:+Security+Weaknesses+in+E-commerce+Platforms)|0| |[An Empirical Study of the Usage of Checksums for Web Downloads](https://doi.org/10.1145/3543507.3583326)|Gaël Bernard, Rémi Coudert, Bertil Chapuis, Kévin Huguenin|University of Applied Sciences Western Switzerland, Switzerland; Department of Information Systems, University of Lausanne, Switzerland; EPFL, Switzerland|Checksums, typically provided on webpages and generated from cryptographic hash functions (e.g., MD5, SHA256) or signature schemes (e.g., PGP), are commonly used on websites to enable users to verify that the files they download have not been tampered with when stored on possibly untrusted servers. In this paper, we elucidate the current practices regarding the usage of checksums for web downloads (hash functions used, visibility and validity of checksums, type of websites and files, etc.), as this has been mostly overlooked so far. Using a snowball-sampling strategy for the 200000 most popular domains of the Web, we first crawled a dataset of  8.5M webpages, from which we built, through an active-learning approach, a unique dataset of 277 diverse webpages that contain checksums. Our analysis of these webpages reveals interesting findings about the usage of checksums. For instance, it shows that checksums are used mostly to verify program files, that weak hash functions are frequently used, and that a non-negligible proportion of the checksums provided on webpages do not match that of their associated files. Finally, we complement our analysis with a survey of the webmasters of the considered webpages (N = 26), thus shedding light on the reasons behind the checksum-related choices they make.|校验和通常在网页上提供,由加密散列函数(例如 MD5、 SHA256)或签名方案(例如 PGP)产生,通常在网站上使用,使用户能够验证他们下载的文件在存储在可能不受信任的服务器上时没有被篡改。在这篇文章中,我们阐述了目前使用校验和进行网页下载的做法(使用的散列函数,校验和的可见性和有效性,网站和文件的类型等) ,因为这是迄今为止大多数被忽视的。使用滚雪球抽样策略对200000个最流行的网络领域,我们首先抓取了850万个网页的数据集,从中,我们通过一个主动学习的方法,建立了一个包含校验和的277个不同网页的独特数据集。我们对这些网页的分析揭示了校验和使用的有趣发现。例如,它表明校验和主要用于验证程序文件,经常使用弱散列函数,网页上提供的校验和中有不可忽视的比例与相关文件的校验和不匹配。最后,我们通过对所考虑的网页(N = 26)的网站管理员进行调查来补充我们的分析,从而阐明他们做出校验和相关选择背后的原因。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=An+Empirical+Study+of+the+Usage+of+Checksums+for+Web+Downloads)|0| |[Quantifying and Defending against Privacy Threats on Federated Knowledge Graph Embedding](https://doi.org/10.1145/3543507.3583450)|Yuke Hu, Wei Liang, Ruofan Wu, Kai Xiao, Weiqiang Wang, Xiaochen Li, Jinfei Liu, Zhan Qin|Zhejiang University, China and HIC-ZJU, China; Ant Group, China|Knowledge Graph Embedding (KGE) is a fundamental technique that extracts expressive representation from knowledge graph (KG) to facilitate diverse downstream tasks. The emerging federated KGE (FKGE) collaboratively trains from distributed KGs held among clients while avoiding exchanging clients' sensitive raw KGs, which can still suffer from privacy threats as evidenced in other federated model trainings (e.g., neural networks). However, quantifying and defending against such privacy threats remain unexplored for FKGE which possesses unique properties not shared by previously studied models. In this paper, we conduct the first holistic study of the privacy threat on FKGE from both attack and defense perspectives. For the attack, we quantify the privacy threat by proposing three new inference attacks, which reveal substantial privacy risk by successfully inferring the existence of the KG triple from victim clients. For the defense, we propose DP-Flames, a novel differentially private FKGE with private selection, which offers a better privacy-utility tradeoff by exploiting the entity-binding sparse gradient property of FKGE and comes with a tight privacy accountant by incorporating the state-of-the-art private selection technique. We further propose an adaptive privacy budget allocation policy to dynamically adjust defense magnitude across the training procedure. Comprehensive evaluations demonstrate that the proposed defense can successfully mitigate the privacy threat by effectively reducing the success rate of inference attacks from $83.1\%$ to $59.4\%$ on average with only a modest utility decrease.|知识图嵌入(KGE)是一种从知识图中提取表达式的基本技术,可以方便地完成不同的下游任务。新兴的联邦 KGE (FKGE)协同培训客户之间的分布式幼儿园,同时避免交换客户敏感的原始幼儿园,这些幼儿园仍然可能受到隐私威胁,这在其他联邦模型培训(例如,神经网络)中得到了证明。然而,量化和防御这种隐私威胁仍然没有探索的 FKGE,具有独特的性质没有共享以前的研究模型。本文首次从攻击和防御两个角度对 FKGE 隐私威胁进行了全面的研究。对于这种攻击,我们提出了三种新的推理攻击来量化隐私威胁,通过成功地从受害客户端推断出 KG 三元组的存在,揭示了巨大的隐私风险。对于辩方,我们提出 DP-Flames,一种具有私有选择的新型差异私有 FKGE,它通过利用 FKGE 的实体绑定稀疏梯度特性提供了更好的隐私-效用权衡,并通过结合最先进的私有选择技术提供了一个严密的隐私会计。我们进一步提出了一个自适应的隐私预算分配策略,以动态调整整个训练过程中的防御大小。综合评估表明,提出的防御能够成功地减轻隐私威胁,有效地减少推理攻击的成功率从83.1% $至59.4% $平均只有适度的效用减少。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Quantifying+and+Defending+against+Privacy+Threats+on+Federated+Knowledge+Graph+Embedding)|0| |[Sanitizing Sentence Embeddings (and Labels) for Local Differential Privacy](https://doi.org/10.1145/3543507.3583512)|Minxin Du, Xiang Yue, Sherman S. M. Chow, Huan Sun|Department of Information Engineering, The Chinese University of Hong Kong, Hong Kong; Department of Computer Science and Engineering, The Ohio State University, USA|Differentially private (DP) learning, notably DP stochastic gradient descent (DP-SGD), has limited applicability in fine-tuning gigantic pre-trained language models (LMs) for natural language processing tasks. The culprit is the perturbation of gradients (as gigantic as entire models), leading to significant efficiency and accuracy drops. We show how to achieve metric-based local DP (LDP) by sanitizing (high-dimensional) sentence embedding, extracted by LMs and much smaller than gradients. For potential utility improvement, we impose a consistency constraint on the sanitization. We explore two approaches: One is brand new and can directly output consistent noisy embeddings; the other is an upgradation with post-processing. To further mitigate “the curse of dimensionality,” we introduce two trainable linear maps for mediating dimensions without hurting privacy or utility. Our protection can effectively defend against privacy threats on embeddings. It also naturally extends to inference. Our experiments1 show that we reach the non-private accuracy under properly configured parameters, e.g., 0.92 for SST-2 with a privacy budget ϵ = 10 and the reduced dimension as 16. We also sanitize the label for LDP (with another small privacy budget) with limited accuracy losses to fully protect every sequence-label pair.|差异私有(DP)学习,特别是 DP 随机梯度下降(DP-sgd) ,在为自然语言处理任务微调庞大的预训练语言模型(LMs)方面的适用性有限。罪魁祸首是梯度的扰动(像整个模型一样巨大) ,导致了显著的效率和精度下降。我们展示了如何通过消毒(高维)句子嵌入、利用 LMs 提取和比梯度小得多的方法来实现基于度量的局部 DP (LDP)。对于潜在的效用改进,我们对消毒施加一致性约束。我们探索了两种方法: 一种是全新的,可以直接输出一致的噪声嵌入; 另一种是后处理的升级。为了进一步减轻“维数灾难”,我们引入了两个可训练的线性映射,用于在不损害隐私或效用的情况下调节维度。我们的保护可以有效地防御嵌入式系统的隐私威胁。它也自然地延伸到推理。我们的实验表明,在适当的参数配置下,我们达到了非私有精度,例如,0.92的 SST-2与私有预算 ε = 10和降维为16。我们还消毒的标签为 LDP (与另一个小的隐私预算)与有限的准确性损失,以充分保护每个序列标签对。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Sanitizing+Sentence+Embeddings+(and+Labels)+for+Local+Differential+Privacy)|0| -|[Heterogeneous Federated Knowledge Graph Embedding Learning and Unlearning](https://doi.org/10.1145/3543507.3583305)|Xiangrong Zhu, Guangyao Li, Wei Hu|State Key Laboratory for Novel Software Technology, Nanjing University, China and National Institute of Healthcare Data Science, Nanjing University, China; State Key Laboratory for Novel Software Technology, Nanjing University, China|Federated Learning (FL) recently emerges as a paradigm to train a global machine learning model across distributed clients without sharing raw data. Knowledge Graph (KG) embedding represents KGs in a continuous vector space, serving as the backbone of many knowledge-driven applications. As a promising combination, federated KG embedding can fully take advantage of knowledge learned from different clients while preserving the privacy of local data. However, realistic problems such as data heterogeneity and knowledge forgetting still remain to be concerned. In this paper, we propose FedLU, a novel FL framework for heterogeneous KG embedding learning and unlearning. To cope with the drift between local optimization and global convergence caused by data heterogeneity, we propose mutual knowledge distillation to transfer local knowledge to global, and absorb global knowledge back. Moreover, we present an unlearning method based on cognitive neuroscience, which combines retroactive interference and passive decay to erase specific knowledge from local clients and propagate to the global model by reusing knowledge distillation. We construct new datasets for assessing realistic performance of the state-of-the-arts. Extensive experiments show that FedLU achieves superior results in both link prediction and knowledge forgetting.|联邦学习(Federated Learning,FL)最近作为一种模式出现,它可以在不共享原始数据的情况下跨分布式客户机训练全局机器学习模型。知识图(KG)嵌入表示连续向量空间中的 KG,作为许多知识驱动应用程序的骨干。作为一种有前途的组合,联邦 KG 嵌入可以充分利用从不同客户端获得的知识,同时保护本地数据的隐私。然而,数据异构性和知识遗忘等现实问题仍然值得关注。本文提出了一种适用于异构 KG 嵌入学习和非学习的 FL 框架 FedLU。针对数据异构导致的局部优化和全局收敛之间的漂移问题,提出了互知识提取方法,将局部知识转化为全局知识,并将全局知识吸收回来。此外,我们还提出了一种基于认知神经科学的去学习方法,它结合了追溯干扰和被动衰减,从本地客户删除特定的知识,并通过重复使用知识提取传播到全球模型。我们建立了新的数据集来评估现实性能的最新技术。大量实验表明,FedLU 在链路预测和知识遗忘方面都取得了较好的效果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Heterogeneous+Federated+Knowledge+Graph+Embedding+Learning+and+Unlearning)|0| -|[A Single Vector Is Not Enough: Taxonomy Expansion via Box Embeddings](https://doi.org/10.1145/3543507.3583310)|Song Jiang, Qiyue Yao, Qifan Wang, Yizhou Sun|University of California, Los Angeles, USA; Meta AI, USA|Taxonomies, which organize knowledge hierarchically, support various practical web applications such as product navigation in online shopping and user profile tagging on social platforms. Given the continued and rapid emergence of new entities, maintaining a comprehensive taxonomy in a timely manner through human annotation is prohibitively expensive. Therefore, expanding a taxonomy automatically with new entities is essential. Most existing methods for expanding taxonomies encode entities into vector embeddings (i.e., single points). However, we argue that vectors are insufficient to model the “is-a” hierarchy in taxonomy (asymmetrical relation), because two points can only represent pairwise similarity (symmetrical relation). To this end, we propose to project taxonomy entities into boxes (i.e., hyperrectangles). Two boxes can be "contained", "disjoint" and "intersecting", thus naturally representing an asymmetrical taxonomic hierarchy. Upon box embeddings, we propose a novel model BoxTaxo for taxonomy expansion. The core of BoxTaxo is to learn boxes for entities to capture their child-parent hierarchies. To achieve this, BoxTaxo optimizes the box embeddings from a joint view of geometry and probability. BoxTaxo also offers an easy and natural way for inference: examine whether the box of a given new entity is fully enclosed inside the box of a candidate parent from the existing taxonomy. Extensive experiments on two benchmarks demonstrate the effectiveness of BoxTaxo compared to vector based models.|分类法按层次组织知识,支持各种实际的网络应用程序,如在线购物中的产品导航和社交平台上的用户配置文件标签。鉴于新实体的持续和快速出现,通过人工注释及时维护全面的分类是非常昂贵的。因此,使用新实体自动扩展分类法是必不可少的。大多数现有的分类扩展方法都将实体编码为向量嵌入(即单点)。然而,我们认为向量不足以模拟分类学中的“ is-a”层次结构(非对称关系) ,因为两点只能表示成对的相似性(对称关系)。为此,我们建议将分类实体投影到盒子(即超矩形)中。两个盒子可以“包含”、“不相交”和“交叉”,因此自然地代表了一个不对称的分类层次。在盒子嵌入的基础上,提出了一种新的分类扩展模型 BoxTaxo。BoxTaxo 的核心是为实体学习用于捕获其子-父层次结构的框。为了实现这一点,BoxTaxo 从几何和概率的联合视图优化了盒子嵌入。BoxTaxo 还提供了一种简单而自然的推理方法: 检查给定新实体的框是否完全封闭在现有分类法的候选父类的框中。在两个基准上的大量实验证明了 BoxTaxo 与基于向量的模型相比的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Single+Vector+Is+Not+Enough:+Taxonomy+Expansion+via+Box+Embeddings)|0| -|[Knowledge Graph Question Answering with Ambiguous Query](https://doi.org/10.1145/3543507.3583316)|Lihui Liu, Yuzhong Chen, Mahashweta Das, Hao Yang, Hanghang Tong|Department of Computer Science, University of Illinois at Urbana Champaign, USA; visa research, USA|Knowledge graph question answering aims to identify answers of the query according to the facts in the knowledge graph. In the vast majority of the existing works, the input queries are considered perfect and can precisely express the user’s query intention. However, in reality, input queries might be ambiguous and elusive which only contain a limited amount of information. Directly answering these ambiguous queries may yield unwanted answers and deteriorate user experience. In this paper, we propose PReFNet which focuses on answering ambiguous queries with pseudo relevance feedback on knowledge graphs. In order to leverage the hidden (pseudo) relevance information existed in the results that are initially returned from a given query, PReFNet treats the top-k returned candidate answers as a set of most relevant answers, and uses variational Bayesian inference to infer user’s query intention. To boost the quality of the inferred queries, a neighborhood embedding based VGAE model is used to prune inferior inferred queries. The inferred high quality queries will be returned to the users to help them search with ease. Moreover, all the high-quality candidate nodes will be re-ranked according to the inferred queries. The experiment results show that our proposed method can recommend high-quality query graphs to users and improve the question answering accuracy.|知识图问答的目的是根据知识图中的事实来识别问题的答案。在现有的大多数工作中,输入查询被认为是完美的,可以准确地表达用户的查询意图。然而,在现实中,输入查询可能是模糊和难以捉摸的,只包含有限数量的信息。直接回答这些模棱两可的问题可能会得到不想要的答案,并损害用户体验。在这篇文章中,我们提出了一种基于知识图的伪关联反馈回答模糊查询的方法。为了利用最初从给定查询返回的结果中存在的隐藏(伪)相关信息,prefNet 将返回的候选答案视为一组最相关的答案,并使用变异贝叶斯推断来推断用户的查询意图。为了提高推理查询的质量,提出了一种基于邻域嵌入的 VGAE 模型来裁剪劣质推理查询。推断出的高质量查询将返回给用户,以帮助他们轻松搜索。此外,所有高质量的候选节点将根据推断的查询进行重新排序。实验结果表明,该方法可以向用户推荐高质量的查询图,提高问答的准确率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Knowledge+Graph+Question+Answering+with+Ambiguous+Query)|0| +|[Heterogeneous Federated Knowledge Graph Embedding Learning and Unlearning](https://doi.org/10.1145/3543507.3583305)|Xiangrong Zhu, Guangyao Li, Wei Hu|State Key Laboratory for Novel Software Technology, Nanjing University, China; State Key Laboratory for Novel Software Technology, Nanjing University, China and National Institute of Healthcare Data Science, Nanjing University, China|Federated Learning (FL) recently emerges as a paradigm to train a global machine learning model across distributed clients without sharing raw data. Knowledge Graph (KG) embedding represents KGs in a continuous vector space, serving as the backbone of many knowledge-driven applications. As a promising combination, federated KG embedding can fully take advantage of knowledge learned from different clients while preserving the privacy of local data. However, realistic problems such as data heterogeneity and knowledge forgetting still remain to be concerned. In this paper, we propose FedLU, a novel FL framework for heterogeneous KG embedding learning and unlearning. To cope with the drift between local optimization and global convergence caused by data heterogeneity, we propose mutual knowledge distillation to transfer local knowledge to global, and absorb global knowledge back. Moreover, we present an unlearning method based on cognitive neuroscience, which combines retroactive interference and passive decay to erase specific knowledge from local clients and propagate to the global model by reusing knowledge distillation. We construct new datasets for assessing realistic performance of the state-of-the-arts. Extensive experiments show that FedLU achieves superior results in both link prediction and knowledge forgetting.|联邦学习(Federated Learning,FL)最近作为一种模式出现,它可以在不共享原始数据的情况下跨分布式客户机训练全局机器学习模型。知识图(KG)嵌入表示连续向量空间中的 KG,作为许多知识驱动应用程序的骨干。作为一种有前途的组合,联邦 KG 嵌入可以充分利用从不同客户端获得的知识,同时保护本地数据的隐私。然而,数据异构性和知识遗忘等现实问题仍然值得关注。本文提出了一种适用于异构 KG 嵌入学习和非学习的 FL 框架 FedLU。针对数据异构导致的局部优化和全局收敛之间的漂移问题,提出了互知识提取方法,将局部知识转化为全局知识,并将全局知识吸收回来。此外,我们还提出了一种基于认知神经科学的去学习方法,它结合了追溯干扰和被动衰减,从本地客户删除特定的知识,并通过重复使用知识提取传播到全球模型。我们建立了新的数据集来评估现实性能的最新技术。大量实验表明,FedLU 在链路预测和知识遗忘方面都取得了较好的效果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Heterogeneous+Federated+Knowledge+Graph+Embedding+Learning+and+Unlearning)|0| +|[A Single Vector Is Not Enough: Taxonomy Expansion via Box Embeddings](https://doi.org/10.1145/3543507.3583310)|Song Jiang, Qiyue Yao, Qifan Wang, Yizhou Sun|Meta AI, USA; University of California, Los Angeles, USA|Taxonomies, which organize knowledge hierarchically, support various practical web applications such as product navigation in online shopping and user profile tagging on social platforms. Given the continued and rapid emergence of new entities, maintaining a comprehensive taxonomy in a timely manner through human annotation is prohibitively expensive. Therefore, expanding a taxonomy automatically with new entities is essential. Most existing methods for expanding taxonomies encode entities into vector embeddings (i.e., single points). However, we argue that vectors are insufficient to model the “is-a” hierarchy in taxonomy (asymmetrical relation), because two points can only represent pairwise similarity (symmetrical relation). To this end, we propose to project taxonomy entities into boxes (i.e., hyperrectangles). Two boxes can be "contained", "disjoint" and "intersecting", thus naturally representing an asymmetrical taxonomic hierarchy. Upon box embeddings, we propose a novel model BoxTaxo for taxonomy expansion. The core of BoxTaxo is to learn boxes for entities to capture their child-parent hierarchies. To achieve this, BoxTaxo optimizes the box embeddings from a joint view of geometry and probability. BoxTaxo also offers an easy and natural way for inference: examine whether the box of a given new entity is fully enclosed inside the box of a candidate parent from the existing taxonomy. Extensive experiments on two benchmarks demonstrate the effectiveness of BoxTaxo compared to vector based models.|分类法按层次组织知识,支持各种实际的网络应用程序,如在线购物中的产品导航和社交平台上的用户配置文件标签。鉴于新实体的持续和快速出现,通过人工注释及时维护全面的分类是非常昂贵的。因此,使用新实体自动扩展分类法是必不可少的。大多数现有的分类扩展方法都将实体编码为向量嵌入(即单点)。然而,我们认为向量不足以模拟分类学中的“ is-a”层次结构(非对称关系) ,因为两点只能表示成对的相似性(对称关系)。为此,我们建议将分类实体投影到盒子(即超矩形)中。两个盒子可以“包含”、“不相交”和“交叉”,因此自然地代表了一个不对称的分类层次。在盒子嵌入的基础上,提出了一种新的分类扩展模型 BoxTaxo。BoxTaxo 的核心是为实体学习用于捕获其子-父层次结构的框。为了实现这一点,BoxTaxo 从几何和概率的联合视图优化了盒子嵌入。BoxTaxo 还提供了一种简单而自然的推理方法: 检查给定新实体的框是否完全封闭在现有分类法的候选父类的框中。在两个基准上的大量实验证明了 BoxTaxo 与基于向量的模型相比的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Single+Vector+Is+Not+Enough:+Taxonomy+Expansion+via+Box+Embeddings)|0| +|[Knowledge Graph Question Answering with Ambiguous Query](https://doi.org/10.1145/3543507.3583316)|Lihui Liu, Yuzhong Chen, Mahashweta Das, Hao Yang, Hanghang Tong|visa research, USA; Department of Computer Science, University of Illinois at Urbana Champaign, USA|Knowledge graph question answering aims to identify answers of the query according to the facts in the knowledge graph. In the vast majority of the existing works, the input queries are considered perfect and can precisely express the user’s query intention. However, in reality, input queries might be ambiguous and elusive which only contain a limited amount of information. Directly answering these ambiguous queries may yield unwanted answers and deteriorate user experience. In this paper, we propose PReFNet which focuses on answering ambiguous queries with pseudo relevance feedback on knowledge graphs. In order to leverage the hidden (pseudo) relevance information existed in the results that are initially returned from a given query, PReFNet treats the top-k returned candidate answers as a set of most relevant answers, and uses variational Bayesian inference to infer user’s query intention. To boost the quality of the inferred queries, a neighborhood embedding based VGAE model is used to prune inferior inferred queries. The inferred high quality queries will be returned to the users to help them search with ease. Moreover, all the high-quality candidate nodes will be re-ranked according to the inferred queries. The experiment results show that our proposed method can recommend high-quality query graphs to users and improve the question answering accuracy.|知识图问答的目的是根据知识图中的事实来识别问题的答案。在现有的大多数工作中,输入查询被认为是完美的,可以准确地表达用户的查询意图。然而,在现实中,输入查询可能是模糊和难以捉摸的,只包含有限数量的信息。直接回答这些模棱两可的问题可能会得到不想要的答案,并损害用户体验。在这篇文章中,我们提出了一种基于知识图的伪关联反馈回答模糊查询的方法。为了利用最初从给定查询返回的结果中存在的隐藏(伪)相关信息,prefNet 将返回的候选答案视为一组最相关的答案,并使用变异贝叶斯推断来推断用户的查询意图。为了提高推理查询的质量,提出了一种基于邻域嵌入的 VGAE 模型来裁剪劣质推理查询。推断出的高质量查询将返回给用户,以帮助他们轻松搜索。此外,所有高质量的候选节点将根据推断的查询进行重新排序。实验结果表明,该方法可以向用户推荐高质量的查询图,提高问答的准确率。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Knowledge+Graph+Question+Answering+with+Ambiguous+Query)|0| |[Hierarchical Self-Attention Embedding for Temporal Knowledge Graph Completion](https://doi.org/10.1145/3543507.3583397)|Xin Ren, Luyi Bai, Qianwen Xiao, Xiangxi Meng|Northeastern University, China|Temporal Knowledge Graph (TKG) is composed of a series of facts related to timestamps in the real world and has become the basis of many artificial intelligence applications. However, the existing TKG is usually incomplete. It has become a hot research task to infer missing facts based on existing facts in a TKG; namely, Temporal Knowledge Graph Completion (TKGC). The current mainstream TKGC models are embedded models that predict missing facts by representing entities, relations and timestamps as low-dimensional vectors. In order to deal with the TKG structure information, there are some models that try to introduce attention mechanism into the embedding process. But they only consider the structure information of entities or relations, and ignore the structure information of the whole TKG. Moreover, most of them usually treat timestamps as a general feature and cannot take advantage of the potential time series information of the timestamp. To solve these problems, wo propose a new Hierarchical Self-Attention Embedding (HSAE) model which inspired by self-attention mechanism and diachronic embedding technique. For structure information of the whole TKG, we divide the TKG into two layers: entity layer and relation layer, and then apply the self-attention mechanism to the entity layer and relation layer respectively to capture the structure information. For time series information of the timestamp, we capture them by combining positional encoding and diachronic embedding technique into the above two self-attention layers. Finally, we can get the embedded representation vectors of entities, relations and timestamps, which can be combined with other models for better results. We evaluate our model on three TKG datasets: ICEWS14, ICEWS05-15 and GDELT. Experimental results on the TKGC (interpolation) task demonstrate that our model achieves state-of-the-art results.|时间知识图(TKG)由现实世界中与时间戳相关的一系列事实组成,已成为许多人工智能应用的基础。然而,现有的 TKG 通常是不完整的。基于 TKG 中已有事实推断缺失事实,即时态知识图完成(TKGC) ,已成为一个热门的研究课题。目前主流的 TKGC 模型是嵌入式模型,通过将实体、关系和时间戳表示为低维向量来预测缺失事实。为了处理 TKG 的结构信息,有一些模型尝试在嵌入过程中引入注意机制。但他们只考虑实体或关系的结构信息,而忽略了整个 TKG 的结构信息。此外,它们中的大多数通常将时间戳视为一个通用特性,不能利用时间戳的潜在时间序列信息。为了解决这些问题,我们提出了一种新的分层自我注意嵌入(HSAE)模型,该模型受自我注意机制和历时嵌入技术的启发。对于整个 TKG 的结构信息,我们将 TKG 分为实体层和关系层两个层次,然后将自注意机制分别应用于实体层和关系层,以获取 TKG 的结构信息。对于时间戳的时间序列信息,我们将位置编码和历时嵌入技术结合到上述两个自我注意层中来获取它们。最后,我们可以得到实体、关系和时间戳的嵌入式表示向量,这些表示向量可以与其他模型相结合以获得更好的结果。我们在三个 TKG 数据集上评估我们的模型: ICEWS14,ICEWS05-15和 GDELT。在 TKGC (插值)任务上的实验结果表明,我们的模型达到了最先进的效果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Hierarchical+Self-Attention+Embedding+for+Temporal+Knowledge+Graph+Completion)|0| -|[Link Prediction with Attention Applied on Multiple Knowledge Graph Embedding Models](https://doi.org/10.1145/3543507.3583358)|Cosimo Gregucci, Mojtaba Nayyeri, Daniel Hernández, Steffen Staab|University of Stuttgart, Germany and University of Southampton, United Kingdom; University of Stuttgart, Germany|Predicting missing links between entities in a knowledge graph is a fundamental task to deal with the incompleteness of data on the Web. Knowledge graph embeddings map nodes into a vector space to predict new links, scoring them according to geometric criteria. Relations in the graph may follow patterns that can be learned, e.g., some relations might be symmetric and others might be hierarchical. However, the learning capability of different embedding models varies for each pattern and, so far, no single model can learn all patterns equally well. In this paper, we combine the query representations from several models in a unified one to incorporate patterns that are independently captured by each model. Our combination uses attention to select the most suitable model to answer each query. The models are also mapped onto a non-Euclidean manifold, the Poincar\'e ball, to capture structural patterns, such as hierarchies, besides relational patterns, such as symmetry. We prove that our combination provides a higher expressiveness and inference power than each model on its own. As a result, the combined model can learn relational and structural patterns. We conduct extensive experimental analysis with various link prediction benchmarks showing that the combined model outperforms individual models, including state-of-the-art approaches.|预测知识图中实体之间的缺失链接是处理网络数据不完整性的基本任务。知识图嵌入将节点映射到向量空间中,预测新的链接,并根据几何标准对其进行评分。图中的关系可能遵循可以学习的模式,例如,一些关系可能是对称的,另一些可能是等级的。然而,不同嵌入模型的学习能力因模式的不同而异,到目前为止,还没有一个单独的模型能够同样好地学习所有的模式。在本文中,我们将来自多个模型的查询表示结合在一个统一的模型中,以合并由每个模型独立捕获的模式。我们的组合使用注意力来选择最合适的模型来回答每个查询。这些模型还被映射到一个非欧几里德流形,即 Poincar‘ e 球,以捕获结构模式,如层次结构,以及关系模式,如对称性。我们证明,我们的组合提供了更高的表达能力和推理能力比每个模型本身。因此,组合模型可以学习关系模式和结构模式。我们进行了广泛的实验分析与各种链接预测基准表明,组合模型优于个别模型,包括最先进的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Link+Prediction+with+Attention+Applied+on+Multiple+Knowledge+Graph+Embedding+Models)|0| -|[SeqCare: Sequential Training with External Medical Knowledge Graph for Diagnosis Prediction in Healthcare Data](https://doi.org/10.1145/3543507.3583543)|Yongxin Xu, Xu Chu, Kai Yang, Zhiyuan Wang, Peinie Zou, Hongxin Ding, Junfeng Zhao, Yasha Wang, Bing Xie|Zhongguancun Laboratory, China; Key Laboratory of High Confidence Software Technologies, Ministry of Education, China and Peking University, China; Tsinghua University, China|Deep learning techniques are capable of capturing complex input-output relationships, and have been widely applied to the diagnosis prediction task based on web-based patient electronic health records (EHR) data. To improve the prediction and interpretability of pure data-driven deep learning with only a limited amount of labeled data, a pervasive trend is to assist the model training with knowledge priors from online medical knowledge graphs. However, they marginally investigated the label imbalance and the task-irrelevant noise in the external knowledge graph. The imbalanced label distribution would bias the learning and knowledge extraction towards the majority categories. The task-irrelevant noise introduces extra uncertainty to the model performance. To this end, aiming at by-passing the bias-variance trade-off dilemma, we introduce a new sequential learning framework, dubbed SeqCare, for diagnosis prediction with online medical knowledge graphs. Concretely, in the first step, SeqCare learns a bias-reduced space through a self-supervised graph contrastive learning task. Secondly, SeqCare reduces the learning uncertainty by refining the supervision signal and the graph structure of the knowledge graph simultaneously. Lastly, SeqCare trains the model in the bias-variance reduced space with a self-distillation to further filter out irrelevant information in the data. Experimental evaluations on two real-world datasets show that SeqCare outperforms state-of-the-art approaches. Case studies exemplify the interpretability of SeqCare. Moreover, the medical findings discovered by SeqCare are consistent with experts and medical literature.|深度学习技术能够捕捉复杂的输入输出关系,已广泛应用于基于网络病人电子健康记录(EHR)数据的诊断预测任务中。为了提高纯数据驱动的深度学习的预测性和可解释性,通过在线医学知识图的知识先验来辅助模型训练是一个普遍的趋势。然而,他们对外部知识图中的标签不平衡和任务不相关噪声的研究很少。不均衡的标签分布会使学习和知识抽取偏向于大多数类别。任务无关噪声给模型性能带来额外的不确定性。为此,针对偏差-方差权衡困境,我们引入了一个新的序列学习框架,称为 SeqCare,用于在线医学知识图的诊断预测。具体地说,在第一步中,SeqCare 通过一个自监督的图形对比学习任务学习一个减少偏差的空间。其次,SeqCare 通过同时细化监督信号和知识图的图形结构来降低学习的不确定性。最后,SeqCare 在偏差-方差缩减空间中用自精馏的方法对模型进行训练,以进一步滤除数据中的不相关信息。对两个真实世界数据集的实验评估表明,SeqCare 的性能优于最先进的方法。案例研究例证了 SeqCare 的可解释性。此外,SeqCare 发现的医学发现与专家和医学文献一致。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SeqCare:+Sequential+Training+with+External+Medical+Knowledge+Graph+for+Diagnosis+Prediction+in+Healthcare+Data)|0| +|[Link Prediction with Attention Applied on Multiple Knowledge Graph Embedding Models](https://doi.org/10.1145/3543507.3583358)|Cosimo Gregucci, Mojtaba Nayyeri, Daniel Hernández, Steffen Staab|University of Stuttgart, Germany; University of Stuttgart, Germany and University of Southampton, United Kingdom|Predicting missing links between entities in a knowledge graph is a fundamental task to deal with the incompleteness of data on the Web. Knowledge graph embeddings map nodes into a vector space to predict new links, scoring them according to geometric criteria. Relations in the graph may follow patterns that can be learned, e.g., some relations might be symmetric and others might be hierarchical. However, the learning capability of different embedding models varies for each pattern and, so far, no single model can learn all patterns equally well. In this paper, we combine the query representations from several models in a unified one to incorporate patterns that are independently captured by each model. Our combination uses attention to select the most suitable model to answer each query. The models are also mapped onto a non-Euclidean manifold, the Poincar\'e ball, to capture structural patterns, such as hierarchies, besides relational patterns, such as symmetry. We prove that our combination provides a higher expressiveness and inference power than each model on its own. As a result, the combined model can learn relational and structural patterns. We conduct extensive experimental analysis with various link prediction benchmarks showing that the combined model outperforms individual models, including state-of-the-art approaches.|预测知识图中实体之间的缺失链接是处理网络数据不完整性的基本任务。知识图嵌入将节点映射到向量空间中,预测新的链接,并根据几何标准对其进行评分。图中的关系可能遵循可以学习的模式,例如,一些关系可能是对称的,另一些可能是等级的。然而,不同嵌入模型的学习能力因模式的不同而异,到目前为止,还没有一个单独的模型能够同样好地学习所有的模式。在本文中,我们将来自多个模型的查询表示结合在一个统一的模型中,以合并由每个模型独立捕获的模式。我们的组合使用注意力来选择最合适的模型来回答每个查询。这些模型还被映射到一个非欧几里德流形,即 Poincar‘ e 球,以捕获结构模式,如层次结构,以及关系模式,如对称性。我们证明,我们的组合提供了更高的表达能力和推理能力比每个模型本身。因此,组合模型可以学习关系模式和结构模式。我们进行了广泛的实验分析与各种链接预测基准表明,组合模型优于个别模型,包括最先进的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Link+Prediction+with+Attention+Applied+on+Multiple+Knowledge+Graph+Embedding+Models)|0| +|[SeqCare: Sequential Training with External Medical Knowledge Graph for Diagnosis Prediction in Healthcare Data](https://doi.org/10.1145/3543507.3583543)|Yongxin Xu, Xu Chu, Kai Yang, Zhiyuan Wang, Peinie Zou, Hongxin Ding, Junfeng Zhao, Yasha Wang, Bing Xie|Tsinghua University, China; Zhongguancun Laboratory, China; Key Laboratory of High Confidence Software Technologies, Ministry of Education, China and Peking University, China|Deep learning techniques are capable of capturing complex input-output relationships, and have been widely applied to the diagnosis prediction task based on web-based patient electronic health records (EHR) data. To improve the prediction and interpretability of pure data-driven deep learning with only a limited amount of labeled data, a pervasive trend is to assist the model training with knowledge priors from online medical knowledge graphs. However, they marginally investigated the label imbalance and the task-irrelevant noise in the external knowledge graph. The imbalanced label distribution would bias the learning and knowledge extraction towards the majority categories. The task-irrelevant noise introduces extra uncertainty to the model performance. To this end, aiming at by-passing the bias-variance trade-off dilemma, we introduce a new sequential learning framework, dubbed SeqCare, for diagnosis prediction with online medical knowledge graphs. Concretely, in the first step, SeqCare learns a bias-reduced space through a self-supervised graph contrastive learning task. Secondly, SeqCare reduces the learning uncertainty by refining the supervision signal and the graph structure of the knowledge graph simultaneously. Lastly, SeqCare trains the model in the bias-variance reduced space with a self-distillation to further filter out irrelevant information in the data. Experimental evaluations on two real-world datasets show that SeqCare outperforms state-of-the-art approaches. Case studies exemplify the interpretability of SeqCare. Moreover, the medical findings discovered by SeqCare are consistent with experts and medical literature.|深度学习技术能够捕捉复杂的输入输出关系,已广泛应用于基于网络病人电子健康记录(EHR)数据的诊断预测任务中。为了提高纯数据驱动的深度学习的预测性和可解释性,通过在线医学知识图的知识先验来辅助模型训练是一个普遍的趋势。然而,他们对外部知识图中的标签不平衡和任务不相关噪声的研究很少。不均衡的标签分布会使学习和知识抽取偏向于大多数类别。任务无关噪声给模型性能带来额外的不确定性。为此,针对偏差-方差权衡困境,我们引入了一个新的序列学习框架,称为 SeqCare,用于在线医学知识图的诊断预测。具体地说,在第一步中,SeqCare 通过一个自监督的图形对比学习任务学习一个减少偏差的空间。其次,SeqCare 通过同时细化监督信号和知识图的图形结构来降低学习的不确定性。最后,SeqCare 在偏差-方差缩减空间中用自精馏的方法对模型进行训练,以进一步滤除数据中的不相关信息。对两个真实世界数据集的实验评估表明,SeqCare 的性能优于最先进的方法。案例研究例证了 SeqCare 的可解释性。此外,SeqCare 发现的医学发现与专家和医学文献一致。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SeqCare:+Sequential+Training+with+External+Medical+Knowledge+Graph+for+Diagnosis+Prediction+in+Healthcare+Data)|0| |[The Thin Ideology of Populist Advertising on Facebook during the 2019 EU Elections](https://doi.org/10.1145/3543507.3583267)|Arthur Capozzi, Gianmarco De Francisci Morales, Yelena Mejova, Corrado Monti, André Panisson|ISI Foundation, Italy; Centai, Italy; Computer Science, Universita' di Torino, Italy|Social media has been an important tool in the expansion of the populist message, and it is thought to have contributed to the electoral success of populist parties in the past decade. This study compares how populist parties advertised on Facebook during the 2019 European Parliamentary election. In particular, we examine commonalities and differences in which audiences they reach and on which issues they focus. By using data from Meta (previously Facebook) Ad Library, we analyze 45k ad campaigns by 39 parties, both populist and mainstream, in Germany, United Kingdom, Italy, Spain, and Poland. While populist parties represent just over 20% of the total expenditure on political ads, they account for 40% of the total impressions$\unicode{x2013}$most of which from Eurosceptic and far-right parties$\unicode{x2013}$thus hinting at a competitive advantage for populist parties on Facebook. We further find that ads posted by populist parties are more likely to reach male audiences, and sometimes much older ones. In terms of issues, populist politicians focus on monetary policy, state bureaucracy and reforms, and security, while the focus on EU and Brexit is on par with non-populist, mainstream parties. However, issue preferences are largely country-specific, thus supporting the view in political science that populism is a "thin ideology", that does not have a universal, coherent policy agenda. This study illustrates the usefulness of publicly available advertising data for monitoring the populist outreach to, and engagement with, millions of potential voters, while outlining the limitations of currently available data.|社交媒体一直是传播民粹主义信息的重要工具,据认为它在过去十年中为民粹主义政党在选举中取得成功做出了贡献。这项研究比较了2019年欧洲议会选举期间民粹主义政党在 Facebook 上的广告。特别是,我们研究了它们所接触到的受众以及它们所关注的问题的共性和差异。通过使用来自 Meta (以前的 Facebook)广告库的数据,我们分析了来自德国、英国、意大利、西班牙和波兰的39个民粹主义和主流政党的45000个广告活动。尽管民粹主义政党仅占政治广告总支出的20% 多一点,但他们占了总印象的40% ,其中大部分来自欧洲怀疑论者和极右翼政党,因此暗示了民粹主义政党在 Facebook 上的竞争优势。我们进一步发现,民粹主义政党发布的广告更有可能触及男性受众,有时甚至是年龄更大的受众。就问题而言,民粹主义政客关注的是货币政策、国家官僚机构和改革以及安全,而关注欧盟和 Brexit 的政党,与非民粹主义的主流政党不相上下。然而,问题的偏好在很大程度上是针对具体国家的,因此支持了政治科学中的观点,即民粹主义是一种“薄弱的意识形态”,没有一个普遍的、连贯的政策议程。这项研究说明了公开可用的广告数据在监测民粹主义者与数百万潜在选民的接触和接触方面的有用性,同时概述了目前可用数据的局限性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=The+Thin+Ideology+of+Populist+Advertising+on+Facebook+during+the+2019+EU+Elections)|0| -|[FlexiFed: Personalized Federated Learning for Edge Clients with Heterogeneous Model Architectures](https://doi.org/10.1145/3543507.3583347)|Kaibin Wang, Qiang He, Feifei Chen, Chunyang Chen, Faliang Huang, Hai Jin, Yun Yang|Deakin University, Australia; Huazhong University of Science and Technology, China and Swinburne University of Technology, Australia; Swinburne University of Technology, Australia; Huazhong University of Science and Technology, China; Nanning Normal University, China; Monash University, Australia|Mobile and Web-of-Things (WoT) devices at the network edge account for more than half of the world’s web traffic, making a great data source for various machine learning (ML) applications, particularly federated learning (FL) which offers a promising solution to privacy-preserving ML feeding on these data. FL allows edge mobile and WoT devices to train a shared global ML model under the orchestration of a central parameter server. In the real world, due to resource heterogeneity, these edge devices often train different versions of models (e.g., VGG-16 and VGG-19) or different ML models (e.g., VGG and ResNet) for the same ML task (e.g., computer vision and speech recognition). Existing FL schemes have assumed that participating edge devices share a common model architecture, and thus cannot facilitate FL across edge devices with heterogeneous ML model architectures. We explored this architecture heterogeneity challenge and found that FL can and should accommodate these edge devices to improve model accuracy and accelerate model training. This paper presents our findings and FlexiFed, a novel scheme for FL across edge devices with heterogeneous model architectures, and three model aggregation strategies for accommodating architecture heterogeneity under FlexiFed. Experiments with four widely-used ML models on four public datasets demonstrate 1) the usefulness of FlexiFed; and 2) that compared with the state-of-the-art FL scheme, FlexiFed improves model accuracy by 2.6%-9.7% and accelerates model convergence by 1.24 × -4.04 ×.|处于网络边缘的移动和物联网(WoT)设备占据了世界网络流量的一半以上,为各种机器学习(ML)应用提供了一个巨大的数据源,特别是联邦学习(FL) ,它为依靠这些数据保护隐私的机器学习提供了一个有前途的解决方案。FL 允许边缘移动和 WoT 设备在一个中心参数服务器的协调下训练一个共享的全球 ML 模型。在现实世界中,由于资源的异质性,这些边缘设备经常为相同的机器学习任务训练不同版本的模型(例如,VGG-16和 VGG-19)或不同的机器学习模型(例如,VGG 和 ResNet)(例如,计算机视觉和语音识别)。现有的 FL 方案假定参与的边缘设备共享一个共同的模型架构,因此不能促进具有异构机器学习模型架构的边缘设备之间的 FL。我们探讨了这种体系结构异构性的挑战,发现 FL 可以而且应该适应这些边缘设备,以提高模型精度和加速模型训练。本文介绍了我们的研究结果和 FlexiFed,这是一个跨具有异构模型结构的边缘设备的 FL 的新方案,以及在 FlexiFed 下适应结构异构性的三种模型聚合策略。在四个公共数据集上对四个广泛使用的机器学习模型进行的实验表明: 1) FlexiFed 的有用性; 2)与最先进的 FL 方案相比,FlexiFed 将模型精度提高了2.6% -9.7% ,并加速了模型收敛1.24 × -4.04 × 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=FlexiFed:+Personalized+Federated+Learning+for+Edge+Clients+with+Heterogeneous+Model+Architectures)|0| -|[PipeEdge: A Trusted Pipelining Collaborative Edge Training based on Blockchain](https://doi.org/10.1145/3543507.3583413)|Liang Yuan, Qiang He, Feifei Chen, Ruihan Dou, Hai Jin, Yun Yang|Deakin University, Australia; Huazhong University of Science and Technology, China and Swinburne University of Technology, Australia; University of Waterloo, Canada; Huazhong University of Science and Technology, China; Swinburne University of Technology, Australia|Powered by the massive data generated by the blossom of mobile and Web-of-Things (WoT) devices, Deep Neural Networks (DNNs) have developed both in accuracy and size in recent years. Conventional cloud-based DNN training incurs rapidly-increasing data and model transmission overheads as well as privacy issues. Mobile edge computing (MEC) provides a promising solution by facilitating DNN model training on edge servers at the network edge. However, edge servers often suffer from constrained resources and need to collaborate on DNN training. Unfortunately, managed by different telecoms, edge servers cannot properly collaborate with each other without incentives and trust. In this paper, we introduce PipeEdge, a scheme that promotes collaborative edge training between edge servers by introducing incentives and trust based on blockchain. Under the PipeEdge scheme, edge servers can hire trustworthy workers for pipelined DNN training tasks based on model parallelism. We implement PipeEdge and evaluate it comprehensively with four different DNN models. The results show that it outperforms state-of-the-art schemes by up to 173.98% with negligible overheads.|深度神经网络(DNN)由移动设备和物联网(WoT)设备的蓬勃发展所产生的大量数据所驱动,近年来在精度和规模上都有所发展。传统的基于云的 DNN 培训会带来快速增长的数据和模型传输开销以及隐私问题。移动边缘计算(MEC)为网络边缘服务器上的 DNN 模型训练提供了一种有前途的解决方案。然而,边缘服务器经常受到资源的限制,需要协作进行 DNN 培训。不幸的是,由不同电信公司管理的边缘服务器如果没有激励和信任,就无法正确地相互协作。本文介绍了 PipeEdge 方案,该方案通过引入基于区块链的激励和信任来促进边缘服务器之间的协同边缘训练。在 PipeEdge 方案下,边缘服务器可以雇佣值得信赖的工人,根据模型并行性执行流水线 DNN 培训任务。我们实现了 PipeEdge,并使用四种不同的 DNN 模型对其进行了综合评估。结果表明,该方案的性能优于最先进的方案达173.98% ,开销可以忽略不计。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PipeEdge:+A+Trusted+Pipelining+Collaborative+Edge+Training+based+on+Blockchain)|0| -|[ELASTIC: Edge Workload Forecasting based on Collaborative Cloud-Edge Deep Learning](https://doi.org/10.1145/3543507.3583436)|Yanan Li, Haitao Yuan, Zhe Fu, Xiao Ma, Mengwei Xu, Shangguang Wang|Beijing University of Posts and Telecommunications, China; Tsinghua University, China; Nanyang Technological University, Singapore|With the rapid development of edge computing in the post-COVID19 pandemic period, precise workload forecasting is considered the basis for making full use of the edge limited resources, and both edge service providers (ESPs) and edge service consumers (ESCs) can benefit significantly from it. Existing paradigms of workload forecasting (i.e., edge-only or cloud-only) are improper, due to failing to consider the inter-site correlations and might suffer from significant data transmission delays. With the increasing adoption of edge platforms by web services, it is critical to balance both accuracy and efficiency in workload forecasting. In this paper, we propose ELASTIC, which is the first study that leverages a cloud-edge collaborative paradigm for edge workload forecasting with multi-view graphs. Specifically, at the global stage, we design a learnable aggregation layer on each edge site to reduce the time consumption while capturing the inter-site correlation. Additionally, at the local stage, we design a disaggregation layer combining both the intra-site correlation and inter-site correlation to improve the prediction accuracy. Extensive experiments on realistic edge workload datasets collected from China’s largest edge service provider show that ELASTIC outperforms state-of-the-art methods, decreases time consumption, and reduces communication cost.|随着后 COVID19大流行时期边缘计算的快速发展,精确的工作量预测被认为是充分利用边缘有限资源的基础,边缘服务提供商(ESP)和边缘服务消费者(ESCs)都可以从中受益匪浅。现有的工作量预测模式(即仅边缘预测或仅云预测)是不适当的,因为没有考虑到站点间的相关性,并可能遭受显著的数据传输延迟。随着 Web 服务越来越多地采用边缘平台,在工作负载预测中平衡准确性和效率至关重要。在本文中,我们提出了 ELASTIC,这是第一个利用云端协作范式进行多视图边缘工作负荷预测的研究。具体来说,在全局阶段,我们在每个边缘站点上设计一个可学习的聚合层,以减少时间消耗,同时捕获站点间的相关性。此外,在局部阶段,我们设计了一个解体层,将场内相关性和场间相关性结合起来,以提高预测的准确性。对中国最大的边缘服务提供商收集的真实边缘工作负载数据集进行的大量实验表明,ELASTIC 优于最先进的方法,减少了时间消耗,降低了通信成本。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ELASTIC:+Edge+Workload+Forecasting+based+on+Collaborative+Cloud-Edge+Deep+Learning)|0| -|[DUET: A Tuning-Free Device-Cloud Collaborative Parameters Generation Framework for Efficient Device Model Generalization](https://doi.org/10.1145/3543507.3583451)|Zheqi Lv, Wenqiao Zhang, Shengyu Zhang, Kun Kuang, Feng Wang, Yongwei Wang, Zhengyu Chen, Tao Shen, Hongxia Yang, Beng Chin Ooi, Fei Wu|National University of Singapore, Singapore; Zhejiang University, China; Alibaba Group, China|Device Model Generalization (DMG) is a practical yet under-investigated research topic for on-device machine learning applications. It aims to improve the generalization ability of pre-trained models when deployed on resource-constrained devices, such as improving the performance of pre-trained cloud models on smart mobiles. While quite a lot of works have investigated the data distribution shift across clouds and devices, most of them focus on model fine-tuning on personalized data for individual devices to facilitate DMG. Despite their promising, these approaches require on-device re-training, which is practically infeasible due to the overfitting problem and high time delay when performing gradient calculation on real-time data. In this paper, we argue that the computational cost brought by fine-tuning can be rather unnecessary. We consequently present a novel perspective to improving DMG without increasing computational cost, i.e., device-specific parameter generation which directly maps data distribution to parameters. Specifically, we propose an efficient Device-cloUd collaborative parametErs generaTion framework DUET. DUET is deployed on a powerful cloud server that only requires the low cost of forwarding propagation and low time delay of data transmission between the device and the cloud. By doing so, DUET can rehearse the device-specific model weight realizations conditioned on the personalized real-time data for an individual device. Importantly, our DUET elegantly connects the cloud and device as a 'duet' collaboration, frees the DMG from fine-tuning, and enables a faster and more accurate DMG paradigm. We conduct an extensive experimental study of DUET on three public datasets, and the experimental results confirm our framework's effectiveness and generalisability for different DMG tasks.|设备模型综合(DMG)是设备上机器学习应用中一个实用但尚未得到充分研究的课题。它旨在提高预先训练的模型在资源有限的设备上部署时的泛化能力,例如提高预先训练的云模型在智能手机上的性能。虽然许多工作已经研究了跨云和设备的数据分布转移,但大多数工作集中在针对个人设备的个性化数据的模型微调上,以促进 DMG。尽管这些方法很有前景,但是需要在设备上进行再训练,这在实际中是不可行的,因为在对实时数据进行梯度计算时,存在过拟合问题和高时延。在本文中,我们认为微调带来的计算成本可能是相当不必要的。因此,我们提出了一个新的视角来改善 DMG 而不增加计算成本,即,设备特定的参数生成,直接映射数据分布到参数。具体地说,我们提出了一种高效的设备云协同参数生成框架 DUET。DUET 部署在一个强大的云服务器上,只需要较低的转发传播成本和较低的设备与云之间的数据传输延迟。通过这样做,DUET 可以预演设备特定的模型权重实现条件下的个性化实时数据为单个设备。重要的是,我们的 DUET 作为一个“二重奏”协作优雅地连接了云和设备,从微调中解放了 DMG,并支持更快、更准确的 DMG 范例。我们对三个公共数据集上的 DUET 进行了广泛的实验研究,实验结果证实了我们的框架对不同 DMG 任务的有效性和通用性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DUET:+A+Tuning-Free+Device-Cloud+Collaborative+Parameters+Generation+Framework+for+Efficient+Device+Model+Generalization)|0| +|[FlexiFed: Personalized Federated Learning for Edge Clients with Heterogeneous Model Architectures](https://doi.org/10.1145/3543507.3583347)|Kaibin Wang, Qiang He, Feifei Chen, Chunyang Chen, Faliang Huang, Hai Jin, Yun Yang|Swinburne University of Technology, Australia; Monash University, Australia; Nanning Normal University, China; Huazhong University of Science and Technology, China; Deakin University, Australia; Huazhong University of Science and Technology, China and Swinburne University of Technology, Australia|Mobile and Web-of-Things (WoT) devices at the network edge account for more than half of the world’s web traffic, making a great data source for various machine learning (ML) applications, particularly federated learning (FL) which offers a promising solution to privacy-preserving ML feeding on these data. FL allows edge mobile and WoT devices to train a shared global ML model under the orchestration of a central parameter server. In the real world, due to resource heterogeneity, these edge devices often train different versions of models (e.g., VGG-16 and VGG-19) or different ML models (e.g., VGG and ResNet) for the same ML task (e.g., computer vision and speech recognition). Existing FL schemes have assumed that participating edge devices share a common model architecture, and thus cannot facilitate FL across edge devices with heterogeneous ML model architectures. We explored this architecture heterogeneity challenge and found that FL can and should accommodate these edge devices to improve model accuracy and accelerate model training. This paper presents our findings and FlexiFed, a novel scheme for FL across edge devices with heterogeneous model architectures, and three model aggregation strategies for accommodating architecture heterogeneity under FlexiFed. Experiments with four widely-used ML models on four public datasets demonstrate 1) the usefulness of FlexiFed; and 2) that compared with the state-of-the-art FL scheme, FlexiFed improves model accuracy by 2.6%-9.7% and accelerates model convergence by 1.24 × -4.04 ×.|处于网络边缘的移动和物联网(WoT)设备占据了世界网络流量的一半以上,为各种机器学习(ML)应用提供了一个巨大的数据源,特别是联邦学习(FL) ,它为依靠这些数据保护隐私的机器学习提供了一个有前途的解决方案。FL 允许边缘移动和 WoT 设备在一个中心参数服务器的协调下训练一个共享的全球 ML 模型。在现实世界中,由于资源的异质性,这些边缘设备经常为相同的机器学习任务训练不同版本的模型(例如,VGG-16和 VGG-19)或不同的机器学习模型(例如,VGG 和 ResNet)(例如,计算机视觉和语音识别)。现有的 FL 方案假定参与的边缘设备共享一个共同的模型架构,因此不能促进具有异构机器学习模型架构的边缘设备之间的 FL。我们探讨了这种体系结构异构性的挑战,发现 FL 可以而且应该适应这些边缘设备,以提高模型精度和加速模型训练。本文介绍了我们的研究结果和 FlexiFed,这是一个跨具有异构模型结构的边缘设备的 FL 的新方案,以及在 FlexiFed 下适应结构异构性的三种模型聚合策略。在四个公共数据集上对四个广泛使用的机器学习模型进行的实验表明: 1) FlexiFed 的有用性; 2)与最先进的 FL 方案相比,FlexiFed 将模型精度提高了2.6% -9.7% ,并加速了模型收敛1.24 × -4.04 × 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=FlexiFed:+Personalized+Federated+Learning+for+Edge+Clients+with+Heterogeneous+Model+Architectures)|0| +|[PipeEdge: A Trusted Pipelining Collaborative Edge Training based on Blockchain](https://doi.org/10.1145/3543507.3583413)|Liang Yuan, Qiang He, Feifei Chen, Ruihan Dou, Hai Jin, Yun Yang|Swinburne University of Technology, Australia; University of Waterloo, Canada; Huazhong University of Science and Technology, China; Deakin University, Australia; Huazhong University of Science and Technology, China and Swinburne University of Technology, Australia|Powered by the massive data generated by the blossom of mobile and Web-of-Things (WoT) devices, Deep Neural Networks (DNNs) have developed both in accuracy and size in recent years. Conventional cloud-based DNN training incurs rapidly-increasing data and model transmission overheads as well as privacy issues. Mobile edge computing (MEC) provides a promising solution by facilitating DNN model training on edge servers at the network edge. However, edge servers often suffer from constrained resources and need to collaborate on DNN training. Unfortunately, managed by different telecoms, edge servers cannot properly collaborate with each other without incentives and trust. In this paper, we introduce PipeEdge, a scheme that promotes collaborative edge training between edge servers by introducing incentives and trust based on blockchain. Under the PipeEdge scheme, edge servers can hire trustworthy workers for pipelined DNN training tasks based on model parallelism. We implement PipeEdge and evaluate it comprehensively with four different DNN models. The results show that it outperforms state-of-the-art schemes by up to 173.98% with negligible overheads.|深度神经网络(DNN)由移动设备和物联网(WoT)设备的蓬勃发展所产生的大量数据所驱动,近年来在精度和规模上都有所发展。传统的基于云的 DNN 培训会带来快速增长的数据和模型传输开销以及隐私问题。移动边缘计算(MEC)为网络边缘服务器上的 DNN 模型训练提供了一种有前途的解决方案。然而,边缘服务器经常受到资源的限制,需要协作进行 DNN 培训。不幸的是,由不同电信公司管理的边缘服务器如果没有激励和信任,就无法正确地相互协作。本文介绍了 PipeEdge 方案,该方案通过引入基于区块链的激励和信任来促进边缘服务器之间的协同边缘训练。在 PipeEdge 方案下,边缘服务器可以雇佣值得信赖的工人,根据模型并行性执行流水线 DNN 培训任务。我们实现了 PipeEdge,并使用四种不同的 DNN 模型对其进行了综合评估。结果表明,该方案的性能优于最先进的方案达173.98% ,开销可以忽略不计。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PipeEdge:+A+Trusted+Pipelining+Collaborative+Edge+Training+based+on+Blockchain)|0| +|[ELASTIC: Edge Workload Forecasting based on Collaborative Cloud-Edge Deep Learning](https://doi.org/10.1145/3543507.3583436)|Yanan Li, Haitao Yuan, Zhe Fu, Xiao Ma, Mengwei Xu, Shangguang Wang|Tsinghua University, China; Beijing University of Posts and Telecommunications, China; Nanyang Technological University, Singapore|With the rapid development of edge computing in the post-COVID19 pandemic period, precise workload forecasting is considered the basis for making full use of the edge limited resources, and both edge service providers (ESPs) and edge service consumers (ESCs) can benefit significantly from it. Existing paradigms of workload forecasting (i.e., edge-only or cloud-only) are improper, due to failing to consider the inter-site correlations and might suffer from significant data transmission delays. With the increasing adoption of edge platforms by web services, it is critical to balance both accuracy and efficiency in workload forecasting. In this paper, we propose ELASTIC, which is the first study that leverages a cloud-edge collaborative paradigm for edge workload forecasting with multi-view graphs. Specifically, at the global stage, we design a learnable aggregation layer on each edge site to reduce the time consumption while capturing the inter-site correlation. Additionally, at the local stage, we design a disaggregation layer combining both the intra-site correlation and inter-site correlation to improve the prediction accuracy. Extensive experiments on realistic edge workload datasets collected from China’s largest edge service provider show that ELASTIC outperforms state-of-the-art methods, decreases time consumption, and reduces communication cost.|随着后 COVID19大流行时期边缘计算的快速发展,精确的工作量预测被认为是充分利用边缘有限资源的基础,边缘服务提供商(ESP)和边缘服务消费者(ESCs)都可以从中受益匪浅。现有的工作量预测模式(即仅边缘预测或仅云预测)是不适当的,因为没有考虑到站点间的相关性,并可能遭受显著的数据传输延迟。随着 Web 服务越来越多地采用边缘平台,在工作负载预测中平衡准确性和效率至关重要。在本文中,我们提出了 ELASTIC,这是第一个利用云端协作范式进行多视图边缘工作负荷预测的研究。具体来说,在全局阶段,我们在每个边缘站点上设计一个可学习的聚合层,以减少时间消耗,同时捕获站点间的相关性。此外,在局部阶段,我们设计了一个解体层,将场内相关性和场间相关性结合起来,以提高预测的准确性。对中国最大的边缘服务提供商收集的真实边缘工作负载数据集进行的大量实验表明,ELASTIC 优于最先进的方法,减少了时间消耗,降低了通信成本。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ELASTIC:+Edge+Workload+Forecasting+based+on+Collaborative+Cloud-Edge+Deep+Learning)|0| +|[DUET: A Tuning-Free Device-Cloud Collaborative Parameters Generation Framework for Efficient Device Model Generalization](https://doi.org/10.1145/3543507.3583451)|Zheqi Lv, Wenqiao Zhang, Shengyu Zhang, Kun Kuang, Feng Wang, Yongwei Wang, Zhengyu Chen, Tao Shen, Hongxia Yang, Beng Chin Ooi, Fei Wu|Zhejiang University, China; National University of Singapore, Singapore; Alibaba Group, China|Device Model Generalization (DMG) is a practical yet under-investigated research topic for on-device machine learning applications. It aims to improve the generalization ability of pre-trained models when deployed on resource-constrained devices, such as improving the performance of pre-trained cloud models on smart mobiles. While quite a lot of works have investigated the data distribution shift across clouds and devices, most of them focus on model fine-tuning on personalized data for individual devices to facilitate DMG. Despite their promising, these approaches require on-device re-training, which is practically infeasible due to the overfitting problem and high time delay when performing gradient calculation on real-time data. In this paper, we argue that the computational cost brought by fine-tuning can be rather unnecessary. We consequently present a novel perspective to improving DMG without increasing computational cost, i.e., device-specific parameter generation which directly maps data distribution to parameters. Specifically, we propose an efficient Device-cloUd collaborative parametErs generaTion framework DUET. DUET is deployed on a powerful cloud server that only requires the low cost of forwarding propagation and low time delay of data transmission between the device and the cloud. By doing so, DUET can rehearse the device-specific model weight realizations conditioned on the personalized real-time data for an individual device. Importantly, our DUET elegantly connects the cloud and device as a 'duet' collaboration, frees the DMG from fine-tuning, and enables a faster and more accurate DMG paradigm. We conduct an extensive experimental study of DUET on three public datasets, and the experimental results confirm our framework's effectiveness and generalisability for different DMG tasks.|设备模型综合(DMG)是设备上机器学习应用中一个实用但尚未得到充分研究的课题。它旨在提高预先训练的模型在资源有限的设备上部署时的泛化能力,例如提高预先训练的云模型在智能手机上的性能。虽然许多工作已经研究了跨云和设备的数据分布转移,但大多数工作集中在针对个人设备的个性化数据的模型微调上,以促进 DMG。尽管这些方法很有前景,但是需要在设备上进行再训练,这在实际中是不可行的,因为在对实时数据进行梯度计算时,存在过拟合问题和高时延。在本文中,我们认为微调带来的计算成本可能是相当不必要的。因此,我们提出了一个新的视角来改善 DMG 而不增加计算成本,即,设备特定的参数生成,直接映射数据分布到参数。具体地说,我们提出了一种高效的设备云协同参数生成框架 DUET。DUET 部署在一个强大的云服务器上,只需要较低的转发传播成本和较低的设备与云之间的数据传输延迟。通过这样做,DUET 可以预演设备特定的模型权重实现条件下的个性化实时数据为单个设备。重要的是,我们的 DUET 作为一个“二重奏”协作优雅地连接了云和设备,从微调中解放了 DMG,并支持更快、更准确的 DMG 范例。我们对三个公共数据集上的 DUET 进行了广泛的实验研究,实验结果证实了我们的框架对不同 DMG 任务的有效性和通用性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DUET:+A+Tuning-Free+Device-Cloud+Collaborative+Parameters+Generation+Framework+for+Efficient+Device+Model+Generalization)|0| |[RL-MPCA: A Reinforcement Learning Based Multi-Phase Computation Allocation Approach for Recommender Systems](https://doi.org/10.1145/3543507.3583313)|Jiahong Zhou, Shunhui Mao, Guoliang Yang, Bo Tang, Qianlong Xie, Lebin Lin, Xingxing Wang, Dong Wang|Meituan, China|Recommender systems aim to recommend the most suitable items to users from a large number of candidates. Their computation cost grows as the number of user requests and the complexity of services (or models) increases. Under the limitation of computation resources (CRs), how to make a trade-off between computation cost and business revenue becomes an essential question. The existing studies focus on dynamically allocating CRs in queue truncation scenarios (i.e., allocating the size of candidates), and formulate the CR allocation problem as an optimization problem with constraints. Some of them focus on single-phase CR allocation, and others focus on multi-phase CR allocation but introduce some assumptions about queue truncation scenarios. However, these assumptions do not hold in other scenarios, such as retrieval channel selection and prediction model selection. Moreover, existing studies ignore the state transition process of requests between different phases, limiting the effectiveness of their approaches. This paper proposes a Reinforcement Learning (RL) based Multi-Phase Computation Allocation approach (RL-MPCA), which aims to maximize the total business revenue under the limitation of CRs. RL-MPCA formulates the CR allocation problem as a Weakly Coupled MDP problem and solves it with an RL-based approach. Specifically, RL-MPCA designs a novel deep Q-network to adapt to various CR allocation scenarios, and calibrates the Q-value by introducing multiple adaptive Lagrange multipliers (adaptive-λ) to avoid violating the global CR constraints. Finally, experiments on the offline simulation environment and online real-world recommender system validate the effectiveness of our approach.|推荐系统的目的是从大量的候选人中向用户推荐最合适的项目。它们的计算成本随着用户请求的数量和服务(或模型)的复杂性的增加而增加。在计算资源有限的情况下,如何在计算成本和业务收益之间取得平衡成为一个必须解决的问题。现有的研究集中于在队列截断情况下(即分配候选人的人数)动态分配登记册编号,并将登记册编号分配问题制订为一个有约束的最佳化问题。其中一些关注于单相 CR 分配,另一些关注于多相 CR 分配,但是引入了一些关于队列截断场景的假设。但是,这些假设在其他场景中不成立,例如检索通道选择和预测模型选择。此外,现有的研究忽视了请求在不同阶段之间的状态转换过程,限制了这些方法的有效性。本文提出了一种基于强化学习的多阶段计算分配方法(RL-MPCA) ,其目标是在客户关系的限制下实现企业总收入的最大化。RL-MPCA 将 CR 分配问题表示为弱耦合 MDP 问题,并用基于 RL 的方法求解。具体来说,RL-MPCA 设计了一种新的深层 Q 网络来适应各种 CR 分配场景,并通过引入多个自适应拉格朗日乘子(Adaptive-λ)来校准 Q 值,以避免违反全局 CR 约束。最后,离线仿真环境和在线真实世界推荐系统的实验验证了我们方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=RL-MPCA:+A+Reinforcement+Learning+Based+Multi-Phase+Computation+Allocation+Approach+for+Recommender+Systems)|0| |[Learning To Rank Resources with GNN](https://doi.org/10.1145/3543507.3583360)|Ulugbek Ergashev, Eduard C. Dragut, Weiyi Meng|Computer and Information Sciences, Temple University, USA; Computer Science, Binghamton University, USA|As the content on the Internet continues to grow, many new dynamically changing and heterogeneous sources of data constantly emerge. A conventional search engine cannot crawl and index at the same pace as the expansion of the Internet. Moreover, a large portion of the data on the Internet is not accessible to traditional search engines. Distributed Information Retrieval (DIR) is a viable solution to this as it integrates multiple shards (resources) and provides a unified access to them. Resource selection is a key component of DIR systems. There is a rich body of literature on resource selection approaches for DIR. A key limitation of the existing approaches is that they primarily use term-based statistical features and do not generally model resource-query and resource-resource relationships. In this paper, we propose a graph neural network (GNN) based approach to learning-to-rank that is capable of modeling resource-query and resource-resource relationships. Specifically, we utilize a pre-trained language model (PTLM) to obtain semantic information from queries and resources. Then, we explicitly build a heterogeneous graph to preserve structural information of query-resource relationships and employ GNN to extract structural information. In addition, the heterogeneous graph is enriched with resource-resource type of edges to further enhance the ranking accuracy. Extensive experiments on benchmark datasets show that our proposed approach is highly effective in resource selection. Our method outperforms the state-of-the-art by 6.4% to 42% on various performance metrics.|随着 Internet 上的内容不断增长,许多新的动态变化和异构的数据源不断出现。传统的搜索引擎无法以与互联网扩展同样的速度爬行和索引。此外,互联网上的大部分数据不能被传统的搜索引擎访问。分布式信息检索(DIR)是一个可行的解决方案,因为它集成了多个碎片(资源) ,并提供了对它们的统一访问。资源选择是 DIR 系统的关键组成部分。关于 DIR 的资源选择方法有大量的文献。现有方法的一个主要局限性在于,它们主要使用基于术语的统计特征,并且一般不对资源-查询和资源-资源关系建模。本文提出了一种基于图神经网络(GNN)的排序学习方法,该方法能够对资源-查询和资源-资源关系进行建模。具体来说,我们利用预先训练的语言模型(PTLM)从查询和资源中获取语义信息。然后,我们显式地构建一个异构图来保存查询-资源关系的结构信息,并使用 GNN 来提取结构信息。此外,对异构图进行了资源-资源类型边的丰富,进一步提高了排序的准确性。对基准数据集的大量实验表明,该方法在资源选择方面是非常有效的。在各种性能指标上,我们的方法比最先进的方法表现好6.4% 到42% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+To+Rank+Resources+with+GNN)|0| |[CgAT: Center-Guided Adversarial Training for Deep Hashing-Based Retrieval](https://doi.org/10.1145/3543507.3583369)|Xunguang Wang, Yiqun Lin, Xiaomeng Li|The Hong Kong University of Science and Technology, China and The Hong Kong University of Science and Technology Shenzhen Research Institute, China; The Hong Kong University of Science and Technology, China|Deep hashing has been extensively utilized in massive image retrieval because of its efficiency and effectiveness. However, deep hashing models are vulnerable to adversarial examples, making it essential to develop adversarial defense methods for image retrieval. Existing solutions achieved limited defense performance because of using weak adversarial samples for training and lacking discriminative optimization objectives to learn robust features. In this paper, we present a min-max based Center-guided Adversarial Training, namely CgAT, to improve the robustness of deep hashing networks through worst adversarial examples. Specifically, we first formulate the center code as a semantically-discriminative representative of the input image content, which preserves the semantic similarity with positive samples and dissimilarity with negative examples. We prove that a mathematical formula can calculate the center code immediately. After obtaining the center codes in each optimization iteration of the deep hashing network, they are adopted to guide the adversarial training process. On the one hand, CgAT generates the worst adversarial examples as augmented data by maximizing the Hamming distance between the hash codes of the adversarial examples and the center codes. On the other hand, CgAT learns to mitigate the effects of adversarial samples by minimizing the Hamming distance to the center codes. Extensive experiments on the benchmark datasets demonstrate the effectiveness of our adversarial training algorithm in defending against adversarial attacks for deep hashing-based retrieval. Compared with the current state-of-the-art defense method, we significantly improve the defense performance by an average of 18.61\%, 12.35\%, and 11.56\% on FLICKR-25K, NUS-WIDE, and MS-COCO, respectively. The code is available at https://github.com/xunguangwang/CgAT.|深度散列由于其高效性和有效性,在海量图像检索中得到了广泛的应用。然而,深度散列模型很容易受到敌对实例的影响,因此开发图像检索的敌对防御方法是非常必要的。现有的解决方案由于使用弱对手样本进行训练,缺乏区分性优化目标来学习鲁棒特征,因此防御性能有限。本文提出了一种基于最小-最大中心引导的对抗训练方法,即 CgAT,通过最坏的对抗实例来提高深度哈希网络的鲁棒性。具体来说,我们首先将中心代码表示为输入图像内容的语义识别代表,保留了正样本的语义相似性和负样本的语义不相似性。我们证明了一个数学公式可以立即计算中心码。在深度哈希网络的每次优化迭代中获得中心码后,采用中心码来指导对抗训练过程。一方面,通过最大化对手的散列码和中心代码之间的汉明距离,CgAT 生成最坏的对手的例子作为增强数据。另一方面,CgAT 学会了通过最小化对中心代码的汉明距离来减轻对抗性样本的影响。在基准数据集上的大量实验证明了我们的对抗性训练算法在防御基于深度散列检索的对抗性攻击方面的有效性。与目前最先进的防御方法相比,FLICKR-25K、 NUS-WIDE 和 MS-COCO 的防御性能分别平均提高了18.61% 、12.35% 和11.56% 。密码可在 https://github.com/xunguangwang/cgat 查阅。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CgAT:+Center-Guided+Adversarial+Training+for+Deep+Hashing-Based+Retrieval)|0| -|[Algorithmic Vibe in Information Retrieval](https://doi.org/10.1145/3543507.3583384)|Ali Montazeralghaem, Nick Craswell, Ryen W. White, Ahmed Hassan Awadallah, Byungki Byun|Microsoft, USA; University of Massachusetts Amherst, USA|When information retrieval systems return a ranked list of results in response to a query, they may be choosing from a large set of candidate results that are equally useful and relevant. This means we might be able to identify a difference between rankers A and B, where ranker A systematically prefers a certain type of relevant results. Ranker A may have this systematic difference (different “vibe”) without having systematically better or worse results according to standard information retrieval metrics. We first show that a vibe difference can exist, comparing two publicly available rankers, where the one that is trained on health-related queries will systematically prefer health-related results, even for non-health queries. We define a vibe metric that lets us see the words that a ranker prefers. We investigate the vibe of search engine clicks vs. human labels. We perform an initial study into correcting for vibe differences to make ranker A more like ranker B via changes in negative sampling during training.|当信息检索系统对一个查询返回一个排名结果列表时,它们可能会从一大堆同样有用和相关的候选结果中进行选择。这意味着我们可能能够识别排名 A 和 B 之间的差异,其中排名 A 系统地偏好某种类型的相关结果。排名 a 可能有这种系统性差异(不同的“内心感应”) ,而没有根据标准的信息检索指标得出系统性更好或更差的结果。我们首先展示了内心感应差异的存在,比较两个公开可用的排名,其中受过健康相关查询培训的人会系统地偏好与健康相关的结果,即使对于非健康查询也是如此。我们定义一个内心感应度量,让我们看到一个排名喜欢的词。我们调查了搜索引擎点击与人类标签之间的关系。我们进行了一个初步的研究,以纠正内心感应的差异,使排名 A 更像排名 B 通过改变负面抽样在训练期间。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Algorithmic+Vibe+in+Information+Retrieval)|0| +|[Algorithmic Vibe in Information Retrieval](https://doi.org/10.1145/3543507.3583384)|Ali Montazeralghaem, Nick Craswell, Ryen W. White, Ahmed Hassan Awadallah, Byungki Byun|University of Massachusetts Amherst, USA; Microsoft, USA|When information retrieval systems return a ranked list of results in response to a query, they may be choosing from a large set of candidate results that are equally useful and relevant. This means we might be able to identify a difference between rankers A and B, where ranker A systematically prefers a certain type of relevant results. Ranker A may have this systematic difference (different “vibe”) without having systematically better or worse results according to standard information retrieval metrics. We first show that a vibe difference can exist, comparing two publicly available rankers, where the one that is trained on health-related queries will systematically prefer health-related results, even for non-health queries. We define a vibe metric that lets us see the words that a ranker prefers. We investigate the vibe of search engine clicks vs. human labels. We perform an initial study into correcting for vibe differences to make ranker A more like ranker B via changes in negative sampling during training.|当信息检索系统对一个查询返回一个排名结果列表时,它们可能会从一大堆同样有用和相关的候选结果中进行选择。这意味着我们可能能够识别排名 A 和 B 之间的差异,其中排名 A 系统地偏好某种类型的相关结果。排名 a 可能有这种系统性差异(不同的“内心感应”) ,而没有根据标准的信息检索指标得出系统性更好或更差的结果。我们首先展示了内心感应差异的存在,比较两个公开可用的排名,其中受过健康相关查询培训的人会系统地偏好与健康相关的结果,即使对于非健康查询也是如此。我们定义一个内心感应度量,让我们看到一个排名喜欢的词。我们调查了搜索引擎点击与人类标签之间的关系。我们进行了一个初步的研究,以纠正内心感应的差异,使排名 A 更像排名 B 通过改变负面抽样在训练期间。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Algorithmic+Vibe+in+Information+Retrieval)|0| |[Geographic Information Retrieval Using Wikipedia Articles](https://doi.org/10.1145/3543507.3583469)|Amir Krause, Sara Cohen|The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University, Israel|Assigning semantically relevant, real-world locations to documents opens new possibilities to perform geographic information retrieval. We propose a novel approach to automatically determine the latitude-longitude coordinates of appropriate Wikipedia articles with high accuracy, leveraging both text and metadata in the corpus. By examining articles whose base-truth coordinates are known, we show that our method attains a substantial improvement over state of the art works. We subsequently demonstrate how our approach could yield two benefits: (1) detecting significant geolocation errors in Wikipedia; and (2) proposing approximated coordinates for hundreds of thousands of articles which are not traditionally considered to be locations (such as events, ideas or people), opening new possibilities for conceptual geographic retrievals over Wikipedia.|为文档分配语义相关的、现实世界中的位置,为执行地理信息检索开辟了新的可能性。我们提出了一种新的方法,利用语料库中的文本和元数据,高精度地自动确定适当 Wikipedia 文章的经纬度坐标。通过检查文章的基础-真理坐标已知,我们表明,我们的方法取得了实质性的改善状态的艺术作品。我们随后展示了我们的方法如何产生两个好处: (1)检测维基百科中的重大地理定位错误; (2)提出几十万个传统上不被认为是地点的文章(如事件、想法或人)的大致坐标,为维基百科的概念地理检索开辟了新的可能性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Geographic+Information+Retrieval+Using+Wikipedia+Articles)|0| |[Optimizing Guided Traversal for Fast Learned Sparse Retrieval](https://doi.org/10.1145/3543507.3583497)|Yifan Qiao, Yingrui Yang, Haixin Lin, Tao Yang|Department of Computer Science, University of California, Santa Barbara, USA; Department of Computer Science, University of California at Santa Barbara, USA|Recent studies show that BM25-driven dynamic index skipping can greatly accelerate MaxScore-based document retrieval based on the learned sparse representation derived by DeepImpact. This paper investigates the effectiveness of such a traversal guidance strategy during top k retrieval when using other models such as SPLADE and uniCOIL, and finds that unconstrained BM25-driven skipping could have a visible relevance degradation when the BM25 model is not well aligned with a learned weight model or when retrieval depth k is small. This paper generalizes the previous work and optimizes the BM25 guided index traversal with a two-level pruning control scheme and model alignment for fast retrieval using a sparse representation. Although there can be a cost of increased latency, the proposed scheme is much faster than the original MaxScore method without BM25 guidance while retaining the relevance effectiveness. This paper analyzes the competitiveness of this two-level pruning scheme, and evaluates its tradeoff in ranking relevance and time efficiency when searching several test datasets.|最近的研究表明,BM25驱动的动态指数跳跃可以大大加快基于深度影响学习稀疏表示的基于 MaxScore 的文献检索。本文研究了使用 SPLADE 和 uniCOIL 等其他模型进行 top k 检索时,这种遍历指导策略的有效性,发现当 BM25模型与学习权重模型不匹配或检索深度 k 较小时,无约束 BM25驱动的跳跃可能具有明显的相关性退化。本文在总结前人工作的基础上,采用两级剪枝控制策略和稀疏表示模型对齐方法对 BM25引导的索引遍历进行了优化,实现了快速检索。虽然可能会增加延迟的代价,提出的方案比原来的 MaxScore 方法快得多没有 BM25的指导,同时保留了相关性的有效性。本文分析了这种两级剪枝方案的竞争力,并在搜索多个测试数据集时,对其在排序相关性和时间效率方面的权衡进行了评估。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Optimizing+Guided+Traversal+for+Fast+Learned+Sparse+Retrieval)|0| |[Stability and Efficiency of Personalised Cultural Markets](https://doi.org/10.1145/3543507.3583315)|Haiqing Zhu, Yun Kuen Cheung, Lexing Xie|Royal Holloway University of London, United Kingdom; Australian National University, Australia|This work is concerned with the dynamics of online cultural markets, namely, attention allocation of many users on a set of digital goods with infinite supply. Such dynamic is important in shaping processes and outcomes in society, from trending items in entertainment, collective knowledge creation, to election outcomes. The outcomes of online cultural markets are susceptible to intricate social influence dynamics, particularly so when the community comprises consumers with heterogeneous interests. This has made formal analysis of these markets improbable. In this paper, we remedy this by establishing robust connections between influence dynamics and optimization processes, in trial-offer markets where the consumer preferences are modelled by multinomial logit. Among other results, we show that the proportional-response-esque influence dynamic is equivalent to stochastic mirror descent on a convex objective function, thus leading to a stable and predictable outcome. When all consumers are homogeneous, the objective function has a natural interpretation as a weighted sum of efficiency and diversity of the culture market. In simulations driven by real-world preferences collected from a large-scale recommender system, we observe that ranking strategies aligned with the underlying heterogeneous preferences are more stable, and achieves higher efficiency and diversity. In simulations driven by real-world preferences collected from a large-scale recommender system, we observe that ranking strategies aligned with the underlying heterogeneous preferences are more stable, and achieves higher efficiency and diversity.|本文研究的是在线文化市场的动态变化,即在无限供应的数字商品上,许多用户的注意力分配。这种动态对于塑造社会的进程和结果至关重要,从娱乐的趋势项目、集体知识创造到选举结果。在线文化市场的结果容易受到错综复杂的社会影响力动态的影响,特别是当社区由具有不同兴趣的消费者组成时。这使得对这些市场的正式分析成为不可能。在本文中,我们通过建立影响力动态和最优化过程之间的强有力的联系来纠正这个问题,在试销市场中,消费者的偏好是由多项式 logit 建模的。在其他结果中,我们表明,比例响应方式的影响动态相当于随机镜下降的凸目标函数,从而导致一个稳定和可预测的结果。当所有消费者都是同质的时候,目标函数就自然地被解释为文化市场效率和多样性的加权和。在从大规模推荐系统中收集的现实世界偏好驱动的模拟中,我们观察到与潜在的异质偏好相一致的排序策略更加稳定,并且实现更高的效率和多样性。在从大规模推荐系统中收集的现实世界偏好驱动的模拟中,我们观察到与潜在的异质偏好相一致的排序策略更加稳定,并且实现更高的效率和多样性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Stability+and+Efficiency+of+Personalised+Cultural+Markets)|0| -|[Eligibility Mechanisms: Auctions Meet Information Retrieval](https://doi.org/10.1145/3543507.3583478)|Gagan Goel, Renato Paes Leme, Jon Schneider, David Thompson, Hanrui Zhang|Google, USA; Carnegie Mellon University, USA|The design of internet advertisement systems is both an auction design problem and an information retrieval (IR) problem. As an auction, the designer needs to take the participants incentives into account. As an information retrieval problem, it needs to identify the ad that it is the most relevant to a user out of an enormous set of ad candidates. Those aspects are combined by first having an IR system narrow down the initial set of ad candidates to a manageable size followed by an auction that ranks and prices those candidates. If the IR system uses information about bids, agents could in principle manipulate the system by manipulating the IR stage even when the subsequent auction is truthful. In this paper we investigate the design of truthful IR mechanisms, which we term eligibility mechanisms. We model it as a truthful version of the stochastic probing problem. We show that there is a constant gap between the truthful and non-truthful versions of the stochastic probing problem and exhibit a constant approximation algorithm. En route, we also characterize the set of eligibility mechanisms, which provides necessary and sufficient conditions for an IR system to be truthful.|互联网广告系统的设计既是一个拍卖设计问题,也是一个信息检索(IR)问题。作为一种拍卖,设计师需要考虑参与者的激励因素。作为一个信息检索问题,它需要从大量的候选广告中找出与用户最相关的广告。将这些方面结合起来,首先通过投资者关系系统将最初的广告候选人缩小到一个可管理的规模,然后通过拍卖对这些候选人进行排名和定价。如果 IR 系统使用关于出价的信息,即使随后的拍卖是真实的,代理人原则上也可以通过操纵 IR 阶段来操纵系统。本文研究了真实信息检索机制的设计,我们称之为资格机制。我们将其建模为随机探测问题的真实版本。我们证明了随机探测问题的真实版本和非真实版本之间存在一个恒定的差距,并呈现出一个恒定的近似演算法。在此过程中,我们还刻画了一组资格机制,它为 IR 系统的真实性提供了充分的必要条件。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Eligibility+Mechanisms:+Auctions+Meet+Information+Retrieval)|0| +|[Eligibility Mechanisms: Auctions Meet Information Retrieval](https://doi.org/10.1145/3543507.3583478)|Gagan Goel, Renato Paes Leme, Jon Schneider, David Thompson, Hanrui Zhang|Carnegie Mellon University, USA; Google, USA|The design of internet advertisement systems is both an auction design problem and an information retrieval (IR) problem. As an auction, the designer needs to take the participants incentives into account. As an information retrieval problem, it needs to identify the ad that it is the most relevant to a user out of an enormous set of ad candidates. Those aspects are combined by first having an IR system narrow down the initial set of ad candidates to a manageable size followed by an auction that ranks and prices those candidates. If the IR system uses information about bids, agents could in principle manipulate the system by manipulating the IR stage even when the subsequent auction is truthful. In this paper we investigate the design of truthful IR mechanisms, which we term eligibility mechanisms. We model it as a truthful version of the stochastic probing problem. We show that there is a constant gap between the truthful and non-truthful versions of the stochastic probing problem and exhibit a constant approximation algorithm. En route, we also characterize the set of eligibility mechanisms, which provides necessary and sufficient conditions for an IR system to be truthful.|互联网广告系统的设计既是一个拍卖设计问题,也是一个信息检索(IR)问题。作为一种拍卖,设计师需要考虑参与者的激励因素。作为一个信息检索问题,它需要从大量的候选广告中找出与用户最相关的广告。将这些方面结合起来,首先通过投资者关系系统将最初的广告候选人缩小到一个可管理的规模,然后通过拍卖对这些候选人进行排名和定价。如果 IR 系统使用关于出价的信息,即使随后的拍卖是真实的,代理人原则上也可以通过操纵 IR 阶段来操纵系统。本文研究了真实信息检索机制的设计,我们称之为资格机制。我们将其建模为随机探测问题的真实版本。我们证明了随机探测问题的真实版本和非真实版本之间存在一个恒定的差距,并呈现出一个恒定的近似演算法。在此过程中,我们还刻画了一组资格机制,它为 IR 系统的真实性提供了充分的必要条件。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Eligibility+Mechanisms:+Auctions+Meet+Information+Retrieval)|0| |[Scoping Fairness Objectives and Identifying Fairness Metrics for Recommender Systems: The Practitioners' Perspective](https://doi.org/10.1145/3543507.3583204)|Jessie J. Smith, Lex Beattie, Henriette Cramer||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Scoping+Fairness+Objectives+and+Identifying+Fairness+Metrics+for+Recommender+Systems:+The+Practitioners'+Perspective)|0| |[Same Same, But Different: Conditional Multi-Task Learning for Demographic-Specific Toxicity Detection](https://doi.org/10.1145/3543507.3583290)|Soumyajit Gupta, Sooyong Lee, Maria DeArteaga, Matthew Lease||Algorithmic bias often arises as a result of differential subgroup validity, in which predictive relationships vary across groups. For example, in toxic language detection, comments targeting different demographic groups can vary markedly across groups. In such settings, trained models can be dominated by the relationships that best fit the majority group, leading to disparate performance. We propose framing toxicity detection as multi-task learning (MTL), allowing a model to specialize on the relationships that are relevant to each demographic group while also leveraging shared properties across groups. With toxicity detection, each task corresponds to identifying toxicity against a particular demographic group. However, traditional MTL requires labels for all tasks to be present for every data point. To address this, we propose Conditional MTL (CondMTL), wherein only training examples relevant to the given demographic group are considered by the loss function. This lets us learn group specific representations in each branch which are not cross contaminated by irrelevant labels. Results on synthetic and real data show that using CondMTL improves predictive recall over various baselines in general and for the minority demographic group in particular, while having similar overall accuracy.|算法偏差通常由于不同的子群效度而产生,其中预测关系因组而异。例如,在毒性语言检测中,针对不同人口群体的评论可能因群体而有显著差异。在这种情况下,训练有素的模型可以被最适合大多数群体的关系所主导,从而导致不同的表现。我们建议将毒性检测框架为多任务学习(MTL) ,允许模型专门处理与每个人口组相关的关系,同时利用跨组的共享属性。通过毒性检测,每项任务都对应于确定针对特定人口群体的毒性。但是,传统的 MTL 要求为每个数据点显示所有任务的标签。为了解决这个问题,我们提出了条件 MTL (CondMTL) ,其中只有与给定人口组相关的训练实例被损失函数考虑。这使我们能够了解每个分支中不受不相关标签交叉污染的特定分组表示。对合成和真实数据的研究结果表明,使用 CondMTL 提高了对各种基线的预测性回忆,特别是对少数人口群体,同时具有相似的整体准确性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Same+Same,+But+Different:+Conditional+Multi-Task+Learning+for+Demographic-Specific+Toxicity+Detection)|0| |[Towards Explainable Collaborative Filtering with Taste Clusters Learning](https://doi.org/10.1145/3543507.3583303)|Yuntao Du, Jianxun Lian, Jing Yao, Xiting Wang, Mingqi Wu, Lu Chen, Yunjun Gao, Xing Xie||Collaborative Filtering (CF) is a widely used and effective technique for recommender systems. In recent decades, there have been significant advancements in latent embedding-based CF methods for improved accuracy, such as matrix factorization, neural collaborative filtering, and LightGCN. However, the explainability of these models has not been fully explored. Adding explainability to recommendation models can not only increase trust in the decisionmaking process, but also have multiple benefits such as providing persuasive explanations for item recommendations, creating explicit profiles for users and items, and assisting item producers in design improvements. In this paper, we propose a neat and effective Explainable Collaborative Filtering (ECF) model that leverages interpretable cluster learning to achieve the two most demanding objectives: (1) Precise - the model should not compromise accuracy in the pursuit of explainability; and (2) Self-explainable - the model's explanations should truly reflect its decision-making process, not generated from post-hoc methods. The core of ECF is mining taste clusters from user-item interactions and item profiles.We map each user and item to a sparse set of taste clusters, and taste clusters are distinguished by a few representative tags. The user-item preference, users/items' cluster affiliations, and the generation of taste clusters are jointly optimized in an end-to-end manner. Additionally, we introduce a forest mechanism to ensure the model's accuracy, explainability, and diversity. To comprehensively evaluate the explainability quality of taste clusters, we design several quantitative metrics, including in-cluster item coverage, tag utilization, silhouette, and informativeness. Our model's effectiveness is demonstrated through extensive experiments on three real-world datasets.|协同过滤(CF)是推荐系统中广泛使用的有效技术。近几十年来,基于潜在嵌入的 CF 方法在提高准确性方面取得了重大进展,例如矩阵分解、神经协同过滤和 LightGCN。然而,这些模型的可解释性还没有得到充分的探索。为推荐模型增加可解释性不仅可以增加对决策过程的信任,而且还有多种好处,例如为项目推荐提供有说服力的解释,为用户和项目创建明确的配置文件,以及帮助项目生产者改进设计。在本文中,我们提出了一个简洁而有效的可解释协同过滤(ECF)模型,它利用可解释的聚类学习来实现两个最严格的目标: (1)精确——模型在追求可解释性的过程中不应该损害准确性; (2)自我解释——模型的解释应该真实地反映其决策过程,而不是由事后方法产生。ECF 的核心是从用户-项目交互和项目配置文件中挖掘味觉集群。我们将每个用户和项目映射到一个稀疏的味觉集群,味觉集群通过几个代表性的标签来区分。用户项偏好、用户/项目的集群附属关系以及味道集群的生成都以端到端的方式进行了联合优化。此外,我们还引入了森林机制来保证模型的准确性、可解释性和多样性。为了全面评价味觉集群的可解释性质量,我们设计了几个量化指标,包括集群内项目覆盖率、标签利用率、轮廓和信息量。通过对三个实际数据集的大量实验,验证了该模型的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Explainable+Collaborative+Filtering+with+Taste+Clusters+Learning)|0| @@ -256,104 +256,104 @@ |[Fairness-Aware Clique-Preserving Spectral Clustering of Temporal Graphs](https://doi.org/10.1145/3543507.3583423)|Dongqi Fu, Dawei Zhou, Ross Maciejewski, Arie Croitoru, Marcus Boyd, Jingrui He||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Fairness-Aware+Clique-Preserving+Spectral+Clustering+of+Temporal+Graphs)|0| |[HybridEval: A Human-AI Collaborative Approach for Evaluating Design Ideas at Scale](https://doi.org/10.1145/3543507.3583496)|Sepideh Mesbah, Ines Arous, Jie Yang, Alessandro Bozzon|Booking, Netherlands; Delft University of Technology, Netherlands; University of Fribourg, Switzerland|Evaluating design ideas is necessary to predict their success and assess their impact early on in the process. Existing methods rely either on metrics computed by systems that are effective but subject to errors and bias, or experts’ ratings, which are accurate but expensive and long to collect. Crowdsourcing offers a compelling way to evaluate a large number of design ideas in a short amount of time while being cost-effective. Workers’ evaluation is, however, less reliable and might substantially differ from experts’ evaluation. In this work, we investigate workers’ rating behavior and compare it with experts. First, we instrument a crowdsourcing study where we asked workers to evaluate design ideas from three innovation challenges. We show that workers share similar insights with experts but tend to rate more generously and weigh certain criteria more importantly. Next, we develop a hybrid human-AI approach that combines a machine learning model with crowdsourcing to evaluate ideas. Our approach models workers’ reliability and bias while leveraging ideas’ textual content to train a machine learning model. It is able to incorporate experts’ ratings whenever available, to supervise the model training and infer worker performance. Results show that our framework outperforms baseline methods and requires significantly less training data from experts, thus providing a viable solution for evaluating ideas at scale.|评估设计想法是必要的,以预测他们的成功和评估他们的影响早在过程中。现有的方法要么依赖于有效但容易出错和偏差的系统计算出的指标,要么依赖于专家的评分,这些评分准确但昂贵,而且需要很长时间才能收集到。众包提供了一个引人注目的方式来评估大量的设计想法在短时间内,同时具有成本效益。然而,工人的评估不太可靠,可能与专家的评估大不相同。在这项工作中,我们调查工人的评分行为,并与专家进行比较。首先,我们进行了一项众包研究,要求工人评估来自三个创新挑战的设计理念。我们的研究表明,员工与专家有着相似的见解,但他们倾向于更慷慨地给出评价,更重视某些标准。接下来,我们开发了一种混合的人工智能方法,它结合了机器学习模型和众包来评估想法。我们的方法模拟工人的可靠性和偏见,同时利用想法的文本内容来训练机器学习模型。它能够在任何时候合并专家的评分,以监督模型培训和推断工人的表现。结果表明,我们的框架优于基线方法,需要的专家培训数据明显减少,从而为大规模评估想法提供了一个可行的解决方案。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=HybridEval:+A+Human-AI+Collaborative+Approach+for+Evaluating+Design+Ideas+at+Scale)|0| |[A Multi-task Model for Emotion and Offensive Aided Stance Detection of Climate Change Tweets](https://doi.org/10.1145/3543507.3583860)|Apoorva Upadhyaya, Marco Fisichella, Wolfgang Nejdl|L3S Research Center, Leibniz University Hannover, Germany|In this work, we address the United Nations Sustainable Development Goal 13: Climate Action by focusing on identifying public attitudes toward climate change on social media platforms such as Twitter. Climate change is threatening the health of the planet and humanity. Public engagement is critical to address climate change. However, climate change conversations on Twitter tend to polarize beliefs, leading to misinformation and fake news that influence public attitudes, often dividing them into climate change believers and deniers. Our paper proposes an approach to classify the attitude of climate change tweets (believe/deny/ambiguous) to identify denier statements on Twitter. Most existing approaches for detecting stances and classifying climate change tweets either overlook deniers’ tweets or do not have a suitable architecture. The relevant literature suggests that emotions and higher levels of toxicity are prevalent in climate change Twitter conversations, leading to a delay in appropriate climate action. Therefore, our work focuses on learning stance detection (main task) while exploiting the auxiliary tasks of recognizing emotions and offensive utterances. We propose a multimodal multitasking framework MEMOCLiC that captures the input data using different embedding techniques and attention frameworks, and then incorporates the learned emotional and offensive expressions to obtain an overall representation of the features relevant to the stance of the input tweet. Extensive experiments conducted on a novel curated climate change dataset and two benchmark stance detection datasets (SemEval-2016 and ClimateStance-2022) demonstrate the effectiveness of our approach.|在这项工作中,我们致力于实现联合国可持续发展目标13: 气候行动,重点是在 Twitter 等社交媒体平台上确定公众对气候变化的态度。气候变化正威胁着地球和人类的健康。公众参与对应对气候变化至关重要。然而,Twitter 上关于气候变化的讨论往往会导致信仰的两极分化,导致错误信息和假新闻,从而影响公众态度,往往将公众分为气候变化信徒和否认者。我们的论文提出了一种方法来分类气候变化推文的态度(相信/否认/模棱两可) ,以确定否认者的声明在 Twitter 上。大多数现有的检测立场和分类气候变化推文的方法要么忽略否认者的推文,要么没有一个合适的架构。相关文献表明,在气候变化的 Twitter 对话中,情绪和较高的毒性水平普遍存在,导致适当的气候行动出现延误。因此,我们的工作集中在学习姿势检测(主要任务) ,同时利用辅助任务的识别情绪和攻击性话语。我们提出了一个多模式多任务框架 MEMOCLiC,它使用不同的嵌入技术和注意力框架捕获输入数据,然后结合所学到的情绪和攻击性表达来获得与输入 tweet 立场相关的特征的整体表示。在一个新的策划气候变化数据集和两个基准姿态检测数据集(SemEval-2016和 ClimateStance-2022)上进行的大量实验证明了我们方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Multi-task+Model+for+Emotion+and+Offensive+Aided+Stance+Detection+of+Climate+Change+Tweets)|0| -|[Cross-center Early Sepsis Recognition by Medical Knowledge Guided Collaborative Learning for Data-scarce Hospitals](https://doi.org/10.1145/3543507.3583989)|Ruiqing Ding, Fangjie Rong, Xiao Han, Leye Wang|Peking University, China; Shanghai University of Finance and Economics, China|There are significant regional inequities in health resources around the world. It has become one of the most focused topics to improve health services for data-scarce hospitals and promote health equity through knowledge sharing among medical institutions. Because electronic medical records (EMRs) contain sensitive personal information, privacy protection is unavoidable and essential for multi-hospital collaboration. In this paper, for a common disease in ICU patients, sepsis, we propose a novel cross-center collaborative learning framework guided by medical knowledge, SofaNet, to achieve early recognition of this disease. The Sepsis-3 guideline, published in 2016, defines that sepsis can be diagnosed by satisfying both suspicion of infection and Sequential Organ Failure Assessment (SOFA) greater than or equal to 2. Based on this knowledge, SofaNet adopts a multi-channel GRU structure to predict SOFA values of different systems, which can be seen as an auxiliary task to generate better health status representations for sepsis recognition. Moreover, we only achieve feature distribution alignment in the hidden space during cross-center collaborative learning, which ensures secure and compliant knowledge transfer without raw data exchange. Extensive experiments on two open clinical datasets, MIMIC-III and Challenge, demonstrate that SofaNet can benefit early sepsis recognition when hospitals only have limited EMRs.|世界各地在卫生资源方面存在着严重的区域不平等。通过医疗机构之间的知识共享,改善数据稀缺的医院的卫生服务,促进卫生公平,已成为最重点的议题之一。由于电子病历(EMR)包含敏感的个人信息,隐私保护是不可避免的,也是多医院合作的必要条件。在这篇文章中,我们针对 ICU 患者常见的一种疾病,败血症,提出了一种新的跨中心合作学习框架,以医学知识为指导,SofaNet,以实现对这种疾病的早期识别。2016年发布的脓毒症3指南规定,脓毒症可以通过满足感染的怀疑和大于或等于2的序贯性器官衰竭评估(SOFA)来诊断。在此基础上,SofaNet 采用多通道 GRU 结构来预测不同系统的 SOFA 值,这可以看作是一个辅助任务,以产生更好的脓毒症识别健康状态表示。此外,我们只有在跨中心合作学习时才能在隐藏空间中实现特征分布对齐,从而确保在没有原始数据交换的情况下安全和兼容的知识传输。在两个开放的临床数据集 MIMIC-III 和 Challenge 上的大量实验表明,当医院只有有限的 EMR 时,SofaNet 可以有利于早期脓毒症识别。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Cross-center+Early+Sepsis+Recognition+by+Medical+Knowledge+Guided+Collaborative+Learning+for+Data-scarce+Hospitals)|0| -|[Breaking Filter Bubble: A Reinforcement Learning Framework of Controllable Recommender System](https://doi.org/10.1145/3543507.3583856)|Zhenyang Li, Yancheng Dong, Chen Gao, Yizhou Zhao, Dong Li, Jianye Hao, Kai Zhang, Yong Li, Zhi Wang|Tsinghua-Berkeley Shenzhen Institute, Tsinghua Shenzhen International Graduate School, China and Peng Cheng Laboratory, China; Tsinghua University, China and Huawei Noah's Ark Lab, China; Huawei Noah's Ark Lab, China; Carnegie Mellon University, USA; Tsinghua Shenzhen International Graduate School, Tsinghua University, China and Research Institute of Tsinghua, Pearl River Delta, China; Tsinghua University, China|In the information-overloaded era of the Web, recommender systems that provide personalized content filtering are now the mainstream portal for users to access Web information. Recommender systems deploy machine learning models to learn users’ preferences from collected historical data, leading to more centralized recommendation results due to the feedback loop. As a result, it will harm the ranking of content outside the narrowed scope and limit the options seen by users. In this work, we first conduct data analysis from a graph view to observe that the users’ feedback is restricted to limited items, verifying the phenomenon of centralized recommendation. We further develop a general simulation framework to derive the procedure of the recommender system, including data collection, model learning, and item exposure, which forms a loop. To address the filter bubble issue under the feedback loop, we then propose a general and easy-to-use reinforcement learning-based method, which can adaptively select few but effective connections between nodes from different communities as the exposure list. We conduct extensive experiments in the simulation framework based on large-scale real-world datasets. The results demonstrate that our proposed reinforcement learning-based control method can serve as an effective solution to alleviate the filter bubble and the separated communities induced by it. We believe the proposed framework of controllable recommendation in this work can inspire not only the researchers of recommender systems, but also a broader community concerned with artificial intelligence algorithms’ impact on humanity, especially for those vulnerable populations on the Web.|在信息过载的 Web 时代,提供个性化内容过滤的推荐系统现在已经成为用户访问 Web 信息的主流门户。推荐系统部署机器学习模型,从收集的历史数据中了解用户的偏好,由于反馈回路的存在,推荐结果更加集中。因此,它将损害内容在狭窄范围之外的排名,并限制用户看到的选项。本文首先从图的角度进行数据分析,发现用户的反馈仅限于有限的项目,验证了集中推荐的现象。我们进一步开发了一个通用的模拟框架来推导推荐系统的过程,包括数据收集、模型学习和项目曝光,形成一个循环。为了解决反馈回路下的过滤泡问题,提出了一种通用的、易于使用的强化学习方法,该方法可以自适应地选择不同社区节点之间少量但有效的连接作为暴露列表。我们在基于大规模真实世界数据集的仿真框架中进行了广泛的实验。结果表明,本文提出的基于强化学习的控制方法可以作为一种有效的解决方案,以减轻过滤气泡及其引起的社区分离。我们相信,这项工作中提出的可控推荐框架不仅可以激励推荐系统的研究人员,而且可以激励更广泛的社区关注人工智能算法对人类的影响,特别是对那些网络上的弱势群体。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Breaking+Filter+Bubble:+A+Reinforcement+Learning+Framework+of+Controllable+Recommender+System)|0| +|[Cross-center Early Sepsis Recognition by Medical Knowledge Guided Collaborative Learning for Data-scarce Hospitals](https://doi.org/10.1145/3543507.3583989)|Ruiqing Ding, Fangjie Rong, Xiao Han, Leye Wang|Shanghai University of Finance and Economics, China; Peking University, China|There are significant regional inequities in health resources around the world. It has become one of the most focused topics to improve health services for data-scarce hospitals and promote health equity through knowledge sharing among medical institutions. Because electronic medical records (EMRs) contain sensitive personal information, privacy protection is unavoidable and essential for multi-hospital collaboration. In this paper, for a common disease in ICU patients, sepsis, we propose a novel cross-center collaborative learning framework guided by medical knowledge, SofaNet, to achieve early recognition of this disease. The Sepsis-3 guideline, published in 2016, defines that sepsis can be diagnosed by satisfying both suspicion of infection and Sequential Organ Failure Assessment (SOFA) greater than or equal to 2. Based on this knowledge, SofaNet adopts a multi-channel GRU structure to predict SOFA values of different systems, which can be seen as an auxiliary task to generate better health status representations for sepsis recognition. Moreover, we only achieve feature distribution alignment in the hidden space during cross-center collaborative learning, which ensures secure and compliant knowledge transfer without raw data exchange. Extensive experiments on two open clinical datasets, MIMIC-III and Challenge, demonstrate that SofaNet can benefit early sepsis recognition when hospitals only have limited EMRs.|世界各地在卫生资源方面存在着严重的区域不平等。通过医疗机构之间的知识共享,改善数据稀缺的医院的卫生服务,促进卫生公平,已成为最重点的议题之一。由于电子病历(EMR)包含敏感的个人信息,隐私保护是不可避免的,也是多医院合作的必要条件。在这篇文章中,我们针对 ICU 患者常见的一种疾病,败血症,提出了一种新的跨中心合作学习框架,以医学知识为指导,SofaNet,以实现对这种疾病的早期识别。2016年发布的脓毒症3指南规定,脓毒症可以通过满足感染的怀疑和大于或等于2的序贯性器官衰竭评估(SOFA)来诊断。在此基础上,SofaNet 采用多通道 GRU 结构来预测不同系统的 SOFA 值,这可以看作是一个辅助任务,以产生更好的脓毒症识别健康状态表示。此外,我们只有在跨中心合作学习时才能在隐藏空间中实现特征分布对齐,从而确保在没有原始数据交换的情况下安全和兼容的知识传输。在两个开放的临床数据集 MIMIC-III 和 Challenge 上的大量实验表明,当医院只有有限的 EMR 时,SofaNet 可以有利于早期脓毒症识别。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Cross-center+Early+Sepsis+Recognition+by+Medical+Knowledge+Guided+Collaborative+Learning+for+Data-scarce+Hospitals)|0| +|[Breaking Filter Bubble: A Reinforcement Learning Framework of Controllable Recommender System](https://doi.org/10.1145/3543507.3583856)|Zhenyang Li, Yancheng Dong, Chen Gao, Yizhou Zhao, Dong Li, Jianye Hao, Kai Zhang, Yong Li, Zhi Wang|Tsinghua Shenzhen International Graduate School, Tsinghua University, China and Research Institute of Tsinghua, Pearl River Delta, China; Tsinghua University, China; Tsinghua-Berkeley Shenzhen Institute, Tsinghua Shenzhen International Graduate School, China and Peng Cheng Laboratory, China; Tsinghua University, China and Huawei Noah's Ark Lab, China; Huawei Noah's Ark Lab, China; Carnegie Mellon University, USA|In the information-overloaded era of the Web, recommender systems that provide personalized content filtering are now the mainstream portal for users to access Web information. Recommender systems deploy machine learning models to learn users’ preferences from collected historical data, leading to more centralized recommendation results due to the feedback loop. As a result, it will harm the ranking of content outside the narrowed scope and limit the options seen by users. In this work, we first conduct data analysis from a graph view to observe that the users’ feedback is restricted to limited items, verifying the phenomenon of centralized recommendation. We further develop a general simulation framework to derive the procedure of the recommender system, including data collection, model learning, and item exposure, which forms a loop. To address the filter bubble issue under the feedback loop, we then propose a general and easy-to-use reinforcement learning-based method, which can adaptively select few but effective connections between nodes from different communities as the exposure list. We conduct extensive experiments in the simulation framework based on large-scale real-world datasets. The results demonstrate that our proposed reinforcement learning-based control method can serve as an effective solution to alleviate the filter bubble and the separated communities induced by it. We believe the proposed framework of controllable recommendation in this work can inspire not only the researchers of recommender systems, but also a broader community concerned with artificial intelligence algorithms’ impact on humanity, especially for those vulnerable populations on the Web.|在信息过载的 Web 时代,提供个性化内容过滤的推荐系统现在已经成为用户访问 Web 信息的主流门户。推荐系统部署机器学习模型,从收集的历史数据中了解用户的偏好,由于反馈回路的存在,推荐结果更加集中。因此,它将损害内容在狭窄范围之外的排名,并限制用户看到的选项。本文首先从图的角度进行数据分析,发现用户的反馈仅限于有限的项目,验证了集中推荐的现象。我们进一步开发了一个通用的模拟框架来推导推荐系统的过程,包括数据收集、模型学习和项目曝光,形成一个循环。为了解决反馈回路下的过滤泡问题,提出了一种通用的、易于使用的强化学习方法,该方法可以自适应地选择不同社区节点之间少量但有效的连接作为暴露列表。我们在基于大规模真实世界数据集的仿真框架中进行了广泛的实验。结果表明,本文提出的基于强化学习的控制方法可以作为一种有效的解决方案,以减轻过滤气泡及其引起的社区分离。我们相信,这项工作中提出的可控推荐框架不仅可以激励推荐系统的研究人员,而且可以激励更广泛的社区关注人工智能算法对人类的影响,特别是对那些网络上的弱势群体。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Breaking+Filter+Bubble:+A+Reinforcement+Learning+Framework+of+Controllable+Recommender+System)|0| |[CollabEquality: A Crowd-AI Collaborative Learning Framework to Address Class-wise Inequality in Web-based Disaster Response](https://doi.org/10.1145/3543507.3583871)|Yang Zhang, Lanyu Shang, Ruohan Zong, Huimin Zeng, Zhenrui Yue, Dong Wang|School of Information Sciences, University of Illinois Urbana-Champaign, USA|Web-based disaster response (WebDR) is emerging as a pervasive approach to acquire real-time situation awareness of disaster events by collecting timely observations from the Web (e.g., social media). This paper studies a class-wise inequality problem in WebDR applications where the objective is to address the limitation of current WebDR solutions that often have imbalanced classification performance across different classes. To address such a limitation, this paper explores the collaborative strengths of the diversified yet complementary biases of AI and crowdsourced human intelligence to ensure a more balanced and accurate performance for WebDR applications. However, two critical challenges exist: 1) it is difficult to identify the imbalanced AI results without knowing the ground-truth WebDR labels a priori; ii) it is non-trivial to address the class-wise inequality problem using potentially imperfect crowd labels. To address the above challenges, we develop CollabEquality, an inequality-aware crowd-AI collaborative learning framework that carefully models the inequality bias of both AI and human intelligence from crowdsourcing systems into a principled learning framework. Extensive experiments on two real-world WebDR applications demonstrate that CollabEquality consistently outperforms the state-of-the-art baselines by significantly reducing class-wise inequality while improving the WebDR classification accuracy.|基于 Web 的灾难响应(WebDR)正在成为一种普遍的方法,通过从 Web (例如,社交媒体)收集及时的观察结果来获得对灾难事件的实时情况感知。本文研究了 WebDR 应用程序中的类别不等式问题,其目的是解决目前 WebDR 解决方案的局限性,这些解决方案通常在不同类别之间具有不平衡的分类性能。为了解决这一局限性,本文探讨了人工智能和众包人类智能的多样化但互补的偏见的协作优势,以确保更平衡和准确的 WebDR 应用程序的性能。然而,存在两个关键的挑战: 1)在不知道基本事实 WebDR 标签的情况下很难识别不平衡的 AI 结果; 2)使用潜在不完美的群体标签来解决类别不平等问题是不平凡的。为了应对上述挑战,我们开发了 Collabequity,这是一个意识到不平等的群体人工智能合作学习框架,它仔细地将人工智能和人类智能的不平等偏见从众包系统建模成一个有原则的学习框架。在两个真实世界的 WebDR 应用程序上进行的大量实验表明,CollabEquity 通过显著减少类别不平等,同时提高 WebDR 分类准确性,始终优于最先进的基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CollabEquality:+A+Crowd-AI+Collaborative+Learning+Framework+to+Address+Class-wise+Inequality+in+Web-based+Disaster+Response)|0| |[MoleRec: Combinatorial Drug Recommendation with Substructure-Aware Molecular Representation Learning](https://doi.org/10.1145/3543507.3583872)|Nianzu Yang, Kaipeng Zeng, Qitian Wu, Junchi Yan|Shanghai Jiao Tong University, China|Combinatorial drug recommendation involves recommending a personalized combination of medication (drugs) to a patient over his/her longitudinal history, which essentially aims at solving a combinatorial optimization problem that pursues high accuracy under the safety constraint. Among existing learning-based approaches, the association between drug substructures (i.e., a sub-graph of the molecule that contributes to certain chemical effect) and the target disease is largely overlooked, though the function of drugs in fact exhibits strong relevance with particular substructures. To address this issue, we propose a molecular substructure-aware encoding method entitled MoleRec that entails a hierarchical architecture aimed at modeling inter-substructure interactions and individual substructures’ impact on patient’s health condition, in order to identify those substructures that really contribute to healing patients. Specifically, MoleRec learns to attentively pooling over substructure representations which will be element-wisely re-scaled by the model’s inferred relevancy with a patient’s health condition to obtain a prior-knowledge-informed drug representation. We further design a weight annealing strategy for drug-drug-interaction (DDI) objective to adaptively control the balance between accuracy and safety criteria throughout training. Experiments on the MIMIC-III dataset demonstrate that our approach achieves new state-of-the-art performance w.r.t. four accuracy and safety metrics. Our source code is publicly available at https://github.com/yangnianzu0515/MoleRec.|组合药物推荐包括在病人的纵向病史中向病人推荐个性化的药物组合(药物) ,其主要目的是解决在安全约束下追求高准确性的组合优化问题。在现有的基于学习的方法中,药物子结构(即有助于某些化学效应的分子的子图)与目标疾病之间的关联在很大程度上被忽视,尽管药物的功能实际上与特定的子结构显示出强烈的相关性。为了解决这个问题,我们提出了一个名为 MoleRec 的分子子结构感知编码方法,它需要一个层次结构,旨在建模子结构间的相互作用和个体子结构对患者健康状况的影响,以确定那些真正有助于治愈患者的子结构。具体而言,MoleRec 学习仔细汇集子结构表示,这些子结构表示将通过模型与患者健康状况的推断相关性进行元素智能重新缩放,以获得事先知情的药物表示。我们进一步设计了药物-药物相互作用(DDI)目标的权重退火策略,以自适应地控制整个训练过程中准确性和安全性标准之间的平衡。在 MIMIC-III 数据集上的实验表明,我们的方法实现了新的最先进的性能和四个准确性和安全性指标。我们的源代码可以在 https://github.com/yangnianzu0515/molerec 上公开。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MoleRec:+Combinatorial+Drug+Recommendation+with+Substructure-Aware+Molecular+Representation+Learning)|0| -|[Moral Narratives Around the Vaccination Debate on Facebook](https://doi.org/10.1145/3543507.3583865)|Mariano Gastón Beiró, Jacopo D'Ignazi, Victoria Perez Bustos, Maria Florencia Prado, Kyriaki Kalimeri|; ISI Foundation, Italy; Universidad de Buenos Aires. Facultad de Ingeniería, Paseo Colón 850, C1063ACV, Argentina and CONICET, Universidad de Buenos Aires, INTECIN, Paseo Colón 850, C1063ACV, Argentina|Vaccine hesitancy is a complex issue with psychological, cultural, and even societal factors entangled in the decision-making process. The narrative around this process is captured in our everyday interactions; social media data offer a direct and spontaneous view of peoples' argumentation. Here, we analysed more than 500,000 public posts and comments from Facebook Pages dedicated to the topic of vaccination to study the role of moral values and, in particular, the understudied role of the Liberty moral foundation from the actual user-generated text. We operationalise morality by employing the Moral Foundations Theory, while our proposed framework is based on recurrent neural network classifiers with a short memory and entity linking information. Our findings show that the principal moral narratives around the vaccination debate focus on the values of Liberty, Care, and Authority. Vaccine advocates urge compliance with the authorities as prosocial behaviour to protect society. On the other hand, vaccine sceptics mainly build their narrative around the value of Liberty, advocating for the right to choose freely whether to adhere or not to the vaccination. We contribute to the automatic understanding of vaccine hesitancy drivers emerging from user-generated text, providing concrete insights into the moral framing around vaccination decision-making. Especially in emergencies such as the Covid-19 pandemic, contrary to traditional surveys, these insights can be provided contemporary to the event, helping policymakers craft communication campaigns that adequately address the concerns of the hesitant population.|疫苗犹豫不决是一个复杂的问题,心理,文化,甚至社会因素纠缠在决策过程中。围绕这一过程的叙述被捕捉在我们日常的互动中; 社交媒体数据提供了人们争论的直接和自发的视角。在这里,我们分析了超过500,000个来自 Facebook 页面的公开帖子和评论,这些帖子和评论专门针对疫苗接种这一主题,研究道德价值观的作用,特别是从实际的用户生成的文本中研究自由道德基础的被忽视的作用。我们运用道德基础理论来操作道德,而我们提出的框架是基于短记忆的递归神经网络分类器和连接信息的实体。我们的研究结果表明,围绕疫苗接种辩论的主要道德叙事集中在自由、关怀和权威的价值观上。疫苗倡导者敦促当局遵守规定,以此作为保护社会的亲社会行为。另一方面,疫苗怀疑论者主要围绕自由的价值建立他们的叙述,倡导自由选择是否坚持接种疫苗的权利。我们有助于自动理解疫苗犹豫驱动程序出现从用户生成的文本,提供具体的见解围绕疫苗接种决策的道德框架。特别是在2019冠状病毒疾病大流行这样的紧急情况下,与传统的调查不同,这些见解可以在事件发生时提供,帮助政策制定者制定沟通运动,充分解决犹豫不决的民众的关切。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Moral+Narratives+Around+the+Vaccination+Debate+on+Facebook)|0| +|[Moral Narratives Around the Vaccination Debate on Facebook](https://doi.org/10.1145/3543507.3583865)|Mariano Gastón Beiró, Jacopo D'Ignazi, Victoria Perez Bustos, Maria Florencia Prado, Kyriaki Kalimeri|; Universidad de Buenos Aires. Facultad de Ingeniería, Paseo Colón 850, C1063ACV, Argentina and CONICET, Universidad de Buenos Aires, INTECIN, Paseo Colón 850, C1063ACV, Argentina; ISI Foundation, Italy|Vaccine hesitancy is a complex issue with psychological, cultural, and even societal factors entangled in the decision-making process. The narrative around this process is captured in our everyday interactions; social media data offer a direct and spontaneous view of peoples' argumentation. Here, we analysed more than 500,000 public posts and comments from Facebook Pages dedicated to the topic of vaccination to study the role of moral values and, in particular, the understudied role of the Liberty moral foundation from the actual user-generated text. We operationalise morality by employing the Moral Foundations Theory, while our proposed framework is based on recurrent neural network classifiers with a short memory and entity linking information. Our findings show that the principal moral narratives around the vaccination debate focus on the values of Liberty, Care, and Authority. Vaccine advocates urge compliance with the authorities as prosocial behaviour to protect society. On the other hand, vaccine sceptics mainly build their narrative around the value of Liberty, advocating for the right to choose freely whether to adhere or not to the vaccination. We contribute to the automatic understanding of vaccine hesitancy drivers emerging from user-generated text, providing concrete insights into the moral framing around vaccination decision-making. Especially in emergencies such as the Covid-19 pandemic, contrary to traditional surveys, these insights can be provided contemporary to the event, helping policymakers craft communication campaigns that adequately address the concerns of the hesitant population.|疫苗犹豫不决是一个复杂的问题,心理,文化,甚至社会因素纠缠在决策过程中。围绕这一过程的叙述被捕捉在我们日常的互动中; 社交媒体数据提供了人们争论的直接和自发的视角。在这里,我们分析了超过500,000个来自 Facebook 页面的公开帖子和评论,这些帖子和评论专门针对疫苗接种这一主题,研究道德价值观的作用,特别是从实际的用户生成的文本中研究自由道德基础的被忽视的作用。我们运用道德基础理论来操作道德,而我们提出的框架是基于短记忆的递归神经网络分类器和连接信息的实体。我们的研究结果表明,围绕疫苗接种辩论的主要道德叙事集中在自由、关怀和权威的价值观上。疫苗倡导者敦促当局遵守规定,以此作为保护社会的亲社会行为。另一方面,疫苗怀疑论者主要围绕自由的价值建立他们的叙述,倡导自由选择是否坚持接种疫苗的权利。我们有助于自动理解疫苗犹豫驱动程序出现从用户生成的文本,提供具体的见解围绕疫苗接种决策的道德框架。特别是在2019冠状病毒疾病大流行这样的紧急情况下,与传统的调查不同,这些见解可以在事件发生时提供,帮助政策制定者制定沟通运动,充分解决犹豫不决的民众的关切。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Moral+Narratives+Around+the+Vaccination+Debate+on+Facebook)|0| |[Exploration of Framing Biases in Polarized Online Content Consumption](https://doi.org/10.1145/3543873.3587534)|Markus ReiterHaas|Institute of Interactive Systems and Data Science, Graz University of Technology, Austria|The study of framing bias on the Web is crucial in our digital age, as the framing of information can influence human behavior and decision on critical issues such as health or politics. Traditional frame analysis requires a curated set of frames derived from manual content analysis by domain experts. In this work, we introduce a frame analysis approach based on pretrained Transformer models that let us capture frames in an exploratory manner beyond predefined frames. In our experiments on two public online news and social media datasets, we show that our approach lets us identify underexplored conceptualizations, such as that health-related content is framed in terms of beliefs for conspiracy media, while mainstream media is instead concerned with science. We anticipate our work to be a starting point for further research on exploratory computational framing analysis using pretrained Transformers.|在我们的数字时代,研究网络上的框架偏见是至关重要的,因为信息的框架可以影响人类的行为和决策,如健康或政治等关键问题。传统的框架分析需要一组精心策划的框架,这些框架来自于领域专家的手工内容分析。在这项工作中,我们介绍了一种基于预先训练的变压器模型的帧分析方法,使我们能够以一种探索性的方式捕获超越预先定义的帧。在我们对两个公共在线新闻和社交媒体数据集的实验中,我们表明,我们的方法让我们确定了未被充分探索的概念化,例如,健康相关的内容被框定在阴谋媒体的信念方面,而主流媒体关注的是科学。我们期望我们的工作是一个开始点,进一步研究探索性计算框架分析使用预先训练的变压器。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Exploration+of+Framing+Biases+in+Polarized+Online+Content+Consumption)|0| -|[On Modeling Long-Term User Engagement from Stochastic Feedback](https://doi.org/10.1145/3543873.3587626)|Guoxi Zhang, Xing Yao, Xuanji Xiao|Shopee Inc., China; China Central Depository & Clearing Co., Ltd., China; Graduate School of Informatics, Kyoto University, Japan|An ultimate goal of recommender systems (RS) is to improve user engagement. Reinforcement learning (RL) is a promising paradigm for this goal, as it directly optimizes overall performance of sequential recommendation. However, many existing RL-based approaches induce huge computational overhead, because they require not only the recommended items but also all other candidate items to be stored. This paper proposes an efficient alternative that does not require the candidate items. The idea is to model the correlation between user engagement and items directly from data. Moreover, the proposed approach consider randomness in user feedback and termination behavior, which are ubiquitous for RS but rarely discussed in RL-based prior work. With online A/B experiments on real-world RS, we confirm the efficacy of the proposed approach and the importance of modeling the two types of randomness.|推荐系统(RS)的最终目标是提高用户参与度。强化学习(RL)是这个目标的一个很有前途的范例,因为它直接优化了顺序推荐的整体性能。然而,许多现有的基于 RL 的方法会产生巨大的计算开销,因为它们不仅需要存储推荐的项,而且还需要存储所有其他候选项。本文提出了一种不需要候选项的有效方案。其思想是直接从数据中建模用户参与度和项目之间的相关性。此外,提出的方法考虑了用户反馈和终止行为的随机性,这在 RS 中是普遍存在的,但在基于 RL 的先前工作中很少讨论。通过在现实世界中的在线 A/B 实验,我们证实了该方法的有效性和建模两种类型的随机性的重要性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On+Modeling+Long-Term+User+Engagement+from+Stochastic+Feedback)|0| +|[On Modeling Long-Term User Engagement from Stochastic Feedback](https://doi.org/10.1145/3543873.3587626)|Guoxi Zhang, Xing Yao, Xuanji Xiao|Shopee Inc., China; Graduate School of Informatics, Kyoto University, Japan; China Central Depository & Clearing Co., Ltd., China|An ultimate goal of recommender systems (RS) is to improve user engagement. Reinforcement learning (RL) is a promising paradigm for this goal, as it directly optimizes overall performance of sequential recommendation. However, many existing RL-based approaches induce huge computational overhead, because they require not only the recommended items but also all other candidate items to be stored. This paper proposes an efficient alternative that does not require the candidate items. The idea is to model the correlation between user engagement and items directly from data. Moreover, the proposed approach consider randomness in user feedback and termination behavior, which are ubiquitous for RS but rarely discussed in RL-based prior work. With online A/B experiments on real-world RS, we confirm the efficacy of the proposed approach and the importance of modeling the two types of randomness.|推荐系统(RS)的最终目标是提高用户参与度。强化学习(RL)是这个目标的一个很有前途的范例,因为它直接优化了顺序推荐的整体性能。然而,许多现有的基于 RL 的方法会产生巨大的计算开销,因为它们不仅需要存储推荐的项,而且还需要存储所有其他候选项。本文提出了一种不需要候选项的有效方案。其思想是直接从数据中建模用户参与度和项目之间的相关性。此外,提出的方法考虑了用户反馈和终止行为的随机性,这在 RS 中是普遍存在的,但在基于 RL 的先前工作中很少讨论。通过在现实世界中的在线 A/B 实验,我们证实了该方法的有效性和建模两种类型的随机性的重要性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=On+Modeling+Long-Term+User+Engagement+from+Stochastic+Feedback)|0| |[CaML: Carbon Footprinting of Household Products with Zero-Shot Semantic Text Similarity](https://doi.org/10.1145/3543507.3583882)|Bharathan Balaji, Venkata Sai Gargeya Vunnava, Geoffrey Guest, Jared Kramer|Amazon, USA|Products contribute to carbon emissions in each phase of their life cycle, from manufacturing to disposal. Estimating the embodied carbon in products is a key step towards understanding their impact, and undertaking mitigation actions. Precise carbon attribution is challenging at scale, requiring both domain expertise and granular supply chain data. As a first-order approximation, standard reports use Economic Input-Output based Life Cycle Assessment (EIO-LCA) which estimates carbon emissions per dollar at an industry sector level using transactions between different parts of the economy. EIO-LCA models map products to an industry sector, and uses the corresponding carbon per dollar estimates to calculate the embodied carbon footprint of a product. An LCA expert needs to map each product to one of upwards of 1000 potential industry sectors. To reduce the annotation burden, the standard practice is to group products by categories, and map categories to their corresponding industry sector. We present CaML, an algorithm to automate EIO-LCA using semantic text similarity matching by leveraging the text descriptions of the product and the industry sector. CaML uses a pre-trained sentence transformer model to rank the top-5 matches, and asks a human to check if any of them are a good match. We annotated 40K products with non-experts. Our results reveal that pre-defined product categories are heterogeneous with respect to EIO-LCA industry sectors, and lead to a large mean absolute percentage error (MAPE) of 51% in kgCO2e/$. CaML outperforms the previous manually intensive method, yielding a MAPE of 22% with no domain labels (zero-shot). We compared annotations of a small sample of 210 products with LCA experts, and find that CaML accuracy is comparable to that of annotations by non-experts.|产品在其生命周期的每个阶段(从生产到处理)都会造成碳排放。评估产品中的含碳量是了解其影响并采取缓解行动的关键一步。精确的碳归属在规模上具有挑战性,需要领域专业知识和细粒度供应链数据。作为一阶近似,标准报告使用基于经济投入产出的生命周期评估(EIO-LCA) ,该评估利用不同经济部门之间的交易,在行业部门水平上估计每美元的碳排放量。生命周期评估模型将产品映射到一个行业部门,并使用相应的每美元碳排放估计值来计算产品的碳足印。LCA 专家需要将每个产品映射到1000个以上的潜在行业部门之一。为了减少注释负担,标准实践是按类别对产品进行分组,并将类别映射到相应的行业部门。我们提出了 CaML,一种利用产品和工业部门的文本描述,使用语义文本相似性匹配实现 EIO-LCA 自动化的算法。CaML 使用一个预先训练好的句子转换模型对前5个匹配项进行排序,并要求人类检查它们中是否有一个是良好匹配的。我们用非专家注释40K 产品。我们的研究结果表明,预定义的产品类别相对于 EIO-LCA 行业部门是异构的,并导致 kgCO2e/$的大平均绝对百分比误差(MAPE)为51% 。CaML 优于以前的手动密集型方法,产生的 MAPE 为22% ,没有域标签(0-shot)。我们将210个产品的小样本注释与 LCA 专家进行了比较,发现 CaML 的准确性与非专家的注释相当。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CaML:+Carbon+Footprinting+of+Household+Products+with+Zero-Shot+Semantic+Text+Similarity)|0| -|[RDF Playground: An Online Tool for Learning about the Semantic Web](https://doi.org/10.1145/3543873.3587325)|Bastián Inostroza, Raúl Cid, Aidan Hogan|DCC, Universidad de Chile, Chile; DCC, Universidad de Chile, Chile and Instituto Milenio Fundamentos de los Datos (IMFD), Chile|We present RDF Playground: a web-based tool to assist those who wish to learn or teach about the Semantic Web. The tool integrates functionalities relating to the key features of RDF, allowing users to specify an RDF graph in Turtle syntax, visualise it as an interactive graph, query it using SPARQL, reason over it using OWL 2 RL, and to validate it using SHACL or ShEx. The tool further provides the ability to import and explore data from the Web through a graph-based Linked Data browser. We discuss the design and functionality of the tool, its implementation, and the results of a usability study considering students from a Web of Data course that used it for lab assignments. We conclude with a discussion of these results, as well as future directions that we envisage for improving the tool.|我们介绍 RDF Playground: 一个基于 Web 的工具,用于帮助那些希望学习或教授语义 Web 的人。该工具集成了与 RDF 关键特性相关的功能,允许用户用 Turtle 语法指定一个 RDF 图形,将其可视化为一个交互式图形,使用 SPARQL 查询它,使用 OWL 2 RL 推理它,并使用 SHACL 或 ShEx 验证它。该工具还提供了通过基于图形的关联数据浏览器从 Web 导入和探索数据的能力。我们讨论了该工具的设计和功能,它的实现,以及一个可用性研究的结果,该研究考虑了使用它完成实验作业的数据网络课程的学生。最后,我们讨论了这些结果以及我们设想的改进该工具的未来方向。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=RDF+Playground:+An+Online+Tool+for+Learning+about+the+Semantic+Web)|0| +|[RDF Playground: An Online Tool for Learning about the Semantic Web](https://doi.org/10.1145/3543873.3587325)|Bastián Inostroza, Raúl Cid, Aidan Hogan|DCC, Universidad de Chile, Chile and Instituto Milenio Fundamentos de los Datos (IMFD), Chile; DCC, Universidad de Chile, Chile|We present RDF Playground: a web-based tool to assist those who wish to learn or teach about the Semantic Web. The tool integrates functionalities relating to the key features of RDF, allowing users to specify an RDF graph in Turtle syntax, visualise it as an interactive graph, query it using SPARQL, reason over it using OWL 2 RL, and to validate it using SHACL or ShEx. The tool further provides the ability to import and explore data from the Web through a graph-based Linked Data browser. We discuss the design and functionality of the tool, its implementation, and the results of a usability study considering students from a Web of Data course that used it for lab assignments. We conclude with a discussion of these results, as well as future directions that we envisage for improving the tool.|我们介绍 RDF Playground: 一个基于 Web 的工具,用于帮助那些希望学习或教授语义 Web 的人。该工具集成了与 RDF 关键特性相关的功能,允许用户用 Turtle 语法指定一个 RDF 图形,将其可视化为一个交互式图形,使用 SPARQL 查询它,使用 OWL 2 RL 推理它,并使用 SHACL 或 ShEx 验证它。该工具还提供了通过基于图形的关联数据浏览器从 Web 导入和探索数据的能力。我们讨论了该工具的设计和功能,它的实现,以及一个可用性研究的结果,该研究考虑了使用它完成实验作业的数据网络课程的学生。最后,我们讨论了这些结果以及我们设想的改进该工具的未来方向。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=RDF+Playground:+An+Online+Tool+for+Learning+about+the+Semantic+Web)|0| |[Locating Faulty Applications via Semantic and Topology Estimation](https://doi.org/10.1145/3543873.3584660)|Shuyi Niu, Jiawei Jin, Xiutian Huang, Yonggeng Wang, Wenhao Xu, Youyong Kong|Southeast University, China; Ant Group, China|With the explosion of Internet product users, how to locate the faulty ones from numerous back-end applications after a customer complaint has become an essential issue in improving user experience. However, existing solutions mostly rely on manual testing to infer the fault, severely limiting their efficiency. In this paper, we transform the problem of locating faulty applications into two subproblems and propose a fully automated framework. We design a scorecard model in one stage to evaluate the semantic relevance between applications and customer complaints. Then in the other stage, topology graphs that reflect the actual calling relationship and engineering connection relationship between applications are utilized to evaluate the topology relevance between applications. Specifically, we employ a multi-graph co-learning framework constrained by consistency-independence loss and an engineering-theory-driven clustering strategy for the unsupervised learning of graphs. With semantic and topology relevance, we can comprehensively locate relevant faulty applications. Experiments on the Alipay dataset show that our method gains significant improvements in both model performance and efficiency.|随着互联网产品用户的爆炸式增长,如何在用户投诉之后从众多的后端应用程序中定位出故障用户已成为提高用户体验的关键问题。然而,现有的解决方案大多依赖于人工测试来推断故障,这严重限制了它们的效率。本文将故障应用定位问题转化为两个子问题,并提出了一个全自动化的框架。我们在一个阶段中设计了一个记分卡模型来评估应用程序和客户投诉之间的语义相关性。然后在另一个阶段,利用反映实际调用关系和应用间工程连接关系的拓扑图来评估应用间的拓扑相关性。具体来说,我们采用了一个受一致性独立性损失约束的多图协同学习框架和一个工程理论驱动的聚类策略来处理图的非监督式学习。通过语义和拓扑相关性,我们可以全面定位相关的故障应用。在支付宝数据集上的实验表明,该方法在模型性能和效率方面都得到了显著的改善。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Locating+Faulty+Applications+via+Semantic+and+Topology+Estimation)|0| |[Analyzing COVID-Related Social Discourse on Twitter using Emotion, Sentiment, Political Bias, Stance, Veracity and Conspiracy Theories](https://doi.org/10.1145/3543873.3587622)|Youri Peskine, Raphaël Troncy, Paolo Papotti|EURECOM, France|Online misinformation has become a major concern in recent years, and it has been further emphasized during the COVID-19 pandemic. Social media platforms, such as Twitter, can be serious vectors of misinformation online. In order to better understand the spread of these fake-news, lies, deceptions, and rumours, we analyze the correlations between the following textual features in tweets: emotion, sentiment, political bias, stance, veracity and conspiracy theories. We train several transformer-based classifiers from multiple datasets to detect these textual features and identify potential correlations using conditional distributions of the labels. Our results show that the online discourse regarding some topics, such as COVID-19 regulations or conspiracy theories, is highly controversial and reflects the actual U.S. political landscape.|近年来,网上虚假信息已成为一个主要问题,在2019冠状病毒疾病大流行期间,这一问题得到了进一步强调。像 Twitter 这样的社交媒体平台可能是网络上错误信息的严重载体。为了更好地理解这些假新闻、谎言、欺骗和谣言的传播,我们分析了以下推文文本特征之间的相关性: 情绪、情绪、政治偏见、立场、真实性和阴谋论。我们训练了多个来自多个数据集的基于转换器的分类器来检测这些文本特征,并使用标签的条件分布来识别潜在的相关性。我们的研究结果表明,网上关于某些话题的讨论,比如2019冠状病毒疾病监管或阴谋论,具有高度的争议性,反映了美国实际的政治环境。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Analyzing+COVID-Related+Social+Discourse+on+Twitter+using+Emotion,+Sentiment,+Political+Bias,+Stance,+Veracity+and+Conspiracy+Theories)|0| -|[Machine Learning for Streaming Media](https://doi.org/10.1145/3543873.3589751)|Sudarshan Lamkhede, Praveen Chandar, Vladan Radosavljevic, Amit Goyal, Lan Luo|Amazon Music, USA; University of Southern California, USA; Spotify, USA; Netflix Research, USA|Streaming media has become a popular medium for consumers of all ages, with people spending several hours a day streaming videos, games, music, or podcasts across devices. Most global streaming services have introduced Machine Learning (ML) into their operations to personalize consumer experience, improve content, and further enhance the value proposition of streaming services. Despite the rapid growth, there is a need to bridge the gap between academic research and industry requirements and build connections between researchers and practitioners in the field. This workshop aims to provide a unique forum for practitioners and researchers interested in Machine Learning to get together, exchange ideas and get a pulse for the state of the art in research and burning issues in the industry.|流媒体已经成为所有年龄段消费者的流行媒体,人们每天花几个小时在各种设备上观看视频、游戏、音乐或播客。大多数全球流媒体服务已经将机器学习(ML)引入到它们的运营中,以个性化消费者体验、改善内容并进一步提高流媒体服务的价值主张。尽管增长迅速,但仍需要弥合学术研究和行业需求之间的差距,并在该领域的研究人员和从业人员之间建立联系。本研讨会旨在为对机器学习感兴趣的从业人员和研究人员提供一个独特的论坛,让他们聚集在一起,交流思想,了解行业研究的最新进展和热点问题。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Machine+Learning+for+Streaming+Media)|0| +|[Machine Learning for Streaming Media](https://doi.org/10.1145/3543873.3589751)|Sudarshan Lamkhede, Praveen Chandar, Vladan Radosavljevic, Amit Goyal, Lan Luo|University of Southern California, USA; Spotify, USA; Amazon Music, USA; Netflix Research, USA|Streaming media has become a popular medium for consumers of all ages, with people spending several hours a day streaming videos, games, music, or podcasts across devices. Most global streaming services have introduced Machine Learning (ML) into their operations to personalize consumer experience, improve content, and further enhance the value proposition of streaming services. Despite the rapid growth, there is a need to bridge the gap between academic research and industry requirements and build connections between researchers and practitioners in the field. This workshop aims to provide a unique forum for practitioners and researchers interested in Machine Learning to get together, exchange ideas and get a pulse for the state of the art in research and burning issues in the industry.|流媒体已经成为所有年龄段消费者的流行媒体,人们每天花几个小时在各种设备上观看视频、游戏、音乐或播客。大多数全球流媒体服务已经将机器学习(ML)引入到它们的运营中,以个性化消费者体验、改善内容并进一步提高流媒体服务的价值主张。尽管增长迅速,但仍需要弥合学术研究和行业需求之间的差距,并在该领域的研究人员和从业人员之间建立联系。本研讨会旨在为对机器学习感兴趣的从业人员和研究人员提供一个独特的论坛,让他们聚集在一起,交流思想,了解行业研究的最新进展和热点问题。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Machine+Learning+for+Streaming+Media)|0| |[Interleaved Online Testing in Large-Scale Systems](https://doi.org/10.1145/3543873.3587572)|Nan Bi, Bai Li, Ruoyuan Gao, Graham Edge, Sachin Ahuja|Amazon, USA|Online testing is indispensable in decision making for information retrieval systems. Interleaving emerges as an online testing method with orders of magnitude higher sensitivity than the pervading A/B testing. It merges the compared results into a single interleaved result to show to users, and attributes user actions back to the systems being tested. However, its pairwise design also brings practical challenges to real-world systems, in terms of effectively comparing multiple (more than two) systems and interpreting the magnitude of raw interleaving measurement. We present two novel methods to address these challenges that make interleaving practically applicable. The first method infers the ordering of multiple systems based on interleaving pairwise results with false discovery control. The second method estimates A/B effect size based on interleaving results using a weighted linear model that adjust for uncertainties of different measurements. We showcase the effectiveness of our methods in large-scale e-commerce experiments, reporting as many as 75 interleaving results, and provide extensive evaluations of their underlying assumptions.|在线测试对于信息检索系统的决策是不可或缺的。交错测试作为一种在线测试方法出现,其灵敏度数量级高于普遍采用的 A/B 测试。它将比较的结果合并到一个交错的结果中以显示给用户,并将用户操作归结到正在测试的系统。然而,它的成对设计也给现实世界的系统带来了实际的挑战,就有效地比较多个(两个以上)系统和解释原始交错测量的大小而言。我们提出了两种新的方法来解决这些挑战,使交织实际上适用。第一种方法是基于错误发现控制交织成对结果来推断多系统的排序。第二种方法基于交错结果估计 A/B 效应大小,使用加权线性模型来调整不同测量的不确定性。我们展示了我们的方法在大规模电子商务实验中的有效性,报告了多达75个交错的结果,并提供了对其基本假设的广泛评估。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Interleaved+Online+Testing+in+Large-Scale+Systems)|0| |[Impact of COVID-19 Pandemic on Cultural Products Interests](https://doi.org/10.1145/3543873.3587594)|Ke Li, Zhiwen Yu, Ying Zhang, Bin Guo|School of Computer Science, Northwestern Polytechnical University, China and Harbin Engineering University, China; School of Computer Science, Northwestern Polytechnical University, China|The COVID-19 pandemic has had a significant impact on human behaviors and how it influenced peoples’ interests in cultural products is an unsolved problem. While prior studies mostly adopt subjective surveys to find an answer, these methods are always suffering from high cost, limited size, and subjective bias. Inspired by the rich user-oriented data over the Internet, this work explores the possibility to leverage users’ search logs to reflect humans’ underlying cultural product interests. To further examine how the COVID-19 mobility policy might influence cultural interest changes, we propose a new regression discontinuity design that has the additional potential to predict the recovery phase of peoples’ cultural product interests. By analyzing the 1592 search interest time series in 6 countries, we found different patterns of change in interest in movies, music, and art during the COVID-19 pandemic, but a clear overall incremental increase. Across the six countries we studied, we found that changes in interest in cultural products were found to be strongly correlated with mobility and that as mobility declined, interest in movies, music, and art increased by an average of 35, 27 and 20, respectively, with these changes lasting at least eight weeks.|2019冠状病毒疾病疫情对人类行为产生了重大影响,它如何影响人们对文化产品的兴趣是一个尚未解决的问题。以往的研究大多采用主观调查的方法来寻找答案,但这些方法往往成本高、规模有限、存在主观偏差。受互联网上丰富的以用户为导向的数据的启发,这项工作探索了利用用户的搜索日志来反映人类潜在的文化产品兴趣的可能性。为了进一步研究2019冠状病毒疾病流动政策可能如何影响文化兴趣的变化,我们提出了一种新的回归不连续性设计,它具有额外的潜力来预测人们的文化产品兴趣的恢复阶段。通过分析6个国家1592年的搜索兴趣时间序列,我们发现在2019冠状病毒疾病大流行期间,人们对电影、音乐和艺术的兴趣有不同的变化模式,但总体上有明显的增长。在我们研究的六个国家中,我们发现对文化产品兴趣的变化与流动性密切相关,随着流动性的下降,对电影、音乐和艺术的兴趣分别平均增加了35、27和20,这些变化至少持续了八周。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Impact+of+COVID-19+Pandemic+on+Cultural+Products+Interests)|0| -|[Enhancing Data Space Semantic Interoperability through Machine Learning: a Visionary Perspective](https://doi.org/10.1145/3543873.3587658)|Zeyd Boukhers, Christoph Lange, Oya Beyan|Fraunhofer Institute for Applied Information Technology, Germany and Faculty of Medicine and University Hospital Cologne, University of Cologne, Germany; Faculty of Medicine and University Hospital Cologne, University of Cologne, Germany and Fraunhofer Institute for Applied Information Technology, Germany; Fraunhofer Institute for Applied Information Technology, Germany and RWTH Aachen University, Germany|Our vision paper outlines a plan to improve the future of semantic interoperability in data spaces through the application of machine learning. The use of data spaces, where data is exchanged among members in a self-regulated environment, is becoming increasingly popular. However, the current manual practices of managing metadata and vocabularies in these spaces are time-consuming, prone to errors, and may not meet the needs of all stakeholders. By leveraging the power of machine learning, we believe that semantic interoperability in data spaces can be significantly improved. This involves automatically generating and updating metadata, which results in a more flexible vocabulary that can accommodate the diverse terminologies used by different sub-communities. Our vision for the future of data spaces addresses the limitations of conventional data exchange and makes data more accessible and valuable for all members of the community.|我们的愿景文件概述了一个通过机器学习应用改善数据空间语义互操作性未来的计划。数据空间的使用正变得越来越流行,数据空间是在一个自我管理的环境中在成员之间交换数据的地方。然而,在这些空间中管理元数据和词汇表的当前手工实践非常耗时,容易出错,并且可能不能满足所有涉众的需求。通过利用机器学习的力量,我们相信数据空间的语义互操作性可以得到显著改善。这涉及到自动生成和更新元数据,从而产生更灵活的词汇表,可以适应不同子社区使用的不同术语。我们对数据空间未来的展望解决了传统数据交换的局限性,并使数据对社区的所有成员更容易获得和更有价值。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Enhancing+Data+Space+Semantic+Interoperability+through+Machine+Learning:+a+Visionary+Perspective)|0| +|[Enhancing Data Space Semantic Interoperability through Machine Learning: a Visionary Perspective](https://doi.org/10.1145/3543873.3587658)|Zeyd Boukhers, Christoph Lange, Oya Beyan|Fraunhofer Institute for Applied Information Technology, Germany and RWTH Aachen University, Germany; Fraunhofer Institute for Applied Information Technology, Germany and Faculty of Medicine and University Hospital Cologne, University of Cologne, Germany; Faculty of Medicine and University Hospital Cologne, University of Cologne, Germany and Fraunhofer Institute for Applied Information Technology, Germany|Our vision paper outlines a plan to improve the future of semantic interoperability in data spaces through the application of machine learning. The use of data spaces, where data is exchanged among members in a self-regulated environment, is becoming increasingly popular. However, the current manual practices of managing metadata and vocabularies in these spaces are time-consuming, prone to errors, and may not meet the needs of all stakeholders. By leveraging the power of machine learning, we believe that semantic interoperability in data spaces can be significantly improved. This involves automatically generating and updating metadata, which results in a more flexible vocabulary that can accommodate the diverse terminologies used by different sub-communities. Our vision for the future of data spaces addresses the limitations of conventional data exchange and makes data more accessible and valuable for all members of the community.|我们的愿景文件概述了一个通过机器学习应用改善数据空间语义互操作性未来的计划。数据空间的使用正变得越来越流行,数据空间是在一个自我管理的环境中在成员之间交换数据的地方。然而,在这些空间中管理元数据和词汇表的当前手工实践非常耗时,容易出错,并且可能不能满足所有涉众的需求。通过利用机器学习的力量,我们相信数据空间的语义互操作性可以得到显著改善。这涉及到自动生成和更新元数据,从而产生更灵活的词汇表,可以适应不同子社区使用的不同术语。我们对数据空间未来的展望解决了传统数据交换的局限性,并使数据对社区的所有成员更容易获得和更有价值。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Enhancing+Data+Space+Semantic+Interoperability+through+Machine+Learning:+a+Visionary+Perspective)|0| |[The PLASMA Framework: Laying the Path to Domain-Specific Semantics in Dataspaces](https://doi.org/10.1145/3543873.3587662)|Alexander Paulus, André Pomp, Tobias Meisen|Institute for Technologies and Management of Digital Transformation, University of Wuppertal, Germany|Modern data management is evolving from centralized integration-based solutions to a non-integration-based process of finding, accessing and processing data, as observed within dataspaces. Common reference dataspace architectures assume that sources publish their own domain-specific schema. These schemas, also known as semantic models, can only be partially created automatically and require oversight and refinement by human modellers. Non-expert users, such as mechanical engineers or municipal workers, often have difficulty building models because they are faced with multiple ontologies, classes, and relations, and existing tools are not designed for non-expert users. The PLASMA framework consists of a platform and auxiliary services that focus on providing non-expert users with an accessible way to create and edit semantic models, combining automation approaches and support systems such as a recommendation engine. It also provides data conversion from raw data to RDF. In this paper we highlight the main features, like the modeling interface and the data conversion engine. We discuss how PLASMA as a tool is suitable for building semantic models by non-expert users in the context of dataspaces and show some applications where PLASMA has already been used in data management projects.|现代数据管理正在从基于集中式集成的解决方案演变为基于非集成的查找、访问和处理数据的过程,正如在数据空间中观察到的那样。通用参考数据空间体系结构假设源发布自己的特定于域的模式。这些模式也称为语义模型,只能部分自动创建,需要人类建模者进行监督和细化。非专家用户,如机械工程师或市政工人,往往难以建立模型,因为他们面临着多个本体、类和关系,现有的工具不是为非专家用户设计的。PLASMA 框架由一个平台和辅助服务组成,侧重于为非专家用户提供创建和编辑语义模型的便捷方式,将自动化方法和推荐引擎等支持系统结合起来。它还提供从原始数据到 RDF 的数据转换。本文着重介绍了建模接口和数据转换引擎等主要特性。我们讨论了 PLASMA 作为一种工具是如何适用于非专家用户在数据空间上下文中建立语义模型的,并展示了 PLASMA 已经在数据管理项目中使用的一些应用。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=The+PLASMA+Framework:+Laying+the+Path+to+Domain-Specific+Semantics+in+Dataspaces)|0| -|[Pairwise-interactions-based Bayesian Inference of Network Structure from Information Cascades](https://doi.org/10.1145/3543507.3583231)|Chao Gao, Yuchen Wang, Zhen Wang, Xianghua Li, Xuelong Li|School of Artificial Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical University, China; Northwestern Polytechnical University, China|An explicit network structure plays an important role when analyzing and understanding diffusion processes. In many scenarios, however, the interactions between nodes in an underlying network are unavailable. Although many methods for inferring a network structure from observed cascades have been proposed, they did not perceive the relationship between pairwise interactions in a cascade. Therefore, this paper proposes a Pairwise-interactions-based Bayesian Inference method (named PBI) to infer the underlying diffusion network structure. More specifically, to get more accurate inference results, we measure the weights of each candidate pairwise interaction in different cascades and add them to the likelihood of a contagion process. In addition, a pre-pruning work is introduced for candidate edges to further improve the inference efficiency. Experiments on synthetic and real-world networks show that PBI achieves significantly better results.|显式的网络结构在分析和理解扩散过程中起着重要作用。然而,在许多场景中,底层网络中的节点之间的交互是不可用的。虽然已经提出了许多从观察到的级联推断网络结构的方法,但它们没有认识到级联中成对相互作用之间的关系。因此,本文提出了一种基于成对互动的贝叶斯推断方法(称为 PBI)来推断基础扩散网络的结构。更具体地说,为了得到更准确的推断结果,我们测量了不同级联中每个候选者成对相互作用的权重,并将它们加到传染过程的可能性上。此外,为了进一步提高推理效率,引入了候选边缘的预剪枝方法。在合成网络和真实网络上的实验表明,PBI 取得了明显的改善效果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Pairwise-interactions-based+Bayesian+Inference+of+Network+Structure+from+Information+Cascades)|0| +|[Pairwise-interactions-based Bayesian Inference of Network Structure from Information Cascades](https://doi.org/10.1145/3543507.3583231)|Chao Gao, Yuchen Wang, Zhen Wang, Xianghua Li, Xuelong Li|Northwestern Polytechnical University, China; School of Artificial Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical University, China|An explicit network structure plays an important role when analyzing and understanding diffusion processes. In many scenarios, however, the interactions between nodes in an underlying network are unavailable. Although many methods for inferring a network structure from observed cascades have been proposed, they did not perceive the relationship between pairwise interactions in a cascade. Therefore, this paper proposes a Pairwise-interactions-based Bayesian Inference method (named PBI) to infer the underlying diffusion network structure. More specifically, to get more accurate inference results, we measure the weights of each candidate pairwise interaction in different cascades and add them to the likelihood of a contagion process. In addition, a pre-pruning work is introduced for candidate edges to further improve the inference efficiency. Experiments on synthetic and real-world networks show that PBI achieves significantly better results.|显式的网络结构在分析和理解扩散过程中起着重要作用。然而,在许多场景中,底层网络中的节点之间的交互是不可用的。虽然已经提出了许多从观察到的级联推断网络结构的方法,但它们没有认识到级联中成对相互作用之间的关系。因此,本文提出了一种基于成对互动的贝叶斯推断方法(称为 PBI)来推断基础扩散网络的结构。更具体地说,为了得到更准确的推断结果,我们测量了不同级联中每个候选者成对相互作用的权重,并将它们加到传染过程的可能性上。此外,为了进一步提高推理效率,引入了候选边缘的预剪枝方法。在合成网络和真实网络上的实验表明,PBI 取得了明显的改善效果。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Pairwise-interactions-based+Bayesian+Inference+of+Network+Structure+from+Information+Cascades)|0| |[Graph Neural Network with Two Uplift Estimators for Label-Scarcity Individual Uplift Modeling](https://doi.org/10.1145/3543507.3583368)|Dingyuan Zhu, Daixin Wang, Zhiqiang Zhang, Kun Kuang, Yan Zhang, Yulin Kang, Jun Zhou|Ant Group, China; Zhejiang university, China|Uplift modeling aims to measure the incremental effect, which we call uplift, of a strategy or action on the users from randomized experiments or observational data. Most existing uplift methods only use individual data, which are usually not informative enough to capture the unobserved and complex hidden factors regarding the uplift. Furthermore, uplift modeling scenario usually has scarce labeled data, especially for the treatment group, which also poses a great challenge for model training. Considering that the neighbors’ features and the social relationships are very informative to characterize a user’s uplift, we propose a graph neural network-based framework with two uplift estimators, called GNUM, to learn from the social graph for uplift estimation. Specifically, we design the first estimator based on a class-transformed target. The estimator is general for all types of outcomes, and is able to comprehensively model the treatment and control group data together to approach the uplift. When the outcome is discrete, we further design the other uplift estimator based on our defined partial labels, which is able to utilize more labeled data from both the treatment and control groups, to further alleviate the label scarcity problem. Comprehensive experiments on a public dataset and two industrial datasets show a superior performance of our proposed framework over state-of-the-art methods under various evaluation metrics. The proposed algorithms have been deployed online to serve real-world uplift estimation scenarios.|提升模型的目的是通过随机实验或观察数据来衡量策略或行动对用户的增量效应,我们称之为提升。大多数现有的抬升方法只使用单独的数据,这些数据通常不足以获取关于抬升的未观测到的复杂的隐藏因素。此外,抬升模型场景通常缺乏标记数据,特别是对于治疗组,这也对模型训练提出了很大的挑战。考虑到邻居的特征和社会关系对于表征用户的提升是非常有用的,我们提出了一种基于图神经网络的提升估计框架,称为 GNUM,以学习社会图的提升估计。具体地说,我们设计了基于类转换目标的第一个估计器。该估计值对于所有类型的结果都是通用的,并且能够将治疗组和对照组的数据综合建模以接近隆起。当结果是离散的,我们进一步设计其他提升估计的基础上我们定义的部分标签,它能够利用更多的标记数据从治疗组和对照组,以进一步减轻标签稀缺问题。对一个公共数据集和两个工业数据集的综合实验表明,在各种评估指标下,我们提出的框架比最先进的方法具有更好的性能。提出的算法已经在线部署,以服务于真实世界的抬升估计场景。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph+Neural+Network+with+Two+Uplift+Estimators+for+Label-Scarcity+Individual+Uplift+Modeling)|0| |[Multi-head Variational Graph Autoencoder Constrained by Sum-product Networks](https://doi.org/10.1145/3543507.3583517)|Riting Xia, Yan Zhang, Chunxu Zhang, Xueyan Liu, Bo Yang|Jilin University, China|Variational graph autoencoder (VGAE) is a promising deep probabilistic model in graph representation learning. However, most existing VGAEs adopt the mean-field assumption, and cannot characterize the graphs with noise well. In this paper, we propose a novel deep probabilistic model for graph analysis, termed Multi-head Variational Graph Autoencoder Constrained by Sum-product Networks (named SPN-MVGAE), which helps to relax the mean-field assumption and learns better latent representation with fault tolerance. Our proposed model SPN-MVGAE uses conditional sum-product networks as constraints to learn the dependencies between latent factors in an end-to-end manner. Furthermore, we introduce the superposition of the latent representations learned by multiple variational networks to represent the final latent representations of nodes. Our model is the first use sum-product networks for graph representation learning, extending the scope of sum-product networks applications. Experimental results show that compared with other baseline methods, our model has competitive advantages in link prediction, fault tolerance, node classification, and graph visualization on real datasets.|变分图自动编码器(VGAE)是图表示学习中一种很有前途的深度概率模型。然而,现有的 VGAE 大多采用平均场假设,不能很好地刻画有噪声的图。本文提出了一种新的图分析的深度概率模型,称为和积网络约束下的多头变分图自动编码器(SPN-MVGAE)。我们提出的模型 SPN-MVGAE 使用条件和积网络作为约束,以端到端的方式学习潜在因素之间的相关性。此外,我们还引入了多变分网络学习的潜在表征的叠加来表示节点的最终潜在表征。该模型首次将和积网络用于图表示学习,扩展了和积网络的应用范围。实验结果表明,与其他基线方法相比,该模型在实际数据集的链接预测、容错、节点分类和图形可视化等方面具有优势。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-head+Variational+Graph+Autoencoder+Constrained+by+Sum-product+Networks)|0| -|[Interactive Log Parsing via Light-weight User Feedback](https://doi.org/10.1145/3543507.3583456)|Liming Wang, Hong Xie, Ye Li, Jian Tan, John C. S. Lui|College of Computer Science, Chongqing University, China; Alibaba, Hong Kong; Alibaba, China; Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong|Template mining is one of the foundational tasks to support log analysis, which supports the diagnosis and troubleshooting of large scale Web applications. This paper develops a human-in-the-loop template mining framework to support interactive log analysis, which is highly desirable in real-world diagnosis or troubleshooting of Web applications but yet previous template mining algorithms fails to support it. We formulate three types of light-weight user feedbacks and based on them we design three atomic human-in-the-loop template mining algorithms. We derive mild conditions under which the outputs of our proposed algorithms are provably correct. We also derive upper bounds on the computational complexity and query complexity of each algorithm. We demonstrate the versatility of our proposed algorithms by combining them to improve the template mining accuracy of five representative algorithms over sixteen widely used benchmark datasets.|模板挖掘是支持日志分析的基本任务之一,它支持大规模 Web 应用程序的诊断和故障排除。本文提出了一种支持交互式日志分析的半人工模板挖掘框架,该框架在 Web 应用程序的实际诊断和故障排除中非常有用,但以往的模板挖掘算法都不支持。我们提出了三种轻量级用户反馈,并在此基础上设计了三种原子人在环模板挖掘算法。我们推导出我们提出的算法输出可证明正确的温和条件。我们还推导了每种算法的计算复杂度和查询复杂度的上界。我们证明了我们提出的算法的通用性,通过结合他们来提高模板挖掘准确性的五个代表性算法超过16个广泛使用的基准数据集。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Interactive+Log+Parsing+via+Light-weight+User+Feedback)|0| +|[Interactive Log Parsing via Light-weight User Feedback](https://doi.org/10.1145/3543507.3583456)|Liming Wang, Hong Xie, Ye Li, Jian Tan, John C. S. Lui|Alibaba, China; Alibaba, Hong Kong; Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong; College of Computer Science, Chongqing University, China|Template mining is one of the foundational tasks to support log analysis, which supports the diagnosis and troubleshooting of large scale Web applications. This paper develops a human-in-the-loop template mining framework to support interactive log analysis, which is highly desirable in real-world diagnosis or troubleshooting of Web applications but yet previous template mining algorithms fails to support it. We formulate three types of light-weight user feedbacks and based on them we design three atomic human-in-the-loop template mining algorithms. We derive mild conditions under which the outputs of our proposed algorithms are provably correct. We also derive upper bounds on the computational complexity and query complexity of each algorithm. We demonstrate the versatility of our proposed algorithms by combining them to improve the template mining accuracy of five representative algorithms over sixteen widely used benchmark datasets.|模板挖掘是支持日志分析的基本任务之一,它支持大规模 Web 应用程序的诊断和故障排除。本文提出了一种支持交互式日志分析的半人工模板挖掘框架,该框架在 Web 应用程序的实际诊断和故障排除中非常有用,但以往的模板挖掘算法都不支持。我们提出了三种轻量级用户反馈,并在此基础上设计了三种原子人在环模板挖掘算法。我们推导出我们提出的算法输出可证明正确的温和条件。我们还推导了每种算法的计算复杂度和查询复杂度的上界。我们证明了我们提出的算法的通用性,通过结合他们来提高模板挖掘准确性的五个代表性算法超过16个广泛使用的基准数据集。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Interactive+Log+Parsing+via+Light-weight+User+Feedback)|0| |[Misbehavior and Account Suspension in an Online Financial Communication Platform](https://doi.org/10.1145/3543507.3583385)|Taro Tsuchiya, Alejandro Cuevas, Thomas Magelinski, Nicolas Christin|Carnegie Mellon University, USA|The expanding accessibility and appeal of investing have attracted millions of new retail investors. As such, investment discussion boards became the de facto communities where traders create, disseminate, and discuss investing ideas. These communities, which can provide useful information to support investors, have anecdotally also attracted a wide range of misbehavior – toxicity, spam/fraud, and reputation manipulation. This paper is the first comprehensive analysis of online misbehavior in the context of investment communities. We study TradingView, the largest online communication platform for financial trading. We collect 2.76M user profiles with their corresponding social graphs, 4.2M historical article posts, and 5.3M comments, including information on nearly 4 000 suspended accounts and 17 000 removed comments. Price fluctuations seem to drive abuse across the platform and certain types of assets, such as “meme” stocks, attract disproportionate misbehavior. Suspended user accounts tend to form more closely-knit communities than those formed by non-suspended accounts; and paying accounts are less likely to be suspended than free accounts even when posting similar levels of content violating platform policies. We conclude by offering guidelines on how to adapt content moderation efforts to fit the particularities of online investment communities.|投资的可及性和吸引力不断扩大,吸引了数以百万计的新散户投资者。因此,投资讨论委员会事实上成为了交易员创建、传播和讨论投资理念的社区。这些社区可以提供有用的信息来支持投资者,据说也吸引了大量的不良行为——毒性、垃圾邮件/欺诈和声誉操纵。本文首次全面分析了投资社区背景下的网络不良行为。我们研究了 TradingView,这是最大的金融交易在线交流平台。我们收集了276万用户的个人资料及其相应的社交图表,420万篇历史文章和530万条评论,包括近4000个被暂停的账户和17000条被删除的评论。价格波动似乎推动了整个平台的滥用,某些类型的资产,如“模因”股票,吸引了不成比例的不当行为。与非暂停用户账户相比,暂停用户账户往往形成更为紧密的社区; 即使发布了违反平台政策的类似水平的内容,付费账户被暂停的可能性也低于免费账户。最后,我们提供了关于如何调整内容审核工作以适应在线投资社区的特殊性的指导方针。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Misbehavior+and+Account+Suspension+in+an+Online+Financial+Communication+Platform)|0| -|[BiSR: Bidirectionally Optimized Super-Resolution for Mobile Video Streaming](https://doi.org/10.1145/3543507.3583519)|Qian Yu, Qing Li, Rui He, Gareth Tyson, Wanxin Shi, Jianhui Lv, Zhenhui Yuan, Peng Zhang, Yulong Lan, Zhicheng Li|International Graduate School, Tsinghua University, China; Peng Cheng Laboratory, China; SUSTech, China and Peng Cheng Laboratory, China; Hong Kong University of Science and Technology(GZ), China; Tencent, China; Northumbria University, United Kingdom|The user experience of mobile web video streaming is often impacted by insufficient and dynamic network bandwidth. In this paper, we design Bidirectionally Optimized Super-Resolution (BiSR) to improve the quality of experience (QoE) for mobile web users under limited bandwidth. BiSR exploits a deep neural network (DNN)-based model to super-resolve key frames efficiently without changing the inter-frame spatial-temporal information. We then propose a downscaling DNN and a mobile-specific optimized lightweight super-resolution DNN to enhance the performance. Finally, a novel reinforcement learning-based adaptive bitrate (ABR) algorithm is proposed to verify the performance of BiSR on real network traces. Our evaluation, using a full system implementation, shows that BiSR saves 26% of bitrate compared to the traditional H.264 codec and improves the SSIM of video by 3.7% compared to the prior state-of-the-art. Overall, BiSR enhances the user-perceived quality of experience by up to 30.6%.|移动网络视频流的用户体验往往受到网络带宽不足和动态性的影响。本文设计了双向优化超分辨率(BiSR)算法,以提高有限带宽下移动网络用户的体验质量(QoE)。BiSR 利用基于深度神经网络(DNN)的模型,在不改变帧间时空信息的情况下,有效地对关键帧进行超分辨。然后,我们提出了一个缩放 DNN 和一个移动专用的优化轻量级超分辨率 DNN,以提高性能。最后,提出了一种新的基于强化学习的自适应比特率(ABR)算法来验证 BiSR 在实际网络跟踪中的性能。我们的评估,使用一个完整的系统实现,表明 BiSR 节省26% 的比特率相比,传统的 H.264编解码器和提高了3.7% 的 SSIM 的视频相比,以前的最先进的国家。总的来说,BiSR 提高了30.6% 的用户感知体验质量。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=BiSR:+Bidirectionally+Optimized+Super-Resolution+for+Mobile+Video+Streaming)|0| +|[BiSR: Bidirectionally Optimized Super-Resolution for Mobile Video Streaming](https://doi.org/10.1145/3543507.3583519)|Qian Yu, Qing Li, Rui He, Gareth Tyson, Wanxin Shi, Jianhui Lv, Zhenhui Yuan, Peng Zhang, Yulong Lan, Zhicheng Li|Tencent, China; SUSTech, China and Peng Cheng Laboratory, China; Peng Cheng Laboratory, China; Hong Kong University of Science and Technology(GZ), China; Northumbria University, United Kingdom; International Graduate School, Tsinghua University, China|The user experience of mobile web video streaming is often impacted by insufficient and dynamic network bandwidth. In this paper, we design Bidirectionally Optimized Super-Resolution (BiSR) to improve the quality of experience (QoE) for mobile web users under limited bandwidth. BiSR exploits a deep neural network (DNN)-based model to super-resolve key frames efficiently without changing the inter-frame spatial-temporal information. We then propose a downscaling DNN and a mobile-specific optimized lightweight super-resolution DNN to enhance the performance. Finally, a novel reinforcement learning-based adaptive bitrate (ABR) algorithm is proposed to verify the performance of BiSR on real network traces. Our evaluation, using a full system implementation, shows that BiSR saves 26% of bitrate compared to the traditional H.264 codec and improves the SSIM of video by 3.7% compared to the prior state-of-the-art. Overall, BiSR enhances the user-perceived quality of experience by up to 30.6%.|移动网络视频流的用户体验往往受到网络带宽不足和动态性的影响。本文设计了双向优化超分辨率(BiSR)算法,以提高有限带宽下移动网络用户的体验质量(QoE)。BiSR 利用基于深度神经网络(DNN)的模型,在不改变帧间时空信息的情况下,有效地对关键帧进行超分辨。然后,我们提出了一个缩放 DNN 和一个移动专用的优化轻量级超分辨率 DNN,以提高性能。最后,提出了一种新的基于强化学习的自适应比特率(ABR)算法来验证 BiSR 在实际网络跟踪中的性能。我们的评估,使用一个完整的系统实现,表明 BiSR 节省26% 的比特率相比,传统的 H.264编解码器和提高了3.7% 的 SSIM 的视频相比,以前的最先进的国家。总的来说,BiSR 提高了30.6% 的用户感知体验质量。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=BiSR:+Bidirectionally+Optimized+Super-Resolution+for+Mobile+Video+Streaming)|0| |[Autobidding Auctions in the Presence of User Costs](https://doi.org/10.1145/3543507.3583234)|Yuan Deng, Jieming Mao, Vahab Mirrokni, Hanrui Zhang, Song Zuo|Google Research, USA; Carnegie Mellon University, USA|We study autobidding ad auctions with user costs, where each bidder is value-maximizing subject to a return-over-investment (ROI) constraint, and the seller aims to maximize the social welfare taking into consideration the user's cost of viewing an ad. We show that in the worst case, the approximation ratio of social welfare by running the vanilla VCG auctions with user costs could as bad as 0. To improve the performance of VCG, We propose a new variant of VCG based on properly chosen cost multipliers, and prove that there exist auction-dependent and bidder-dependent cost multipliers that guarantee approximation ratios of 1/2 and 1/4 respectively in terms of the social welfare.|本文研究了具有用户成本的自动竞价广告拍卖,其中每个竞价者的价值最大化受到投资回报率(ROI)的约束,卖方的目标是最大化社会福利,同时考虑用户观看广告的成本。我们指出,在最坏的情况下,通过运行普通的 VCG 拍卖与用户成本的社会福利的近似比率可能为0。为了提高 VCG 的性能,我们提出了一种新的基于合理选择成本乘数的 VCG 变体,并证明了存在拍卖相关成本乘数和投标人相关成本乘数,它们分别保证在社会福利方面的近似比为1/2和1/4。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Autobidding+Auctions+in+the+Presence+of+User+Costs)|0| -|[Online Bidding Algorithms for Return-on-Spend Constrained Advertisers✱](https://doi.org/10.1145/3543507.3583491)|Zhe Feng, Swati Padmanabhan, Di Wang|University of Washington, Seattle, USA; Google Research, USA|Online advertising has recently grown into a highly competitive and complex multi-billion-dollar industry, with advertisers bidding for ad slots at large scales and high frequencies. This has resulted in a growing need for efficient "auto-bidding" algorithms that determine the bids for incoming queries to maximize advertisers' targets subject to their specified constraints. This work explores efficient online algorithms for a single value-maximizing advertiser under an increasingly popular constraint: Return-on-Spend (RoS). We quantify efficiency in terms of regret relative to the optimal algorithm, which knows all queries a priori. We contribute a simple online algorithm that achieves near-optimal regret in expectation while always respecting the specified RoS constraint when the input sequence of queries are i.i.d. samples from some distribution. We also integrate our results with the previous work of Balseiro, Lu, and Mirrokni [BLM20] to achieve near-optimal regret while respecting both RoS and fixed budget constraints. Our algorithm follows the primal-dual framework and uses online mirror descent (OMD) for the dual updates. However, we need to use a non-canonical setup of OMD, and therefore the classic low-regret guarantee of OMD, which is for the adversarial setting in online learning, no longer holds. Nonetheless, in our case and more generally where low-regret dynamics are applied in algorithm design, the gradients encountered by OMD can be far from adversarial but influenced by our algorithmic choices. We exploit this key insight to show our OMD setup achieves low regret in the realm of our algorithm.|最近,在线广告业已发展成为一个竞争激烈、规模高达数十亿美元的复杂行业,广告客户大规模、高频率地竞标广告位置。这就导致了对有效的“自动竞价”算法的需求日益增长,这种算法可以确定收到的查询的出价,从而最大限度地提高广告商的目标,使其受到特定的约束。这项工作探讨了一个单一的价值最大化的广告客户在一个日益流行的约束下有效的在线算法: 支出回报(RoS)。我们量化效率的遗憾相对于优化算法,它知道所有查询的先验。我们提出了一个简单的在线算法,当查询的输入序列是来自某个分布的标识样本时,该算法在期望中达到接近最优的遗憾,同时始终遵守指定的 RoS 约束。我们还将我们的研究结果与 Balseiro、 Lu 和 Mirrokni [ BLM20]的前期工作结合起来,以在尊重 RoS 和固定预算约束的情况下实现近乎最佳的遗憾。我们的算法遵循原始-对偶框架,并使用在线镜像下降(OMD)的双重更新。然而,我们需要使用一个非规范的 OMD 设置,因此 OMD 的经典的低后悔保证,这是在线学习的对抗设置,不再成立。尽管如此,在我们的案例中,以及更一般的低后悔动力学应用于算法设计的情况下,OMD 遇到的梯度可能远不是对手,而是受到我们的算法选择的影响。我们利用这个关键的洞察力来展示我们的 OMD 设置在我们的算法领域实现了低遗憾。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Online+Bidding+Algorithms+for+Return-on-Spend+Constrained+Advertisers✱)|0| -|[EDNet: Attention-Based Multimodal Representation for Classification of Twitter Users Related to Eating Disorders](https://doi.org/10.1145/3543507.3583863)|Mohammad Abuhassan, Tarique Anwar, Chengfei Liu, Hannah K. Jarman, Matthew FullerTyszkiewicz|Deakin University, Australia; University of York, United Kingdom; Swinburne University of Technology, Australia|Social media platforms provide rich data sources in several domains. In mental health, individuals experiencing an Eating Disorder (ED) are often hesitant to seek help through conventional healthcare services. However, many people seek help with diet and body image issues on social media. To better distinguish at-risk users who may need help for an ED from those who are simply commenting on ED in social environments, highly sophisticated approaches are required. Assessment of ED risks in such a situation can be done in various ways, and each has its own strengths and weaknesses. Hence, there is a need for and potential benefit of a more complex multimodal approach. To this end, we collect historical tweets, user biographies, and online behaviours of relevant users from Twitter, and generate a reasonably large labelled benchmark dataset. Thereafter, we develop an advanced multimodal deep learning model called EDNet using these data to identify the different types of users with ED engagement (e.g., potential ED sufferers, healthcare professionals, or communicators) and distinguish them from those not experiencing EDs on Twitter. EDNet consists of five deep neural network layers. With the help of its embedding, representation and behaviour modeling layers, it effectively learns the multimodalities of social media. In our experiments, EDNet consistently outperforms all the baseline techniques by significant margins. It achieves an accuracy of up to 94.32% and F1 score of up to 93.91% F1 score. To the best of our knowledge, this is the first such study to propose a multimodal approach for user-level classification according to their engagement with ED content on social media.|社交媒体平台在多个领域提供丰富的数据源。在心理健康方面,患有饮食失调(ED)的个体往往不愿意通过传统的医疗服务寻求帮助。然而,许多人在社交媒体上寻求关于饮食和身体形象问题的帮助。为了更好地区分那些可能需要急诊帮助的高危患者和那些只是在社会环境中评论急诊的患者,需要高度复杂的方法。在这种情况下,评估教育署的风险有多种方法,每种方法各有优缺点。因此,需要一种更复杂的多模式方法,而且这种方法还有潜在的好处。为此,我们从 Twitter 收集历史推文、用户简历和相关用户的在线行为,并生成一个相当大的带标签的基准数据集。此后,我们开发了一种称为 EDNet 的高级多模式深度学习模型,使用这些数据来确定不同类型的 ED 参与用户(例如,潜在的 ED 患者,医疗保健专业人员或沟通者) ,并将他们与 Twitter 上没有经历 ED 的人区分开来。EDNet 由五个深层神经网络层组成。借助其嵌入层、表征层和行为建模层,它有效地学习了社会媒体的多重形态。在我们的实验中,EDNet 始终以显著的优势优于所有的基线技术。该算法的正确率达到94.32% ,F1得分达到93.91% 。据我们所知,这是第一个这样的研究,提出了一个多模式的方法,用户级别的分类,根据他们的参与,教育署的内容在社交媒体上。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=EDNet:+Attention-Based+Multimodal+Representation+for+Classification+of+Twitter+Users+Related+to+Eating+Disorders)|0| +|[Online Bidding Algorithms for Return-on-Spend Constrained Advertisers✱](https://doi.org/10.1145/3543507.3583491)|Zhe Feng, Swati Padmanabhan, Di Wang|Google Research, USA; University of Washington, Seattle, USA|Online advertising has recently grown into a highly competitive and complex multi-billion-dollar industry, with advertisers bidding for ad slots at large scales and high frequencies. This has resulted in a growing need for efficient "auto-bidding" algorithms that determine the bids for incoming queries to maximize advertisers' targets subject to their specified constraints. This work explores efficient online algorithms for a single value-maximizing advertiser under an increasingly popular constraint: Return-on-Spend (RoS). We quantify efficiency in terms of regret relative to the optimal algorithm, which knows all queries a priori. We contribute a simple online algorithm that achieves near-optimal regret in expectation while always respecting the specified RoS constraint when the input sequence of queries are i.i.d. samples from some distribution. We also integrate our results with the previous work of Balseiro, Lu, and Mirrokni [BLM20] to achieve near-optimal regret while respecting both RoS and fixed budget constraints. Our algorithm follows the primal-dual framework and uses online mirror descent (OMD) for the dual updates. However, we need to use a non-canonical setup of OMD, and therefore the classic low-regret guarantee of OMD, which is for the adversarial setting in online learning, no longer holds. Nonetheless, in our case and more generally where low-regret dynamics are applied in algorithm design, the gradients encountered by OMD can be far from adversarial but influenced by our algorithmic choices. We exploit this key insight to show our OMD setup achieves low regret in the realm of our algorithm.|最近,在线广告业已发展成为一个竞争激烈、规模高达数十亿美元的复杂行业,广告客户大规模、高频率地竞标广告位置。这就导致了对有效的“自动竞价”算法的需求日益增长,这种算法可以确定收到的查询的出价,从而最大限度地提高广告商的目标,使其受到特定的约束。这项工作探讨了一个单一的价值最大化的广告客户在一个日益流行的约束下有效的在线算法: 支出回报(RoS)。我们量化效率的遗憾相对于优化算法,它知道所有查询的先验。我们提出了一个简单的在线算法,当查询的输入序列是来自某个分布的标识样本时,该算法在期望中达到接近最优的遗憾,同时始终遵守指定的 RoS 约束。我们还将我们的研究结果与 Balseiro、 Lu 和 Mirrokni [ BLM20]的前期工作结合起来,以在尊重 RoS 和固定预算约束的情况下实现近乎最佳的遗憾。我们的算法遵循原始-对偶框架,并使用在线镜像下降(OMD)的双重更新。然而,我们需要使用一个非规范的 OMD 设置,因此 OMD 的经典的低后悔保证,这是在线学习的对抗设置,不再成立。尽管如此,在我们的案例中,以及更一般的低后悔动力学应用于算法设计的情况下,OMD 遇到的梯度可能远不是对手,而是受到我们的算法选择的影响。我们利用这个关键的洞察力来展示我们的 OMD 设置在我们的算法领域实现了低遗憾。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Online+Bidding+Algorithms+for+Return-on-Spend+Constrained+Advertisers✱)|0| +|[EDNet: Attention-Based Multimodal Representation for Classification of Twitter Users Related to Eating Disorders](https://doi.org/10.1145/3543507.3583863)|Mohammad Abuhassan, Tarique Anwar, Chengfei Liu, Hannah K. Jarman, Matthew FullerTyszkiewicz|Deakin University, Australia; Swinburne University of Technology, Australia; University of York, United Kingdom|Social media platforms provide rich data sources in several domains. In mental health, individuals experiencing an Eating Disorder (ED) are often hesitant to seek help through conventional healthcare services. However, many people seek help with diet and body image issues on social media. To better distinguish at-risk users who may need help for an ED from those who are simply commenting on ED in social environments, highly sophisticated approaches are required. Assessment of ED risks in such a situation can be done in various ways, and each has its own strengths and weaknesses. Hence, there is a need for and potential benefit of a more complex multimodal approach. To this end, we collect historical tweets, user biographies, and online behaviours of relevant users from Twitter, and generate a reasonably large labelled benchmark dataset. Thereafter, we develop an advanced multimodal deep learning model called EDNet using these data to identify the different types of users with ED engagement (e.g., potential ED sufferers, healthcare professionals, or communicators) and distinguish them from those not experiencing EDs on Twitter. EDNet consists of five deep neural network layers. With the help of its embedding, representation and behaviour modeling layers, it effectively learns the multimodalities of social media. In our experiments, EDNet consistently outperforms all the baseline techniques by significant margins. It achieves an accuracy of up to 94.32% and F1 score of up to 93.91% F1 score. To the best of our knowledge, this is the first such study to propose a multimodal approach for user-level classification according to their engagement with ED content on social media.|社交媒体平台在多个领域提供丰富的数据源。在心理健康方面,患有饮食失调(ED)的个体往往不愿意通过传统的医疗服务寻求帮助。然而,许多人在社交媒体上寻求关于饮食和身体形象问题的帮助。为了更好地区分那些可能需要急诊帮助的高危患者和那些只是在社会环境中评论急诊的患者,需要高度复杂的方法。在这种情况下,评估教育署的风险有多种方法,每种方法各有优缺点。因此,需要一种更复杂的多模式方法,而且这种方法还有潜在的好处。为此,我们从 Twitter 收集历史推文、用户简历和相关用户的在线行为,并生成一个相当大的带标签的基准数据集。此后,我们开发了一种称为 EDNet 的高级多模式深度学习模型,使用这些数据来确定不同类型的 ED 参与用户(例如,潜在的 ED 患者,医疗保健专业人员或沟通者) ,并将他们与 Twitter 上没有经历 ED 的人区分开来。EDNet 由五个深层神经网络层组成。借助其嵌入层、表征层和行为建模层,它有效地学习了社会媒体的多重形态。在我们的实验中,EDNet 始终以显著的优势优于所有的基线技术。该算法的正确率达到94.32% ,F1得分达到93.91% 。据我们所知,这是第一个这样的研究,提出了一个多模式的方法,用户级别的分类,根据他们的参与,教育署的内容在社交媒体上。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=EDNet:+Attention-Based+Multimodal+Representation+for+Classification+of+Twitter+Users+Related+to+Eating+Disorders)|0| |[C-Affinity: A Novel Similarity Measure for Effective Data Clustering](https://doi.org/10.1145/3543873.3587307)|Jiwon Hong, SangWook Kim|Hanyang University, Republic of Korea|Clustering is widely employed in various applications as it is one of the most useful data mining techniques. In performing clustering, a similarity measure, which defines how similar a pair of data objects are, plays an important role. A similarity measure is employed by considering a target dataset’s characteristics. Current similarity measures (or distances) do not reflect the distribution of data objects in a dataset at all. From the clustering point of view, this fact may limit the clustering accuracy. In this paper, we propose c-affinity, a new notion of a similarity measure that reflects the distribution of objects in the given dataset from a clustering point of view. We design c-affinity between any two objects to have a higher value as they are more likely to belong to the same cluster by learning the data distribution. We use random walk with restart (RWR) on the k-nearest neighbor graph of the given dataset to measure (1) how similar a pair of objects are and (2) how densely other objects are distributed between them. Via extensive experiments on sixteen synthetic and real-world datasets, we verify that replacing the existing similarity measure with our c-affinity improves the clustering accuracy significantly.|聚类作为最有用的数据挖掘技术之一,被广泛应用于各种应用程序中。在进行聚类时,一个相似性度量定义了一对数据对象的相似程度,它扮演着重要的角色。通过考虑目标数据集的特征,采用相似性度量。当前的相似性度量(或距离)根本不反映数据集中数据对象的分布。从聚类的角度来看,这一事实可能会限制聚类的准确性。在这篇文章中,我们提出了一个新的概念,即从聚类的角度来反映给定数据集中对象的分布的相似性度量。通过学习数据分布,我们设计任意两个对象之间的 c 亲和关系,使其具有更高的值,因为它们更可能属于同一个集群。我们在给定数据集的 k 最近邻图上使用重启随机游动(RWR)来测量(1)一对对象有多相似,(2)其他对象在它们之间的分布有多密集。通过在16个合成和真实数据集上的大量实验,我们验证了用 c 亲和度取代现有的相似度度量可以显著提高聚类的准确性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=C-Affinity:+A+Novel+Similarity+Measure+for+Effective+Data+Clustering)|0| |[Knowledge Distillation on Cross-Modal Adversarial Reprogramming for Data-Limited Attribute Inference](https://doi.org/10.1145/3543873.3587313)|Quan Li, Lingwei Chen, Shixiong Jing, Dinghao Wu|Pennsylvania State University, USA; Wright State University, USA|Social media generates a rich source of text data with intrinsic user attributes (e.g., age, gender), where different parties benefit from disclosing them. Attribute inference can be cast as a text classification problem, which, however, suffers from labeled data scarcity. To address this challenge, we propose a data-limited learning model to distill knowledge on adversarial reprogramming of a visual transformer (ViT) for attribute inferences. Not only does this novel cross-modal model transfers the powerful learning capability from ViT, but also leverages unlabeled texts to reduce the demand on labeled data. Experiments on social media datasets demonstrate the state-of-the-art performance of our model on data-limited attribute inferences.|社交媒体产生了丰富的具有内在用户属性(例如,年龄,性别)的文本数据来源,不同的方面从披露这些数据中获益。属性推理可以被看作是一个文本分类问题,但是,这个问题存在标记数据稀缺性。为了解决这个问题,我们提出了一个数据有限的学习模型,以提取知识的对抗性重编程的可视化转换器(ViT)的属性推理。这种新颖的跨模式模型不仅转移了 ViT 强大的学习能力,而且利用未标记的文本来减少对标记数据的需求。在社会媒体数据集上的实验证明了我们的模型在数据有限属性推理上的最新性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Knowledge+Distillation+on+Cross-Modal+Adversarial+Reprogramming+for+Data-Limited+Attribute+Inference)|0| |[Copyright Protection and Accountability of Generative AI: Attack, Watermarking and Attribution](https://doi.org/10.1145/3543873.3587321)|Haonan Zhong, Jiamin Chang, Ziyue Yang, Tingmin Wu, Pathum Chamikara Mahawaga Arachchige, Chehara Pathmabandu, Minhui Xue||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Copyright+Protection+and+Accountability+of+Generative+AI:+Attack,+Watermarking+and+Attribution)|0| |[How Streaming Can Improve the World (Wide Web)](https://doi.org/10.1145/3543873.3587332)|Lucas Vogel, Thomas Springer|TU Dresden, Germany|Since its beginnings, web pages have been based on files. This means that HTML, CSS, and JavaScript are transferred from server to client as files, which by default need to be fully loaded before the web page is displayed. This render-blocking procedure increases loading times significantly, leading to reduced user satisfaction and revenue loss due to lower conversion rates. We present a full implementation of a new approach for loading web pages by splitting up every component and loading the page via a text-based stream. Such a modification aligns with current trends of the HTTP protocol, which has been using streams internally since HTTP/2. It significantly improves loading times, independent of the total page size.|从一开始,网页就是基于文件的。这意味着 HTML、 CSS 和 JavaScript 作为文件从服务器传输到客户端,默认情况下需要在网页显示之前完全加载这些文件。这种渲染阻塞过程大大增加了加载时间,导致用户满意度降低和收入损失,由于较低的转换率。我们提出了一个全新的加载网页的方法,通过拆分每个组件,并通过一个基于文本的流加载网页的完整实现。这种修改符合 HTTP 协议的当前趋势,自 HTTP/2以来,HTTP 协议一直在内部使用流。它显著提高了加载时间,与总页面大小无关。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=How+Streaming+Can+Improve+the+World+(Wide+Web))|0| |[PyPoll: A python library automating mining of networks, discussions and polarization on Twitter](https://doi.org/10.1145/3543873.3587349)|Dimitrios Panteleimon Giakatos, Pavlos Sermpezis, Athena Vakali|Aristotle University of Thessaloniki, Greece|Today online social networks have a high impact in our society as more and more people use them for communicating with each other, express their opinions, participating in public discussions, etc. In particular, Twitter is one of the most popular social network platforms people mainly use for political discussions. This attracted the interest of many research studies that analyzed social phenomena on Twitter, by collecting data, analysing communication patterns, and exploring the structure of user networks. While previous works share many common methodologies for data collection and analysis, these are mainly re-implemented every time by researchers in a custom way. In this paper, we introduce PyPoll an open-source Python library that operationalizes common analysis tasks for Twitter discussions. With PyPoll users can perform Twitter graph mining, calculate the polarization index and generate interactive visualizations without needing third-party tools. We believe that PyPoll can help researchers automate their tasks by giving them methods that are easy to use. Also, we demonstrate the use of the library by presenting two use cases; the PyPoll visualization app, an online application for graph visualizing and sharing, and the Political Lighthouse, a Web portal for displaying the polarization in various political topics on Twitter.|今天,在线社交网络对我们的社会产生了很大的影响,因为越来越多的人使用它们进行交流、表达意见、参与公共讨论等等。特别值得一提的是,Twitter 是人们主要用于政治讨论的最流行的社交网络平台之一。这引起了许多研究的兴趣,这些研究通过收集数据、分析交流模式和探索用户网络的结构来分析 Twitter 上的社会现象。虽然以前的工作共享许多共同的方法收集和分析数据,这些主要是重新实现每次由研究人员在一个自定义的方式。在本文中,我们介绍了 PyPoll,它是一个开放源码的 Python 库,可以为 Twitter 讨论操作常见的分析任务。使用 PyPoll,用户可以执行 Twitter 图形挖掘、计算极化指数和生成交互式可视化,而不需要第三方工具。我们相信 PyPoll 可以帮助研究人员通过提供易于使用的方法来自动化他们的任务。此外,我们通过展示两个用例来演示该库的使用: PyPoll 可视化应用程序,一个用于图形可视化和共享的在线应用程序,以及 Political Lighthouse,一个用于在 Twitter 上显示各种政治主题的两极分化的门户网站。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PyPoll:+A+python+library+automating+mining+of+networks,+discussions+and+polarization+on+Twitter)|0| |[WebSHAP: Towards Explaining Any Machine Learning Models Anywhere](https://doi.org/10.1145/3543873.3587362)|Zijie J. Wang, Duen Horng Chau|Georgia Institute of Technology, USA|As machine learning (ML) is increasingly integrated into our everyday Web experience, there is a call for transparent and explainable web-based ML. However, existing explainability techniques often require dedicated backend servers, which limit their usefulness as the Web community moves toward in-browser ML for lower latency and greater privacy. To address the pressing need for a client-side explainability solution, we present WebSHAP, the first in-browser tool that adapts the state-of-the-art model-agnostic explainability technique SHAP to the Web environment. Our open-source tool is developed with modern Web technologies such as WebGL that leverage client-side hardware capabilities and make it easy to integrate into existing Web ML applications. We demonstrate WebSHAP in a usage scenario of explaining ML-based loan approval decisions to loan applicants. Reflecting on our work, we discuss the opportunities and challenges for future research on transparent Web ML. WebSHAP is available at https://github.com/poloclub/webshap.|随着机器学习(ML)越来越多地融入我们的日常网络经验,有一个透明的和可解释的基于网络的 ML 的呼吁。然而,现有的可解释性技术通常需要专用的后端服务器,这限制了它们的有用性,因为 Web 社区正在朝着浏览器内机器学习的方向发展,以获得更低的延迟和更大的隐私。为了满足对客户端可解释性解决方案的迫切需求,我们提出了 WebSHAP,这是第一个在浏览器中使最先进的模型无关可解释性技术 SHAP 适用于 Web 环境的工具。我们的开源工具是使用现代 Web 技术(如 WebGL)开发的,这些技术利用了客户端硬件功能,并使其易于集成到现有的 Web ML 应用程序中。我们在向贷款申请者解释基于 ML 的贷款批准决策的使用场景中演示了 WebSHAP。回顾我们的工作,我们讨论了未来研究透明 Web 机器学习的机遇和挑战。「网上民政事务及安全资讯 https://github.com/poloclub/WebSHAP 」已上载至。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=WebSHAP:+Towards+Explaining+Any+Machine+Learning+Models+Anywhere)|0| -|[Privacy-Preserving Online Content Moderation: A Federated Learning Use Case](https://doi.org/10.1145/3543873.3587604)|Pantelitsa Leonidou, Nicolas Kourtellis, Nikos Salamanos, Michael Sirivianos|Cyprus University of Technology, Cyprus; Telefonica Research, Spain|Users are daily exposed to a large volume of harmful content on various social network platforms. One solution is developing online moderation tools using Machine Learning techniques. However, the processing of user data by online platforms requires compliance with privacy policies. Federated Learning (FL) is an ML paradigm where the training is performed locally on the users' devices. Although the FL framework complies, in theory, with the GDPR policies, privacy leaks can still occur. For instance, an attacker accessing the final trained model can successfully perform unwanted inference of the data belonging to the users who participated in the training process. In this paper, we propose a privacy-preserving FL framework for online content moderation that incorporates Differential Privacy (DP). To demonstrate the feasibility of our approach, we focus on detecting harmful content on Twitter - but the overall concept can be generalized to other types of misbehavior. We simulate a text classifier - in FL fashion - which can detect tweets with harmful content. We show that the performance of the proposed FL framework can be close to the centralized approach - for both the DP and non-DP FL versions. Moreover, it has a high performance even if a small number of clients (each with a small number of data points) are available for the FL training. When reducing the number of clients (from 50 to 10) or the data points per client (from 1K to 0.1K), the classifier can still achieve ~81% AUC. Furthermore, we extend the evaluation to four other Twitter datasets that capture different types of user misbehavior and still obtain a promising performance (61% - 80% AUC). Finally, we explore the overhead on the users' devices during the FL training phase and show that the local training does not introduce excessive CPU utilization and memory consumption overhead.|用户每天都会在各种社交网络平台上接触到大量的有害内容。一个解决方案是使用机器学习技术开发在线审核工具。然而,通过在线平台处理用户数据需要遵守隐私策略。联邦学习(FL)是一种机器学习范式,其中的培训是在用户的设备上本地执行的。尽管 FL 框架在理论上符合 GDPR 策略,但仍然可能发生隐私泄露。例如,访问最终训练模型的攻击者可以成功地对属于参与训练过程的用户的数据进行不必要的推断。在这篇文章中,我们提出了一个保护隐私的在线内容审查框架,该框架结合了差分隐私(DP)。为了证明我们方法的可行性,我们将重点放在检测 Twitter 上的有害内容上——但总体概念可以推广到其他类型的不当行为。我们模拟了一个文本分类器——以 FL 的方式——它可以检测带有有害内容的 tweet。我们展示了所提出的 FL 框架的性能可以接近于集中式方法-对于 DP 和非 DP FL 版本都是如此。此外,即使只有少量的客户端(每个客户端都有少量的数据点)可用于 FL 培训,它也具有很高的性能。当减少客户端数量(从50到10)或每个客户端的数据点数量(从1K 到0.1 K)时,分类器仍然可以达到约81% 的 AUC。此外,我们将评估扩展到其他四个 Twitter 数据集,这些数据集捕获了不同类型的用户不当行为,并且仍然获得了有希望的性能(61% -80% AUC)。最后,我们研究了 FL 训练阶段用户设备上的开销,结果表明本地训练不会引入过多的 CPU 利用率和内存消耗开销。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Privacy-Preserving+Online+Content+Moderation:+A+Federated+Learning+Use+Case)|0| -|[Intent-based Web Page Summarization with Structure-Aware Chunking and Generative Language Models](https://doi.org/10.1145/3543873.3587372)|HuanYuan Chen, Hong Yu|School of Computer and Information Sciences, University of Massachusetts Lowell, USA; College of Information and Computer Sciences, University of Massachusetts Amherst, USA|This paper introduces a structure-aware method to segment web pages into chunks based on their web structures. We utilize large language models to select chunks correspond to a given intent and generate the abstractive summary. Experiments on a food pantry dataset developed for mitigating food insecurity show that the proposed framework is promising.|介绍了一种基于结构感知的网页分块方法。我们利用大型语言模型来选择与给定意图相对应的块,并生成抽象的摘要。为减轻粮食不安全而开发的食品储藏室数据集的实验表明,所提出的框架是有希望的。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Intent-based+Web+Page+Summarization+with+Structure-Aware+Chunking+and+Generative+Language+Models)|0| +|[Privacy-Preserving Online Content Moderation: A Federated Learning Use Case](https://doi.org/10.1145/3543873.3587604)|Pantelitsa Leonidou, Nicolas Kourtellis, Nikos Salamanos, Michael Sirivianos|Telefonica Research, Spain; Cyprus University of Technology, Cyprus|Users are daily exposed to a large volume of harmful content on various social network platforms. One solution is developing online moderation tools using Machine Learning techniques. However, the processing of user data by online platforms requires compliance with privacy policies. Federated Learning (FL) is an ML paradigm where the training is performed locally on the users' devices. Although the FL framework complies, in theory, with the GDPR policies, privacy leaks can still occur. For instance, an attacker accessing the final trained model can successfully perform unwanted inference of the data belonging to the users who participated in the training process. In this paper, we propose a privacy-preserving FL framework for online content moderation that incorporates Differential Privacy (DP). To demonstrate the feasibility of our approach, we focus on detecting harmful content on Twitter - but the overall concept can be generalized to other types of misbehavior. We simulate a text classifier - in FL fashion - which can detect tweets with harmful content. We show that the performance of the proposed FL framework can be close to the centralized approach - for both the DP and non-DP FL versions. Moreover, it has a high performance even if a small number of clients (each with a small number of data points) are available for the FL training. When reducing the number of clients (from 50 to 10) or the data points per client (from 1K to 0.1K), the classifier can still achieve ~81% AUC. Furthermore, we extend the evaluation to four other Twitter datasets that capture different types of user misbehavior and still obtain a promising performance (61% - 80% AUC). Finally, we explore the overhead on the users' devices during the FL training phase and show that the local training does not introduce excessive CPU utilization and memory consumption overhead.|用户每天都会在各种社交网络平台上接触到大量的有害内容。一个解决方案是使用机器学习技术开发在线审核工具。然而,通过在线平台处理用户数据需要遵守隐私策略。联邦学习(FL)是一种机器学习范式,其中的培训是在用户的设备上本地执行的。尽管 FL 框架在理论上符合 GDPR 策略,但仍然可能发生隐私泄露。例如,访问最终训练模型的攻击者可以成功地对属于参与训练过程的用户的数据进行不必要的推断。在这篇文章中,我们提出了一个保护隐私的在线内容审查框架,该框架结合了差分隐私(DP)。为了证明我们方法的可行性,我们将重点放在检测 Twitter 上的有害内容上——但总体概念可以推广到其他类型的不当行为。我们模拟了一个文本分类器——以 FL 的方式——它可以检测带有有害内容的 tweet。我们展示了所提出的 FL 框架的性能可以接近于集中式方法-对于 DP 和非 DP FL 版本都是如此。此外,即使只有少量的客户端(每个客户端都有少量的数据点)可用于 FL 培训,它也具有很高的性能。当减少客户端数量(从50到10)或每个客户端的数据点数量(从1K 到0.1 K)时,分类器仍然可以达到约81% 的 AUC。此外,我们将评估扩展到其他四个 Twitter 数据集,这些数据集捕获了不同类型的用户不当行为,并且仍然获得了有希望的性能(61% -80% AUC)。最后,我们研究了 FL 训练阶段用户设备上的开销,结果表明本地训练不会引入过多的 CPU 利用率和内存消耗开销。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Privacy-Preserving+Online+Content+Moderation:+A+Federated+Learning+Use+Case)|0| +|[Intent-based Web Page Summarization with Structure-Aware Chunking and Generative Language Models](https://doi.org/10.1145/3543873.3587372)|HuanYuan Chen, Hong Yu|College of Information and Computer Sciences, University of Massachusetts Amherst, USA; School of Computer and Information Sciences, University of Massachusetts Lowell, USA|This paper introduces a structure-aware method to segment web pages into chunks based on their web structures. We utilize large language models to select chunks correspond to a given intent and generate the abstractive summary. Experiments on a food pantry dataset developed for mitigating food insecurity show that the proposed framework is promising.|介绍了一种基于结构感知的网页分块方法。我们利用大型语言模型来选择与给定意图相对应的块,并生成抽象的摘要。为减轻粮食不安全而开发的食品储藏室数据集的实验表明,所提出的框架是有希望的。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Intent-based+Web+Page+Summarization+with+Structure-Aware+Chunking+and+Generative+Language+Models)|0| |[Measuring and Detecting Virality on Social Media: The Case of Twitter's Viral Tweets Topic](https://doi.org/10.1145/3543873.3587373)|Tugrulcan Elmas, Stephane Selim, Célia Houssiaux||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Measuring+and+Detecting+Virality+on+Social+Media:+The+Case+of+Twitter's+Viral+Tweets+Topic)|0| |[Anytime-Valid Confidence Sequences in an Enterprise A/B Testing Platform](https://doi.org/10.1145/3543873.3584635)|Akash Maharaj, Ritwik Sinha, David Arbour, Ian WaudbySmith, Simon Z. Liu, Moumita Sinha, Raghavendra Addanki, Aaditya Ramdas, Manas Garg, Viswanathan Swaminathan||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Anytime-Valid+Confidence+Sequences+in+an+Enterprise+A/B+Testing+Platform)|0| |[Contrastive Fine-tuning on Few Shot Intent Detection with Topological Intent Tree](https://doi.org/10.1145/3543873.3584648)|Wei Yuan, Martin Dimkovski, Aijun An|Intact Financial Corporation, Canada; Department of Electrical Engineering and Computer Science, York University, Canada|We present a few-shot intent detection model for an enterprise’s conversational dialogue system. The model uses an intent topological tree to guide the search for the user intent using large language models (LLMs). The intents are resolved based on semantic similarities between user utterances and the text descriptions of the internal nodes of the intent tree or the intent examples in the leaf nodes of the tree. Our results show that an off-the-shelf language model can work reasonably well in a large enterprise deployment without fine-tuning, and its performance can be further improved with fine-tuning as more domain-specific data becomes available. We also show that the fine-tuned language model meets and outperforms the state-of-the-art (SOTA) results in resolving conversation intents without training classifiers. With the use of a topological intent tree, our model provides more interpretability to cultivate people’s trust in their decisions.|针对企业会话对话系统,提出了一种少镜头意图检测模型。该模型使用一个意图拓扑树来指导使用大型语言模型(LLM)搜索用户意图。根据用户语句与意图树内部节点的文本描述或树叶节点中的意图示例之间的语义相似性来解析意图。我们的研究结果表明,现成的语言模型可以在不进行微调的情况下在大型企业部署中工作得相当好,而且随着更多特定于领域的数据可用,通过微调可以进一步提高其性能。我们还表明,经过微调的语言模型满足并优于最先进的(SOTA)结果,无需训练分类器就能解析会话意图。通过使用拓扑意图树,我们的模型提供了更多的可解释性来培养人们对其决策的信任。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Contrastive+Fine-tuning+on+Few+Shot+Intent+Detection+with+Topological+Intent+Tree)|0| -|[Visual Item Selection With Voice Assistants: A systems perspective](https://doi.org/10.1145/3543873.3584655)|Prashan Wanigasekara, Rafid AlHumaimidi, Turan Gojayev, Niloofar Gheissari, Achal Dave, Stephen Rawls, Fan Yang, Kechen Qin, Nalin Gupta, Spurthi Sandiri, Chevanthie Dissanayake, Zeynab Raeesy, Emre Barut, Chengwei Su|Amazon, Canada; Amazon, USA; Amazon, Germany|Interacting with voice assistants, such as Amazon Alexa to aid in day-to-day tasks has become a ubiquitous phenomenon in modern-day households. These voice assistants often have screens to provide visual content (e.g., images, videos) to their users. There is an increasing trend of users shopping or searching for products using these devices, yet, these voice assistants do not support commands or queries that contain visual references to the content shown on screen (e.g., “blue one”, “red dress”). We introduce a novel multi-modal visual shopping experience where the voice assistant is aware of the visual content shown on the screen and assists the user in item selection using natural language multi-modal interactions. We detail a practical, lightweight end-to-end system architecture spanning from model fine-tuning, deployment, to skill invocation on an Amazon Echo family device with a screen. We also define a niche “Visual Item Selection” task and evaluate whether we can effectively leverage publicly available multi-modal models, and embeddings produced from these models for the task. We show that open source contrastive embeddings like CLIP [30] and ALBEF [24] have zero-shot accuracy above for the “Visual Item Selection” task on an internally collected visual shopping dataset. By further fine-tuning the embeddings, we obtain further gains of 8.6% to 24.0% in relative accuracy improvement over a baseline. The technology that enables our visual shopping assistant is available as an Alexa Skill in the Alexa Skills store.|与诸如亚马逊 Alexa 这样的语音助手进行交互以帮助完成日常任务,已经成为现代家庭中无处不在的现象。这些语音助手通常有屏幕为用户提供视觉内容(如图像、视频)。用户使用这些设备购物或搜索产品的趋势正在增加,然而,这些语音助手不支持包含对屏幕上显示的内容的可视化参考的命令或查询(例如,“蓝色的”、“红色的裙子”)。我们介绍了一种新颖的多模态视觉购物体验,语音助手可以感知屏幕上显示的视觉内容,并利用自然语言的多模态交互协助用户选择商品。我们详细介绍了一个实用的、轻量级的端到端系统架构,从模型微调、部署到带屏幕的 Amazon Echo 系列设备上的技能调用。我们还定义了一个小型的“可视化项目选择”任务,并评估我们是否能够有效地利用公开可用的多模态模型,以及从这些模型产生的嵌入任务。我们展示了像 CLIP [30]和 ALBEF [24]这样的开源对比嵌入对于内部收集的可视化购物数据集上的“可视化项目选择”任务具有零拍摄精度。通过对嵌入进行进一步的微调,我们获得了比基线相对准确度提高8.6% 到24.0% 的进一步增益。该技术,使我们的视觉购物助理可作为一个 Alexa 技能在 Alexa 技能商店。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Visual+Item+Selection+With+Voice+Assistants:+A+systems+perspective)|0| -|[Multi-Source Domain Adaptation via Latent Domain Reconstruction](https://doi.org/10.1145/3543873.3584659)|Jun Zhou, Chilin Fu, Xiaolu Zhang|Ant Group, China; College of Computer Science and Technology, Zhejiang University, China and Ant Group, China|Multi-Source Domain Adaptation (MSDA) is widely used in various machine learning scenarios for domain shifts between labeled source domains and unlabeled target domains. Conventional MSDA methods are built on a strong hypothesis that data samples from the same source belong to the same domain with the same latent distribution. However, in practice sources and their latent domains are not necessarily one-to-one correspondence. To tackle this problem, a novel Multi-source Reconstructed Domain Adaptation (MRDA) framework for MSDA is proposed. We use an Expectation-Maximization (EM) mechanism that iteratively reconstructs the source domains to recover the latent domains and performs domain adaptation on the reconstructed domains. Specifically, in the E-step, we cluster the samples from multiple sources into different latent domains, and a soft assignment strategy is proposed to avoid cluster imbalance. In the M-step, we freeze the latent domains clustered in the E-step and optimize the objective function for domain adaptation, and a global-specific feature extractor is used to capture both domain-invariant and domain-specific features. Extensive experiments demonstrate that our approach can reconstruct source domains and perform domain adaptation on the reconstructed domains effectively, thus significantly outperforming state-of-the-art (SOTA) baselines (e.g., 1% to 3.1% absolute improvement in AUC).|多源域自适应(MSDA)广泛应用于各种机器学习场景中,用于标记源域和未标记目标域之间的域移位。传统的 MSDA 方法是建立在一个强大的假设,即来自同一来源的数据样本属于同一领域,具有相同的潜在分布。然而,在实践中,来源及其潜在领域并不一定是双射。针对这一问题,提出了一种新的多源重构域自适应(MRDA)框架。我们使用一个期望最大化(EM)机制,迭代地重构源域来恢复潜在域,并对重构域执行域适应。具体来说,在 E 步中,我们将来自多个源的样本聚类到不同的潜在域中,并提出了一种软分配策略来避免聚类不平衡。在 M 步中,我们冻结了聚集在 E 步中的潜在领域,并优化了领域适应的目标函数,并且使用了全局特征提取器来捕获领域不变和领域特定的特征。广泛的实验表明,我们的方法可以重建源域并有效地对重建域进行域适应,从而显着优于最先进的(SOTA)基线(例如,AUC 的1% 至3.1% 的绝对改善)。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-Source+Domain+Adaptation+via+Latent+Domain+Reconstruction)|0| +|[Visual Item Selection With Voice Assistants: A systems perspective](https://doi.org/10.1145/3543873.3584655)|Prashan Wanigasekara, Rafid AlHumaimidi, Turan Gojayev, Niloofar Gheissari, Achal Dave, Stephen Rawls, Fan Yang, Kechen Qin, Nalin Gupta, Spurthi Sandiri, Chevanthie Dissanayake, Zeynab Raeesy, Emre Barut, Chengwei Su|Amazon, Germany; Amazon, USA; Amazon, Canada|Interacting with voice assistants, such as Amazon Alexa to aid in day-to-day tasks has become a ubiquitous phenomenon in modern-day households. These voice assistants often have screens to provide visual content (e.g., images, videos) to their users. There is an increasing trend of users shopping or searching for products using these devices, yet, these voice assistants do not support commands or queries that contain visual references to the content shown on screen (e.g., “blue one”, “red dress”). We introduce a novel multi-modal visual shopping experience where the voice assistant is aware of the visual content shown on the screen and assists the user in item selection using natural language multi-modal interactions. We detail a practical, lightweight end-to-end system architecture spanning from model fine-tuning, deployment, to skill invocation on an Amazon Echo family device with a screen. We also define a niche “Visual Item Selection” task and evaluate whether we can effectively leverage publicly available multi-modal models, and embeddings produced from these models for the task. We show that open source contrastive embeddings like CLIP [30] and ALBEF [24] have zero-shot accuracy above for the “Visual Item Selection” task on an internally collected visual shopping dataset. By further fine-tuning the embeddings, we obtain further gains of 8.6% to 24.0% in relative accuracy improvement over a baseline. The technology that enables our visual shopping assistant is available as an Alexa Skill in the Alexa Skills store.|与诸如亚马逊 Alexa 这样的语音助手进行交互以帮助完成日常任务,已经成为现代家庭中无处不在的现象。这些语音助手通常有屏幕为用户提供视觉内容(如图像、视频)。用户使用这些设备购物或搜索产品的趋势正在增加,然而,这些语音助手不支持包含对屏幕上显示的内容的可视化参考的命令或查询(例如,“蓝色的”、“红色的裙子”)。我们介绍了一种新颖的多模态视觉购物体验,语音助手可以感知屏幕上显示的视觉内容,并利用自然语言的多模态交互协助用户选择商品。我们详细介绍了一个实用的、轻量级的端到端系统架构,从模型微调、部署到带屏幕的 Amazon Echo 系列设备上的技能调用。我们还定义了一个小型的“可视化项目选择”任务,并评估我们是否能够有效地利用公开可用的多模态模型,以及从这些模型产生的嵌入任务。我们展示了像 CLIP [30]和 ALBEF [24]这样的开源对比嵌入对于内部收集的可视化购物数据集上的“可视化项目选择”任务具有零拍摄精度。通过对嵌入进行进一步的微调,我们获得了比基线相对准确度提高8.6% 到24.0% 的进一步增益。该技术,使我们的视觉购物助理可作为一个 Alexa 技能在 Alexa 技能商店。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Visual+Item+Selection+With+Voice+Assistants:+A+systems+perspective)|0| +|[Multi-Source Domain Adaptation via Latent Domain Reconstruction](https://doi.org/10.1145/3543873.3584659)|Jun Zhou, Chilin Fu, Xiaolu Zhang|College of Computer Science and Technology, Zhejiang University, China and Ant Group, China; Ant Group, China|Multi-Source Domain Adaptation (MSDA) is widely used in various machine learning scenarios for domain shifts between labeled source domains and unlabeled target domains. Conventional MSDA methods are built on a strong hypothesis that data samples from the same source belong to the same domain with the same latent distribution. However, in practice sources and their latent domains are not necessarily one-to-one correspondence. To tackle this problem, a novel Multi-source Reconstructed Domain Adaptation (MRDA) framework for MSDA is proposed. We use an Expectation-Maximization (EM) mechanism that iteratively reconstructs the source domains to recover the latent domains and performs domain adaptation on the reconstructed domains. Specifically, in the E-step, we cluster the samples from multiple sources into different latent domains, and a soft assignment strategy is proposed to avoid cluster imbalance. In the M-step, we freeze the latent domains clustered in the E-step and optimize the objective function for domain adaptation, and a global-specific feature extractor is used to capture both domain-invariant and domain-specific features. Extensive experiments demonstrate that our approach can reconstruct source domains and perform domain adaptation on the reconstructed domains effectively, thus significantly outperforming state-of-the-art (SOTA) baselines (e.g., 1% to 3.1% absolute improvement in AUC).|多源域自适应(MSDA)广泛应用于各种机器学习场景中,用于标记源域和未标记目标域之间的域移位。传统的 MSDA 方法是建立在一个强大的假设,即来自同一来源的数据样本属于同一领域,具有相同的潜在分布。然而,在实践中,来源及其潜在领域并不一定是双射。针对这一问题,提出了一种新的多源重构域自适应(MRDA)框架。我们使用一个期望最大化(EM)机制,迭代地重构源域来恢复潜在域,并对重构域执行域适应。具体来说,在 E 步中,我们将来自多个源的样本聚类到不同的潜在域中,并提出了一种软分配策略来避免聚类不平衡。在 M 步中,我们冻结了聚集在 E 步中的潜在领域,并优化了领域适应的目标函数,并且使用了全局特征提取器来捕获领域不变和领域特定的特征。广泛的实验表明,我们的方法可以重建源域并有效地对重建域进行域适应,从而显着优于最先进的(SOTA)基线(例如,AUC 的1% 至3.1% 的绝对改善)。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-Source+Domain+Adaptation+via+Latent+Domain+Reconstruction)|0| |[Human Dimensions of Animal Exploitation: Towards Understanding the International Wildlife Trade and Selfie-Tourism on Twitter](https://doi.org/10.1145/3543873.3587538)|Sean P. Rogers, Jeremiah Onaolapo|University of Vermont, USA|This study investigates statements of participation in an exploitative animal activity on social media website Twitter. The data include social posts (tweets) related to two exploited species - the sloth (N=32,119), and the elephant (N=15,160). Tweets for each of these case studies were examined and labeled. The initial results reveal several features of interaction with exploited species. Namely, there are a high number of tweets indicating that individuals participated in exploited species activities during vacations in destinations that double as native countries for the exploited species. The data also indicate that a large number of exploited species activities take place at fairs, carnivals, and circuses. These initial results shed light on the trends in human participation in activities with exploited species. These findings will offer insight to stakeholders seeking to bolster education programs and quantify the level of animal exploitation.|本研究调查了社交媒体网站 Twitter 上参与剥削动物活动的声明。这些数据包括与两个被开发的物种——树懒(N = 32,119)和大象(N = 15,160)有关的社交帖子(tweet)。这些个案研究的推文都经过了检查和标记。初步结果揭示了与被开发物种相互作用的几个特征。也就是说,有大量推文表明,个人在假期期间参加了被捕捞物种的活动,而目的地是被捕捞物种的本土国家。数据还表明,大量被开发的物种活动发生在集市、嘉年华会和马戏团。这些初步结果说明了人类参与与被捕捞物种有关的活动的趋势。这些发现将为利益相关者提供深刻的见解,以支持教育项目和量化动物剥削的水平。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Human+Dimensions+of+Animal+Exploitation:+Towards+Understanding+the+International+Wildlife+Trade+and+Selfie-Tourism+on+Twitter)|0| |[A Bridge over the Troll: Non-Complementary Activism Online](https://doi.org/10.1145/3543873.3587541)|Emyn Dean|Knowledge Media Institute (KMi), Open University, United Kingdom|Previous research has identified phenomena such as cyberbystander intervention and various other forms of responses to aggressive or hateful behaviours online. In the online media ecosystem, some people from marginalized communities and their allies have attempted to enhance organic engagement by participating in organized activism, which is sometimes characterized as "non-complementary" or "indirect". This paper attempts to identify, recognize, and label this phenomenon, as well as provide suggestions for further research in this area.|以前的研究已经确定了一些现象,例如网络旁观者的干预和对网上攻击性或仇恨行为的各种其他形式的反应。在网络媒体生态系统中,一些来自边缘化社区及其盟友的人试图通过参与有组织的行动主义来加强有机参与,这种行动主义有时被描述为“非互补”或“间接”。本文试图对这一现象进行识别、识别和标记,并为该领域的进一步研究提供建议。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Bridge+over+the+Troll:+Non-Complementary+Activism+Online)|0| |[The DEEP Sensorium: a multidimensional approach to sensory domain labelling](https://doi.org/10.1145/3543873.3587631)|Simona Corciulo, Livio Bioglio, Valerio Basile, Viviana Patti, Rossana Damiano|Dipartimento di Informatica, University of Turin, Italy; Dipartimento di Studi Umanistici, University of Turin, Italy|In this paper, we describe our intuitions about how language technologies can contribute to create new ways to enhance the accessibility of exhibits in cultural contexts by exploiting the knowledge about the history of our senses and the link between perception and language. We evaluate the performance of five multi-class classification models for the task of sensory recognition and introduce the DEEP Sensorium (Deep Engaging Experiences and Practices - Sensorium), a multidimensional dataset that combines cognitive and affective features to inform systematic methodologies for augmenting exhibits with multi-sensory stimuli. For each model, using different feature sets, we show that the features expressing the affective dimension of words combined with sub-lexical features perform better than uni-dimensional training sets.|在本文中,我们描述了我们的直觉,关于语言技术如何能够有助于创造新的方式,通过利用我们的感官历史知识和感知与语言之间的联系,提高展品在文化背景下的可及性。我们评估了五个多类分类模型在感官识别任务中的表现,并介绍了 DEEP Sensorium (DEEP Engaging Experience and Practices-Sensorium) ,这是一个结合认知和情感特征的多维数据集,以通知系统方法用多感官刺激来增强展品。对于每个模型,使用不同的特征集,我们发现结合亚词汇特征表达词的情感维度的特征比一维训练集表现得更好。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=The+DEEP+Sensorium:+a+multidimensional+approach+to+sensory+domain+labelling)|0| -|[A Survey of General Ontologies for the Cross-Industry Domain of Circular Economy](https://doi.org/10.1145/3543873.3587613)|Huanyu Li, Mina Abd Nikooie Pour, Ying Li, Mikael Lindecrantz, Eva Blomqvist, Patrick Lambrix|Linköping University, Sweden and University of Gävle, Sweden; Ragn-Sells AB, Sweden; Linköping University, Sweden|Circular Economy has the goal to reduce value loss and avoid waste by extending the life span of materials and products, including circulating materials or product parts before they become waste. Circular economy models (e.g., circular value networks) are typically complex and networked, involving different cross-industry domains. In the context of a circular value network, multiple actors, such as suppliers, manufacturers, recyclers, and product end-users, may be involved. In addition, there may be various flows of resources, energy, information and value throughout the network. This means that we face the challenge that the data and information from cross-industry domains in a circular economy model are not built on common ground, and as a result are difficult to understand and use for both humans and machines. Using ontologies to represent domain knowledge can enable actors and stakeholders from different industries in the circular economy to communicate using a common language. The knowledge domains involved include circular economy, sustainability, materials, products, manufacturing, and logistics. The objective of this paper is to investigate the landscape of current ontologies for these domains. This will enable us to in the future explore what existing knowledge can be adapted or used to develop ontologies for circular value networks.|循环经济的目标是通过延长材料和产品的寿命来减少价值损失和避免浪费,包括在成为废物之前循环的材料或产品部件。循环经济模型(例如,循环价值网络)通常是复杂和网络化的,涉及不同的跨行业领域。在循环价值网络的背景下,可能涉及多个参与者,如供应商、制造商、回收商和产品最终用户。此外,整个网络可能有各种各样的资源、能源、信息和价值流。这意味着我们面临的挑战是,来自循环经济模型中跨行业领域的数据和信息不是建立在共同的基础之上,因此人类和机器都难以理解和使用。使用本体来表示领域知识可以使循环经济中不同行业的参与者和利益相关者使用一种共同的语言进行交流。所涉及的知识领域包括循环经济、可持续性、材料、产品、制造和物流。本文的目的是研究这些领域当前的本体论景观。这将使我们能够在未来探索什么现有的知识可以被调整或用于开发循环价值网络的本体论。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Survey+of+General+Ontologies+for+the+Cross-Industry+Domain+of+Circular+Economy)|0| +|[A Survey of General Ontologies for the Cross-Industry Domain of Circular Economy](https://doi.org/10.1145/3543873.3587613)|Huanyu Li, Mina Abd Nikooie Pour, Ying Li, Mikael Lindecrantz, Eva Blomqvist, Patrick Lambrix|Ragn-Sells AB, Sweden; Linköping University, Sweden; Linköping University, Sweden and University of Gävle, Sweden|Circular Economy has the goal to reduce value loss and avoid waste by extending the life span of materials and products, including circulating materials or product parts before they become waste. Circular economy models (e.g., circular value networks) are typically complex and networked, involving different cross-industry domains. In the context of a circular value network, multiple actors, such as suppliers, manufacturers, recyclers, and product end-users, may be involved. In addition, there may be various flows of resources, energy, information and value throughout the network. This means that we face the challenge that the data and information from cross-industry domains in a circular economy model are not built on common ground, and as a result are difficult to understand and use for both humans and machines. Using ontologies to represent domain knowledge can enable actors and stakeholders from different industries in the circular economy to communicate using a common language. The knowledge domains involved include circular economy, sustainability, materials, products, manufacturing, and logistics. The objective of this paper is to investigate the landscape of current ontologies for these domains. This will enable us to in the future explore what existing knowledge can be adapted or used to develop ontologies for circular value networks.|循环经济的目标是通过延长材料和产品的寿命来减少价值损失和避免浪费,包括在成为废物之前循环的材料或产品部件。循环经济模型(例如,循环价值网络)通常是复杂和网络化的,涉及不同的跨行业领域。在循环价值网络的背景下,可能涉及多个参与者,如供应商、制造商、回收商和产品最终用户。此外,整个网络可能有各种各样的资源、能源、信息和价值流。这意味着我们面临的挑战是,来自循环经济模型中跨行业领域的数据和信息不是建立在共同的基础之上,因此人类和机器都难以理解和使用。使用本体来表示领域知识可以使循环经济中不同行业的参与者和利益相关者使用一种共同的语言进行交流。所涉及的知识领域包括循环经济、可持续性、材料、产品、制造和物流。本文的目的是研究这些领域当前的本体论景观。这将使我们能够在未来探索什么现有的知识可以被调整或用于开发循环价值网络的本体论。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Survey+of+General+Ontologies+for+the+Cross-Industry+Domain+of+Circular+Economy)|0| |[Improving Netflix Video Quality with Neural Networks](https://doi.org/10.1145/3543873.3587553)|Christos G. Bampis, LiHeng Chen, Zhi Li||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+Netflix+Video+Quality+with+Neural+Networks)|0| |[Graph2Feat: Inductive Link Prediction via Knowledge Distillation](https://doi.org/10.1145/3543873.3587596)|Ahmed E. Samy, Zekarias T. Kefato, Sarunas Girdzijauskas|KTH, Royal Institue of Technology, Sweden; KTH, Royal Institute of Technology, Sweden|Link prediction between two nodes is a critical task in graph machine learning. Most approaches are based on variants of graph neural networks (GNNs) that focus on transductive link prediction and have high inference latency. However, many real-world applications require fast inference over new nodes in inductive settings where no information on connectivity is available for these nodes. Thereby, node features provide an inevitable alternative in the latter scenario. To that end, we propose Graph2Feat, which enables inductive link prediction by exploiting knowledge distillation (KD) through the Student-Teacher learning framework. In particular, Graph2Feat learns to match the representations of a lightweight student multi-layer perceptron (MLP) with a more expressive teacher GNN while learning to predict missing links based on the node features, thus attaining both GNN’s expressiveness and MLP’s fast inference. Furthermore, our approach is general; it is suitable for transductive and inductive link predictions on different types of graphs regardless of them being homogeneous or heterogeneous, directed or undirected. We carry out extensive experiments on seven real-world datasets including homogeneous and heterogeneous graphs. Our experiments demonstrate that Graph2Feat significantly outperforms SOTA methods in terms of AUC and average precision in homogeneous and heterogeneous graphs. Finally, Graph2Feat has the minimum inference time compared to the SOTA methods, and 100x acceleration compared to GNNs. The code and datasets are available on GitHub1.|两节点之间的链路预测是图形机器学习中的一个关键问题。大多数方法是基于图神经网络(GNN)的变体,侧重于传导链接预测和具有较高的推理潜伏期。然而,许多实际应用程序需要在归纳设置中对新节点进行快速推理,因为这些节点没有关于连通性的信息。因此,在后一种情况下,节点特性提供了一种不可避免的替代方案。为此,我们提出了 Graph2Feat,它通过学生-教师学习框架利用知识提取(KD)来实现归纳链接预测。特别是,Graph2Feat 学习匹配轻量级学生多层感知器(MLP)和更具表现力的教师 GNN 的表示,同时学习基于节点特征预测缺失链接,从而实现 GNN 的表现力和 MLP 的快速推理。此外,我们的方法是通用的,它适合于对不同类型的图的传导和归纳链接预测,不管它们是同质的或异质的,有向的或无向的。我们在七个真实世界的数据集上进行了广泛的实验,包括同质和异质图。我们的实验表明,Graph2Feat 在 AUC 和均匀和异构图的平均精度方面显著优于 SOTA 方法。最后,Graph2Feat 与 SOTA 方法相比具有最短的推理时间,与 GNN 相比具有100倍的加速度。代码和数据集可以在 GitHub1上获得。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph2Feat:+Inductive+Link+Prediction+via+Knowledge+Distillation)|0| |[Universal Model in Online Customer Service](https://doi.org/10.1145/3543873.3587630)|ShuTing Pi, ChengPing Hsieh, Qun Liu, Yuying Zhu|Amazon, USA|Building machine learning models can be a time-consuming process that often takes several months to implement in typical business scenarios. To ensure consistent model performance and account for variations in data distribution, regular retraining is necessary. This paper introduces a solution for improving online customer service in e-commerce by presenting a universal model for predicting labels based on customer questions, without requiring training. Our novel approach involves using machine learning techniques to tag customer questions in transcripts and create a repository of questions and corresponding labels. When a customer requests assistance, an information retrieval model searches the repository for similar questions, and statistical analysis is used to predict the corresponding label. By eliminating the need for individual model training and maintenance, our approach reduces both the model development cycle and costs. The repository only requires periodic updating to maintain accuracy.|构建机器学习模型可能是一个耗时的过程,在典型的业务场景中通常需要几个月才能实现。为了确保模型性能一致,并考虑到数据分布的差异,定期再培训是必要的。本文介绍了一种改进电子商务中在线客户服务的解决方案,提出了一种基于客户问题的通用标签预测模型,该模型不需要培训。我们的新方法包括使用机器学习技术在文本中标记客户的问题,并创建一个问题库和相应的标签。当客户要求协助时,信息检索模型会在储存库中搜索类似的问题,并使用统计分析来预测相应的标签。通过消除对单个模型的培训和维护的需要,我们的方法降低了模型开发周期和成本。存储库只需要定期更新以保持准确性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Universal+Model+in+Online+Customer+Service)|0| -|[Robust Stochastic Multi-Armed Bandits with Historical Data](https://doi.org/10.1145/3543873.3587653)|Sarah Boufelja Yacobi, Djallel Bouneffouf|Imperial College London, United Kingdom; IBM Research, USA|We consider the problem of Stochastic Contextual Multi-Armed Bandits (CMABs) initialised with Historical data. Initialisation with historical data is an example of data-driven regularisation which should, in theory, accelerate the convergence of CMABs. However, in practice, we have little to no control over the underlying generation process of such data, which may exhibit some pathologies, possibly impeding the convergence and the stability of the algorithm. In this paper, we focus on two main challenges: bias selection and data corruption. We propose two new algorithms to solve these specific issues: LinUCB with historical data and offline balancing (OB-HLinUCB) and Robust LinUCB with corrupted historical data (R-HLinUCB). We derive their theoretical regret bounds and discuss their computational performance using real-world datasets.|我们考虑的问题随机上下文多武装匪徒(CMABs)初始化与历史数据。使用历史数据进行初始化,是数据驱动的正规化的一个例子,理论上应能加速 CMAB 的融合。然而,在实践中,我们对这些数据的基本生成过程几乎没有控制,这可能会表现出一些病态,可能会妨碍算法的收敛和稳定性。在本文中,我们主要关注两个主要的挑战: 偏差选择和数据损坏。我们提出了两种新的算法来解决这些具体问题: 具有历史数据和离线平衡的 LinUCB (OB-HLinUCB)和具有损坏历史数据的鲁棒 LinUCB (R-HLinUCB)。我们推导了它们的理论遗憾界限,并利用实际数据集讨论了它们的计算性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Robust+Stochastic+Multi-Armed+Bandits+with+Historical+Data)|0| +|[Robust Stochastic Multi-Armed Bandits with Historical Data](https://doi.org/10.1145/3543873.3587653)|Sarah Boufelja Yacobi, Djallel Bouneffouf|IBM Research, USA; Imperial College London, United Kingdom|We consider the problem of Stochastic Contextual Multi-Armed Bandits (CMABs) initialised with Historical data. Initialisation with historical data is an example of data-driven regularisation which should, in theory, accelerate the convergence of CMABs. However, in practice, we have little to no control over the underlying generation process of such data, which may exhibit some pathologies, possibly impeding the convergence and the stability of the algorithm. In this paper, we focus on two main challenges: bias selection and data corruption. We propose two new algorithms to solve these specific issues: LinUCB with historical data and offline balancing (OB-HLinUCB) and Robust LinUCB with corrupted historical data (R-HLinUCB). We derive their theoretical regret bounds and discuss their computational performance using real-world datasets.|我们考虑的问题随机上下文多武装匪徒(CMABs)初始化与历史数据。使用历史数据进行初始化,是数据驱动的正规化的一个例子,理论上应能加速 CMAB 的融合。然而,在实践中,我们对这些数据的基本生成过程几乎没有控制,这可能会表现出一些病态,可能会妨碍算法的收敛和稳定性。在本文中,我们主要关注两个主要的挑战: 偏差选择和数据损坏。我们提出了两种新的算法来解决这些具体问题: 具有历史数据和离线平衡的 LinUCB (OB-HLinUCB)和具有损坏历史数据的鲁棒 LinUCB (R-HLinUCB)。我们推导了它们的理论遗憾界限,并利用实际数据集讨论了它们的计算性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Robust+Stochastic+Multi-Armed+Bandits+with+Historical+Data)|0| |[Skill Graph Construction From Semantic Understanding](https://doi.org/10.1145/3543873.3587667)|Shiyong Lin, Yiping Yuan, Carol Jin, Yi Pan|LinkedIn, USA|LinkedIn is building a skill graph to power a skill-first talent marketplace. Constructing a skill graph from a flat list is not an trivial task, especially by human curation. In this paper, we leverage the pre-trained large language model BERT to achieve this through semantic understanding on synthetically generated texts as training data. We automatically create positive and negative labels from the seed skill graph. The training data are encoded by pre-trained language models into embeddings and they are consumed by the downstream classification module to classify the relationships between skill pairs.|LinkedIn 正在构建一个技能图表,为技能优先的人才市场提供动力。从一个平面列表中构建一个技能图表并不是一件微不足道的事情,尤其是对于人工管理来说。在本文中,我们利用预训练的大语言模型 BERT,通过对综合生成的文本作为训练数据的语义理解来实现这一点。我们自动从种子技能图中创建正面和负面的标签。训练数据通过预先训练的语言模型进行编码嵌入,然后由下游分类模块使用这些数据对技能对之间的关系进行分类。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Skill+Graph+Construction+From+Semantic+Understanding)|0| |[Cultural Differences in Signed Ego Networks on Twitter: An Investigatory Analysis](https://doi.org/10.1145/3543873.3587641)|Jack Tacchi, Chiara Boldrini, Andrea Passarella, Marco Conti|Istituto di Informatica e Telematica - Consiglio Nazionale delle Ricerche, Italy; Istituto di Informatica e Telematica - Consiglio Nazionale delle Ricerche, Italy and Scuola Normale Superiore, Italy|Human social behaviour has been observed to adhere to certain structures. One such structure, the Ego Network Model (ENM), has been found almost ubiquitously in human society. Recently, this model has been extended to include signed connections. While the unsigned ENM has been rigorously observed for decades, the signed version is still somewhat novel and lacks the same breadth of observation. Therefore, the main aim of this paper is to examine this signed structure across various categories of individuals from a swathe of culturally distinct regions. Minor differences in the distribution of signs across the SENM can be observed between cultures. However, these can be overwhelmed when the network is centred around a specific topic. Indeed, users who are engaged with specific themes display higher levels of negativity in their networks. This effect is further supported by a significant negative correlation between the number of "general" topics discussed in a network and that network’s percentage of negative connections. These findings suggest that the negativity of communications and relationships on Twitter are very dependent on the topics being discussed and, furthermore, these relationships are more likely to be negative when they are based around a specific topic.|人类的社会行为已经被观察到遵循某些结构。其中一种结构,自我网络模型(ENM) ,在人类社会中几乎无处不在。最近,这个模型已经扩展到包括有符号连接。虽然未签名的 ENM 已经被严格观察了几十年,但是签名版本仍然有些新颖,缺乏同样广度的观察。因此,本文的主要目的是考察来自不同文化区域的不同类别的个体的这种符号结构。不同文化之间可以观察到在 SENM 中符号分布的细微差别。然而,当网络围绕某个特定主题时,这些问题可能会不堪重负。事实上,参与特定主题的用户在他们的网络中表现出更高水平的消极性。网络中讨论的“一般”主题的数量与网络负连接的百分比之间存在显著的负相关性,进一步支持了这种效应。这些发现表明,Twitter 上的交流和关系的消极性很大程度上取决于正在讨论的话题,而且,当这些关系围绕着一个特定的话题时,它们更有可能是消极的。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Cultural+Differences+in+Signed+Ego+Networks+on+Twitter:+An+Investigatory+Analysis)|0| |[Don't Trust, Verify: The Case of Slashing from a Popular Ethereum Explorer](https://doi.org/10.1145/3543873.3587555)|Zhiguo He, Jiasun Li, Zhengxun Wu|University of Chicago and NBER, USA; George Mason University, USA; Independent, USA|Blockchain explorers are important tools for quick look-ups of on-chain activities. However, as centralized data providers, their reliability remains under-studied. As a case study, we investigate Beaconcha.in , a leading explorer serving Ethereum’s proof-of-stake (PoS) update. According to the explorer, we find that more than 75% of slashable Byzantine actions were not slashed. Since Ethereum relies on the “stake-and-slash" mechanism to align incentives, this finding would at its face value cause concern over Ethereum’s security. However, further investigation reveals that all the apparent unslashed incidents were erroneously recorded due to the explorer’s mishandling of consensus edge cases. Besides the usual message of using caution with centralized information providers, our findings also call for attention to improving the monitoring of blockchain systems that support high-value applications.|区块链探索器是快速查找链上活动的重要工具。然而,作为集中的数据提供者,它们的可靠性仍然没有得到充分的研究。作为一个案例研究,我们调查了 Beaconcha.in,它是一个领先的探索者,服务于以太坊的木桩证明(PoS)更新。根据探险家的说法,我们发现超过75% 的拜占庭式行为没有被砍掉。由于以太坊依靠“利害关系”机制来调整激励机制,这一发现从表面上看会引起人们对以太坊安全性的担忧。然而,进一步的调查表明,所有明显的未删除事件是错误的记录,由于探索者的错误处理共识边缘案件。除了对集中的信息提供者使用谨慎的通常信息之外,我们的研究结果还呼吁注意改进对支持高价值应用的区块链系统的监测。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Don't+Trust,+Verify:+The+Case+of+Slashing+from+a+Popular+Ethereum+Explorer)|0| |[An Exploration on Cryptocurrency Corporations' Fiscal Opportunities](https://doi.org/10.1145/3543873.3587603)|Thomas Charest, Masarah PaquetClouston|School of Criminology, University of Montreal, Canada|As the decentralized finance industry gains traction, governments worldwide are creating or modifying legislations to regulate such financial activities. To avoid these new legislations, decentralized finance enterprises may shop for fiscally advantageous jurisdictions. This study explores global tax evasion opportunities for decentralized finance enterprises. Opportunities are identified by considering various jurisdictions’ tax laws on cryptocurrencies along with their corporate income tax rates, corporate capital gains tax rates, level of financial development and level of cryptocurrency adoption. They are visualized with the manifold approximation and projection for dimension reduction (UMAP) technique. The study results show that there exist a substantial number of tax evasion opportunities for decentralized finance enterprises through both traditional offshore jurisdictions and crypto-advantageous jurisdictions. The latter jurisdictions are usually considered high-tax fiscal regimes; but, given that they do not apply tax laws, tax evasion opportunities arise, especially in jurisdictions that have high financial development and high cryptocurrency adoption. Further research should investigate these new opportunities and how they are evolving. Understanding the global landscape surrounding tax evasion opportunities in decentralized finance represents a first step at preventing corporate capital flight of cryptocurrencies.|随着权力下放的金融业获得牵引力,世界各国政府正在制定或修订法律,以管制此类金融活动。为了避免这些新的立法,分散的金融企业可以选择财政上有利的司法管辖区。本研究探讨分散型金融企业的全球逃税机会。透过考虑不同地区有关加密货币的税务法例,以及其公司所得税税率、公司资本增值税税率、金融发展水平和加密货币的采用程度,我们可找出机会。它们是可视化的流形近似和投影维度减化(UMAP)技术。研究结果表明,无论是传统的离岸管辖区还是加密优势管辖区,分散的金融企业都有大量的逃税机会。后者通常被认为是高税收的财政体制; 但是,由于它们不适用税法,逃税机会就会出现,特别是在金融发展程度高、采用加密货币程度高的地区。进一步的研究应该调查这些新的机会以及它们是如何演变的。了解分散金融中逃税机会的全球环境,是防止加密货币企业资本外逃的第一步。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=An+Exploration+on+Cryptocurrency+Corporations'+Fiscal+Opportunities)|0| |[Improving the Exploration/Exploitation Trade-Off in Web Content Discovery](https://doi.org/10.1145/3543873.3587574)|Peter Schulam, Ion Muslea||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Improving+the+Exploration/Exploitation+Trade-Off+in+Web+Content+Discovery)|0| -|[SoarGraph: Numerical Reasoning over Financial Table-Text Data via Semantic-Oriented Hierarchical Graphs](https://doi.org/10.1145/3543873.3587598)|Fengbin Zhu, Moxin Li, Junbin Xiao, Fuli Feng, Chao Wang, TatSeng Chua|6ESTATES PTE LTD, Singapore; National University of Singapore, Singapore; University of Science and Technology of China, China|Towards the intelligent understanding of table-text data in the finance domain, previous research explores numerical reasoning over table-text content with Question Answering (QA) tasks. A general framework is to extract supporting evidence from the table and text and then perform numerical reasoning over extracted evidence for inferring the answer. However, existing models are vulnerable to missing supporting evidence, which limits their performance. In this work, we propose a novel Semantic-Oriented Hierarchical Graph (SoarGraph) that models the semantic relationships and dependencies among the different elements (e.g., question, table cells, text paragraphs, quantities, and dates) using hierarchical graphs to facilitate supporting evidence extraction and enhance numerical reasoning capability. We conduct our experiments on two popular benchmarks, FinQA and TAT-QA datasets, and the results show that our SoarGraph significantly outperforms all strong baselines, demonstrating remarkable effectiveness.|针对金融领域中表格文本数据的智能理解问题,以往的研究采用问答(QA)任务对表格文本内容进行数值推理。一个通用的框架是从表格和文本中提取支持证据,然后对提取的证据进行数值推理,从而推断出答案。然而,现有的模型容易失去支持证据,这限制了它们的性能。在这项工作中,我们提出了一种新的面向语义的层次图(SoarGraph) ,它使用层次图来模拟不同元素(如问题、表格单元、文本段落、数量和日期)之间的语义关系和依赖关系,以便于支持证据提取和增强数值推理能力。我们在 FinQA 和 TAT-QA 数据集上进行了实验,结果表明我们的 SoarGraph 显著优于所有强基线,显示出显著的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SoarGraph:+Numerical+Reasoning+over+Financial+Table-Text+Data+via+Semantic-Oriented+Hierarchical+Graphs)|0| +|[SoarGraph: Numerical Reasoning over Financial Table-Text Data via Semantic-Oriented Hierarchical Graphs](https://doi.org/10.1145/3543873.3587598)|Fengbin Zhu, Moxin Li, Junbin Xiao, Fuli Feng, Chao Wang, TatSeng Chua|National University of Singapore, Singapore; University of Science and Technology of China, China; 6ESTATES PTE LTD, Singapore|Towards the intelligent understanding of table-text data in the finance domain, previous research explores numerical reasoning over table-text content with Question Answering (QA) tasks. A general framework is to extract supporting evidence from the table and text and then perform numerical reasoning over extracted evidence for inferring the answer. However, existing models are vulnerable to missing supporting evidence, which limits their performance. In this work, we propose a novel Semantic-Oriented Hierarchical Graph (SoarGraph) that models the semantic relationships and dependencies among the different elements (e.g., question, table cells, text paragraphs, quantities, and dates) using hierarchical graphs to facilitate supporting evidence extraction and enhance numerical reasoning capability. We conduct our experiments on two popular benchmarks, FinQA and TAT-QA datasets, and the results show that our SoarGraph significantly outperforms all strong baselines, demonstrating remarkable effectiveness.|针对金融领域中表格文本数据的智能理解问题,以往的研究采用问答(QA)任务对表格文本内容进行数值推理。一个通用的框架是从表格和文本中提取支持证据,然后对提取的证据进行数值推理,从而推断出答案。然而,现有的模型容易失去支持证据,这限制了它们的性能。在这项工作中,我们提出了一种新的面向语义的层次图(SoarGraph) ,它使用层次图来模拟不同元素(如问题、表格单元、文本段落、数量和日期)之间的语义关系和依赖关系,以便于支持证据提取和增强数值推理能力。我们在 FinQA 和 TAT-QA 数据集上进行了实验,结果表明我们的 SoarGraph 显著优于所有强基线,显示出显著的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SoarGraph:+Numerical+Reasoning+over+Financial+Table-Text+Data+via+Semantic-Oriented+Hierarchical+Graphs)|0| |[Online to Offline Crossover of White Supremacist Propaganda](https://doi.org/10.1145/3543873.3587569)|Ahmad Diab, BolorErdene Jagdagdorj, Lynnette Hui Xian Ng, YuRu Lin, Michael Miller Yoder|University of Pittsburgh, USA; Carnegie Mellon University, USA|White supremacist extremist groups are a significant domestic terror threat in many Western nations. These groups harness the Internet to spread their ideology via online platforms: blogs, chat rooms, forums, and social media, which can inspire violence offline. In this work, we study the persistence and reach of white supremacist propaganda in both online and offline environments. We also study patterns in narratives that crossover from online to offline environments, or vice versa. From a geospatial analysis, we find that offline propaganda is geographically widespread in the United States, with a slight tendency toward Northeastern states. Propaganda that spreads the farthest and lasts the longest has a patriotic framing and is short, memorable, and repeatable. Through text comparison methods, we illustrate that online propaganda typically leads the appearance of the same propaganda in offline flyers, banners, and graffiti. We hope that this study sheds light on the characteristics of persistent white supremacist narratives both online and offline.|在许多西方国家,白人至上主义极端组织是一个重大的国内恐怖主义威胁。这些团体利用互联网通过在线平台传播他们的意识形态: 博客、聊天室、论坛和社交媒体,这些平台可以在线下煽动暴力。在这项工作中,我们研究了白人至上主义宣传在两个在线和离线环境中的持续性和影响范围。我们也研究从 Online To Offline线上到线下环境中交叉出来的叙述模式,反之亦然。通过地理空间分析,我们发现线下宣传在美国的地理位置上非常普遍,有轻微的美国东北部倾向。传播最广、持续时间最长的宣传具有爱国主义的框架,简短、令人难忘、可重复。通过文本比较的方法,我们说明了在线宣传通常会导致同样的宣传出现在线下传单、横幅和涂鸦中。我们希望这项研究能够揭示白人至上主义叙事的在线和离线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Online+to+Offline+Crossover+of+White+Supremacist+Propaganda)|0| -|[Privacy-Preserving Online Content Moderation with Federated Learning](https://doi.org/10.1145/3543873.3587366)|Pantelitsa Leonidou, Nicolas Kourtellis, Nikos Salamanos, Michael Sirivianos|Cyprus University of Technology, Cyprus; Telefonica Research, Spain|Users are daily exposed to a large volume of harmful content on various social network platforms. One solution is developing online moderation tools using Machine Learning techniques. However, the processing of user data by online platforms requires compliance with privacy policies. Federated Learning (FL) is an ML paradigm where the training is performed locally on the users' devices. Although the FL framework complies, in theory, with the GDPR policies, privacy leaks can still occur. For instance, an attacker accessing the final trained model can successfully perform unwanted inference of the data belonging to the users who participated in the training process. In this paper, we propose a privacy-preserving FL framework for online content moderation that incorporates Differential Privacy (DP). To demonstrate the feasibility of our approach, we focus on detecting harmful content on Twitter - but the overall concept can be generalized to other types of misbehavior. We simulate a text classifier - in FL fashion - which can detect tweets with harmful content. We show that the performance of the proposed FL framework can be close to the centralized approach - for both the DP and non-DP FL versions. Moreover, it has a high performance even if a small number of clients (each with a small number of data points) are available for the FL training. When reducing the number of clients (from 50 to 10) or the data points per client (from 1K to 0.1K), the classifier can still achieve ~81% AUC. Furthermore, we extend the evaluation to four other Twitter datasets that capture different types of user misbehavior and still obtain a promising performance (61% - 80% AUC). Finally, we explore the overhead on the users' devices during the FL training phase and show that the local training does not introduce excessive CPU utilization and memory consumption overhead.|用户每天都会在各种社交网络平台上接触到大量的有害内容。一个解决方案是使用机器学习技术开发在线审核工具。然而,通过在线平台处理用户数据需要遵守隐私策略。联邦学习(FL)是一种机器学习范式,其中的培训是在用户的设备上本地执行的。尽管 FL 框架在理论上符合 GDPR 策略,但仍然可能发生隐私泄露。例如,访问最终训练模型的攻击者可以成功地对属于参与训练过程的用户的数据进行不必要的推断。在这篇文章中,我们提出了一个保护隐私的在线内容审查框架,该框架结合了差分隐私(DP)。为了证明我们方法的可行性,我们将重点放在检测 Twitter 上的有害内容上——但总体概念可以推广到其他类型的不当行为。我们模拟了一个文本分类器——以 FL 的方式——它可以检测带有有害内容的 tweet。我们展示了所提出的 FL 框架的性能可以接近于集中式方法——对于 DP 和非 DP FL 版本都是如此。此外,即使只有少量的客户端(每个客户端都有少量的数据点)可用于 FL 培训,它也具有很高的性能。当减少客户端数量(从50到10)或每个客户端的数据点数量(从1K 到0.1 K)时,分类器仍然可以达到约81% 的 AUC。此外,我们将评估扩展到其他四个 Twitter 数据集,这些数据集捕获不同类型的用户不当行为,并仍然获得有希望的性能(61% -80% AUC)。最后,我们研究了 FL 训练阶段用户设备上的开销,结果表明本地训练不会引入过多的 CPU 利用率和内存消耗开销。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Privacy-Preserving+Online+Content+Moderation+with+Federated+Learning)|0| +|[Privacy-Preserving Online Content Moderation with Federated Learning](https://doi.org/10.1145/3543873.3587366)|Pantelitsa Leonidou, Nicolas Kourtellis, Nikos Salamanos, Michael Sirivianos|Telefonica Research, Spain; Cyprus University of Technology, Cyprus|Users are daily exposed to a large volume of harmful content on various social network platforms. One solution is developing online moderation tools using Machine Learning techniques. However, the processing of user data by online platforms requires compliance with privacy policies. Federated Learning (FL) is an ML paradigm where the training is performed locally on the users' devices. Although the FL framework complies, in theory, with the GDPR policies, privacy leaks can still occur. For instance, an attacker accessing the final trained model can successfully perform unwanted inference of the data belonging to the users who participated in the training process. In this paper, we propose a privacy-preserving FL framework for online content moderation that incorporates Differential Privacy (DP). To demonstrate the feasibility of our approach, we focus on detecting harmful content on Twitter - but the overall concept can be generalized to other types of misbehavior. We simulate a text classifier - in FL fashion - which can detect tweets with harmful content. We show that the performance of the proposed FL framework can be close to the centralized approach - for both the DP and non-DP FL versions. Moreover, it has a high performance even if a small number of clients (each with a small number of data points) are available for the FL training. When reducing the number of clients (from 50 to 10) or the data points per client (from 1K to 0.1K), the classifier can still achieve ~81% AUC. Furthermore, we extend the evaluation to four other Twitter datasets that capture different types of user misbehavior and still obtain a promising performance (61% - 80% AUC). Finally, we explore the overhead on the users' devices during the FL training phase and show that the local training does not introduce excessive CPU utilization and memory consumption overhead.|用户每天都会在各种社交网络平台上接触到大量的有害内容。一个解决方案是使用机器学习技术开发在线审核工具。然而,通过在线平台处理用户数据需要遵守隐私策略。联邦学习(FL)是一种机器学习范式,其中的培训是在用户的设备上本地执行的。尽管 FL 框架在理论上符合 GDPR 策略,但仍然可能发生隐私泄露。例如,访问最终训练模型的攻击者可以成功地对属于参与训练过程的用户的数据进行不必要的推断。在这篇文章中,我们提出了一个保护隐私的在线内容审查框架,该框架结合了差分隐私(DP)。为了证明我们方法的可行性,我们将重点放在检测 Twitter 上的有害内容上——但总体概念可以推广到其他类型的不当行为。我们模拟了一个文本分类器——以 FL 的方式——它可以检测带有有害内容的 tweet。我们展示了所提出的 FL 框架的性能可以接近于集中式方法——对于 DP 和非 DP FL 版本都是如此。此外,即使只有少量的客户端(每个客户端都有少量的数据点)可用于 FL 培训,它也具有很高的性能。当减少客户端数量(从50到10)或每个客户端的数据点数量(从1K 到0.1 K)时,分类器仍然可以达到约81% 的 AUC。此外,我们将评估扩展到其他四个 Twitter 数据集,这些数据集捕获不同类型的用户不当行为,并仍然获得有希望的性能(61% -80% AUC)。最后,我们研究了 FL 训练阶段用户设备上的开销,结果表明本地训练不会引入过多的 CPU 利用率和内存消耗开销。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Privacy-Preserving+Online+Content+Moderation+with+Federated+Learning)|0| |[Graph-based Approach for Studying Spread of Radical Online Sentiment](https://doi.org/10.1145/3543873.3587634)|Le Nguyen, Nidhi Rastogi|Golisano College of Computing and Information Science, Department of Software Engineering, Rochester Institute of Technology, USA|The spread of radicalization through the Internet is a growing problem. We are witnessing a rise in online hate groups, inspiring the impressionable and vulnerable population towards extreme actions in the real world. In this paper, we study the spread of hate sentiments in online forums by collecting 1,973 long comment threads (30+ comments per thread) posted on dark-web forums and containing a combination of benign posts and radical comments on the Islamic religion. This framework allows us to leverage network analysis tools to investigate sentiment propagation through a social network. By combining sentiment analysis, social network analysis, and graph theory, we aim to shed light on the propagation of hate speech in online forums and the extent to which such speech can influence individuals. The results of the intra-thread analysis suggests that sentiment tends to cluster within comment threads, with around 75% of connected members sharing similar sentiments. They also indicate that online forums can act as echo chambers where people with similar views reinforce each other’s beliefs and opinions. On the other hand, the inter-thread shows that 64% of connected threads share similar sentiments, suggesting similarities between the ideologies present in different threads and that there likely is a wider network of individuals spreading hate speech across different forums. Finally, we plan to study this work with a larger dataset, which could provide further insights into the spread of hate speech in online forums and how to mitigate it.|通过互联网传播激进主义是一个日益严重的问题。我们正在目睹网上仇恨团体的增加,鼓励易受影响的弱势群体在现实世界中采取极端行动。在这篇论文中,我们通过收集在暗网论坛上发布的1973条长评论(每条评论超过30条) ,以及包含对伊斯兰教的善意帖子和激进评论的组合,来研究在网络论坛上仇恨情绪的传播。这个框架允许我们利用网络分析工具来调查通过社交网络的情感传播。通过结合情绪分析、社会网络分析和图论,我们旨在阐明仇恨言论在网络论坛中的传播以及这种言论对个人的影响程度。内部线程分析的结果表明,情绪倾向于聚集在评论线程中,大约75% 的连接成员共享类似的情绪。他们还指出,在线论坛可以充当回音室,在这里,持有相似观点的人们可以相互加强信念和观点。另一方面,线程之间显示,64% 的连接线程有相似的情感,表明存在于不同线程的意识形态之间的相似性,并且可能有一个更广泛的个人网络在不同的论坛上传播仇恨言论。最后,我们计划利用一个更大的数据集来研究这项工作,它可以为在线论坛中仇恨言论的传播以及如何减轻这种传播提供进一步的见解。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Graph-based+Approach+for+Studying+Spread+of+Radical+Online+Sentiment)|0| -|[Swinging in the States: Does disinformation on Twitter mirror the US presidential election system?](https://doi.org/10.1145/3543873.3587638)|Manuel Pratelli, Marinella Petrocchi, Fabio Saracco, Rocco De Nicola|“Enrico Fermi” Research Center (CREF), Italy and IMT - School for Advanced Studies Lucca, Italy; Istituto di Informatica e Telematica (IIT) CNR, Italy and IMT - School for Advanced Studies Lucca, Italy; IMT - School for Advanced Studies Lucca, Italy and Istituto di Informatica e Telematica (IIT) CNR, Italy; IMT - School for Advanced Studies Lucca, Italy|For more than a decade scholars have been investigating the disinformation flow on social media contextually to societal events, like, e.g., elections. In this paper, we analyze the Twitter traffic related to the US 2020 pre-election debate and ask whether it mirrors the electoral system. The U.S. electoral system provides that, regardless of the actual vote gap, the premier candidate who received more votes in one state `takes' that state. Criticisms of this system have pointed out that election campaigns can be more intense in particular key states to achieve victory, so-called {\it swing states}. Our intuition is that election debate may cause more traffic on Twitter-and probably be more plagued by misinformation-when associated with swing states. The results mostly confirm the intuition. About 88\% of the entire traffic can be associated with swing states, and links to non-trustworthy news are shared far more in swing-related traffic than the same type of news in safe-related traffic. Considering traffic origin instead, non-trustworthy tweets generated by automated accounts, so-called social bots, are mostly associated with swing states. Our work sheds light on the role an electoral system plays in the evolution of online debates, with, in the spotlight, disinformation and social bots.|十多年来,学者们一直在研究社交媒体上的虚假信息与社会事件的关系,比如选举。在本文中,我们分析了与美国2020年大选前辩论有关的 Twitter 流量,并询问它是否反映了选举制度。美国的选举制度规定,不管实际的选票差距如何,在一个州获得更多选票的总理候选人“接受”该州。对这种制度的批评指出,竞选活动可以更加激烈,在特定的关键州取得胜利,所谓的“摇摆州”。我们的直觉是,当选举辩论与摇摆州联系在一起时,可能会在 Twitter 上引起更多流量,而且可能更容易受到错误信息的困扰。结果大多证实了直觉。大约88% 的流量可以与摇摆州联系起来,与安全相关的流量相比,与摇摆州相关的流量中分享的不可信新闻的链接要多得多。相反,考虑到流量来源,由自动账户(即所谓的社交机器人)生成的不可信的 tweet 大多与摇摆州有关。我们的工作揭示了选举系统在网络辩论演变中所扮演的角色,聚光灯下是虚假信息和社交机器人。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Swinging+in+the+States:+Does+disinformation+on+Twitter+mirror+the+US+presidential+election+system?)|0| +|[Swinging in the States: Does disinformation on Twitter mirror the US presidential election system?](https://doi.org/10.1145/3543873.3587638)|Manuel Pratelli, Marinella Petrocchi, Fabio Saracco, Rocco De Nicola|IMT - School for Advanced Studies Lucca, Italy and Istituto di Informatica e Telematica (IIT) CNR, Italy; “Enrico Fermi” Research Center (CREF), Italy and IMT - School for Advanced Studies Lucca, Italy; IMT - School for Advanced Studies Lucca, Italy; Istituto di Informatica e Telematica (IIT) CNR, Italy and IMT - School for Advanced Studies Lucca, Italy|For more than a decade scholars have been investigating the disinformation flow on social media contextually to societal events, like, e.g., elections. In this paper, we analyze the Twitter traffic related to the US 2020 pre-election debate and ask whether it mirrors the electoral system. The U.S. electoral system provides that, regardless of the actual vote gap, the premier candidate who received more votes in one state `takes' that state. Criticisms of this system have pointed out that election campaigns can be more intense in particular key states to achieve victory, so-called {\it swing states}. Our intuition is that election debate may cause more traffic on Twitter-and probably be more plagued by misinformation-when associated with swing states. The results mostly confirm the intuition. About 88\% of the entire traffic can be associated with swing states, and links to non-trustworthy news are shared far more in swing-related traffic than the same type of news in safe-related traffic. Considering traffic origin instead, non-trustworthy tweets generated by automated accounts, so-called social bots, are mostly associated with swing states. Our work sheds light on the role an electoral system plays in the evolution of online debates, with, in the spotlight, disinformation and social bots.|十多年来,学者们一直在研究社交媒体上的虚假信息与社会事件的关系,比如选举。在本文中,我们分析了与美国2020年大选前辩论有关的 Twitter 流量,并询问它是否反映了选举制度。美国的选举制度规定,不管实际的选票差距如何,在一个州获得更多选票的总理候选人“接受”该州。对这种制度的批评指出,竞选活动可以更加激烈,在特定的关键州取得胜利,所谓的“摇摆州”。我们的直觉是,当选举辩论与摇摆州联系在一起时,可能会在 Twitter 上引起更多流量,而且可能更容易受到错误信息的困扰。结果大多证实了直觉。大约88% 的流量可以与摇摆州联系起来,与安全相关的流量相比,与摇摆州相关的流量中分享的不可信新闻的链接要多得多。相反,考虑到流量来源,由自动账户(即所谓的社交机器人)生成的不可信的 tweet 大多与摇摆州有关。我们的工作揭示了选举系统在网络辩论演变中所扮演的角色,聚光灯下是虚假信息和社交机器人。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Swinging+in+the+States:+Does+disinformation+on+Twitter+mirror+the+US+presidential+election+system?)|0| |[Analyzing Activity and Suspension Patterns of Twitter Bots Attacking Turkish Twitter Trends by a Longitudinal Dataset](https://doi.org/10.1145/3543873.3587650)|Tugrulcan Elmas|Indiana University Bloomington, USA|Twitter bots amplify target content in a coordinated manner to make them appear popular, which is an astroturfing attack. Such attacks promote certain keywords to push them to Twitter trends to make them visible to a broader audience. Past work on such fake trends revealed a new astroturfing attack named ephemeral astroturfing that employs a very unique bot behavior in which bots post and delete generated tweets in a coordinated manner. As such, it is easy to mass-annotate such bots reliably, making them a convenient source of ground truth for bot research. In this paper, we detect and disclose over 212,000 such bots targeting Turkish trends, which we name astrobots. We also analyze their activity and suspension patterns. We found that Twitter purged those bots en-masse 6 times since June 2018. However, the adversaries reacted quickly and deployed new bots that were created years ago. We also found that many such bots do not post tweets apart from promoting fake trends, which makes it challenging for bot detection methods to detect them. Our work provides insights into platforms' content moderation practices and bot detection research. The dataset is publicly available at https://github.com/tugrulz/EphemeralAstroturfing.|Twitter 机器人以一种协调的方式放大目标内容,让它们看起来很受欢迎,这是一种草根攻击。这样的攻击促进了某些关键词,将它们推向 Twitter 趋势,使它们对更广泛的受众显而易见。过去对这种虚假趋势的研究揭示了一种名为“短暂星空草坪”的新型星空草坪攻击,该攻击采用了一种非常独特的机器人行为,机器人以协调的方式发布和删除生成的推文。因此,很容易可靠地对这些机器人进行大量注释,使它们成为机器人研究的一个方便的地面真相来源。在这篇论文中,我们发现并揭露了超过212,000个这样的机器人瞄准了土耳其的趋势,我们将其命名为“太空机器人”。我们还分析了它们的活动和悬浮模式。我们发现,自2018年6月以来,Twitter 共清除了这些机器人6次。然而,对手反应迅速,部署了多年前创建的新机器人。我们还发现,许多这样的机器人除了推广虚假趋势之外,不会发布推文,这使得机器人检测方法很难检测到它们。我们的工作为平台的内容审核实践和机器人检测研究提供了见解。该数据集可在 https://github.com/tugrulz/ephemeralastroturfing 公开获取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Analyzing+Activity+and+Suspension+Patterns+of+Twitter+Bots+Attacking+Turkish+Twitter+Trends+by+a+Longitudinal+Dataset)|0| -|[Socio-Emotional Computational Analysis of Propaganda Campaigns on Social Media Users in the Middle East](https://doi.org/10.1145/3543873.3587677)|Zain A. Halloush, Ahmed Aleroud, Craig Albert|Pamplin College of Arts, Humanities, and Social Sciences, Augusta University, USA; School of Computer and Cyber Sciences, Augusta University, Augusta University, USA|Society has been significantly impacted by social media platforms in almost every aspect of their life. This impact has been effectively formulating people’s global mindsets and opinions on political, economic, and social events. Such waves of opinion formation are referred to as propagandas and misinformation. Online propaganda influences the emotional and psychological orientation of people. The remarkable leaps in Machine Learning models and Natural Language Processing have helped in analyzing the emotional and psychological effects of cyber social threats such as propaganda campaigns on different nations, specifically in the Middle East, where rates of disputes have risen after the Arab Spring and the ongoing crises. In this paper, we present an approach to detect propagandas and the associated emotional and psychological aspects from social media news headlines that contain such a contextualized cyber social attack. We created a new dataset of headlines containing propaganda tweets and another dataset of potential emotions that the audience might endure when being exposed to such propaganda headlines. We believe that this is the first research to address the detection of emotional reactions linked to propaganda types on social media in the Middle East.|社会几乎在其生活的各个方面都受到社交媒体平台的显著影响。这种影响有效地形成了人们对政治、经济和社会事件的全球心态和观点。这种形成意见的浪潮被称为宣传和错误信息。网络宣传影响着人们的情感和心理取向。机器学习模型和自然语言处理的显著飞跃有助于分析网络社会威胁的情感和心理影响,例如针对不同国家的宣传活动,特别是在中东,在阿拉伯之春和持续的危机之后,那里的争端比率已经上升。在本文中,我们提出了一种方法来检测宣传和相关的情绪和心理方面的社会媒体新闻标题,其中包含这样一个情境化的网络社会攻击。我们创建了一个新的标题数据集,其中包含宣传推文和另一个潜在情绪数据集,观众在接触这些宣传标题时可能会忍受这些情绪。我们认为,这是第一次研究如何在中东社交媒体上发现与宣传类型相关的情绪反应。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Socio-Emotional+Computational+Analysis+of+Propaganda+Campaigns+on+Social+Media+Users+in+the+Middle+East)|0| +|[Socio-Emotional Computational Analysis of Propaganda Campaigns on Social Media Users in the Middle East](https://doi.org/10.1145/3543873.3587677)|Zain A. Halloush, Ahmed Aleroud, Craig Albert|School of Computer and Cyber Sciences, Augusta University, Augusta University, USA; Pamplin College of Arts, Humanities, and Social Sciences, Augusta University, USA|Society has been significantly impacted by social media platforms in almost every aspect of their life. This impact has been effectively formulating people’s global mindsets and opinions on political, economic, and social events. Such waves of opinion formation are referred to as propagandas and misinformation. Online propaganda influences the emotional and psychological orientation of people. The remarkable leaps in Machine Learning models and Natural Language Processing have helped in analyzing the emotional and psychological effects of cyber social threats such as propaganda campaigns on different nations, specifically in the Middle East, where rates of disputes have risen after the Arab Spring and the ongoing crises. In this paper, we present an approach to detect propagandas and the associated emotional and psychological aspects from social media news headlines that contain such a contextualized cyber social attack. We created a new dataset of headlines containing propaganda tweets and another dataset of potential emotions that the audience might endure when being exposed to such propaganda headlines. We believe that this is the first research to address the detection of emotional reactions linked to propaganda types on social media in the Middle East.|社会几乎在其生活的各个方面都受到社交媒体平台的显著影响。这种影响有效地形成了人们对政治、经济和社会事件的全球心态和观点。这种形成意见的浪潮被称为宣传和错误信息。网络宣传影响着人们的情感和心理取向。机器学习模型和自然语言处理的显著飞跃有助于分析网络社会威胁的情感和心理影响,例如针对不同国家的宣传活动,特别是在中东,在阿拉伯之春和持续的危机之后,那里的争端比率已经上升。在本文中,我们提出了一种方法来检测宣传和相关的情绪和心理方面的社会媒体新闻标题,其中包含这样一个情境化的网络社会攻击。我们创建了一个新的标题数据集,其中包含宣传推文和另一个潜在情绪数据集,观众在接触这些宣传标题时可能会忍受这些情绪。我们认为,这是第一次研究如何在中东社交媒体上发现与宣传类型相关的情绪反应。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Socio-Emotional+Computational+Analysis+of+Propaganda+Campaigns+on+Social+Media+Users+in+the+Middle+East)|0| |[Towards a Semantic Approach for Linked Dataspace, Model and Data Cards](https://doi.org/10.1145/3543873.3587659)|Andy Donald, Apostolos Galanopoulos, Edward Curry, Emir Muñoz, Ihsan Ullah, M. A. Waskow, Maciej Dabrowski, Manan Kalra|Genesys Cloud Services Inc., Bonham Quay, Galway, Ireland, Ireland; Insight SFI Centre for Data Analytics, Data Science Institute, University of Galway, Galway, Ireland, Ireland|The vast majority of artificial intelligence practitioners overlook the importance of documentation when building and publishing models and datasets. However, due to the recent trend in the explainability and fairness of AI models, several frameworks have been proposed such as Model Cards, and Data Cards, among others, to help in the appropriate re-usage of those models and datasets. In addition, because of the introduction of the dataspace concept for similar datasets in one place, there is potential that similar Model Cards, Data Cards, Service Cards, and Dataspace Cards can be linked to extract helpful information for better decision-making about which model and data can be used for a specific application. This paper reviews the case for considering a Semantic Web approach for exchanging Model/Data Cards as Linked Data or knowledge graphs in a dataspace, making them machine-readable. We discuss the basic concepts and propose a schema for linking Data Cards and Model Cards within a dataspace. In addition, we introduce the concept of a dataspace card which can be a starting point for extracting knowledge about models and datasets in a dataspace. This helps in building trust and reuse of models and data among companies and individuals participating as publishers or consumers of such assets.|绝大多数人工智能从业者在构建和发布模型和数据集时忽视了文档的重要性。然而,由于人工智能模型的可解释性和公平性最近的趋势,已经提出了几个框架,如模型卡和数据卡等,以帮助这些模型和数据集的适当重用。此外,由于在一个地方引入了类似数据集的数据空间概念,可以将类似的模型卡、数据卡、服务卡和数据空间卡联系起来,提取有用的信息,以便更好地决定哪些模型和数据可用于特定应用。本文回顾了在数据空间中考虑语义 Web 方法将模型/数据卡交换为链接数据或知识图的情况,从而使它们具有机器可读性。我们讨论了基本概念,并提出了一个在数据空间中连接数据卡和模型卡的模式。此外,我们还介绍了数据空间卡的概念,它可以作为提取数据空间中模型和数据集知识的起点。这有助于在作为此类资产的发布者或消费者参与的公司和个人之间建立模型和数据的信任和重用。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+a+Semantic+Approach+for+Linked+Dataspace,+Model+and+Data+Cards)|0| -|[Semantics in Dataspaces: Origin and Future Directions](https://doi.org/10.1145/3543873.3587689)|Johannes TheissenLipp, Max Kocher, Christoph Lange, Stefan Decker, Alexander Paulus, André Pomp, Edward Curry|University of Galway, Ireland; RWTH Aachen University, Germany; RWTH Aachen University, Germany and Fraunhofer Institute for Applied Information Technology FIT, Germany; University of Wuppertal, Germany|The term dataspace was coined two decades ago [12] and has evolved since then. Definitions range from (i) an abstraction for data management in an identifiable scope [15] over (iii) a multi-sided data platform connecting participants in an ecosystem [21] to (iii) interlinking data towards loosely connected (global) information [17]. Many implementations and scientific notions follow different interpretations of the term dataspace, but agree on some use of semantic technologies. For example, dataspaces such as the European Open Science Cloud and the German National Research Data Infrastructure are committed to applying the FAIR principles [11, 16]. Dataspaces built on top of Gaia-X are using semantic methods for service Self-Descriptions [13]. This paper investigates ongoing dataspace efforts and aims to provide insights on the definition of the term dataspace, the usage of semantics and FAIR principles, and future directions for the role of semantics in dataspaces.|数据空间这个术语是在20年前创造出来的,并且从那时起一直在进化。定义范围从(i)可识别范围内的数据管理的抽象[15]到(iii)连接生态系统参与者的多边数据平台[21]到(iii)将数据互联到松散连接的(全球)信息[17]。许多实现和科学概念遵循对数据空间这一术语的不同解释,但对语义技术的某些使用达成了一致。例如,欧洲开放科学云和德国国家研究数据基础设施等数据空间致力于应用 FAIR 原则[11,16]。建立在 Gaia-X 之上的数据空间正在使用语义方法进行服务自我描述[13]。本文研究了正在进行的数据空间工作,旨在提供对术语数据空间的定义、语义和 FAIR 原则的使用以及语义在数据空间中的作用的未来方向的见解。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Semantics+in+Dataspaces:+Origin+and+Future+Directions)|0| +|[Semantics in Dataspaces: Origin and Future Directions](https://doi.org/10.1145/3543873.3587689)|Johannes TheissenLipp, Max Kocher, Christoph Lange, Stefan Decker, Alexander Paulus, André Pomp, Edward Curry|RWTH Aachen University, Germany and Fraunhofer Institute for Applied Information Technology FIT, Germany; University of Wuppertal, Germany; University of Galway, Ireland; RWTH Aachen University, Germany|The term dataspace was coined two decades ago [12] and has evolved since then. Definitions range from (i) an abstraction for data management in an identifiable scope [15] over (iii) a multi-sided data platform connecting participants in an ecosystem [21] to (iii) interlinking data towards loosely connected (global) information [17]. Many implementations and scientific notions follow different interpretations of the term dataspace, but agree on some use of semantic technologies. For example, dataspaces such as the European Open Science Cloud and the German National Research Data Infrastructure are committed to applying the FAIR principles [11, 16]. Dataspaces built on top of Gaia-X are using semantic methods for service Self-Descriptions [13]. This paper investigates ongoing dataspace efforts and aims to provide insights on the definition of the term dataspace, the usage of semantics and FAIR principles, and future directions for the role of semantics in dataspaces.|数据空间这个术语是在20年前创造出来的,并且从那时起一直在进化。定义范围从(i)可识别范围内的数据管理的抽象[15]到(iii)连接生态系统参与者的多边数据平台[21]到(iii)将数据互联到松散连接的(全球)信息[17]。许多实现和科学概念遵循对数据空间这一术语的不同解释,但对语义技术的某些使用达成了一致。例如,欧洲开放科学云和德国国家研究数据基础设施等数据空间致力于应用 FAIR 原则[11,16]。建立在 Gaia-X 之上的数据空间正在使用语义方法进行服务自我描述[13]。本文研究了正在进行的数据空间工作,旨在提供对术语数据空间的定义、语义和 FAIR 原则的使用以及语义在数据空间中的作用的未来方向的见解。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Semantics+in+Dataspaces:+Origin+and+Future+Directions)|0| |[Efficient Sampling for Big Provenance](https://doi.org/10.1145/3543873.3587556)|Sara Moshtaghi Largani, Seokki Lee|University of Cincinnati, USA|Provenance has been studied extensively to explain existing and missing results for many applications while focusing on scalability and usability challenges. Recently, techniques that efficiently compute a compact representation of provenance have been introduced. In this work, we introduce a practical solution that computes a sample of provenance for existing results without computing full provenance. Our technique computes a sample of provenance based on the distribution of provenance wrt the query result that is estimated from the distribution of input data while considering the correlation among the data. The preliminary evaluation demonstrates that comparing to the naive approach our method efficiently computes a sample of (large size of) provenance with low errors.|起源已被广泛研究,以解释许多应用程序的现有结果和缺失结果,同时关注可伸缩性和可用性方面的挑战。最近,有效地计算种源的紧凑表示的技术已经被引入。在这项工作中,我们介绍了一个实用的解决方案,计算现有结果的来源样本,而不计算完整的来源。该方法在考虑数据间相关性的基础上,根据来源分布计算出一个来源样本,并根据输入数据的分布估计出查询结果。初步评估表明,与初始方法相比,我们的方法有效地计算了一个样本(大规模)来源低误差。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Efficient+Sampling+for+Big+Provenance)|0| |[Provenance Tracking for End-to-End Machine Learning Pipelines](https://doi.org/10.1145/3543873.3587557)|Stefan Grafberger, Paul Groth, Sebastian Schelter|University of Amsterdam, Netherlands|No abstract available.|没有摘要。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Provenance+Tracking+for+End-to-End+Machine+Learning+Pipelines)|0| -|[SeeGera: Self-supervised Semi-implicit Graph Variational Auto-encoders with Masking](https://doi.org/10.1145/3543507.3583245)|Xiang Li, Tiandi Ye, Caihua Shan, Dongsheng Li, Ming Gao|East China Normal University, China; Microsoft Research Asia, China|Generative graph self-supervised learning (SSL) aims to learn node representations by reconstructing the input graph data. However, most existing methods focus on unsupervised learning tasks only and very few work has shown its superiority over the state-of-the-art graph contrastive learning (GCL) models, especially on the classification task. While a very recent model has been proposed to bridge the gap, its performance on unsupervised learning tasks is still unknown. In this paper, to comprehensively enhance the performance of generative graph SSL against other GCL models on both unsupervised and supervised learning tasks, we propose the SeeGera model, which is based on the family of self-supervised variational graph auto-encoder (VGAE). Specifically, SeeGera adopts the semi-implicit variational inference framework, a hierarchical variational framework, and mainly focuses on feature reconstruction and structure/feature masking. On the one hand, SeeGera co-embeds both nodes and features in the encoder and reconstructs both links and features in the decoder. Since feature embeddings contain rich semantic information on features, they can be combined with node embeddings to provide fine-grained knowledge for feature reconstruction. On the other hand, SeeGera adds an additional layer for structure/feature masking to the hierarchical variational framework, which boosts the model generalizability. We conduct extensive experiments comparing SeeGera with 9 other state-of-the-art competitors. Our results show that SeeGera can compare favorably against other state-of-the-art GCL methods in a variety of unsupervised and supervised learning tasks.|生成图自监督学习(SSL)的目的是通过重构输入图数据来学习节点表示。然而,大多数现有的方法只关注非监督式学习任务,很少有研究显示其优于最先进的图形对比学习(gCL)模型,特别是在分类任务上。虽然最近有人提出了一个模型来弥补这一差距,但它在非监督式学习任务上的表现仍然是未知的。为了全面提高生成图 SSL 在无监督任务和监督式学习任务中相对于其他 GCL 模型的性能,我们提出了基于自监督变分图自动编码器(vgAE)系列的 SeeGera 模型。具体来说,SeeGera 采用了半隐式变分推理框架,即层次变分框架,主要研究特征重构和结构/特征掩蔽。一方面,SeeGera 在编码器中共同嵌入节点和特征,并在解码器中重构链路和特征。由于特征嵌入包含丰富的特征语义信息,因此它们可以与节点嵌入相结合,为特征重构提供细粒度的知识。另一方面,SeeGera 为层次变分框架增加了一个结构/特征屏蔽层,从而提高了模型的通用性。我们进行了广泛的实验比较 SeeGera 与其他9个国家的最先进的竞争对手。我们的研究结果表明,SeeGera 可以在各种无监督和无监督式学习的任务中与其他最先进的 GCL 方法相比较。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SeeGera:+Self-supervised+Semi-implicit+Graph+Variational+Auto-encoders+with+Masking)|0| -|[Lightweight source localization for large-scale social networks](https://doi.org/10.1145/3543507.3583299)|Zhen Wang, Dongpeng Hou, Chao Gao, Xiaoyu Li, Xuelong Li|School of Artificial Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical University, China; Northwestern Polytechnical University, China|The rapid diffusion of hazardous information in large-flow-based social media causes great economic losses and potential threats to society. It is crucial to infer the inner information source as early as possible to prevent further losses. However, existing localization methods wait until all deployed sensors obtain propagation information before starting source inference within a network, and hence the best opportunity to control propagation is missed. In this paper, we propose a new localization strategy based on finite deployed sensors, named Greedy-coverage-based Rapid Source Localization (GRSL), to rapidly, flexibly and accurately infer the source in the early propagation stage of large-scale networks. There are two phases in GRSL. In the first phase, the Greedy-based Strategy (GS) greedily deploys sensors to rapidly achieve wide area coverage at a low cost. In the second phase, when a propagation event within a network is observed by a part of the sensors, the Inference Strategy (IS) with an earlier response mechanism begins executing the source inference task in an earlier small infected area. Comprehensive experiments with the SOTA methods demonstrate the superior performance and robustness of GRSL in various application scenarios.|危险信息在基于大流量的社交媒体上的快速传播给社会造成了巨大的经济损失和潜在的威胁。为了防止进一步的损失,尽早推断出内部信息源至关重要。然而,现有的定位方法要等到所有部署的传感器获得传播信息后,才能在网络中开始源推理,因此错过了控制传播的最佳机会。提出了一种基于有限部署传感器的快速信源定位策略,即基于贪婪覆盖的快速信源定位(GRSL)策略,可以在大规模网络传播的早期阶段快速、灵活、准确地推断信源。GRSL 有两个阶段。在第一阶段,基于贪婪策略(GS)贪婪地部署传感器,以低成本快速实现广域覆盖。在第二个阶段,当一部分传感器观察到网络中的传播事件时,具有早期响应机制的推理策略(IS)开始在早期的小受感染区域中执行源推理任务。通过 SOTA 方法的综合实验,证明了 GRSL 在各种应用场景下的优越性能和鲁棒性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Lightweight+source+localization+for+large-scale+social+networks)|0| -|[xGCN: An Extreme Graph Convolutional Network for Large-scale Social Link Prediction](https://doi.org/10.1145/3543507.3583340)|Xiran Song, Jianxun Lian, Hong Huang, Zihan Luo, Wei Zhou, Xue Lin, Mingqi Wu, Chaozhuo Li, Xing Xie, Hai Jin|Microsoft Gaming, USA; National Engineering Research Center for Big Data Technology and System, Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China; Microsoft Research Asia, China|Graph neural networks (GNNs) have seen widespread usage across multiple real-world applications, yet in transductive learning, they still face challenges in accuracy, efficiency, and scalability, due to the extensive number of trainable parameters in the embedding table and the paradigm of stacking neighborhood aggregations. This paper presents a novel model called xGCN for large-scale network embedding, which is a practical solution for link predictions. xGCN addresses these issues by encoding graph-structure data in an extreme convolutional manner, and has the potential to push the performance of network embedding-based link predictions to a new record. Specifically, instead of assigning each node with a directly learnable embedding vector, xGCN regards node embeddings as static features. It uses a propagation operation to smooth node embeddings and relies on a Refinement neural Network (RefNet) to transform the coarse embeddings derived from the unsupervised propagation into new ones that optimize a training objective. The output of RefNet, which are well-refined embeddings, will replace the original node embeddings. This process is repeated iteratively until the model converges to a satisfying status. Experiments on three social network datasets with link prediction tasks show that xGCN not only achieves the best accuracy compared with a series of competitive baselines but also is highly efficient and scalable.|图形神经网络(GNN)在多种现实应用中得到了广泛的应用,然而在传导学习中,由于嵌入表中大量的可训练参数和邻域聚合堆叠的范式,它们在精度、效率和可扩展性方面仍然面临挑战。提出了一种新的大规模网络嵌入模型 xGCN,该模型是链路预测的一种实用解决方案。XGCN 通过以极端卷积方式编码图形结构数据来解决这些问题,并且有可能将基于网络嵌入的链接预测的性能提升到一个新的记录。具体来说,xGCN 没有为每个节点分配一个可直接学习的嵌入向量,而是将节点嵌入视为静态特征。该算法利用传播操作来平滑节点嵌入,并依靠细化神经网络(RefNet)将无监督传播产生的粗嵌入转化为优化训练目标的新嵌入。RefNet 的输出是经过良好改进的嵌入,它将取代原始节点嵌入。这个过程反复重复,直到模型收敛到一个令人满意的状态。通过对三个具有链接预测任务的社会网络数据集的实验表明,xGCN 不仅比一系列竞争性基线获得了最佳的预测精度,而且具有高效性和可扩展性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=xGCN:+An+Extreme+Graph+Convolutional+Network+for+Large-scale+Social+Link+Prediction)|0| -|[GraphPrompt: Unifying Pre-Training and Downstream Tasks for Graph Neural Networks](https://doi.org/10.1145/3543507.3583386)|Zemin Liu, Xingtong Yu, Yuan Fang, Xinming Zhang|Singapore Management University, Singapore; National University of Singapore, Singapore; University of Science and Technology of China, China|Graphs can model complex relationships between objects, enabling a myriad of Web applications such as online page/article classification and social recommendation. While graph neural networks(GNNs) have emerged as a powerful tool for graph representation learning, in an end-to-end supervised setting, their performance heavily rely on a large amount of task-specific supervision. To reduce labeling requirement, the "pre-train, fine-tune" and "pre-train, prompt" paradigms have become increasingly common. In particular, prompting is a popular alternative to fine-tuning in natural language processing, which is designed to narrow the gap between pre-training and downstream objectives in a task-specific manner. However, existing study of prompting on graphs is still limited, lacking a universal treatment to appeal to different downstream tasks. In this paper, we propose GraphPrompt, a novel pre-training and prompting framework on graphs. GraphPrompt not only unifies pre-training and downstream tasks into a common task template, but also employs a learnable prompt to assist a downstream task in locating the most relevant knowledge from the pre-train model in a task-specific manner. Finally, we conduct extensive experiments on five public datasets to evaluate and analyze GraphPrompt.|图形可以模拟对象之间的复杂关系,支持无数的 Web 应用程序,例如在线页面/文章分类和社交推荐。尽管图神经网络(GNN)已经成为图表示学习的有力工具,但在端到端的监督环境下,它们的性能很大程度上依赖于大量的任务特定监督。为了减少标签要求,“预训练,微调”和“预训练,及时”的范例已经变得越来越普遍。特别是,提示是自然语言处理中微调的一种流行替代方法,其目的是以特定于任务的方式缩小培训前和下游目标之间的差距。然而,现有的图形激励研究仍然是有限的,缺乏一个普遍的治疗呼吁不同的下游任务。本文提出了一种新的图形预训练和提示框架 GraphPrompt。GraphPrompt 不仅将预训练和下游任务统一到一个共同的任务模板中,而且还使用可学习的提示来帮助下游任务以特定于任务的方式从预训练模型中找到最相关的知识。最后,我们在五个公共数据集上进行了广泛的实验来评估和分析 GraphPrompt。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=GraphPrompt:+Unifying+Pre-Training+and+Downstream+Tasks+for+Graph+Neural+Networks)|0| -|[FedACK: Federated Adversarial Contrastive Knowledge Distillation for Cross-Lingual and Cross-Model Social Bot Detection](https://doi.org/10.1145/3543507.3583500)|Yingguang Yang, Renyu Yang, Hao Peng, Yangyang Li, Tong Li, Yong Liao, Pengyuan Zhou|NERC-RPP, CAEIT, China; Beihang University, China; University of Science and Technology of China, China; University of Leeds, United Kingdom; Tsinghua University, China|Social bot detection is of paramount importance to the resilience and security of online social platforms. The state-of-the-art detection models are siloed and have largely overlooked a variety of data characteristics from multiple cross-lingual platforms. Meanwhile, the heterogeneity of data distribution and model architecture makes it intricate to devise an efficient cross-platform and cross-model detection framework. In this paper, we propose FedACK, a new federated adversarial contrastive knowledge distillation framework for social bot detection. We devise a GAN-based federated knowledge distillation mechanism for efficiently transferring knowledge of data distribution among clients. In particular, a global generator is used to extract the knowledge of global data distribution and distill it into each client's local model. We leverage local discriminator to enable customized model design and use local generator for data enhancement with hard-to-decide samples. Local training is conducted as multi-stage adversarial and contrastive learning to enable consistent feature spaces among clients and to constrain the optimization direction of local models, reducing the divergences between local and global models. Experiments demonstrate that FedACK outperforms the state-of-the-art approaches in terms of accuracy, communication efficiency, and feature space consistency.|社交机器人检测对于在线社交平台的弹性和安全性至关重要。最先进的检测模型是孤立的,在很大程度上忽略了来自多个跨语言平台的各种数据特征。同时,数据分布和模型结构的异构性使得设计一个有效的跨平台、跨模型检测框架变得复杂。在本文中,我们提出了一个新的联邦对抗性对比知识提取框架 FedACK,用于社会机器人检测。设计了一种基于 GAN 的联邦知识提取机制,用于在客户端之间有效地传递数据分布的知识。特别地,全局生成器用于提取全局数据分布的知识,并将其提取到每个客户机的本地模型中。我们利用局部鉴别器来实现定制的模型设计,并使用局部生成器对难以确定的样本进行数据增强。局部训练作为多阶段对抗性和对比性学习进行,以使客户之间的特征空间保持一致,并约束局部模型的优化方向,减少局部模型和全局模型之间的差异。实验结果表明,FedACK 算法在准确性、通信效率和特征空间一致性方面优于目前最先进的算法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=FedACK:+Federated+Adversarial+Contrastive+Knowledge+Distillation+for+Cross-Lingual+and+Cross-Model+Social+Bot+Detection)|0| -|[Self-training through Classifier Disagreement for Cross-Domain Opinion Target Extraction](https://doi.org/10.1145/3543507.3583325)|Kai Sun, Richong Zhang, Samuel Mensah, Nikolaos Aletras, Yongyi Mao, Xudong Liu|; SKLSDE, School of Computer Science and Engineering, Beihang University, China; Computer Science Department, University of Sheffield, UK, United Kingdom; School of Electrical Engineering and Computer Science, University of Ottawa, Canada|Opinion target extraction (OTE) or aspect extraction (AE) is a fundamental task in opinion mining that aims to extract the targets (or aspects) on which opinions have been expressed. Recent work focus on cross-domain OTE, which is typically encountered in real-world scenarios, where the testing and training distributions differ. Most methods use domain adversarial neural networks that aim to reduce the domain gap between the labelled source and unlabelled target domains to improve target domain performance. However, this approach only aligns feature distributions and does not account for class-wise feature alignment, leading to suboptimal results. Semi-supervised learning (SSL) has been explored as a solution, but is limited by the quality of pseudo-labels generated by the model. Inspired by the theoretical foundations in domain adaptation [2], we propose a new SSL approach that opts for selecting target samples whose model output from a domain-specific teacher and student network disagree on the unlabelled target data, in an effort to boost the target domain performance. Extensive experiments on benchmark cross-domain OTE datasets show that this approach is effective and performs consistently well in settings with large domain shifts.|意见目标提取(OTE)或方面提取(AE)是意见挖掘中的一个基本任务,其目的是提取表达意见的目标(或方面)。最近的工作集中在跨域 OTE 上,这是在现实世界的场景中经常遇到的问题,其中测试和训练的分布是不同的。大多数方法使用域对抗神经网络,目的是减少标记源和未标记目标域之间的域差,以提高目标域的性能。然而,这种方法只对齐了特征分布,没有考虑类别特征对齐,导致次优结果。半监督学习(SSL)作为一种解决方案,但受到模型生成的伪标签质量的限制。受领域适应理论基础[2]的启发,我们提出了一种新的 SSL 方法,选择领域特定的教师和学生网络的模型输出与未标记的目标数据不一致的目标样本,以提高目标领域的性能。在基准跨域 OTE 数据集上的大量实验表明,该方法是有效的,并且在具有较大域移位的情况下表现一致。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Self-training+through+Classifier+Disagreement+for+Cross-Domain+Opinion+Target+Extraction)|0| -|[Fast and Multi-aspect Mining of Complex Time-stamped Event Streams](https://doi.org/10.1145/3543507.3583370)|Kota Nakamura, Yasuko Matsubara, Koki Kawabata, Yuhei Umeda, Yuichiro Wada, Yasushi Sakurai|AI Lab., Fujitsu, Japan; SANKEN, Osaka University, Japan; AI Lab., Fujitsu, Japan and AIP, RIKEN, Japan|Given a huge, online stream of time-evolving events with multiple attributes, such as online shopping logs: (item, price, brand, time), and local mobility activities: (pick-up and drop-off locations, time), how can we summarize large, dynamic high-order tensor streams? How can we see any hidden patterns, rules, and anomalies? Our answer is to focus on two types of patterns, i.e., ''regimes'' and ''components'', for which we present CubeScope, an efficient and effective method over high-order tensor streams. Specifically, it identifies any sudden discontinuity and recognizes distinct dynamical patterns, ''regimes'' (e.g., weekday/weekend/holiday patterns). In each regime, it also performs multi-way summarization for all attributes (e.g., item, price, brand, and time) and discovers hidden ''components'' representing latent groups (e.g., item/brand groups) and their relationship. Thanks to its concise but effective summarization, CubeScope can also detect the sudden appearance of anomalies and identify the types of anomalies that occur in practice. Our proposed method has the following properties: (a) Effective: it captures dynamical multi-aspect patterns, i.e., regimes and components, and statistically summarizes all the events; (b) General: it is practical for successful application to data compression, pattern discovery, and anomaly detection on various types of tensor streams; (c) Scalable: our algorithm does not depend on the length of the data stream and its dimensionality. Extensive experiments on real datasets demonstrate that CubeScope finds meaningful patterns and anomalies correctly, and consistently outperforms the state-of-the-art methods as regards accuracy and execution speed.|给定一个巨大的,在线时间演化事件流,具有多种属性,如在线购物日志: (项目,价格,品牌,时间) ,和本地流动性活动: (上下车地点,时间) ,我们如何总结大型,动态高阶张量流?我们怎么才能看到隐藏的模式,规则和异常呢?我们的答案是关注两种类型的模式,即“体制”和“组件”,对于这两种模式,我们提出 CubeScope,一种高阶张量流上的高效和有效的方法。具体来说,它识别任何突然的不连续性,并识别不同的动态模式,“制度”(例如,工作日/周末/假日模式)。在每个体系中,它还对所有属性(例如,商品、价格、品牌和时间)进行多方面的总结,并发现代表潜在群体(例如,商品/品牌群体)及其关系的隐藏“组成部分”。由于其简洁而有效的总结,CubeScope 还可以检测异常的突然出现并识别实际发生的异常类型。我们提出的方法具有以下特点: (a)有效: 它捕获动态的多方面模式,即制度和组成部分,并对所有事件进行统计总结; (b)一般情况: 成功应用于各种类型的张量流的数据压缩、模式发现和异常检测是可行的; (c)可扩展性: 我们的算法不依赖于数据流的长度及其维度。在真实数据集上的大量实验表明,CubeScope 能够正确地发现有意义的模式和异常,并且在准确性和执行速度方面始终优于最先进的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Fast+and+Multi-aspect+Mining+of+Complex+Time-stamped+Event+Streams)|0| +|[SeeGera: Self-supervised Semi-implicit Graph Variational Auto-encoders with Masking](https://doi.org/10.1145/3543507.3583245)|Xiang Li, Tiandi Ye, Caihua Shan, Dongsheng Li, Ming Gao|Microsoft Research Asia, China; East China Normal University, China|Generative graph self-supervised learning (SSL) aims to learn node representations by reconstructing the input graph data. However, most existing methods focus on unsupervised learning tasks only and very few work has shown its superiority over the state-of-the-art graph contrastive learning (GCL) models, especially on the classification task. While a very recent model has been proposed to bridge the gap, its performance on unsupervised learning tasks is still unknown. In this paper, to comprehensively enhance the performance of generative graph SSL against other GCL models on both unsupervised and supervised learning tasks, we propose the SeeGera model, which is based on the family of self-supervised variational graph auto-encoder (VGAE). Specifically, SeeGera adopts the semi-implicit variational inference framework, a hierarchical variational framework, and mainly focuses on feature reconstruction and structure/feature masking. On the one hand, SeeGera co-embeds both nodes and features in the encoder and reconstructs both links and features in the decoder. Since feature embeddings contain rich semantic information on features, they can be combined with node embeddings to provide fine-grained knowledge for feature reconstruction. On the other hand, SeeGera adds an additional layer for structure/feature masking to the hierarchical variational framework, which boosts the model generalizability. We conduct extensive experiments comparing SeeGera with 9 other state-of-the-art competitors. Our results show that SeeGera can compare favorably against other state-of-the-art GCL methods in a variety of unsupervised and supervised learning tasks.|生成图自监督学习(SSL)的目的是通过重构输入图数据来学习节点表示。然而,大多数现有的方法只关注非监督式学习任务,很少有研究显示其优于最先进的图形对比学习(gCL)模型,特别是在分类任务上。虽然最近有人提出了一个模型来弥补这一差距,但它在非监督式学习任务上的表现仍然是未知的。为了全面提高生成图 SSL 在无监督任务和监督式学习任务中相对于其他 GCL 模型的性能,我们提出了基于自监督变分图自动编码器(vgAE)系列的 SeeGera 模型。具体来说,SeeGera 采用了半隐式变分推理框架,即层次变分框架,主要研究特征重构和结构/特征掩蔽。一方面,SeeGera 在编码器中共同嵌入节点和特征,并在解码器中重构链路和特征。由于特征嵌入包含丰富的特征语义信息,因此它们可以与节点嵌入相结合,为特征重构提供细粒度的知识。另一方面,SeeGera 为层次变分框架增加了一个结构/特征屏蔽层,从而提高了模型的通用性。我们进行了广泛的实验比较 SeeGera 与其他9个国家的最先进的竞争对手。我们的研究结果表明,SeeGera 可以在各种无监督和无监督式学习的任务中与其他最先进的 GCL 方法相比较。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SeeGera:+Self-supervised+Semi-implicit+Graph+Variational+Auto-encoders+with+Masking)|0| +|[Lightweight source localization for large-scale social networks](https://doi.org/10.1145/3543507.3583299)|Zhen Wang, Dongpeng Hou, Chao Gao, Xiaoyu Li, Xuelong Li|Northwestern Polytechnical University, China; School of Artificial Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical University, China|The rapid diffusion of hazardous information in large-flow-based social media causes great economic losses and potential threats to society. It is crucial to infer the inner information source as early as possible to prevent further losses. However, existing localization methods wait until all deployed sensors obtain propagation information before starting source inference within a network, and hence the best opportunity to control propagation is missed. In this paper, we propose a new localization strategy based on finite deployed sensors, named Greedy-coverage-based Rapid Source Localization (GRSL), to rapidly, flexibly and accurately infer the source in the early propagation stage of large-scale networks. There are two phases in GRSL. In the first phase, the Greedy-based Strategy (GS) greedily deploys sensors to rapidly achieve wide area coverage at a low cost. In the second phase, when a propagation event within a network is observed by a part of the sensors, the Inference Strategy (IS) with an earlier response mechanism begins executing the source inference task in an earlier small infected area. Comprehensive experiments with the SOTA methods demonstrate the superior performance and robustness of GRSL in various application scenarios.|危险信息在基于大流量的社交媒体上的快速传播给社会造成了巨大的经济损失和潜在的威胁。为了防止进一步的损失,尽早推断出内部信息源至关重要。然而,现有的定位方法要等到所有部署的传感器获得传播信息后,才能在网络中开始源推理,因此错过了控制传播的最佳机会。提出了一种基于有限部署传感器的快速信源定位策略,即基于贪婪覆盖的快速信源定位(GRSL)策略,可以在大规模网络传播的早期阶段快速、灵活、准确地推断信源。GRSL 有两个阶段。在第一阶段,基于贪婪策略(GS)贪婪地部署传感器,以低成本快速实现广域覆盖。在第二个阶段,当一部分传感器观察到网络中的传播事件时,具有早期响应机制的推理策略(IS)开始在早期的小受感染区域中执行源推理任务。通过 SOTA 方法的综合实验,证明了 GRSL 在各种应用场景下的优越性能和鲁棒性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Lightweight+source+localization+for+large-scale+social+networks)|0| +|[xGCN: An Extreme Graph Convolutional Network for Large-scale Social Link Prediction](https://doi.org/10.1145/3543507.3583340)|Xiran Song, Jianxun Lian, Hong Huang, Zihan Luo, Wei Zhou, Xue Lin, Mingqi Wu, Chaozhuo Li, Xing Xie, Hai Jin|Microsoft Research Asia, China; National Engineering Research Center for Big Data Technology and System, Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China; Microsoft Gaming, USA|Graph neural networks (GNNs) have seen widespread usage across multiple real-world applications, yet in transductive learning, they still face challenges in accuracy, efficiency, and scalability, due to the extensive number of trainable parameters in the embedding table and the paradigm of stacking neighborhood aggregations. This paper presents a novel model called xGCN for large-scale network embedding, which is a practical solution for link predictions. xGCN addresses these issues by encoding graph-structure data in an extreme convolutional manner, and has the potential to push the performance of network embedding-based link predictions to a new record. Specifically, instead of assigning each node with a directly learnable embedding vector, xGCN regards node embeddings as static features. It uses a propagation operation to smooth node embeddings and relies on a Refinement neural Network (RefNet) to transform the coarse embeddings derived from the unsupervised propagation into new ones that optimize a training objective. The output of RefNet, which are well-refined embeddings, will replace the original node embeddings. This process is repeated iteratively until the model converges to a satisfying status. Experiments on three social network datasets with link prediction tasks show that xGCN not only achieves the best accuracy compared with a series of competitive baselines but also is highly efficient and scalable.|图形神经网络(GNN)在多种现实应用中得到了广泛的应用,然而在传导学习中,由于嵌入表中大量的可训练参数和邻域聚合堆叠的范式,它们在精度、效率和可扩展性方面仍然面临挑战。提出了一种新的大规模网络嵌入模型 xGCN,该模型是链路预测的一种实用解决方案。XGCN 通过以极端卷积方式编码图形结构数据来解决这些问题,并且有可能将基于网络嵌入的链接预测的性能提升到一个新的记录。具体来说,xGCN 没有为每个节点分配一个可直接学习的嵌入向量,而是将节点嵌入视为静态特征。该算法利用传播操作来平滑节点嵌入,并依靠细化神经网络(RefNet)将无监督传播产生的粗嵌入转化为优化训练目标的新嵌入。RefNet 的输出是经过良好改进的嵌入,它将取代原始节点嵌入。这个过程反复重复,直到模型收敛到一个令人满意的状态。通过对三个具有链接预测任务的社会网络数据集的实验表明,xGCN 不仅比一系列竞争性基线获得了最佳的预测精度,而且具有高效性和可扩展性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=xGCN:+An+Extreme+Graph+Convolutional+Network+for+Large-scale+Social+Link+Prediction)|0| +|[GraphPrompt: Unifying Pre-Training and Downstream Tasks for Graph Neural Networks](https://doi.org/10.1145/3543507.3583386)|Zemin Liu, Xingtong Yu, Yuan Fang, Xinming Zhang|National University of Singapore, Singapore; University of Science and Technology of China, China; Singapore Management University, Singapore|Graphs can model complex relationships between objects, enabling a myriad of Web applications such as online page/article classification and social recommendation. While graph neural networks(GNNs) have emerged as a powerful tool for graph representation learning, in an end-to-end supervised setting, their performance heavily rely on a large amount of task-specific supervision. To reduce labeling requirement, the "pre-train, fine-tune" and "pre-train, prompt" paradigms have become increasingly common. In particular, prompting is a popular alternative to fine-tuning in natural language processing, which is designed to narrow the gap between pre-training and downstream objectives in a task-specific manner. However, existing study of prompting on graphs is still limited, lacking a universal treatment to appeal to different downstream tasks. In this paper, we propose GraphPrompt, a novel pre-training and prompting framework on graphs. GraphPrompt not only unifies pre-training and downstream tasks into a common task template, but also employs a learnable prompt to assist a downstream task in locating the most relevant knowledge from the pre-train model in a task-specific manner. Finally, we conduct extensive experiments on five public datasets to evaluate and analyze GraphPrompt.|图形可以模拟对象之间的复杂关系,支持无数的 Web 应用程序,例如在线页面/文章分类和社交推荐。尽管图神经网络(GNN)已经成为图表示学习的有力工具,但在端到端的监督环境下,它们的性能很大程度上依赖于大量的任务特定监督。为了减少标签要求,“预训练,微调”和“预训练,及时”的范例已经变得越来越普遍。特别是,提示是自然语言处理中微调的一种流行替代方法,其目的是以特定于任务的方式缩小培训前和下游目标之间的差距。然而,现有的图形激励研究仍然是有限的,缺乏一个普遍的治疗呼吁不同的下游任务。本文提出了一种新的图形预训练和提示框架 GraphPrompt。GraphPrompt 不仅将预训练和下游任务统一到一个共同的任务模板中,而且还使用可学习的提示来帮助下游任务以特定于任务的方式从预训练模型中找到最相关的知识。最后,我们在五个公共数据集上进行了广泛的实验来评估和分析 GraphPrompt。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=GraphPrompt:+Unifying+Pre-Training+and+Downstream+Tasks+for+Graph+Neural+Networks)|0| +|[FedACK: Federated Adversarial Contrastive Knowledge Distillation for Cross-Lingual and Cross-Model Social Bot Detection](https://doi.org/10.1145/3543507.3583500)|Yingguang Yang, Renyu Yang, Hao Peng, Yangyang Li, Tong Li, Yong Liao, Pengyuan Zhou|Tsinghua University, China; University of Science and Technology of China, China; NERC-RPP, CAEIT, China; University of Leeds, United Kingdom; Beihang University, China|Social bot detection is of paramount importance to the resilience and security of online social platforms. The state-of-the-art detection models are siloed and have largely overlooked a variety of data characteristics from multiple cross-lingual platforms. Meanwhile, the heterogeneity of data distribution and model architecture makes it intricate to devise an efficient cross-platform and cross-model detection framework. In this paper, we propose FedACK, a new federated adversarial contrastive knowledge distillation framework for social bot detection. We devise a GAN-based federated knowledge distillation mechanism for efficiently transferring knowledge of data distribution among clients. In particular, a global generator is used to extract the knowledge of global data distribution and distill it into each client's local model. We leverage local discriminator to enable customized model design and use local generator for data enhancement with hard-to-decide samples. Local training is conducted as multi-stage adversarial and contrastive learning to enable consistent feature spaces among clients and to constrain the optimization direction of local models, reducing the divergences between local and global models. Experiments demonstrate that FedACK outperforms the state-of-the-art approaches in terms of accuracy, communication efficiency, and feature space consistency.|社交机器人检测对于在线社交平台的弹性和安全性至关重要。最先进的检测模型是孤立的,在很大程度上忽略了来自多个跨语言平台的各种数据特征。同时,数据分布和模型结构的异构性使得设计一个有效的跨平台、跨模型检测框架变得复杂。在本文中,我们提出了一个新的联邦对抗性对比知识提取框架 FedACK,用于社会机器人检测。设计了一种基于 GAN 的联邦知识提取机制,用于在客户端之间有效地传递数据分布的知识。特别地,全局生成器用于提取全局数据分布的知识,并将其提取到每个客户机的本地模型中。我们利用局部鉴别器来实现定制的模型设计,并使用局部生成器对难以确定的样本进行数据增强。局部训练作为多阶段对抗性和对比性学习进行,以使客户之间的特征空间保持一致,并约束局部模型的优化方向,减少局部模型和全局模型之间的差异。实验结果表明,FedACK 算法在准确性、通信效率和特征空间一致性方面优于目前最先进的算法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=FedACK:+Federated+Adversarial+Contrastive+Knowledge+Distillation+for+Cross-Lingual+and+Cross-Model+Social+Bot+Detection)|0| +|[Self-training through Classifier Disagreement for Cross-Domain Opinion Target Extraction](https://doi.org/10.1145/3543507.3583325)|Kai Sun, Richong Zhang, Samuel Mensah, Nikolaos Aletras, Yongyi Mao, Xudong Liu|; Computer Science Department, University of Sheffield, UK, United Kingdom; SKLSDE, School of Computer Science and Engineering, Beihang University, China; School of Electrical Engineering and Computer Science, University of Ottawa, Canada|Opinion target extraction (OTE) or aspect extraction (AE) is a fundamental task in opinion mining that aims to extract the targets (or aspects) on which opinions have been expressed. Recent work focus on cross-domain OTE, which is typically encountered in real-world scenarios, where the testing and training distributions differ. Most methods use domain adversarial neural networks that aim to reduce the domain gap between the labelled source and unlabelled target domains to improve target domain performance. However, this approach only aligns feature distributions and does not account for class-wise feature alignment, leading to suboptimal results. Semi-supervised learning (SSL) has been explored as a solution, but is limited by the quality of pseudo-labels generated by the model. Inspired by the theoretical foundations in domain adaptation [2], we propose a new SSL approach that opts for selecting target samples whose model output from a domain-specific teacher and student network disagree on the unlabelled target data, in an effort to boost the target domain performance. Extensive experiments on benchmark cross-domain OTE datasets show that this approach is effective and performs consistently well in settings with large domain shifts.|意见目标提取(OTE)或方面提取(AE)是意见挖掘中的一个基本任务,其目的是提取表达意见的目标(或方面)。最近的工作集中在跨域 OTE 上,这是在现实世界的场景中经常遇到的问题,其中测试和训练的分布是不同的。大多数方法使用域对抗神经网络,目的是减少标记源和未标记目标域之间的域差,以提高目标域的性能。然而,这种方法只对齐了特征分布,没有考虑类别特征对齐,导致次优结果。半监督学习(SSL)作为一种解决方案,但受到模型生成的伪标签质量的限制。受领域适应理论基础[2]的启发,我们提出了一种新的 SSL 方法,选择领域特定的教师和学生网络的模型输出与未标记的目标数据不一致的目标样本,以提高目标领域的性能。在基准跨域 OTE 数据集上的大量实验表明,该方法是有效的,并且在具有较大域移位的情况下表现一致。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Self-training+through+Classifier+Disagreement+for+Cross-Domain+Opinion+Target+Extraction)|0| +|[Fast and Multi-aspect Mining of Complex Time-stamped Event Streams](https://doi.org/10.1145/3543507.3583370)|Kota Nakamura, Yasuko Matsubara, Koki Kawabata, Yuhei Umeda, Yuichiro Wada, Yasushi Sakurai|SANKEN, Osaka University, Japan; AI Lab., Fujitsu, Japan; AI Lab., Fujitsu, Japan and AIP, RIKEN, Japan|Given a huge, online stream of time-evolving events with multiple attributes, such as online shopping logs: (item, price, brand, time), and local mobility activities: (pick-up and drop-off locations, time), how can we summarize large, dynamic high-order tensor streams? How can we see any hidden patterns, rules, and anomalies? Our answer is to focus on two types of patterns, i.e., ''regimes'' and ''components'', for which we present CubeScope, an efficient and effective method over high-order tensor streams. Specifically, it identifies any sudden discontinuity and recognizes distinct dynamical patterns, ''regimes'' (e.g., weekday/weekend/holiday patterns). In each regime, it also performs multi-way summarization for all attributes (e.g., item, price, brand, and time) and discovers hidden ''components'' representing latent groups (e.g., item/brand groups) and their relationship. Thanks to its concise but effective summarization, CubeScope can also detect the sudden appearance of anomalies and identify the types of anomalies that occur in practice. Our proposed method has the following properties: (a) Effective: it captures dynamical multi-aspect patterns, i.e., regimes and components, and statistically summarizes all the events; (b) General: it is practical for successful application to data compression, pattern discovery, and anomaly detection on various types of tensor streams; (c) Scalable: our algorithm does not depend on the length of the data stream and its dimensionality. Extensive experiments on real datasets demonstrate that CubeScope finds meaningful patterns and anomalies correctly, and consistently outperforms the state-of-the-art methods as regards accuracy and execution speed.|给定一个巨大的,在线时间演化事件流,具有多种属性,如在线购物日志: (项目,价格,品牌,时间) ,和本地流动性活动: (上下车地点,时间) ,我们如何总结大型,动态高阶张量流?我们怎么才能看到隐藏的模式,规则和异常呢?我们的答案是关注两种类型的模式,即“体制”和“组件”,对于这两种模式,我们提出 CubeScope,一种高阶张量流上的高效和有效的方法。具体来说,它识别任何突然的不连续性,并识别不同的动态模式,“制度”(例如,工作日/周末/假日模式)。在每个体系中,它还对所有属性(例如,商品、价格、品牌和时间)进行多方面的总结,并发现代表潜在群体(例如,商品/品牌群体)及其关系的隐藏“组成部分”。由于其简洁而有效的总结,CubeScope 还可以检测异常的突然出现并识别实际发生的异常类型。我们提出的方法具有以下特点: (a)有效: 它捕获动态的多方面模式,即制度和组成部分,并对所有事件进行统计总结; (b)一般情况: 成功应用于各种类型的张量流的数据压缩、模式发现和异常检测是可行的; (c)可扩展性: 我们的算法不依赖于数据流的长度及其维度。在真实数据集上的大量实验表明,CubeScope 能够正确地发现有意义的模式和异常,并且在准确性和执行速度方面始终优于最先进的方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Fast+and+Multi-aspect+Mining+of+Complex+Time-stamped+Event+Streams)|0| |[PDSum: Prototype-driven Continuous Summarization of Evolving Multi-document Sets Stream](https://doi.org/10.1145/3543507.3583371)|Susik Yoon, Hou Pong Chan, Jiawei Han|University of Illinois at Urbana-Champaign, USA; University of Macau, Macao|Summarizing text-rich documents has been long studied in the literature, but most of the existing efforts have been made to summarize a static and predefined multi-document set. With the rapid development of online platforms for generating and distributing text-rich documents, there arises an urgent need for continuously summarizing dynamically evolving multi-document sets where the composition of documents and sets is changing over time. This is especially challenging as the summarization should be not only effective in incorporating relevant, novel, and distinctive information from each concurrent multi-document set, but also efficient in serving online applications. In this work, we propose a new summarization problem, Evolving Multi-Document sets stream Summarization (EMDS), and introduce a novel unsupervised algorithm PDSum with the idea of prototype-driven continuous summarization. PDSum builds a lightweight prototype of each multi-document set and exploits it to adapt to new documents while preserving accumulated knowledge from previous documents. To update new summaries, the most representative sentences for each multi-document set are extracted by measuring their similarities to the prototypes. A thorough evaluation with real multi-document sets streams demonstrates that PDSum outperforms state-of-the-art unsupervised multi-document summarization algorithms in EMDS in terms of relevance, novelty, and distinctiveness and is also robust to various evaluation settings.|总结文本丰富的文档在文献中已经研究了很长时间,但现有的大多数努力都是总结一个静态的和预定义的多文档集。随着生成和分发文本丰富的文件的在线平台的迅速发展,迫切需要不断总结动态演变的多文件集,其中文件和文件集的组成随着时间的推移而变化。这尤其具有挑战性,因为摘要不仅应该有效地合并每个并发多文档集中的相关、新颖和独特信息,而且还应该有效地为在线应用程序提供服务。本文提出了一个新的文摘问题——进化多文档集流文摘(EMDS) ,并引入了一种基于原型驱动的连续文摘思想的无监督算法 PDSum。PDSum 为每个多文档集构建一个轻量级原型,并利用它来适应新文档,同时保留以前文档中积累的知识。为了更新新的摘要,通过测量每个多文档集合与原型的相似性来提取最有代表性的句子。对真实多文档集合流的全面评估表明,PDSum 在相关性、新颖性和独特性方面优于 EMDS 中最先进的无监督多文档摘要算法,并且对各种评估设置也具有鲁棒性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=PDSum:+Prototype-driven+Continuous+Summarization+of+Evolving+Multi-document+Sets+Stream)|0| -|[Learning Disentangled Representation via Domain Adaptation for Dialogue Summarization](https://doi.org/10.1145/3543507.3583389)|Jinpeng Li, Yingce Xia, Xin Cheng, Dongyan Zhao, Rui Yan|Wangxuan Institute of Computer Technology, Peking University, China; Gaoling School of Artificial Intelligence, Renmin University of China, China; Wangxuan Institute of Computer Technology, Peking University, China and National Key Laboratory of General Artificial Intelligence, Beijing Institute for General Artificial Intelligence, China; Microsoft Research, China|Dialogue summarization, which aims to generate a summary for an input dialogue, plays a vital role in intelligent dialogue systems. The end-to-end models have achieved satisfactory performance in summarization, but the success is built upon enough annotated data, which is costly to obtain, especially in the dialogue summarization. To leverage the rich external data, previous works first pre-train the model on the other domain data (e.g., the news domain), and then fine-tune it directly on the dialogue domain. The data from different domains are equally treated during the training process, while the vast differences between dialogues (usually informal, repetitive, and with multiple speakers) and conventional articles (usually formal and concise) are neglected. In this work, we propose to use a disentangled representation method to reduce the deviation between data in different domains, where the input data is disentangled into domain-invariant and domain-specific representations. The domain-invariant representation carries context information that is supposed to be the same across domains (e.g., news, dialogue) and the domain-specific representation indicates the input data belongs to a particular domain. We use adversarial learning and contrastive learning to constrain the disentangled representations to the target space. Furthermore, we propose two novel reconstruction strategies, namely backtracked and cross-track reconstructions, which aim to reduce the domain characteristics of out-of-domain data and mitigate the domain bias of the model. Experimental results on three public datasets show that our model significantly outperforms the strong baselines.|对话摘要是智能对话系统中的一个重要组成部分,其目的是为输入对话生成摘要。端到端模型在摘要方面取得了令人满意的效果,但是这种成功是建立在足够的注释数据的基础上的,而这些注释数据的获取成本很高,尤其是在对话摘要方面。为了利用丰富的外部数据,以前的工作首先在其他领域数据(例如,新闻领域)上预训练模型,然后直接在对话领域进行微调。不同领域的数据在训练过程中被平等对待,而对话(通常是非正式的、重复的、多人使用的)和传统文章(通常是正式的和简洁的)之间的巨大差异被忽略。在这项工作中,我们提出使用一个分离的表示方法,以减少不同领域的数据之间的偏差,其中输入数据被分离成领域不变和领域特定的表示。领域不变表示携带的上下文信息应该是相同的跨领域(例如,新闻,对话)和领域特定的表示表明输入数据属于一个特定的领域。我们使用对抗学习和对比学习来约束离散表征到目标空间。此外,我们提出了两种新的重建策略,即回溯重建和交叉跟踪重建,旨在减少域外数据的域特征和减轻模型的域偏差。在三个公共数据集上的实验结果表明,我们的模型明显优于强基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Disentangled+Representation+via+Domain+Adaptation+for+Dialogue+Summarization)|0| +|[Learning Disentangled Representation via Domain Adaptation for Dialogue Summarization](https://doi.org/10.1145/3543507.3583389)|Jinpeng Li, Yingce Xia, Xin Cheng, Dongyan Zhao, Rui Yan|Wangxuan Institute of Computer Technology, Peking University, China and National Key Laboratory of General Artificial Intelligence, Beijing Institute for General Artificial Intelligence, China; Microsoft Research, China; Gaoling School of Artificial Intelligence, Renmin University of China, China; Wangxuan Institute of Computer Technology, Peking University, China|Dialogue summarization, which aims to generate a summary for an input dialogue, plays a vital role in intelligent dialogue systems. The end-to-end models have achieved satisfactory performance in summarization, but the success is built upon enough annotated data, which is costly to obtain, especially in the dialogue summarization. To leverage the rich external data, previous works first pre-train the model on the other domain data (e.g., the news domain), and then fine-tune it directly on the dialogue domain. The data from different domains are equally treated during the training process, while the vast differences between dialogues (usually informal, repetitive, and with multiple speakers) and conventional articles (usually formal and concise) are neglected. In this work, we propose to use a disentangled representation method to reduce the deviation between data in different domains, where the input data is disentangled into domain-invariant and domain-specific representations. The domain-invariant representation carries context information that is supposed to be the same across domains (e.g., news, dialogue) and the domain-specific representation indicates the input data belongs to a particular domain. We use adversarial learning and contrastive learning to constrain the disentangled representations to the target space. Furthermore, we propose two novel reconstruction strategies, namely backtracked and cross-track reconstructions, which aim to reduce the domain characteristics of out-of-domain data and mitigate the domain bias of the model. Experimental results on three public datasets show that our model significantly outperforms the strong baselines.|对话摘要是智能对话系统中的一个重要组成部分,其目的是为输入对话生成摘要。端到端模型在摘要方面取得了令人满意的效果,但是这种成功是建立在足够的注释数据的基础上的,而这些注释数据的获取成本很高,尤其是在对话摘要方面。为了利用丰富的外部数据,以前的工作首先在其他领域数据(例如,新闻领域)上预训练模型,然后直接在对话领域进行微调。不同领域的数据在训练过程中被平等对待,而对话(通常是非正式的、重复的、多人使用的)和传统文章(通常是正式的和简洁的)之间的巨大差异被忽略。在这项工作中,我们提出使用一个分离的表示方法,以减少不同领域的数据之间的偏差,其中输入数据被分离成领域不变和领域特定的表示。领域不变表示携带的上下文信息应该是相同的跨领域(例如,新闻,对话)和领域特定的表示表明输入数据属于一个特定的领域。我们使用对抗学习和对比学习来约束离散表征到目标空间。此外,我们提出了两种新的重建策略,即回溯重建和交叉跟踪重建,旨在减少域外数据的域特征和减轻模型的域偏差。在三个公共数据集上的实验结果表明,我们的模型明显优于强基线。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Disentangled+Representation+via+Domain+Adaptation+for+Dialogue+Summarization)|0| |[Towards Understanding Consumer Healthcare Questions on the Web with Semantically Enhanced Contrastive Learning](https://doi.org/10.1145/3543507.3583449)|Shweta Yadav, Stefan Cobeli, Cornelia Caragea|University of Illinois at Chicago, USA|In recent years, seeking health information on the web has become a preferred way for healthcare consumers to support their information needs. Generally, healthcare consumers use long and detailed questions with several peripheral details to express their healthcare concerns, contributing to natural language understanding challenges. One way to address this challenge is by summarizing the questions. However, most of the existing abstractive summarization systems generate impeccably fluent yet factually incorrect summaries. In this paper, we present a semantically-enhanced contrastive learning-based framework for generating abstractive question summaries that are faithful and factually correct. We devised multiple strategies based on question semantics to generate the erroneous (negative) summaries, such that the model has the understanding of plausible and incorrect perturbations of the original summary. Our extensive experimental results on two benchmark consumer health question summarization datasets confirm the effectiveness of our proposed method by achieving state-of-the-art performance and generating factually correct and fluent summaries, as measured by human evaluation.|近年来,在网上寻找健康信息已成为医疗保健消费者支持其信息需求的首选方式。一般来说,医疗保健消费者使用长而详细的问题和几个外围细节来表达他们的医疗保健关注,有助于自然语言理解的挑战。解决这个问题的一个方法是总结问题。然而,大多数现有的抽象摘要系统生成的摘要无可挑剔地流畅,但事实上是不正确的。本文提出了一个基于语义增强的对比学习框架,用于生成抽象的、忠实的、事实正确的问题摘要。我们设计了多种基于问题语义的策略来生成错误的(否定的)总结,使得模型能够理解原始总结的合理和不正确的扰动。我们在两个基准的消费者健康问题摘要数据集上的广泛实验结果证实了我们提出的方法的有效性,通过实现最先进的性能和生成事实上正确和流畅的摘要,由人类评估测量。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Understanding+Consumer+Healthcare+Questions+on+the+Web+with+Semantically+Enhanced+Contrastive+Learning)|0| |[Modeling Dynamic Interactions over Tensor Streams](https://doi.org/10.1145/3543507.3583458)|Koki Kawabata, Yasuko Matsubara, Yasushi Sakurai|SANKEN, Osaka University, Japan|Many web applications, such as search engines and social network services, are continuously producing a huge number of events with a multi-order tensor form, {count;query, location, …, timestamp}, and so how can we discover important trends to enables us to forecast long-term future events? Can we interpret any relationships between events that determine the trends from multi-aspect perspectives? Real-world online activities can be composed of (1) many time-changing interactions that control trends, for example, competition/cooperation to gain user attention, as well as (2) seasonal patterns that covers trends. To model the shifting trends via interactions, namely dynamic interactions over tensor streams, in this paper, we propose a streaming algorithm, DISMO, that we designed to discover Dynamic Interactions and Seasonality in a Multi-Order tensor. Our approach has the following properties. (a) Interpretable: it incorporates interpretable non-linear differential equations in tensor factorization so that it can reveal latent interactive relationships and thus generate future events effectively; (b) Dynamic: it can be aware of shifting trends by switching multi-aspect factors while summarizing their characteristics incrementally; and (c) Automatic: it finds every factor automatically without losing forecasting accuracy. Extensive experiments on real datasets demonstrate that our algorithm extracts interpretable interactions between data attributes, while simultaneously providing improved forecasting accuracy and a great reduction in computational time.|许多网络应用程序,如搜索引擎和社交网络服务,不断产生大量的事件与多阶张量形式,{ count; query,location,... ,time戳} ,所以我们如何发现重要的趋势,使我们能够预测长期的未来事件?我们能否从多方面的角度解释事件之间确定趋势的任何关系?现实世界的在线活动可以包括(1)许多时间变化的交互,控制趋势,例如,竞争/合作,以获得用户的关注,以及(2)涵盖趋势的季节性模式。为了通过张量流的动态相互作用(即动态相互作用)来模拟变化趋势,在本文中,我们提出了一个流算法,DISMO,我们设计它来发现多阶张量中的动态相互作用和季节性。我们的方法具有以下属性。(a)可解释性: 在张量因子分解中加入可解释的非线性微分方程,以便能够揭示潜在的交互关系,从而有效地产生未来事件; (b)动态性: 通过切换多方面因子,逐步总结其特征,它可以意识到变化的趋势; (c)自动性: 它自动找到每个因子,而不会失去预测的准确性。在实际数据集上的大量实验表明,该算法提取了数据属性之间可解释的交互,同时提高了预测精度,大大减少了计算时间。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Modeling+Dynamic+Interactions+over+Tensor+Streams)|0| -|[Constrained Subset Selection from Data Streams for Profit Maximization](https://doi.org/10.1145/3543507.3583490)|Shuang Cui, Kai Han, Jing Tang, He Huang|School of Computer Science and Technology, Soochow University, China; The Hong Kong University of Science and Technology (Guangzhou), The Hong Kong University of Science and Technology, China; School of Computer Science and Technology, University of Science and Technology of China, China|The problem of constrained subset selection from a large data stream for profit maximization has many applications in web data mining and machine learning, such as social advertising, team formation and recommendation systems. Such a problem can be formulated as maximizing a regularized submodular function under certain constraints. In this paper, we consider a generalized k-system constraint, which captures various requirements in real-world applications. For this problem, we propose the first streaming algorithm with provable performance bounds, leveraging a novel multitudinous distorted filter framework. The empirical performance of our algorithm is extensively evaluated in several applications including web data mining and recommendation systems, and the experimental results demonstrate the superiorities of our algorithm in terms of both effectiveness and efficiency.|面向利润最大化的大型数据流的约束子集选择问题在网络数据挖掘和机器学习中有许多应用,例如社交广告、团队组建和推荐系统。这类问题可以表述为在一定约束条件下正则化子模函数的最大化问题。在本文中,我们考虑了一个广义 k 系统约束,它能够满足实际应用中的各种需求。针对这个问题,我们提出了第一个具有可证明性能界限的流式算法,利用了一个新的大量失真过滤器框架。该算法的经验性能在网络数据挖掘和推荐系统等应用中得到了广泛的评价,实验结果表明了该算法在效率和效果方面的优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Constrained+Subset+Selection+from+Data+Streams+for+Profit+Maximization)|0| -|[SCStory: Self-supervised and Continual Online Story Discovery](https://doi.org/10.1145/3543507.3583507)|Susik Yoon, Yu Meng, Dongha Lee, Jiawei Han|Yonsei University, Republic of Korea; University of Illinois at Urbana-Champaign, USA|We present a framework SCStory for online story discovery, that helps people digest rapidly published news article streams in real-time without human annotations. To organize news article streams into stories, existing approaches directly encode the articles and cluster them based on representation similarity. However, these methods yield noisy and inaccurate story discovery results because the generic article embeddings do not effectively reflect the story-indicative semantics in an article and cannot adapt to the rapidly evolving news article streams. SCStory employs self-supervised and continual learning with a novel idea of story-indicative adaptive modeling of news article streams. With a lightweight hierarchical embedding module that first learns sentence representations and then article representations, SCStory identifies story-relevant information of news articles and uses them to discover stories. The embedding module is continuously updated to adapt to evolving news streams with a contrastive learning objective, backed up by two unique techniques, confidence-aware memory replay and prioritized-augmentation, employed for label absence and data scarcity problems. Thorough experiments on real and the latest news data sets demonstrate that SCStory outperforms existing state-of-the-art algorithms for unsupervised online story discovery.|我们提出了一个用于在线故事发现的框架 SCStory,它可以帮助人们在没有人工注释的情况下实时消化快速发布的新闻文章流。为了将新闻文章流组织成故事,现有的方法直接对文章进行编码,并根据表示相似性对文章进行聚类。然而,这些方法产生的噪音和不准确的故事发现结果,因为通用文章嵌入不能有效地反映故事指示语义在一篇文章,不能适应迅速发展的新闻文章流。SCStory 采用自我监督和持续学习的思想,对新闻文章流进行故事指示性自适应建模。SCStory 使用一个轻量级的层次化嵌入模块,首先学习句子表示,然后学习文章表示,SCStory 识别新闻文章的故事相关信息,并使用它们来发现故事。嵌入模块不断更新,以适应具有对比学习目标的不断发展的新闻流,并辅以两种独特的技术: 可信度感知记忆重放和优先级增强,用于解决标签缺失和数据稀缺问题。对真实新闻和最新新闻数据集的彻底实验表明,SCStory 在无监督的在线故事发现方面优于现有的最先进算法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SCStory:+Self-supervised+and+Continual+Online+Story+Discovery)|0| -|[Know Your Transactions: Real-time and Generic Transaction Semantic Representation on Blockchain & Web3 Ecosystem](https://doi.org/10.1145/3543507.3583537)|Zhiying Wu, Jieli Liu, Jiajing Wu, Zibin Zheng, Xiapu Luo, Ting Chen|University of Electronic Science and Technologyo of China, China; Sun Yat-sen University, China; Hong Kong Polytechnic University, China|Web3, based on blockchain technology, is the evolving next generation Internet of value. Massive active applications on Web3, e.g. DeFi and NFT, usually rely on blockchain transactions to achieve value transfer as well as complex and diverse custom logic and intentions. Various risky or illegal behaviors such as financial fraud, hacking, money laundering are currently rampant in the blockchain ecosystem, and it is thus important to understand the intent behind the pseudonymous transactions. To reveal the intent of transactions, much effort has been devoted to extracting some particular transaction semantics through specific expert experiences. However, the limitations of existing methods in terms of effectiveness and generalization make it difficult to extract diverse transaction semantics in the rapidly growing and evolving Web3 ecosystem. In this paper, we propose the Motif-based Transaction Semantics representation method (MoTS), which can capture the transaction semantic information in the real-time transaction data workflow. To the best of our knowledge, MoTS is the first general semantic extraction method in Web3 blockchain ecosystem. Experimental results show that MoTS can effectively distinguish different transaction semantics in real-time, and can be used for various downstream tasks, giving new insights to understand the Web3 blockchain ecosystem. Our codes are available at https://github.com/wuzhy1ng/MoTS.|基于区块链技术的 Web3是不断发展的下一代价值互联网。Web3上的大量活动应用程序,例如 DeFi 和 NFT,通常依赖区块链事务来实现价值转移以及复杂多样的定制逻辑和意图。各种风险或非法行为,如金融欺诈,黑客,洗钱目前在区块链生态系统中猖獗,因此了解这些假名交易背后的意图非常重要。为了揭示事务的意图,人们花费了大量的精力通过特定的专家经验提取特定的事务语义。然而,现有方法在有效性和泛化方面的局限性使得在快速增长和发展的 Web3生态系统中很难提取不同的事务语义。在本文中,我们提出了基于主题的事务语义表示方法(moTS) ,它可以捕获实时事务数据流中的事务语义信息。据我们所知,MoTS 是 Web3区块链生态系统中第一个通用的语义提取方法。实验结果表明,MoTS 可以有效地实时区分不同的事务语义,并可用于各种下游任务,为理解 Web3区块链生态系统提供了新的视角。我们的密码可以在 https://github.com/wuzhy1ng/mots 找到。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Know+Your+Transactions:+Real-time+and+Generic+Transaction+Semantic+Representation+on+Blockchain+&+Web3+Ecosystem)|0| +|[Constrained Subset Selection from Data Streams for Profit Maximization](https://doi.org/10.1145/3543507.3583490)|Shuang Cui, Kai Han, Jing Tang, He Huang|The Hong Kong University of Science and Technology (Guangzhou), The Hong Kong University of Science and Technology, China; School of Computer Science and Technology, Soochow University, China; School of Computer Science and Technology, University of Science and Technology of China, China|The problem of constrained subset selection from a large data stream for profit maximization has many applications in web data mining and machine learning, such as social advertising, team formation and recommendation systems. Such a problem can be formulated as maximizing a regularized submodular function under certain constraints. In this paper, we consider a generalized k-system constraint, which captures various requirements in real-world applications. For this problem, we propose the first streaming algorithm with provable performance bounds, leveraging a novel multitudinous distorted filter framework. The empirical performance of our algorithm is extensively evaluated in several applications including web data mining and recommendation systems, and the experimental results demonstrate the superiorities of our algorithm in terms of both effectiveness and efficiency.|面向利润最大化的大型数据流的约束子集选择问题在网络数据挖掘和机器学习中有许多应用,例如社交广告、团队组建和推荐系统。这类问题可以表述为在一定约束条件下正则化子模函数的最大化问题。在本文中,我们考虑了一个广义 k 系统约束,它能够满足实际应用中的各种需求。针对这个问题,我们提出了第一个具有可证明性能界限的流式算法,利用了一个新的大量失真过滤器框架。该算法的经验性能在网络数据挖掘和推荐系统等应用中得到了广泛的评价,实验结果表明了该算法在效率和效果方面的优越性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Constrained+Subset+Selection+from+Data+Streams+for+Profit+Maximization)|0| +|[SCStory: Self-supervised and Continual Online Story Discovery](https://doi.org/10.1145/3543507.3583507)|Susik Yoon, Yu Meng, Dongha Lee, Jiawei Han|University of Illinois at Urbana-Champaign, USA; Yonsei University, Republic of Korea|We present a framework SCStory for online story discovery, that helps people digest rapidly published news article streams in real-time without human annotations. To organize news article streams into stories, existing approaches directly encode the articles and cluster them based on representation similarity. However, these methods yield noisy and inaccurate story discovery results because the generic article embeddings do not effectively reflect the story-indicative semantics in an article and cannot adapt to the rapidly evolving news article streams. SCStory employs self-supervised and continual learning with a novel idea of story-indicative adaptive modeling of news article streams. With a lightweight hierarchical embedding module that first learns sentence representations and then article representations, SCStory identifies story-relevant information of news articles and uses them to discover stories. The embedding module is continuously updated to adapt to evolving news streams with a contrastive learning objective, backed up by two unique techniques, confidence-aware memory replay and prioritized-augmentation, employed for label absence and data scarcity problems. Thorough experiments on real and the latest news data sets demonstrate that SCStory outperforms existing state-of-the-art algorithms for unsupervised online story discovery.|我们提出了一个用于在线故事发现的框架 SCStory,它可以帮助人们在没有人工注释的情况下实时消化快速发布的新闻文章流。为了将新闻文章流组织成故事,现有的方法直接对文章进行编码,并根据表示相似性对文章进行聚类。然而,这些方法产生的噪音和不准确的故事发现结果,因为通用文章嵌入不能有效地反映故事指示语义在一篇文章,不能适应迅速发展的新闻文章流。SCStory 采用自我监督和持续学习的思想,对新闻文章流进行故事指示性自适应建模。SCStory 使用一个轻量级的层次化嵌入模块,首先学习句子表示,然后学习文章表示,SCStory 识别新闻文章的故事相关信息,并使用它们来发现故事。嵌入模块不断更新,以适应具有对比学习目标的不断发展的新闻流,并辅以两种独特的技术: 可信度感知记忆重放和优先级增强,用于解决标签缺失和数据稀缺问题。对真实新闻和最新新闻数据集的彻底实验表明,SCStory 在无监督的在线故事发现方面优于现有的最先进算法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SCStory:+Self-supervised+and+Continual+Online+Story+Discovery)|0| +|[Know Your Transactions: Real-time and Generic Transaction Semantic Representation on Blockchain & Web3 Ecosystem](https://doi.org/10.1145/3543507.3583537)|Zhiying Wu, Jieli Liu, Jiajing Wu, Zibin Zheng, Xiapu Luo, Ting Chen|University of Electronic Science and Technologyo of China, China; Hong Kong Polytechnic University, China; Sun Yat-sen University, China|Web3, based on blockchain technology, is the evolving next generation Internet of value. Massive active applications on Web3, e.g. DeFi and NFT, usually rely on blockchain transactions to achieve value transfer as well as complex and diverse custom logic and intentions. Various risky or illegal behaviors such as financial fraud, hacking, money laundering are currently rampant in the blockchain ecosystem, and it is thus important to understand the intent behind the pseudonymous transactions. To reveal the intent of transactions, much effort has been devoted to extracting some particular transaction semantics through specific expert experiences. However, the limitations of existing methods in terms of effectiveness and generalization make it difficult to extract diverse transaction semantics in the rapidly growing and evolving Web3 ecosystem. In this paper, we propose the Motif-based Transaction Semantics representation method (MoTS), which can capture the transaction semantic information in the real-time transaction data workflow. To the best of our knowledge, MoTS is the first general semantic extraction method in Web3 blockchain ecosystem. Experimental results show that MoTS can effectively distinguish different transaction semantics in real-time, and can be used for various downstream tasks, giving new insights to understand the Web3 blockchain ecosystem. Our codes are available at https://github.com/wuzhy1ng/MoTS.|基于区块链技术的 Web3是不断发展的下一代价值互联网。Web3上的大量活动应用程序,例如 DeFi 和 NFT,通常依赖区块链事务来实现价值转移以及复杂多样的定制逻辑和意图。各种风险或非法行为,如金融欺诈,黑客,洗钱目前在区块链生态系统中猖獗,因此了解这些假名交易背后的意图非常重要。为了揭示事务的意图,人们花费了大量的精力通过特定的专家经验提取特定的事务语义。然而,现有方法在有效性和泛化方面的局限性使得在快速增长和发展的 Web3生态系统中很难提取不同的事务语义。在本文中,我们提出了基于主题的事务语义表示方法(moTS) ,它可以捕获实时事务数据流中的事务语义信息。据我们所知,MoTS 是 Web3区块链生态系统中第一个通用的语义提取方法。实验结果表明,MoTS 可以有效地实时区分不同的事务语义,并可用于各种下游任务,为理解 Web3区块链生态系统提供了新的视角。我们的密码可以在 https://github.com/wuzhy1ng/mots 找到。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Know+Your+Transactions:+Real-time+and+Generic+Transaction+Semantic+Representation+on+Blockchain+&+Web3+Ecosystem)|0| |[Toward Open-domain Slot Filling via Self-supervised Co-training](https://doi.org/10.1145/3543507.3583541)|Adib Mosharrof, Moghis Fereidouni, A. B. Siddique|University of Kentucky, USA|Slot filling is one of the critical tasks in modern conversational systems. The majority of existing literature employs supervised learning methods, which require labeled training data for each new domain. Zero-shot learning and weak supervision approaches, among others, have shown promise as alternatives to manual labeling. Nonetheless, these learning paradigms are significantly inferior to supervised learning approaches in terms of performance. To minimize this performance gap and demonstrate the possibility of open-domain slot filling, we propose a Self-supervised Co-training framework, called SCot, that requires zero in-domain manually labeled training examples and works in three phases. Phase one acquires two sets of complementary pseudo labels automatically. Phase two leverages the power of the pre-trained language model BERT, by adapting it for the slot filling task using these sets of pseudo labels. In phase three, we introduce a self-supervised cotraining mechanism, where both models automatically select highconfidence soft labels to further improve the performance of the other in an iterative fashion. Our thorough evaluations show that SCot outperforms state-of-the-art models by 45.57% and 37.56% on SGD and MultiWoZ datasets, respectively. Moreover, our proposed framework SCot achieves comparable performance when compared to state-of-the-art fully supervised models.|时隙填充是现代会话系统的关键任务之一。现有的大多数文献采用监督式学习方法,每个新领域都需要有标签的训练数据。零拍摄学习和薄弱的监督方法,除其他外,已经显示出作为手工标签替代品的前景。尽管如此,这些学习模式在表现方面明显逊色于监督式学习方法。为了尽可能减小这种性能差距,并证明开放域时隙填充的可能性,我们提出了一种称为 SCot 的自监督协同训练框架,它需要零域内手动标记的训练例子,并分三个阶段工作。第一阶段自动获取两套互补的伪标签。第二阶段利用预先训练好的语言模型 BERT 的能力,通过使用这些伪标签集使其适应插槽填充任务。在第三阶段,我们引入一个自我监督的协同训练机制,其中两个模型自动选择高置信软标签,以进一步提高其他在迭代方式的性能。我们的全面评估表明,在 SGD 和 MultiWoZ 数据集上,SCot 的性能分别比最先进的模型高出45.57% 和37.56% 。此外,我们提出的框架 SCot 实现了可比的性能相比,国家的最先进的全监督模型。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Toward+Open-domain+Slot+Filling+via+Self-supervised+Co-training)|0| -|[Measuring and Evading Turkmenistan's Internet Censorship: A Case Study in Large-Scale Measurements of a Low-Penetration Country](https://doi.org/10.1145/3543507.3583189)|Sadia Nourin, Van Tran, Xi Jiang, Kevin Bock, Nick Feamster, Nguyen Phong Hoang, Dave Levin|University of Maryland, USA; University of Chicago, USA|Since 2006, Turkmenistan has been listed as one of the few Internet enemies by Reporters without Borders due to its extensively censored Internet and strictly regulated information control policies. Existing reports of filtering in Turkmenistan rely on a small number of vantage points or test a small number of websites. Yet, the country's poor Internet adoption rates and small population can make more comprehensive measurement challenging. With a population of only six million people and an Internet penetration rate of only 38%, it is challenging to either recruit in-country volunteers or obtain vantage points to conduct remote network measurements at scale. We present the largest measurement study to date of Turkmenistan's Web censorship. To do so, we developed TMC, which tests the blocking status of millions of domains across the three foundational protocols of the Web (DNS, HTTP, and HTTPS). Importantly, TMC does not require access to vantage points in the country. We apply TMC to 15.5M domains, our results reveal that Turkmenistan censors more than 122K domains, using different blocklists for each protocol. We also reverse-engineer these censored domains, identifying 6K over-blocking rules causing incidental filtering of more than 5.4M domains. Finally, we use Geneva, an open-source censorship evasion tool, to discover five new censorship evasion strategies that can defeat Turkmenistan's censorship at both transport and application layers. We will publicly release both the data collected by TMC and the code for censorship evasion.|自2006年以来,土库曼斯坦由于其广泛的互联网审查和严格的信息控制政策而被列为少数几个互联网无国界记者之一。土库曼斯坦现有的过滤报告依赖于少数有利位置或测试少数网站。然而,该国糟糕的互联网使用率和较小的人口可以使更全面的测量具有挑战性。由于人口只有600万,互联网普及率只有38% ,因此,要么征聘国内志愿人员,要么获得有利位置,进行大规模的远程网络测量,这是一个挑战。我们提出了迄今为止土库曼斯坦网络审查最大的测量研究。为此,我们开发了 TMC,它测试跨 Web 的三个基本协议(DNS、 HTTP 和 HTTPS)的数百万个域的阻塞状态。重要的是,TMC 不要求进入该国的有利位置。我们将 TMC 应用于1550万个域名,我们的结果显示,土库曼斯坦审查超过122K 域名,使用不同的区块列表为每个协议。我们还反向工程这些删除域,确定6K 过度阻塞规则导致超过5.4 M 域的附带过滤。最后,我们使用日内瓦,一个开源的审查规避工具,发现五个新的审查规避策略,可以击败土库曼斯坦的审查在传输和应用层。我们将公开发布 TMC 收集的数据和规避审查的代码。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Measuring+and+Evading+Turkmenistan's+Internet+Censorship:+A+Case+Study+in+Large-Scale+Measurements+of+a+Low-Penetration+Country)|0| -|[NetGuard: Protecting Commercial Web APIs from Model Inversion Attacks using GAN-generated Fake Samples](https://doi.org/10.1145/3543507.3583224)|Xueluan Gong, Ziyao Wang, Yanjiao Chen, Qian Wang, Cong Wang, Chao Shen|Wuhan University, China; Xi'an Jiaotong University, China; City University of Hong Kong, China; Zhejiang University, China|Recently more and more cloud service providers (e.g., Microsoft, Google, and Amazon) have commercialized their well-trained deep learning models by providing limited access via web API interfaces. However, it is shown that these APIs are susceptible to model inversion attacks, where attackers can recover the training data with high fidelity, which may cause serious privacy leakage.Existing defenses against model inversion attacks, however, hinder the model performance and are ineffective for more advanced attacks, e.g., Mirror [4]. In this paper, we proposed NetGuard, a novel utility-aware defense methodology against model inversion attacks (MIAs). Unlike previous works that perturb prediction outputs of the victim model, we propose to mislead the MIA effort by inserting engineered fake samples during the training process. A generative adversarial network (GAN) is carefully built to construct fake training samples to mislead the attack model without degrading the performance of the victim model. Besides, we adopt continual learning to further improve the utility of the victim model. Extensive experiments on CelebA, VGG-Face, and VGG-Face2 datasets show that NetGuard is superior to existing defenses, including DP [37] and Ad-mi [32] on state-of-the-art model inversion attacks, i.e., DMI [8], Mirror [4], Privacy [12], and Alignment [34].|最近越来越多的云服务提供商(如微软、谷歌和亚马逊)通过提供有限的 Web API 接口访问,将他们训练有素的深度学习模型商业化。然而,这些 API 容易受到模型反转攻击,攻击者可以恢复高保真的训练数据,从而导致严重的隐私泄漏。然而,针对模型反转攻击的现有防御措施阻碍了模型的性能,并且对于更高级的攻击无效,例如,Mirror [4]。在本文中,我们提出了一种新的效用感知的防御模型反转攻击(MIA)方法 NetGuard。不同于以往的工作,扰乱预测输出的受害者模型,我们建议误导 MIA 的努力,插入工程假样本在训练过程中。构造一个生成式对抗网络(GAN)来构造虚假的训练样本,在不降低受害者模型性能的前提下误导攻击模型。此外,我们采用不断学习的方法进一步提高了被害人模型的有效性。在 CelebA,VGG-Face 和 VGG-Face2数据集上的大量实验表明,NetGuard 优于现有的防御系统,包括 DP [37]和 Ad-mi [32]对最先进的模型反转攻击,即 DMI [8] ,Mirror [4] ,Privacy [12]和 Align [34]。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=NetGuard:+Protecting+Commercial+Web+APIs+from+Model+Inversion+Attacks+using+GAN-generated+Fake+Samples)|0| -|[Meteor: Improved Secure 3-Party Neural Network Inference with Reducing Online Communication Costs](https://doi.org/10.1145/3543507.3583272)|Ye Dong, Xiaojun Chen, Weizhan Jing, Kaiyun Li, Weiping Wang|; Institute of Information Engineering,Chinese Academy of Sciences, China and School of Cyber Security, University of Chinese Academy of Sciences, China; Institute of Information Engineering, Chinese Academy of Sciences, China and School of Cyber Security, University of Chinese Academy of Sciences, China|Secure neural network inference has been a promising solution to private Deep-Learning-as-a-Service, which enables the service provider and user to execute neural network inference without revealing their private inputs. However, the expensive overhead of current schemes is still an obstacle when applied in real applications. In this work, we present Meteor, an online communication-efficient and fast secure 3-party computation neural network inference system aginst semi-honest adversary in honest-majority. The main contributions of Meteor are two-fold: i) We propose a new and improved 3-party secret sharing scheme stemming from the linearity of replicated secret sharing, and design efficient protocols for the basic cryptographic primitives, including linear operations, multiplication, most significant bit extraction, and multiplexer. ii) Furthermore, we build efficient and secure blocks for the widely used neural network operators such as Matrix Multiplication, ReLU, and Maxpool, along with exploiting several specific optimizations for better efficiency. Our total communication with the setup phase is a little larger than SecureNN (PoPETs’19) and Falcon (PoPETs’21), two state-of-the-art solutions, but the gap is not significant when the online phase must be optimized as a priority. Using Meteor, we perform extensive evaluations on various neural networks. Compared to SecureNN and Falcon, we reduce the online communication costs by up to 25.6 × and 1.5 ×, and improve the running-time by at most 9.8 × (resp. 8.1 ×) and 1.5 × (resp. 2.1 ×) in LAN (resp. WAN) for the online inference.|安全神经网络推理已成为私有深度学习即服务(Deep-Learning-as-a-Service)的一种有前途的解决方案,它使服务提供者和用户能够在不暴露其私有输入的情况下执行神经网络推理。然而,在实际应用中,现有方案昂贵的开销仍然是一个障碍。在这项工作中,我们提出了一个在线通信效率和快速安全的三方计算神经网络推理系统对半诚实的对手在诚实的大多数。基于复制密钥共享的线性特性,提出了一种新的改进的三方密钥共享方案,并针对基本密码原语设计了高效的协议,包括线性运算、乘法、最大位提取和多路复用。Ii)此外,我们还为广泛使用的神经网络操作器(如矩阵乘法、 relU 和 Maxpool)构建了高效、安全的模块,同时利用一些特定的优化来提高效率。我们与安装阶段的总体沟通略大于 SecureNN (PoPETs’19)和 Falcon (PoPETs’21)这两种最先进的解决方案,但当在线阶段必须优先优化时,差距并不显着。使用流星,我们对各种神经网络进行广泛的评估。与 SecureNN 和 Falcon 相比,在线通信成本分别降低了25.6 × 和1.5 × ,运行时间最多提高了9.8 × 。8.1 ×)和1.5 × (分辨率)。2.1 ×).WAN)进行在线推理。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Meteor:+Improved+Secure+3-Party+Neural+Network+Inference+with+Reducing+Online+Communication+Costs)|0| -|[IRWArt: Levering Watermarking Performance for Protecting High-quality Artwork Images](https://doi.org/10.1145/3543507.3583489)|Yuanjing Luo, Tongqing Zhou, Fang Liu, Zhiping Cai|National University of Defense Technology, China; Hunan University, China|Increasing artwork plagiarism incidents underscores the urgent need for reliable copyright protection for high-quality artwork images. Although watermarking is helpful to this issue, existing methods are limited in imperceptibility and robustness. To provide high-level protection for valuable artwork images, we propose a novel invisible robust watermarking framework, dubbed as IRWArt. In our architecture, the embedding and recovery of the watermark are treated as a pair of image transformations’ inverse problems, and can be implemented through the forward and backward processes of an invertible neural networks (INN), respectively. For high visual quality, we embed the watermark in high-frequency domains with minimal impact on artwork and supervise image reconstruction using a human visual system(HVS)-consistent deep perceptual loss. For strong plagiarism-resistant, we construct a quality enhancement module for the embedded image against possible distortions caused by plagiarism actions. Moreover, the two-stagecontrastive training strategy enables the simultaneous realization of the above two goals. Experimental results on 4 datasets demonstrate the superiority of our IRWArt over other state-of-the-art watermarking methods. Code: https://github.com/1024yy/IRWArt.|越来越多的艺术品剽窃事件突出表明,迫切需要为高质量的艺术品图像提供可靠的版权保护。虽然水印技术有助于解决这一问题,但现有的水印方法在不可感知性和鲁棒性方面存在局限性。为了对有价值的艺术品图像提供高层次的保护,我们提出了一种新的不可见的鲁棒水印框架,称为 IRWArt。在我们的体系结构中,水印的嵌入和恢复被视为一对图像变换的逆问题,可以分别通过可逆神经网络(INN)的正向和反向过程来实现。为了提高视觉质量,我们将水印嵌入到高频域中,尽量减少对作品的影响,并使用人类视觉系统(HVS)一致的深度感知损失来监督图像重建。为了提高嵌入图像的抗剽窃能力,我们构建了一个质量增强模块,用于对抗剽窃行为可能造成的图像失真。此外,两阶段对比训练策略可以同时实现上述两个目标。在4个数据集上的实验结果表明了我们的 IRWArt 相对于其他最先进的水印方法的优越性。密码: https://github.com/1024yy/irwart。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=IRWArt:+Levering+Watermarking+Performance+for+Protecting+High-quality+Artwork+Images)|0| +|[Measuring and Evading Turkmenistan's Internet Censorship: A Case Study in Large-Scale Measurements of a Low-Penetration Country](https://doi.org/10.1145/3543507.3583189)|Sadia Nourin, Van Tran, Xi Jiang, Kevin Bock, Nick Feamster, Nguyen Phong Hoang, Dave Levin|University of Chicago, USA; University of Maryland, USA|Since 2006, Turkmenistan has been listed as one of the few Internet enemies by Reporters without Borders due to its extensively censored Internet and strictly regulated information control policies. Existing reports of filtering in Turkmenistan rely on a small number of vantage points or test a small number of websites. Yet, the country's poor Internet adoption rates and small population can make more comprehensive measurement challenging. With a population of only six million people and an Internet penetration rate of only 38%, it is challenging to either recruit in-country volunteers or obtain vantage points to conduct remote network measurements at scale. We present the largest measurement study to date of Turkmenistan's Web censorship. To do so, we developed TMC, which tests the blocking status of millions of domains across the three foundational protocols of the Web (DNS, HTTP, and HTTPS). Importantly, TMC does not require access to vantage points in the country. We apply TMC to 15.5M domains, our results reveal that Turkmenistan censors more than 122K domains, using different blocklists for each protocol. We also reverse-engineer these censored domains, identifying 6K over-blocking rules causing incidental filtering of more than 5.4M domains. Finally, we use Geneva, an open-source censorship evasion tool, to discover five new censorship evasion strategies that can defeat Turkmenistan's censorship at both transport and application layers. We will publicly release both the data collected by TMC and the code for censorship evasion.|自2006年以来,土库曼斯坦由于其广泛的互联网审查和严格的信息控制政策而被列为少数几个互联网无国界记者之一。土库曼斯坦现有的过滤报告依赖于少数有利位置或测试少数网站。然而,该国糟糕的互联网使用率和较小的人口可以使更全面的测量具有挑战性。由于人口只有600万,互联网普及率只有38% ,因此,要么征聘国内志愿人员,要么获得有利位置,进行大规模的远程网络测量,这是一个挑战。我们提出了迄今为止土库曼斯坦网络审查最大的测量研究。为此,我们开发了 TMC,它测试跨 Web 的三个基本协议(DNS、 HTTP 和 HTTPS)的数百万个域的阻塞状态。重要的是,TMC 不要求进入该国的有利位置。我们将 TMC 应用于1550万个域名,我们的结果显示,土库曼斯坦审查超过122K 域名,使用不同的区块列表为每个协议。我们还反向工程这些删除域,确定6K 过度阻塞规则导致超过5.4 M 域的附带过滤。最后,我们使用日内瓦,一个开源的审查规避工具,发现五个新的审查规避策略,可以击败土库曼斯坦的审查在传输和应用层。我们将公开发布 TMC 收集的数据和规避审查的代码。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Measuring+and+Evading+Turkmenistan's+Internet+Censorship:+A+Case+Study+in+Large-Scale+Measurements+of+a+Low-Penetration+Country)|0| +|[NetGuard: Protecting Commercial Web APIs from Model Inversion Attacks using GAN-generated Fake Samples](https://doi.org/10.1145/3543507.3583224)|Xueluan Gong, Ziyao Wang, Yanjiao Chen, Qian Wang, Cong Wang, Chao Shen|Wuhan University, China; Xi'an Jiaotong University, China; Zhejiang University, China; City University of Hong Kong, China|Recently more and more cloud service providers (e.g., Microsoft, Google, and Amazon) have commercialized their well-trained deep learning models by providing limited access via web API interfaces. However, it is shown that these APIs are susceptible to model inversion attacks, where attackers can recover the training data with high fidelity, which may cause serious privacy leakage.Existing defenses against model inversion attacks, however, hinder the model performance and are ineffective for more advanced attacks, e.g., Mirror [4]. In this paper, we proposed NetGuard, a novel utility-aware defense methodology against model inversion attacks (MIAs). Unlike previous works that perturb prediction outputs of the victim model, we propose to mislead the MIA effort by inserting engineered fake samples during the training process. A generative adversarial network (GAN) is carefully built to construct fake training samples to mislead the attack model without degrading the performance of the victim model. Besides, we adopt continual learning to further improve the utility of the victim model. Extensive experiments on CelebA, VGG-Face, and VGG-Face2 datasets show that NetGuard is superior to existing defenses, including DP [37] and Ad-mi [32] on state-of-the-art model inversion attacks, i.e., DMI [8], Mirror [4], Privacy [12], and Alignment [34].|最近越来越多的云服务提供商(如微软、谷歌和亚马逊)通过提供有限的 Web API 接口访问,将他们训练有素的深度学习模型商业化。然而,这些 API 容易受到模型反转攻击,攻击者可以恢复高保真的训练数据,从而导致严重的隐私泄漏。然而,针对模型反转攻击的现有防御措施阻碍了模型的性能,并且对于更高级的攻击无效,例如,Mirror [4]。在本文中,我们提出了一种新的效用感知的防御模型反转攻击(MIA)方法 NetGuard。不同于以往的工作,扰乱预测输出的受害者模型,我们建议误导 MIA 的努力,插入工程假样本在训练过程中。构造一个生成式对抗网络(GAN)来构造虚假的训练样本,在不降低受害者模型性能的前提下误导攻击模型。此外,我们采用不断学习的方法进一步提高了被害人模型的有效性。在 CelebA,VGG-Face 和 VGG-Face2数据集上的大量实验表明,NetGuard 优于现有的防御系统,包括 DP [37]和 Ad-mi [32]对最先进的模型反转攻击,即 DMI [8] ,Mirror [4] ,Privacy [12]和 Align [34]。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=NetGuard:+Protecting+Commercial+Web+APIs+from+Model+Inversion+Attacks+using+GAN-generated+Fake+Samples)|0| +|[Meteor: Improved Secure 3-Party Neural Network Inference with Reducing Online Communication Costs](https://doi.org/10.1145/3543507.3583272)|Ye Dong, Xiaojun Chen, Weizhan Jing, Kaiyun Li, Weiping Wang|Institute of Information Engineering,Chinese Academy of Sciences, China and School of Cyber Security, University of Chinese Academy of Sciences, China; ; Institute of Information Engineering, Chinese Academy of Sciences, China and School of Cyber Security, University of Chinese Academy of Sciences, China|Secure neural network inference has been a promising solution to private Deep-Learning-as-a-Service, which enables the service provider and user to execute neural network inference without revealing their private inputs. However, the expensive overhead of current schemes is still an obstacle when applied in real applications. In this work, we present Meteor, an online communication-efficient and fast secure 3-party computation neural network inference system aginst semi-honest adversary in honest-majority. The main contributions of Meteor are two-fold: i) We propose a new and improved 3-party secret sharing scheme stemming from the linearity of replicated secret sharing, and design efficient protocols for the basic cryptographic primitives, including linear operations, multiplication, most significant bit extraction, and multiplexer. ii) Furthermore, we build efficient and secure blocks for the widely used neural network operators such as Matrix Multiplication, ReLU, and Maxpool, along with exploiting several specific optimizations for better efficiency. Our total communication with the setup phase is a little larger than SecureNN (PoPETs’19) and Falcon (PoPETs’21), two state-of-the-art solutions, but the gap is not significant when the online phase must be optimized as a priority. Using Meteor, we perform extensive evaluations on various neural networks. Compared to SecureNN and Falcon, we reduce the online communication costs by up to 25.6 × and 1.5 ×, and improve the running-time by at most 9.8 × (resp. 8.1 ×) and 1.5 × (resp. 2.1 ×) in LAN (resp. WAN) for the online inference.|安全神经网络推理已成为私有深度学习即服务(Deep-Learning-as-a-Service)的一种有前途的解决方案,它使服务提供者和用户能够在不暴露其私有输入的情况下执行神经网络推理。然而,在实际应用中,现有方案昂贵的开销仍然是一个障碍。在这项工作中,我们提出了一个在线通信效率和快速安全的三方计算神经网络推理系统对半诚实的对手在诚实的大多数。基于复制密钥共享的线性特性,提出了一种新的改进的三方密钥共享方案,并针对基本密码原语设计了高效的协议,包括线性运算、乘法、最大位提取和多路复用。Ii)此外,我们还为广泛使用的神经网络操作器(如矩阵乘法、 relU 和 Maxpool)构建了高效、安全的模块,同时利用一些特定的优化来提高效率。我们与安装阶段的总体沟通略大于 SecureNN (PoPETs’19)和 Falcon (PoPETs’21)这两种最先进的解决方案,但当在线阶段必须优先优化时,差距并不显着。使用流星,我们对各种神经网络进行广泛的评估。与 SecureNN 和 Falcon 相比,在线通信成本分别降低了25.6 × 和1.5 × ,运行时间最多提高了9.8 × 。8.1 ×)和1.5 × (分辨率)。2.1 ×).WAN)进行在线推理。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Meteor:+Improved+Secure+3-Party+Neural+Network+Inference+with+Reducing+Online+Communication+Costs)|0| +|[IRWArt: Levering Watermarking Performance for Protecting High-quality Artwork Images](https://doi.org/10.1145/3543507.3583489)|Yuanjing Luo, Tongqing Zhou, Fang Liu, Zhiping Cai|Hunan University, China; National University of Defense Technology, China|Increasing artwork plagiarism incidents underscores the urgent need for reliable copyright protection for high-quality artwork images. Although watermarking is helpful to this issue, existing methods are limited in imperceptibility and robustness. To provide high-level protection for valuable artwork images, we propose a novel invisible robust watermarking framework, dubbed as IRWArt. In our architecture, the embedding and recovery of the watermark are treated as a pair of image transformations’ inverse problems, and can be implemented through the forward and backward processes of an invertible neural networks (INN), respectively. For high visual quality, we embed the watermark in high-frequency domains with minimal impact on artwork and supervise image reconstruction using a human visual system(HVS)-consistent deep perceptual loss. For strong plagiarism-resistant, we construct a quality enhancement module for the embedded image against possible distortions caused by plagiarism actions. Moreover, the two-stagecontrastive training strategy enables the simultaneous realization of the above two goals. Experimental results on 4 datasets demonstrate the superiority of our IRWArt over other state-of-the-art watermarking methods. Code: https://github.com/1024yy/IRWArt.|越来越多的艺术品剽窃事件突出表明,迫切需要为高质量的艺术品图像提供可靠的版权保护。虽然水印技术有助于解决这一问题,但现有的水印方法在不可感知性和鲁棒性方面存在局限性。为了对有价值的艺术品图像提供高层次的保护,我们提出了一种新的不可见的鲁棒水印框架,称为 IRWArt。在我们的体系结构中,水印的嵌入和恢复被视为一对图像变换的逆问题,可以分别通过可逆神经网络(INN)的正向和反向过程来实现。为了提高视觉质量,我们将水印嵌入到高频域中,尽量减少对作品的影响,并使用人类视觉系统(HVS)一致的深度感知损失来监督图像重建。为了提高嵌入图像的抗剽窃能力,我们构建了一个质量增强模块,用于对抗剽窃行为可能造成的图像失真。此外,两阶段对比训练策略可以同时实现上述两个目标。在4个数据集上的实验结果表明了我们的 IRWArt 相对于其他最先进的水印方法的优越性。密码: https://github.com/1024yy/irwart。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=IRWArt:+Levering+Watermarking+Performance+for+Protecting+High-quality+Artwork+Images)|0| |[CapEnrich: Enriching Caption Semantics for Web Images via Cross-modal Pre-trained Knowledge](https://doi.org/10.1145/3543507.3583232)|Linli Yao, Weijing Chen, Qin Jin|School of Information, Renmin University of China, China|Automatically generating textual descriptions for massive unlabeled images on the web can greatly benefit realistic web applications, e.g. multimodal retrieval and recommendation. However, existing models suffer from the problem of generating ``over-generic'' descriptions, such as their tendency to generate repetitive sentences with common concepts for different images. These generic descriptions fail to provide sufficient textual semantics for ever-changing web images. Inspired by the recent success of Vision-Language Pre-training (VLP) models that learn diverse image-text concept alignment during pretraining, we explore leveraging their cross-modal pre-trained knowledge to automatically enrich the textual semantics of image descriptions. With no need for additional human annotations, we propose a plug-and-play framework, i.e CapEnrich, to complement the generic image descriptions with more semantic details. Specifically, we first propose an automatic data-building strategy to get desired training sentences, based on which we then adopt prompting strategies, i.e. learnable and template prompts, to incentivize VLP models to generate more textual details. For learnable templates, we fix the whole VLP model and only tune the prompt vectors, which leads to two advantages: 1) the pre-training knowledge of VLP models can be reserved as much as possible to describe diverse visual concepts; 2) only lightweight trainable parameters are required, so it is friendly to low data resources. Extensive experiments show that our method significantly improves the descriptiveness and diversity of generated sentences for web images. The code is available at https://github.com/yaolinli/CapEnrich.|自动生成文本描述的大量未标记的图像在网络上可以大大有益于现实的网络应用程序,例如多模式检索和推荐。然而,现有的模型存在“过度泛型”描述的问题,例如它们倾向于为不同的图像生成具有共同概念的重复句子。这些通用的描述无法为不断变化的网络图像提供足够的文本语义。受最近成功的视觉语言预训练(VLP)模型的启发,我们在预训练过程中学习了不同的图像-文本概念对齐,我们探索利用它们的跨模式预训练知识来自动丰富图像描述的文本语义。由于不需要额外的人工注释,我们提出了一个即插即用的框架,即 CapEnrich,用更多的语义细节来补充通用图像描述。具体来说,我们首先提出一个自动的数据建立策略,以获得所需的训练句子,然后在此基础上,我们采用提示策略,即可学习和模板提示,以激励 VLP 模型生成更多的文本细节。对于可学习的模板,我们修正了整个 VLP 模型,只对提示向量进行调整,这带来了两个好处: 1) VLP 模型的预训练知识可以尽可能地保留,以描述不同的视觉概念; 2)只需要轻量级的可训练参数,因此对低资源量的数据是友好的。大量实验表明,该方法显著提高了网络图像生成句子的描述性和多样性。密码可在 https://github.com/yaolinli/capenrich 查阅。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=CapEnrich:+Enriching+Caption+Semantics+for+Web+Images+via+Cross-modal+Pre-trained+Knowledge)|0| |[MLN4KB: an efficient Markov logic network engine for large-scale knowledge bases and structured logic rules](https://doi.org/10.1145/3543507.3583248)|Huang Fang, Yang Liu, Yunfeng Cai, Mingming Sun|Baidu, China|Markov logic network (MLN) is a powerful statistical modeling framework for probabilistic logic reasoning. Despite the elegancy and effectiveness of MLN, the inference of MLN is known to suffer from an efficiency issue. Even the state-of-the-art MLN engines can not scale to medium-size real-world knowledge bases in the open-world setting, i.e., all unobserved facts in the knowledge base need predictions. In this work, by focusing on a certain class of first-order logic rules that are sufficiently expressive, we develop a highly efficient MLN inference engine called MLN4KB that can leverage the sparsity of knowledge bases. MLN4KB enjoys quite strong theoretical properties; its space and time complexities can be exponentially smaller than existing MLN engines. Experiments on both synthetic and real-world knowledge bases demonstrate the effectiveness of the proposed method. MLN4KB is orders of magnitudes faster (more than 103 times faster on some datasets) than existing MLN engines in the open-world setting. Without any approximation tricks, MLN4KB can scale to real-world knowledge bases including WN-18 and YAGO3-10 and achieve decent prediction accuracy without bells and whistles. We implement MLN4KB as a Julia package called MLN4KB.jl. The package supports both maximum a posteriori (MAP) inference and learning the weights of rules. MLN4KB.jl is public available at https://github.com/baidu-research/MLN4KB .|马尔可夫逻辑网络(MLN)是一个强大的统计建模框架,用于概率逻辑推理。尽管 MLN 具有优雅性和有效性,但 MLN 的推理却存在效率问题。即使是最先进的 MLN 引擎也无法在开放世界环境下扩展到中等规模的现实世界知识库,即知识库中所有未观察到的事实都需要预测。在这项工作中,通过关注一类具有足够表现力的一阶逻辑规则,我们开发了一种高效的 MLN 推理机,称为 MLN4kB,它可以利用稀缺的知识库。MLN4KB 具有很强的理论性能,其空间和时间复杂度可以指数小于现有的 MLN 发动机。在综合知识库和现实知识库上的实验表明了该方法的有效性。MLN4KB 比开放世界中现有的 MLN 引擎快几个数量级(在一些数据集上快103倍以上)。在没有任何近似技巧的情况下,MLN4KB 可以扩展到包括 WN-18和 YAGO3-10在内的真实世界的知识库,并且不需要花哨的功夫就能获得相当高的预测精度。我们将 MLN4KB 实现为一个名为 MLN4KB.jl 的 Julia 包。该软件包支持最大后验(MAP)推理和学习规则的权重。MLn4kB.jl 可于 https://github.com/baidu-research/mln4kb 索取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MLN4KB:+an+efficient+Markov+logic+network+engine+for+large-scale+knowledge+bases+and+structured+logic+rules)|0| -|[Learning Long- and Short-term Representations for Temporal Knowledge Graph Reasoning](https://doi.org/10.1145/3543507.3583242)|Mengqi Zhang, Yuwei Xia, Qiang Liu, Shu Wu, Liang Wang|School of Cyber Security, University of Chinese Academy of Sciences, China and Institute of Information Engineering, Chinese Academy of Sciences, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, China and CRIPAC, MAIS, Institute of Automation, Chinese Academy of Sciences, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, China and CRIPAC,MAIS, Institute of Automation, Chinese Academy of Sciences, China|Temporal Knowledge graph (TKG) reasoning aims to predict missing facts based on historical TKG data. Most of the existing methods are incapable of explicitly modeling the long-term time dependencies from history and neglect the adaptive integration of the long- and short-term information. To tackle these problems, we propose a novel method that utilizes a designed Hierarchical Relational Graph Neural Network to learn the Long- and Short-term representations for TKG reasoning, namely HGLS. Specifically, to explicitly associate entities in different timestamps, we first transform the TKG into a global graph. Based on the built graph, we design a Hierarchical Relational Graph Neural Network that executes in two levels: The sub-graph level is to capture the semantic dependencies within concurrent facts of each KG. And the global-graph level aims to model the temporal dependencies between entities. Furthermore, we design a module to extract the long- and short-term information from the output of these two levels. Finally, the long- and short-term representations are fused into a unified one by Gating Integration for entity prediction. Extensive experiments on four datasets demonstrate the effectiveness of HGLS.|时态知识图(TKG)推理的目的是根据历史 TKG 数据预测缺失事实。现有的方法大多不能从历史上明确地建立长期时间依赖关系模型,忽视了长期和短期信息的自适应集成。为了解决这些问题,我们提出了一种利用所设计的层次关系图神经网络来学习 TKG 推理的长期和短期表示的新方法,即 HGLS。具体来说,为了显式地关联不同时间戳中的实体,我们首先将 TKG 转换为一个全局图。基于构建的图,我们设计了一个分层关系图神经网络,该网络在两个层次上执行: 子图层次是在每个 KG 的并发事实中捕获语义依赖。全局图层次的目标是建立实体之间的时间依赖关系模型。此外,我们还设计了一个模块,从这两个级别的输出中提取长期和短期信息。最后,通过门限积分将长期和短期表示融合为一个统一的表示,用于实体预测。在四个数据集上的大量实验证明了 HGLS 的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Long-+and+Short-term+Representations+for+Temporal+Knowledge+Graph+Reasoning)|0| -|[Mutually-paced Knowledge Distillation for Cross-lingual Temporal Knowledge Graph Reasoning](https://doi.org/10.1145/3543507.3583407)|Ruijie Wang, Zheng Li, Jingfeng Yang, Tianyu Cao, Chao Zhang, Bing Yin, Tarek F. Abdelzaher|School of Computational Science and Engineering, Georgia Institute of Technology, USA; Department of Computer Science, University of Illinois Urbana-Champaign, USA; Amazon.com Inc, USA|This paper investigates cross-lingual temporal knowledge graph reasoning problem, which aims to facilitate reasoning on Temporal Knowledge Graphs (TKGs) in low-resource languages by transfering knowledge from TKGs in high-resource ones. The cross-lingual distillation ability across TKGs becomes increasingly crucial, in light of the unsatisfying performance of existing reasoning methods on those severely incomplete TKGs, especially in low-resource languages. However, it poses tremendous challenges in two aspects. First, the cross-lingual alignments, which serve as bridges for knowledge transfer, are usually too scarce to transfer sufficient knowledge between two TKGs. Second, temporal knowledge discrepancy of the aligned entities, especially when alignments are unreliable, can mislead the knowledge distillation process. We correspondingly propose a mutually-paced knowledge distillation model MP-KD, where a teacher network trained on a source TKG can guide the training of a student network on target TKGs with an alignment module. Concretely, to deal with the scarcity issue, MP-KD generates pseudo alignments between TKGs based on the temporal information extracted by our representation module. To maximize the efficacy of knowledge transfer and control the noise caused by the temporal knowledge discrepancy, we enhance MP-KD with a temporal cross-lingual attention mechanism to dynamically estimate the alignment strength. The two procedures are mutually paced along with model training. Extensive experiments on twelve cross-lingual TKG transfer tasks in the EventKG benchmark demonstrate the effectiveness of the proposed MP-KD method.|本文研究了跨语言时态知识图的推理问题,目的是通过将知识从低资源语言的时态知识图中转移到高资源语言的时态知识图中来实现对低资源语言的时态知识图的推理。由于现有的推理方法对于严重不完备的 TKG,特别是在资源不足的 TKG 中的推理效果不理想,跨 TKG 的跨语言精馏能力变得越来越重要。然而,它在两个方面提出了巨大的挑战。首先,作为知识转移桥梁的跨语言对齐通常过于稀缺,无法在两个 TKG 之间转移足够的知识。其次,排列实体的时间知识差异,特别是排列不可靠时,会误导知识提取过程。相应地,我们提出了一个相互节奏的知识提取模型 MP-KD,其中一个在源 TKG 上训练的教师网络可以用一个对齐模块来指导学生网络在目标 TKG 上的训练。具体来说,MP-KD 基于表示模块提取的时间信息生成 TKG 之间的伪对齐,以解决缺失问题。为了最大限度地提高知识转移的效率,控制时间知识差异带来的噪声,我们采用时间跨语言注意机制增强 MP-KD,动态估计对齐强度。这两个程序与模型训练是相互配合的。通过对 EventKG 基准中12个跨语言 TKG 传递任务的大量实验,证明了所提出的 MP-KD 方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Mutually-paced+Knowledge+Distillation+for+Cross-lingual+Temporal+Knowledge+Graph+Reasoning)|0| -|[Large-Scale Analysis of New Employee Network Dynamics](https://doi.org/10.1145/3543507.3583400)|Yulin Yu, Longqi Yang, Siân Lindley, Mengting Wan|Microsoft Research, United Kingdom; School of Information, University of Michigan, USA; Microsoft, USA|The COVID-19 pandemic has accelerated digital transformations across industries, but also introduced new challenges into workplaces, including the difficulties of effectively socializing with colleagues when working remotely. This challenge is exacerbated for new employees who need to develop workplace networks from the outset. In this paper, by analyzing a large-scale telemetry dataset of more than 10,000 Microsoft employees who joined the company in the first three months of 2022, we describe how new employees interact and telecommute with their colleagues during their ``onboarding'' period. Our results reveal that although new hires are gradually expanding networks over time, there still exists significant gaps between their network statistics and those of tenured employees even after the six-month onboarding phase. We also observe that heterogeneity exists among new employees in how their networks change over time, where employees whose job tasks do not necessarily require extensive and diverse connections could be at a disadvantaged position in this onboarding process. By investigating how web-based people recommendations in organizational knowledge base facilitate new employees naturally expand their networks, we also demonstrate the potential of web-based applications for addressing the aforementioned socialization challenges. Altogether, our findings provide insights on new employee network dynamics in remote and hybrid work environments, which may help guide organizational leaders and web application developers on quantifying and improving the socialization experiences of new employees in digital workplaces.|2019冠状病毒疾病疫情加速了各行各业的数字化转型,但也给工作场所带来了新的挑战,包括在远程工作时难以有效地与同事进行社交。对于从一开始就需要发展工作场所网络的新员工来说,这一挑战更加严峻。在这篇论文中,我们通过分析一个大规模的遥测数据集,这个数据集包含了2022年前三个月加入微软的10,000多名员工,我们描述了新员工在“入职”期间是如何与他们的同事进行互动和远程办公的。我们的研究结果表明,尽管新员工的人际网络随着时间的推移逐渐扩大,但即使在六个月的入职阶段之后,他们的人际网络统计数据与终身雇员之间仍然存在显著差距。我们还观察到,新员工的网络随着时间的推移如何变化存在异质性,其工作任务不一定需要广泛和多样化的联系的员工可能在这一入职过程中处于不利地位。通过调查组织知识库中基于网络的人员推荐如何促进新员工自然地扩展他们的网络,我们也展示了基于网络的应用程序在解决上述社会化挑战方面的潜力。总之,我们的研究结果提供了远程和混合工作环境中新员工网络动态的见解,这可能有助于指导组织领导者和网络应用程序开发人员量化和改善新员工在数字工作场所的社会化经验。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Large-Scale+Analysis+of+New+Employee+Network+Dynamics)|0| +|[Learning Long- and Short-term Representations for Temporal Knowledge Graph Reasoning](https://doi.org/10.1145/3543507.3583242)|Mengqi Zhang, Yuwei Xia, Qiang Liu, Shu Wu, Liang Wang|School of Cyber Security, University of Chinese Academy of Sciences, China and Institute of Information Engineering, Chinese Academy of Sciences, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, China and CRIPAC,MAIS, Institute of Automation, Chinese Academy of Sciences, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, China and CRIPAC, MAIS, Institute of Automation, Chinese Academy of Sciences, China|Temporal Knowledge graph (TKG) reasoning aims to predict missing facts based on historical TKG data. Most of the existing methods are incapable of explicitly modeling the long-term time dependencies from history and neglect the adaptive integration of the long- and short-term information. To tackle these problems, we propose a novel method that utilizes a designed Hierarchical Relational Graph Neural Network to learn the Long- and Short-term representations for TKG reasoning, namely HGLS. Specifically, to explicitly associate entities in different timestamps, we first transform the TKG into a global graph. Based on the built graph, we design a Hierarchical Relational Graph Neural Network that executes in two levels: The sub-graph level is to capture the semantic dependencies within concurrent facts of each KG. And the global-graph level aims to model the temporal dependencies between entities. Furthermore, we design a module to extract the long- and short-term information from the output of these two levels. Finally, the long- and short-term representations are fused into a unified one by Gating Integration for entity prediction. Extensive experiments on four datasets demonstrate the effectiveness of HGLS.|时态知识图(TKG)推理的目的是根据历史 TKG 数据预测缺失事实。现有的方法大多不能从历史上明确地建立长期时间依赖关系模型,忽视了长期和短期信息的自适应集成。为了解决这些问题,我们提出了一种利用所设计的层次关系图神经网络来学习 TKG 推理的长期和短期表示的新方法,即 HGLS。具体来说,为了显式地关联不同时间戳中的实体,我们首先将 TKG 转换为一个全局图。基于构建的图,我们设计了一个分层关系图神经网络,该网络在两个层次上执行: 子图层次是在每个 KG 的并发事实中捕获语义依赖。全局图层次的目标是建立实体之间的时间依赖关系模型。此外,我们还设计了一个模块,从这两个级别的输出中提取长期和短期信息。最后,通过门限积分将长期和短期表示融合为一个统一的表示,用于实体预测。在四个数据集上的大量实验证明了 HGLS 的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Learning+Long-+and+Short-term+Representations+for+Temporal+Knowledge+Graph+Reasoning)|0| +|[Mutually-paced Knowledge Distillation for Cross-lingual Temporal Knowledge Graph Reasoning](https://doi.org/10.1145/3543507.3583407)|Ruijie Wang, Zheng Li, Jingfeng Yang, Tianyu Cao, Chao Zhang, Bing Yin, Tarek F. Abdelzaher|School of Computational Science and Engineering, Georgia Institute of Technology, USA; Amazon.com Inc, USA; Department of Computer Science, University of Illinois Urbana-Champaign, USA|This paper investigates cross-lingual temporal knowledge graph reasoning problem, which aims to facilitate reasoning on Temporal Knowledge Graphs (TKGs) in low-resource languages by transfering knowledge from TKGs in high-resource ones. The cross-lingual distillation ability across TKGs becomes increasingly crucial, in light of the unsatisfying performance of existing reasoning methods on those severely incomplete TKGs, especially in low-resource languages. However, it poses tremendous challenges in two aspects. First, the cross-lingual alignments, which serve as bridges for knowledge transfer, are usually too scarce to transfer sufficient knowledge between two TKGs. Second, temporal knowledge discrepancy of the aligned entities, especially when alignments are unreliable, can mislead the knowledge distillation process. We correspondingly propose a mutually-paced knowledge distillation model MP-KD, where a teacher network trained on a source TKG can guide the training of a student network on target TKGs with an alignment module. Concretely, to deal with the scarcity issue, MP-KD generates pseudo alignments between TKGs based on the temporal information extracted by our representation module. To maximize the efficacy of knowledge transfer and control the noise caused by the temporal knowledge discrepancy, we enhance MP-KD with a temporal cross-lingual attention mechanism to dynamically estimate the alignment strength. The two procedures are mutually paced along with model training. Extensive experiments on twelve cross-lingual TKG transfer tasks in the EventKG benchmark demonstrate the effectiveness of the proposed MP-KD method.|本文研究了跨语言时态知识图的推理问题,目的是通过将知识从低资源语言的时态知识图中转移到高资源语言的时态知识图中来实现对低资源语言的时态知识图的推理。由于现有的推理方法对于严重不完备的 TKG,特别是在资源不足的 TKG 中的推理效果不理想,跨 TKG 的跨语言精馏能力变得越来越重要。然而,它在两个方面提出了巨大的挑战。首先,作为知识转移桥梁的跨语言对齐通常过于稀缺,无法在两个 TKG 之间转移足够的知识。其次,排列实体的时间知识差异,特别是排列不可靠时,会误导知识提取过程。相应地,我们提出了一个相互节奏的知识提取模型 MP-KD,其中一个在源 TKG 上训练的教师网络可以用一个对齐模块来指导学生网络在目标 TKG 上的训练。具体来说,MP-KD 基于表示模块提取的时间信息生成 TKG 之间的伪对齐,以解决缺失问题。为了最大限度地提高知识转移的效率,控制时间知识差异带来的噪声,我们采用时间跨语言注意机制增强 MP-KD,动态估计对齐强度。这两个程序与模型训练是相互配合的。通过对 EventKG 基准中12个跨语言 TKG 传递任务的大量实验,证明了所提出的 MP-KD 方法的有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Mutually-paced+Knowledge+Distillation+for+Cross-lingual+Temporal+Knowledge+Graph+Reasoning)|0| +|[Large-Scale Analysis of New Employee Network Dynamics](https://doi.org/10.1145/3543507.3583400)|Yulin Yu, Longqi Yang, Siân Lindley, Mengting Wan|School of Information, University of Michigan, USA; Microsoft Research, United Kingdom; Microsoft, USA|The COVID-19 pandemic has accelerated digital transformations across industries, but also introduced new challenges into workplaces, including the difficulties of effectively socializing with colleagues when working remotely. This challenge is exacerbated for new employees who need to develop workplace networks from the outset. In this paper, by analyzing a large-scale telemetry dataset of more than 10,000 Microsoft employees who joined the company in the first three months of 2022, we describe how new employees interact and telecommute with their colleagues during their ``onboarding'' period. Our results reveal that although new hires are gradually expanding networks over time, there still exists significant gaps between their network statistics and those of tenured employees even after the six-month onboarding phase. We also observe that heterogeneity exists among new employees in how their networks change over time, where employees whose job tasks do not necessarily require extensive and diverse connections could be at a disadvantaged position in this onboarding process. By investigating how web-based people recommendations in organizational knowledge base facilitate new employees naturally expand their networks, we also demonstrate the potential of web-based applications for addressing the aforementioned socialization challenges. Altogether, our findings provide insights on new employee network dynamics in remote and hybrid work environments, which may help guide organizational leaders and web application developers on quantifying and improving the socialization experiences of new employees in digital workplaces.|2019冠状病毒疾病疫情加速了各行各业的数字化转型,但也给工作场所带来了新的挑战,包括在远程工作时难以有效地与同事进行社交。对于从一开始就需要发展工作场所网络的新员工来说,这一挑战更加严峻。在这篇论文中,我们通过分析一个大规模的遥测数据集,这个数据集包含了2022年前三个月加入微软的10,000多名员工,我们描述了新员工在“入职”期间是如何与他们的同事进行互动和远程办公的。我们的研究结果表明,尽管新员工的人际网络随着时间的推移逐渐扩大,但即使在六个月的入职阶段之后,他们的人际网络统计数据与终身雇员之间仍然存在显著差距。我们还观察到,新员工的网络随着时间的推移如何变化存在异质性,其工作任务不一定需要广泛和多样化的联系的员工可能在这一入职过程中处于不利地位。通过调查组织知识库中基于网络的人员推荐如何促进新员工自然地扩展他们的网络,我们也展示了基于网络的应用程序在解决上述社会化挑战方面的潜力。总之,我们的研究结果提供了远程和混合工作环境中新员工网络动态的见解,这可能有助于指导组织领导者和网络应用程序开发人员量化和改善新员工在数字工作场所的社会化经验。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Large-Scale+Analysis+of+New+Employee+Network+Dynamics)|0| |[MassNE: Exploring Higher-Order Interactions with Marginal Effect for Massive Battle Outcome Prediction](https://doi.org/10.1145/3543507.3583390)|Yin Gu, Kai Zhang, Qi Liu, Xin Lin, Zhenya Huang, Enhong Chen|Anhui Province Key Laboratory of Big Data Analysis and Application, University of Science and Technology of China & State Key Laboratory of Cognitive Intelligence, China; Anhui Province Key Laboratory of Big Data Analysis and Application, University of Science and Technology of China & State Key Laboratory of Cognitive Intelligence, China and Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, China|In online games, predicting massive battle outcomes is a fundamental task of many applications, such as team optimization and tactical formulation. Existing works do not pay adequate attention to the massive battle. They either seek to evaluate individuals in isolation or mine simple pair-wise interactions between individuals, neither of which effectively captures the intricate interactions between massive units (e.g., individuals). Furthermore, as the team size increases, the phenomenon of diminishing marginal utility of units emerges. Such a diminishing pattern is rarely noticed in previous work, and how to capture it from data remains a challenge. To this end, we propose a novel Massive battle outcome predictor with margiNal Effect modules, namely MassNE, which comprehensively incorporates individual effects, cooperation effects (i.e., intra-team interactions) and suppression effects (i.e., inter-team interactions) for predicting battle outcomes. Specifically, we design marginal effect modules to learn how units’ marginal utility changing respect to their number, where the monotonicity assumption is applied to ensure rationality. In addition, we evaluate the current classical models and provide mathematical proofs that MassNE is able to generalize several earlier works in massive settings. Massive battle datasets generated by StarCraft II APIs are adopted to evaluate the performances of MassNE. Extensive experiments empirically demonstrate the effectiveness of MassNE, and MassNE can reveal reasonable cooperation effects, suppression effects, and marginal utilities of combat units from the data.|在网络游戏中,预测大规模战斗的结果是许多应用程序的基本任务,如团队优化和战术制定。现有的工程对这场大规模的战斗没有给予足够的重视。他们要么试图孤立地评估个体,要么挖掘个体之间简单的成对互动,这两种方法都没有有效地捕捉到大量单位(例如个体)之间错综复杂的互动。此外,随着团队规模的扩大,单位边际效用递减的现象也随之出现。这种递减模式在以前的工作中很少被注意到,如何从数据中捕获它仍然是一个挑战。为此,我们提出了一个新的具有边际效应模块的大规模战斗结果预测器,即 MassNE,它综合了个体效应,合作效应(即团队内部相互作用)和抑制效应(即团队间相互作用)来预测战斗结果。具体来说,我们设计边际效应模块来研究单位的边际效用是如何随着数量的变化而变化的,其中使用了单调性假设来确保合理性。此外,我们评估了目前的经典模型,并提供数学证明,MassNE 能够在大规模的背景下推广几个早期的作品。采用星际争霸 II API 生成的海量战斗数据集对 MassNE 的性能进行评估。大量的实验结果表明,MassNE 和 MassNE 能够从数据中揭示作战单元的合理协同效应、抑制效应和边际效用。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=MassNE:+Exploring+Higher-Order+Interactions+with+Marginal+Effect+for+Massive+Battle+Outcome+Prediction)|0| -|[Online Advertising in Ukraine and Russia During the 2022 Russian Invasion](https://doi.org/10.1145/3543507.3583484)|Christina Yeung, Umar Iqbal, Yekaterina Tsipenyuk O'Neil, Tadayoshi Kohno, Franziska Roesner|Security and Privacy Research Lab, University of Washington, USA; Micro Focus, USA|Online ads are a major source of information on the web. The mass reach of online advertising is often leveraged for information dissemination, at times with an objective to influence public opinion (e.g., election misinformation). We hypothesized that online advertising, due to its reach and potential, might have been used to spread information around the 2022 Russian invasion of Ukraine. Thus, to understand the online ad ecosystem during this conflict, we conducted a five-month long large-scale measurement study of online advertising in Ukraine, Russia, and the US. We studied advertising trends of ad platforms that delivered ads in Ukraine, Russia, and the US and conducted an in-depth qualitative analysis of the conflict-related ad content. We found that prominent US-based advertisers continued to support Russian websites, and a portion of online ads were used to spread conflict-related information, including protesting the invasion, and spreading awareness, which might have otherwise potentially been censored in Russia.|在线广告是网络信息的主要来源。在线广告的广泛传播经常被用于信息传播,有时是为了影响公众舆论(例如,选举错误信息)。我们假设,由于网络广告的影响力和潜力,它可能被用来传播2022年俄罗斯入侵乌克兰前后的信息。因此,为了了解这场冲突中的在线广告生态系统,我们对乌克兰、俄罗斯和美国的在线广告进行了为期五个月的大规模测量研究。我们研究了在乌克兰、俄罗斯和美国发布广告的广告平台的广告趋势,并对与冲突相关的广告内容进行了深入的定性分析。我们发现,美国知名的广告商继续支持俄罗斯网站,一部分在线广告被用于传播与冲突有关的信息,包括抗议入侵和传播意识,否则这些信息可能会在俄罗斯受到审查。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Online+Advertising+in+Ukraine+and+Russia+During+the+2022+Russian+Invasion)|0| -|[Understanding the Behaviors of Toxic Accounts on Reddit](https://doi.org/10.1145/3543507.3583522)|Deepak Kumar, Jeff T. Hancock, Kurt Thomas, Zakir Durumeric||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Understanding+the+Behaviors+of+Toxic+Accounts+on+Reddit)|0| -|[Online Reviews Are Leading Indicators of Changes in K-12 School Attributes](https://doi.org/10.1145/3543507.3583531)|Linsen Li, Aron Culotta, Douglas N. Harris, Nicholas Mattei||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Online+Reviews+Are+Leading+Indicators+of+Changes+in+K-12+School+Attributes)|0| +|[Online Advertising in Ukraine and Russia During the 2022 Russian Invasion](https://doi.org/10.1145/3543507.3583484)|Christina Yeung, Umar Iqbal, Yekaterina Tsipenyuk O'Neil, Tadayoshi Kohno, Franziska Roesner|Micro Focus, USA; Security and Privacy Research Lab, University of Washington, USA|Online ads are a major source of information on the web. The mass reach of online advertising is often leveraged for information dissemination, at times with an objective to influence public opinion (e.g., election misinformation). We hypothesized that online advertising, due to its reach and potential, might have been used to spread information around the 2022 Russian invasion of Ukraine. Thus, to understand the online ad ecosystem during this conflict, we conducted a five-month long large-scale measurement study of online advertising in Ukraine, Russia, and the US. We studied advertising trends of ad platforms that delivered ads in Ukraine, Russia, and the US and conducted an in-depth qualitative analysis of the conflict-related ad content. We found that prominent US-based advertisers continued to support Russian websites, and a portion of online ads were used to spread conflict-related information, including protesting the invasion, and spreading awareness, which might have otherwise potentially been censored in Russia.|在线广告是网络信息的主要来源。在线广告的广泛传播经常被用于信息传播,有时是为了影响公众舆论(例如,选举错误信息)。我们假设,由于网络广告的影响力和潜力,它可能被用来传播2022年俄罗斯入侵乌克兰前后的信息。因此,为了了解这场冲突中的在线广告生态系统,我们对乌克兰、俄罗斯和美国的在线广告进行了为期五个月的大规模测量研究。我们研究了在乌克兰、俄罗斯和美国发布广告的广告平台的广告趋势,并对与冲突相关的广告内容进行了深入的定性分析。我们发现,美国知名的广告商继续支持俄罗斯网站,一部分在线广告被用于传播与冲突有关的信息,包括抗议入侵和传播意识,否则这些信息可能会在俄罗斯受到审查。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Online+Advertising+in+Ukraine+and+Russia+During+the+2022+Russian+Invasion)|0| +|[Understanding the Behaviors of Toxic Accounts on Reddit](https://doi.org/10.1145/3543507.3583522)|Deepak Kumar, Jeff T. Hancock, Kurt Thomas, Zakir Durumeric|Google, USA; Stanford University, USA|Toxic comments are the top form of hate and harassment experienced online. While many studies have investigated the types of toxic comments posted online, the effects that such content has on people, and the impact of potential defenses, no study has captured the behaviors of the accounts that post toxic comments or how such attacks are operationalized. In this paper, we present a measurement study of 929K accounts that post toxic comments on Reddit over an 18 month period. Combined, these accounts posted over 14 million toxic comments that encompass insults, identity attacks, threats of violence, and sexual harassment. We explore the impact that these accounts have on Reddit, the targeting strategies that abusive accounts adopt, and the distinct patterns that distinguish classes of abusive accounts. Our analysis informs the nuanced interventions needed to curb unwanted toxic behaviors online.|有害的评论是网络上最常见的仇恨和骚扰。尽管许多研究调查了网上发布的有毒评论的类型、这些内容对人们的影响以及潜在防御的影响,但没有一项研究捕捉到发布有毒评论的账户的行为或者这种攻击是如何操作的。在这篇论文中,我们提出了一个测量研究的929K 帐户后,有毒评论 Reddit 在18个月期间。这些账户总共发布了超过1400万条有毒评论,包括侮辱、身份攻击、暴力威胁和性骚扰。我们探讨了这些账户对 Reddit 的影响,滥用账户采用的定位策略,以及区分滥用账户类别的独特模式。我们的分析为控制网上不良行为提供了细致入微的干预措施。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Understanding+the+Behaviors+of+Toxic+Accounts+on+Reddit)|0| +|[Online Reviews Are Leading Indicators of Changes in K-12 School Attributes](https://doi.org/10.1145/3543507.3583531)|Linsen Li, Aron Culotta, Douglas N. Harris, Nicholas Mattei|Department of Computer Science, Tulane University, USA; Department of Economics, Tulane University, USA|School rating websites are increasingly used by parents to assess the quality and fit of U.S. K-12 schools for their children. These online reviews often contain detailed descriptions of a school’s strengths and weaknesses, which both reflect and inform perceptions of a school. Existing work on these text reviews has focused on finding words or themes that underlie these perceptions, but has stopped short of using the textual reviews as leading indicators of school performance. In this paper, we investigate to what extent the language used in online reviews of a school is predictive of changes in the attributes of that school, such as its socio-economic makeup and student test scores. Using over 300K reviews of 70K U.S. schools from a popular ratings website, we apply language processing models to predict whether schools will significantly increase or decrease in an attribute of interest over a future time horizon. We find that using the text improves predictive performance significantly over a baseline model that does not include text but only the historical time-series of the indicators themselves, suggesting that the review text carries predictive power. A qualitative analysis of the most predictive terms and phrases used in the text reviews indicates a number of topics that serve as leading indicators, such as diversity, changes in school leadership, a focus on testing, and school safety.|越来越多的家长使用学校评级网站来评估美国 K-12学校的质量和适合他们孩子的程度。这些在线评论通常包含对一所学校优势和劣势的详细描述,这些优势和劣势既反映了对一所学校的看法,也反映了对这所学校的看法。关于这些文本审查的现有工作集中在寻找构成这些观念的词汇或主题,但没有使用文本审查作为学校表现的主要指标。在这篇论文中,我们调查了在网上评论一所学校时使用的语言在多大程度上可以预测该学校的属性变化,例如其社会经济构成和学生考试成绩。我们利用一个受欢迎的评级网站对美国70000所学校的300K 评论,应用语言处理模型来预测学校在未来的时间范围内是否会显著增加或减少兴趣属性。我们发现使用文本比不包括文本但仅包括指标本身的历史时间序列的基线模型显着提高了预测性能,这表明审查文本具有预测能力。对文本评论中使用的最具预测性的术语和短语进行定性分析,可以发现一些主要指标,如多样性、学校领导层的变化、对测试的关注和学校安全。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Online+Reviews+Are+Leading+Indicators+of+Changes+in+K-12+School+Attributes)|0| |[Beyond Fine-Tuning: Efficient and Effective Fed-Tuning for Mobile/Web Users](https://doi.org/10.1145/3543507.3583212)|Bingyan Liu, Yifeng Cai, Hongzhe Bi, Ziqi Zhang, Ding Li, Yao Guo, Xiangqun Chen||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Beyond+Fine-Tuning:+Efficient+and+Effective+Fed-Tuning+for+Mobile/Web+Users)|0| -|[Automated WebAssembly Function Purpose Identification With Semantics-Aware Analysis](https://doi.org/10.1145/3543507.3583235)|Alan Romano, Weihang Wang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Automated+WebAssembly+Function+Purpose+Identification+With+Semantics-Aware+Analysis)|0| -|[SCTAP: Supporting Scenario-Centric Trigger-Action Programming based on Software-Defined Physical Environments](https://doi.org/10.1145/3543507.3583293)|Bingkun Sun, Liwei Shen, Xin Peng, Ziming Wang||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SCTAP:+Supporting+Scenario-Centric+Trigger-Action+Programming+based+on+Software-Defined+Physical+Environments)|0| -|[DeeProphet: Improving HTTP Adaptive Streaming for Low Latency Live Video by Meticulous Bandwidth Prediction](https://doi.org/10.1145/3543507.3583364)|Kefan Chen, Bo Wang, Wufan Wang, Xiaoyu Li, Fengyuan Ren||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DeeProphet:+Improving+HTTP+Adaptive+Streaming+for+Low+Latency+Live+Video+by+Meticulous+Bandwidth+Prediction)|0| -|[Is IPFS Ready for Decentralized Video Streaming?](https://doi.org/10.1145/3543507.3583404)|Zhengyu Wu, ChengHao Ryan Yang, Santiago Vargas, Aruna Balasubramanian||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Is+IPFS+Ready+for+Decentralized+Video+Streaming?)|0| -|[SISSI: An Architecture for Semantic Interoperable Self-Sovereign Identity-based Access Control on the Web](https://doi.org/10.1145/3543507.3583409)|Christoph H.J. Braun, Vasil Papanchev, Tobias Käfer||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SISSI:+An+Architecture+for+Semantic+Interoperable+Self-Sovereign+Identity-based+Access+Control+on+the+Web)|0| -|[Detecting Socially Abnormal Highway Driving Behaviors via Recurrent Graph Attention Networks](https://doi.org/10.1145/3543507.3583452)|Yue Hu, Yuhang Zhang, Yanbing Wang, Daniel B. Work||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Detecting+Socially+Abnormal+Highway+Driving+Behaviors+via+Recurrent+Graph+Attention+Networks)|0| -|[GROUP: An End-to-end Multi-step-ahead Workload Prediction Approach Focusing on Workload Group Behavior](https://doi.org/10.1145/3543507.3583460)|Binbin Feng, Zhijun Ding||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=GROUP:+An+End-to-end+Multi-step-ahead+Workload+Prediction+Approach+Focusing+on+Workload+Group+Behavior)|0| +|[Automated WebAssembly Function Purpose Identification With Semantics-Aware Analysis](https://doi.org/10.1145/3543507.3583235)|Alan Romano, Weihang Wang|University of Southern California, USA|WebAssembly is a recent web standard built for better performance in web applications. The standard defines a binary code format to use as a compilation target for a variety of languages, such as C, C++, and Rust. The standard also defines a text representation for readability, although, WebAssembly modules are difficult to interpret by human readers, regardless of their experience level. This makes it difficult to understand and maintain any existing WebAssembly code. As a result, third-party WebAssembly modules need to be implicitly trusted by developers as verifying the functionality themselves may not be feasible. To this end, we construct WASPur, a tool to automatically identify the purposes of WebAssembly functions. To build this tool, we first construct an extensive collection of WebAssembly samples that represent the state of WebAssembly. Second, we analyze the dataset and identify the diverse use cases of the collected WebAssembly modules. We leverage the dataset of WebAssembly modules to construct semantics-aware intermediate representations (IR) of the functions in the modules. We encode the function IR for use in a machine learning classifier, and we find that this classifier can predict the similarity of a given function against known named functions with an accuracy rate of 88.07%. We hope our tool will enable inspection of optimized and minified WebAssembly modules that remove function names and most other semantic identifiers.|WebAssembly 是最近为了在 Web 应用程序中获得更好的性能而建立的 Web 标准。该标准定义了一种二进制代码格式,用作各种语言(如 C、 C + + 和 Rust)的编译目标。该标准还为可读性定义了一个文本表示,尽管 WebAssembly 模块很难被人类读者解释,不管他们的经验水平如何。这使得理解和维护任何现有的 WebAssembly 代码变得非常困难。因此,开发人员需要隐式地信任第三方 WebAssembly 模块,因为验证功能本身可能是不可行的。为此,我们构建了 WASPur,这是一个自动识别 WebAssembly 函数用途的工具。为了构建这个工具,我们首先构建一个广泛的 WebAssembly 示例集合,这些示例代表 WebAssembly 的状态。其次,我们分析数据集并识别收集的 WebAssembly 模块的不同用例。我们利用 WebAssembly 模块的数据集来构造模块中函数的语义感知中间表示(IR)。将函数 IR 编码后用于机器学习分类器,结果表明,该分类器可以预测给定函数与已知命名函数的相似度,准确率为88.07% 。我们希望我们的工具能够检查优化和缩小的 WebAssembly 模块,这些模块删除了函数名和大多数其他语义标识符。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Automated+WebAssembly+Function+Purpose+Identification+With+Semantics-Aware+Analysis)|0| +|[SCTAP: Supporting Scenario-Centric Trigger-Action Programming based on Software-Defined Physical Environments](https://doi.org/10.1145/3543507.3583293)|Bingkun Sun, Liwei Shen, Xin Peng, Ziming Wang|School of Computer Science and Shanghai Key Laboratory of Data Science, Fudan University, China|The physical world we live in is accelerating digitalization with the vigorous development of Internet of Things (IoT). Following this trend, Web of Things (WoT) further enables fast and efficient creation of various applications that perceive and act on the physical world using standard Web technologies. A popular way for creating WoT applications is Trigger-Action Programming (TAP), which allows users to orchestrate the capabilities of IoT devices in the form of “if trigger, then action”. However, existing TAP approaches don’t support scenario-centric WoT applications which involve abstract modeling of physical environments and complex spatio-temporal dependencies between events and actions. In this paper, we propose an approach called SCTAP which supports Scenario-Centric Trigger-Action Programming based on software-defined physical environments. SCTAP defines a structured and conceptual representation for physical environments, which provides the required programming abstractions for WoT applications. Based on the representation, SCTAP defines a grammar for specifying scenario-centric WoT applications with spatio-temporal dependencies. Furthermore, we design a service-based architecture for SCTAP which supports the integration of device access, event perception, environment representation, and rule execution in a loosely-coupled and extensible way. We implement SCTAP as a WoT infrastructure and evaluate it with two case studies including a smart laboratory and a smart coffee house. The results confirm the usability, feasibility and efficiency of SCTAP and its implementation.|随着物联网的蓬勃发展,我们生活的物质世界正在加速数字化进程。遵循这一趋势,物联网(Web of Things,WoT)进一步支持使用标准 Web 技术快速高效地创建各种应用程序,这些应用程序使用物理世界来感知和操作。创建物联网应用程序的一种流行方式是触发行动编程(Trigger-Action Programming,TAP) ,它允许用户以“如果触发,那么行动”的形式编排物联网设备的功能。然而,现有的 TAP 方法不支持以场景为中心的 WoT 应用程序,这些应用程序涉及物理环境的抽象建模以及事件和操作之间复杂的时空依赖关系。在本文中,我们提出了一种支持基于软件定义的物理环境的以场景为中心的触发行为编程的 SCTAP 方法。SCTAP 定义了物理环境的结构化和概念化表示,为 WoT 应用程序提供了所需的编程抽象。基于这种表示,SCTAP 定义了一种语法,用于指定具有时空依赖性的以场景为中心的 WoT 应用程序。此外,我们还为 SCTAP 设计了一个基于服务的体系结构,该体系结构以松耦合和可扩展的方式支持设备访问、事件感知、环境表示和规则执行的集成。我们将 SCTAP 作为一个 WoT 基础设施来实施,并通过两个案例研究对其进行评估,其中包括一个智能实验室和一个智能咖啡屋。实验结果验证了 SCTAP 及其实现的可用性、可行性和有效性。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SCTAP:+Supporting+Scenario-Centric+Trigger-Action+Programming+based+on+Software-Defined+Physical+Environments)|0| +|[DeeProphet: Improving HTTP Adaptive Streaming for Low Latency Live Video by Meticulous Bandwidth Prediction](https://doi.org/10.1145/3543507.3583364)|Kefan Chen, Bo Wang, Wufan Wang, Xiaoyu Li, Fengyuan Ren|Tsinghua University, China; Beijing Institute of Technology, China; Tsinghua University, China and Zhongguancun Laboratory, China|The performance of HTTP adaptive streaming (HAS) depends heavily on the prediction of end-to-end network bandwidth. The increasingly popular low latency live streaming (LLLS) faces greater challenges since it requires accurate, short-term bandwidth prediction, compared with VOD streaming which needs long-term bandwidth prediction and has good tolerance against prediction error. Part of the challenges comes from the fact that short-term bandwidth experiences both large abrupt changes and uncertain fluctuations. Additionally, it is hard to obtain valid bandwidth measurement samples in LLLS due to its inter-chunk and intra-chunk sending idleness. In this work, we present DeeProphet, a system for accurate bandwidth prediction in LLLS to improve the performance of HAS. DeeProphet overcomes the above challenges by collecting valid measurement samples using fine-grained TCP state information to identify the packet bursting intervals, and by combining the time series model and learning-based model to predict both large change and uncertain fluctuations. Experiment results show that DeeProphet improves the overall QoE by 17.7%-359.2% compared with state-of-the-art LLLS ABR algorithms, and reduces the median bandwidth prediction error to 2.7%.|HTTP 自适应流(HAS)的性能在很大程度上取决于端到端网络带宽的预测。相对于需要长期带宽预测且对预测误差有较好承受能力的 VOD 流,低延迟直播流(LLLS)由于需要准确、短期的带宽预测而面临着更大的挑战。部分挑战来自短期带宽经历巨大的突变和不确定的波动这一事实。此外,在 LLLS 中,由于组间和组内发送空闲,很难获得有效的带宽测量样本。在这项工作中,我们提出了 DeeProphet,一个在 LLLS 中准确的带宽预测系统,以改善 HAS 的性能。DeeProphet 通过使用细粒度的 TCP 状态信息收集有效的测量样本来识别数据包的爆发间隔,并结合时间序列模型和基于学习的模型来预测大的变化和不确定的波动,克服了上述挑战。实验结果表明,与最先进的 LLLS ABR 算法相比,DeeProphet 算法的总体 QoE 提高了17.7% -359.2% ,中值带宽预测误差降低到2.7% 。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DeeProphet:+Improving+HTTP+Adaptive+Streaming+for+Low+Latency+Live+Video+by+Meticulous+Bandwidth+Prediction)|0| +|[Is IPFS Ready for Decentralized Video Streaming?](https://doi.org/10.1145/3543507.3583404)|Zhengyu Wu, ChengHao Ryan Yang, Santiago Vargas, Aruna Balasubramanian|Computer Science, Stony Brook University, USA; Computer Science, Northeastern University, USA|InterPlanetary File System (IPFS) is a peer-to-peer protocol for decentralized content storage and retrieval. The IPFS platform has the potential to help users evade censorship and avoid a central point of failure. IPFS is seeing increasing adoption for distributing various kinds of files, including video. However, the performance of video streaming on IPFS has not been well-studied. We conduct a measurement study with over 28,000 videos hosted on the IPFS network and find that video streaming experiences high stall rates due to relatively high Round Trip Times (RTT). Further, videos are encoded using a single static quality, because of which streaming cannot adapt to different network conditions. A natural approach is to use adaptive bitrate (ABR) algorithms for streaming, which encode videos in multiple qualities and streams according to the throughput available. However, traditional ABR algorithms perform poorly on IPFS because the throughput cannot be estimated correctly. The main problem is that video segments can be retrieved from multiple sources, making it difficult to estimate the throughput. To overcome this issue, we have designed Telescope, an IPFS-aware ABR system. We conduct experiments on the IPFS network, where IPFS video providers are geographically distributed across the globe. Our results show that Telescope significantly improves the Quality of Experience (QoE) of videos, for a diverse set of network and cache conditions, compared to traditional ABR.|行星文件系统(IPFS)是一种用于分散内容存储和检索的对等协议。IPFS 平台具有帮助用户规避审查和避免中心故障点的潜力。IPFS 越来越多地被用于发布各种文件,包括视频。然而,IPFS 上视频流的性能还没有得到很好的研究。我们对 IPFS 网络上的28,000多个视频进行了测量研究,发现由于往返时间(RTT)相对较高,视频流经历了较高的失速率。而且,视频是使用单一静态质量进行编码的,因为流不能适应不同的网络条件。一种自然的方法是使用自适应比特率(ABR)算法进行视频流编码,它根据可用的吞吐量以多种质量和流的形式对视频进行编码。然而,传统的 ABR 算法在 IPFS 上表现不佳,因为不能正确估计吞吐量。主要问题是视频片段可以从多个源检索,这使得估计吞吐量变得困难。为了克服这个问题,我们设计了望远镜,一个 IPFS 感知 ABR 系统。我们在 IPFS 网络上进行实验,IPFS 视频提供商分布在全球各地。我们的研究结果表明,与传统的 ABR 相比,Telescope 在不同的网络和缓存条件下显著提高了视频的体验质量(QoE)。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Is+IPFS+Ready+for+Decentralized+Video+Streaming?)|0| +|[SISSI: An Architecture for Semantic Interoperable Self-Sovereign Identity-based Access Control on the Web](https://doi.org/10.1145/3543507.3583409)|Christoph H.J. Braun, Vasil Papanchev, Tobias Käfer|Karlsruhe Institute of Technology, Germany|We present an architecture for authentication and authorization on the Web that is based on the Self-Sovereign Identity paradigm. Using our architecture, we aim to achieve semantic interoperability across different approaches to SSI. We build on the underlying RDF data model of the W3C’s recommendation for Verifiable Credentials and specify semantic access control rules using SHACL. Our communication protocol for an authorization process is based on Decentralised Identifiers and extends the Hyperledger Aries Present Proof protocol. We propose a modular architecture that allows for flexible extension, e. g., for supporting more signature schemes or Decentralised Identifier Methods. For evaluation, we implemented a Proof-of-Concept: We show that a Web-based approach to SSI outperfoms a blockchain-based approach to SSI in terms of End-to-End execution time.|我们提出了一种基于自主身份认证范式的 Web 身份验证和授权体系结构。利用我们的架构,我们的目标是在不同的 SSI 方法之间实现语义互操作性。我们基于 W3C 推荐的可验证凭证的底层 RDF 数据模型,并使用 SHACL 指定语义访问控制规则。我们的授权过程的通信协议是基于分散的标识符,并扩展了 Hyperledger 白羊目前的证明协议。我们提出了一个模块化的体系结构,允许灵活的扩展,例如,支持更多的签名方案或分散的标识符方法。对于评估,我们实现了一个概念验证: 我们展示了基于 Web 的 SSI 方法在端到端执行时间方面优于基于区块链的 SSI 方法。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=SISSI:+An+Architecture+for+Semantic+Interoperable+Self-Sovereign+Identity-based+Access+Control+on+the+Web)|0| +|[Detecting Socially Abnormal Highway Driving Behaviors via Recurrent Graph Attention Networks](https://doi.org/10.1145/3543507.3583452)|Yue Hu, Yuhang Zhang, Yanbing Wang, Daniel B. Work|Vanderbilt University, USA|With the rapid development of Internet of Things technologies, the next generation traffic monitoring infrastructures are connected via the web, to aid traffic data collection and intelligent traffic management. One of the most important tasks in traffic is anomaly detection, since abnormal drivers can reduce traffic efficiency and cause safety issues. This work focuses on detecting abnormal driving behaviors from trajectories produced by highway video surveillance systems. Most of the current abnormal driving behavior detection methods focus on a limited category of abnormal behaviors that deal with a single vehicle without considering vehicular interactions. In this work, we consider the problem of detecting a variety of socially abnormal driving behaviors, i.e., behaviors that do not conform to the behavior of other nearby drivers. This task is complicated by the variety of vehicular interactions and the spatial-temporal varying nature of highway traffic. To solve this problem, we propose an autoencoder with a Recurrent Graph Attention Network that can capture the highway driving behaviors contextualized on the surrounding cars, and detect anomalies that deviate from learned patterns. Our model is scalable to large freeways with thousands of cars. Experiments on data generated from traffic simulation software show that our model is the only one that can spot the exact vehicle conducting socially abnormal behaviors, among the state-of-the-art anomaly detection models. We further show the performance on real world HighD traffic dataset, where our model detects vehicles that violate the local driving norms.|随着物联网技术的飞速发展,下一代的交通监控基础设施通过网络连接起来,有助于交通数据的采集和智能交通管理。交通中最重要的任务之一就是异常检测,因为不正常的司机会降低交通效率并引起安全问题。本文主要研究如何从高速公路视频监控系统产生的轨迹中检测出不正常的驾驶行为。目前的异常驾驶行为检测方法大多集中在有限的一类异常驾驶行为上,这类异常驾驶行为只检测一辆车,而不考虑车辆之间的相互作用。在这项工作中,我们考虑的问题,检测各种社会异常驾驶行为,即行为不符合其他附近的驾驶员的行为。由于车辆相互作用的多样性以及高速公路交通的时空变化特性,这项任务变得更加复杂。为了解决这一问题,我们提出了一种基于循环图形注意网络的自动编码器,它可以捕获与周围车辆相关的高速公路驾驶行为,并检测偏离学习模式的异常。我们的模型可扩展到拥有数千辆汽车的大型高速公路。对交通模拟软体数据的实验表明,在最先进的异常检测模型中,我们的模型是唯一能够准确识别出行为异常的车辆的模型。我们进一步显示在真实世界的高速交通数据集,其中我们的模型检测违反当地驾驶规范的车辆的性能。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Detecting+Socially+Abnormal+Highway+Driving+Behaviors+via+Recurrent+Graph+Attention+Networks)|0| +|[GROUP: An End-to-end Multi-step-ahead Workload Prediction Approach Focusing on Workload Group Behavior](https://doi.org/10.1145/3543507.3583460)|Binbin Feng, Zhijun Ding|Tongji University, China|Accurately forecasting workloads can enable web service providers to achieve proactive runtime management for applications and ensure service quality and cost efficiency. For cloud-native applications, multiple containers collaborate to handle user requests, making each container’s workload changes influenced by workload group behavior. However, existing approaches mainly analyze the individual changes of each container and do not explicitly model the workload group evolution of containers, resulting in sub-optimal results. Therefore, we propose a workload prediction method, GROUP, which implements the shifts of workload prediction focus from individual to group, workload group behavior representation from data similarity to data correlation, and workload group behavior evolution from implicit modeling to explicit modeling. First, we model the workload group behavior and its evolution from multiple perspectives. Second, we propose a container correlation calculation algorithm that considers static and dynamic container information to represent the workload group behavior. Third, we propose an end-to-end multi-step-ahead prediction method that explicitly portrays the complex relationship between the evolution of workload group behavior and the workload changes of each container. Lastly, enough experiments on public datasets show the advantages of GROUP, which provides an effective solution to achieve workload prediction for cloud-native applications.|准确预测工作负载可以使 Web 服务提供商实现应用程序的主动运行时管理,并确保服务质量和成本效率。对于云本地应用程序,多个容器协作处理用户请求,使每个容器的工作负载变化受到工作负载组行为的影响。然而,现有的方法主要分析每个集装箱的个别变化,并没有明确的模型集装箱的工作负载组演化,导致次优结果。为此,提出了一种工作负载预测方法 GROUP,该方法实现了工作负载预测焦点从个体到群体的转移,实现了工作负载群体行为从数据相似性到数据相关性的表示,实现了工作负载群体行为从隐式建模到显式建模的演化。首先,我们从多个角度对工作负载组行为及其演化进行建模。其次,提出了一种考虑静态和动态容器信息来表示工作负载组行为的容器关联计算算法。第三,提出了一种端到端多步提前预测方法,该方法明确描述了工作负载组行为的演化与每个容器的工作负载变化之间的复杂关系。最后,在公共数据集上进行了大量的实验,验证了 GROUP 的优势,为云本地应用程序实现工作负载预测提供了有效的解决方案。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=GROUP:+An+End-to-end+Multi-step-ahead+Workload+Prediction+Approach+Focusing+on+Workload+Group+Behavior)|0| |[FANS: Fast Non-Autoregressive Sequence Generation for Item List Continuation](https://doi.org/10.1145/3543507.3583430)|Qijiong Liu, Jieming Zhu, Jiahao Wu, Tiandeng Wu, Zhenhua Dong, XiaoMing Wu||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=FANS:+Fast+Non-Autoregressive+Sequence+Generation+for+Item+List+Continuation)|0| |[DANCE: Learning A Domain Adaptive Framework for Deep Hashing](https://doi.org/10.1145/3543507.3583445)|Haixin Wang, Jinan Sun, Xiang Wei, Shikun Zhang, Chong Chen, XianSheng Hua, Xiao Luo||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=DANCE:+Learning+A+Domain+Adaptive+Framework+for+Deep+Hashing)|0| |[Differentiable Optimized Product Quantization and Beyond](https://doi.org/10.1145/3543507.3583482)|Zepu Lu, Defu Lian, Jin Zhang, Zaixi Zhang, Chao Feng, Hao Wang, Enhong Chen||||[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Differentiable+Optimized+Product+Quantization+and+Beyond)|0| diff --git a/results.json b/results.json index 8e687dd8..fbd7491b 100644 --- a/results.json +++ b/results.json @@ -130879,9 +130879,28 @@ "Kurt Thomas", "Zakir Durumeric" ], - "paper_abstract": "", + "paper_abstract": "Toxic comments are the top form of hate and harassment experienced online. While many studies have investigated the types of toxic comments posted online, the effects that such content has on people, and the impact of potential defenses, no study has captured the behaviors of the accounts that post toxic comments or how such attacks are operationalized. In this paper, we present a measurement study of 929K accounts that post toxic comments on Reddit over an 18 month period. Combined, these accounts posted over 14 million toxic comments that encompass insults, identity attacks, threats of violence, and sexual harassment. We explore the impact that these accounts have on Reddit, the targeting strategies that abusive accounts adopt, and the distinct patterns that distinguish classes of abusive accounts. Our analysis informs the nuanced interventions needed to curb unwanted toxic behaviors online.", "paper_code": "#", - "paper_cite": 0 + "paper_cite": 0, + "authors_detail": [ + { + "name": "Deepak Kumar", + "org": "Stanford University, USA" + }, + { + "name": "Jeff Hancock", + "org": "Stanford University, USA" + }, + { + "name": "Kurt Thomas", + "org": "Google, USA" + }, + { + "name": "Zakir Durumeric", + "org": "Stanford University, USA" + } + ], + "translated": "有害的评论是网络上最常见的仇恨和骚扰。尽管许多研究调查了网上发布的有毒评论的类型、这些内容对人们的影响以及潜在防御的影响,但没有一项研究捕捉到发布有毒评论的账户的行为或者这种攻击是如何操作的。在这篇论文中,我们提出了一个测量研究的929K 帐户后,有毒评论 Reddit 在18个月期间。这些账户总共发布了超过1400万条有毒评论,包括侮辱、身份攻击、暴力威胁和性骚扰。我们探讨了这些账户对 Reddit 的影响,滥用账户采用的定位策略,以及区分滥用账户类别的独特模式。我们的分析为控制网上不良行为提供了细致入微的干预措施。" }, { "paper_name": "Online Reviews Are Leading Indicators of Changes in K-12 School Attributes", @@ -130892,9 +130911,28 @@ "Douglas N. Harris", "Nicholas Mattei" ], - "paper_abstract": "", + "paper_abstract": "School rating websites are increasingly used by parents to assess the quality and fit of U.S. K-12 schools for their children. These online reviews often contain detailed descriptions of a school’s strengths and weaknesses, which both reflect and inform perceptions of a school. Existing work on these text reviews has focused on finding words or themes that underlie these perceptions, but has stopped short of using the textual reviews as leading indicators of school performance. In this paper, we investigate to what extent the language used in online reviews of a school is predictive of changes in the attributes of that school, such as its socio-economic makeup and student test scores. Using over 300K reviews of 70K U.S. schools from a popular ratings website, we apply language processing models to predict whether schools will significantly increase or decrease in an attribute of interest over a future time horizon. We find that using the text improves predictive performance significantly over a baseline model that does not include text but only the historical time-series of the indicators themselves, suggesting that the review text carries predictive power. A qualitative analysis of the most predictive terms and phrases used in the text reviews indicates a number of topics that serve as leading indicators, such as diversity, changes in school leadership, a focus on testing, and school safety.", "paper_code": "#", - "paper_cite": 0 + "paper_cite": 0, + "authors_detail": [ + { + "name": "Linsen Li", + "org": "Department of Computer Science, Tulane University, USA" + }, + { + "name": "Aron Culotta", + "org": "Department of Computer Science, Tulane University, USA" + }, + { + "name": "Douglas N. Harris", + "org": "Department of Economics, Tulane University, USA" + }, + { + "name": "Nicholas Mattei", + "org": "Department of Computer Science, Tulane University, USA" + } + ], + "translated": "越来越多的家长使用学校评级网站来评估美国 K-12学校的质量和适合他们孩子的程度。这些在线评论通常包含对一所学校优势和劣势的详细描述,这些优势和劣势既反映了对一所学校的看法,也反映了对这所学校的看法。关于这些文本审查的现有工作集中在寻找构成这些观念的词汇或主题,但没有使用文本审查作为学校表现的主要指标。在这篇论文中,我们调查了在网上评论一所学校时使用的语言在多大程度上可以预测该学校的属性变化,例如其社会经济构成和学生考试成绩。我们利用一个受欢迎的评级网站对美国70000所学校的300K 评论,应用语言处理模型来预测学校在未来的时间范围内是否会显著增加或减少兴趣属性。我们发现使用文本比不包括文本但仅包括指标本身的历史时间序列的基线模型显着提高了预测性能,这表明审查文本具有预测能力。对文本评论中使用的最具预测性的术语和短语进行定性分析,可以发现一些主要指标,如多样性、学校领导层的变化、对测试的关注和学校安全。" }, { "paper_name": "Beyond Fine-Tuning: Efficient and Effective Fed-Tuning for Mobile/Web Users", @@ -130919,9 +130957,20 @@ "Alan Romano", "Weihang Wang" ], - "paper_abstract": "", + "paper_abstract": "WebAssembly is a recent web standard built for better performance in web applications. The standard defines a binary code format to use as a compilation target for a variety of languages, such as C, C++, and Rust. The standard also defines a text representation for readability, although, WebAssembly modules are difficult to interpret by human readers, regardless of their experience level. This makes it difficult to understand and maintain any existing WebAssembly code. As a result, third-party WebAssembly modules need to be implicitly trusted by developers as verifying the functionality themselves may not be feasible. To this end, we construct WASPur, a tool to automatically identify the purposes of WebAssembly functions. To build this tool, we first construct an extensive collection of WebAssembly samples that represent the state of WebAssembly. Second, we analyze the dataset and identify the diverse use cases of the collected WebAssembly modules. We leverage the dataset of WebAssembly modules to construct semantics-aware intermediate representations (IR) of the functions in the modules. We encode the function IR for use in a machine learning classifier, and we find that this classifier can predict the similarity of a given function against known named functions with an accuracy rate of 88.07%. We hope our tool will enable inspection of optimized and minified WebAssembly modules that remove function names and most other semantic identifiers.", "paper_code": "#", - "paper_cite": 0 + "paper_cite": 0, + "authors_detail": [ + { + "name": "Alan Romano", + "org": "University of Southern California, USA" + }, + { + "name": "Weihang Wang", + "org": "University of Southern California, USA" + } + ], + "translated": "WebAssembly 是最近为了在 Web 应用程序中获得更好的性能而建立的 Web 标准。该标准定义了一种二进制代码格式,用作各种语言(如 C、 C + + 和 Rust)的编译目标。该标准还为可读性定义了一个文本表示,尽管 WebAssembly 模块很难被人类读者解释,不管他们的经验水平如何。这使得理解和维护任何现有的 WebAssembly 代码变得非常困难。因此,开发人员需要隐式地信任第三方 WebAssembly 模块,因为验证功能本身可能是不可行的。为此,我们构建了 WASPur,这是一个自动识别 WebAssembly 函数用途的工具。为了构建这个工具,我们首先构建一个广泛的 WebAssembly 示例集合,这些示例代表 WebAssembly 的状态。其次,我们分析数据集并识别收集的 WebAssembly 模块的不同用例。我们利用 WebAssembly 模块的数据集来构造模块中函数的语义感知中间表示(IR)。将函数 IR 编码后用于机器学习分类器,结果表明,该分类器可以预测给定函数与已知命名函数的相似度,准确率为88.07% 。我们希望我们的工具能够检查优化和缩小的 WebAssembly 模块,这些模块删除了函数名和大多数其他语义标识符。" }, { "paper_name": "SCTAP: Supporting Scenario-Centric Trigger-Action Programming based on Software-Defined Physical Environments", @@ -130932,9 +130981,28 @@ "Xin Peng", "Ziming Wang" ], - "paper_abstract": "", + "paper_abstract": "The physical world we live in is accelerating digitalization with the vigorous development of Internet of Things (IoT). Following this trend, Web of Things (WoT) further enables fast and efficient creation of various applications that perceive and act on the physical world using standard Web technologies. A popular way for creating WoT applications is Trigger-Action Programming (TAP), which allows users to orchestrate the capabilities of IoT devices in the form of “if trigger, then action”. However, existing TAP approaches don’t support scenario-centric WoT applications which involve abstract modeling of physical environments and complex spatio-temporal dependencies between events and actions. In this paper, we propose an approach called SCTAP which supports Scenario-Centric Trigger-Action Programming based on software-defined physical environments. SCTAP defines a structured and conceptual representation for physical environments, which provides the required programming abstractions for WoT applications. Based on the representation, SCTAP defines a grammar for specifying scenario-centric WoT applications with spatio-temporal dependencies. Furthermore, we design a service-based architecture for SCTAP which supports the integration of device access, event perception, environment representation, and rule execution in a loosely-coupled and extensible way. We implement SCTAP as a WoT infrastructure and evaluate it with two case studies including a smart laboratory and a smart coffee house. The results confirm the usability, feasibility and efficiency of SCTAP and its implementation.", "paper_code": "#", - "paper_cite": 0 + "paper_cite": 0, + "authors_detail": [ + { + "name": "Bingkun Sun", + "org": "School of Computer Science and Shanghai Key Laboratory of Data Science, Fudan University, China" + }, + { + "name": "Liwei Shen", + "org": "School of Computer Science and Shanghai Key Laboratory of Data Science, Fudan University, China" + }, + { + "name": "Xin Peng", + "org": "School of Computer Science and Shanghai Key Laboratory of Data Science, Fudan University, China" + }, + { + "name": "Ziming Wang", + "org": "School of Computer Science and Shanghai Key Laboratory of Data Science, Fudan University, China" + } + ], + "translated": "随着物联网的蓬勃发展,我们生活的物质世界正在加速数字化进程。遵循这一趋势,物联网(Web of Things,WoT)进一步支持使用标准 Web 技术快速高效地创建各种应用程序,这些应用程序使用物理世界来感知和操作。创建物联网应用程序的一种流行方式是触发行动编程(Trigger-Action Programming,TAP) ,它允许用户以“如果触发,那么行动”的形式编排物联网设备的功能。然而,现有的 TAP 方法不支持以场景为中心的 WoT 应用程序,这些应用程序涉及物理环境的抽象建模以及事件和操作之间复杂的时空依赖关系。在本文中,我们提出了一种支持基于软件定义的物理环境的以场景为中心的触发行为编程的 SCTAP 方法。SCTAP 定义了物理环境的结构化和概念化表示,为 WoT 应用程序提供了所需的编程抽象。基于这种表示,SCTAP 定义了一种语法,用于指定具有时空依赖性的以场景为中心的 WoT 应用程序。此外,我们还为 SCTAP 设计了一个基于服务的体系结构,该体系结构以松耦合和可扩展的方式支持设备访问、事件感知、环境表示和规则执行的集成。我们将 SCTAP 作为一个 WoT 基础设施来实施,并通过两个案例研究对其进行评估,其中包括一个智能实验室和一个智能咖啡屋。实验结果验证了 SCTAP 及其实现的可用性、可行性和有效性。" }, { "paper_name": "DeeProphet: Improving HTTP Adaptive Streaming for Low Latency Live Video by Meticulous Bandwidth Prediction", @@ -130946,9 +131014,32 @@ "Xiaoyu Li", "Fengyuan Ren" ], - "paper_abstract": "", + "paper_abstract": "The performance of HTTP adaptive streaming (HAS) depends heavily on the prediction of end-to-end network bandwidth. The increasingly popular low latency live streaming (LLLS) faces greater challenges since it requires accurate, short-term bandwidth prediction, compared with VOD streaming which needs long-term bandwidth prediction and has good tolerance against prediction error. Part of the challenges comes from the fact that short-term bandwidth experiences both large abrupt changes and uncertain fluctuations. Additionally, it is hard to obtain valid bandwidth measurement samples in LLLS due to its inter-chunk and intra-chunk sending idleness. In this work, we present DeeProphet, a system for accurate bandwidth prediction in LLLS to improve the performance of HAS. DeeProphet overcomes the above challenges by collecting valid measurement samples using fine-grained TCP state information to identify the packet bursting intervals, and by combining the time series model and learning-based model to predict both large change and uncertain fluctuations. Experiment results show that DeeProphet improves the overall QoE by 17.7%-359.2% compared with state-of-the-art LLLS ABR algorithms, and reduces the median bandwidth prediction error to 2.7%.", "paper_code": "#", - "paper_cite": 0 + "paper_cite": 0, + "authors_detail": [ + { + "name": "Kefan Chen", + "org": "Tsinghua University, China" + }, + { + "name": "Bo Wang", + "org": "Tsinghua University, China and Zhongguancun Laboratory, China" + }, + { + "name": "Wufang Wang", + "org": "Beijing Institute of Technology, China" + }, + { + "name": "Xiaoyu Li", + "org": "Tsinghua University, China" + }, + { + "name": "Fengyuan Ren", + "org": "Tsinghua University, China" + } + ], + "translated": "HTTP 自适应流(HAS)的性能在很大程度上取决于端到端网络带宽的预测。相对于需要长期带宽预测且对预测误差有较好承受能力的 VOD 流,低延迟直播流(LLLS)由于需要准确、短期的带宽预测而面临着更大的挑战。部分挑战来自短期带宽经历巨大的突变和不确定的波动这一事实。此外,在 LLLS 中,由于组间和组内发送空闲,很难获得有效的带宽测量样本。在这项工作中,我们提出了 DeeProphet,一个在 LLLS 中准确的带宽预测系统,以改善 HAS 的性能。DeeProphet 通过使用细粒度的 TCP 状态信息收集有效的测量样本来识别数据包的爆发间隔,并结合时间序列模型和基于学习的模型来预测大的变化和不确定的波动,克服了上述挑战。实验结果表明,与最先进的 LLLS ABR 算法相比,DeeProphet 算法的总体 QoE 提高了17.7% -359.2% ,中值带宽预测误差降低到2.7% 。" }, { "paper_name": "Is IPFS Ready for Decentralized Video Streaming?", @@ -130959,9 +131050,28 @@ "Santiago Vargas", "Aruna Balasubramanian" ], - "paper_abstract": "", + "paper_abstract": "InterPlanetary File System (IPFS) is a peer-to-peer protocol for decentralized content storage and retrieval. The IPFS platform has the potential to help users evade censorship and avoid a central point of failure. IPFS is seeing increasing adoption for distributing various kinds of files, including video. However, the performance of video streaming on IPFS has not been well-studied. We conduct a measurement study with over 28,000 videos hosted on the IPFS network and find that video streaming experiences high stall rates due to relatively high Round Trip Times (RTT). Further, videos are encoded using a single static quality, because of which streaming cannot adapt to different network conditions. A natural approach is to use adaptive bitrate (ABR) algorithms for streaming, which encode videos in multiple qualities and streams according to the throughput available. However, traditional ABR algorithms perform poorly on IPFS because the throughput cannot be estimated correctly. The main problem is that video segments can be retrieved from multiple sources, making it difficult to estimate the throughput. To overcome this issue, we have designed Telescope, an IPFS-aware ABR system. We conduct experiments on the IPFS network, where IPFS video providers are geographically distributed across the globe. Our results show that Telescope significantly improves the Quality of Experience (QoE) of videos, for a diverse set of network and cache conditions, compared to traditional ABR.", "paper_code": "#", - "paper_cite": 0 + "paper_cite": 0, + "authors_detail": [ + { + "name": "Zhengyu Wu", + "org": "Computer Science, Stony Brook University, USA" + }, + { + "name": "Cheng Hao Yang", + "org": "Computer Science, Northeastern University, USA" + }, + { + "name": "Santiago Vargas", + "org": "Computer Science, Stony Brook University, USA" + }, + { + "name": "Aruna Balasubramanian", + "org": "Computer Science, Stony Brook University, USA" + } + ], + "translated": "行星文件系统(IPFS)是一种用于分散内容存储和检索的对等协议。IPFS 平台具有帮助用户规避审查和避免中心故障点的潜力。IPFS 越来越多地被用于发布各种文件,包括视频。然而,IPFS 上视频流的性能还没有得到很好的研究。我们对 IPFS 网络上的28,000多个视频进行了测量研究,发现由于往返时间(RTT)相对较高,视频流经历了较高的失速率。而且,视频是使用单一静态质量进行编码的,因为流不能适应不同的网络条件。一种自然的方法是使用自适应比特率(ABR)算法进行视频流编码,它根据可用的吞吐量以多种质量和流的形式对视频进行编码。然而,传统的 ABR 算法在 IPFS 上表现不佳,因为不能正确估计吞吐量。主要问题是视频片段可以从多个源检索,这使得估计吞吐量变得困难。为了克服这个问题,我们设计了望远镜,一个 IPFS 感知 ABR 系统。我们在 IPFS 网络上进行实验,IPFS 视频提供商分布在全球各地。我们的研究结果表明,与传统的 ABR 相比,Telescope 在不同的网络和缓存条件下显著提高了视频的体验质量(QoE)。" }, { "paper_name": "SISSI: An Architecture for Semantic Interoperable Self-Sovereign Identity-based Access Control on the Web", @@ -130971,9 +131081,24 @@ "Vasil Papanchev", "Tobias Käfer" ], - "paper_abstract": "", + "paper_abstract": "We present an architecture for authentication and authorization on the Web that is based on the Self-Sovereign Identity paradigm. Using our architecture, we aim to achieve semantic interoperability across different approaches to SSI. We build on the underlying RDF data model of the W3C’s recommendation for Verifiable Credentials and specify semantic access control rules using SHACL. Our communication protocol for an authorization process is based on Decentralised Identifiers and extends the Hyperledger Aries Present Proof protocol. We propose a modular architecture that allows for flexible extension, e. g., for supporting more signature schemes or Decentralised Identifier Methods. For evaluation, we implemented a Proof-of-Concept: We show that a Web-based approach to SSI outperfoms a blockchain-based approach to SSI in terms of End-to-End execution time.", "paper_code": "#", - "paper_cite": 0 + "paper_cite": 0, + "authors_detail": [ + { + "name": "Christoph H.-J. Braun", + "org": "Karlsruhe Institute of Technology, Germany" + }, + { + "name": "Vasil Papanchev", + "org": "Karlsruhe Institute of Technology, Germany" + }, + { + "name": "Tobias Käfer", + "org": "Karlsruhe Institute of Technology, Germany" + } + ], + "translated": "我们提出了一种基于自主身份认证范式的 Web 身份验证和授权体系结构。利用我们的架构,我们的目标是在不同的 SSI 方法之间实现语义互操作性。我们基于 W3C 推荐的可验证凭证的底层 RDF 数据模型,并使用 SHACL 指定语义访问控制规则。我们的授权过程的通信协议是基于分散的标识符,并扩展了 Hyperledger 白羊目前的证明协议。我们提出了一个模块化的体系结构,允许灵活的扩展,例如,支持更多的签名方案或分散的标识符方法。对于评估,我们实现了一个概念验证: 我们展示了基于 Web 的 SSI 方法在端到端执行时间方面优于基于区块链的 SSI 方法。" }, { "paper_name": "To Store or Not? Online Data Selection for Federated Learning with Limited Storage", @@ -130986,9 +131111,36 @@ "Bingshuai Li", "Guihai Chen" ], - "paper_abstract": "", + "paper_abstract": "Machine learning models have been deployed in mobile networks to deal with massive data from different layers to enable automated network management and intelligence on devices. To overcome high communication cost and severe privacy concerns of centralized machine learning, federated learning (FL) has been proposed to achieve distributed machine learning among networked devices. While the computation and communication limitation has been widely studied, the impact of on-device storage on the performance of FL is still not explored. Without an effective data selection policy to filter the massive streaming data on devices, classical FL can suffer from much longer model training time ($4\\times$) and significant inference accuracy reduction ($7\\%$), observed in our experiments. In this work, we take the first step to consider the online data selection for FL with limited on-device storage. We first define a new data valuation metric for data evaluation and selection in FL with theoretical guarantees for speeding up model convergence and enhancing final model accuracy, simultaneously. We further design {\\ttfamily ODE}, a framework of \\textbf{O}nline \\textbf{D}ata s\\textbf{E}lection for FL, to coordinate networked devices to store valuable data samples. Experimental results on one industrial dataset and three public datasets show the remarkable advantages of {\\ttfamily ODE} over the state-of-the-art approaches. Particularly, on the industrial dataset, {\\ttfamily ODE} achieves as high as $2.5\\times$ speedup of training time and $6\\%$ increase in inference accuracy, and is robust to various factors in practical environments.", "paper_code": "#", - "paper_cite": 1 + "paper_cite": 1, + "authors_detail": [ + { + "name": "Chen Gong", + "org": "Department of Computer Science and Engineering, Shanghai Jiao Tong University, China" + }, + { + "name": "Zhenzhe Zheng", + "org": "Department of Computer Science and Engineering, Shanghai Jiao Tong University, China" + }, + { + "name": "Yunfeng Shao", + "org": "" + }, + { + "name": "Bingshuai Li", + "org": "" + }, + { + "name": "Fan Wu", + "org": "" + }, + { + "name": "Guihai Chen", + "org": "Department of Computer Science and Engineering, Shanghai Jiao Tong University, China" + } + ], + "translated": "机器学习模型已经部署在移动网络中,用于处理来自不同层次的大量数据,以实现设备上的自动网络管理和智能化。为了克服集中式机器学习的高通信成本和严重的隐私问题,提出了联邦学习(FL)来实现网络设备之间的分布式机器学习。虽然计算和通信的局限性已经被广泛研究,但是在设备上存储对 FL 性能的影响还没有被探讨。在我们的实验中观察到,如果没有有效的数据选择策略来过滤设备上的大量流数据,经典的 FL 可能会遭受更长的模型训练时间(4倍 $)和显着的推断准确性降低(7% $)。在这项工作中,我们采取的第一步考虑在线数据选择的 FL 与有限的设备上的存储。我们首先定义了一个新的用于 FL 中数据评估和选择的数据估值度量,同时为加快模型收敛速度和提高最终模型精度提供了理论保证。进一步设计了一个基于 textbf { O } nline textbf { D } ata’s textbf { E }选项的框架{ ttfamily ODE } ,用于协调网络设备以存储有价值的数据样本。在一个工业数据集和三个公共数据集上的实验结果表明,{ ttfamily ODE }方法比现有的方法具有显著的优势。特别是在工业数据集上,{ ttfamily ODE }达到了2.5倍的训练时间加速和6% 的推理精度提高,并且对实际环境中的各种因素具有鲁棒性。" }, { "paper_name": "Detecting Socially Abnormal Highway Driving Behaviors via Recurrent Graph Attention Networks", @@ -130999,9 +131151,28 @@ "Yanbing Wang", "Daniel B. Work" ], - "paper_abstract": "", + "paper_abstract": "With the rapid development of Internet of Things technologies, the next generation traffic monitoring infrastructures are connected via the web, to aid traffic data collection and intelligent traffic management. One of the most important tasks in traffic is anomaly detection, since abnormal drivers can reduce traffic efficiency and cause safety issues. This work focuses on detecting abnormal driving behaviors from trajectories produced by highway video surveillance systems. Most of the current abnormal driving behavior detection methods focus on a limited category of abnormal behaviors that deal with a single vehicle without considering vehicular interactions. In this work, we consider the problem of detecting a variety of socially abnormal driving behaviors, i.e., behaviors that do not conform to the behavior of other nearby drivers. This task is complicated by the variety of vehicular interactions and the spatial-temporal varying nature of highway traffic. To solve this problem, we propose an autoencoder with a Recurrent Graph Attention Network that can capture the highway driving behaviors contextualized on the surrounding cars, and detect anomalies that deviate from learned patterns. Our model is scalable to large freeways with thousands of cars. Experiments on data generated from traffic simulation software show that our model is the only one that can spot the exact vehicle conducting socially abnormal behaviors, among the state-of-the-art anomaly detection models. We further show the performance on real world HighD traffic dataset, where our model detects vehicles that violate the local driving norms.", "paper_code": "#", - "paper_cite": 0 + "paper_cite": 0, + "authors_detail": [ + { + "name": "Yue Hu", + "org": "Vanderbilt University, USA" + }, + { + "name": "Yuhang Zhang", + "org": "Vanderbilt University, USA" + }, + { + "name": "Yanbing Wang", + "org": "Vanderbilt University, USA" + }, + { + "name": "Daniel Work", + "org": "Vanderbilt University, USA" + } + ], + "translated": "随着物联网技术的飞速发展,下一代的交通监控基础设施通过网络连接起来,有助于交通数据的采集和智能交通管理。交通中最重要的任务之一就是异常检测,因为不正常的司机会降低交通效率并引起安全问题。本文主要研究如何从高速公路视频监控系统产生的轨迹中检测出不正常的驾驶行为。目前的异常驾驶行为检测方法大多集中在有限的一类异常驾驶行为上,这类异常驾驶行为只检测一辆车,而不考虑车辆之间的相互作用。在这项工作中,我们考虑的问题,检测各种社会异常驾驶行为,即行为不符合其他附近的驾驶员的行为。由于车辆相互作用的多样性以及高速公路交通的时空变化特性,这项任务变得更加复杂。为了解决这一问题,我们提出了一种基于循环图形注意网络的自动编码器,它可以捕获与周围车辆相关的高速公路驾驶行为,并检测偏离学习模式的异常。我们的模型可扩展到拥有数千辆汽车的大型高速公路。对交通模拟软体数据的实验表明,在最先进的异常检测模型中,我们的模型是唯一能够准确识别出行为异常的车辆的模型。我们进一步显示在真实世界的高速交通数据集,其中我们的模型检测违反当地驾驶规范的车辆的性能。" }, { "paper_name": "GROUP: An End-to-end Multi-step-ahead Workload Prediction Approach Focusing on Workload Group Behavior", @@ -131010,9 +131181,20 @@ "Binbin Feng", "Zhijun Ding" ], - "paper_abstract": "", + "paper_abstract": "Accurately forecasting workloads can enable web service providers to achieve proactive runtime management for applications and ensure service quality and cost efficiency. For cloud-native applications, multiple containers collaborate to handle user requests, making each container’s workload changes influenced by workload group behavior. However, existing approaches mainly analyze the individual changes of each container and do not explicitly model the workload group evolution of containers, resulting in sub-optimal results. Therefore, we propose a workload prediction method, GROUP, which implements the shifts of workload prediction focus from individual to group, workload group behavior representation from data similarity to data correlation, and workload group behavior evolution from implicit modeling to explicit modeling. First, we model the workload group behavior and its evolution from multiple perspectives. Second, we propose a container correlation calculation algorithm that considers static and dynamic container information to represent the workload group behavior. Third, we propose an end-to-end multi-step-ahead prediction method that explicitly portrays the complex relationship between the evolution of workload group behavior and the workload changes of each container. Lastly, enough experiments on public datasets show the advantages of GROUP, which provides an effective solution to achieve workload prediction for cloud-native applications.", "paper_code": "#", - "paper_cite": 0 + "paper_cite": 0, + "authors_detail": [ + { + "name": "Binbin Feng", + "org": "Tongji University, China" + }, + { + "name": "Zhijun Ding", + "org": "Tongji University, China" + } + ], + "translated": "准确预测工作负载可以使 Web 服务提供商实现应用程序的主动运行时管理,并确保服务质量和成本效率。对于云本地应用程序,多个容器协作处理用户请求,使每个容器的工作负载变化受到工作负载组行为的影响。然而,现有的方法主要分析每个集装箱的个别变化,并没有明确的模型集装箱的工作负载组演化,导致次优结果。为此,提出了一种工作负载预测方法 GROUP,该方法实现了工作负载预测焦点从个体到群体的转移,实现了工作负载群体行为从数据相似性到数据相关性的表示,实现了工作负载群体行为从隐式建模到显式建模的演化。首先,我们从多个角度对工作负载组行为及其演化进行建模。其次,提出了一种考虑静态和动态容器信息来表示工作负载组行为的容器关联计算算法。第三,提出了一种端到端多步提前预测方法,该方法明确描述了工作负载组行为的演化与每个容器的工作负载变化之间的复杂关系。最后,在公共数据集上进行了大量的实验,验证了 GROUP 的优势,为云本地应用程序实现工作负载预测提供了有效的解决方案。" }, { "paper_name": "FANS: Fast Non-Autoregressive Sequence Generation for Item List Continuation",