Maximum Entropy Policy for Long-Term Fairness in Interactive Recommender Systems

doi:10.1109/TSC.2024.3349636

CSpace

	Maximum Entropy Policy for Long-Term Fairness in Interactive Recommender Systems
	Shi, Xiaoyu1 ; Liu, Quanliang 1,2; Xie, Hong 1; Bai, Yanan 3; Shang, Mingsheng1
	2024-05-01
摘要	This article considers the problem of maintaining the long-term fairness of item exposure in interactive recommender systems under the dynamic setting that user preference and item popularity evolve over time. The challenge is that the evolving dynamics of user preference and item popularity in the feedback loop amplify the long-term "unfairness" of item exposure. To address this challenge, we first formulate a constrained Markov Decision Process (MDP) to capture the evolving dynamics of user preference. The proposed constrained MDP imposes long-term fairness requirements via maximum entropy techniques. Moreover, to illuminate the "unfairness" amplifying effect caused by the evolving dynamic of item popularity in the feedback loop, we design a debiased reward function to eliminate popularity bias in the training data. To this end, the proposed framework can maintain acceptable recommendation accuracy while exposing items as randomly as possible, ensuring long-term benefits for users. To address the data sparsity issue, the proposed framework can easily integrate self-supervised learning methods to enhance state representation. Experiments on three datasets and an authentic Reinforcement Learning environment (Virtual-Taobao) demonstrate the effectiveness and superiority of the proposed framework in terms of recommendation accuracy and fairness, and show the robustness against data sparsity and noise.
关键词	Entropy Recommender systems Training Feedback loop Training data Robustness Real-time systems Long-term fairness maximum entropy policy popularity bias recommender system reinforcement learning web services
DOI	10.1109/TSC.2024.3349636
发表期刊	IEEE TRANSACTIONS ON SERVICES COMPUTING
ISSN	1939-1374
卷号	17 期号:3 页码:1029-1043
通讯作者	Shang, Mingsheng(msshang@cigit.ac.cn)
收录类别	SCI
WOS记录号	WOS:001248286200003
语种	英语

中国科学院重庆绿色智能技术研究院机构知识库