Post-Training Generative Recommenders with Advantage-Weighted Supervised Finetuning
Generative recommender systems need more than just observed user behavior to make accurate recommendations. Introducing A-SFT algorithm improves alignment between pre-trained models and reward models for more effective post-training...










