TSR: Trajectory‑Search Rollouts for Multi‑Turn RL of LLM AgentsAladin DjuheraSwanand Ravindra Kadheet al.2026ICLR 2026Workshop paper
When Data is the Algorithm: A Systematic Study and Curation of Preference Optimization DatasetsAladin DjuheraFarhan Ahmedet al.2025NeurIPS 2025Workshop paper