Large Language Models as Generalizable Policies for Embodied Tasks Paper • 2310.17722 • Published Oct 26, 2023 • 7
Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards Paper • 2402.17975 • Published Feb 28, 2024
Hindsight PRIORs for Reward Learning from Human Preferences Paper • 2404.08828 • Published Apr 12, 2024
Symbol Guided Hindsight Priors for Reward Learning from Human Preferences Paper • 2210.09151 • Published Oct 17, 2022