Collection of models and datasets for Beyond Binary Rewards: Training LMs to Reason about their Uncertainty
Mehul Damani PRO
mehuldamani
AI & ML interests
Reinforcement Learning, Large Language Models
Recent Activity
updated
a model
about 3 hours ago
mehuldamani/sft-base-half-tranches-v1-global-step-394
published
a model
about 3 hours ago
mehuldamani/sft-base-half-tranches-v1-global-step-394
updated
a dataset
1 day ago
mehuldamani/judge-new-sft
Organizations
None yet