Soumye Singhal's picture

Soumye Singhal

soumye

·

AI & ML interests

LLM Post-training

Organizations

authored 6 papers 7 months ago

Llama-Nemotron: Efficient Reasoning Models

Paper • 2505.00949 • Published May 2 • 42

Effective Backdoor Mitigation in Vision-Language Models Depends on the Pre-training Objective

Paper • 2311.14948 • Published Nov 25, 2023

Adversarial Training of Reward Models

Paper • 2504.06141 • Published Apr 8

Countering Language Drift with Seeded Iterated Learning

Paper • 2003.12694 • Published Mar 28, 2020 • 1

Recall Traces: Backtracking Models for Efficient Reinforcement Learning

Paper • 1804.00379 • Published Apr 2, 2018

Supervised Seeded Iterated Learning for Interactive Language Learning

Paper • 2010.02975 • Published Oct 6, 2020

authored 2 papers 8 months ago

Reward-aware Preference Optimization: A Unified Mathematical Framework for Model Alignment

Paper • 2502.00203 • Published Jan 31 • 1

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

Paper • 2504.03624 • Published Apr 4 • 15