Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
dasom oh's picture

dasom oh

dasomoh
·
  • OhDasom88

AI & ML interests

None yet

Organizations

Dasom's profile picture

Collections 1

papers
  • PERL: Parameter Efficient Reinforcement Learning from Human Feedback

    Paper • 2403.10704 • Published Mar 15, 2024 • 59
  • RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

    Paper • 2309.00267 • Published Sep 1, 2023 • 52
  • Absolute Zero: Reinforced Self-play Reasoning with Zero Data

    Paper • 2505.03335 • Published May 6, 2025 • 188
  • Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

    Paper • 2506.01939 • Published Jun 2, 2025 • 187
papers
  • PERL: Parameter Efficient Reinforcement Learning from Human Feedback

    Paper • 2403.10704 • Published Mar 15, 2024 • 59
  • RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

    Paper • 2309.00267 • Published Sep 1, 2023 • 52
  • Absolute Zero: Reinforced Self-play Reasoning with Zero Data

    Paper • 2505.03335 • Published May 6, 2025 • 188
  • Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

    Paper • 2506.01939 • Published Jun 2, 2025 • 187

models 0

None public yet

datasets 0

None public yet
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs