-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
Collections
Discover the best community collections!
Collections including paper arxiv:2507.13348
-
Robust Multimodal Large Language Models Against Modality Conflict
Paper • 2507.07151 • Published • 5 -
One Token to Fool LLM-as-a-Judge
Paper • 2507.08794 • Published • 31 -
Test-Time Scaling with Reflective Generative Model
Paper • 2507.01951 • Published • 107 -
KV Cache Steering for Inducing Reasoning in Small Language Models
Paper • 2507.08799 • Published • 40
-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 276 -
SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis
Paper • 2506.02096 • Published • 52 -
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation
Paper • 2506.02397 • Published • 35 -
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
Paper • 2505.24864 • Published • 142
-
Reasoning with Sampling: Your Base Model is Smarter Than You Think
Paper • 2510.14901 • Published • 47 -
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?
Paper • 2505.23359 • Published • 39 -
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation
Paper • 2506.02397 • Published • 35 -
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
Paper • 2505.24864 • Published • 142
-
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
Paper • 2506.22434 • Published • 10 -
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning
Paper • 2507.13348 • Published • 77 -
RewardDance: Reward Scaling in Visual Generation
Paper • 2509.08826 • Published • 73 -
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs
Paper • 2510.18876 • Published • 36
-
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning
Paper • 2505.15966 • Published • 53 -
GRIT: Teaching MLLMs to Think with Images
Paper • 2505.15879 • Published • 12 -
Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models
Paper • 2505.16854 • Published • 11 -
VLM-R^3: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought
Paper • 2505.16192 • Published • 12
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
Reasoning with Sampling: Your Base Model is Smarter Than You Think
Paper • 2510.14901 • Published • 47 -
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?
Paper • 2505.23359 • Published • 39 -
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation
Paper • 2506.02397 • Published • 35 -
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
Paper • 2505.24864 • Published • 142
-
Robust Multimodal Large Language Models Against Modality Conflict
Paper • 2507.07151 • Published • 5 -
One Token to Fool LLM-as-a-Judge
Paper • 2507.08794 • Published • 31 -
Test-Time Scaling with Reflective Generative Model
Paper • 2507.01951 • Published • 107 -
KV Cache Steering for Inducing Reasoning in Small Language Models
Paper • 2507.08799 • Published • 40
-
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
Paper • 2506.22434 • Published • 10 -
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning
Paper • 2507.13348 • Published • 77 -
RewardDance: Reward Scaling in Visual Generation
Paper • 2509.08826 • Published • 73 -
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs
Paper • 2510.18876 • Published • 36
-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 276 -
SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis
Paper • 2506.02096 • Published • 52 -
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation
Paper • 2506.02397 • Published • 35 -
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
Paper • 2505.24864 • Published • 142
-
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning
Paper • 2505.15966 • Published • 53 -
GRIT: Teaching MLLMs to Think with Images
Paper • 2505.15879 • Published • 12 -
Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models
Paper • 2505.16854 • Published • 11 -
VLM-R^3: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought
Paper • 2505.16192 • Published • 12