-
Reasoning Introduces New Poisoning Attacks Yet Makes Them More Complicated
Paper • 2509.05739 • Published • 2 -
Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers
Paper • 2509.03059 • Published • 24 -
Universal Deep Research: Bring Your Own Model and Strategy
Paper • 2509.00244 • Published • 13 -
<think> So let's replace this phrase with insult... </think> Lessons learned from generation of toxic texts with LLMs
Paper • 2509.08358 • Published • 13
Collections
Discover the best community collections!
Collections including paper arxiv:2504.21233
-
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Paper • 2504.20752 • Published • 92 -
Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math
Paper • 2504.21233 • Published • 49 -
AF Adapter: Continual Pretraining for Building Chinese Biomedical Language Model
Paper • 2211.11363 • Published • 1 -
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper • 2405.12130 • Published • 50
-
CoRAG: Collaborative Retrieval-Augmented Generation
Paper • 2504.01883 • Published • 9 -
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Paper • 2504.08837 • Published • 43 -
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
Paper • 2504.10068 • Published • 30 -
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
Paper • 2504.10481 • Published • 85
-
The Lessons of Developing Process Reward Models in Mathematical Reasoning
Paper • 2501.07301 • Published • 99 -
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2
Paper • 2502.03544 • Published • 44 -
FoNE: Precise Single-Token Number Embeddings via Fourier Features
Paper • 2502.09741 • Published • 15 -
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers
Paper • 2502.20545 • Published • 22
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 29 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 123 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4
-
MLLM-as-a-Judge for Image Safety without Human Labeling
Paper • 2501.00192 • Published • 31 -
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 107 -
Xmodel-2 Technical Report
Paper • 2412.19638 • Published • 26 -
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper • 2412.18925 • Published • 104
-
Reasoning Introduces New Poisoning Attacks Yet Makes Them More Complicated
Paper • 2509.05739 • Published • 2 -
Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers
Paper • 2509.03059 • Published • 24 -
Universal Deep Research: Bring Your Own Model and Strategy
Paper • 2509.00244 • Published • 13 -
<think> So let's replace this phrase with insult... </think> Lessons learned from generation of toxic texts with LLMs
Paper • 2509.08358 • Published • 13
-
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Paper • 2504.20752 • Published • 92 -
Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math
Paper • 2504.21233 • Published • 49 -
AF Adapter: Continual Pretraining for Building Chinese Biomedical Language Model
Paper • 2211.11363 • Published • 1 -
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper • 2405.12130 • Published • 50
-
CoRAG: Collaborative Retrieval-Augmented Generation
Paper • 2504.01883 • Published • 9 -
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Paper • 2504.08837 • Published • 43 -
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
Paper • 2504.10068 • Published • 30 -
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
Paper • 2504.10481 • Published • 85
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 123 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4
-
The Lessons of Developing Process Reward Models in Mathematical Reasoning
Paper • 2501.07301 • Published • 99 -
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2
Paper • 2502.03544 • Published • 44 -
FoNE: Precise Single-Token Number Embeddings via Fourier Features
Paper • 2502.09741 • Published • 15 -
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers
Paper • 2502.20545 • Published • 22
-
MLLM-as-a-Judge for Image Safety without Human Labeling
Paper • 2501.00192 • Published • 31 -
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 107 -
Xmodel-2 Technical Report
Paper • 2412.19638 • Published • 26 -
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper • 2412.18925 • Published • 104
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 29 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23