-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
Collections
Discover the best community collections!
Collections including paper arxiv:2410.13720
-
Scaling Laws for Neural Language Models
Paper • 2001.08361 • Published • 9 -
Scaling Laws for Autoregressive Generative Modeling
Paper • 2010.14701 • Published • 1 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
A Survey on Data Selection for Language Models
Paper • 2402.16827 • Published • 4
-
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Paper • 2405.08748 • Published • 24 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 30 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 132 -
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Paper • 2405.11143 • Published • 41
-
Wan-AI/Wan2.1-T2V-1.3B
Text-to-Video • Updated • 13k • • 401 -
VBench: Comprehensive Benchmark Suite for Video Generative Models
Paper • 2311.17982 • Published • 9 -
VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models
Paper • 2411.13503 • Published • 34 -
tencent/HunyuanVideo-I2V
Image-to-Video • Updated • 752 • • 344
-
Movie Gen: A Cast of Media Foundation Models
Paper • 2410.13720 • Published • 98 -
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Paper • 2410.06885 • Published • 46 -
Flow Matching for Generative Modeling
Paper • 2210.02747 • Published • 3 -
Matcha-TTS: A fast TTS architecture with conditional flow matching
Paper • 2309.03199 • Published • 14
-
STaR: Bootstrapping Reasoning With Reasoning
Paper • 2203.14465 • Published • 9 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 58 -
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Paper • 2405.04434 • Published • 24 -
Prompt Cache: Modular Attention Reuse for Low-Latency Inference
Paper • 2311.04934 • Published • 34
-
LIMA: Less Is More for Alignment
Paper • 2305.11206 • Published • 26 -
Garment3DGen: 3D Garment Stylization and Texture Generation
Paper • 2403.18816 • Published • 25 -
EgoLifter: Open-world 3D Segmentation for Egocentric Perception
Paper • 2403.18118 • Published • 12 -
The Unreasonable Ineffectiveness of the Deeper Layers
Paper • 2403.17887 • Published • 82
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
Wan-AI/Wan2.1-T2V-1.3B
Text-to-Video • Updated • 13k • • 401 -
VBench: Comprehensive Benchmark Suite for Video Generative Models
Paper • 2311.17982 • Published • 9 -
VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models
Paper • 2411.13503 • Published • 34 -
tencent/HunyuanVideo-I2V
Image-to-Video • Updated • 752 • • 344
-
Movie Gen: A Cast of Media Foundation Models
Paper • 2410.13720 • Published • 98 -
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Paper • 2410.06885 • Published • 46 -
Flow Matching for Generative Modeling
Paper • 2210.02747 • Published • 3 -
Matcha-TTS: A fast TTS architecture with conditional flow matching
Paper • 2309.03199 • Published • 14
-
Scaling Laws for Neural Language Models
Paper • 2001.08361 • Published • 9 -
Scaling Laws for Autoregressive Generative Modeling
Paper • 2010.14701 • Published • 1 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
A Survey on Data Selection for Language Models
Paper • 2402.16827 • Published • 4
-
STaR: Bootstrapping Reasoning With Reasoning
Paper • 2203.14465 • Published • 9 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 58 -
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Paper • 2405.04434 • Published • 24 -
Prompt Cache: Modular Attention Reuse for Low-Latency Inference
Paper • 2311.04934 • Published • 34
-
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Paper • 2405.08748 • Published • 24 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 30 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 132 -
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Paper • 2405.11143 • Published • 41
-
LIMA: Less Is More for Alignment
Paper • 2305.11206 • Published • 26 -
Garment3DGen: 3D Garment Stylization and Texture Generation
Paper • 2403.18816 • Published • 25 -
EgoLifter: Open-world 3D Segmentation for Egocentric Perception
Paper • 2403.18118 • Published • 12 -
The Unreasonable Ineffectiveness of the Deeper Layers
Paper • 2403.17887 • Published • 82