Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2411.12580

Papers - Reasoning - Visualization - Pearson’s R

Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Paper • 2411.12580 • Published Nov 19, 2024 • 2

Papers - Training - Influence Functions - EK-FAC

Studying Large Language Model Generalization with Influence Functions

Paper • 2308.03296 • Published Aug 7, 2023 • 13
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Paper • 2411.12580 • Published Nov 19, 2024 • 2

Papers - Pre-training

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

Paper • 2310.20587 • Published Oct 31, 2023 • 18
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 109
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Paper • 2403.15042 • Published Mar 22, 2024 • 27
LIMA: Less Is More for Alignment

Paper • 2305.11206 • Published May 18, 2023 • 26

🔍 Interpretability & Analysis of LMs

Outstanding research in LM interpretability and evaluation, summarized

Latent Reasoning in LLMs as a Vocabulary-Space Superposition

Paper • 2510.15522 • Published Oct 17 • 1
Language Models are Injective and Hence Invertible

Paper • 2510.15511 • Published Oct 17 • 69
Eliciting Secret Knowledge from Language Models

Paper • 2510.01070 • Published Oct 1 • 4
Interpreting Language Models Through Concept Descriptions: A Survey

Paper • 2510.01048 • Published Oct 1 • 2

Papers - Training - Scaling - Influence Functions

Studying Large Language Model Generalization with Influence Functions

Paper • 2308.03296 • Published Aug 7, 2023 • 13
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Paper • 2411.12580 • Published Nov 19, 2024 • 2

Papers - Training

SELF: Language-Driven Self-Evolution for Large Language Model

Paper • 2310.00533 • Published Oct 1, 2023 • 2
GrowLength: Accelerating LLMs Pretraining by Progressively Growing Training Length

Paper • 2310.00576 • Published Oct 1, 2023 • 2
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity

Paper • 2305.13169 • Published May 22, 2023 • 3
Transformers Can Achieve Length Generalization But Not Robustly

Paper • 2402.09371 • Published Feb 14, 2024 • 15

Papers - Reasoning

Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models

Paper • 2402.14848 • Published Feb 19, 2024 • 20
Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7, 2024 • 50
How Far Are We from Intelligent Visual Deductive Reasoning?

Paper • 2403.04732 • Published Mar 7, 2024 • 23
Learning to Reason and Memorize with Self-Notes

Paper • 2305.00833 • Published May 1, 2023 • 5

Papers - Reasoning - Visualization - Pearson’s R

Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Paper • 2411.12580 • Published Nov 19, 2024 • 2

Papers - Training - Scaling - Influence Functions

Studying Large Language Model Generalization with Influence Functions

Paper • 2308.03296 • Published Aug 7, 2023 • 13
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Paper • 2411.12580 • Published Nov 19, 2024 • 2

Papers - Training - Influence Functions - EK-FAC

Studying Large Language Model Generalization with Influence Functions

Paper • 2308.03296 • Published Aug 7, 2023 • 13
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Paper • 2411.12580 • Published Nov 19, 2024 • 2

Papers - Training

SELF: Language-Driven Self-Evolution for Large Language Model

Paper • 2310.00533 • Published Oct 1, 2023 • 2
GrowLength: Accelerating LLMs Pretraining by Progressively Growing Training Length

Paper • 2310.00576 • Published Oct 1, 2023 • 2
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity

Paper • 2305.13169 • Published May 22, 2023 • 3
Transformers Can Achieve Length Generalization But Not Robustly

Paper • 2402.09371 • Published Feb 14, 2024 • 15

Papers - Pre-training

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

Paper • 2310.20587 • Published Oct 31, 2023 • 18
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 109
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Paper • 2403.15042 • Published Mar 22, 2024 • 27
LIMA: Less Is More for Alignment

Paper • 2305.11206 • Published May 18, 2023 • 26

Papers - Reasoning

Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models

Paper • 2402.14848 • Published Feb 19, 2024 • 20
Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7, 2024 • 50
How Far Are We from Intelligent Visual Deductive Reasoning?

Paper • 2403.04732 • Published Mar 7, 2024 • 23
Learning to Reason and Memorize with Self-Notes

Paper • 2305.00833 • Published May 1, 2023 • 5

🔍 Interpretability & Analysis of LMs

Outstanding research in LM interpretability and evaluation, summarized

Latent Reasoning in LLMs as a Vocabulary-Space Superposition

Paper • 2510.15522 • Published Oct 17 • 1
Language Models are Injective and Hence Invertible

Paper • 2510.15511 • Published Oct 17 • 69
Eliciting Secret Knowledge from Language Models

Paper • 2510.01070 • Published Oct 1 • 4
Interpreting Language Models Through Concept Descriptions: A Survey

Paper • 2510.01048 • Published Oct 1 • 2

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs