nanochat-d34-finetuned

This is SFT only checkpoints of karpathy/nanochat-d34, trained using the nanochat framework.

Model Description

  • Base Model: karpathy/nanochat-d34 (2.2B parameters, pre-trained on 88B tokens)
  • Architecture: GPT-style transformer with depth=34
  • Training Pipeline: Mid-training β†’ SFT β†’ RL (optional)
  • Hardware: 8x A100-80GB GPUs

Training Details

Base Model (Pre-trained by Karpathy)

  • Parameters: 2,217,082,880
  • Training tokens: 88,683,315,200 (40x param:token ratio)
  • Max sequence length: 2048
  • Base CORE score: 0.3382

Dataset

It's same smol-smoltalk which is mentioned in Karpathy OG nanochat discussion

Eval Reports

Full details in report/reports.md

Metric BASE MID SFT RL
ARC-Challenge - 0.5367 0.5418 -
ARC-Easy - 0.6961 0.7210 -
GSM8K - 0.1137 0.1327 -
HumanEval - 0.1098 0.1037 -
MMLU - 0.4229 0.4304 -
ChatCORE - 0.4045 0.4157 -

Fine-tuning Pipeline

  1. Mid-training: General instruction tuning on SmolTalk, MMLU, GSM8K, Spelling tasks
  2. SFT (Supervised Fine-Tuning): Chat-specific training on ARC, GSM8K, SmolTalk
  3. RL (Reinforcement Learning): Optional GRPO-style training on GSM8K (if included)

WandB Reports

Full Report

Repository Structure

β”œβ”€β”€ tokenizer/
β”‚   β”œβ”€β”€ tokenizer.pkl          # Tokenizer
β”‚   └── token_bytes.pt         # Token byte mappings
β”œβ”€β”€ mid_checkpoints/d34/       # Mid-training checkpoint
β”‚   β”œβ”€β”€ model_*.pt
β”‚   └── meta_*.json
β”œβ”€β”€ chatsft_checkpoints/d34/   # SFT checkpoint
β”‚   β”œβ”€β”€ model_*.pt
β”‚   └── meta_*.json
β”œβ”€β”€ chatsft_checkpoints_int8/d34/   # SFT checkpoint
β”‚   β”œβ”€β”€ model_*.pt
β”‚   └── meta_*.json
β”œβ”€β”€ chatrl_checkpoints/d34/    # RL checkpoint (if available)
β”‚   β”œβ”€β”€ model_*.pt
β”‚   └── meta_*.json
β”œβ”€β”€ report/                    # Evaluation reports
β”‚   └── report.md
└── logs/                      # Training logs

License

MIT License (same as nanochat)

Acknowledgments

@misc{nanochat,
  author = {Andrej Karpathy},
  title = {nanochat: The best ChatGPT that $100 can buy},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/karpathy/nanochat}
}
  • The nanochat community
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for pankajmathur/nanochat-d34-sft

Finetuned
(2)
this model

Dataset used to train pankajmathur/nanochat-d34-sft