Chirag2207 commited on
Commit
be73b15
·
verified ·
1 Parent(s): 937a539

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -40
README.md CHANGED
@@ -5,6 +5,8 @@ tags:
5
  - coding
6
  - mathematics
7
  - quantization
 
 
8
  license: apache-2.0
9
  datasets:
10
  - synthetic
@@ -17,7 +19,7 @@ pipeline_tag: text-generation
17
  ---
18
  # Alpie-Core: 4-bit Quantized Reasoning Model
19
 
20
- 📄 **[Technical Report: Alpie_Core.pdf](./Alpie_Core.pdf)**
21
 
22
  <p align="center">
23
  <a href="https://169pi.ai/"><img src="https://img.shields.io/badge/🌐%20Website-169Pi%20AI-blue" alt="Website"></a>
@@ -28,7 +30,11 @@ pipeline_tag: text-generation
28
 
29
  ## 1. Introduction
30
 
31
- Alpie-Core is one of the world's first fine-tuned 4-bit reasoning models, proving that aggressive quantization can surpass full-precision baselines in reasoning, mathematics, and coding. By combining cutting-edge quantization-aware training with synthetic STEM-rich datasets, Alpie-Core achieves frontier-level reasoning while being practical for real-world deployment at scale.
 
 
 
 
32
 
33
  ## 2. Model Summary
34
 
@@ -38,26 +44,27 @@ Alpie-Core is one of the world's first fine-tuned 4-bit reasoning models, provin
38
  - **Quantization**: 4-bit NF4 with double quantization
39
  - **Context Length**: 65k tokens
40
  - **Max Output Length**: 16,384 tokens
 
41
  - **License**: Apache 2.0
42
 
43
 
44
  ## 3. Approach
45
 
46
- **Alpie-Core** has undergone extensive **supervised fine-tuning (SFT)** to strengthen reasoning, robustness, and safety. The training leveraged a diverse mixture of curated open-source datasets and proprietary synthetic data, optimized with high-quality LLM-generated responses. The fine-tuning process emphasized adherence to rigorous safety and usability standards, including:
47
 
48
- 1)**User Understanding and Clarity** – ensuring outputs are direct, interpretable, and pedagogically sound.
49
 
50
- 2)**Security and Ethical Guidelines** – filtering unsafe or harmful generations during and after training.
51
 
52
- 3)**Limitations, Disclaimers, and Knowledge Boundaries** – transparently communicating uncertainty and scope.
53
 
54
- 4)**Handling Complex and Sensitive Topics** – balancing informativeness with responsible guardrails.
55
 
56
- 5)**Safety and Respectful Engagement** – maintaining politeness, inclusivity, and cultural sensitivity.
57
 
58
- 6)**Confidentiality and Responsible Use** – preventing leakage of private training data, proprietary prompts, or internal reasoning traces.
59
 
60
- This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-aware responses while maintaining safety across a broad range of use cases.
61
 
62
  ## 4. Model Features
63
 
@@ -65,25 +72,23 @@ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-a
65
  2. **OpenAI-Compatible API** – Seamless integration with OpenAI client libraries
66
  3. **65K Context Length** – Handles very large inputs and conversations
67
  4. **16,384 Max Output Length** – Enables extremely long generations
68
- 5. **4-Bit Quantization** – Memory-efficient and optimized for deployment
69
  6. **High Throughput Inference** – Powered by vLLM for efficient large-scale serving
70
  7. **Low Latency Inference** – Fast response times optimized for production
71
  8. **Customizable Safety & Moderation Filters** – Built-in guardrails for safer outputs
72
  9. **Supports Function Calling / Tool Use** – Enables structured outputs and external API integration
 
 
73
 
74
  ## 5. Key Highlights
75
 
76
- 1. **Frontier Performance in 4-bit**: 81.28% MMLU, 92.75% GSM8K, 57.8% SWE-Bench Verified
77
-
78
- 2) **STEM + Coding Excellence**: Outperforms full-precision peers in mathematics and programming
79
-
80
- 3) **Enhanced Content Access**: Provides factual responses to geopolitically sensitive topics
81
-
82
- 4) **Quantization Efficiency**: A 4-bit quantized variant achieves competitive performance retention compared to full-precision models, demonstrating that aggressive quantization can preserve task accuracy while substantially reducing hardware requirements.
83
-
84
- 5) **Benchmark Competitiveness**: Across more than ten standard evaluation benchmarks, the model demonstrates performance on par with or exceeding that of larger 70B+ parameter systems, highlighting the effectiveness of our training and optimization strategies.
85
-
86
- 6) **Environmental Benefits**: Through quantization and efficiency-focused design, the model requires significantly fewer computational resources. This translates into lower energy consumption and reduced carbon footprint relative to full-precision deployments.
87
 
88
  ## 6. Benchmark Results
89
 
@@ -92,7 +97,6 @@ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-a
92
 
93
  ![BBH Benchmark](BBH.png)
94
 
95
- ![Combined Benchmark](combined_benchmark.png)
96
 
97
  | Benchmark | Alpie-Core (32B-4bit) | DeepSeek-V2 (236B) | Qwen2.5 72B | Llama 3.1 405B | Llama 3.1 70B | Gemma-3 27B-PT | Mistral-Small-24B-Base-2501 |
98
  |-----------|----------------------|-------------------|-------------|---------------|---------------|----------------|----------------------------|
@@ -103,7 +107,7 @@ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-a
103
  | MBPP (pass@1) | **75.20%** | 65.0% | 72.6% | 68.4% | - | 65.6% | 69.64% |
104
  | HumanEval (pass@1) | **57.23%** | 43.3% | 53.0% | 54.9% | - | 48.8% | = |
105
 
106
-
107
 
108
  ### SWE-Bench Verified Performance
109
 
@@ -162,6 +166,8 @@ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-a
162
  - **Quantization**: 4-bit NF4 + Double Quantization + FP16 compute
163
  - **Dataset Domains**: Mathematics, coding, reasoning, science, general knowledge, competitive exams, Indian context + law, multilingual (Hindi and Hinglish)
164
  - **Synthetic Data Advantage**: +15-20% performance boost in STEM & coding domains
 
 
165
 
166
  ## 8. Environmental Impact
167
 
@@ -184,20 +190,21 @@ Conservative mode (near TDP ≈ 700 W per GPU = 0.70 kWh/hr): 0.364 × 408 × 0.
184
 
185
  Total training footprint ranges from ~298 kg CO₂e (realistic) to ~835 kg CO₂e (conservative worst-case)
186
 
187
-
188
-
189
 
190
  ## 9. Use Cases
191
 
192
  Best for **STEM**, **complex mathematical reasoning**, **coding**, and **Indian context**
193
 
194
- 1)**STEM**: Excels at solving advanced problems in science, technology, engineering, and mathematics with high accuracy.
195
 
196
- 2)**Complex Mathematical Reasoning**: Handles multi-step logical and quantitative reasoning tasks with strong reliability.
197
 
198
- 3)**Coding**: Supports software development, debugging, and algorithmic problem-solving across multiple programming languages.
199
 
200
- 4)**Indian Context**: Provides culturally aware insights, competitive exam assistance (JEE, NEET, UPSC), and multilingual support in Hindi/Hinglish.
 
 
201
 
202
 
203
  ## 10. Safety and Limitations
@@ -210,12 +217,15 @@ Unlike the base DeepSeek model, Alpie-Core provides factual, balanced responses
210
  - Fixed knowledge cutoff without real-time information retrieval
211
  - Occasional struggles with complex multi-hop mathematical reasoning
212
  - Potential hallucinations in factual question-answering
 
 
213
 
214
  ### Mitigations
215
  - Safety classifiers and output filtering systems
216
  - Model-assisted safety pipeline using RLHF
217
  - Comprehensive adversarial testing by domain experts
218
 
 
219
  ## 11. How to Use
220
 
221
  ### Non-Streaming Inference
@@ -225,7 +235,7 @@ from peft import PeftModel, PeftConfig
225
  import torch
226
 
227
  # Load LoRA adapter configuration to find the base model
228
- peft_model_id = "169Pi/Alpie-Core-4-bit"
229
  config = PeftConfig.from_pretrained(peft_model_id)
230
 
231
  # Load the base model
@@ -262,7 +272,7 @@ from peft import PeftModel, PeftConfig
262
  import torch
263
 
264
  # Load LoRA adapter configuration to find the base model
265
- peft_model_id = "169Pi/Alpie-Core-4-bit"
266
  config = PeftConfig.from_pretrained(peft_model_id)
267
 
268
  # Load the base model
@@ -309,24 +319,40 @@ with torch.no_grad():
309
  ```bibtex
310
  @misc{alpie2025core,
311
  title = {Alpie-Core: A 4-bit Quantized Reasoning Model Surpassing Full-Precision Benchmarks},
312
- author = {Alpie AI},
313
  year = {2025},
314
- url = {https://huggingface.co/alpie/Alpie-Core-4bit}
315
  }
316
  ```
317
 
318
- ## 13. License
 
 
 
 
 
 
 
 
319
 
320
- Apache 2.0 Free for research and commercial use
321
 
322
- ## 14. Acknowledgements / Credits
323
 
324
- We would like to thank **DeepSeek** for their original model, which served as the foundation for this work. Our team fine-tuned the model and implemented **4-bit quantization**, achieving improved efficiency and accuracy for downstream tasks. This model is built with respect to the contributions of the original authors and aims to provide a safe, high-performance solution for reasoning and inference.
325
 
326
- ## 15. Contact
 
 
 
 
 
 
 
 
327
 
328
  For technical inquiries and support: **contact@169pi.com**
329
 
330
  ---
331
-
332
  *For technical details, training methodology, and comprehensive evaluation results, please refer to our technical report.*
 
5
  - coding
6
  - mathematics
7
  - quantization
8
+ - 4-bit model
9
+ - state-of-the-art
10
  license: apache-2.0
11
  datasets:
12
  - synthetic
 
19
  ---
20
  # Alpie-Core: 4-bit Quantized Reasoning Model
21
 
22
+ 📄 **[Technical Report: Alpie Core.pdf](./Alpie_Core.pdf)**
23
 
24
  <p align="center">
25
  <a href="https://169pi.ai/"><img src="https://img.shields.io/badge/🌐%20Website-169Pi%20AI-blue" alt="Website"></a>
 
30
 
31
  ## 1. Introduction
32
 
33
+ **Alpie Core is one of the first fine-tuned 4-bit reasoning models from India, and among one of the first worldwide.** Trained on just 8 Hopper GPUs with LoRA, QLoRA quantization, and synthetic STEM-rich dataset distillation, it proves that aggressive quantization can not only match but also surpass full-precision baselines.
34
+
35
+ With a dramatically reduced memory footprint, Alpie-Core delivers competitive, frontier-level reasoning performance, even beating some top proprietary models. It achieves **81.28% on MMLU, 92.75% on GSM8K, and 57.8% on SWE-Bench Verified**, ranking top globally on competitive leaderboards, a demonstration that efficient models can rival frontier systems while remaining practical for real-world deployment at scale.
36
+
37
+ ![Combined Benchmark](combined_benchmark.png)
38
 
39
  ## 2. Model Summary
40
 
 
44
  - **Quantization**: 4-bit NF4 with double quantization
45
  - **Context Length**: 65k tokens
46
  - **Max Output Length**: 16,384 tokens
47
+ - **Training Data Sources:** Synthetic (STEM, reasoning, coding) + domain-rich curated data (law, Indian context, exams, multilingual).
48
  - **License**: Apache 2.0
49
 
50
 
51
  ## 3. Approach
52
 
53
+ **Alpie-Core** has undergone extensive **supervised fine-tuning (SFT)** to strengthen reasoning, robustness, and safety. The training leveraged a diverse mixture of curated open-source datasets and proprietary synthetic data, optimised with high-quality LLM-generated responses. The fine-tuning process emphasised adherence to rigorous safety and usability standards, including:
54
 
55
+ 1.**User Understanding and Clarity** – ensuring outputs are direct, interpretable, and pedagogically sound.
56
 
57
+ 2.**Security and Ethical Guidelines** – filtering unsafe or harmful generations during and after training.
58
 
59
+ 3.**Limitations, Disclaimers, and Knowledge Boundaries** – transparently communicating uncertainty and scope.
60
 
61
+ 4.**Handling Complex and Sensitive Topics** – balancing informativeness with responsible guardrails.
62
 
63
+ 5.**Safety and Respectful Engagement** – maintaining politeness, inclusivity, and cultural sensitivity.
64
 
65
+ 6.**Confidentiality and Responsible Use** – preventing leakage of private training data, proprietary prompts, or internal reasoning traces.
66
 
67
+ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-aware responses while maintaining safety across a broad range of use cases. This approach allows Alpie-Core to generalize across global and Indian contexts while staying aligned to safe and responsible use guidelines.
68
 
69
  ## 4. Model Features
70
 
 
72
  2. **OpenAI-Compatible API** – Seamless integration with OpenAI client libraries
73
  3. **65K Context Length** – Handles very large inputs and conversations
74
  4. **16,384 Max Output Length** – Enables extremely long generations
75
+ 5. **4-Bit Quantization** – Memory-efficient and optimised for deployment
76
  6. **High Throughput Inference** – Powered by vLLM for efficient large-scale serving
77
  7. **Low Latency Inference** – Fast response times optimized for production
78
  8. **Customizable Safety & Moderation Filters** – Built-in guardrails for safer outputs
79
  9. **Supports Function Calling / Tool Use** – Enables structured outputs and external API integration
80
+ 10. **Instruction Following** – Optimised for reasoning and chain-of-thought stepwise answers.
81
+ 11. **Education & Research Ready** – Tailored for competitive exams, STEM reasoning, and knowledge-intensive tasks.
82
 
83
  ## 5. Key Highlights
84
 
85
+ 1. **First 4-bit Reasoning Model from India**: Competitive globally with frontier models
86
+ 2. **Benchmark Competitiveness**: Outperforms or matches 70B+ models across reasoning, math, and coding
87
+ 3. **STEM & Coding Strength**: Excellent on GSM8K, MATH-500, HumanEval, SWE-Bench Verified
88
+ 4. **Efficiency & Deployment**: 16 GB VRAM footprint, runs on commodity GPUs with vLLM
89
+ 5. **Extended Context Length**: 65K tokens for research papers, conversations, multi-document reasoning
90
+ 6. **Environmental Benefits**: ~298–835 kg CO₂e, 2–3× more efficient than FP16 training
91
+ 7. **Open-Source Commitment**: Released under Apache 2.0 for global use
 
 
 
 
92
 
93
  ## 6. Benchmark Results
94
 
 
97
 
98
  ![BBH Benchmark](BBH.png)
99
 
 
100
 
101
  | Benchmark | Alpie-Core (32B-4bit) | DeepSeek-V2 (236B) | Qwen2.5 72B | Llama 3.1 405B | Llama 3.1 70B | Gemma-3 27B-PT | Mistral-Small-24B-Base-2501 |
102
  |-----------|----------------------|-------------------|-------------|---------------|---------------|----------------|----------------------------|
 
107
  | MBPP (pass@1) | **75.20%** | 65.0% | 72.6% | 68.4% | - | 65.6% | 69.64% |
108
  | HumanEval (pass@1) | **57.23%** | 43.3% | 53.0% | 54.9% | - | 48.8% | = |
109
 
110
+ These results demonstrate Alpie-Core’s ability to rival or surpass leading proprietary and open-source models, despite being 4-bit quantized.
111
 
112
  ### SWE-Bench Verified Performance
113
 
 
166
  - **Quantization**: 4-bit NF4 + Double Quantization + FP16 compute
167
  - **Dataset Domains**: Mathematics, coding, reasoning, science, general knowledge, competitive exams, Indian context + law, multilingual (Hindi and Hinglish)
168
  - **Synthetic Data Advantage**: +15-20% performance boost in STEM & coding domains
169
+ - **Training Strategy**: Multi-stage distillation → SFT → safety alignment.
170
+ - **Synthetic Data Advantage:** Clarify source: LLM-generated, curated with multi-turn reasoning traces for STEM/coding.
171
 
172
  ## 8. Environmental Impact
173
 
 
190
 
191
  Total training footprint ranges from ~298 kg CO₂e (realistic) to ~835 kg CO₂e (conservative worst-case)
192
 
193
+ *This makes Alpie-Core one of the most carbon-efficient reasoning models released to date.*
 
194
 
195
  ## 9. Use Cases
196
 
197
  Best for **STEM**, **complex mathematical reasoning**, **coding**, and **Indian context**
198
 
199
+ 1.**STEM**: Excels at solving advanced problems in science, technology, engineering, and mathematics with high accuracy.
200
 
201
+ 2.**Complex Mathematical Reasoning**: Handles multi-step logical and quantitative reasoning tasks with strong reliability.
202
 
203
+ 3.**Coding**: Supports software development, debugging, algorithmic problem-solving, and structured reasoning in code..
204
 
205
+ 4.**Indian Context**: Provides culturally aware insights, competitive exam assistance (JEE, NEET, UPSC), and multilingual support in Hindi/Hinglish.
206
+
207
+ 5.**Research Assistants**: Handle long contexts (65K) for academic and legal research.
208
 
209
 
210
  ## 10. Safety and Limitations
 
217
  - Fixed knowledge cutoff without real-time information retrieval
218
  - Occasional struggles with complex multi-hop mathematical reasoning
219
  - Potential hallucinations in factual question-answering
220
+ - Hallucinations: As with all LLMs, outputs should not be used for medical/legal advice without expert oversight.
221
+ - Biases: Training on synthetic + curated datasets reduces bias, but some risks may persist.
222
 
223
  ### Mitigations
224
  - Safety classifiers and output filtering systems
225
  - Model-assisted safety pipeline using RLHF
226
  - Comprehensive adversarial testing by domain experts
227
 
228
+
229
  ## 11. How to Use
230
 
231
  ### Non-Streaming Inference
 
235
  import torch
236
 
237
  # Load LoRA adapter configuration to find the base model
238
+ peft_model_id = "169Pi/Alpie-Core"
239
  config = PeftConfig.from_pretrained(peft_model_id)
240
 
241
  # Load the base model
 
272
  import torch
273
 
274
  # Load LoRA adapter configuration to find the base model
275
+ peft_model_id = "169Pi/Alpie-Core"
276
  config = PeftConfig.from_pretrained(peft_model_id)
277
 
278
  # Load the base model
 
319
  ```bibtex
320
  @misc{alpie2025core,
321
  title = {Alpie-Core: A 4-bit Quantized Reasoning Model Surpassing Full-Precision Benchmarks},
322
+ author = {169Pi AI},
323
  year = {2025},
324
+ url = {https://huggingface.co/alpie/Alpie-Core}
325
  }
326
  ```
327
 
328
+ ## 13. Community & Contributions
329
+
330
+ This model is released under the Apache 2.0 license, and we warmly welcome the community to build, download, and extend it.
331
+
332
+ 1.**Issues & Discussions:** Report bugs, suggest features, or start conversations on the Hugging Face model page.
333
+
334
+ 2.**Contributions:** Pull requests are welcome for error fixes, performance improvements, and extended functionality.
335
+
336
+ 3.**Fine-tuning Results:** Share your experiments, benchmarks, and downstream applications with the community.
337
 
338
+ 4.**Collaboration:** We encourage researchers, developers, and organisations to join in shaping the future of this model.
339
 
340
+ Together, we can continue to improve accessibility, safety, and performance for real-world AI applications.
341
 
342
+ ## 14. License
343
 
344
+ Apache 2.0 License – Permissive, allowing free use, modification, and distribution for both research and commercial purposes.
345
+
346
+ ## 15. Acknowledgements / Credits
347
+
348
+ We would like to thank DeepSeek for their original model, which served as the foundation for this work. Our team fine-tuned the model and implemented 4-bit quantization, achieving improved efficiency and accuracy for downstream tasks. This model is built with respect to the contributions of the original authors and aims to provide a safe, high-performance solution for reasoning and inference.
349
+
350
+ We are also grateful to the Hugging Face ecosystem (Transformers, PEFT, vLLM, bitsandbytes), the open-source community datasets (MMLU, GSM8K, SWE-Bench, and others), and the support of various cloud providers. Finally, we acknowledge the broader AI research community and companies whose innovations and insights continue to inspire our work.
351
+
352
+ ## 16. Contact
353
 
354
  For technical inquiries and support: **contact@169pi.com**
355
 
356
  ---
357
+ Alpie-Core represents a milestone for open-source AI from India, one of the first globally to show that 4-bit reasoning models can rival frontier-scale systems. We hope this release empowers developers, researchers, and organisations worldwide to build more efficient, inclusive, and impactful AI.
358
  *For technical details, training methodology, and comprehensive evaluation results, please refer to our technical report.*