deepanshupillm commited on
Commit
bae016e
·
verified ·
1 Parent(s): f39c6c7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md CHANGED
@@ -17,6 +17,7 @@ pipeline_tag: text-generation
17
  ---
18
  # Alpie-Core: 4-bit Quantized Reasoning Model
19
 
 
20
 
21
  ## 1. Introduction
22
 
@@ -79,6 +80,14 @@ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-a
79
 
80
  ## 6. Benchmark Results
81
 
 
 
 
 
 
 
 
 
82
  | Benchmark | Alpie-Core (32B-4bit) | DeepSeek-V2 (236B) | Qwen2.5 72B | Llama 3.1 405B | Llama 3.1 70B | Gemma-3 27B-PT | Mistral-Small-24B-Base-2501 |
83
  |-----------|----------------------|-------------------|-------------|---------------|---------------|----------------|----------------------------|
84
  | MMLU (5-shot) | **81.28%** | 78.4% | 85.0% | 84.4% | 79.3% | 78.6% | 80.73% |
@@ -90,6 +99,8 @@ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-a
90
 
91
  ### SWE-Bench Verified Performance
92
 
 
 
93
  | Rank | Model | Accuracy (%) | Performance vs Alpie |
94
  |------|-------|-------------|---------------------|
95
  | **1** | **Alpie Core** | **57.8** | **Alpie** |
@@ -102,6 +113,8 @@ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-a
102
 
103
  ### Humanity's Last Exam Leaderboard Performance
104
 
 
 
105
  | Rank | Model | Accuracy (%) | Performance vs Alpie |
106
  |------|-------|-------------|---------------------|
107
  | 1 | GPT 4.5 Preview | 5.8 | Above Alpie |
@@ -126,6 +139,7 @@ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-a
126
  | CommonSenseQA | **87.06%** | Commonsense |
127
  | AGIEval | **64.98%** | General Intelligence |
128
  | Winogrande | **79.53%** | Commonsense Reasoning |
 
129
 
130
  ## 7. Training Details
131
 
@@ -298,6 +312,10 @@ Apache 2.0 – Free for research and commercial use
298
 
299
  We would like to thank **DeepSeek** for their original model, which served as the foundation for this work. Our team fine-tuned the model and implemented **4-bit quantization**, achieving improved efficiency and accuracy for downstream tasks. This model is built with respect to the contributions of the original authors and aims to provide a safe, high-performance solution for reasoning and inference.
300
 
 
 
 
 
301
  ---
302
 
303
  *For technical details, training methodology, and comprehensive evaluation results, please refer to our technical report.*
 
17
  ---
18
  # Alpie-Core: 4-bit Quantized Reasoning Model
19
 
20
+ 📄 **[Technical Report: Alpie_Core.pdf](./Alpie_Core.pdf)**
21
 
22
  ## 1. Introduction
23
 
 
80
 
81
  ## 6. Benchmark Results
82
 
83
+ ![GSM8K Benchmark](Benchmark_GSM8K.png)
84
+
85
+ ![AIME Benchmark](Benchmark_AIME.png)
86
+
87
+ ![BBH Benchmark](Benchmark_BBH.png)
88
+
89
+ ![Combined Benchmark](combined_benchmark.png)
90
+
91
  | Benchmark | Alpie-Core (32B-4bit) | DeepSeek-V2 (236B) | Qwen2.5 72B | Llama 3.1 405B | Llama 3.1 70B | Gemma-3 27B-PT | Mistral-Small-24B-Base-2501 |
92
  |-----------|----------------------|-------------------|-------------|---------------|---------------|----------------|----------------------------|
93
  | MMLU (5-shot) | **81.28%** | 78.4% | 85.0% | 84.4% | 79.3% | 78.6% | 80.73% |
 
99
 
100
  ### SWE-Bench Verified Performance
101
 
102
+ ![SWE-Bench Performance](swe.png)
103
+
104
  | Rank | Model | Accuracy (%) | Performance vs Alpie |
105
  |------|-------|-------------|---------------------|
106
  | **1** | **Alpie Core** | **57.8** | **Alpie** |
 
113
 
114
  ### Humanity's Last Exam Leaderboard Performance
115
 
116
+ ![Humanity's Last Exam](Humanity's_Last_Exam_(Text_Only).png)
117
+
118
  | Rank | Model | Accuracy (%) | Performance vs Alpie |
119
  |------|-------|-------------|---------------------|
120
  | 1 | GPT 4.5 Preview | 5.8 | Above Alpie |
 
139
  | CommonSenseQA | **87.06%** | Commonsense |
140
  | AGIEval | **64.98%** | General Intelligence |
141
  | Winogrande | **79.53%** | Commonsense Reasoning |
142
+ | MATH-500 | **70.00%** | Advanced Mathematics |
143
 
144
  ## 7. Training Details
145
 
 
312
 
313
  We would like to thank **DeepSeek** for their original model, which served as the foundation for this work. Our team fine-tuned the model and implemented **4-bit quantization**, achieving improved efficiency and accuracy for downstream tasks. This model is built with respect to the contributions of the original authors and aims to provide a safe, high-performance solution for reasoning and inference.
314
 
315
+ ## 15. Contact
316
+
317
+ For technical inquiries and support: **contact@169pi.com**
318
+
319
  ---
320
 
321
  *For technical details, training methodology, and comprehensive evaluation results, please refer to our technical report.*