Update README.md

Browse files

Files changed (1) hide show

README.md +18 -0

README.md CHANGED Viewed

@@ -17,6 +17,7 @@ pipeline_tag: text-generation
 ---
 # Alpie-Core: 4-bit Quantized Reasoning Model
 ## 1. Introduction
@@ -79,6 +80,14 @@ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-a
 ## 6. Benchmark Results
 | Benchmark | Alpie-Core (32B-4bit) | DeepSeek-V2 (236B) | Qwen2.5 72B | Llama 3.1 405B | Llama 3.1 70B | Gemma-3 27B-PT | Mistral-Small-24B-Base-2501 |
 |-----------|----------------------|-------------------|-------------|---------------|---------------|----------------|----------------------------|
 | MMLU (5-shot) | **81.28%** | 78.4% | 85.0% | 84.4% | 79.3% | 78.6% | 80.73% |
@@ -90,6 +99,8 @@ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-a
 ### SWE-Bench Verified Performance
 | Rank | Model | Accuracy (%) | Performance vs Alpie |
 |------|-------|-------------|---------------------|
 | **1** | **Alpie Core** | **57.8** | **Alpie** |
@@ -102,6 +113,8 @@ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-a
 ### Humanity's Last Exam Leaderboard Performance
 | Rank | Model | Accuracy (%) | Performance vs Alpie |
 |------|-------|-------------|---------------------|
 | 1 | GPT 4.5 Preview | 5.8 | Above Alpie |
@@ -126,6 +139,7 @@ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-a
 | CommonSenseQA | **87.06%** | Commonsense |
 | AGIEval | **64.98%** | General Intelligence |
 | Winogrande | **79.53%** | Commonsense Reasoning |
 ## 7. Training Details
@@ -298,6 +312,10 @@ Apache 2.0 – Free for research and commercial use
 We would like to thank **DeepSeek** for their original model, which served as the foundation for this work. Our team fine-tuned the model and implemented **4-bit quantization**, achieving improved efficiency and accuracy for downstream tasks. This model is built with respect to the contributions of the original authors and aims to provide a safe, high-performance solution for reasoning and inference.
 ---
 *For technical details, training methodology, and comprehensive evaluation results, please refer to our technical report.*

 ---
 # Alpie-Core: 4-bit Quantized Reasoning Model
+📄 **[Technical Report: Alpie_Core.pdf](./Alpie_Core.pdf)**
 ## 1. Introduction
 ## 6. Benchmark Results
+![GSM8K Benchmark](Benchmark_GSM8K.png)
+![AIME Benchmark](Benchmark_AIME.png)
+![BBH Benchmark](Benchmark_BBH.png)
+![Combined Benchmark](combined_benchmark.png)
 | Benchmark | Alpie-Core (32B-4bit) | DeepSeek-V2 (236B) | Qwen2.5 72B | Llama 3.1 405B | Llama 3.1 70B | Gemma-3 27B-PT | Mistral-Small-24B-Base-2501 |
 |-----------|----------------------|-------------------|-------------|---------------|---------------|----------------|----------------------------|
 | MMLU (5-shot) | **81.28%** | 78.4% | 85.0% | 84.4% | 79.3% | 78.6% | 80.73% |
 ### SWE-Bench Verified Performance
+![SWE-Bench Performance](swe.png)
 | Rank | Model | Accuracy (%) | Performance vs Alpie |
 |------|-------|-------------|---------------------|
 | **1** | **Alpie Core** | **57.8** | **Alpie** |
 ### Humanity's Last Exam Leaderboard Performance
+![Humanity's Last Exam](Humanity's_Last_Exam_(Text_Only).png)
 | Rank | Model | Accuracy (%) | Performance vs Alpie |
 |------|-------|-------------|---------------------|
 | 1 | GPT 4.5 Preview | 5.8 | Above Alpie |
 | CommonSenseQA | **87.06%** | Commonsense |
 | AGIEval | **64.98%** | General Intelligence |
 | Winogrande | **79.53%** | Commonsense Reasoning |
+| MATH-500 | **70.00%** | Advanced Mathematics |
 ## 7. Training Details
 We would like to thank **DeepSeek** for their original model, which served as the foundation for this work. Our team fine-tuned the model and implemented **4-bit quantization**, achieving improved efficiency and accuracy for downstream tasks. This model is built with respect to the contributions of the original authors and aims to provide a safe, high-performance solution for reasoning and inference.
+## 15. Contact
+For technical inquiries and support: **contact@169pi.com**
 ---
 *For technical details, training methodology, and comprehensive evaluation results, please refer to our technical report.*