Update README.md
Browse files
README.md
CHANGED
|
@@ -133,7 +133,6 @@ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-a
|
|
| 133 |
## 7. Training Details
|
| 134 |
|
| 135 |
- **Hardware**: 8× NVIDIA H100-80GB GPUs
|
| 136 |
-
- **Training Duration**: 408 hours
|
| 137 |
- **Fine-tuning Method**: LoRA/QLoRA with the following configuration:
|
| 138 |
- LoRA Alpha: 8
|
| 139 |
- LoRA Dropout: 0.05
|
|
@@ -144,6 +143,8 @@ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-a
|
|
| 144 |
|
| 145 |
## 8. Environmental Impact
|
| 146 |
|
|
|
|
|
|
|
| 147 |
**Carbon Footprint**: We estimated the environmental impact of training Alpie-Core (32B) on 8× NVIDIA H100-80GB GPUs by calculating carbon emissions from GPU energy consumption. The calculation follows the formula:
|
| 148 |
CO₂e (kg) = Grid CO₂ Factor (kg/kWh) × Runtime (hours) × Power per GPU (kW) × Number of GPUs
|
| 149 |
|
|
@@ -280,7 +281,6 @@ with torch.no_grad():
|
|
| 280 |
### Deployment Options
|
| 281 |
- **Transformers**: Python, PyTorch integration
|
| 282 |
- **vLLM**: High-throughput inference
|
| 283 |
-
- **LMDeploy/Ollama/TensorRT-LLM**: Production deployments
|
| 284 |
|
| 285 |
## 12. Citation
|
| 286 |
|
|
@@ -297,6 +297,10 @@ with torch.no_grad():
|
|
| 297 |
|
| 298 |
Apache 2.0 – Free for research and commercial use
|
| 299 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 300 |
---
|
| 301 |
|
| 302 |
*For technical details, training methodology, and comprehensive evaluation results, please refer to our technical report.*
|
|
|
|
| 133 |
## 7. Training Details
|
| 134 |
|
| 135 |
- **Hardware**: 8× NVIDIA H100-80GB GPUs
|
|
|
|
| 136 |
- **Fine-tuning Method**: LoRA/QLoRA with the following configuration:
|
| 137 |
- LoRA Alpha: 8
|
| 138 |
- LoRA Dropout: 0.05
|
|
|
|
| 143 |
|
| 144 |
## 8. Environmental Impact
|
| 145 |
|
| 146 |
+

|
| 147 |
+
|
| 148 |
**Carbon Footprint**: We estimated the environmental impact of training Alpie-Core (32B) on 8× NVIDIA H100-80GB GPUs by calculating carbon emissions from GPU energy consumption. The calculation follows the formula:
|
| 149 |
CO₂e (kg) = Grid CO₂ Factor (kg/kWh) × Runtime (hours) × Power per GPU (kW) × Number of GPUs
|
| 150 |
|
|
|
|
| 281 |
### Deployment Options
|
| 282 |
- **Transformers**: Python, PyTorch integration
|
| 283 |
- **vLLM**: High-throughput inference
|
|
|
|
| 284 |
|
| 285 |
## 12. Citation
|
| 286 |
|
|
|
|
| 297 |
|
| 298 |
Apache 2.0 – Free for research and commercial use
|
| 299 |
|
| 300 |
+
## 14. Acknowledgements / Credits
|
| 301 |
+
|
| 302 |
+
We would like to thank **DeepSeek** for their original model, which served as the foundation for this work. Our team fine-tuned the model and implemented **4-bit quantization**, achieving improved efficiency and accuracy for downstream tasks. This model is built with respect to the contributions of the original authors and aims to provide a safe, high-performance solution for reasoning and inference.
|
| 303 |
+
|
| 304 |
---
|
| 305 |
|
| 306 |
*For technical details, training methodology, and comprehensive evaluation results, please refer to our technical report.*
|