a-k-aAiMGoD's picture
Upload fine-tuned Phi-3-mini model
92e0d13 verified
---
license: mit
base_model: microsoft/Phi-3-mini-128k-instruct
tags:
- phi-3
- fine-tuned
- distributed-training
- pytorch
language:
- en
---
# Fine-tuned Phi-3-mini Model
This is a fine-tuned version of microsoft/Phi-3-mini-128k-instruct using distributed training.
## Model Details
- **Base Model**: microsoft/Phi-3-mini-128k-instruct
- **Training Method**: Distributed fine-tuning with Ray
- **Shards Used**: 2
- **Parameters**: ~3.8B
## Training Information
The model was fine-tuned using a distributed approach across multiple shards. While the base architecture is preserved, this model has been through a fine-tuning process optimized for specific tasks.
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("a-k-aAiMGoD/phi3-mini-distributed-fine-tune")
model = AutoModelForCausalLM.from_pretrained("a-k-aAiMGoD/phi3-mini-distributed-fine-tune")
# Example usage
text = "Hello, how are you?"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
## Training Configuration
- Distributed across 2 shards
- Optimized for large-scale deployment
- Enhanced with Ray-based parallelization