Qwen2.5-7B English-Kannada Translation Model
A fine-tuned translation model based on Qwen2.5-7B-Instruct, specialized for translating between English and Kannada (ಕನ್ನಡ).
Model Description
This model is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct trained on English-Kannada translation pairs. Kannada is a Dravidian language spoken primarily in the Indian state of Karnataka.
Training was done across 4xNVIDIA A100-SXM4-40GB GPUs for 6h 48m 10s using 64,603,656 tokens. Training frameworks used were transformers, peft, trl and accelerate for distributed setup.
How to Use
Basic Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("RakshithFury/Qwen2.5-7b-en-kn-translate")
model = AutoModelForCausalLM.from_pretrained("RakshithFury/Qwen2.5-7b-en-kn-translate")
model = model.to("cuda:0")
sentences = [sentence1, sentence2,....]
for sentence in sentences:
messages = [
{"role": "user", "content": "Translate the following English sentence to Kannada:" + sentence},
]
inputs = tokenizer.apply_chat_template(messages,add_generation_prompt=True,tokenize=True,return_dict=True,return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200,temperature=0.5,min_p=0.1)
res = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:],skip_special_tokens=True)
print(res)
Training Details
Training Data
Trained on 500000 samples of english-kannada translation pairs.
- Dataset: https://www.kaggle.com/datasets/parvmodi/english-to-kannada-machine-translation-dataset
- Size: 500000 samples comprising of 64,603,656 tokens randomly from the above corpus
Training Procedure
- Framework - transformers, trl
- Distributed training - Yes, using DDP through accelerate
- LoRA - Yes
Training Hyperparameters
- Batch size: per_device_batch_size=4, gradient_accumulation=1,num_gpus=4 i.e effective batch size of 16
- Epochs: 1
- Optimizer: AdamW
- Learning rate: 2e-4
- LoRA rank: 8
- LoRA alpha: 16
The train loss is 0.5036 and the token accuracy is 87%.
Eval Data
- Dataset: https://www.kaggle.com/datasets/parvmodi/english-to-kannada-machine-translation-dataset
- Size: 100000 samples comprising of 62,014,459 tokens sampled randomly from the above corpus, while ensuring none of the examples are present in training data.
It is clear that the eval loss hasn't saturated. There is still scope for it's decrease.
Hardware
- GPU: 4x NVIDIA A100-SXM4
- CPU count 128
- Logical CPU count 256
Example Translations
Example 1
English sentence:
What is the meaning of life?
Default model:
生命周期 ನೀಡಲು ಎಂದು ತೆರೆಯಿರಿ?
Finetuned model:
ಜೀವನದ ಅರ್ಥ ಏನು?
Example 2
English sentence:
My biggest problem is deciding what I should wear.
Default model:
ನ ಹೊಸ ಸಮಸ್ಯೆಯು ನನ್ನ ವೈರಾಗ್ಯವನ್ನು ತಿಳಿದೇಕ್ಕಾಗಿ ಎಂಬ ವೈಸೀನಿಯನ್ನು ಒಡ್ಡುವುದು.
Finetuned model:
ನಾನು ಏನು ಧರಿಸಬೇಕೆಂದು ನಿರ್ಧರಿಸುವುದು ನನಗೆ ಅತಿ ದೊಡ್ಡ ಸಮಸ್ಯೆ.
Example 3
English sentence:
It was probably the first thing I remembered from my early childhood.
Default model:
ಯಾವುದೇ ಈಗ ಹಲವಾರು ವರ್ಷಗಳ ಕ್ಕೆ ಪ್ರೊಜೆಕ್ಟ್ನಲ್ಲಿ ಮನೆಯಲ್ಲಿ ಬರುತ್ತಿರುವ ಮುಖ್ಯ ಚಿತ್ರಗಳು ನಂತರ ಮುಂದೆ ಹೆಚ್ಚು ವರ್ಷಗಳ ಕ್ಕೆ ಸೆಟ್ಪಡಿಸಲಾಗಿದೆ.
Finetuned model:
ಬೆಳೆದ ಮೊದಲ ವರ್ಷದಲ್ಲಿ ನನಗೆ ಸಂಭವನೀಯವಾಗಿ ಮರೆಯಲಾಗದ ಒಂದು ಘಟನೆ.
Example 4
English sentence:
Captain America is my favorite Avenger
Default model:
ಕप्टन ಅಮೆರಿಕಾ ಎಂದರೆ ನನ್ನ ಪ್ರಯತ್ನಿತ ಏವ್ನೇಂಟಿನ ಸೊನ್ನೋತ್ತಮ ವಿಷಯವಾಗಿದೆ.
Finetuned model:
ನಾನು ನಿರ್ದೇಶಕ ಅವರ ಪ್ರಿಯ ಸ್ಟಾರ್ ಆಗಿದ್ದೇನೆ ಕ್ಯಾಪ್ಟನ್ ಅಮೆರಿಕಾ.
CO2 Emission Related to Experiments
Experiments were conducted using a private infrastructure, which has a carbon efficiency of 0.432kgCO2/kWh.
Total emissions are estimated to be 0.76 kgCO2 Which is equivalent to
- 3.07km driven by an average ICE car
- 0.38 Kgs of coal burned
- 0.01 Tree seedlings sequesting carbon for 10 years
Limitations and Bias
Known Limitations
- The model may struggle with:
- Complex sentences with strange words (eg avenger)
- Very long sentences
Citation
If you use this model in your research, please cite:
@misc{RakshithFury/Qwen2.5-7b-en-kn-translate,
author = {Rakshith Rao},
title = {Qwen2.5-7B English-Kannada Translation Model},
year = {2024},
publisher = {HuggingFace},
url = {https://huggingface.co/your-username/model-name}
}
Base Model Citation
@article{qwen2.5,
title={Qwen2.5: A Party of Foundation Models},
author={Qwen Team},
year={2024},
journal={arXiv preprint arXiv:2412.xxxxx}
}
Contact
For questions or feedback:
- Email: rakshithdrao@gmail.com
- LinkedIn: https://www.linkedin.com/in/12rakshith-rao/
- Downloads last month
- 59

