Model Card for oleksiizirka/llama-3-8b-openemis-bot
Model Details
Model Description:
This is the model card for a fine-tuned version of Llama 3.1 with 8 billion parameters. The model has been fine-tuned on client-specific data and quantized using GGIF format to optimize memory usage and performance. It was developed to better handle specific domain tasks for the client's project.
- Developed by: Oleksii Zirka
- Model type: Decoder
- Language(s) (NLP): English, Arabic, French, Russian
- License: Apache 2.0
- Finetuned from model : Llama 3.1 (8 billion parameters)
Uses
Direct Use
The model can be used to assist in handling specific domain-related NLP tasks, such as data quality validation, report generation, and user query handling in education management systems.
Downstream Use
Applicable in industry-specific tasks for handling language variations, improving data validation, and generating insights from structured educational data.
Out-of-Scope Use
The model is not designed for offensive, illegal, or discriminatory purposes.
Bias, Risks, and Limitations
Since the model is fine-tuned on client-specific data, it might not generalize well to unrelated domains or data. Users should ensure proper usage and review results for bias or unintended outputs in their use case.
How to Get Started with the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("oleksiizirka/llama-3-8b-openemis-bot")
model = AutoModelForCausalLM.from_pretrained("oleksiizirka/llama-3-8b-openemis-bot")
inputs = tokenizer("Your input text here", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
Training Data
The model was fine-tuned on educational data provided by the client, which includes examination, reporting, and validation tasks relevant to OpenEMIS deployments. This ensures improved handling of specific queries related to educational data management.
Training Procedure
- Preprocessing: Data was preprocessed to remove noise and structured to fit the task-specific requirements of the client’s domain.
Training Hyperparameters
- Training regime: Three epochs training to avoid overfitting on a relatively small dataset (~8000 cases).
Evaluation
Testing Data, Factors & Metrics
- Testing Data: The model was tested on the client's dataset using various representative queries.
- Factors: Focus on domain-specific tasks.
- Metrics: Accuracy of query handling and data validation tasks.
Results
The fine-tuned model performed well on client-specific validation metrics, but generalization to external datasets may be limited.
Environmental Impact
- Hardware Type: Tesla T4 GPU
- Cloud Provider: AWS
- Compute Region: Singapore
Technical Specifications
Model Architecture and Objective
The model uses the Llama 3.1 architecture with 8 billion parameters, fine-tuned on domain-specific data for better performance in education management tasks.
Compute Infrastructure
- Hardware: Tesla T4 GPU
- Software: Python 3.12, PyTorch, HuggingFace Transformers, AWS SageMaker
Citation
You can use the following citations to credit this model:
BibTeX:
@misc{oleksiizirka_llama_3_8b_openemis_bot,
author = {Oleksii Zirka},
title = {Llama 3.1 Fine-Tuned Model for OpenEMIS Data},
year = {2024},
howpublished = {\url{https://huggingface.co/oleksiizirka/llama-3-8b-openemis-bot}},
}
APA:
Zirka, O. (2024). Llama 3.1 Fine-Tuned Model for OpenEMIS Aggregated Data.
Model Card Contact
For inquiries, please reach out to Oleksii Zirka at oleksii.zirka@kordit.com
- Downloads last month
- 15