Model Card for `oleksiizirka/llama-3-8b-openemis-bot`

Model Details

Model Description:
This is the model card for a fine-tuned version of Llama 3.1 with 8 billion parameters. The model has been fine-tuned on client-specific data and quantized using GGIF format to optimize memory usage and performance. It was developed to better handle specific domain tasks for the client's project.

Developed by: Oleksii Zirka
Model type: Decoder
Language(s) (NLP): English, Arabic, French, Russian
License: Apache 2.0
Finetuned from model : Llama 3.1 (8 billion parameters)

Uses

Direct Use

The model can be used to assist in handling specific domain-related NLP tasks, such as data quality validation, report generation, and user query handling in education management systems.

Downstream Use

Applicable in industry-specific tasks for handling language variations, improving data validation, and generating insights from structured educational data.

Out-of-Scope Use

The model is not designed for offensive, illegal, or discriminatory purposes.

Bias, Risks, and Limitations

Since the model is fine-tuned on client-specific data, it might not generalize well to unrelated domains or data. Users should ensure proper usage and review results for bias or unintended outputs in their use case.

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("oleksiizirka/llama-3-8b-openemis-bot")
model = AutoModelForCausalLM.from_pretrained("oleksiizirka/llama-3-8b-openemis-bot")

inputs = tokenizer("Your input text here", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was fine-tuned on educational data provided by the client, which includes examination, reporting, and validation tasks relevant to OpenEMIS deployments. This ensures improved handling of specific queries related to educational data management.

Training Procedure

Preprocessing: Data was preprocessed to remove noise and structured to fit the task-specific requirements of the client’s domain.

Training Hyperparameters

Training regime: Three epochs training to avoid overfitting on a relatively small dataset (~8000 cases).

Evaluation

Testing Data, Factors & Metrics

Testing Data: The model was tested on the client's dataset using various representative queries.
Factors: Focus on domain-specific tasks.
Metrics: Accuracy of query handling and data validation tasks.

Results

The fine-tuned model performed well on client-specific validation metrics, but generalization to external datasets may be limited.

Environmental Impact

Hardware Type: Tesla T4 GPU
Cloud Provider: AWS
Compute Region: Singapore

Technical Specifications

Model Architecture and Objective

The model uses the Llama 3.1 architecture with 8 billion parameters, fine-tuned on domain-specific data for better performance in education management tasks.

Compute Infrastructure

Hardware: Tesla T4 GPU
Software: Python 3.12, PyTorch, HuggingFace Transformers, AWS SageMaker

Citation

You can use the following citations to credit this model:

BibTeX:

@misc{oleksiizirka_llama_3_8b_openemis_bot,
  author = {Oleksii Zirka},
  title = {Llama 3.1 Fine-Tuned Model for OpenEMIS Data},
  year = {2024},
  howpublished = {\url{https://huggingface.co/oleksiizirka/llama-3-8b-openemis-bot}},
}

APA:

Zirka, O. (2024). Llama 3.1 Fine-Tuned Model for OpenEMIS Aggregated Data.

Model Card Contact

For inquiries, please reach out to Oleksii Zirka at oleksii.zirka@kordit.com

Downloads last month: 15

Safetensors

Model size

8B params

Tensor type

F16

Model tree for oleksiizirka/llama-3-8b-openemis-bot

Quantizations

2 models

Model Card for oleksiizirka/llama-3-8b-openemis-bot