Model Card for oleksiizirka/llama-3-8b-openemis-bot

Model Details

Model Description:
This is the model card for a fine-tuned version of Llama 3.1 with 8 billion parameters. The model has been fine-tuned on client-specific data and quantized using GGIF format to optimize memory usage and performance. It was developed to better handle specific domain tasks for the client's project.

  • Developed by: Oleksii Zirka
  • Model type: Decoder
  • Language(s) (NLP): English, Arabic, French, Russian
  • License: Apache 2.0
  • Finetuned from model : Llama 3.1 (8 billion parameters)

Uses

Direct Use

The model can be used to assist in handling specific domain-related NLP tasks, such as data quality validation, report generation, and user query handling in education management systems.

Downstream Use

Applicable in industry-specific tasks for handling language variations, improving data validation, and generating insights from structured educational data.

Out-of-Scope Use

The model is not designed for offensive, illegal, or discriminatory purposes.

Bias, Risks, and Limitations

Since the model is fine-tuned on client-specific data, it might not generalize well to unrelated domains or data. Users should ensure proper usage and review results for bias or unintended outputs in their use case.


How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("oleksiizirka/llama-3-8b-openemis-bot")
model = AutoModelForCausalLM.from_pretrained("oleksiizirka/llama-3-8b-openemis-bot")

inputs = tokenizer("Your input text here", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was fine-tuned on educational data provided by the client, which includes examination, reporting, and validation tasks relevant to OpenEMIS deployments. This ensures improved handling of specific queries related to educational data management.

Training Procedure

  • Preprocessing: Data was preprocessed to remove noise and structured to fit the task-specific requirements of the client’s domain.

Training Hyperparameters

  • Training regime: Three epochs training to avoid overfitting on a relatively small dataset (~8000 cases).

Evaluation

Testing Data, Factors & Metrics

  • Testing Data: The model was tested on the client's dataset using various representative queries.
  • Factors: Focus on domain-specific tasks.
  • Metrics: Accuracy of query handling and data validation tasks.

Results

The fine-tuned model performed well on client-specific validation metrics, but generalization to external datasets may be limited.


Environmental Impact

  • Hardware Type: Tesla T4 GPU
  • Cloud Provider: AWS
  • Compute Region: Singapore

Technical Specifications

Model Architecture and Objective

The model uses the Llama 3.1 architecture with 8 billion parameters, fine-tuned on domain-specific data for better performance in education management tasks.

Compute Infrastructure

  • Hardware: Tesla T4 GPU
  • Software: Python 3.12, PyTorch, HuggingFace Transformers, AWS SageMaker

Citation

You can use the following citations to credit this model:

BibTeX:

@misc{oleksiizirka_llama_3_8b_openemis_bot,
  author = {Oleksii Zirka},
  title = {Llama 3.1 Fine-Tuned Model for OpenEMIS Data},
  year = {2024},
  howpublished = {\url{https://huggingface.co/oleksiizirka/llama-3-8b-openemis-bot}},
}

APA:

Zirka, O. (2024). Llama 3.1 Fine-Tuned Model for OpenEMIS Aggregated Data.

Model Card Contact

For inquiries, please reach out to Oleksii Zirka at oleksii.zirka@kordit.com

Downloads last month
15
Safetensors
Model size
8B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for oleksiizirka/llama-3-8b-openemis-bot

Quantizations
2 models