YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

🧠 Fine-tuned TinyLlama QA Model – With QLoRA Adapters This project fine-tunes the TinyLlama-1.1B-Chat-v1.0 model using a custom QA dataset and deploys it as a memory-efficient chatbot via QLoRA (Quantized LoRA). Ideal for low-resource environments without sacrificing response quality.

πŸ“¦ Base Model

  • Name: TinyLlama/TinyLlama-1.1B-Chat-v1.0
  • Size: ~1.1B parameters
  • Architecture: Causal Language Model (Chat format)
  • Token Format: [INST] Question [/INST] Answer
  • Smaller size (1.1B) for fast training & inference
  • Chat instruction tuning
  • Compatibility with LoRA + 4-bit quantization

🧠 Training Strategy

Feature Value
Quantization 4-bit (bnb nf4)
Optimizer Backend bitsandbytes
Gradient Accumulation 4 steps
Token Format Chat template / INST
Epochs 10
Batch Size 2
Mixed Precision FP16
Target Modules q_proj, k_proj, v_proj, o_proj,
gate_proj, up_proj, down_proj

Key Features: βœ… QLoRA: 4-bit quantized base model + 16-bit LoRA adapters. βœ… Memory Efficient: ~10MB adapter weights (vs. full 1.1B model). βœ… Early Stopping: Patience=3 to prevent overfitting.

🎯 Inference Modes

Mode Trigger Description
Exact Match QA.json question match Static response
Partial Match Substring match with known questions Approximated answer
Generative No match, use model Generates via TinyLlama + QLoRA adapters

Benefits:

  • Only adapter weights are updated
  • Base model remains untouched
  • TinyLlama + QLoRA = πŸ’₯ High performance + low resource usage

πŸ—ƒ Model Summary

Component Detail
Base Model TinyLlama/TinyLlama-1.1B-Chat-v1.0
Architecture Causal LM (Transformer)
Adapters QLoRA (r=32)
Max Sequence 512 tokens
Fine-Tuning QLoRA (4-bit NF4 + LoRA adapters)
Format [INST] {question} [/INST] {answer} (ChatML-style)
Output Personal QA generation/chatbot
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support