🦙 LLaMA 3.2 1B - QLoRA Fine-Tuned Model

This repository contains a QLoRA fine-tuned adapter for meta-llama/Llama-3.2-1B, trained on the Alpaca dataset for instruction following and conversational response improvement.

The model runs efficiently using 4-bit quantization (bitsandbytes) and is ideal for low-VRAM GPUs (6–8GB+).


🚀 Quick Start

1️⃣ Install Dependencies

pip install -U transformers datasets peft bitsandbytes accelerate

2️⃣ Load 4-bit Base Model

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

model_name = "meta-llama/Llama-3.2-1B"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
)

tokenizer = AutoTokenizer.from_pretrained(model_name)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    dtype=torch.bfloat16,
    device_map="auto"
)

print("🔥 Base Model Loaded in 4-bit Mode")

3️⃣ Attach LoRA Adapter (This Model)

from peft import PeftModel

lora_model = PeftModel.from_pretrained(model, "omkarwazulkar/LLama3.2-1B-QLoRA")
lora_model.eval()

print("🟥 LoRA Adapter Attached Successfully")

4️⃣ Text Generation Function

def generate(prompt):
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    output = lora_model.generate(
        **inputs,
        max_new_tokens=200,
        temperature=0.7,
        top_p=0.9,
        pad_token_id=tokenizer.eos_token_id
    )
    return tokenizer.decode(output[0], skip_special_tokens=True)

print(generate("Explain quantum computing simply."))

5️⃣ Test on Alpaca Dataset Samples

from datasets import load_dataset

dataset = load_dataset("tatsu-lab/alpaca", split="train")

for i in range(n):
    ex = dataset[i]
    prompt = f"Instruction: {ex['instruction']}\n\nAnswer:" if ex['input']=="" else \
             f"Instruction: {ex['instruction']}\nInput: {ex['input']}\n\nAnswer:"

    print(f"===== SAMPLE {i} =====")
    print(generate(prompt))
    print("=======================")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for omkarwazulkar/LLama3.2-1B-QLoRA

Finetuned
(815)
this model

Dataset used to train omkarwazulkar/LLama3.2-1B-QLoRA