LFM2-2.6B-CodeAgent-LoRA

A LoRA fine-tuned adapter for LiquidAI/LFM2-2.6B-Exp trained to follow the smolagents CodeAgent format.

Model Description

This adapter teaches LFM2-2.6B-Exp to respond in the structured Thought + Code format required by smolagents CodeAgent:

Thought: I need to calculate this.
```python
result = 2 + 2
final_answer(result)
```

Key Features

  • Base Model: LiquidAI/LFM2-2.6B-Exp (2.6B parameter hybrid architecture with LIV convolution + GQA)
  • Format Compliance: 100% with minimal prompt
  • Answer Accuracy: 80% on evaluation tasks
  • Adapter Size: ~49MB (LoRA rank=8, alpha=16)

Training Details

Training Data

  • 130 successful CodeAgent trajectories generated using Claude 3.5 Sonnet as the teacher model
  • Tasks include mathematical reasoning, string manipulation, and general problem-solving
  • Each trajectory demonstrates the Thought -> Code -> Observation -> final_answer pattern

Training Configuration

Parameter Value
LoRA Rank 8
LoRA Alpha 16
Target Modules q_proj, v_proj
Trainable Parameters 12.2M (0.47% of base)
Training Steps 30
Learning Rate 2e-4
Batch Size 4
Max Sequence Length 2048
Hardware NVIDIA RTX 3090 (24GB)
Training Time ~5.5 hours

Training Framework

Evaluation Results

Prompt Mode Comparison

Prompt Mode Format Compliance Answer Accuracy
Minimal 100% 80%
Default 80% 80%
None 0% 0%

The model performs best with the minimal prompt (~95 tokens), demonstrating successful prompt distillation.

Minimal Prompt Template

You are a CodeAgent that solves tasks by writing and executing Python code.

Always respond with Thought + Python code block. Example:

Thought: I need to calculate this.
```python
result = 2 + 2
final_answer(result)
```

Call final_answer(result) when done. Now Begin!

Usage

With PEFT

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "LiquidAI/LFM2-2.6B-Exp",
    device_map="auto",
    torch_dtype="bfloat16",
)
tokenizer = AutoTokenizer.from_pretrained("LiquidAI/LFM2-2.6B-Exp")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "krzysztofwos/LFM2-2.6B-CodeAgent-LoRA")

# Generate
messages = [{"role": "user", "content": "What is 15 * 23?"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.3,
    min_p=0.15,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With smolagents

from smolagents import CodeAgent, FinalAnswerTool, TransformersModel

model = TransformersModel(
    model_id="LiquidAI/LFM2-2.6B-Exp",
    peft_model="krzysztofwos/LFM2-2.6B-CodeAgent-LoRA",
)

agent = CodeAgent(
    tools=[FinalAnswerTool()],
    model=model,
)

result = agent.run("What is 15 * 23?")
print(result)

Intended Use

  • Code-assisted problem solving
  • Mathematical reasoning tasks
  • Automated code generation following structured formats
  • Research into prompt distillation and small model fine-tuning

Limitations

  • Requires specific prompt format: Works best with minimal prompt template
  • Limited reasoning depth: 2.6B parameter hybrid architecture with LIV convolution + GQA model has constrained reasoning capabilities compared to larger models
  • English only: Trained on English-language tasks

Citation

If you use this model, please cite:

@misc{lfm2_2.6b_codeagent_lora,
  author = {krzysztofwos},
  title = {LFM2-2.6B-CodeAgent-LoRA},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/krzysztofwos/LFM2-2.6B-CodeAgent-LoRA}
}

Acknowledgments

  • LiquidAI for the LFM2-2.6B-Exp base model
  • Hugging Face for smolagents, TRL, and PEFT
  • Training performed as part of CodeAgent prompt distillation research
Downloads last month
26
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for krzysztofwos/LFM2-2.6B-CodeAgent-LoRA

Unable to build the model tree, the base model loops to the model itself. Learn more.