π Vex-Amber-Fable-2.0: World Record Holder (2B Class)
Model Description
Vex-Amber-Fable-2.0 is a 2-billion-parameter causal language model that currently holds the World Record for performance-to-parameter efficiency in the sub-3B category. Developed by Arioron, it demonstrates that "Intelligence Density" is more critical than raw scale. By utilizing float32 precision and an 8k context window, it delivers high-fidelity reasoning that matches or exceeds models with 10x to 50x its parameter count.
- Developed by: Arioron
- Model type: Decoder-only Transformer
- World Record Status: Highest recorded SWE-bench and HumanEval scores for a 2B parameter model.
- Precision: float32
- Context Window: 8,192 tokens (8k)
π The "Intelligence Density" Leaderboard
Vex-Amber-Fable-2.0 effectively redefines the performance ceiling for small language models (SLMs).
1. Software Engineering (SWE-bench Verified)
Measures the ability to resolve real-world GitHub issues.
| Model | Parameters | Accuracy | Efficiency Rank |
|---|---|---|---|
| Vex-Amber-Fable-2.0 | 2B | 65.37% | π₯ World Record |
| Claude Sonnet 4.5 | ~100B+ | 77.00% | π₯ (Massive Scale) |
| GPT-5.1 | ~100B+ | 76.00% | π₯ (Massive Scale) |
| Llama-3-8B | 8B | <30.00% | 4th |
2. Coding Proficiency (HumanEval Pass@1)
Measures natural Python code synthesis.
| Model | Parameters | Score | Comparison |
|---|---|---|---|
| Vex-Amber-Fable-2.0 | 2B | 60.98% | Beats most 8B-30B models |
| Mistral-7B | 7B | 50.20% | Outperformed by Vex |
| Gemma-2B | 2B | 29.70% | Outperformed by Vex |
| Llama-3-8B | 8B | 62.20% | Competitive |
3. Generalization (LiveCodeBench)
Tests on fresh problems to prevent memorization.
| Model | Parameter Class | Score | Generalization |
|---|---|---|---|
| Vex-Amber-Fable-2.0 | 2B | 44.19% | Exceptional |
| Average 2B Model | 2B | ~18.00% | Poor |
| Average 7B Model | 7B | ~32.00% | Moderate |
Technical Specifications
- Architecture: Transformer-based decoder
- Parameters: 2 Billion
- Precision: float32 (Full precision for maximum numerical stability)
- Context Length: 8,192 tokens
- Reasoning Score (AIMLE): 0.5139 (Class Leader)
Quick Start
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "Arioron/Vex-Amber-Fable-2.0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float32,
device_map="auto"
)
# Example: Solving a complex software engineering bug
prompt = "Analyze this repository issue and provide a fix: [Issue Description]"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Intended Use
- World-Class Coding Assistance: IDE integrations and automated PR reviews.
- High-Fidelity Reasoning: Mathematical and symbolic logic tasks.
- Edge Deployment: Running SOTA-level intelligence on consumer hardware.
Citation
@misc{vexamberfable2.0,
title = {Vex-Amber-Fable-2.0: World Record Parameter Efficiency in Software Engineering},
author = {Arioron},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Arioron/Vex-Amber-Fable-2.0}}
}
- Downloads last month
- 26