Overview
RuvLTRA Medium provides the sweet spot between capability and resource usage. Ideal for desktop applications, development workstations, and moderate-scale deployments.
Model Card
| Property | Value |
|---|---|
| Parameters | 1.1 Billion |
| Quantization | Q4_K_M |
| Context | 8,192 tokens |
| Size | ~669 MB |
| Min RAM | 2 GB |
| Recommended RAM | 4 GB |
π Quick Start
# Download
wget https://huggingface.co/ruv/ruvltra-medium/resolve/main/ruvltra-1.1b-q4_k_m.gguf
# Run inference
./llama-cli -m ruvltra-1.1b-q4_k_m.gguf \
-p "Explain quantum computing in simple terms:" \
-n 512 -c 8192
π‘ Use Cases
- Development: Code assistance and generation
- Writing: Content creation and editing
- Analysis: Document summarization
- Chat: Conversational AI applications
π§ Integration
Rust
use ruvllm::hub::ModelDownloader;
let path = ModelDownloader::new()
.download("ruv/ruvltra-medium", None)
.await?;
Python
from llama_cpp import Llama
from huggingface_hub import hf_hub_download
model_path = hf_hub_download("ruv/ruvltra-medium", "ruvltra-1.1b-q4_k_m.gguf")
llm = Llama(model_path=model_path, n_ctx=8192)
OpenAI-Compatible Server
python -m llama_cpp.server \
--model ruvltra-1.1b-q4_k_m.gguf \
--host 0.0.0.0 --port 8000
Performance
| Platform | Tokens/sec |
|---|---|
| M2 Pro (Metal) | 65 tok/s |
| RTX 4080 (CUDA) | 95 tok/s |
| i9-13900K (CPU) | 25 tok/s |
License: Apache 2.0 | GitHub: ruvnet/ruvector
- Downloads last month
- 15
Hardware compatibility
Log In
to view the estimation
4-bit