RuvLTRA Medium

License HuggingFace GGUF

βš–οΈ Balanced Model for General-Purpose Tasks


Overview

RuvLTRA Medium provides the sweet spot between capability and resource usage. Ideal for desktop applications, development workstations, and moderate-scale deployments.

Model Card

Property Value
Parameters 1.1 Billion
Quantization Q4_K_M
Context 8,192 tokens
Size ~669 MB
Min RAM 2 GB
Recommended RAM 4 GB

πŸš€ Quick Start

# Download
wget https://huggingface.co/ruv/ruvltra-medium/resolve/main/ruvltra-1.1b-q4_k_m.gguf

# Run inference
./llama-cli -m ruvltra-1.1b-q4_k_m.gguf \
  -p "Explain quantum computing in simple terms:" \
  -n 512 -c 8192

πŸ’‘ Use Cases

  • Development: Code assistance and generation
  • Writing: Content creation and editing
  • Analysis: Document summarization
  • Chat: Conversational AI applications

πŸ”§ Integration

Rust

use ruvllm::hub::ModelDownloader;

let path = ModelDownloader::new()
    .download("ruv/ruvltra-medium", None)
    .await?;

Python

from llama_cpp import Llama
from huggingface_hub import hf_hub_download

model_path = hf_hub_download("ruv/ruvltra-medium", "ruvltra-1.1b-q4_k_m.gguf")
llm = Llama(model_path=model_path, n_ctx=8192)

OpenAI-Compatible Server

python -m llama_cpp.server \
  --model ruvltra-1.1b-q4_k_m.gguf \
  --host 0.0.0.0 --port 8000

Performance

Platform Tokens/sec
M2 Pro (Metal) 65 tok/s
RTX 4080 (CUDA) 95 tok/s
i9-13900K (CPU) 25 tok/s

License: Apache 2.0 | GitHub: ruvnet/ruvector

Downloads last month
15
GGUF
Model size
1B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support