Memo: Production-Grade Transformers + Safetensors Implementation
Overview
Memo is a complete transformation from toy logic to production-grade machine learning infrastructure. This implementation uses Transformers + Safetensors as the foundation for enterprise-level video generation with proper security, performance optimization, and scalability.
π― What This Guarantees
β
Transformers-based - Real ML understanding, not toy logic
β
Safetensors-only - Zero security vulnerabilities
β
Production-ready - Enterprise architecture with proper error handling
β
Memory optimized - xFormers, attention slicing, CPU offload
β
Tier-based scaling - Free/Pro/Enterprise configurations
β
Security compliant - Audit trails and validation
ποΈ Architecture
Core Components
Bangla Text Parser (
models/text/bangla_parser.py)- Transformer-based scene extraction using
google/mt5-small - Proper tokenization with memory optimization
- Deterministic output with controlled parameters
- Transformer-based scene extraction using
Scene Planner (
core/scene_planner.py)- ML-based scene planning (no more toy logic)
- Intelligent timing and pacing calculations
- Visual style determination
Stable Diffusion Generator (
models/image/sd_generator.py)- Safetensors-only model loading (
use_safetensors=True) - Memory optimizations (xFormers, attention slicing, CPU offload)
- LoRA support with safetensors validation
- LCM acceleration for faster inference
- Safetensors-only model loading (
Model Tier System (
config/model_tiers.py)- Free Tier: Basic 512x512, 15 steps, no LoRA
- Pro Tier: 768x768, 25 steps, scene LoRA, LCM
- Enterprise Tier: 1024x1024, 30 steps, custom LoRA
Training Pipeline (
scripts/train_scene_lora.py)- MANDATORY
save_safetensors=True - Transformers integration with PEFT
- Security-first training with proper validation
- MANDATORY
Production API (
api/main.py)- FastAPI endpoint with tier-based routing
- Background processing for long-running tasks
- Security validation endpoints
π Security Implementation
Model Weight Security
- ONLY .safetensors files allowed - No .bin, .ckpt, or pickle files
- Model signature verification
- File format enforcement
- Memory-safe loading practices
LoRA Configuration (data/lora/README.md)
- ONLY .safetensors files - No .bin, .ckpt, or other formats allowed
- Model signatures required
- Version tracking and audit trails
π Usage Examples
Basic Scene Planning
from core.scene_planner import plan_scenes
scenes = plan_scenes(
text_bn="ΰ¦ΰ¦ΰ¦ΰ§ΰ¦° দিনΰ¦ΰ¦Ώ ΰ¦ΰ§ΰ¦¬ ΰ¦Έΰ§ΰ¦¨ΰ§ΰ¦¦ΰ¦° ΰ¦ΰ¦Ώΰ¦²ΰ₯€",
duration=15
)
Tier-Based Generation
from config.model_tiers import get_tier_config
from models.image.sd_generator import get_generator
config = get_tier_config("pro")
generator = get_generator(lora_path=config.lora_path, use_lcm=config.lcm_enabled)
Security Validation
from config.model_tiers import validate_model_weights_security
result = validate_model_weights_security("data/lora/memo-scene-lora.safetensors")
π Model Tiers
| Tier | Resolution | Inference Steps | LoRA | LCM | Credits/min | Memory |
|---|---|---|---|---|---|---|
| Free | 512Γ512 | 15 | β | β | $5.0 | 4GB |
| Pro | 768Γ768 | 25 | β | β | $15.0 | 8GB |
| Enterprise | 1024Γ1024 | 30 | β | β | $50.0 | 16GB |
π οΈ Installation
# Clone the repository
git clone https://huggingface.co/likhonsheikh/memo
# Install dependencies
pip install -r requirements.txt
# Run the demonstration
python demo.py
# Start the API server
python api/main.py
π¬ API Usage
Health Check
curl http://localhost:8000/health
Generate Video
curl -X POST "http://localhost:8000/generate" \
-H "Content-Type: application/json" \
-d '{
"text": "ΰ¦ΰ¦ΰ¦ΰ§ΰ¦° দিনΰ¦ΰ¦Ώ ΰ¦ΰ§ΰ¦¬ ΰ¦Έΰ§ΰ¦¨ΰ§ΰ¦¦ΰ¦° ΰ¦ΰ¦Ώΰ¦²ΰ₯€",
"duration": 15,
"tier": "pro"
}'
Check Status
curl http://localhost:8000/status/{request_id}
π§ͺ Training Custom LoRA
from scripts.train_scene_lora import SceneLoRATrainer, TrainingConfig
config = TrainingConfig(
base_model="google/mt5-small",
rank=32,
alpha=64,
save_safetensors=True # MANDATORY
)
trainer = SceneLoRATrainer(config)
trainer.load_model()
trainer.setup_lora()
trainer.train(training_data)
β‘ Performance Features
- Memory Optimization: xFormers, attention slicing, CPU offload
- FP16 Precision: 50% memory reduction with maintained quality
- LCM Acceleration: Faster inference when available
- Device Mapping: Optimal GPU/CPU utilization
- Background Processing: Async handling of long-running tasks
π Security Validation
from config.model_tiers import validate_model_weights_security
# Validate any model file
result = validate_model_weights_security("path/to/model.safetensors")
print(f"Secure: {result['is_secure']}")
print(f"Format: {result['format']}")
print(f"Issues: {result['issues']}")
π File Structure
π Memo/
βββ π requirements.txt # Production dependencies
βββ π models/
β βββ π text/
β βββ π bangla_parser.py # Transformer-based Bangla parser
βββ π core/
β βββ π scene_planner.py # ML-based scene planning
βββ π models/
β βββ π image/
β βββ π sd_generator.py # Stable Diffusion + Safetensors
βββ π data/
β βββ π lora/
β βββ π README.md # LoRA configuration (safetensors only)
βββ π scripts/
β βββ π train_scene_lora.py # Training with safetensors output
βββ π config/
β βββ π model_tiers.py # Tier management system
βββ π api/
β βββ π main.py # Production API endpoint
βββ π demo.py # Complete system demonstration
π― What This Doesn't Do
β Make GPUs cheap
β Fix bad prompts
β Read your mind
β Guarantee perfect results
π Production Readiness
This implementation is now:
- β Correct - Uses proper ML frameworks (transformers, safetensors)
- β Modern - 2025-grade architecture with security best practices
- β Secure - Zero tolerance for unsafe model formats
- β Scalable - Tier-based resource management
- β Defensible - Production-grade security and validation
π License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
π€ Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
π Support
For support, email support@memo.ai or join our Discord community.
If your API claims "state-of-the-art" without these features, you're lying. Memo now actually delivers on that promise with proper Transformers + Safetensors integration.