memo / model_card.md

likhonsheikh

Upload Memo: Production-grade Transformers + Safetensors implementation

a8fc815 verified 17 days ago

preview code

raw

history blame contribute delete

7.62 kB

Memo: Production-Grade Transformers + Safetensors Implementation

Overview

Memo is a complete transformation from toy logic to production-grade machine learning infrastructure. This implementation uses Transformers + Safetensors as the foundation for enterprise-level video generation with proper security, performance optimization, and scalability.

🎯 What This Guarantees

✅ Transformers-based - Real ML understanding, not toy logic
✅ Safetensors-only - Zero security vulnerabilities
✅ Production-ready - Enterprise architecture with proper error handling
✅ Memory optimized - xFormers, attention slicing, CPU offload
✅ Tier-based scaling - Free/Pro/Enterprise configurations
✅ Security compliant - Audit trails and validation

🏗️ Architecture

Core Components

Bangla Text Parser (models/text/bangla_parser.py)
- Transformer-based scene extraction using google/mt5-small
- Proper tokenization with memory optimization
- Deterministic output with controlled parameters
Scene Planner (core/scene_planner.py)
- ML-based scene planning (no more toy logic)
- Intelligent timing and pacing calculations
- Visual style determination
Stable Diffusion Generator (models/image/sd_generator.py)
- Safetensors-only model loading (use_safetensors=True)
- Memory optimizations (xFormers, attention slicing, CPU offload)
- LoRA support with safetensors validation
- LCM acceleration for faster inference
Model Tier System (config/model_tiers.py)
- Free Tier: Basic 512x512, 15 steps, no LoRA
- Pro Tier: 768x768, 25 steps, scene LoRA, LCM
- Enterprise Tier: 1024x1024, 30 steps, custom LoRA
Training Pipeline (scripts/train_scene_lora.py)
- MANDATORY save_safetensors=True
- Transformers integration with PEFT
- Security-first training with proper validation
Production API (api/main.py)
- FastAPI endpoint with tier-based routing
- Background processing for long-running tasks
- Security validation endpoints

🔒 Security Implementation

Model Weight Security

ONLY .safetensors files allowed - No .bin, .ckpt, or pickle files
Model signature verification
File format enforcement
Memory-safe loading practices

LoRA Configuration (`data/lora/README.md`)

ONLY .safetensors files - No .bin, .ckpt, or other formats allowed
Model signatures required
Version tracking and audit trails

🚀 Usage Examples

Basic Scene Planning

from core.scene_planner import plan_scenes

scenes = plan_scenes(
    text_bn="আজকের দিনটি খুব সুন্দর ছিল।",
    duration=15
)

Tier-Based Generation

from config.model_tiers import get_tier_config
from models.image.sd_generator import get_generator

config = get_tier_config("pro")
generator = get_generator(lora_path=config.lora_path, use_lcm=config.lcm_enabled)

Security Validation

from config.model_tiers import validate_model_weights_security

result = validate_model_weights_security("data/lora/memo-scene-lora.safetensors")

📊 Model Tiers

Tier	Resolution	Inference Steps	LoRA	LCM	Credits/min	Memory
Free	512×512	15	❌	❌	$5.0	4GB
Pro	768×768	25	✅	✅	$15.0	8GB
Enterprise	1024×1024	30	✅	✅	$50.0	16GB

🛠️ Installation

# Clone the repository
git clone https://huggingface.co/likhonsheikh/memo

# Install dependencies
pip install -r requirements.txt

# Run the demonstration
python demo.py

# Start the API server
python api/main.py

🎬 API Usage

Health Check

curl http://localhost:8000/health

Generate Video

curl -X POST "http://localhost:8000/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "আজকের দিনটি খুব সুন্দর ছিল।",
    "duration": 15,
    "tier": "pro"
  }'

Check Status

curl http://localhost:8000/status/{request_id}

🧪 Training Custom LoRA

from scripts.train_scene_lora import SceneLoRATrainer, TrainingConfig

config = TrainingConfig(
    base_model="google/mt5-small",
    rank=32,
    alpha=64,
    save_safetensors=True  # MANDATORY
)

trainer = SceneLoRATrainer(config)
trainer.load_model()
trainer.setup_lora()
trainer.train(training_data)

⚡ Performance Features

Memory Optimization: xFormers, attention slicing, CPU offload
FP16 Precision: 50% memory reduction with maintained quality
LCM Acceleration: Faster inference when available
Device Mapping: Optimal GPU/CPU utilization
Background Processing: Async handling of long-running tasks

🔍 Security Validation

from config.model_tiers import validate_model_weights_security

# Validate any model file
result = validate_model_weights_security("path/to/model.safetensors")
print(f"Secure: {result['is_secure']}")
print(f"Format: {result['format']}")
print(f"Issues: {result['issues']}")

📁 File Structure

📁 Memo/
├── 📄 requirements.txt                    # Production dependencies
├── 📁 models/
│   └── 📁 text/
│       └── 📄 bangla_parser.py           # Transformer-based Bangla parser
├── 📁 core/
│   └── 📄 scene_planner.py               # ML-based scene planning
├── 📁 models/
│   └── 📁 image/
│       └── 📄 sd_generator.py            # Stable Diffusion + Safetensors
├── 📁 data/
│   └── 📁 lora/
│       └── 📄 README.md                  # LoRA configuration (safetensors only)
├── 📁 scripts/
│   └── 📄 train_scene_lora.py            # Training with safetensors output
├── 📁 config/
│   └── 📄 model_tiers.py                 # Tier management system
├── 📁 api/
│   └── 📄 main.py                        # Production API endpoint
└── 📁 demo.py                            # Complete system demonstration

🎯 What This Doesn't Do

❌ Make GPUs cheap
❌ Fix bad prompts
❌ Read your mind
❌ Guarantee perfect results

🏆 Production Readiness

This implementation is now:

✅ Correct - Uses proper ML frameworks (transformers, safetensors)
✅ Modern - 2025-grade architecture with security best practices
✅ Secure - Zero tolerance for unsafe model formats
✅ Scalable - Tier-based resource management
✅ Defensible - Production-grade security and validation

📜 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📞 Support

For support, email support@memo.ai or join our Discord community.

If your API claims "state-of-the-art" without these features, you're lying. Memo now actually delivers on that promise with proper Transformers + Safetensors integration.