--- license: mit base_model: microsoft/Phi-3-mini-4k-instruct tags: - text-classification - domain-classification - phi-3 - lora - peft - api-routing - llm-routing language: - en metrics: - accuracy - f1 library_name: peft pipeline_tag: text-classification datasets: - custom widget: - text: "Write a Python function to calculate factorial" example_title: "Coding Query" - text: "Generate an OpenAPI specification for a user management API" example_title: "API Generation" - text: "What is quantum mechanics?" example_title: "Science Query" - text: "Analyze sales data to find trends" example_title: "Data Analysis" - text: "Write a poem about the ocean" example_title: "Creative Content" --- # Phi-3 Domain Classifier for Intelligent API Routing **🎯 96.5% Accuracy | 15 Domain Categories | Production-Ready** A fine-tuned Phi-3-mini model for classifying user queries into specific domains, enabling intelligent routing to specialized LLM providers in API management systems. ## 🚀 Key Features - ✅ **High Accuracy**: 96.5% on test set - ✅ **Fast Inference**: ~35-45ms per query - ✅ **Lightweight**: Only ~100MB LoRA adapters - ✅ **15 Domains**: Comprehensive coverage - ✅ **Production-Ready**: Battle-tested on real queries ## 📊 Performance Metrics | Metric | Score | |--------|-------| | **Accuracy** | 96.50% | | **F1 Score (Weighted)** | 0.9649 | | **F1 Score (Macro)** | 0.9679 | | **Precision (Macro)** | 0.97 | | **Recall (Macro)** | 0.97 | ### Per-Domain Performance | Domain | Precision | Recall | F1-Score | |--------|-----------|--------|----------| | coding | 0.86 | 0.92 | 0.89 | | api_generation | 1.00 | 0.90 | 0.95 | | mathematics | 1.00 | 1.00 | 1.00 | | data_analysis | 0.92 | 1.00 | 0.96 | | science | 1.00 | 1.00 | 1.00 | | medicine | 0.93 | 1.00 | 0.96 | | business | 0.88 | 1.00 | 0.93 | | law | 0.91 | 1.00 | 0.95 | | technology | 1.00 | 1.00 | 1.00 | | literature | 1.00 | 1.00 | 1.00 | | creative_content | 1.00 | 1.00 | 1.00 | | education | 1.00 | 0.93 | 0.96 | | general_knowledge | 1.00 | 0.84 | 0.91 | | ambiguous | 1.00 | 1.00 | 1.00 | | sensitive | 1.00 | 1.00 | 1.00 | ## 🎯 Supported Domains 1. **coding** - Programming, algorithms, code generation 2. **api_generation** - OpenAPI specs, API design, REST/GraphQL 3. **mathematics** - Math problems, equations, calculations 4. **data_analysis** - Data science, statistics, analysis 5. **science** - Physics, chemistry, biology, scientific concepts 6. **medicine** - Medical queries, health information 7. **business** - Business strategy, finance, management 8. **law** - Legal questions, regulations, compliance 9. **technology** - Tech concepts, hardware, software 10. **literature** - Books, writing, literary analysis 11. **creative_content** - Creative writing, poetry, storytelling 12. **education** - Teaching, learning, academic topics 13. **general_knowledge** - General Q&A, trivia 14. **ambiguous** - Unclear or multi-domain queries 15. **sensitive** - Sensitive topics requiring careful handling ## 🔧 Usage ### Basic Classification ```python from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel import torch import json # Load model base_model = AutoModelForCausalLM.from_pretrained( "microsoft/Phi-3-mini-4k-instruct", torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True ) model = PeftModel.from_pretrained( base_model, "YOUR_USERNAME/phi3-domain-classifier" ) tokenizer = AutoTokenizer.from_pretrained( "YOUR_USERNAME/phi3-domain-classifier", trust_remote_code=True ) # Configure for inference model.config.use_cache = False model.eval() # Classify a query def classify_domain(query): messages = [ {"role": "system", "content": "You are a domain classifier. Respond with JSON."}, {"role": "user", "content": f"Classify this query: {query}"} ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors="pt" ).to(model.device) with torch.no_grad(): outputs = model.generate( inputs, max_new_tokens=100, temperature=0.1, do_sample=True, pad_token_id=tokenizer.pad_token_id, eos_token_id=tokenizer.eos_token_id, use_cache=False ) response = tokenizer.decode( outputs[0][inputs.shape[-1]:], skip_special_tokens=True ) return json.loads(response) # Example result = classify_domain("Write a Python function to calculate factorial") print(result) # Output: {"primary_domain": "coding", "confidence": "high"} ``` ### API Router Integration ```python class SmartAPIRouter: """Route queries to specialized LLM providers""" def __init__(self): self.classifier = DomainClassifier() self.provider_mapping = { "coding": "anthropic", # Claude for code "api_generation": "anthropic", # Claude for APIs "mathematics": "anthropic", # Claude for math "creative_content": "openai", # GPT-4 for creativity "general_knowledge": "openai", # GPT-4 for general Q&A # ... customize as needed } def route(self, query): result = self.classifier.classify(query) domain = result["primary_domain"] provider = self.provider_mapping.get(domain, "openai") return { "domain": domain, "routed_to": provider, "confidence": result["confidence"] } # Usage router = SmartAPIRouter() routing_info = router.route("Explain quantum entanglement") # Routes to appropriate LLM provider based on domain ``` ## 📦 Model Details ### Architecture - **Base Model**: microsoft/Phi-3-mini-4k-instruct (3.8B parameters) - **Fine-tuning Method**: LoRA (Low-Rank Adaptation) - **LoRA Rank**: 32 - **LoRA Alpha**: 64 - **Target Modules**: qkv_proj, o_proj, gate_up_proj, down_proj - **Trainable Parameters**: ~100M (2.6% of total) ### Training Configuration - **Epochs**: 15 - **Batch Size**: 4 (per device) - **Gradient Accumulation**: 8 steps (effective batch size: 32) - **Learning Rate**: 5e-5 - **LR Schedule**: Cosine with 5% warmup - **Optimizer**: AdamW (fused) - **Precision**: BF16 - **Label Smoothing**: 0.1 - **Gradient Clipping**: 0.5 ### Training Hardware - **GPU**: NVIDIA A40 (48GB VRAM) - **Training Time**: ~7 hours - **Framework**: PyTorch 2.0+ with Transformers ### Training Data - **Total Samples**: Custom dataset with domain-labeled queries - **Train/Val/Test Split**: 70/15/15 - **Domains**: 15 categories - **Format**: Instruction-following with JSON output ## 🎯 Use Cases ### 1. Intelligent API Gateway Route user queries to the most appropriate LLM provider based on domain expertise. ### 2. Multi-LLM Orchestration Distribute workload across multiple LLM providers based on their strengths. ### 3. Cost Optimization Route simple queries to cheaper models, complex queries to premium providers. ### 4. Query Analytics Analyze and categorize user query patterns for insights. ### 5. Content Moderation Identify sensitive or ambiguous queries for special handling. ## 🔒 Limitations - **Language**: Optimized for English queries only - **Context Length**: Limited to 4K tokens (Phi-3-mini constraint) - **Domain Coverage**: Fixed 15 domains; custom domains require retraining - **Ambiguous Queries**: May struggle with highly ambiguous or multi-domain queries - **JSON Output**: Expects structured JSON response; parsing may fail on malformed output ## ⚖️ Ethical Considerations - **Bias**: Model may inherit biases from training data - **Sensitive Content**: Has dedicated "sensitive" category but should not replace human review - **Privacy**: No personal data used in training; user queries not logged by model - **Transparency**: Classification decisions are explainable through domain labels ## 📄 License MIT License - Free for commercial and non-commercial use ## 🙏 Acknowledgments - Base model: Microsoft Phi-3 team - Fine-tuning: HuggingFace PEFT library - Training infrastructure: NVIDIA A40 GPU ## 📚 Citation If you use this model in your research or application, please cite: ```bibtex @misc{phi3-domain-classifier, author = {Your Name}, title = {Phi-3 Domain Classifier for Intelligent API Routing}, year = {2024}, publisher = {HuggingFace}, howpublished = {\url{https://huggingface.co/YOUR_USERNAME/phi3-domain-classifier}}, } ``` ## 📞 Contact For questions, issues, or collaboration: - **HuggingFace**: [@YOUR_USERNAME](https://huggingface.co/YOUR_USERNAME) - **GitHub**: [(https://github.com/ovindumandith)] - **Email**: your.email@example.com ## 🔄 Version History - **v1.0** (2024-12-09): Initial release - 96.5% accuracy on 15-domain classification - Production-ready LoRA adapter - Optimized for API routing use cases --- **Built using Phi-3 and PEFT**