prithivMLmods
/

shoe-type-detection

@@ -2,8 +2,31 @@
 license: apache-2.0
 datasets:
 - prithivMLmods/Shoe-Net-10K
 ---
 ```py
 Classification Report:
               precision    recall  f1-score   support
@@ -20,3 +43,87 @@ weighted avg     0.9202    0.9197    0.9194     10000
 ```
 ![download.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/e5c_wP09atj7GhXoxUnHW.png)

 license: apache-2.0
 datasets:
 - prithivMLmods/Shoe-Net-10K
+language:
+- en
+base_model:
+- google/siglip2-base-patch16-512
+pipeline_tag: image-classification
+library_name: transformers
+tags:
+- SigLIP2
+- Ballet Flat
+- Boat
+- Sneaker
+- Clog
+- Brogue
 ---
+![44.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/_6WAhmO9_W74Sz2AhwytE.png)
+# shoe-type-detection
+> shoe-type-detection is a vision-language encoder model fine-tuned from `google/siglip2-base-patch16-512` for **multi-class image classification**. It is trained to detect different types of shoes such as **Ballet Flats**, **Boat Shoes**, **Brogues**, **Clogs**, and **Sneakers**. The model uses the `SiglipForImageClassification` architecture.
+> \[!note]
+> SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
+> [https://arxiv.org/pdf/2502.14786](https://arxiv.org/pdf/2502.14786)
 ```py
 Classification Report:
               precision    recall  f1-score   support
 ```
 ![download.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/e5c_wP09atj7GhXoxUnHW.png)
+---
+## Label Space: 5 Classes
+```
+Class 0: Ballet Flat
+Class 1: Boat
+Class 2: Brogue
+Class 3: Clog
+Class 4: Sneaker
+```
+---
+## Install Dependencies
+```bash
+pip install -q transformers torch pillow gradio hf_xet
+```
+---
+## Inference Code
+```python
+import gradio as gr
+from transformers import AutoImageProcessor, SiglipForImageClassification
+from PIL import Image
+import torch
+# Load model and processor
+model_name = "prithivMLmods/shoe-type-detection"  # Update with actual model name on Hugging Face
+model = SiglipForImageClassification.from_pretrained(model_name)
+processor = AutoImageProcessor.from_pretrained(model_name)
+# Updated label mapping
+id2label = {
+    "0": "Ballet Flat",
+    "1": "Boat",
+    "2": "Brogue",
+    "3": "Clog",
+    "4": "Sneaker"
+}
+def classify_image(image):
+    image = Image.fromarray(image).convert("RGB")
+    inputs = processor(images=image, return_tensors="pt")
+    with torch.no_grad():
+        outputs = model(**inputs)
+        logits = outputs.logits
+        probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
+    prediction = {
+        id2label[str(i)]: round(probs[i], 3) for i in range(len(probs))
+    }
+    return prediction
+# Gradio Interface
+iface = gr.Interface(
+    fn=classify_image,
+    inputs=gr.Image(type="numpy"),
+    outputs=gr.Label(num_top_classes=5, label="Shoe Type Classification"),
+    title="Shoe Type Detection",
+    description="Upload an image of a shoe to classify it as Ballet Flat, Boat, Brogue, Clog, or Sneaker."
+)
+if __name__ == "__main__":
+    iface.launch()
+```
+---
+## Intended Use
+`shoe-type-detection` is designed for:
+* **E-Commerce Automation** – Automate product tagging and classification in online retail platforms.
+* **Footwear Inventory Management** – Efficiently organize and categorize large volumes of shoe images.
+* **Retail Intelligence** – Enable AI-powered search and filtering based on shoe types.
+* **Smart Surveillance** – Identify and analyze footwear types in surveillance footage for retail analytics.
+* **Fashion and Apparel Research** – Analyze trends in shoe types and customer preferences using image datasets.