YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

🧠 Machine Learning Model Comparison – Classification Project

This project compares a variety of supervised machine learning algorithms to evaluate their performance on structured classification tasks. Each model was analyzed based on speed, accuracy, and practical usability.

📌 Models Included

No.	Model Name	Type
1	Logistic Regression	Linear Model
2	Random Forest	Ensemble (Bagging)
3	K-Nearest Neighbors	Instance-Based (Lazy)
4	XGBoost	Gradient Boosting
5	Support Vector Machine	Margin-based Classifier
6	ANN (MLPClassifier)	Neural Network
7	LightGBM	Gradient Boosting (Histogram)
8	Naive Bayes	Probabilistic

📊 Accuracy Summary

Model	Accuracy (%)	Speed
Logistic Regression	~84%	🔥 Very Fast
Random Forest	~95%	⚡ Medium
KNN	~84%	🐢 Slow
XGBoost	~90%	⚡ Medium
SVM	~85%	⚡ Medium
ANN (MLP)	~51%	⚡ Medium
LightGBM	~90%	🚀 Fastest
Naive Bayes	~80%	🚀 Extremely Fast

🧠 Model Descriptions

1. Logistic Regression

A linear model that predicts class probabilities using a sigmoid function.
✅ Best for interpretable and quick binary classification.
❌ Not ideal for non-linear or complex patterns.

2. Random Forest

An ensemble of decision trees with majority voting.
✅ Excellent accuracy and robustness.
❌ Slower and harder to interpret than simpler models.

3. K-Nearest Neighbors (KNN)

A lazy learner that predicts based on the nearest data points.
✅ Simple and training-free.
❌ Very slow for large datasets; sensitive to noise.

4. XGBoost

A boosting algorithm that builds trees sequentially to minimize error.
✅ High accuracy, regularization, built-in feature importance.
❌ Slightly complex tuning; slower than simpler models.

5. Support Vector Machine (SVM)

Separates classes by finding the maximum margin hyperplane.
✅ Excellent for high-dimensional or non-linear data.
❌ Doesn’t scale well; requires feature scaling.

6. ANN (MLPClassifier – sklearn)

A basic feedforward neural network with hidden layers.
✅ Capable of learning complex patterns.
❌ Low accuracy in this project; needs better tuning and data scaling.

7. LightGBM

A gradient boosting framework optimized for speed and memory.
✅ Faster than XGBoost, supports categorical features directly.
❌ Can overfit small datasets if not tuned well.

8. Naive Bayes (GaussianNB)

A probabilistic classifier assuming feature independence.
✅ Fastest model; works well for text and high-dimensional data.
❌ Feature independence rarely true; weak for complex patterns.

🧪 Recommendation Summary

Best For	Model
Highest Accuracy	Random Forest
Fastest Training	Naive Bayes
Best for Large Data	LightGBM
Best Baseline	Logistic Regression
Best for Clean Data	SVM
Best for Speed + Accuracy	XGBoost

📎 Resources Included

📁 model.pkl files for each classifier
📄 cart.docx with graphs, charts, and performance analysis
🧾 This README.md as the model card

For more information check cart.docx file.

🔧 How to Use

from joblib import load
model = load("XGBoost_model.pkl")
prediction = model.predict(["Sample input text"])
print(prediction)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support