JashuXo commited on
Commit
dd8886e
·
verified ·
1 Parent(s): 6708b9e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +121 -3
README.md CHANGED
@@ -1,3 +1,121 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ metrics:
6
+ - accuracy
7
+ - f1
8
+ - r_squared
9
+ - mse
10
+ tags:
11
+ - knn
12
+ - nearest-neighbors
13
+ - tabular
14
+ - classification
15
+ - regression
16
+ - cpu
17
+ - low-latency
18
+ - ann
19
+ - distance-weighted
20
+ - production-ready
21
+ ---
22
+ ---
23
+ language:
24
+ - en
25
+ tags:
26
+ - tabular
27
+ - nearest-neighbors
28
+ - knn
29
+ - classification
30
+ - regression
31
+ - cpu
32
+ - low-latency
33
+ - interpretable
34
+ library_name: smart-knn
35
+ license: mit
36
+ pipeline_tag: tabular-classification
37
+ model_name: SmartKNN v2
38
+ ---
39
+
40
+ # SmartKNN v2
41
+
42
+ **SmartKNN v2** is a high-performance, CPU-first nearest-neighbors model designed for **low-latency production inference** on real-world tabular data.
43
+
44
+ It delivers **competitive accuracy with gradient-boosted models** while maintaining **sub-millisecond single-prediction latency (p95)** on CPU-only systems.
45
+
46
+ SmartKNN v2 is part of the **SmartEco** ecosystem.
47
+
48
+ ---
49
+
50
+ ## Model Details
51
+
52
+ - **Model type:** Distance-weighted K-Nearest Neighbors
53
+ - **Tasks:** Classification, Regression
54
+ - **Backend:** Adaptive (Brute-force + ANN)
55
+ - **Hardware:** CPU-only (GPU not required)
56
+ - **Focus:** Low latency, interpretability, production readiness
57
+
58
+ Unlike classical KNN, SmartKNN v2 learns feature importance, adapts execution strategy based on data size, and uses optimized distance kernels for fast inference.
59
+
60
+ ---
61
+
62
+ ## What’s New in v2
63
+
64
+ - Full classification support restored
65
+ - ANN backend introduced for scalable neighbor search
66
+ - Automatic backend selection (small → brute, large → ANN)
67
+ - Distance-weighted voting for improved accuracy
68
+ - Interpretable neighbor influence statistics
69
+ - Foundation for adaptive-K strategies
70
+
71
+ ---
72
+
73
+ ## Architecture Overview
74
+
75
+ - Feature Weighting
76
+ - Backend Selector
77
+ - Brute Backend (small datasets)
78
+ - ANN Backend (large datasets)
79
+ - Distance Kernel
80
+ - Weighted Voting
81
+ - Prediction
82
+
83
+
84
+ This hybrid architecture ensures consistent low latency across dataset sizes.
85
+
86
+ ---
87
+
88
+ ## Performance (Internal Evaluation)
89
+
90
+ > Public benchmarks will be released soon.
91
+
92
+ From internal testing on real-world tabular datasets:
93
+
94
+ - Accuracy comparable to XGBoost / LightGBM / CatBoost
95
+ - Single-prediction latency:
96
+ - Median: sub-millisecond
97
+ - p95: consistently low on CPU
98
+ - Predictable batch inference scaling
99
+
100
+ SmartKNN v2 has **not yet reached its performance ceiling**. Future releases will further optimize speed and accuracy.
101
+
102
+ ---
103
+
104
+ ## Limitations
105
+
106
+ - Not designed for unstructured data (text, images)
107
+ - ANN backend focuses on CPU efficiency, not GPU acceleration
108
+ - Best suited for tabular datasets
109
+
110
+ ---
111
+
112
+ ## Future Work
113
+
114
+ - Adaptive-K accuracy optimization
115
+ - Kernel-level speed improvements
116
+ - Custom ANN backend
117
+
118
+ ## Links
119
+
120
+ - Website: https://thatipamula-jashwanth.github.io/SmartEco/
121
+ - Source Code: https://github.com/thatipamula-jashwanth/smart-knn