language: - en tags: - tabular - nearest-neighbors - knn - classification - regression - cpu - low-latency - interpretable library_name: smart-knn license: mit pipeline_tag: tabular-classification model_name: SmartKNN v2

SmartKNN v2

SmartKNN v2 is a high-performance, CPU-first nearest-neighbors model designed for low-latency production inference on real-world tabular data.

It delivers competitive accuracy with gradient-boosted models while maintaining sub-millisecond single-prediction latency (p95) on CPU-only systems.

SmartKNN v2 is part of the SmartEco ecosystem.

Model Details

Model type: Distance-weighted K-Nearest Neighbors
Tasks: Classification, Regression
Backend: Adaptive (Brute-force + ANN)
Hardware: CPU-only (GPU not required)
Focus: Low latency, interpretability, production readiness

Unlike classical KNN, SmartKNN v2 learns feature importance, adapts execution strategy based on data size, and uses optimized distance kernels for fast inference.

What’s New in v2

Full classification support restored
ANN backend introduced for scalable neighbor search
Automatic backend selection (small → brute, large → ANN)
Distance-weighted voting for improved accuracy
Interpretable neighbor influence statistics
Foundation for adaptive-K strategies

Architecture Overview

Feature Weighting
Backend Selector
Brute Backend (small datasets)
ANN Backend (large datasets)
Distance Kernel
Weighted Voting
Prediction

This hybrid architecture ensures consistent low latency across dataset sizes.

Performance (Internal Evaluation)

Public benchmarks will be released soon.

From internal testing on real-world tabular datasets:

Accuracy comparable to XGBoost / LightGBM / CatBoost
Single-prediction latency:
- Median: sub-millisecond
- p95: consistently low on CPU
Predictable batch inference scaling

SmartKNN v2 has not yet reached its performance ceiling. Future releases will further optimize speed and accuracy.

Limitations

Not designed for unstructured data (text, images)
ANN backend focuses on CPU efficiency, not GPU acceleration
Best suited for tabular datasets

Future Work

Adaptive-K accuracy optimization
Kernel-level speed improvements
Custom ANN backend

JashuXo
/

SmartKNN_v2