WavesFM v1.0

WavesFM is a multimodal wireless foundation model that processes raw IQ streams and image-like wireless modalities (spectrograms and CSI). It uses a single Vision Transformer (ViT) backbone with modality-specific input embeddings and a masked wireless modeling pretraining objective. The model is evaluated across RF fingerprinting, interference detection/classification, human activity sensing, RF signal classification, and 5G NR positioning.

This card summarizes the model as described in:
"Multimodal Wireless Foundation Models" (Aboulfotouh, Abou-Zeid), arXiv:2511.15162.

Model details

  • Architecture: ViT encoder with modality-specific input embeddings and lightweight task heads.
  • Modalities: raw IQ streams and image-like inputs (spectrograms, CSI).
  • Encoder config (paper): patch size 16x16, IQ segment length 16, 8 blocks, embed dim 256.
  • Params (paper): ~6.32M encoder, ~0.79M decoder (decoder used only in pretraining).

Pretraining

  • Objective: masked wireless modeling (MAE-style reconstruction).
  • Datasets:
    • Spectrogram dataset: 3,200 samples from over-the-air SDR captures across multiple signal types (WiFi, LTE, Bluetooth, 5G-NR, ISM).
    • IQ dataset: 3,200 samples from a 4-antenna MIMO indoor testbed with varied modulations/technologies and TX/RX configurations.
  • Setup (paper): 800 epochs, 40-epoch warmup, 70% masking ratio for both modalities, Adam with lr 1e-3 and cosine annealing.

Fine-tuning regimes

  • LP (Linear probing): encoder frozen; train task head + input projections.
  • FT2 (Partial fine-tuning): last 2 encoder blocks unfrozen.
  • LoRA: rank 32, alpha 32, adapters in attention projections; encoder frozen.

Downstream tasks

  • RF Fingerprinting (RFP) - IQ, device identification (mean per-class accuracy).
  • Interference Detection (INTD) / Classification (INTC) - IQ (mean per-class accuracy).
  • Human Activity Sensing (HAS) - CSI (mean per-class accuracy).
  • RF Signal Classification (RFS) - spectrograms (mean per-class accuracy).
  • 5G NR Positioning (POS) - CSI (mean localization error in meters).
  • DeepMIMO LOS/NLOS Classifcation - CSI (mean per-class accuracy)
  • DeepMIMO Beam Prediction - CSI (mean pea-class accuracy)
  • RADCOM Signal & Modulation Classification - IQ (mean per-class accuracy)
  • UWB Indoor Positioning and Tracking - CIR (mean position error)
  • UWB Industrial Localization - CIR (mean position error)

For up-to-date reproduction commands and dataset protocols, see:

  • Benchmarks: to be added
  • Reproduce: to be added
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for ahmedaboulfo/wavesfm