NeuronSpark-0.9B

Introduction

NeuronSpark-0.9B is a 0.87-billion parameter language model built entirely on Spiking Neural Networks (SNNs). Unlike conventional Transformer-based LLMs that rely on attention mechanisms, NeuronSpark replaces the entire computation backbone with biologically-inspired spiking neurons, achieving language modeling through membrane potential dynamics, surrogate gradient training, and adaptive computation (PonderNet).

This is the pretrained base model (85,000 steps on a small subset of Seq-Monkey corpus).

Note on training data: Due to limited compute resources (single DGX Spark), this model was trained on only ~85K steps with a small fraction of the full Seq-Monkey 10B-token corpus. Despite the minimal training data, the model demonstrates emergent language capabilities — validating the architectural viability of pure SNN language models. We plan to continue scaling with more data and compute in future work.

For the instruction-tuned chat version, see NeuronSpark-0.9B-Chat.

Model Details

Attribute Value
Parameters 874M
Architecture SNN Hidden State Space Model
Hidden Dimension (D) 896
Layers 20
SNN Timesteps (K) 16 (PonderNet adaptive)
State Expansion (N) 8
FFN Dimension 2688
Vocabulary 6144 (custom BPE)
Context Length 512 tokens
Training Data Seq-Monkey (small subset, Chinese)
Training Tokens ~1.4B (of ~10B available)
Precision bfloat16
License Apache 2.0

Architecture Highlights

  • Pure SNN: No attention, no standard MLP — all computation via PLIF (Parametric Leaky Integrate-and-Fire) neurons
  • Membrane Potential Leakage Activation: PLIFNode outputs (1-β)·V_post (leak current), naturally emphasizing fast-responding neurons over slow-memory neurons
  • Selective State Space: Hidden neurons with input-dependent dynamic β(t), α(t), V_th(t) — analogous to selective state space models (Mamba)
  • PonderNet Adaptive K: Each token dynamically decides how many SNN timesteps to use (1~K), with geometric distribution weighting
  • Triton Fused Kernels: Custom PLIF forward/backward kernels, single-pass sequential scan replacing 3-phase approach
  • Pre-LN Residual Stream: Continuous residual flow with RMSNorm, matching Qwen3/LLaMA architecture pattern

Quickstart

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Brain2nd/NeuronSpark-0.9B",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained("Brain2nd/NeuronSpark-0.9B")

# Text completion
text = f"{tokenizer.bos_token}人工智能的发展"
input_ids = tokenizer(text, return_tensors="pt")["input_ids"]

output_ids = model.generate(
    input_ids,
    max_new_tokens=128,
    temperature=0.8,
    top_k=50,
    eos_token_id=tokenizer.eos_token_id,
)
print(tokenizer.decode(output_ids[0], skip_special_tokens=True))

Example Output:

人工智能的发展,为人类的未来发展提供了新的机遇。在未来,人工智能将是未来人工智能发展的重要方向。

Requirements

pip install torch transformers spikingjelly safetensors
# For Triton kernels (GPU): pip install triton

Training

Trained on a single NVIDIA DGX Spark (GB10, 128GB unified memory) with 4-GPU DDP. Due to compute constraints, training used only a small subset of the full corpus (~85K steps, ~1.4B tokens of ~10B available). Even with this limited data budget, the model acquires basic language generation ability, demonstrating the architectural viability of pure SNN language modeling.

torchrun --nproc_per_node=4 train_ddp.py \
    --D 896 --D_ff 2688 --K 16 --num_layers 20 \
    --batch_size 8 --accumulation_steps 8 \
    --learning_rate 2e-4 --warmup_iters 1000

Citation

@misc{neuronspark2025,
    title={NeuronSpark: A Spiking Neural Network Language Model with Selective State Space Dynamics},
    author={Zhengzheng Tang},
    year={2025},
    url={https://github.com/Brain2nd/NeuronSpark}
}

Contact

Downloads last month
229
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Brain2nd/NeuronSpark-0.9B

Finetunes
1 model