NeuronSpark-0.9B-Chat
Introduction
NeuronSpark-0.9B-Chat is the instruction-tuned chat version of NeuronSpark-0.9B — a 0.87-billion parameter language model built entirely on Spiking Neural Networks (SNNs). It has been fine-tuned on a small subset of BelleGroup 3.5M Chinese instructions to enable basic dialogue capabilities.
Note on training data: Due to limited compute resources, both pretraining and SFT used only small subsets of their respective datasets (pretrain ~1.4B of ~10B tokens; SFT ~6.5K steps of ~3.5M samples). Despite this minimal data budget, the model demonstrates coherent Chinese dialogue — validating that pure SNN architectures can learn language from scratch. We plan to scale training with more data and compute in future work.
For the pretrained base model, see NeuronSpark-0.9B.
Model Details
| Attribute | Value |
|---|---|
| Parameters | 874M |
| Architecture | SNN Hidden State Space Model |
| Hidden Dimension (D) | 896 |
| Layers | 20 |
| SNN Timesteps (K) | 16 (PonderNet adaptive) |
| State Expansion (N) | 8 |
| FFN Dimension | 2688 |
| Vocabulary | 6144 (custom BPE) |
| Context Length | 512 tokens |
| Base Model | NeuronSpark-0.9B (pretrained 85K steps) |
| SFT Data | BelleGroup train_3.5M_CN |
| SFT Steps | 6,500 |
| Chat Template | ChatML |
| License | Apache 2.0 |
Quickstart
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"Brain2nd/NeuronSpark-0.9B-Chat",
trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained("Brain2nd/NeuronSpark-0.9B-Chat")
# Chat
messages = [
{"role": "system", "content": "你是一个AI助手"},
{"role": "user", "content": "中国的首都是哪里?"},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
input_ids = tokenizer(text, return_tensors="pt")["input_ids"]
output_ids = model.generate(
input_ids,
max_new_tokens=256,
temperature=0.1,
top_k=10,
eos_token_id=tokenizer.eos_token_id,
)
# Extract assistant response
full_text = tokenizer.decode(output_ids[0], skip_special_tokens=False)
response = full_text.split("assistant\n")[-1].replace("<|im_end|>", "").strip()
print(response)
Example Output:
Q: 中国的首都是哪里?
A: 中国的首都在北京。
Q: 你好呀
A: 请问您需要什么样的帮助?
Architecture Highlights
- Pure SNN: No attention, no standard MLP — all computation via PLIF (Parametric Leaky Integrate-and-Fire) neurons
- Membrane Potential Leakage Activation: PLIFNode outputs
(1-β)·V_post(leak current), naturally emphasizing fast-responding neurons - Selective State Space: Hidden neurons with input-dependent dynamic β(t), α(t), V_th(t)
- PonderNet Adaptive K: Each token dynamically decides how many SNN timesteps to use
- Triton Fused Kernels: Custom PLIF forward/backward kernels for efficient parallel scan
- ChatML Template: Compatible with standard chat formatting
Requirements
pip install torch transformers spikingjelly safetensors
Limitations
- Context length: 512 tokens (limited by training configuration)
- Knowledge: Trained on Chinese corpus only; limited factual accuracy
- Repetition: May generate repetitive text for complex queries
- Scale: 0.9B parameters — significantly smaller than state-of-the-art chat models
This is a research model demonstrating that SNN architectures can achieve basic language understanding and dialogue, even with very limited training data. It is not intended for production use. We plan to continue scaling with more data and compute.
Citation
@misc{neuronspark2025,
title={NeuronSpark: A Spiking Neural Network Language Model with Selective State Space Dynamics},
author={Zhengzheng Tang},
year={2025},
url={https://github.com/Brain2nd/NeuronSpark}
}
Contact
- Author: Zhengzheng Tang
- Email: zztangbu@bu.edu
- GitHub: Brain2nd/NeuronSpark
- Downloads last month
- 243
Model tree for Brain2nd/NeuronSpark-0.9B-Chat
Base model
Brain2nd/NeuronSpark-0.9B