NeuronSpark-0.9B-Chat

Introduction

NeuronSpark-0.9B-Chat is the instruction-tuned chat version of NeuronSpark-0.9B — a 0.87-billion parameter language model built entirely on Spiking Neural Networks (SNNs). It has been fine-tuned on a small subset of BelleGroup 3.5M Chinese instructions to enable basic dialogue capabilities.

Note on training data: Due to limited compute resources, both pretraining and SFT used only small subsets of their respective datasets (pretrain ~1.4B of ~10B tokens; SFT ~6.5K steps of ~3.5M samples). Despite this minimal data budget, the model demonstrates coherent Chinese dialogue — validating that pure SNN architectures can learn language from scratch. We plan to scale training with more data and compute in future work.

For the pretrained base model, see NeuronSpark-0.9B.

Model Details

Attribute	Value
Parameters	874M
Architecture	SNN Hidden State Space Model
Hidden Dimension (D)	896
Layers	20
SNN Timesteps (K)	16 (PonderNet adaptive)
State Expansion (N)	8
FFN Dimension	2688
Vocabulary	6144 (custom BPE)
Context Length	512 tokens
Base Model	NeuronSpark-0.9B (pretrained 85K steps)
SFT Data	BelleGroup train_3.5M_CN
SFT Steps	6,500
Chat Template	ChatML
License	Apache 2.0

Quickstart

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Brain2nd/NeuronSpark-0.9B-Chat",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained("Brain2nd/NeuronSpark-0.9B-Chat")

# Chat
messages = [
    {"role": "system", "content": "你是一个AI助手"},
    {"role": "user", "content": "中国的首都是哪里？"},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
input_ids = tokenizer(text, return_tensors="pt")["input_ids"]

output_ids = model.generate(
    input_ids,
    max_new_tokens=256,
    temperature=0.1,
    top_k=10,
    eos_token_id=tokenizer.eos_token_id,
)

# Extract assistant response
full_text = tokenizer.decode(output_ids[0], skip_special_tokens=False)
response = full_text.split("assistant\n")[-1].replace("<|im_end|>", "").strip()
print(response)

Example Output:

Q: 中国的首都是哪里？
A: 中国的首都在北京。

Q: 你好呀
A: 请问您需要什么样的帮助?

Architecture Highlights

Pure SNN: No attention, no standard MLP — all computation via PLIF (Parametric Leaky Integrate-and-Fire) neurons
Membrane Potential Leakage Activation: PLIFNode outputs (1-β)·V_post (leak current), naturally emphasizing fast-responding neurons
Selective State Space: Hidden neurons with input-dependent dynamic β(t), α(t), V_th(t)
PonderNet Adaptive K: Each token dynamically decides how many SNN timesteps to use
Triton Fused Kernels: Custom PLIF forward/backward kernels for efficient parallel scan
ChatML Template: Compatible with standard chat formatting

Requirements

pip install torch transformers spikingjelly safetensors

Limitations

Context length: 512 tokens (limited by training configuration)
Knowledge: Trained on Chinese corpus only; limited factual accuracy
Repetition: May generate repetitive text for complex queries
Scale: 0.9B parameters — significantly smaller than state-of-the-art chat models

This is a research model demonstrating that SNN architectures can achieve basic language understanding and dialogue, even with very limited training data. It is not intended for production use. We plan to continue scaling with more data and compute.

Citation

@misc{neuronspark2025,
    title={NeuronSpark: A Spiking Neural Network Language Model with Selective State Space Dynamics},
    author={Zhengzheng Tang},
    year={2025},
    url={https://github.com/Brain2nd/NeuronSpark}
}

Contact

Author: Zhengzheng Tang
Email: zztangbu@bu.edu
GitHub: Brain2nd/NeuronSpark

Downloads last month: 243

Model tree for Brain2nd/NeuronSpark-0.9B-Chat

Base model

Brain2nd/NeuronSpark-0.9B

Finetuned

(1)

this model