nanokimi-mini / README.md

sohv

Update README

038a18d verified 3 months ago

preview code

raw

history blame contribute delete

1.51 kB

metadata

language: en
license: mit
library_name: transformers
tags:
  - text-generation
  - shakespeare
  - transformer
  - pytorch
pipeline_tag: text-generation
model_type: kimi-k2

nanokimi-mini

This repository contains the nanoKimi model pre-trained on Shakespeare dataset. An upgraded version of nanokimi trained on OpenWebText will be up on HuggingFace in a few days.

Model Details

Architecture: 12 layers, 12 heads, 768 embedding dimension
Training Data: Shakespeare dataset
Features: Mixture of Experts (8 experts), Latent Attention
Model Type: Kimi-K2

Files

pytorch_model.bin - Model weights
config.json - Model configuration
src/ - Source code for model architecture
modeling_kimik2.py - HuggingFace wrapper

Usage

import torch
import json
from huggingface_hub import hf_hub_download

# Download files
config_path = hf_hub_download(repo_id="sohv/nanokimi-mini", filename="config.json")
weights_path = hf_hub_download(repo_id="sohv/nanokimi-mini", filename="pytorch_model.bin")

# Load config and weights
with open(config_path) as f:
    config = json.load(f)

weights = torch.load(weights_path, map_location="cpu")
print("Model downloaded successfully!")

License

MIT License

Contact

Raise an issue in Files and Version or reach out to me here for any feedback or enquiry.