ssz's picture

ssz

ssz1111

·

AI & ML interests

None yet

Recent Activity

reacted to vicgalle's post with 🤝 15 days ago

Can you merge models of different sizes? ⚗️ Well, yes, if the models are somewhat compatible. Here is an experiment I did. I wanted to merge two of the best performing models: https://huggingface.co/mlabonne/NeuralBeagle14-7B and https://huggingface.co/jeonsworld/CarbonVillain-en-10.7B-v4 Here is my recipe: 1. Expand the layers of NeuralBeagle to 10.7B ala frankenmerge. 2. DPO-tune the previous model with a high-quality preference dataset, https://huggingface.co/datasets/argilla/distilabel-intel-orca-dpo-pairs 3. Merge the previous model with CarbonVillain (needs —allow-crimes in mergekit! 🔪) And here is the resulting model, CarbonBeagle-11B, which ranked top in the leaderboard for its size class: https://huggingface.co/vicgalle/CarbonBeagle-11B

reacted to vicgalle's post with 🤯 15 days ago

Can you merge models of different sizes? ⚗️ Well, yes, if the models are somewhat compatible. Here is an experiment I did. I wanted to merge two of the best performing models: https://huggingface.co/mlabonne/NeuralBeagle14-7B and https://huggingface.co/jeonsworld/CarbonVillain-en-10.7B-v4 Here is my recipe: 1. Expand the layers of NeuralBeagle to 10.7B ala frankenmerge. 2. DPO-tune the previous model with a high-quality preference dataset, https://huggingface.co/datasets/argilla/distilabel-intel-orca-dpo-pairs 3. Merge the previous model with CarbonVillain (needs —allow-crimes in mergekit! 🔪) And here is the resulting model, CarbonBeagle-11B, which ranked top in the leaderboard for its size class: https://huggingface.co/vicgalle/CarbonBeagle-11B

upvoted an article 15 days ago

Merge Large Language Models with mergekit

View all activity

Organizations

liked 3 models about 1 month ago

ssz1111/CANOE-Qwen2.5-7B

Text Generation • 8B • Updated Apr 15, 2025 • 7 • 1

ssz1111/CANOE-LLaMA3-8B

Text Generation • 8B • Updated Apr 15, 2025 • 8 • 1

ssz1111/FaithLens

8B • Updated Dec 23, 2025 • 21.6k • 2

liked a model 5 months ago

ssz1111/CANOE-Qwen2.5-14B

Text Generation • 15B • Updated Apr 15, 2025 • 1 • 2

liked 4 models 6 months ago

ssz1111/NOVA-LLaMA-3-8B-AlpacaGPT4-15percent

Text Generation • Updated May 12, 2025 • 1 • 1

ssz1111/NOVA-LLaMA-3-8B-Alpaca-15percent

Text Generation • Updated May 12, 2025 • 1 • 1

ssz1111/NOVA-LLaMA-3-8B-AlpacaGPT4-5percent

Text Generation • Updated May 12, 2025 • 1 • 1

ssz1111/NOVA-LLaMA-3-8B-Alpaca-5percent

Text Generation • Updated May 12, 2025 • 1 • 1

liked 6 models about 1 year ago

ssz1111/GATEAU-1k-10k

Updated Dec 13, 2024 • 1

ssz1111/GATEAU-1k-100k

Updated Dec 13, 2024 • 1

ssz1111/GATEAU-3k-100k

Updated Dec 13, 2024 • 1 • 1

ssz1111/GATEAU-5k-10k

Updated Dec 13, 2024 • 1

ssz1111/GATEAU-3k-10k

Updated Dec 13, 2024 • 1

ssz1111/GATEAU-5k-100k

Updated Dec 13, 2024 • 1 • 1