trillionlabs
/

Tri-70B-Intermediate-Checkpoints

Model card Files Files and versions

tl-hyungguk commited on Sep 9

Commit

032115c

·

verified ·

1 Parent(s): 0dcb825

Update README.md

Files changed (1) hide show

README.md +3 -6

README.md CHANGED Viewed

@@ -7,13 +7,10 @@ language:
 ---
 # Intermediate Checkpoints Release
-For the first time among Korean-targeted LLMs, we’re releasing **intermediate checkpoints** from the Tri family—**0.5B**, **1.9B**, and **7B**—to advance research on LLM training dynamics.
-We release checkpoints at regular step intervals— **≈20B tokens (0.5B), ≈40B (1.9B), and ≈160B (7B & 70B)** —enabling consistent analysis of training dynamics.
-You can grab the **Tri-7B** model here: [https://huggingface.co/trillionlabs/Tri-7B](https://huggingface.co/trillionlabs/Tri-7B?utm_source=chatgpt.com).
-You can grab the **Tri-70B** preview model here: [https://huggingface.co/trillionlabs/Tri-70B-preview-SFT](https://huggingface.co/trillionlabs/Tri-70B-preview-SFT).
 We’re also sharing the **0.5B** and **1.9B** runs—originally produced for system bring-up but now available as valuable artifacts for analyzing training behavior at smaller scales.

 ---
 # Intermediate Checkpoints Release
+For the first time among Korean-targeted LLMs, we’re releasing **intermediate checkpoints** from the Tri family—**0.5B**, **1.9B**, and **7B**—to advance research on LLM training dynamics. We release checkpoints at regular step intervals— **≈20B tokens (0.5B), ≈40B (1.9B), and ≈160B (7B & 70B)** —enabling consistent analysis of training dynamics. Each step’s release is distinguished by its **branch name**.
+- You can grab the **Tri-7B** model here: [https://huggingface.co/trillionlabs/Tri-7B](https://huggingface.co/trillionlabs/Tri-7B?utm_source=chatgpt.com).
+- You can grab the **Tri-70B** preview model here: [https://huggingface.co/trillionlabs/Tri-70B-preview-SFT](https://huggingface.co/trillionlabs/Tri-70B-preview-SFT).
 We’re also sharing the **0.5B** and **1.9B** runs—originally produced for system bring-up but now available as valuable artifacts for analyzing training behavior at smaller scales.