ranarag commited on
Commit
30de792
·
verified ·
1 Parent(s): 8cd3226

model card update

Browse files
README.md CHANGED
@@ -6,10 +6,10 @@ tags:
6
  - granite-4.0
7
  ---
8
 
9
- # Granite-4.0-H-Micro
10
 
11
  **Model Summary:**
12
- Granite-4.0-H-Micro is a 3B parameter long-context instruct model finetuned from *Granite-4.0-H-Micro-Base* using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets. This model is developed using a diverse set of techniques with a structured chat format, including supervised finetuning, model alignment using reinforcement learning, and model merging. Granite 4.0 instruct models feature improved *instruction following (IF)* and *tool-calling* capabilities, making them more effective in enterprise applications.
13
 
14
  - **Developers:** Granite Team, IBM
15
  - **HF Collection:** [Granite 4.0 Language Models HF Collection](https://huggingface.co/collections/ibm-granite/granite-40-language-models-6811a18b820ef362d9e5a82c)
@@ -22,7 +22,7 @@ Granite-4.0-H-Micro is a 3B parameter long-context instruct model finetuned from
22
  English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. Users may finetune Granite 4.0 models for languages beyond these languages.
23
 
24
  **Intended use:**
25
- The model is designed to respond to general instructions and can be used to build AI assistants for multiple domains, including business applications.
26
 
27
  *Capabilities*
28
  * Summarization
@@ -38,7 +38,7 @@ The model is designed to respond to general instructions and can be used to buil
38
  -->
39
 
40
  **Generation:**
41
- This is a simple example of how to use Granite-4.0-H-Micro model.
42
 
43
  Install the following libraries:
44
 
@@ -54,7 +54,7 @@ import torch
54
  from transformers import AutoModelForCausalLM, AutoTokenizer
55
 
56
  device = "cuda"
57
- model_path = "ibm-granite/granite-4.0-h-micro"
58
  tokenizer = AutoTokenizer.from_pretrained(model_path)
59
  # drop device_map if running on CPU
60
  model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
@@ -82,11 +82,21 @@ Expected output:
82
  ```
83
 
84
  **Tool-calling:**
85
- Granite-4.0-H-Micro comes with enhanced tool calling capabilities, enabling seamless integration with external functions and APIs. To define a list of tools please follow OpenAI's function [definition schema](https://platform.openai.com/docs/guides/function-calling?api-mode=responses#defining-functions).
86
 
87
- This is an example of how to use Granite-4.0-H-Micro model tool-calling ability:
88
 
89
  ```python
 
 
 
 
 
 
 
 
 
 
90
  tools = [
91
  {
92
  "type": "function",
@@ -170,7 +180,7 @@ For each tool call, return a json object with function name and arguments within
170
  </tr></thead>
171
  <tbody>
172
  <tr>
173
- <td style="text-align:left; background-color: #DAE8FF; color: #2D2D2D;">Granite-4.0-H-Micro</td>
174
  <td style="text-align:center; background-color: #DAE8FF; color: #2D2D2D;"></td>
175
  <td style="text-align:center; background-color: #DAE8FF; color: #2D2D2D;"></td>
176
  <td style="text-align:center; background-color: #DAE8FF; color: #2D2D2D;"></td>
@@ -243,7 +253,7 @@ For each tool call, return a json object with function name and arguments within
243
  </tr></thead>
244
  <tbody>
245
  <tr>
246
- <td style="text-align:left; background-color: #DAE8FF; color: black;">Granite-4.0-H-Micro</td>
247
  <td style="text-align:center; background-color: #DAE8FF; color: black;"></td>
248
  <td style="text-align:center; background-color: #DAE8FF; color: black;"></td>
249
  <td style="text-align:center; background-color: #DAE8FF; color: black;"></td>
@@ -285,7 +295,8 @@ For each tool call, return a json object with function name and arguments within
285
  </tbody></table> -->
286
 
287
  **Model Architecture:**
288
- Granite-4.0-H-Micro baseline is built on a decoder-only dense transformer architecture. Core components of this architecture are: GQA, Mamba2, MLP with SwiGLU, RMSNorm, and shared input/output embeddings.
 
289
 
290
  <table>
291
  <thead>
@@ -299,58 +310,58 @@ Granite-4.0-H-Micro baseline is built on a decoder-only dense transformer archit
299
  <tbody>
300
  <tr>
301
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Embedding size</td>
302
- <td style="text-align:center; background-color: #FFFFFF; color: black;">2560</td>
303
- <td style="text-align:center; background-color: #DAE8FF; color: black;">2048</td>
304
  <td style="text-align:center; background-color: #FFFFFF; color: black;">1536</td>
305
  <td style="text-align:center; background-color: #FFFFFF; color: black;">4096</td>
306
  </tr>
307
  <tr>
308
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Number of layers</td>
309
- <td style="text-align:center; background-color: #FFFFFF; color: black;">40 attention</td>
310
- <td style="text-align:center; background-color: #DAE8FF; color: black;">4 attention / 36 Mamba2</td>
311
  <td style="text-align:center; background-color: #FFFFFF; color: black;">4 attention / 36 Mamba2</td>
312
  <td style="text-align:center; background-color: #FFFFFF; color: black;">4 attention / 36 Mamba2</td>
313
  </tr>
314
  <tr>
315
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Attention head size</td>
316
- <td style="text-align:center; background-color: #FFFFFF; color: black;">64</td>
317
  <td style="text-align:center; background-color: #DAE8FF; color: black;">64</td>
 
318
  <td style="text-align:center; background-color: #FFFFFF; color: black;">128</td>
319
  <td style="text-align:center; background-color: #FFFFFF; color: black;">128</td>
320
  </tr>
321
  <tr>
322
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Number of attention heads</td>
323
- <td style="text-align:center; background-color: #FFFFFF; color: black;">40</td>
324
- <td style="text-align:center; background-color: #DAE8FF; color: black;">32</td>
325
  <td style="text-align:center; background-color: #FFFFFF; color: black;">12</td>
326
  <td style="text-align:center; background-color: #FFFFFF; color: black;">32</td>
327
  </tr>
328
  <tr>
329
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Number of KV heads</td>
330
- <td style="text-align:center; background-color: #FFFFFF; color: black;">8</td>
331
  <td style="text-align:center; background-color: #DAE8FF; color: black;">8</td>
 
332
  <td style="text-align:center; background-color: #FFFFFF; color: black;">4</td>
333
  <td style="text-align:center; background-color: #FFFFFF; color: black;">8</td>
334
  </tr>
335
  <tr>
336
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Mamba2 state size</td>
337
- <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
338
- <td style="text-align:center; background-color: #DAE8FF; color: black;">128</td>
339
  <td style="text-align:center; background-color: #FFFFFF; color: black;">128</td>
340
  <td style="text-align:center; background-color: #FFFFFF; color: black;">128</td>
341
  </tr>
342
  <tr>
343
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Number of Mamba2 heads</td>
344
- <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
345
- <td style="text-align:center; background-color: #DAE8FF; color: black;">64</td>
346
  <td style="text-align:center; background-color: #FFFFFF; color: black;">48</td>
347
  <td style="text-align:center; background-color: #FFFFFF; color: black;">128</td>
348
  </tr>
349
 
350
  <tr>
351
  <td style="text-align:left; background-color: #FFFFFF; color: black;">MLP / Shared expert hidden size</td>
352
- <td style="text-align:center; background-color: #FFFFFF; color: black;">8192</td>
353
  <td style="text-align:center; background-color: #DAE8FF; color: black;">8192</td>
 
354
  <td style="text-align:center; background-color: #FFFFFF; color: black;">1024</td>
355
  <td style="text-align:center; background-color: #FFFFFF; color: black;">1536</td>
356
  </tr>
@@ -358,59 +369,59 @@ Granite-4.0-H-Micro baseline is built on a decoder-only dense transformer archit
358
 
359
  <tr>
360
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Num. Experts</td>
361
- <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
362
  <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
 
363
  <td style="text-align:center; background-color: #FFFFFF; color: black;">64</td>
364
  <td style="text-align:center; background-color: #FFFFFF; color: black;">72</td>
365
  </tr>
366
  <tr>
367
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Num. active Experts</td>
368
- <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
369
  <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
 
370
  <td style="text-align:center; background-color: #FFFFFF; color: black;">6</td>
371
  <td style="text-align:center; background-color: #FFFFFF; color: black;">10</td>
372
  </tr>
373
  <tr>
374
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Expert hidden size</td>
375
- <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
376
  <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
 
377
  <td style="text-align:center; background-color: #FFFFFF; color: black;">512</td>
378
  <td style="text-align:center; background-color: #FFFFFF; color: black;">768</td>
379
  </tr>
380
 
381
  <tr>
382
  <td style="text-align:left; background-color: #FFFFFF; color: black;">MLP activation</td>
383
- <td style="text-align:center; background-color: #FFFFFF; color: black;">SwiGLU</td>
384
  <td style="text-align:center; background-color: #DAE8FF; color: black;">SwiGLU</td>
385
  <td style="text-align:center; background-color: #FFFFFF; color: black;">SwiGLU</td>
386
  <td style="text-align:center; background-color: #FFFFFF; color: black;">SwiGLU</td>
 
387
  </tr>
388
 
389
  <tr>
390
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Sequence length</td>
391
- <td style="text-align:center; background-color: #FFFFFF; color: black;">128K</td>
392
  <td style="text-align:center; background-color: #DAE8FF; color: black;">128K</td>
393
  <td style="text-align:center; background-color: #FFFFFF; color: black;">128K</td>
394
  <td style="text-align:center; background-color: #FFFFFF; color: black;">128K</td>
 
395
  </tr>
396
  <tr>
397
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Position embedding</td>
398
- <td style="text-align:center; background-color: #FFFFFF; color: black;">RoPE</td>
399
- <td style="text-align:center; background-color: #DAE8FF; color: black;">NoPE</td>
400
  <td style="text-align:center; background-color: #FFFFFF; color: black;">NoPE</td>
401
  <td style="text-align:center; background-color: #FFFFFF; color: black;">NoPE</td>
402
  </tr>
403
  <tr>
404
  <td style="text-align:left; background-color: #FFFFFF; color: black;"># Parameters</td>
405
- <td style="text-align:center; background-color: #FFFFFF; color: black;">3B</td>
406
  <td style="text-align:center; background-color: #DAE8FF; color: black;">3B</td>
 
407
  <td style="text-align:center; background-color: #FFFFFF; color: black;">7B</td>
408
  <td style="text-align:center; background-color: #FFFFFF; color: black;">32B</td>
409
  </tr>
410
  <tr>
411
  <td style="text-align:left; background-color: #FFFFFF; color: black;"># Active parameters</td>
412
- <td style="text-align:center; background-color: #FFFFFF; color: black;">3B</td>
413
  <td style="text-align:center; background-color: #DAE8FF; color: black;">3B</td>
 
414
  <td style="text-align:center; background-color: #FFFFFF; color: black;">2B</td>
415
  <td style="text-align:center; background-color: #FFFFFF; color: black;">9B</td>
416
  </tr>
 
6
  - granite-4.0
7
  ---
8
 
9
+ # Granite-4.0-Micro
10
 
11
  **Model Summary:**
12
+ Granite-4.0-Micro is a 3B parameter long-context instruct model finetuned from *Granite-4.0-Micro-Base* using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets. This model is developed using a diverse set of techniques with a structured chat format, including supervised finetuning, model alignment using reinforcement learning, and model merging. Granite 4.0 instruct models feature improved *instruction following (IF)* and *tool-calling* capabilities, making them more effective in enterprise applications.
13
 
14
  - **Developers:** Granite Team, IBM
15
  - **HF Collection:** [Granite 4.0 Language Models HF Collection](https://huggingface.co/collections/ibm-granite/granite-40-language-models-6811a18b820ef362d9e5a82c)
 
22
  English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. Users may finetune Granite 4.0 models for languages beyond these languages.
23
 
24
  **Intended use:**
25
+ The model is designed to follow general instructions and can serve as the foundation for AI assistants across diverse domains, including business applications, as well as for LLM agents equipped with tool-use capabilities.
26
 
27
  *Capabilities*
28
  * Summarization
 
38
  -->
39
 
40
  **Generation:**
41
+ This is a simple example of how to use Granite-4.0-Micro model.
42
 
43
  Install the following libraries:
44
 
 
54
  from transformers import AutoModelForCausalLM, AutoTokenizer
55
 
56
  device = "cuda"
57
+ model_path = "ibm-granite/granite-4.0-micro"
58
  tokenizer = AutoTokenizer.from_pretrained(model_path)
59
  # drop device_map if running on CPU
60
  model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
 
82
  ```
83
 
84
  **Tool-calling:**
85
+ Granite-4.0-Micro comes with enhanced tool calling capabilities, enabling seamless integration with external functions and APIs. To define a list of tools please follow OpenAI's function [definition schema](https://platform.openai.com/docs/guides/function-calling?api-mode=responses#defining-functions).
86
 
87
+ This is an example of how to use Granite-4.0-Micro model tool-calling ability:
88
 
89
  ```python
90
+ import torch
91
+ from transformers import AutoModelForCausalLM, AutoTokenizer
92
+
93
+ device = "cuda"
94
+ model_path = "ibm-granite/granite-4.0-micro"
95
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
96
+ # drop device_map if running on CPU
97
+ model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
98
+ model.eval()
99
+
100
  tools = [
101
  {
102
  "type": "function",
 
180
  </tr></thead>
181
  <tbody>
182
  <tr>
183
+ <td style="text-align:left; background-color: #DAE8FF; color: #2D2D2D;">Granite-4.0-Micro</td>
184
  <td style="text-align:center; background-color: #DAE8FF; color: #2D2D2D;"></td>
185
  <td style="text-align:center; background-color: #DAE8FF; color: #2D2D2D;"></td>
186
  <td style="text-align:center; background-color: #DAE8FF; color: #2D2D2D;"></td>
 
253
  </tr></thead>
254
  <tbody>
255
  <tr>
256
+ <td style="text-align:left; background-color: #DAE8FF; color: black;">Granite-4.0-Micro</td>
257
  <td style="text-align:center; background-color: #DAE8FF; color: black;"></td>
258
  <td style="text-align:center; background-color: #DAE8FF; color: black;"></td>
259
  <td style="text-align:center; background-color: #DAE8FF; color: black;"></td>
 
295
  </tbody></table> -->
296
 
297
  **Model Architecture:**
298
+
299
+ Granite-4.0-Micro baseline is built on a decoder-only dense transformer architecture. Core components of this architecture are: GQA, RoPE, MLP with SwiGLU, RMSNorm, and shared input/output embeddings.
300
 
301
  <table>
302
  <thead>
 
310
  <tbody>
311
  <tr>
312
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Embedding size</td>
313
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">2560</td>
314
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">2048</td>
315
  <td style="text-align:center; background-color: #FFFFFF; color: black;">1536</td>
316
  <td style="text-align:center; background-color: #FFFFFF; color: black;">4096</td>
317
  </tr>
318
  <tr>
319
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Number of layers</td>
320
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">40 attention</td>
321
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">4 attention / 36 Mamba2</td>
322
  <td style="text-align:center; background-color: #FFFFFF; color: black;">4 attention / 36 Mamba2</td>
323
  <td style="text-align:center; background-color: #FFFFFF; color: black;">4 attention / 36 Mamba2</td>
324
  </tr>
325
  <tr>
326
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Attention head size</td>
 
327
  <td style="text-align:center; background-color: #DAE8FF; color: black;">64</td>
328
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">64</td>
329
  <td style="text-align:center; background-color: #FFFFFF; color: black;">128</td>
330
  <td style="text-align:center; background-color: #FFFFFF; color: black;">128</td>
331
  </tr>
332
  <tr>
333
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Number of attention heads</td>
334
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">40</td>
335
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">32</td>
336
  <td style="text-align:center; background-color: #FFFFFF; color: black;">12</td>
337
  <td style="text-align:center; background-color: #FFFFFF; color: black;">32</td>
338
  </tr>
339
  <tr>
340
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Number of KV heads</td>
 
341
  <td style="text-align:center; background-color: #DAE8FF; color: black;">8</td>
342
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">8</td>
343
  <td style="text-align:center; background-color: #FFFFFF; color: black;">4</td>
344
  <td style="text-align:center; background-color: #FFFFFF; color: black;">8</td>
345
  </tr>
346
  <tr>
347
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Mamba2 state size</td>
348
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
349
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">128</td>
350
  <td style="text-align:center; background-color: #FFFFFF; color: black;">128</td>
351
  <td style="text-align:center; background-color: #FFFFFF; color: black;">128</td>
352
  </tr>
353
  <tr>
354
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Number of Mamba2 heads</td>
355
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
356
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">64</td>
357
  <td style="text-align:center; background-color: #FFFFFF; color: black;">48</td>
358
  <td style="text-align:center; background-color: #FFFFFF; color: black;">128</td>
359
  </tr>
360
 
361
  <tr>
362
  <td style="text-align:left; background-color: #FFFFFF; color: black;">MLP / Shared expert hidden size</td>
 
363
  <td style="text-align:center; background-color: #DAE8FF; color: black;">8192</td>
364
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">8192</td>
365
  <td style="text-align:center; background-color: #FFFFFF; color: black;">1024</td>
366
  <td style="text-align:center; background-color: #FFFFFF; color: black;">1536</td>
367
  </tr>
 
369
 
370
  <tr>
371
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Num. Experts</td>
 
372
  <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
373
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
374
  <td style="text-align:center; background-color: #FFFFFF; color: black;">64</td>
375
  <td style="text-align:center; background-color: #FFFFFF; color: black;">72</td>
376
  </tr>
377
  <tr>
378
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Num. active Experts</td>
 
379
  <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
380
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
381
  <td style="text-align:center; background-color: #FFFFFF; color: black;">6</td>
382
  <td style="text-align:center; background-color: #FFFFFF; color: black;">10</td>
383
  </tr>
384
  <tr>
385
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Expert hidden size</td>
 
386
  <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
387
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
388
  <td style="text-align:center; background-color: #FFFFFF; color: black;">512</td>
389
  <td style="text-align:center; background-color: #FFFFFF; color: black;">768</td>
390
  </tr>
391
 
392
  <tr>
393
  <td style="text-align:left; background-color: #FFFFFF; color: black;">MLP activation</td>
 
394
  <td style="text-align:center; background-color: #DAE8FF; color: black;">SwiGLU</td>
395
  <td style="text-align:center; background-color: #FFFFFF; color: black;">SwiGLU</td>
396
  <td style="text-align:center; background-color: #FFFFFF; color: black;">SwiGLU</td>
397
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">SwiGLU</td>
398
  </tr>
399
 
400
  <tr>
401
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Sequence length</td>
 
402
  <td style="text-align:center; background-color: #DAE8FF; color: black;">128K</td>
403
  <td style="text-align:center; background-color: #FFFFFF; color: black;">128K</td>
404
  <td style="text-align:center; background-color: #FFFFFF; color: black;">128K</td>
405
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">128K</td>
406
  </tr>
407
  <tr>
408
  <td style="text-align:left; background-color: #FFFFFF; color: black;">Position embedding</td>
409
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">RoPE</td>
410
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">NoPE</td>
411
  <td style="text-align:center; background-color: #FFFFFF; color: black;">NoPE</td>
412
  <td style="text-align:center; background-color: #FFFFFF; color: black;">NoPE</td>
413
  </tr>
414
  <tr>
415
  <td style="text-align:left; background-color: #FFFFFF; color: black;"># Parameters</td>
 
416
  <td style="text-align:center; background-color: #DAE8FF; color: black;">3B</td>
417
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">3B</td>
418
  <td style="text-align:center; background-color: #FFFFFF; color: black;">7B</td>
419
  <td style="text-align:center; background-color: #FFFFFF; color: black;">32B</td>
420
  </tr>
421
  <tr>
422
  <td style="text-align:left; background-color: #FFFFFF; color: black;"># Active parameters</td>
 
423
  <td style="text-align:center; background-color: #DAE8FF; color: black;">3B</td>
424
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">3B</td>
425
  <td style="text-align:center; background-color: #FFFFFF; color: black;">2B</td>
426
  <td style="text-align:center; background-color: #FFFFFF; color: black;">9B</td>
427
  </tr>
config.json CHANGED
@@ -9,76 +9,76 @@
9
  "embedding_multiplier": 12,
10
  "eos_token_id": 100257,
11
  "hidden_act": "silu",
12
- "hidden_size": 2048,
13
  "initializer_range": 0.1,
14
  "intermediate_size": 8192,
15
  "layer_types": [
16
- "mamba",
17
- "mamba",
18
- "mamba",
19
- "mamba",
20
- "mamba",
21
- "attention",
22
- "mamba",
23
- "mamba",
24
- "mamba",
25
- "mamba",
26
- "mamba",
27
- "mamba",
28
- "mamba",
29
- "mamba",
30
- "mamba",
31
- "attention",
32
- "mamba",
33
- "mamba",
34
- "mamba",
35
- "mamba",
36
- "mamba",
37
- "mamba",
38
- "mamba",
39
- "mamba",
40
- "mamba",
41
- "attention",
42
- "mamba",
43
- "mamba",
44
- "mamba",
45
- "mamba",
46
- "mamba",
47
- "mamba",
48
- "mamba",
49
- "mamba",
50
- "mamba",
51
- "attention",
52
- "mamba",
53
- "mamba",
54
- "mamba",
55
- "mamba"
56
  ],
57
- "logits_scaling": 8,
58
  "mamba_chunk_size": 256,
59
  "mamba_conv_bias": true,
60
  "mamba_d_conv": 4,
61
- "mamba_d_head": 64,
62
- "mamba_d_state": 128,
63
  "mamba_expand": 2,
64
  "mamba_n_groups": 1,
65
- "mamba_n_heads": 64,
66
  "mamba_proj_bias": false,
67
  "max_position_embeddings": 131072,
68
  "model_type": "granitemoehybrid",
69
  "normalization_function": "rmsnorm",
70
- "num_attention_heads": 32,
71
  "num_experts_per_tok": 0,
72
  "num_hidden_layers": 40,
73
  "num_key_value_heads": 8,
74
  "num_local_experts": 0,
75
  "output_router_logits": false,
76
  "pad_token_id": 100256,
77
- "position_embedding_type": "nope",
78
  "residual_multiplier": 0.22,
79
  "rms_norm_eps": 1e-05,
80
  "rope_scaling": null,
81
- "rope_theta": 10000,
82
  "router_aux_loss_coef": 0.01,
83
  "shared_intermediate_size": 8192,
84
  "tie_word_embeddings": true,
 
9
  "embedding_multiplier": 12,
10
  "eos_token_id": 100257,
11
  "hidden_act": "silu",
12
+ "hidden_size": 2560,
13
  "initializer_range": 0.1,
14
  "intermediate_size": 8192,
15
  "layer_types": [
16
+ "attention",
17
+ "attention",
18
+ "attention",
19
+ "attention",
20
+ "attention",
21
+ "attention",
22
+ "attention",
23
+ "attention",
24
+ "attention",
25
+ "attention",
26
+ "attention",
27
+ "attention",
28
+ "attention",
29
+ "attention",
30
+ "attention",
31
+ "attention",
32
+ "attention",
33
+ "attention",
34
+ "attention",
35
+ "attention",
36
+ "attention",
37
+ "attention",
38
+ "attention",
39
+ "attention",
40
+ "attention",
41
+ "attention",
42
+ "attention",
43
+ "attention",
44
+ "attention",
45
+ "attention",
46
+ "attention",
47
+ "attention",
48
+ "attention",
49
+ "attention",
50
+ "attention",
51
+ "attention",
52
+ "attention",
53
+ "attention",
54
+ "attention",
55
+ "attention"
56
  ],
57
+ "logits_scaling": 10,
58
  "mamba_chunk_size": 256,
59
  "mamba_conv_bias": true,
60
  "mamba_d_conv": 4,
61
+ "mamba_d_head": 40,
62
+ "mamba_d_state": 256,
63
  "mamba_expand": 2,
64
  "mamba_n_groups": 1,
65
+ "mamba_n_heads": 128,
66
  "mamba_proj_bias": false,
67
  "max_position_embeddings": 131072,
68
  "model_type": "granitemoehybrid",
69
  "normalization_function": "rmsnorm",
70
+ "num_attention_heads": 40,
71
  "num_experts_per_tok": 0,
72
  "num_hidden_layers": 40,
73
  "num_key_value_heads": 8,
74
  "num_local_experts": 0,
75
  "output_router_logits": false,
76
  "pad_token_id": 100256,
77
+ "position_embedding_type": "rope",
78
  "residual_multiplier": 0.22,
79
  "rms_norm_eps": 1e-05,
80
  "rope_scaling": null,
81
+ "rope_theta": 10000000,
82
  "router_aux_loss_coef": 0.01,
83
  "shared_intermediate_size": 8192,
84
  "tie_word_embeddings": true,
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:883f134c083852597a3033cfc7de8357acde530b3fe941a567f507431f5c9216
3
- size 4990606832
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:131a793b45ddeee8e7c83a77846c350b52316158917ef589c006c4a10f2de952
3
+ size 4918145376
model-00002-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:987e8aee10c40446c71f4c91d689672f1805f94712738f7a44d06a7dee5e787f
3
- size 1392237216
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6b6383a5aca723940d9e4be63fa1575e9c44941e7f5d489b923c0eaeada2c02f
3
+ size 1887565800
model.safetensors.index.json CHANGED
@@ -1,92 +1,64 @@
1
  {
2
  "metadata": {
3
  "total_parameters": 0,
4
- "total_size": 6382792192
5
  },
6
  "weight_map": {
7
  "model.embed_tokens.weight": "model-00001-of-00002.safetensors",
8
  "model.layers.0.input_layernorm.weight": "model-00001-of-00002.safetensors",
9
- "model.layers.0.mamba.A_log": "model-00001-of-00002.safetensors",
10
- "model.layers.0.mamba.D": "model-00001-of-00002.safetensors",
11
- "model.layers.0.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
12
- "model.layers.0.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
13
- "model.layers.0.mamba.dt_bias": "model-00001-of-00002.safetensors",
14
- "model.layers.0.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
15
- "model.layers.0.mamba.norm.weight": "model-00001-of-00002.safetensors",
16
- "model.layers.0.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
17
  "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
18
  "model.layers.0.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
19
  "model.layers.0.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
20
  "model.layers.1.input_layernorm.weight": "model-00001-of-00002.safetensors",
21
- "model.layers.1.mamba.A_log": "model-00001-of-00002.safetensors",
22
- "model.layers.1.mamba.D": "model-00001-of-00002.safetensors",
23
- "model.layers.1.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
24
- "model.layers.1.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
25
- "model.layers.1.mamba.dt_bias": "model-00001-of-00002.safetensors",
26
- "model.layers.1.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
27
- "model.layers.1.mamba.norm.weight": "model-00001-of-00002.safetensors",
28
- "model.layers.1.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
29
  "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
30
  "model.layers.1.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
31
  "model.layers.1.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
32
  "model.layers.10.input_layernorm.weight": "model-00001-of-00002.safetensors",
33
- "model.layers.10.mamba.A_log": "model-00001-of-00002.safetensors",
34
- "model.layers.10.mamba.D": "model-00001-of-00002.safetensors",
35
- "model.layers.10.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
36
- "model.layers.10.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
37
- "model.layers.10.mamba.dt_bias": "model-00001-of-00002.safetensors",
38
- "model.layers.10.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
39
- "model.layers.10.mamba.norm.weight": "model-00001-of-00002.safetensors",
40
- "model.layers.10.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
41
  "model.layers.10.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
42
  "model.layers.10.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
43
  "model.layers.10.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
44
  "model.layers.11.input_layernorm.weight": "model-00001-of-00002.safetensors",
45
- "model.layers.11.mamba.A_log": "model-00001-of-00002.safetensors",
46
- "model.layers.11.mamba.D": "model-00001-of-00002.safetensors",
47
- "model.layers.11.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
48
- "model.layers.11.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
49
- "model.layers.11.mamba.dt_bias": "model-00001-of-00002.safetensors",
50
- "model.layers.11.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
51
- "model.layers.11.mamba.norm.weight": "model-00001-of-00002.safetensors",
52
- "model.layers.11.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
53
  "model.layers.11.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
54
  "model.layers.11.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
55
  "model.layers.11.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
56
  "model.layers.12.input_layernorm.weight": "model-00001-of-00002.safetensors",
57
- "model.layers.12.mamba.A_log": "model-00001-of-00002.safetensors",
58
- "model.layers.12.mamba.D": "model-00001-of-00002.safetensors",
59
- "model.layers.12.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
60
- "model.layers.12.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
61
- "model.layers.12.mamba.dt_bias": "model-00001-of-00002.safetensors",
62
- "model.layers.12.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
63
- "model.layers.12.mamba.norm.weight": "model-00001-of-00002.safetensors",
64
- "model.layers.12.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
65
  "model.layers.12.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
66
  "model.layers.12.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
67
  "model.layers.12.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
68
  "model.layers.13.input_layernorm.weight": "model-00001-of-00002.safetensors",
69
- "model.layers.13.mamba.A_log": "model-00001-of-00002.safetensors",
70
- "model.layers.13.mamba.D": "model-00001-of-00002.safetensors",
71
- "model.layers.13.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
72
- "model.layers.13.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
73
- "model.layers.13.mamba.dt_bias": "model-00001-of-00002.safetensors",
74
- "model.layers.13.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
75
- "model.layers.13.mamba.norm.weight": "model-00001-of-00002.safetensors",
76
- "model.layers.13.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
77
  "model.layers.13.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
78
  "model.layers.13.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
79
  "model.layers.13.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
80
  "model.layers.14.input_layernorm.weight": "model-00001-of-00002.safetensors",
81
- "model.layers.14.mamba.A_log": "model-00001-of-00002.safetensors",
82
- "model.layers.14.mamba.D": "model-00001-of-00002.safetensors",
83
- "model.layers.14.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
84
- "model.layers.14.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
85
- "model.layers.14.mamba.dt_bias": "model-00001-of-00002.safetensors",
86
- "model.layers.14.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
87
- "model.layers.14.mamba.norm.weight": "model-00001-of-00002.safetensors",
88
- "model.layers.14.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
89
  "model.layers.14.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
90
  "model.layers.14.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
91
  "model.layers.14.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
92
  "model.layers.15.input_layernorm.weight": "model-00001-of-00002.safetensors",
@@ -98,123 +70,83 @@
98
  "model.layers.15.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
99
  "model.layers.15.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
100
  "model.layers.16.input_layernorm.weight": "model-00001-of-00002.safetensors",
101
- "model.layers.16.mamba.A_log": "model-00001-of-00002.safetensors",
102
- "model.layers.16.mamba.D": "model-00001-of-00002.safetensors",
103
- "model.layers.16.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
104
- "model.layers.16.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
105
- "model.layers.16.mamba.dt_bias": "model-00001-of-00002.safetensors",
106
- "model.layers.16.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
107
- "model.layers.16.mamba.norm.weight": "model-00001-of-00002.safetensors",
108
- "model.layers.16.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
109
  "model.layers.16.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
110
  "model.layers.16.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
111
  "model.layers.16.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
112
  "model.layers.17.input_layernorm.weight": "model-00001-of-00002.safetensors",
113
- "model.layers.17.mamba.A_log": "model-00001-of-00002.safetensors",
114
- "model.layers.17.mamba.D": "model-00001-of-00002.safetensors",
115
- "model.layers.17.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
116
- "model.layers.17.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
117
- "model.layers.17.mamba.dt_bias": "model-00001-of-00002.safetensors",
118
- "model.layers.17.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
119
- "model.layers.17.mamba.norm.weight": "model-00001-of-00002.safetensors",
120
- "model.layers.17.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
121
  "model.layers.17.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
122
  "model.layers.17.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
123
  "model.layers.17.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
124
  "model.layers.18.input_layernorm.weight": "model-00001-of-00002.safetensors",
125
- "model.layers.18.mamba.A_log": "model-00001-of-00002.safetensors",
126
- "model.layers.18.mamba.D": "model-00001-of-00002.safetensors",
127
- "model.layers.18.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
128
- "model.layers.18.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
129
- "model.layers.18.mamba.dt_bias": "model-00001-of-00002.safetensors",
130
- "model.layers.18.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
131
- "model.layers.18.mamba.norm.weight": "model-00001-of-00002.safetensors",
132
- "model.layers.18.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
133
  "model.layers.18.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
134
  "model.layers.18.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
135
  "model.layers.18.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
136
  "model.layers.19.input_layernorm.weight": "model-00001-of-00002.safetensors",
137
- "model.layers.19.mamba.A_log": "model-00001-of-00002.safetensors",
138
- "model.layers.19.mamba.D": "model-00001-of-00002.safetensors",
139
- "model.layers.19.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
140
- "model.layers.19.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
141
- "model.layers.19.mamba.dt_bias": "model-00001-of-00002.safetensors",
142
- "model.layers.19.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
143
- "model.layers.19.mamba.norm.weight": "model-00001-of-00002.safetensors",
144
- "model.layers.19.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
145
  "model.layers.19.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
146
  "model.layers.19.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
147
  "model.layers.19.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
148
  "model.layers.2.input_layernorm.weight": "model-00001-of-00002.safetensors",
149
- "model.layers.2.mamba.A_log": "model-00001-of-00002.safetensors",
150
- "model.layers.2.mamba.D": "model-00001-of-00002.safetensors",
151
- "model.layers.2.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
152
- "model.layers.2.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
153
- "model.layers.2.mamba.dt_bias": "model-00001-of-00002.safetensors",
154
- "model.layers.2.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
155
- "model.layers.2.mamba.norm.weight": "model-00001-of-00002.safetensors",
156
- "model.layers.2.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
157
  "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
158
  "model.layers.2.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
159
  "model.layers.2.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
160
  "model.layers.20.input_layernorm.weight": "model-00001-of-00002.safetensors",
161
- "model.layers.20.mamba.A_log": "model-00001-of-00002.safetensors",
162
- "model.layers.20.mamba.D": "model-00001-of-00002.safetensors",
163
- "model.layers.20.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
164
- "model.layers.20.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
165
- "model.layers.20.mamba.dt_bias": "model-00001-of-00002.safetensors",
166
- "model.layers.20.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
167
- "model.layers.20.mamba.norm.weight": "model-00001-of-00002.safetensors",
168
- "model.layers.20.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
169
  "model.layers.20.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
170
  "model.layers.20.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
171
  "model.layers.20.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
172
  "model.layers.21.input_layernorm.weight": "model-00001-of-00002.safetensors",
173
- "model.layers.21.mamba.A_log": "model-00001-of-00002.safetensors",
174
- "model.layers.21.mamba.D": "model-00001-of-00002.safetensors",
175
- "model.layers.21.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
176
- "model.layers.21.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
177
- "model.layers.21.mamba.dt_bias": "model-00001-of-00002.safetensors",
178
- "model.layers.21.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
179
- "model.layers.21.mamba.norm.weight": "model-00001-of-00002.safetensors",
180
- "model.layers.21.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
181
  "model.layers.21.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
182
  "model.layers.21.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
183
  "model.layers.21.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
184
  "model.layers.22.input_layernorm.weight": "model-00001-of-00002.safetensors",
185
- "model.layers.22.mamba.A_log": "model-00001-of-00002.safetensors",
186
- "model.layers.22.mamba.D": "model-00001-of-00002.safetensors",
187
- "model.layers.22.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
188
- "model.layers.22.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
189
- "model.layers.22.mamba.dt_bias": "model-00001-of-00002.safetensors",
190
- "model.layers.22.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
191
- "model.layers.22.mamba.norm.weight": "model-00001-of-00002.safetensors",
192
- "model.layers.22.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
193
  "model.layers.22.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
194
  "model.layers.22.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
195
  "model.layers.22.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
196
  "model.layers.23.input_layernorm.weight": "model-00001-of-00002.safetensors",
197
- "model.layers.23.mamba.A_log": "model-00001-of-00002.safetensors",
198
- "model.layers.23.mamba.D": "model-00001-of-00002.safetensors",
199
- "model.layers.23.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
200
- "model.layers.23.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
201
- "model.layers.23.mamba.dt_bias": "model-00001-of-00002.safetensors",
202
- "model.layers.23.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
203
- "model.layers.23.mamba.norm.weight": "model-00001-of-00002.safetensors",
204
- "model.layers.23.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
205
  "model.layers.23.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
206
  "model.layers.23.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
207
  "model.layers.23.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
208
  "model.layers.24.input_layernorm.weight": "model-00001-of-00002.safetensors",
209
- "model.layers.24.mamba.A_log": "model-00001-of-00002.safetensors",
210
- "model.layers.24.mamba.D": "model-00001-of-00002.safetensors",
211
- "model.layers.24.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
212
- "model.layers.24.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
213
- "model.layers.24.mamba.dt_bias": "model-00001-of-00002.safetensors",
214
- "model.layers.24.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
215
- "model.layers.24.mamba.norm.weight": "model-00001-of-00002.safetensors",
216
- "model.layers.24.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
217
  "model.layers.24.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
218
  "model.layers.24.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
219
  "model.layers.24.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
220
  "model.layers.25.input_layernorm.weight": "model-00001-of-00002.safetensors",
@@ -226,123 +158,83 @@
226
  "model.layers.25.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
227
  "model.layers.25.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
228
  "model.layers.26.input_layernorm.weight": "model-00001-of-00002.safetensors",
229
- "model.layers.26.mamba.A_log": "model-00001-of-00002.safetensors",
230
- "model.layers.26.mamba.D": "model-00001-of-00002.safetensors",
231
- "model.layers.26.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
232
- "model.layers.26.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
233
- "model.layers.26.mamba.dt_bias": "model-00001-of-00002.safetensors",
234
- "model.layers.26.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
235
- "model.layers.26.mamba.norm.weight": "model-00001-of-00002.safetensors",
236
- "model.layers.26.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
237
  "model.layers.26.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
238
  "model.layers.26.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
239
  "model.layers.26.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
240
  "model.layers.27.input_layernorm.weight": "model-00001-of-00002.safetensors",
241
- "model.layers.27.mamba.A_log": "model-00001-of-00002.safetensors",
242
- "model.layers.27.mamba.D": "model-00001-of-00002.safetensors",
243
- "model.layers.27.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
244
- "model.layers.27.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
245
- "model.layers.27.mamba.dt_bias": "model-00001-of-00002.safetensors",
246
- "model.layers.27.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
247
- "model.layers.27.mamba.norm.weight": "model-00001-of-00002.safetensors",
248
- "model.layers.27.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
249
  "model.layers.27.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
250
  "model.layers.27.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
251
  "model.layers.27.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
252
  "model.layers.28.input_layernorm.weight": "model-00001-of-00002.safetensors",
253
- "model.layers.28.mamba.A_log": "model-00001-of-00002.safetensors",
254
- "model.layers.28.mamba.D": "model-00001-of-00002.safetensors",
255
- "model.layers.28.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
256
- "model.layers.28.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
257
- "model.layers.28.mamba.dt_bias": "model-00001-of-00002.safetensors",
258
- "model.layers.28.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
259
- "model.layers.28.mamba.norm.weight": "model-00001-of-00002.safetensors",
260
- "model.layers.28.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
261
  "model.layers.28.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
262
- "model.layers.28.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
263
- "model.layers.28.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
264
- "model.layers.29.input_layernorm.weight": "model-00001-of-00002.safetensors",
265
- "model.layers.29.mamba.A_log": "model-00001-of-00002.safetensors",
266
- "model.layers.29.mamba.D": "model-00001-of-00002.safetensors",
267
- "model.layers.29.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
268
- "model.layers.29.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
269
- "model.layers.29.mamba.dt_bias": "model-00001-of-00002.safetensors",
270
- "model.layers.29.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
271
- "model.layers.29.mamba.norm.weight": "model-00001-of-00002.safetensors",
272
- "model.layers.29.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
273
- "model.layers.29.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
274
- "model.layers.29.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
275
- "model.layers.29.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
276
  "model.layers.3.input_layernorm.weight": "model-00001-of-00002.safetensors",
277
- "model.layers.3.mamba.A_log": "model-00001-of-00002.safetensors",
278
- "model.layers.3.mamba.D": "model-00001-of-00002.safetensors",
279
- "model.layers.3.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
280
- "model.layers.3.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
281
- "model.layers.3.mamba.dt_bias": "model-00001-of-00002.safetensors",
282
- "model.layers.3.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
283
- "model.layers.3.mamba.norm.weight": "model-00001-of-00002.safetensors",
284
- "model.layers.3.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
285
  "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
286
  "model.layers.3.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
287
  "model.layers.3.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
288
- "model.layers.30.input_layernorm.weight": "model-00001-of-00002.safetensors",
289
- "model.layers.30.mamba.A_log": "model-00001-of-00002.safetensors",
290
- "model.layers.30.mamba.D": "model-00001-of-00002.safetensors",
291
- "model.layers.30.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
292
- "model.layers.30.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
293
- "model.layers.30.mamba.dt_bias": "model-00001-of-00002.safetensors",
294
- "model.layers.30.mamba.in_proj.weight": "model-00002-of-00002.safetensors",
295
- "model.layers.30.mamba.norm.weight": "model-00002-of-00002.safetensors",
296
- "model.layers.30.mamba.out_proj.weight": "model-00002-of-00002.safetensors",
297
- "model.layers.30.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
298
- "model.layers.30.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
299
- "model.layers.30.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
300
  "model.layers.31.input_layernorm.weight": "model-00002-of-00002.safetensors",
301
- "model.layers.31.mamba.A_log": "model-00002-of-00002.safetensors",
302
- "model.layers.31.mamba.D": "model-00002-of-00002.safetensors",
303
- "model.layers.31.mamba.conv1d.bias": "model-00002-of-00002.safetensors",
304
- "model.layers.31.mamba.conv1d.weight": "model-00002-of-00002.safetensors",
305
- "model.layers.31.mamba.dt_bias": "model-00002-of-00002.safetensors",
306
- "model.layers.31.mamba.in_proj.weight": "model-00002-of-00002.safetensors",
307
- "model.layers.31.mamba.norm.weight": "model-00002-of-00002.safetensors",
308
- "model.layers.31.mamba.out_proj.weight": "model-00002-of-00002.safetensors",
309
  "model.layers.31.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
 
 
 
 
310
  "model.layers.31.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
311
  "model.layers.31.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
312
  "model.layers.32.input_layernorm.weight": "model-00002-of-00002.safetensors",
313
- "model.layers.32.mamba.A_log": "model-00002-of-00002.safetensors",
314
- "model.layers.32.mamba.D": "model-00002-of-00002.safetensors",
315
- "model.layers.32.mamba.conv1d.bias": "model-00002-of-00002.safetensors",
316
- "model.layers.32.mamba.conv1d.weight": "model-00002-of-00002.safetensors",
317
- "model.layers.32.mamba.dt_bias": "model-00002-of-00002.safetensors",
318
- "model.layers.32.mamba.in_proj.weight": "model-00002-of-00002.safetensors",
319
- "model.layers.32.mamba.norm.weight": "model-00002-of-00002.safetensors",
320
- "model.layers.32.mamba.out_proj.weight": "model-00002-of-00002.safetensors",
321
  "model.layers.32.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
 
 
 
 
322
  "model.layers.32.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
323
  "model.layers.32.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
324
  "model.layers.33.input_layernorm.weight": "model-00002-of-00002.safetensors",
325
- "model.layers.33.mamba.A_log": "model-00002-of-00002.safetensors",
326
- "model.layers.33.mamba.D": "model-00002-of-00002.safetensors",
327
- "model.layers.33.mamba.conv1d.bias": "model-00002-of-00002.safetensors",
328
- "model.layers.33.mamba.conv1d.weight": "model-00002-of-00002.safetensors",
329
- "model.layers.33.mamba.dt_bias": "model-00002-of-00002.safetensors",
330
- "model.layers.33.mamba.in_proj.weight": "model-00002-of-00002.safetensors",
331
- "model.layers.33.mamba.norm.weight": "model-00002-of-00002.safetensors",
332
- "model.layers.33.mamba.out_proj.weight": "model-00002-of-00002.safetensors",
333
  "model.layers.33.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
 
 
 
 
334
  "model.layers.33.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
335
  "model.layers.33.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
336
  "model.layers.34.input_layernorm.weight": "model-00002-of-00002.safetensors",
337
- "model.layers.34.mamba.A_log": "model-00002-of-00002.safetensors",
338
- "model.layers.34.mamba.D": "model-00002-of-00002.safetensors",
339
- "model.layers.34.mamba.conv1d.bias": "model-00002-of-00002.safetensors",
340
- "model.layers.34.mamba.conv1d.weight": "model-00002-of-00002.safetensors",
341
- "model.layers.34.mamba.dt_bias": "model-00002-of-00002.safetensors",
342
- "model.layers.34.mamba.in_proj.weight": "model-00002-of-00002.safetensors",
343
- "model.layers.34.mamba.norm.weight": "model-00002-of-00002.safetensors",
344
- "model.layers.34.mamba.out_proj.weight": "model-00002-of-00002.safetensors",
345
  "model.layers.34.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
 
 
 
 
346
  "model.layers.34.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
347
  "model.layers.34.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
348
  "model.layers.35.input_layernorm.weight": "model-00002-of-00002.safetensors",
@@ -354,63 +246,43 @@
354
  "model.layers.35.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
355
  "model.layers.35.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
356
  "model.layers.36.input_layernorm.weight": "model-00002-of-00002.safetensors",
357
- "model.layers.36.mamba.A_log": "model-00002-of-00002.safetensors",
358
- "model.layers.36.mamba.D": "model-00002-of-00002.safetensors",
359
- "model.layers.36.mamba.conv1d.bias": "model-00002-of-00002.safetensors",
360
- "model.layers.36.mamba.conv1d.weight": "model-00002-of-00002.safetensors",
361
- "model.layers.36.mamba.dt_bias": "model-00002-of-00002.safetensors",
362
- "model.layers.36.mamba.in_proj.weight": "model-00002-of-00002.safetensors",
363
- "model.layers.36.mamba.norm.weight": "model-00002-of-00002.safetensors",
364
- "model.layers.36.mamba.out_proj.weight": "model-00002-of-00002.safetensors",
365
  "model.layers.36.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
 
 
 
 
366
  "model.layers.36.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
367
  "model.layers.36.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
368
  "model.layers.37.input_layernorm.weight": "model-00002-of-00002.safetensors",
369
- "model.layers.37.mamba.A_log": "model-00002-of-00002.safetensors",
370
- "model.layers.37.mamba.D": "model-00002-of-00002.safetensors",
371
- "model.layers.37.mamba.conv1d.bias": "model-00002-of-00002.safetensors",
372
- "model.layers.37.mamba.conv1d.weight": "model-00002-of-00002.safetensors",
373
- "model.layers.37.mamba.dt_bias": "model-00002-of-00002.safetensors",
374
- "model.layers.37.mamba.in_proj.weight": "model-00002-of-00002.safetensors",
375
- "model.layers.37.mamba.norm.weight": "model-00002-of-00002.safetensors",
376
- "model.layers.37.mamba.out_proj.weight": "model-00002-of-00002.safetensors",
377
  "model.layers.37.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
 
 
 
 
378
  "model.layers.37.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
379
  "model.layers.37.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
380
  "model.layers.38.input_layernorm.weight": "model-00002-of-00002.safetensors",
381
- "model.layers.38.mamba.A_log": "model-00002-of-00002.safetensors",
382
- "model.layers.38.mamba.D": "model-00002-of-00002.safetensors",
383
- "model.layers.38.mamba.conv1d.bias": "model-00002-of-00002.safetensors",
384
- "model.layers.38.mamba.conv1d.weight": "model-00002-of-00002.safetensors",
385
- "model.layers.38.mamba.dt_bias": "model-00002-of-00002.safetensors",
386
- "model.layers.38.mamba.in_proj.weight": "model-00002-of-00002.safetensors",
387
- "model.layers.38.mamba.norm.weight": "model-00002-of-00002.safetensors",
388
- "model.layers.38.mamba.out_proj.weight": "model-00002-of-00002.safetensors",
389
  "model.layers.38.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
 
 
 
 
390
  "model.layers.38.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
391
  "model.layers.38.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
392
  "model.layers.39.input_layernorm.weight": "model-00002-of-00002.safetensors",
393
- "model.layers.39.mamba.A_log": "model-00002-of-00002.safetensors",
394
- "model.layers.39.mamba.D": "model-00002-of-00002.safetensors",
395
- "model.layers.39.mamba.conv1d.bias": "model-00002-of-00002.safetensors",
396
- "model.layers.39.mamba.conv1d.weight": "model-00002-of-00002.safetensors",
397
- "model.layers.39.mamba.dt_bias": "model-00002-of-00002.safetensors",
398
- "model.layers.39.mamba.in_proj.weight": "model-00002-of-00002.safetensors",
399
- "model.layers.39.mamba.norm.weight": "model-00002-of-00002.safetensors",
400
- "model.layers.39.mamba.out_proj.weight": "model-00002-of-00002.safetensors",
401
  "model.layers.39.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
 
 
 
 
402
  "model.layers.39.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
403
  "model.layers.39.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
404
  "model.layers.4.input_layernorm.weight": "model-00001-of-00002.safetensors",
405
- "model.layers.4.mamba.A_log": "model-00001-of-00002.safetensors",
406
- "model.layers.4.mamba.D": "model-00001-of-00002.safetensors",
407
- "model.layers.4.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
408
- "model.layers.4.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
409
- "model.layers.4.mamba.dt_bias": "model-00001-of-00002.safetensors",
410
- "model.layers.4.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
411
- "model.layers.4.mamba.norm.weight": "model-00001-of-00002.safetensors",
412
- "model.layers.4.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
413
  "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
414
  "model.layers.4.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
415
  "model.layers.4.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
416
  "model.layers.5.input_layernorm.weight": "model-00001-of-00002.safetensors",
@@ -422,51 +294,35 @@
422
  "model.layers.5.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
423
  "model.layers.5.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
424
  "model.layers.6.input_layernorm.weight": "model-00001-of-00002.safetensors",
425
- "model.layers.6.mamba.A_log": "model-00001-of-00002.safetensors",
426
- "model.layers.6.mamba.D": "model-00001-of-00002.safetensors",
427
- "model.layers.6.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
428
- "model.layers.6.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
429
- "model.layers.6.mamba.dt_bias": "model-00001-of-00002.safetensors",
430
- "model.layers.6.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
431
- "model.layers.6.mamba.norm.weight": "model-00001-of-00002.safetensors",
432
- "model.layers.6.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
433
  "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
434
  "model.layers.6.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
435
  "model.layers.6.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
436
  "model.layers.7.input_layernorm.weight": "model-00001-of-00002.safetensors",
437
- "model.layers.7.mamba.A_log": "model-00001-of-00002.safetensors",
438
- "model.layers.7.mamba.D": "model-00001-of-00002.safetensors",
439
- "model.layers.7.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
440
- "model.layers.7.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
441
- "model.layers.7.mamba.dt_bias": "model-00001-of-00002.safetensors",
442
- "model.layers.7.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
443
- "model.layers.7.mamba.norm.weight": "model-00001-of-00002.safetensors",
444
- "model.layers.7.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
445
  "model.layers.7.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
446
  "model.layers.7.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
447
  "model.layers.7.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
448
  "model.layers.8.input_layernorm.weight": "model-00001-of-00002.safetensors",
449
- "model.layers.8.mamba.A_log": "model-00001-of-00002.safetensors",
450
- "model.layers.8.mamba.D": "model-00001-of-00002.safetensors",
451
- "model.layers.8.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
452
- "model.layers.8.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
453
- "model.layers.8.mamba.dt_bias": "model-00001-of-00002.safetensors",
454
- "model.layers.8.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
455
- "model.layers.8.mamba.norm.weight": "model-00001-of-00002.safetensors",
456
- "model.layers.8.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
457
  "model.layers.8.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
458
  "model.layers.8.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
459
  "model.layers.8.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
460
  "model.layers.9.input_layernorm.weight": "model-00001-of-00002.safetensors",
461
- "model.layers.9.mamba.A_log": "model-00001-of-00002.safetensors",
462
- "model.layers.9.mamba.D": "model-00001-of-00002.safetensors",
463
- "model.layers.9.mamba.conv1d.bias": "model-00001-of-00002.safetensors",
464
- "model.layers.9.mamba.conv1d.weight": "model-00001-of-00002.safetensors",
465
- "model.layers.9.mamba.dt_bias": "model-00001-of-00002.safetensors",
466
- "model.layers.9.mamba.in_proj.weight": "model-00001-of-00002.safetensors",
467
- "model.layers.9.mamba.norm.weight": "model-00001-of-00002.safetensors",
468
- "model.layers.9.mamba.out_proj.weight": "model-00001-of-00002.safetensors",
469
  "model.layers.9.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
470
  "model.layers.9.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
471
  "model.layers.9.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
472
  "model.norm.weight": "model-00002-of-00002.safetensors"
 
1
  {
2
  "metadata": {
3
  "total_parameters": 0,
4
+ "total_size": 6805672960
5
  },
6
  "weight_map": {
7
  "model.embed_tokens.weight": "model-00001-of-00002.safetensors",
8
  "model.layers.0.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
9
  "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
10
+ "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
11
+ "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
12
+ "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
13
+ "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
14
  "model.layers.0.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
15
  "model.layers.0.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
16
  "model.layers.1.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
17
  "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
18
+ "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
19
+ "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
20
+ "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
21
+ "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
22
  "model.layers.1.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
23
  "model.layers.1.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
24
  "model.layers.10.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
25
  "model.layers.10.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
26
+ "model.layers.10.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
27
+ "model.layers.10.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
28
+ "model.layers.10.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
29
+ "model.layers.10.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
30
  "model.layers.10.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
31
  "model.layers.10.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
32
  "model.layers.11.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
33
  "model.layers.11.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
34
+ "model.layers.11.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
35
+ "model.layers.11.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
36
+ "model.layers.11.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
37
+ "model.layers.11.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
38
  "model.layers.11.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
39
  "model.layers.11.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
40
  "model.layers.12.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
41
  "model.layers.12.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
42
+ "model.layers.12.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
43
+ "model.layers.12.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
44
+ "model.layers.12.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
45
+ "model.layers.12.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
46
  "model.layers.12.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
47
  "model.layers.12.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
48
  "model.layers.13.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
49
  "model.layers.13.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
50
+ "model.layers.13.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
51
+ "model.layers.13.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
52
+ "model.layers.13.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
53
+ "model.layers.13.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
54
  "model.layers.13.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
55
  "model.layers.13.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
56
  "model.layers.14.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
57
  "model.layers.14.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
58
+ "model.layers.14.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
59
+ "model.layers.14.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
60
+ "model.layers.14.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
61
+ "model.layers.14.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
62
  "model.layers.14.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
63
  "model.layers.14.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
64
  "model.layers.15.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
70
  "model.layers.15.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
71
  "model.layers.15.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
72
  "model.layers.16.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
73
  "model.layers.16.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
74
+ "model.layers.16.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
75
+ "model.layers.16.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
76
+ "model.layers.16.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
77
+ "model.layers.16.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
78
  "model.layers.16.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
79
  "model.layers.16.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
80
  "model.layers.17.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
81
  "model.layers.17.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
82
+ "model.layers.17.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
83
+ "model.layers.17.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
84
+ "model.layers.17.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
85
+ "model.layers.17.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
86
  "model.layers.17.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
87
  "model.layers.17.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
88
  "model.layers.18.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
89
  "model.layers.18.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
90
+ "model.layers.18.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
91
+ "model.layers.18.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
92
+ "model.layers.18.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
93
+ "model.layers.18.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
94
  "model.layers.18.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
95
  "model.layers.18.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
96
  "model.layers.19.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
97
  "model.layers.19.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
98
+ "model.layers.19.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
99
+ "model.layers.19.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
100
+ "model.layers.19.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
101
+ "model.layers.19.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
102
  "model.layers.19.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
103
  "model.layers.19.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
104
  "model.layers.2.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
105
  "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
106
+ "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
107
+ "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
108
+ "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
109
+ "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
110
  "model.layers.2.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
111
  "model.layers.2.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
112
  "model.layers.20.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
113
  "model.layers.20.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
114
+ "model.layers.20.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
115
+ "model.layers.20.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
116
+ "model.layers.20.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
117
+ "model.layers.20.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
118
  "model.layers.20.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
119
  "model.layers.20.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
120
  "model.layers.21.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
121
  "model.layers.21.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
122
+ "model.layers.21.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
123
+ "model.layers.21.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
124
+ "model.layers.21.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
125
+ "model.layers.21.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
126
  "model.layers.21.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
127
  "model.layers.21.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
128
  "model.layers.22.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
129
  "model.layers.22.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
130
+ "model.layers.22.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
131
+ "model.layers.22.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
132
+ "model.layers.22.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
133
+ "model.layers.22.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
134
  "model.layers.22.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
135
  "model.layers.22.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
136
  "model.layers.23.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
137
  "model.layers.23.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
138
+ "model.layers.23.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
139
+ "model.layers.23.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
140
+ "model.layers.23.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
141
+ "model.layers.23.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
142
  "model.layers.23.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
143
  "model.layers.23.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
144
  "model.layers.24.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
145
  "model.layers.24.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
146
+ "model.layers.24.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
147
+ "model.layers.24.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
148
+ "model.layers.24.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
149
+ "model.layers.24.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
150
  "model.layers.24.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
151
  "model.layers.24.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
152
  "model.layers.25.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
158
  "model.layers.25.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
159
  "model.layers.25.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
160
  "model.layers.26.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
161
  "model.layers.26.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
162
+ "model.layers.26.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
163
+ "model.layers.26.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
164
+ "model.layers.26.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
165
+ "model.layers.26.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
166
  "model.layers.26.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
167
  "model.layers.26.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
168
  "model.layers.27.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
169
  "model.layers.27.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
170
+ "model.layers.27.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
171
+ "model.layers.27.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
172
+ "model.layers.27.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
173
+ "model.layers.27.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
174
  "model.layers.27.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
175
  "model.layers.27.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
176
  "model.layers.28.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
177
  "model.layers.28.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
178
+ "model.layers.28.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
179
+ "model.layers.28.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
180
+ "model.layers.28.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
181
+ "model.layers.28.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
182
+ "model.layers.28.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
183
+ "model.layers.28.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
184
+ "model.layers.29.input_layernorm.weight": "model-00002-of-00002.safetensors",
185
+ "model.layers.29.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
186
+ "model.layers.29.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
187
+ "model.layers.29.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
188
+ "model.layers.29.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
189
+ "model.layers.29.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
190
+ "model.layers.29.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
191
+ "model.layers.29.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
192
  "model.layers.3.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
193
  "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
194
+ "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
195
+ "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
196
+ "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
197
+ "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
198
  "model.layers.3.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
199
  "model.layers.3.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
200
+ "model.layers.30.input_layernorm.weight": "model-00002-of-00002.safetensors",
201
+ "model.layers.30.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
202
+ "model.layers.30.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
203
+ "model.layers.30.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
204
+ "model.layers.30.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
205
+ "model.layers.30.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
206
+ "model.layers.30.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
207
+ "model.layers.30.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
 
 
 
 
208
  "model.layers.31.input_layernorm.weight": "model-00002-of-00002.safetensors",
 
 
 
 
 
 
 
 
209
  "model.layers.31.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
210
+ "model.layers.31.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
211
+ "model.layers.31.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
212
+ "model.layers.31.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
213
+ "model.layers.31.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
214
  "model.layers.31.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
215
  "model.layers.31.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
216
  "model.layers.32.input_layernorm.weight": "model-00002-of-00002.safetensors",
 
 
 
 
 
 
 
 
217
  "model.layers.32.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
218
+ "model.layers.32.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
219
+ "model.layers.32.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
220
+ "model.layers.32.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
221
+ "model.layers.32.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
222
  "model.layers.32.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
223
  "model.layers.32.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
224
  "model.layers.33.input_layernorm.weight": "model-00002-of-00002.safetensors",
 
 
 
 
 
 
 
 
225
  "model.layers.33.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
226
+ "model.layers.33.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
227
+ "model.layers.33.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
228
+ "model.layers.33.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
229
+ "model.layers.33.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
230
  "model.layers.33.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
231
  "model.layers.33.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
232
  "model.layers.34.input_layernorm.weight": "model-00002-of-00002.safetensors",
 
 
 
 
 
 
 
 
233
  "model.layers.34.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
234
+ "model.layers.34.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
235
+ "model.layers.34.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
236
+ "model.layers.34.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
237
+ "model.layers.34.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
238
  "model.layers.34.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
239
  "model.layers.34.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
240
  "model.layers.35.input_layernorm.weight": "model-00002-of-00002.safetensors",
 
246
  "model.layers.35.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
247
  "model.layers.35.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
248
  "model.layers.36.input_layernorm.weight": "model-00002-of-00002.safetensors",
 
 
 
 
 
 
 
 
249
  "model.layers.36.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
250
+ "model.layers.36.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
251
+ "model.layers.36.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
252
+ "model.layers.36.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
253
+ "model.layers.36.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
254
  "model.layers.36.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
255
  "model.layers.36.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
256
  "model.layers.37.input_layernorm.weight": "model-00002-of-00002.safetensors",
 
 
 
 
 
 
 
 
257
  "model.layers.37.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
258
+ "model.layers.37.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
259
+ "model.layers.37.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
260
+ "model.layers.37.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
261
+ "model.layers.37.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
262
  "model.layers.37.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
263
  "model.layers.37.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
264
  "model.layers.38.input_layernorm.weight": "model-00002-of-00002.safetensors",
 
 
 
 
 
 
 
 
265
  "model.layers.38.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
266
+ "model.layers.38.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
267
+ "model.layers.38.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
268
+ "model.layers.38.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
269
+ "model.layers.38.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
270
  "model.layers.38.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
271
  "model.layers.38.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
272
  "model.layers.39.input_layernorm.weight": "model-00002-of-00002.safetensors",
 
 
 
 
 
 
 
 
273
  "model.layers.39.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
274
+ "model.layers.39.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
275
+ "model.layers.39.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
276
+ "model.layers.39.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
277
+ "model.layers.39.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
278
  "model.layers.39.shared_mlp.input_linear.weight": "model-00002-of-00002.safetensors",
279
  "model.layers.39.shared_mlp.output_linear.weight": "model-00002-of-00002.safetensors",
280
  "model.layers.4.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
281
  "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
282
+ "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
283
+ "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
284
+ "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
285
+ "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
286
  "model.layers.4.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
287
  "model.layers.4.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
288
  "model.layers.5.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
294
  "model.layers.5.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
295
  "model.layers.5.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
296
  "model.layers.6.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
297
  "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
298
+ "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
299
+ "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
300
+ "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
301
+ "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
302
  "model.layers.6.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
303
  "model.layers.6.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
304
  "model.layers.7.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
305
  "model.layers.7.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
306
+ "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
307
+ "model.layers.7.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
308
+ "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
309
+ "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
310
  "model.layers.7.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
311
  "model.layers.7.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
312
  "model.layers.8.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
313
  "model.layers.8.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
314
+ "model.layers.8.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
315
+ "model.layers.8.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
316
+ "model.layers.8.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
317
+ "model.layers.8.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
318
  "model.layers.8.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
319
  "model.layers.8.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
320
  "model.layers.9.input_layernorm.weight": "model-00001-of-00002.safetensors",
 
 
 
 
 
 
 
 
321
  "model.layers.9.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
322
+ "model.layers.9.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
323
+ "model.layers.9.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
324
+ "model.layers.9.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
325
+ "model.layers.9.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
326
  "model.layers.9.shared_mlp.input_linear.weight": "model-00001-of-00002.safetensors",
327
  "model.layers.9.shared_mlp.output_linear.weight": "model-00001-of-00002.safetensors",
328
  "model.norm.weight": "model-00002-of-00002.safetensors"
model.sig CHANGED
@@ -1 +1 @@
1
- {"mediaType": "application/vnd.dev.sigstore.bundle.v0.3+json", "verificationMaterial": {"certificate": {"rawBytes": "MIIC4zCCAmqgAwIBAgIUIz7dSNWJmRlLaNf6Fwo+6glMUP8wCgYIKoZIzj0EAwMwNzEVMBMGA1UEChMMc2lnc3RvcmUuZGV2MR4wHAYDVQQDExVzaWdzdG9yZS1pbnRlcm1lZGlhdGUwHhcNMjUwOTI0MTcyNjIyWhcNMjUwOTI0MTczNjIyWjAAMFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE2THKXDyLV2izTRJcUQuCYtJt3M000+BFHvRCQi8KpX8ooaJ85Qlbk8IkFFRcnN+gUyufcUd7IU0fFUpXkVXpNqOCAYkwggGFMA4GA1UdDwEB/wQEAwIHgDATBgNVHSUEDDAKBggrBgEFBQcDAzAdBgNVHQ4EFgQUew1PUIxqbFo4LwkNL+XWILpfECMwHwYDVR0jBBgwFoAU39Ppz1YkEZb5qNjpKFWixi4YZD8wJAYDVR0RAQH/BBowGIEWR3Jhbml0ZS12ZXJpZnlAaWJtLmNvbTA0BgorBgEEAYO/MAEBBCZodHRwczovL3NpZ3N0b3JlLnZlcmlmeS5pYm0uY29tL29hdXRoMjA2BgorBgEEAYO/MAEIBCgMJmh0dHBzOi8vc2lnc3RvcmUudmVyaWZ5LmlibS5jb20vb2F1dGgyMIGJBgorBgEEAdZ5AgQCBHsEeQB3AHUA3T0wasbHETJjGR4cmWc3AqJKXrjePK3/h4pygC8p7o4AAAGZfMMOvAAABAMARjBEAiAa7dyMoArio18kl4lqol47m6ZOHaHB5WrOS7z9aigYHwIgFtgPA7+NlKvv3TczaYA3cXW2rJVoGmEbSKQXuGo6Q8cwCgYIKoZIzj0EAwMDZwAwZAIwBZddrDWSaIdLnAza8Omu1EE1W3J2FuVnHXigSOn3RHYK6qeILLVbM9KPmPUHjExwAjBecEluky0GBi56zuco8LDJNO907MjuFFhdmixjREhc2TizbZnwsHRCcU9MP8AHYg0="}, "tlogEntries": [{"logIndex": "556560144", "logId": {"keyId": "wNI9atQGlz+VWfO6LRygH4QUfY/8W4RFwiT5i5WRgB0="}, "kindVersion": {"kind": "dsse", "version": "0.0.1"}, "integratedTime": "1758734782", "inclusionPromise": {"signedEntryTimestamp": "MEQCIFMfntFVx1PEs4/Ev0eUJV9uPQjuaYQ6SLAyFRpKP3s6AiA6Z5y0MhzZErw96zwp+dzhx8C+3XUdGI0aQGw+X0HipA=="}, "inclusionProof": {"logIndex": "434655882", "rootHash": "2UOKekUF3cU9B/AGxzhY5GY4NQ6du3fDa7m1VtLpBR4=", "treeSize": "434655888", "hashes": ["GgE8dyvpboODk+Wl1RpdSjH8r+F2dAZGhonXqLpriHQ=", "Rb/zqdvprTOvH7p4quLe5Sk/hC+iWHcK9ayxzRcz0C8=", "S2Xx7wRYWQEKIaF+wmoxfv5pHUWnDQhUm+0YzgcR+F4=", "8ihB5zwIKThwprjf9doSf9UcFLLhs116j2xGWxNCNYA=", "cIsHv2cUgKxxTENXhz6XbwF6n6p25+rvH4MLAyXUTp0=", "6IPaN568kpWGmiDUsygOPWzAm9lEM7zSiJFaO/sU1WM=", "QO5bqvUMa8UFsF8Jcs1qRGNGzcTnHjLD1SNL+5y8uE8=", "V3XSrFA6fS5Zb3qaN0gQIOAcARCDwTRi2Aky06MUF98=", "58t9iQVfZHZM98IeZympYWj5Ec2lSv6GfJD4Dzllr0A=", "kIpC8HZ1RqnIJoiaLXMnAF27OjCtAmXbe2DJqVL303o=", "2Ums0BJj8X63OztoWu5/Iu3n3DySpmrs46my98VBQuA=", "xpJKkCRds0GSfols2CmZTiYCRJjEXZ4F1M6bU2bnjrw=", "rO8wDSOjmY8VkspFqYaJS4TV5HxywICMlHM8gTxXkAA=", "1mfy94KpcItqshH9+gwqV6jccupcaMpVsF28New8zDY=", "vS7O4ozHIQZJWBiov+mkpI27GE8zAmVCEkRcP3NDyNE="], "checkpoint": {"envelope": "rekor.sigstore.dev - 1193050959916656506\n434655888\n2UOKekUF3cU9B/AGxzhY5GY4NQ6du3fDa7m1VtLpBR4=\n\n\u2014 rekor.sigstore.dev wNI9ajBEAiBqR1kCvpyVEX499F28chSgRlqL0DflQ67T69NWWrn1twIgXyIxwioSVFsRK0zSJBIjO35KN7T4BgNJ6kUizZkyDTc=\n"}}, "canonicalizedBody": "eyJhcGlWZXJzaW9uIjoiMC4wLjEiLCJraW5kIjoiZHNzZSIsInNwZWMiOnsiZW52ZWxvcGVIYXNoIjp7ImFsZ29yaXRobSI6InNoYTI1NiIsInZhbHVlIjoiNmM0OGZiZjcwYWM3NzMxY2ZmMmM2MGEyNTQzYTY2YTIwNjk5MTFjOTU2MGVhN2QwOTQzYThjYjE2MDU0MzY1YSJ9LCJwYXlsb2FkSGFzaCI6eyJhbGdvcml0aG0iOiJzaGEyNTYiLCJ2YWx1ZSI6IjZhODkzNjM2N2U4MDBlYzEyMTI0YTZmZmQ4MWFjZDVmMTg2NjY3OTc4YjE3ZGE4NGQ1ZWRkZTNhYTBiNmQzNTcifSwic2lnbmF0dXJlcyI6W3sic2lnbmF0dXJlIjoiTUVRQ0lISGdRVW9BUlNQTVd5aVIvLzJXblpqV0FIRHA3cWtVbEZzMXNZaFNjT1lrQWlBV3p3Z0RqQlV3elZITWtoN1lMekUzRjBxV0V3YjNsenR5cXRpL3pYbG5GZz09IiwidmVyaWZpZXIiOiJMUzB0TFMxQ1JVZEpUaUJEUlZKVVNVWkpRMEZVUlMwdExTMHRDazFKU1VNMGVrTkRRVzF4WjBGM1NVSkJaMGxWU1hvM1pGTk9WMHB0VW14TVlVNW1Oa1ozYnlzMloyeE5WVkE0ZDBObldVbExiMXBKZW1vd1JVRjNUWGNLVG5wRlZrMUNUVWRCTVZWRlEyaE5UV015Ykc1ak0xSjJZMjFWZFZwSFZqSk5ValIzU0VGWlJGWlJVVVJGZUZaNllWZGtlbVJIT1hsYVV6RndZbTVTYkFwamJURnNXa2RzYUdSSFZYZElhR05PVFdwVmQwOVVTVEJOVkdONVRtcEplVmRvWTA1TmFsVjNUMVJKTUUxVVkzcE9ha2w1VjJwQlFVMUdhM2RGZDFsSUNrdHZXa2w2YWpCRFFWRlpTVXR2V2tsNmFqQkVRVkZqUkZGblFVVXlWRWhMV0VSNVRGWXlhWHBVVWtwalZWRjFRMWwwU25RelRUQXdNQ3RDUmtoMlVrTUtVV2s0UzNCWU9HOXZZVW80TlZGc1ltczRTV3RHUmxKamJrNHJaMVY1ZFdaalZXUTNTVlV3WmtaVmNGaHJWbGh3VG5GUFEwRlphM2RuWjBkR1RVRTBSd3BCTVZWa1JIZEZRaTkzVVVWQmQwbElaMFJCVkVKblRsWklVMVZGUkVSQlMwSm5aM0pDWjBWR1FsRmpSRUY2UVdSQ1owNVdTRkUwUlVablVWVmxkekZRQ2xWSmVIRmlSbTgwVEhkclRrd3JXRmRKVEhCbVJVTk5kMGgzV1VSV1VqQnFRa0puZDBadlFWVXpPVkJ3ZWpGWmEwVmFZalZ4VG1wd1MwWlhhWGhwTkZrS1drUTRkMHBCV1VSV1VqQlNRVkZJTDBKQ2IzZEhTVVZYVWpOS2FHSnRiREJhVXpFeVdsaEtjRnB1YkVGaFYwcDBURzFPZG1KVVFUQkNaMjl5UW1kRlJRcEJXVTh2VFVGRlFrSkRXbTlrU0ZKM1kzcHZka3d6VG5CYU0wNHdZak5LYkV4dVdteGpiV3h0WlZNMWNGbHRNSFZaTWpsMFRESTVhR1JZVW05TmFrRXlDa0puYjNKQ1owVkZRVmxQTDAxQlJVbENRMmROU20xb01HUklRbnBQYVRoMll6SnNibU16VW5aamJWVjFaRzFXZVdGWFdqVk1iV3hwWWxNMWFtSXlNSFlLWWpKR01XUkhaM2xOU1VkS1FtZHZja0puUlVWQlpGbzFRV2RSUTBKSWMwVmxVVUl6UVVoVlFUTlVNSGRoYzJKSVJWUktha2RTTkdOdFYyTXpRWEZLU3dwWWNtcGxVRXN6TDJnMGNIbG5Remh3TjI4MFFVRkJSMXBtVFUxUGRrRkJRVUpCVFVGU2FrSkZRV2xCWVRka2VVMXZRWEpwYnpFNGEydzBiSEZ2YkRRM0NtMDJXazlJWVVoQ05WZHlUMU0zZWpsaGFXZFpTSGRKWjBaMFoxQkJOeXRPYkV0MmRqTlVZM3BoV1VFelkxaFhNbkpLVm05SGJVVmlVMHRSV0hWSGJ6WUtVVGhqZDBObldVbExiMXBKZW1vd1JVRjNUVVJhZDBGM1drRkpkMEphWkdSeVJGZFRZVWxrVEc1QmVtRTRUMjExTVVWRk1WY3pTakpHZFZadVNGaHBad3BUVDI0elVraFpTelp4WlVsTVRGWmlUVGxMVUcxUVZVaHFSWGgzUVdwQ1pXTkZiSFZyZVRCSFFtazFObnAxWTI4NFRFUktUazg1TURkTmFuVkdSbWhrQ20xcGVHcFNSV2hqTWxScGVtSmFibmR6U0ZKRFkxVTVUVkE0UVVoWlp6QTlDaTB0TFMwdFJVNUVJRU5GVWxSSlJrbERRVlJGTFMwdExTMEsifV19fQ=="}], "timestampVerificationData": {}}, "dsseEnvelope": {"payload": "ewogICJfdHlwZSI6ICJodHRwczovL2luLXRvdG8uaW8vU3RhdGVtZW50L3YxIiwKICAic3ViamVjdCI6IFsKICAgIHsKICAgICAgIm5hbWUiOiAiZ3Jhbml0ZS00LjAtaC1taWNybyIsCiAgICAgICJkaWdlc3QiOiB7CiAgICAgICAgInNoYTI1NiI6ICIyOWYxNjQ0YWMyMzBiNmUyNjQwYTRjZDlkYzc0OTVhM2VkYTVhZjRlOTY5NTcwYjZkMWE1YWEwYzhkZDQwNzk0IgogICAgICB9CiAgICB9CiAgXSwKICAicHJlZGljYXRlVHlwZSI6ICJodHRwczovL21vZGVsX3NpZ25pbmcvc2lnbmF0dXJlL3YxLjAiLAogICJwcmVkaWNhdGUiOiB7CiAgICAic2VyaWFsaXphdGlvbiI6IHsKICAgICAgIm1ldGhvZCI6ICJmaWxlcyIsCiAgICAgICJhbGxvd19zeW1saW5rcyI6IGZhbHNlLAogICAgICAiaGFzaF90eXBlIjogInNoYTI1NiIKICAgIH0sCiAgICAicmVzb3VyY2VzIjogWwogICAgICB7CiAgICAgICAgIm5hbWUiOiAiUkVBRE1FLm1kIiwKICAgICAgICAiZGlnZXN0IjogIjQwZmRhMjExMDJkMGNlNWQ5ZDE3YmUwNDVkZGIwZmM5YzllN2Y0ZDIyYmQ4NzcxMTJkMTZmMDI5NDlkZjBkNWIiLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IgogICAgICB9LAogICAgICB7CiAgICAgICAgIm5hbWUiOiAiY2hhdF90ZW1wbGF0ZS5qaW5qYSIsCiAgICAgICAgImRpZ2VzdCI6ICJmZWQyNzU2ZDJkMjRlMTI3Yjk1MWRjZjEzOWQwYjAzYWI3ZGI4ZWYyM2E0NTYxMjhlYmM5YzJkYjQ5MDFkNDc2IiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIKICAgICAgfSwKICAgICAgewogICAgICAgICJuYW1lIjogImNvbmZpZy5qc29uIiwKICAgICAgICAiZGlnZXN0IjogImJkODI4YjAwZWRjYzI1ZjkzZDZmYTU1MjEwYWNlYmNkMDU4MDU0YzRkNWI1ZmE2YzUxNDZiMGEwZDE4MTcyZWUiLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IgogICAgICB9LAogICAgICB7CiAgICAgICAgIm5hbWUiOiAiZ2VuZXJhdGlvbl9jb25maWcuanNvbiIsCiAgICAgICAgImRpZ2VzdCI6ICI3YzA0Y2I5ZDJiYTc3MWY3NTI4ZmJhNWE3MTA0OTk5Y2RhZjc1NjZkMDJiNWZiZDU4NDcyODI5ZjYyNzE2MTc3IiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIKICAgICAgfSwKICAgICAgewogICAgICAgICJuYW1lIjogIm1lcmdlcy50eHQiLAogICAgICAgICJkaWdlc3QiOiAiYjZmZTQyNGUzMzQ5MDNmN2ZiODRkM2ExMDZkOTczMDQ1NWY0NzQ0YjlmZTNjMjFlZTEzNmQ5N2EwMGU3MjUwMiIsCiAgICAgICAgImFsZ29yaXRobSI6ICJzaGEyNTYiCiAgICAgIH0sCiAgICAgIHsKICAgICAgICAibmFtZSI6ICJtb2RlbC0wMDAwMS1vZi0wMDAwMi5zYWZldGVuc29ycyIsCiAgICAgICAgImRpZ2VzdCI6ICI4ODNmMTM0YzA4Mzg1MjU5N2EzMDMzY2ZjN2RlODM1N2FjZGU1MzBiM2ZlOTQxYTU2N2Y1MDc0MzFmNWM5MjE2IiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIKICAgICAgfSwKICAgICAgewogICAgICAgICJuYW1lIjogIm1vZGVsLTAwMDAyLW9mLTAwMDAyLnNhZmV0ZW5zb3JzIiwKICAgICAgICAiZGlnZXN0IjogIjk4N2U4YWVlMTBjNDA0NDZjNzFmNGM5MWQ2ODk2NzJmMTgwNWY5NDcxMjczOGY3YTQ0ZDA2YTdkZWU1ZTc4N2YiLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IgogICAgICB9LAogICAgICB7CiAgICAgICAgIm5hbWUiOiAibW9kZWwuc2FmZXRlbnNvcnMuaW5kZXguanNvbiIsCiAgICAgICAgImRpZ2VzdCI6ICI5ZTk1YThjZWZhOWMyOTA2MzZmNDRmNWNmMzlkZGNmYmI0NTMxYzIwMTUwNmViZTljYWU3YjVjNzQ2YzYwYTA5IiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIKICAgICAgfSwKICAgICAgewogICAgICAgICJuYW1lIjogInNwZWNpYWxfdG9rZW5zX21hcC5qc29uIiwKICAgICAgICAiZGlnZXN0IjogImMwODY3NmM0OWZkNzk2OWEzMTMwZjcyYmU2ZDRiZjM0ZGE2NmFhNDg0YTZlMjFkZmZlMzU5ODkzYTFiZDVmMmUiLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IgogICAgICB9LAogICAgICB7CiAgICAgICAgIm5hbWUiOiAidG9rZW5pemVyLmpzb24iLAogICAgICAgICJkaWdlc3QiOiAiZTJiYWQ2NjQzOTUzOGNiNGQ1YTc1ODA2ODA5MzI0MzJlZDllY2U5ZDNiODU3N2U2NzU1MTJiZGYxMTU5OTI1MyIsCiAgICAgICAgImFsZ29yaXRobSI6ICJzaGEyNTYiCiAgICAgIH0sCiAgICAgIHsKICAgICAgICAibmFtZSI6ICJ0b2tlbml6ZXJfY29uZmlnLmpzb24iLAogICAgICAgICJkaWdlc3QiOiAiYTVlYzVkYWFiMTJiYTA5MGE5MGYzZGQxNjljOGY5YzI3NTU1NzAxM2E4N2I5YzEyNThkYzdjYjQ5N2EzNWM4NiIsCiAgICAgICAgImFsZ29yaXRobSI6ICJzaGEyNTYiCiAgICAgIH0sCiAgICAgIHsKICAgICAgICAibmFtZSI6ICJ2b2NhYi5qc29uIiwKICAgICAgICAiZGlnZXN0IjogIjhhZjcxMDc2ZGU4YjBiNjI2ZWVkMGY0Yzk4NGZhZjBhN2MwNjI0NzkxNjRiMmEzMTMwOGE5NDg1MjRkNGY2OWMiLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IgogICAgICB9CiAgICBdCiAgfQp9", "payloadType": "application/vnd.in-toto+json", "signatures": [{"sig": "MEQCIHHgQUoARSPMWyiR//2WnZjWAHDp7qkUlFs1sYhScOYkAiAWzwgDjBUwzVHMkh7YLzE3F0qWEwb3lztyqti/zXlnFg=="}]}}
 
1
+ {"mediaType": "application/vnd.dev.sigstore.bundle.v0.3+json", "verificationMaterial": {"certificate": {"rawBytes": "MIIC5TCCAmugAwIBAgIUWH9wwFzpevLsYTY93zfV0FDUi8swCgYIKoZIzj0EAwMwNzEVMBMGA1UEChMMc2lnc3RvcmUuZGV2MR4wHAYDVQQDExVzaWdzdG9yZS1pbnRlcm1lZGlhdGUwHhcNMjUwOTI1MTkwNjAzWhcNMjUwOTI1MTkxNjAzWjAAMFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEEB8QY4c/PfLCF746taVJnu2iBsauQ1XSF6jo96ar+ZotfbSTwkgH9tSmzX3pR/NAgow925Oc2o6IlJtmWmUav6OCAYowggGGMA4GA1UdDwEB/wQEAwIHgDATBgNVHSUEDDAKBggrBgEFBQcDAzAdBgNVHQ4EFgQU1LIqbA2OhZfKwqJfyDJ69Pi8osEwHwYDVR0jBBgwFoAU39Ppz1YkEZb5qNjpKFWixi4YZD8wJAYDVR0RAQH/BBowGIEWR3Jhbml0ZS12ZXJpZnlAaWJtLmNvbTA0BgorBgEEAYO/MAEBBCZodHRwczovL3NpZ3N0b3JlLnZlcmlmeS5pYm0uY29tL29hdXRoMjA2BgorBgEEAYO/MAEIBCgMJmh0dHBzOi8vc2lnc3RvcmUudmVyaWZ5LmlibS5jb20vb2F1dGgyMIGKBgorBgEEAdZ5AgQCBHwEegB4AHYA3T0wasbHETJjGR4cmWc3AqJKXrjePK3/h4pygC8p7o4AAAGZgkSwwAAABAMARzBFAiEA4GJ05xun3DQ9ZCHrv1pacsDPBYVOq5oPkbCUXhHpg8ICICrsgoDPtoWSa7ylIBTCdq4G6w9oJEsO8ynouNttcpaaMAoGCCqGSM49BAMDA2gAMGUCMEUPpumWESosi2Hm5RkMlB4jL0BIyuKinDaU1bQSzSpn6f8JNNBLQffF42V+EnaaKwIxAO/WNK/SmYBxOpvhNt1QCbQSp94dOCkYxYp/tjTPEeEtVcSnphGrAbW7AJlHiomkpw=="}, "tlogEntries": [{"logIndex": "561461472", "logId": {"keyId": "wNI9atQGlz+VWfO6LRygH4QUfY/8W4RFwiT5i5WRgB0="}, "kindVersion": {"kind": "dsse", "version": "0.0.1"}, "integratedTime": "1758827164", "inclusionPromise": {"signedEntryTimestamp": "MEUCIQDEW2hQH1gP00xKYpJJRX8DxkXsmJh+rVBhRWriEQDzbgIgWbTY8wpHAOtxa7E5LQBE6LjuJeXgKy4RCDjA9kyofho="}, "inclusionProof": {"logIndex": "439557210", "rootHash": "tgjs68I6j9oSmIOxvOJR1wZUAMFUfy9DhA2W7s8YmyM=", "treeSize": "439557212", "hashes": ["jaMPDrARWqsH6EK0Dw6bR1VyQ1I2hy1MbB6yWYONUcU=", "R7tp8/8EVVUGywLt38rPZszS5v2rhF1StEizGu5aJfg=", "d0M1DenVFHpuphG+LfasnYu+B4Mk8d4DQwJhU7Zzi74=", "+CPUJ6rUVtKi/18CdjSESAcbJnyHUbbCd2xLRRPWO2A=", "cgaBbHz4ezZg1/G6gdyU6EInaVzB4IGLmBce34K4qK0=", "j4/0lQnXbtFc2kPaDCFOe5CVbTyJs9D5Wjfb4VEYXsg=", "xxvTHr7VsxGhfygAVJOlChIHhV1XnCsWlC+GsvWcLF0=", "ubEVkHajzdKuV4PpXOfz+PEER9LTlr+ZP3vfbPlih5Q=", "G7BxuwFumlbbybuh/z58+H2m1f18lCqshZPbPATnMSE=", "69FNNnHmKTte3DWvJGucN3Ql6eH+wiPhRoejCigTz5U=", "rzUmQ4MG+pdNvwZxrIgx8Ptz8EMePrdTuyV1gno22jE=", "AmhPJRi9wKH8isay1YeiXop4PG6Yg9wQmdn/3pf7TcA=", "/buh7NyeUdCgJPdaMbTAtiQpMkNHhk+xMsI2shkTieM=", "1mfy94KpcItqshH9+gwqV6jccupcaMpVsF28New8zDY=", "vS7O4ozHIQZJWBiov+mkpI27GE8zAmVCEkRcP3NDyNE="], "checkpoint": {"envelope": "rekor.sigstore.dev - 1193050959916656506\n439557212\ntgjs68I6j9oSmIOxvOJR1wZUAMFUfy9DhA2W7s8YmyM=\n\n\u2014 rekor.sigstore.dev wNI9ajBEAiBi5xLYhyq+t70db0tObJSs8IoZ1Eml5k1qHXPYttUgJAIgLSWUChdjVr3a+JWoi0cqAotePk0+h6S/ralUqrR3Jxg=\n"}}, "canonicalizedBody": "eyJhcGlWZXJzaW9uIjoiMC4wLjEiLCJraW5kIjoiZHNzZSIsInNwZWMiOnsiZW52ZWxvcGVIYXNoIjp7ImFsZ29yaXRobSI6InNoYTI1NiIsInZhbHVlIjoiNGMzNzk0ZTlkMmE1ODZkMThiZGRjY2Q5ZGVkMjQ1OGQzODUyZDZkZGIyYmRkOWU4NGNlOTBmNGY4OGFmZmQxNiJ9LCJwYXlsb2FkSGFzaCI6eyJhbGdvcml0aG0iOiJzaGEyNTYiLCJ2YWx1ZSI6IjY4MGZlOGJlMTRmYWZiNjFkNjlmNTRmODU3MTVmNDNmMjk4YWQwMDlmZTkyYjI3ZDY4NzFmYWI0ZDhmNTk3MjkifSwic2lnbmF0dXJlcyI6W3sic2lnbmF0dXJlIjoiTUVRQ0lCMjBnd2doNzFmZVRyVUVOSDg0UURLM3k3bm1zUG9pSUxYWlRqT1ZDMmhDQWlBNVN3VEZyQ29uMkZkODlHWFQ1cStZVVF5NEttOHI4VGRXek4ybXVuQTRBUT09IiwidmVyaWZpZXIiOiJMUzB0TFMxQ1JVZEpUaUJEUlZKVVNVWkpRMEZVUlMwdExTMHRDazFKU1VNMVZFTkRRVzExWjBGM1NVSkJaMGxWVjBnNWQzZEdlbkJsZGt4eldWUlpPVE42WmxZd1JrUlZhVGh6ZDBObldVbExiMXBKZW1vd1JVRjNUWGNLVG5wRlZrMUNUVWRCTVZWRlEyaE5UV015Ykc1ak0xSjJZMjFWZFZwSFZqSk5ValIzU0VGWlJGWlJVVVJGZUZaNllWZGtlbVJIT1hsYVV6RndZbTVTYkFwamJURnNXa2RzYUdSSFZYZElhR05PVFdwVmQwOVVTVEZOVkd0M1RtcEJlbGRvWTA1TmFsVjNUMVJKTVUxVWEzaE9ha0Y2VjJwQlFVMUdhM2RGZDFsSUNrdHZXa2w2YWpCRFFWRlpTVXR2V2tsNmFqQkVRVkZqUkZGblFVVkZRamhSV1RSakwxQm1URU5HTnpRMmRHRldTbTUxTW1sQ2MyRjFVVEZZVTBZMmFtOEtPVFpoY2l0YWIzUm1ZbE5VZDJ0blNEbDBVMjE2V0ROd1VpOU9RV2R2ZHpreU5VOWpNbTgyU1d4S2RHMVhiVlZoZGpaUFEwRlpiM2RuWjBkSFRVRTBSd3BCTVZWa1JIZEZRaTkzVVVWQmQwbElaMFJCVkVKblRsWklVMVZGUkVSQlMwSm5aM0pDWjBWR1FsRmpSRUY2UVdSQ1owNVdTRkUwUlVablVWVXhURWx4Q21KQk1rOW9XbVpMZDNGS1pubEVTalk1VUdrNGIzTkZkMGgzV1VSV1VqQnFRa0puZDBadlFWVXpPVkJ3ZWpGWmEwVmFZalZ4VG1wd1MwWlhhWGhwTkZrS1drUTRkMHBCV1VSV1VqQlNRVkZJTDBKQ2IzZEhTVVZYVWpOS2FHSnRiREJhVXpFeVdsaEtjRnB1YkVGaFYwcDBURzFPZG1KVVFUQkNaMjl5UW1kRlJRcEJXVTh2VFVGRlFrSkRXbTlrU0ZKM1kzcHZka3d6VG5CYU0wNHdZak5LYkV4dVdteGpiV3h0WlZNMWNGbHRNSFZaTWpsMFRESTVhR1JZVW05TmFrRXlDa0puYjNKQ1owVkZRVmxQTDAxQlJVbENRMmROU20xb01HUklRbnBQYVRoMll6SnNibU16VW5aamJWVjFaRzFXZVdGWFdqVk1iV3hwWWxNMWFtSXlNSFlLWWpKR01XUkhaM2xOU1VkTFFtZHZja0puUlVWQlpGbzFRV2RSUTBKSWQwVmxaMEkwUVVoWlFUTlVNSGRoYzJKSVJWUktha2RTTkdOdFYyTXpRWEZLU3dwWWNtcGxVRXN6TDJnMGNIbG5Remh3TjI4MFFVRkJSMXBuYTFOM2QwRkJRVUpCVFVGU2VrSkdRV2xGUVRSSFNqQTFlSFZ1TTBSUk9WcERTSEoyTVhCaENtTnpSRkJDV1ZaUGNUVnZVR3RpUTFWWWFFaHdaemhKUTBsRGNuTm5iMFJRZEc5WFUyRTNlV3hKUWxSRFpIRTBSelozT1c5S1JYTlBPSGx1YjNWT2RIUUtZM0JoWVUxQmIwZERRM0ZIVTAwME9VSkJUVVJCTW1kQlRVZFZRMDFGVlZCd2RXMVhSVk52YzJreVNHMDFVbXROYkVJMGFrd3dRa2w1ZFV0cGJrUmhWUW94WWxGVGVsTndialptT0VwT1RrSk1VV1ptUmpReVZpdEZibUZoUzNkSmVFRlBMMWRPU3k5VGJWbENlRTl3ZG1oT2RERlJRMkpSVTNBNU5HUlBRMnRaQ25oWmNDOTBhbFJRUldWRmRGWmpVMjV3YUVkeVFXSlhOMEZLYkVocGIyMXJjSGM5UFFvdExTMHRMVVZPUkNCRFJWSlVTVVpKUTBGVVJTMHRMUzB0Q2c9PSJ9XX19"}], "timestampVerificationData": {}}, "dsseEnvelope": {"payload": "ewogICJfdHlwZSI6ICJodHRwczovL2luLXRvdG8uaW8vU3RhdGVtZW50L3YxIiwKICAic3ViamVjdCI6IFsKICAgIHsKICAgICAgIm5hbWUiOiAiZ3Jhbml0ZS00LjAtbWljcm8iLAogICAgICAiZGlnZXN0IjogewogICAgICAgICJzaGEyNTYiOiAiMThjNjI1MmI5MjU4NGQ1YTRjOGFlMmMxNDU5OTRiOTAxNWZlOTgzYjhiODRkYzdjMDczMmQ4YTJkNGVkZTRlNiIKICAgICAgfQogICAgfQogIF0sCiAgInByZWRpY2F0ZVR5cGUiOiAiaHR0cHM6Ly9tb2RlbF9zaWduaW5nL3NpZ25hdHVyZS92MS4wIiwKICAicHJlZGljYXRlIjogewogICAgInNlcmlhbGl6YXRpb24iOiB7CiAgICAgICJoYXNoX3R5cGUiOiAic2hhMjU2IiwKICAgICAgImFsbG93X3N5bWxpbmtzIjogZmFsc2UsCiAgICAgICJtZXRob2QiOiAiZmlsZXMiCiAgICB9LAogICAgInJlc291cmNlcyI6IFsKICAgICAgewogICAgICAgICJkaWdlc3QiOiAiNDFiYjQ4MTBjMmNmYjQzNTRkOGRiODQ1NmY2MjBjYmU4OWY2MTBiNzY4NzY2ZDk4YWY4OTE0OWUxOTI3YjgxNCIsCiAgICAgICAgIm5hbWUiOiAiUkVBRE1FLm1kIiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIKICAgICAgfSwKICAgICAgewogICAgICAgICJkaWdlc3QiOiAiZmVkMjc1NmQyZDI0ZTEyN2I5NTFkY2YxMzlkMGIwM2FiN2RiOGVmMjNhNDU2MTI4ZWJjOWMyZGI0OTAxZDQ3NiIsCiAgICAgICAgIm5hbWUiOiAiY2hhdF90ZW1wbGF0ZS5qaW5qYSIsCiAgICAgICAgImFsZ29yaXRobSI6ICJzaGEyNTYiCiAgICAgIH0sCiAgICAgIHsKICAgICAgICAiZGlnZXN0IjogImUxMWEyNDQzNDc2OWYxOTI2YzJiNzNlYTRiNmJmNzExNTUzM2UzNjQ0ZDU1M2Q1YTM0Nzc5NzA4MmU2NWMwZjIiLAogICAgICAgICJuYW1lIjogImNvbmZpZy5qc29uIiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIKICAgICAgfSwKICAgICAgewogICAgICAgICJkaWdlc3QiOiAiN2MwNGNiOWQyYmE3NzFmNzUyOGZiYTVhNzEwNDk5OWNkYWY3NTY2ZDAyYjVmYmQ1ODQ3MjgyOWY2MjcxNjE3NyIsCiAgICAgICAgIm5hbWUiOiAiZ2VuZXJhdGlvbl9jb25maWcuanNvbiIsCiAgICAgICAgImFsZ29yaXRobSI6ICJzaGEyNTYiCiAgICAgIH0sCiAgICAgIHsKICAgICAgICAiZGlnZXN0IjogImI2ZmU0MjRlMzM0OTAzZjdmYjg0ZDNhMTA2ZDk3MzA0NTVmNDc0NGI5ZmUzYzIxZWUxMzZkOTdhMDBlNzI1MDIiLAogICAgICAgICJuYW1lIjogIm1lcmdlcy50eHQiLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IgogICAgICB9LAogICAgICB7CiAgICAgICAgImRpZ2VzdCI6ICIxMzFhNzkzYjQ1ZGRlZWU4ZTdjODNhNzc4NDZjMzUwYjUyMzE2MTU4OTE3ZWY1ODljMDA2YzRhMTBmMmRlOTUyIiwKICAgICAgICAibmFtZSI6ICJtb2RlbC0wMDAwMS1vZi0wMDAwMi5zYWZldGVuc29ycyIsCiAgICAgICAgImFsZ29yaXRobSI6ICJzaGEyNTYiCiAgICAgIH0sCiAgICAgIHsKICAgICAgICAiZGlnZXN0IjogIjZiNjM4M2E1YWNhNzIzOTQwZDllNGJlNjNmYTE1NzVlOWM0NDk0MWU3ZjVkNDg5YjkyM2MwZWFlYWRhMmMwMmYiLAogICAgICAgICJuYW1lIjogIm1vZGVsLTAwMDAyLW9mLTAwMDAyLnNhZmV0ZW5zb3JzIiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIKICAgICAgfSwKICAgICAgewogICAgICAgICJkaWdlc3QiOiAiMjA4OTU5ZjViZWIyZGQzNjIwZmI1Y2YyN2Q3M2ZiODNkYWIwYmU3NGQ3OTQ4NmVmMGEyOGUwNzQ5YTRiYjBkYSIsCiAgICAgICAgIm5hbWUiOiAibW9kZWwuc2FmZXRlbnNvcnMuaW5kZXguanNvbiIsCiAgICAgICAgImFsZ29yaXRobSI6ICJzaGEyNTYiCiAgICAgIH0sCiAgICAgIHsKICAgICAgICAiZGlnZXN0IjogImMwODY3NmM0OWZkNzk2OWEzMTMwZjcyYmU2ZDRiZjM0ZGE2NmFhNDg0YTZlMjFkZmZlMzU5ODkzYTFiZDVmMmUiLAogICAgICAgICJuYW1lIjogInNwZWNpYWxfdG9rZW5zX21hcC5qc29uIiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIKICAgICAgfSwKICAgICAgewogICAgICAgICJkaWdlc3QiOiAiZTJiYWQ2NjQzOTUzOGNiNGQ1YTc1ODA2ODA5MzI0MzJlZDllY2U5ZDNiODU3N2U2NzU1MTJiZGYxMTU5OTI1MyIsCiAgICAgICAgIm5hbWUiOiAidG9rZW5pemVyLmpzb24iLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IgogICAgICB9LAogICAgICB7CiAgICAgICAgImRpZ2VzdCI6ICJhNWVjNWRhYWIxMmJhMDkwYTkwZjNkZDE2OWM4ZjljMjc1NTU3MDEzYTg3YjljMTI1OGRjN2NiNDk3YTM1Yzg2IiwKICAgICAgICAibmFtZSI6ICJ0b2tlbml6ZXJfY29uZmlnLmpzb24iLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IgogICAgICB9LAogICAgICB7CiAgICAgICAgImRpZ2VzdCI6ICI4YWY3MTA3NmRlOGIwYjYyNmVlZDBmNGM5ODRmYWYwYTdjMDYyNDc5MTY0YjJhMzEzMDhhOTQ4NTI0ZDRmNjljIiwKICAgICAgICAibmFtZSI6ICJ2b2NhYi5qc29uIiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIKICAgICAgfQogICAgXQogIH0KfQ==", "payloadType": "application/vnd.in-toto+json", "signatures": [{"sig": "MEQCIB20gwgh71feTrUENH84QDK3y7nmsPoiILXZTjOVC2hCAiA5SwTFrCon2Fd89GXT5q+YUQy4Km8r8TdWzN2munA4AQ=="}]}}