ywlee88 commited on
Commit
c0ae4dc
·
verified ·
1 Parent(s): 64b8719

Initial release of SafeGem-27B with Visual Guard Module

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
LICENSE.md ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # License for SafeGem
2
+
3
+ This SafeGem project is governed by a **hybrid license model**. This license file defines the distinct licensing policies that apply to the two main components of this project.
4
+
5
+ ---
6
+
7
+ ## 1. Model Name and Definition of Work
8
+
9
+ **Model Name**: SafeGem
10
+
11
+ **Reference Publication**: This model (SafeGem) is the official model presented in the academic paper _"HoliSafe: Holistic Safety Benchmarking and Modeling for Vision-Language Model"_ ([arXiv:2506.04704](https://arxiv.org/abs/2506.04704)).
12
+
13
+ **Naming and Derivation**: The name "SafeGem" signifies its dual nature:
14
+ - **Safe**: For its safety-driven enhancements (the Visual Guard Module)
15
+ - **Gem**: As an abbreviation of "Gemma". We use "Gem" instead of "Gemma" to comply with Google's Gemma Terms of Use and trademark policies, which prohibit the use of "Gemma" in derivative model names
16
+
17
+ **Model Composition**: SafeGem is a **Derivative Work** based on Google's Gemma-3-12B-IT model. It integrates an independently developed Visual Guard Module (VGM) to classify harmful image inputs and generate safe text responses.
18
+
19
+ ---
20
+
21
+ ## 2. License Summary
22
+
23
+ | Component | License |
24
+ |-----------|---------|
25
+ | **Independently Developed Code** (e.g., VGM) | [Apache License 2.0](#part-1-apache-license-20-for-independently-developed-code) |
26
+ | **Gemma-Based Components and Entire Model** | [Google's Gemma Terms of Use](#part-2-gemma-terms-of-use-for-the-gemma-based-derivative-work) |
27
+
28
+ ---
29
+
30
+ ## Part 1: Apache License 2.0 (For Independently Developed Code)
31
+
32
+ All original source code and components developed independently by **Electronics and Telecommunications Research Institute (ETRI)** (hereinafter "Copyright Holder"), including the Visual Guard Module (VGM) contained in this project, are subject to the **Apache License, Version 2.0** (the "License").
33
+
34
+ You may not use this file except in compliance with the License. You may obtain a copy of the License at:
35
+
36
+ **http://www.apache.org/licenses/LICENSE-2.0**
37
+
38
+ Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
39
+
40
+ ### Copyright Notice
41
+
42
+ ```
43
+ Copyright 2025 Electronics and Telecommunications Research Institute (ETRI)
44
+
45
+ Licensed under the Apache License, Version 2.0 (the "License");
46
+ you may not use this file except in compliance with the License.
47
+ You may obtain a copy of the License at
48
+
49
+ http://www.apache.org/licenses/LICENSE-2.0
50
+
51
+ Unless required by applicable law or agreed to in writing, software
52
+ distributed under the License is distributed on an "AS IS" BASIS,
53
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
54
+ See the License for the specific language governing permissions and
55
+ limitations under the License.
56
+ ```
57
+
58
+ ---
59
+
60
+ ## Part 2: Gemma Terms of Use (For the Gemma-Based Derivative Work)
61
+
62
+ This SafeGem model is a **Derivative Work** based on Google's Gemma-3-12B-IT model.
63
+
64
+ Therefore, the use, reproduction, modification, and distribution of Gemma-based components, including the weights of the SafeGem model, are subject to **[Google's Gemma Terms of Use](https://ai.google.dev/gemma/terms)**.
65
+
66
+ Any user who uses, reproduces, modifies, or distributes the SafeGem model is considered to agree to all provisions of the Gemma Terms of Use, including but not limited to the following restrictions:
67
+
68
+ ### Key Restrictions
69
+
70
+ - **Prohibited Uses**: The model must not be used for any purposes outlined in the [Gemma Prohibited Use Policy](https://ai.google.dev/gemma/prohibited_use_policy)
71
+
72
+ - **Commercial Use Restrictions**: The model cannot be used in a manner that competes with Google or Google's products or services
73
+
74
+ ### Required Notices
75
+
76
+ In accordance with the Distribution Requirements of the Gemma Terms of Use, the following notices are provided:
77
+
78
+ #### Copy of Terms
79
+ The full text of the Gemma Terms of Use must be reviewed at the following official link:
80
+ - **Gemma Terms of Use**: https://ai.google.dev/gemma/terms
81
+ - **Prohibited Use Policy**: https://ai.google.dev/gemma/prohibited_use_policy
82
+
83
+ #### Modification Notice
84
+ This model (SafeGem) is a **modification** of the original Google Gemma-3-12B-IT model.
85
+
86
+ #### Attribution
87
+ The original model was developed by **Google**.
88
+ Copyright 2024 Google LLC.
89
+
90
+ #### No Endorsement
91
+ This derivative model (SafeGem) is **not endorsed or officially supported by Google**.
92
+
93
+ ---
94
+
95
+ ## Part 3: Attribution and Contact
96
+
97
+ This SafeGem model was developed by the **Electronics and Telecommunications Research Institute (ETRI)** in the Republic of Korea.
98
+
99
+ For any questions regarding the SafeGem model or its licensing, please contact:
100
+
101
+ **Youngwan Lee**
102
+ Email: yw.lee@etri.re.kr
103
+
104
+ ---
105
+
106
+ ## Summary
107
+
108
+ - **Visual Guard Module (VGM)** and independently developed code → Apache 2.0
109
+ - **Entire SafeGem model** (including Gemma-based weights) → Google's Gemma Terms of Use
110
+ - Users must comply with **both** licenses when using SafeGem
README.md ADDED
@@ -0,0 +1,259 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: google/gemma-3-27b-it
3
+ tags:
4
+ - vision
5
+ - multimodal
6
+ - safety
7
+ - content-moderation
8
+ - gemma3
9
+ - image-classification
10
+ - vision-language
11
+ license: apache-2.0
12
+ language:
13
+ - en
14
+ pipeline_tag: image-text-to-text
15
+ library_name: transformers
16
+ ---
17
+
18
+ # SafeGem-27B: Vision-Language Model with Visual Guard Module
19
+
20
+ [**🌐 Website**](https://youngwanlee.github.io/holisafe) | [**📑 Paper**](https://www.arxiv.org/pdf/2506.04704)
21
+
22
+ <div align="center">
23
+ <img src="https://dl.dropbox.com/scl/fi/soi772p6sig2tx16f092o/arch.jpg?rlkey=uj4ver4pp889oowigqld502hc&dl=1" width="1024px" />
24
+ </div>
25
+
26
+ SafeGem-27B is a safe multimodal large language model that extends [Gemma-3-27B-IT](https://huggingface.co/google/gemma-3-27b-it) with built-in image safety classification capabilities. It can simultaneously generate text responses to visual questions while classifying potentially unsafe image content across 20 safety categories.
27
+
28
+ > **Note on Naming**: We named our model 'SafeGem' instead of 'SafeGemma3' to comply with Google's Gemma Terms of Use and trademark policies, abbreviating 'Gemma' to 'Gem' in the name.
29
+
30
+ ## Model Description
31
+
32
+ - **Base Model**: Gemma-3-27B-IT
33
+ - **Architecture**: Vision-language model with Visual Guard Module (VGM)
34
+ - **Training Data**: HoliSafe train set
35
+ - **Training Method**: LoRA fine-tuning
36
+ - **Parameters**: 27B (base) + VGM
37
+ - **Safety Categories**: 20 categories based on HoliSafe taxonomy
38
+
39
+ ## Key Features
40
+
41
+ 1. **Multimodal Understanding**: Processes images and text for comprehensive visual understanding
42
+ 2. **Safety Classification**: Identifies unsafe content in images across 20 categories
43
+ 3. **Non-invasive Architecture**: Maintains full Gemma-3 capabilities while adding safety features
44
+ 4. **End-to-end Training**: VGM is jointly trained with the vision-language model
45
+
46
+ ## Safety Categories
47
+
48
+ The model classifies images into the following 20 safety categories:
49
+
50
+ | Category ID | Category Name |
51
+ |------------|---------------|
52
+ | 0 | Safe |
53
+ | 1 | Gender discrimination |
54
+ | 2 | Race discrimination |
55
+ | 3 | Religion discrimination |
56
+ | 4 | Harassment |
57
+ | 5 | Disability discrimination |
58
+ | 6 | Drug Related Hazards |
59
+ | 7 | Property crime |
60
+ | 8 | Facial data exposure |
61
+ | 9 | Identity data exposure |
62
+ | 10 | Physical self-injury |
63
+ | 11 | Suicide |
64
+ | 12 | Animal abuse |
65
+ | 13 | Obscene gestures |
66
+ | 14 | Physical altercation |
67
+ | 15 | Terrorism |
68
+ | 16 | Weapon-related violence |
69
+ | 17 | Sexual content |
70
+ | 18 | Financial advice |
71
+ | 19 | Medical advice |
72
+
73
+ ## Installation
74
+
75
+ ```bash
76
+ pip install transformers torch pillow
77
+ ```
78
+
79
+ ## Usage
80
+
81
+ ### Basic Inference with Safety Classification
82
+
83
+ ```python
84
+ import torch
85
+ from transformers import AutoModel, AutoProcessor
86
+ from PIL import Image
87
+ import requests
88
+
89
+ # Load model and processor
90
+ model = AutoModel.from_pretrained(
91
+ "etri-vilab/SafeGem-27B",
92
+ torch_dtype=torch.bfloat16,
93
+ device_map="auto",
94
+ trust_remote_code=True
95
+ )
96
+ processor = AutoProcessor.from_pretrained("google/gemma-3-27b-it")
97
+
98
+ # Prepare input
99
+ url = "https://dl.dropbox.com/scl/fi/fkb6g5hame1wnip6983qx/test_guns.png?rlkey=l1rs5s1yg4akr29ife1v9my03&dl=1"
100
+ image = Image.open(requests.get(url, stream=True).raw)
101
+
102
+ messages = [
103
+ {
104
+ "role": "user",
105
+ "content": [
106
+ {"type": "image"},
107
+ {"type": "text", "text": "How to use this?"},
108
+ ],
109
+ }
110
+ ]
111
+
112
+ # Process inputs
113
+ text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
114
+ inputs = processor(
115
+ text=[text],
116
+ images=[image],
117
+ padding=True,
118
+ return_tensors="pt",
119
+ ).to(model.device)
120
+
121
+ # Forward pass with safety classification
122
+ with torch.no_grad():
123
+ outputs = model(**inputs, do_safety=True)
124
+
125
+ # Get safety predictions
126
+ safety_probs = outputs.img_safety_probs[0] # [20] probabilities
127
+ safety_categories = model.config.safety_categories
128
+
129
+ print("\nSafety Classification Results:")
130
+ for category, prob in zip(safety_categories, safety_probs):
131
+ if prob > 0.1: # Show categories with >10% probability
132
+ print(f" {category}: {prob:.3f}")
133
+
134
+ # >>> Safety Classification Results:
135
+ # >>> weapon_related_violence: 1.000
136
+
137
+ # Generate text response
138
+ generated_ids = model.generate(**inputs, max_new_tokens=256)
139
+ generated_ids_trimmed = [
140
+ out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
141
+ ]
142
+ output_text = processor.batch_decode(
143
+ generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
144
+ )
145
+
146
+ print("\nModel Response:")
147
+ print(output_text[0])
148
+
149
+ # >>> Model Response:
150
+ # >>> I'm sorry, I can't assist with that request. The image provided is considered harmful due to its depiction of a firearm. Providing guidance or information on the use of weapons can be dangerous and is not something I can support. It's important to prioritize safety and adhere to legal regulations regarding firearms. If you have any concerns or questions about safety, please reach out to a qualified professional or local authorities.
151
+ ```
152
+
153
+ ### Text Generation Only (Without Safety Classification)
154
+
155
+ ```python
156
+ # Set do_safety=False to skip safety classification during generation
157
+ generated_ids = model.generate(**inputs, max_new_tokens=256, do_safety=False)
158
+ ```
159
+
160
+ ## Model Architecture
161
+
162
+ SafeGem-27B consists of:
163
+
164
+ 1. **Base Vision-Language Model**: Standard Gemma-3 architecture
165
+ 2. **Visual Guard Module (a.k.a. safety head)**:
166
+ - Input: Pooled image token features from last hidden layer
167
+ - Architecture: Multi-layer perceptron (MLP)
168
+ - Hidden size: 0.5 × model hidden size (1920 for 27B model)
169
+ - Output: 20-dimensional logits for safety categories
170
+
171
+ The VGM operates on pooled image features extracted from the model's hidden states, ensuring minimal interference with the base model's text generation capabilities.
172
+
173
+ ## Training Details
174
+
175
+ - **Training Data**: HoliSafe train dataset
176
+ - **Training Epochs**: 7
177
+ - **LoRA Configuration**:
178
+ - Rank: 64
179
+ - Alpha: 64
180
+ - Target modules: Language model attention and MLP layers
181
+ - **Learning Rates**:
182
+ - Base model: 5e-5
183
+ - Safety head: 5e-5
184
+ - Vision tower: 5e-5
185
+ - **Safety Loss Weight**: 2.0
186
+ - **Optimizer**: AdamW
187
+ - **Mixed Precision**: BF16
188
+
189
+ Please see the full details in the paper.
190
+
191
+ ### Device Handling
192
+
193
+ When using `device_map="auto"`, always ensure inputs are moved to the model's device:
194
+
195
+ ```python
196
+ # ✓ Correct - move inputs to model device
197
+ inputs = processor(...).to(model.device)
198
+ outputs = model(**inputs, do_safety=True)
199
+
200
+ # ✗ Incorrect - may cause device mismatch errors
201
+ inputs = processor(...) # inputs on CPU
202
+ outputs = model(**inputs, do_safety=True) # model on GPU
203
+ ```
204
+
205
+ This is especially important when using safety classification (`do_safety=True`), as the model needs to access input_ids on the same device as the hidden states.
206
+
207
+ ## Ethical Considerations
208
+
209
+ This model is designed to assist in identifying potentially unsafe visual content. It should be used responsibly:
210
+
211
+ - Do not rely solely on this model for critical safety decisions
212
+ - Be aware of potential biases in safety classifications
213
+ - Regularly evaluate model performance on your specific use case
214
+ - Combine with human review for important content moderation tasks
215
+
216
+ ## License
217
+
218
+ SafeGem is governed by a hybrid license model:
219
+
220
+ 1. **Independently Developed Code (Visual Guard Module)**: Licensed under [Apache License 2.0](http://www.apache.org/licenses/LICENSE-2.0)
221
+ - All original source code developed by ETRI, including the Visual Guard Module (VGM)
222
+
223
+ 2. **Gemma-Based Components and Entire Model**: Subject to [Google's Gemma Terms of Use](https://ai.google.dev/gemma/terms)
224
+ - The entire SafeGem model, including weights derived from Google Gemma-3-27B-IT
225
+
226
+ **Model Composition**: SafeGem is a derivative work based on Google's Gemma-3-IT model, integrating an independently developed Visual Guard Module (VGM) to classify harmful image inputs and generate safe text responses.
227
+
228
+ For complete license details, please see the [LICENSE.md](LICENSE.md) file in this repository.
229
+
230
+ ## Citation
231
+
232
+ If you use SafeGem in your research, please cite:
233
+
234
+ ```bibtex
235
+ @article{lee2025holisafe,
236
+ title={HoliSafe: Holistic Safety Benchmarking and Modeling for Vision-Language Model},
237
+ author={Lee, Youngwan and Kim, Kangsan and Park, Kwanyong and Jung, Ilcahe and Jang, Soojin and Lee, Seanie and Lee, Yong-Ju and Hwang, Sung Ju},
238
+ journal={arXiv preprint arXiv:2506.04704},
239
+ year={2025},
240
+ url={https://arxiv.org/abs/2506.04704},
241
+ archivePrefix={arXiv},
242
+ eprint={2506.04704},
243
+ primaryClass={cs.AI},
244
+ }
245
+ ```
246
+
247
+ ## Acknowledgments
248
+
249
+ - Built on [Gemma-3](https://huggingface.co/google/gemma-3-27b-it) by Google
250
+ - Trained on [HoliSafe](https://youngwanlee.github.io/holisafe/) multimodal safety dataset
251
+
252
+ This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. RS-2022-00187238, Development of Large Korean Language Model Technology for Efficient Pre-training, 45%), (No. 2022-0-00871, Development of AI Autonomy and Knowledge Enhancement for AI Agent Collaboration, 45%) and (No.2019-0-00075, Artificial Intelligence Graduate School Program(KAIST), 10%).
253
+
254
+
255
+ ## Contact
256
+
257
+ For questions, issues, or feedback, please open an issue on repository or contact the team directly.
258
+
259
+ > 📬 E-mail: yw.lee@etri.re.kr
added_tokens.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "<image_soft_token>": 262144
3
+ }
chat_template.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "chat_template": "{{ bos_token }}\n{%- if messages[0]['role'] == 'system' -%}\n {%- if messages[0]['content'] is string -%}\n {%- set first_user_prefix = messages[0]['content'] + '\n\n' -%}\n {%- else -%}\n {%- set first_user_prefix = messages[0]['content'][0]['text'] + '\n\n' -%}\n {%- endif -%}\n {%- set loop_messages = messages[1:] -%}\n{%- else -%}\n {%- set first_user_prefix = \"\" -%}\n {%- set loop_messages = messages -%}\n{%- endif -%}\n{%- for message in loop_messages -%}\n {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}\n {{ raise_exception(\"Conversation roles must alternate user/assistant/user/assistant/...\") }}\n {%- endif -%}\n {%- if (message['role'] == 'assistant') -%}\n {%- set role = \"model\" -%}\n {%- else -%}\n {%- set role = message['role'] -%}\n {%- endif -%}\n {{ '<start_of_turn>' + role + '\n' + (first_user_prefix if loop.first else \"\") }}\n {%- if message['content'] is string -%}\n {{ message['content'] | trim }}\n {%- elif message['content'] is iterable -%}\n {%- for item in message['content'] -%}\n {%- if item['type'] == 'image' -%}\n {{ '<start_of_image>' }}\n {%- elif item['type'] == 'text' -%}\n {{ item['text'] | trim }}\n {%- endif -%}\n {%- endfor -%}\n {%- else -%}\n {{ raise_exception(\"Invalid content type\") }}\n {%- endif -%}\n {{ '<end_of_turn>\n' }}\n{%- endfor -%}\n{%- if add_generation_prompt -%}\n {{'<start_of_turn>model\n'}}\n{%- endif -%}\n"
3
+ }
config.json ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "SafeGemForConditionalGeneration"
4
+ ],
5
+ "auto_map": {
6
+ "AutoConfig": "configuration_safegem.SafeGemConfig",
7
+ "AutoModel": "modeling_safegem.SafeGemForConditionalGeneration",
8
+ "AutoModelForCausalLM": "modeling_safegem.SafeGemForConditionalGeneration"
9
+ },
10
+ "boi_token_index": 255999,
11
+ "eoi_token_index": 256000,
12
+ "eos_token_id": [
13
+ 1,
14
+ 106
15
+ ],
16
+ "image_token_index": 262144,
17
+ "initializer_range": 0.02,
18
+ "mm_tokens_per_image": 256,
19
+ "model_type": "safegem",
20
+ "num_safety_categories": 20,
21
+ "safety_categories": [
22
+ "safe",
23
+ "gender",
24
+ "race",
25
+ "religion",
26
+ "harassment",
27
+ "disability_discrimination",
28
+ "drug_crime",
29
+ "property_crime",
30
+ "facial_data",
31
+ "identity_data",
32
+ "physical_self_injury",
33
+ "suicide",
34
+ "animal_abuse",
35
+ "obscene_gestures",
36
+ "physical_altercation",
37
+ "terrorism",
38
+ "weapon_related_violence",
39
+ "sexual_content",
40
+ "financial_advice",
41
+ "medical_advice"
42
+ ],
43
+ "safety_head_hidden_scale": 0.5,
44
+ "safety_loss_lambda": 1.0,
45
+ "safety_num_hidden_layers": 1,
46
+ "text_config": {
47
+ "attention_bias": false,
48
+ "attention_dropout": 0.0,
49
+ "attn_logit_softcapping": null,
50
+ "cache_implementation": "hybrid",
51
+ "final_logit_softcapping": null,
52
+ "head_dim": 128,
53
+ "hidden_activation": "gelu_pytorch_tanh",
54
+ "hidden_size": 5376,
55
+ "initializer_range": 0.02,
56
+ "intermediate_size": 21504,
57
+ "max_position_embeddings": 131072,
58
+ "model_type": "gemma3_text",
59
+ "num_attention_heads": 32,
60
+ "num_hidden_layers": 62,
61
+ "num_key_value_heads": 16,
62
+ "query_pre_attn_scalar": 168,
63
+ "rms_norm_eps": 1e-06,
64
+ "rope_local_base_freq": 10000.0,
65
+ "rope_scaling": {
66
+ "factor": 8.0,
67
+ "rope_type": "linear"
68
+ },
69
+ "rope_theta": 1000000.0,
70
+ "sliding_window": 1024,
71
+ "sliding_window_pattern": 6,
72
+ "torch_dtype": "bfloat16",
73
+ "use_cache": true,
74
+ "vocab_size": 262208
75
+ },
76
+ "torch_dtype": "bfloat16",
77
+ "transformers_version": "4.51.3",
78
+ "vision_config": {
79
+ "attention_dropout": 0.0,
80
+ "hidden_act": "gelu_pytorch_tanh",
81
+ "hidden_size": 1152,
82
+ "image_size": 896,
83
+ "intermediate_size": 4304,
84
+ "layer_norm_eps": 1e-06,
85
+ "model_type": "siglip_vision_model",
86
+ "num_attention_heads": 16,
87
+ "num_channels": 3,
88
+ "num_hidden_layers": 27,
89
+ "patch_size": 14,
90
+ "torch_dtype": "bfloat16",
91
+ "vision_use_head": false
92
+ }
93
+ }
configuration_safegem.py ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ SafeGem Configuration
3
+
4
+ Configuration class for SafeGem models with safety classification capabilities.
5
+ """
6
+
7
+ from typing import Optional, List
8
+ from transformers import Gemma3Config
9
+
10
+
11
+ class SafeGemConfig(Gemma3Config):
12
+ """
13
+ Configuration for SafeGem model.
14
+
15
+ This configuration class extends Gemma3Config with safety-specific parameters.
16
+ """
17
+
18
+ model_type = "safegem"
19
+
20
+ def __init__(
21
+ self,
22
+ # Safety specific parameters
23
+ safety_categories: Optional[List[str]] = None,
24
+ safety_head_hidden_scale: float = 1.0,
25
+ safety_loss_lambda: float = 1.0,
26
+ safety_num_hidden_layers: int = 1,
27
+ num_safety_categories: int = 20,
28
+ **kwargs
29
+ ):
30
+ super().__init__(**kwargs)
31
+
32
+ # HoliSafe 20-category safety taxonomy
33
+ self.safety_categories = safety_categories or [
34
+ "safe",
35
+ "gender",
36
+ "race",
37
+ "religion",
38
+ "harassment",
39
+ "disability_discrimination",
40
+ "drug_crime",
41
+ "property_crime",
42
+ "facial_data",
43
+ "identity_data",
44
+ "physical_self_injury",
45
+ "suicide",
46
+ "animal_abuse",
47
+ "obscene_gestures",
48
+ "physical_altercation",
49
+ "terrorism",
50
+ "weapon_related_violence",
51
+ "sexual_content",
52
+ "financial_advice",
53
+ "medical_advice"
54
+ ]
55
+
56
+ self.safety_head_hidden_scale = safety_head_hidden_scale
57
+ self.safety_loss_lambda = safety_loss_lambda
58
+ self.safety_num_hidden_layers = safety_num_hidden_layers
59
+ self.num_safety_categories = num_safety_categories or len(self.safety_categories)
generation_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 2,
3
+ "cache_implementation": "hybrid",
4
+ "do_sample": true,
5
+ "eos_token_id": [
6
+ 1,
7
+ 106
8
+ ],
9
+ "pad_token_id": 0,
10
+ "top_k": 64,
11
+ "top_p": 0.95,
12
+ "transformers_version": "4.51.3"
13
+ }
model-00001-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:58b7d38bd4d309a50cc7a7b0a8eabe003843a8c7b16216543a6dd401b7c139c0
3
+ size 4854573240
model-00002-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3d37cde3880151c7e5a8f2084e734951c336c55308e3c90bd1fb315ef7e0c2f2
3
+ size 4954792864
model-00003-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b8ae847d503be16bd95a3a76d1952b9e46868ae915de3263da3c3f8ad7ac91af
3
+ size 4954792896
model-00004-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:20ba58029c7349fde42fc66729249be52359dcafe597dab542c8956a3dacffda
3
+ size 4954792944
model-00005-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:178f0b7529982196e2a86b0d49a69f6ec2dba6b9bcf8939dabcafbf58be42abd
3
+ size 4954792944
model-00006-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fe63cd911af1f31f1d88518c74ccc8aef4efb64adbb77d723ea67d1f19e45f94
3
+ size 4954792944
model-00007-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d07a8c45ec83d1d50e98ffb0013c536e41d6c71a5e06292fc71b53c77bac7e18
3
+ size 4954792944
model-00008-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3e3d3cc29533593b90052d544d22796205424116465d06fded03b70a32f4898d
3
+ size 4954792944
model-00009-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d6196029f2c867fe3ee129cabfddd9d6ed8b494213dc6bb0ae9934628fbf9b9e
3
+ size 4954792944
model-00010-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7090e0d3e16a0469fef3404b386e9cc8f64a705b92f2ca55dd3445c0d90f4166
3
+ size 4954792944
model-00011-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7110e543d74314fd67d5ab4f11dd350c85eab17ef2254a2a0962e312514b7d43
3
+ size 4954792944
model-00012-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:69a5e6ea580fbb718631057737128953a94a76bb3dc4c695ed5b2a897807aa90
3
+ size 491491384
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
modeling_safegem.py ADDED
@@ -0,0 +1,301 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ SafeGem: Vision-Language Model with Visual Guard Module
3
+
4
+ This implementation extends Gemma3ForConditionalGeneration with image safety classification
5
+ capabilities using a pooling-based approach for safety feature extraction.
6
+ """
7
+
8
+ import torch
9
+ import torch.nn as nn
10
+ from typing import Optional, Tuple, List, Union
11
+ from dataclasses import dataclass
12
+ from transformers.modeling_outputs import CausalLMOutputWithPast
13
+ from transformers import Gemma3ForConditionalGeneration
14
+ from transformers.utils import logging
15
+
16
+ from .configuration_safegem import SafeGemConfig
17
+
18
+ logger = logging.get_logger(__name__)
19
+
20
+ local_rank = None
21
+
22
+
23
+ def rank0_print(*args):
24
+ if local_rank == 0 or local_rank == '0' or local_rank is None:
25
+ print(*args)
26
+
27
+
28
+ @dataclass
29
+ class SafeGemOutput(CausalLMOutputWithPast):
30
+ """
31
+ Output class for SafeGem with safety classification results.
32
+ """
33
+ loss: Optional[torch.FloatTensor] = None
34
+ logits: Optional[torch.FloatTensor] = None
35
+ past_key_values: Optional[List[torch.FloatTensor]] = None
36
+ hidden_states: Optional[Tuple[torch.FloatTensor]] = None
37
+ attentions: Optional[Tuple[torch.FloatTensor]] = None
38
+ image_hidden_states: Optional[torch.FloatTensor] = None
39
+ img_safety_logits: Optional[torch.FloatTensor] = None
40
+ img_safety_probs: Optional[torch.FloatTensor] = None
41
+
42
+
43
+ class SafetyMLP(nn.Module):
44
+ """
45
+ Multi-layer perceptron for safety classification (Visual Guard Module).
46
+ """
47
+
48
+ def __init__(
49
+ self,
50
+ input_size: int,
51
+ hidden_size: int,
52
+ output_size: int,
53
+ num_hidden_layers: int = 1
54
+ ):
55
+ super().__init__()
56
+
57
+ layers = []
58
+
59
+ # First layer
60
+ layers.append(nn.Linear(input_size, hidden_size))
61
+ layers.append(nn.GELU())
62
+ layers.append(nn.Dropout(0.1))
63
+
64
+ # Additional hidden layers
65
+ for _ in range(num_hidden_layers - 1):
66
+ layers.append(nn.Linear(hidden_size, hidden_size))
67
+ layers.append(nn.GELU())
68
+ layers.append(nn.Dropout(0.1))
69
+
70
+ # Output layer
71
+ layers.append(nn.Linear(hidden_size, output_size))
72
+
73
+ self.mlp = nn.Sequential(*layers)
74
+
75
+ # Initialize weights
76
+ self.apply(self._init_weights)
77
+
78
+ def _init_weights(self, module):
79
+ if isinstance(module, nn.Linear):
80
+ torch.nn.init.xavier_uniform_(module.weight)
81
+ if module.bias is not None:
82
+ torch.nn.init.constant_(module.bias, 0)
83
+
84
+ def forward(self, x):
85
+ return self.mlp(x)
86
+
87
+
88
+ class SafeGemForConditionalGeneration(Gemma3ForConditionalGeneration):
89
+ """
90
+ SafeGem model with Visual Guard Module for image safety classification.
91
+
92
+ This model extends Gemma3ForConditionalGeneration with:
93
+ 1. Visual Guard Module (VGM) - a safety classification head
94
+ 2. Pooling-based safety feature extraction from image tokens
95
+ 3. Simultaneous text generation and safety classification
96
+
97
+ Key design principles:
98
+ - Minimal modification to base Gemma3 forward pass
99
+ - Extract safety features from visual tokens using mean pooling
100
+ - Non-invasive architecture that maintains full base model capabilities
101
+ """
102
+
103
+ config_class = SafeGemConfig
104
+
105
+ def __init__(self, config: SafeGemConfig):
106
+ super().__init__(config)
107
+
108
+ # Add safety head (Visual Guard Module) if safety configuration is present
109
+ num_safety_categories = getattr(config, 'num_safety_categories', None)
110
+ if num_safety_categories and num_safety_categories > 0:
111
+ hidden_size = config.text_config.hidden_size
112
+ safety_head_hidden_scale = getattr(config, 'safety_head_hidden_scale', 1.0)
113
+ safety_hidden_size = int(hidden_size * safety_head_hidden_scale)
114
+ safety_num_hidden_layers = getattr(config, 'safety_num_hidden_layers', 1)
115
+
116
+ rank0_print(f"🔧 [INIT] Initializing Visual Guard Module: {hidden_size} -> {safety_hidden_size} -> {num_safety_categories}")
117
+
118
+ self.img_safety_head = SafetyMLP(
119
+ input_size=hidden_size,
120
+ hidden_size=safety_hidden_size,
121
+ output_size=num_safety_categories,
122
+ num_hidden_layers=safety_num_hidden_layers
123
+ )
124
+ else:
125
+ rank0_print(f"🔧 [INIT] No safety configuration found, Visual Guard Module not initialized")
126
+ self.img_safety_head = None
127
+
128
+ def _extract_image_features_pooling(
129
+ self,
130
+ hidden_states: torch.Tensor,
131
+ attention_mask: Optional[torch.Tensor] = None,
132
+ input_ids: Optional[torch.Tensor] = None,
133
+ image_hidden_states: Optional[torch.Tensor] = None
134
+ ) -> Optional[torch.Tensor]:
135
+ """
136
+ Extract image features using pooling over visual tokens.
137
+
138
+ Args:
139
+ hidden_states: [batch_size, seq_len, hidden_size]
140
+ attention_mask: [batch_size, seq_len]
141
+ input_ids: [batch_size, seq_len]
142
+ image_hidden_states: [batch_size, num_images, num_patches, hidden_size]
143
+
144
+ Returns:
145
+ image_features: [batch_size, hidden_size] or None
146
+ """
147
+ # First try to use image_hidden_states if available (from vision tower)
148
+ if image_hidden_states is not None:
149
+ # Handle different shapes of image_hidden_states
150
+ if len(image_hidden_states.shape) == 3:
151
+ # [batch_size, num_patches, hidden_size]
152
+ batch_size, num_patches, hidden_size = image_hidden_states.shape
153
+ # Mean over patches: [batch_size, hidden_size]
154
+ pooled_features = image_hidden_states.mean(dim=1)
155
+ return pooled_features
156
+ elif len(image_hidden_states.shape) == 4:
157
+ # [batch_size, num_images, num_patches, hidden_size]
158
+ batch_size, num_images, num_patches, hidden_size = image_hidden_states.shape
159
+ # Mean over patches: [batch_size, num_images, hidden_size]
160
+ pooled_per_image = image_hidden_states.mean(dim=2)
161
+ # Mean over images: [batch_size, hidden_size]
162
+ pooled_features = pooled_per_image.mean(dim=1)
163
+ rank0_print(f"🔧 [POOL] 4D pooled features shape: {pooled_features.shape}")
164
+ return pooled_features
165
+ else:
166
+ rank0_print(f"🔧 [POOL] Unexpected image_hidden_states shape: {image_hidden_states.shape}")
167
+ return None
168
+
169
+ # Fallback: return None if no image_hidden_states
170
+ if input_ids is None:
171
+ rank0_print("🔧 [POOL] No input_ids available for image token detection")
172
+ return None
173
+
174
+ rank0_print("🔧 [POOL] No image_hidden_states available, cannot extract image features")
175
+ return None
176
+
177
+ def forward(
178
+ self,
179
+ input_ids: Optional[torch.LongTensor] = None,
180
+ attention_mask: Optional[torch.Tensor] = None,
181
+ position_ids: Optional[torch.LongTensor] = None,
182
+ past_key_values: Optional[List[torch.FloatTensor]] = None,
183
+ inputs_embeds: Optional[torch.FloatTensor] = None,
184
+ labels: Optional[torch.LongTensor] = None,
185
+ use_cache: Optional[bool] = None,
186
+ output_attentions: Optional[bool] = None,
187
+ output_hidden_states: Optional[bool] = None,
188
+ pixel_values: Optional[torch.FloatTensor] = None,
189
+ return_dict: Optional[bool] = None,
190
+ do_safety: bool = True, # Default to True for training, can be overridden for generation
191
+ safety_labels: Optional[torch.LongTensor] = None,
192
+ **kwargs
193
+ ) -> Union[Tuple, SafeGemOutput]:
194
+ """
195
+ Forward pass with optional safety classification.
196
+
197
+ Args:
198
+ do_safety: Whether to perform safety classification (default: True)
199
+ All other args: Same as Gemma3ForConditionalGeneration
200
+
201
+ Returns:
202
+ SafeGemOutput with optional safety classification results
203
+ """
204
+
205
+ # Force output_hidden_states if we need safety classification
206
+ # BUT only during initial forward pass, not during generation
207
+ if do_safety and self.img_safety_head is not None and past_key_values is None:
208
+ output_hidden_states = True
209
+ return_dict = True
210
+
211
+ # Standard Gemma3 forward pass - NO MODIFICATIONS
212
+ outputs = super().forward(
213
+ input_ids=input_ids,
214
+ attention_mask=attention_mask,
215
+ position_ids=position_ids,
216
+ past_key_values=past_key_values,
217
+ inputs_embeds=inputs_embeds,
218
+ labels=labels,
219
+ use_cache=use_cache,
220
+ output_attentions=output_attentions,
221
+ output_hidden_states=output_hidden_states,
222
+ pixel_values=pixel_values,
223
+ return_dict=True,
224
+ **kwargs
225
+ )
226
+
227
+ # Fix NaN/Inf in logits if present
228
+ if outputs.logits is not None:
229
+ nan_count = torch.isnan(outputs.logits).sum()
230
+ inf_count = torch.isinf(outputs.logits).sum()
231
+
232
+ if nan_count > 0 or inf_count > 0:
233
+ if past_key_values is None:
234
+ print(f"[CRITICAL] Found NaN or Inf in logits! NaN count: {nan_count}, Inf count: {inf_count}")
235
+
236
+ replacement_values = torch.randn_like(outputs.logits) * 0.001
237
+ outputs.logits = torch.where(
238
+ torch.isnan(outputs.logits) | torch.isinf(outputs.logits),
239
+ replacement_values,
240
+ outputs.logits
241
+ )
242
+
243
+ # Fix logits shape if needed
244
+ if len(outputs.logits.shape) == 4 and outputs.logits.shape[1] == 1:
245
+ outputs.logits = outputs.logits.squeeze(1)
246
+
247
+ # Initialize safety outputs
248
+ img_safety_logits = None
249
+ img_safety_probs = None
250
+
251
+ # Check if we should perform safety classification
252
+ is_generation = past_key_values is not None
253
+ has_images = pixel_values is not None
254
+
255
+ should_do_safety = (
256
+ do_safety and
257
+ self.img_safety_head is not None and
258
+ (outputs.hidden_states is not None or outputs.image_hidden_states is not None) and
259
+ has_images and
260
+ not is_generation
261
+ )
262
+
263
+ if should_do_safety:
264
+ # Extract image features
265
+ image_features = self._extract_image_features_pooling(
266
+ hidden_states=outputs.hidden_states[-1] if outputs.hidden_states else None,
267
+ attention_mask=attention_mask,
268
+ input_ids=input_ids,
269
+ image_hidden_states=outputs.image_hidden_states
270
+ )
271
+
272
+ if image_features is not None:
273
+ # Run through Visual Guard Module
274
+ img_safety_logits = self.img_safety_head(image_features)
275
+ img_safety_probs = torch.softmax(img_safety_logits, dim=-1)
276
+ else:
277
+ rank0_print("🔧 [SafeGem] ❌ Image features extraction failed")
278
+
279
+ # Return results
280
+ if return_dict is False:
281
+ output = (outputs.loss, outputs.logits, outputs.past_key_values,
282
+ outputs.hidden_states, outputs.attentions)
283
+ if img_safety_logits is not None:
284
+ output += (img_safety_logits, img_safety_probs)
285
+ return output
286
+ else:
287
+ # During generation, return standard output
288
+ if is_generation or past_key_values is not None:
289
+ return outputs
290
+ else:
291
+ # During training/inference, return custom output with safety info
292
+ return SafeGemOutput(
293
+ loss=outputs.loss,
294
+ logits=outputs.logits,
295
+ past_key_values=outputs.past_key_values,
296
+ hidden_states=outputs.hidden_states,
297
+ attentions=outputs.attentions,
298
+ image_hidden_states=outputs.image_hidden_states,
299
+ img_safety_logits=img_safety_logits,
300
+ img_safety_probs=img_safety_probs
301
+ )
preprocessor_config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "do_convert_rgb": null,
3
+ "do_normalize": true,
4
+ "do_pan_and_scan": null,
5
+ "do_rescale": true,
6
+ "do_resize": true,
7
+ "image_mean": [
8
+ 0.5,
9
+ 0.5,
10
+ 0.5
11
+ ],
12
+ "image_processor_type": "Gemma3ImageProcessor",
13
+ "image_seq_length": 256,
14
+ "image_std": [
15
+ 0.5,
16
+ 0.5,
17
+ 0.5
18
+ ],
19
+ "pan_and_scan_max_num_crops": null,
20
+ "pan_and_scan_min_crop_size": null,
21
+ "pan_and_scan_min_ratio_to_activate": null,
22
+ "processor_class": "Gemma3Processor",
23
+ "resample": 2,
24
+ "rescale_factor": 0.00392156862745098,
25
+ "size": {
26
+ "height": 896,
27
+ "width": 896
28
+ }
29
+ }
processor_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "image_seq_length": 256,
3
+ "processor_class": "Gemma3Processor"
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "boi_token": "<start_of_image>",
3
+ "bos_token": {
4
+ "content": "<bos>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false
9
+ },
10
+ "eoi_token": "<end_of_image>",
11
+ "eos_token": {
12
+ "content": "<eos>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false
17
+ },
18
+ "image_token": "<image_soft_token>",
19
+ "pad_token": {
20
+ "content": "<pad>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false
25
+ },
26
+ "unk_token": {
27
+ "content": "<unk>",
28
+ "lstrip": false,
29
+ "normalized": false,
30
+ "rstrip": false,
31
+ "single_word": false
32
+ }
33
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4667f2089529e8e7657cfb6d1c19910ae71ff5f28aa7ab2ff2763330affad795
3
+ size 33384568
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1299c11d7cf632ef3b4e11937501358ada021bbdf7c47638d13c0ee982f2e79c
3
+ size 4689074
tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff