spow12
/

ChatWaifu_2.0_vision

@@ -304,6 +304,107 @@ Role:	Popular, Shopkeeper, University Student, Waitstaff
 右の人は、髪が黒くて長くて、後ろで結んでいるわ。髪には赤いリボンがついていて、髪に色を添えているわ。目は大きくて、少し緑がかった感じ。服装は青い着物を着ていて、下には黒いショーツを履いているわ。座っている姿勢が少し恥ずかしいような、でも楽しそうな雰囲気ね。
 どう？説明に不足した点があったら言ってね。"""
 ```
 ## Dataset

 右の人は、髪が黒くて長くて、後ろで結んでいるわ。髪には赤いリボンがついていて、髪に色を添えているわ。目は大きくて、少し緑がかった感じ。服装は青い着物を着ていて、下には黒いショーツを履いているわ。座っている姿勢が少し恥ずかしいような、でも楽しそうな雰囲気ね。
 どう？説明に不足した点があったら言ってね。"""
 ```
+## Using vLLM
+Currently, vLLM stable version doesn't supprot huggingface pixtral model. But they are working for that in developer version.
+First you need to install latest vLLM developer version. Check this [document](https://docs.vllm.ai/en/latest/getting_started/installation.html)
+```bash
+pip install https://vllm-wheels.s3.us-west-2.amazonaws.com/nightly/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
+```
+And You can run openai server using below command
+Note, you need to specify chat template. Copy and paste from the processor chat template.
+```bash
+export OMP_NUM_THREADS=8
+export VLLM_ALLOW_LONG_MAX_MODEL_LEN=1
+CUDA_VISIBLE_DEVICES=1 vllm serve spow12/ChatWaifu_2.0_vision \
+    --chat-template ./chat_templates/chatwaifu_vision.jinja \ # You have to change this for your setting.
+    --dtype bfloat16 \
+    --trust-remote-code \
+    --api-key token_abc123 \
+    --max-seq-len-to-capture 32768 \
+    --max_model_len 16384 \
+    --tensor-parallel-size 1 \
+    --pipeline-parallel-size 1 \
+    --port 5500 \
+    --served-model-name chat_model \
+    --limit-mm-per-prompt image=4 \
+    --allowed-local-media-path ./data/ # You can remove if you don't have a plan for using local image.
+```
+After the OpenAI Server is pop up,
+```python
+import requests, sys
+from openai import OpenAI
+client = OpenAI(
+    base_url="http://localhost:5500/v1",
+    api_key='token_abc123',
+)
+def add_completion(user_message, chat_history:list):
+    if chat_history[-1]['role'] == 'assistant':
+        chat_history.append({
+            'role':'user',
+            'content': user_message
+        })
+    completion = client.chat.completions.create(
+        model="chat_model",
+        messages=chat_history,
+        temperature=0.75,
+        max_tokens=512,
+        stop=['[/INST]', '<|im_end|>','</s>'],
+        stream=True,
+		stream_options={
+			"include_usage": True
+		},
+        extra_body={
+            "min_p": 0.05,
+            "repetition_penalty": 1.1,
+        }
+    )
+    completion_str = ""
+    for chunk in completion:
+        try:
+            content = chunk.choices[0].delta.content
+            if type(content) == str:
+                completion_str += content
+                print(content, end='')  # Print without newline
+                sys.stdout.flush()  # Ensure content is printed immediately
+        except IndexError:
+            pass
+    chat_history.append({
+        'role': 'assistant',
+        'content': completion_str
+    })
+    return chat_history
+history = [
+    {
+        'content': system,
+        'role': 'system'
+    },
+]
+user_content = {
+    "role": "user", "content": [
+      {
+          'type': 'image_url',
+          'image_url': {'url': url_natume}
+      },
+      {
+          'type': 'image_url',
+          'image_url': {'url': url_mako}
+      }
+      {"type": "text", "text": "ユーザー: この二人の外見を説明してみて。"},
+    ]
+}
+history = add_completion(user_content, history)
+```
 ## Dataset