Update README.md
Browse files
README.md
CHANGED
|
@@ -104,7 +104,7 @@ print("generate_text:", generate_text)
|
|
| 104 |
|
| 105 |
### Using vLLM
|
| 106 |
|
| 107 |
-
|
| 108 |
|
| 109 |
```bash
|
| 110 |
# 80G * 16 GPU
|
|
@@ -112,7 +112,7 @@ vllm serve baidu/ERNIE-4.5-300B-A47B-PT --trust-remote-code
|
|
| 112 |
```
|
| 113 |
|
| 114 |
```bash
|
| 115 |
-
# FP8 online quantification 80G *
|
| 116 |
vllm serve baidu/ERNIE-4.5-300B-A47B-PT --trust-remote-code --quantization fp8
|
| 117 |
```
|
| 118 |
|
|
|
|
| 104 |
|
| 105 |
### Using vLLM
|
| 106 |
|
| 107 |
+
[vllm](https://github.com/vllm-project/vllm/tree/main) github library. Python-only [build](https://docs.vllm.ai/en/latest/getting_started/installation/gpu.html#set-up-using-python-only-build-without-compilation).
|
| 108 |
|
| 109 |
```bash
|
| 110 |
# 80G * 16 GPU
|
|
|
|
| 112 |
```
|
| 113 |
|
| 114 |
```bash
|
| 115 |
+
# FP8 online quantification 80G * 16 GPU
|
| 116 |
vllm serve baidu/ERNIE-4.5-300B-A47B-PT --trust-remote-code --quantization fp8
|
| 117 |
```
|
| 118 |
|