misc(readme): add example queries
Browse files
README.md
CHANGED
|
@@ -23,6 +23,39 @@ which you can query using the `OpenAi` Libraries or directly through `cURL` for
|
|
| 23 |
| /api/v1/audio/transcriptions | Transcription endpoint to interact with the model |
|
| 24 |
| /docs | Visual documentation |
|
| 25 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
## Specifications
|
| 27 |
|
| 28 |
| spec | value | description |
|
|
@@ -33,3 +66,6 @@ which you can query using the `OpenAi` Libraries or directly through `cURL` for
|
|
| 33 |
| KV cache data type | `float8` (e4m3) | Key-Value cache is stored on the GPU using `float8` (`float8_e4m3`) precision to save space |
|
| 34 |
| PyTorch Compile | ✅ | Enable the use of `torch.compile` to further optimize model's execution with more optimizations |
|
| 35 |
| CUDA Graphs | ✅ | Enable the use of so called "[CUDA Graphs](https://developer.nvidia.com/blog/cuda-graphs/)" to reduce overhead executing GPU computations |
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
| /api/v1/audio/transcriptions | Transcription endpoint to interact with the model |
|
| 24 |
| /docs | Visual documentation |
|
| 25 |
|
| 26 |
+
## Getting started
|
| 27 |
+
|
| 28 |
+
- **Getting text output from audio file**
|
| 29 |
+
|
| 30 |
+
```bash
|
| 31 |
+
curl http://localhost:8000/api/v1/audio/transcriptions \
|
| 32 |
+
--request POST \
|
| 33 |
+
--header 'Content-Type: multipart/form-data' \
|
| 34 |
+
-F file=@</path/to/audio/file> \
|
| 35 |
+
-F "response_format": "text"
|
| 36 |
+
```
|
| 37 |
+
|
| 38 |
+
- **Getting JSON output from audio file**
|
| 39 |
+
|
| 40 |
+
```bash
|
| 41 |
+
curl http://localhost:8000/api/v1/audio/transcriptions \
|
| 42 |
+
--request POST \
|
| 43 |
+
--header 'Content-Type: multipart/form-data' \
|
| 44 |
+
-F file=@</path/to/audio/file> \
|
| 45 |
+
-F "response_format": "json"
|
| 46 |
+
```
|
| 47 |
+
|
| 48 |
+
- **Getting segmented JSON output from audio file**
|
| 49 |
+
|
| 50 |
+
```bash
|
| 51 |
+
curl http://localhost:8000/api/v1/audio/transcriptions \
|
| 52 |
+
--request POST \
|
| 53 |
+
--header 'Content-Type: multipart/form-data' \
|
| 54 |
+
-F file=@</path/to/audio/file> \
|
| 55 |
+
-F "response_format": "verbose_json"
|
| 56 |
+
```
|
| 57 |
+
|
| 58 |
+
|
| 59 |
## Specifications
|
| 60 |
|
| 61 |
| spec | value | description |
|
|
|
|
| 66 |
| KV cache data type | `float8` (e4m3) | Key-Value cache is stored on the GPU using `float8` (`float8_e4m3`) precision to save space |
|
| 67 |
| PyTorch Compile | ✅ | Enable the use of `torch.compile` to further optimize model's execution with more optimizations |
|
| 68 |
| CUDA Graphs | ✅ | Enable the use of so called "[CUDA Graphs](https://developer.nvidia.com/blog/cuda-graphs/)" to reduce overhead executing GPU computations |
|
| 69 |
+
|
| 70 |
+
|
| 71 |
+
|