Upload complete model
Browse files
README.md
CHANGED
|
@@ -29,7 +29,7 @@ tags:
|
|
| 29 |
|
| 30 |
#### M3 Ultra 512GB RAM connected to MBP 128GB RAM using [Inferencer app v1.7.3](https://inferencer.com) with LAN distributed compute
|
| 31 |
* Expect ~13.7 tokens/s @ 1000 tokens
|
| 32 |
-
*
|
| 33 |
* More RAM available for larger context window using this method
|
| 34 |
|
| 35 |
##### Quantized with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.28
|
|
|
|
| 29 |
|
| 30 |
#### M3 Ultra 512GB RAM connected to MBP 128GB RAM using [Inferencer app v1.7.3](https://inferencer.com) with LAN distributed compute
|
| 31 |
* Expect ~13.7 tokens/s @ 1000 tokens
|
| 32 |
+
* Example memory usage: MBP ~20GB + Mac Studio ~430GB
|
| 33 |
* More RAM available for larger context window using this method
|
| 34 |
|
| 35 |
##### Quantized with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.28
|