Mark as TensorRT library for download tracking

e21bc07 verified 3 months ago

3.93 kB

	---
	pipeline_tag: text-to-image
	inference: false
	library_name: tensorrt
	license: other
	license_name: stabilityai-ai-community
	license_link: LICENSE.md
	tags:
	- tensorrt
	- sd3.5-medium
	- text-to-image
	- onnx
	extra_gated_prompt: >-
	By clicking "Agree", you agree to the [License
	Agreement](https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/LICENSE.md)
	and acknowledge Stability AI's [Privacy
	Policy](https://stability.ai/privacy-policy).
	extra_gated_fields:
	Name: text
	Email: text
	Country: country
	Organization or Affiliation: text
	Receive email updates and promotions on Stability AI products, services, and research?:
	type: select
	options:
	- 'Yes'
	- 'No'
	What do you intend to use the model for?:
	type: select
	options:
	- Research
	- Personal use
	- Creative Professional
	- Startup
	- Enterprise
	I agree to the License Agreement and acknowledge Stability AI's Privacy Policy: checkbox
	language:
	- en
	---

	# Stable Diffusion 3.5 Medium TensorRT
	## Introduction

	This repository hosts the TensorRT-optimized version of Stable Diffusion 3.5 Medium, developed in collaboration between [Stability AI](https://stability.ai) and [NVIDIA](https://huggingface.co/nvidia). This implementation leverages NVIDIA's TensorRT deep learning inference library to deliver significant performance improvements while maintaining the exceptional image quality of the original model.

	Stable Diffusion 3.5 Medium is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency. The TensorRT optimization makes these capabilities accessible for production deployment and real-time applications.

	## Model Details

	### Model Description
	This repository holds the ONNX exports of the T5, MMDiT and VAE models in BF16 precision.


	## Performance using TensorRT 10.13
	#### Timings for 30 steps at 1024x1024

	\| Accelerator \| Precision \| CLIP-G \| CLIP-L \| T5 \| MMDiT x 30 \| VAE Decoder \| Total \|
	\|-------------\|-----------\|------------\|--------------\|--------------\|-----------------------\|---------------------\|------------------------\|
	\| H100 \| BF16 \| 16.52 ms \| 6.83 ms \| 8.46 ms \| 2358.34 ms \| 72.58 ms \| 2496.63 ms \|


	## Usage Example
	1. Follow the [setup instructions](https://github.com/NVIDIA/TensorRT/blob/release/sd35/demo/Diffusion/README.md) on launching a TensorRT NGC container.
	```shell
	git clone https://github.com/NVIDIA/TensorRT.git
	cd TensorRT
	git checkout release/sd35
	docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/pytorch:25.01-py3 /bin/bash
	```


	2. Install libraries and requirements
	```shell
	cd demo/Diffusion
	python3 -m pip install --upgrade pip
	pip3 install -r requirements.txt
	python3 -m pip install --pre --upgrade --extra-index-url https://pypi.nvidia.com tensorrt-cu12
	```

	3. Generate HuggingFace user access token
	To download model checkpoints for the Stable Diffusion 3.5 checkpoints, please request access on the [Stable Diffusion 3.5 Medium](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium) page.
	You will then need to obtain a `read` access token to HuggingFace Hub and export as shown below. See [instructions](https://huggingface.co/docs/hub/security-tokens).

	```bash
	export HF_TOKEN=<your access token>
	```

	4. Perform TensorRT optimized inference:

	- Stable Diffusion 3.5 Medium in BF16 precision

	```
	python3 demo_txt2img_sd35.py \
	"a beautiful photograph of Mt. Fuji during cherry blossom" \
	--version=3.5-medium \
	--bf16 \
	--download-onnx-models \
	--denoising-steps=30 \
	--guidance-scale 3.5 \
	--build-static-batch \
	--use-cuda-graph \
	--hf-token=$HF_TOKEN
	```