huggingface/diffusers

Fork 0

mirror of https://github.com/huggingface/diffusers.git synced 2026-06-05 00:53:09 +08:00

Files

YiYi Xu 10dc589a94

Build documentation / build (push) Has been cancelled

Details

CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled

Details

Run dependency tests / check_dependencies (push) Has been cancelled

Details

Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled

Details

Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled

Details

Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled

Details

Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled

Details

Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled

Details

Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled

Details

Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled

Details

Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled

Details

Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled

Details

Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled

Details

Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled

Details

Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled

Details

Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled

Details

Secret Leaks / trufflehog (push) Has been cancelled

Details

Update Diffusers metadata / update_metadata (push) Has been cancelled

Details

Stale Bot / Close Stale Issues (push) Has been cancelled

Details

[modular]simplify components manager doc (#13088 )

* simplify components manager doc

* Apply suggestion from @yiyixuxu

* Apply suggestion from @stevhliu

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Apply suggestion from @stevhliu

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

2026-02-06 09:55:34 -10:00

4.7 KiB

Raw Blame History

ComponentsManager

The [ComponentsManager] is a model registry and management system for Modular Diffusers. It adds and tracks models, stores useful metadata (model size, device placement, adapters), and supports offloading.

This guide will show you how to use [ComponentsManager] to manage components and device memory.

Connect to a pipeline

Create a [ComponentsManager] and pass it to a [ModularPipeline] with either [~ModularPipeline.from_pretrained] or [~ModularPipelineBlocks.init_pipeline].

from diffusers import ModularPipeline, ComponentsManager
import torch

manager = ComponentsManager()
pipe = ModularPipeline.from_pretrained("Tongyi-MAI/Z-Image-Turbo", components_manager=manager)
pipe.load_components(torch_dtype=torch.bfloat16)

from diffusers import ModularPipelineBlocks, ComponentsManager
import torch
manager = ComponentsManager()
blocks = ModularPipelineBlocks.from_pretrained("diffusers/Florence2-image-Annotator", trust_remote_code=True)
pipe= blocks.init_pipeline(components_manager=manager)
pipe.load_components(torch_dtype=torch.bfloat16)

Components loaded by the pipeline are automatically registered in the manager. You can inspect them right away.

Inspect components

Print the [ComponentsManager] to see all registered components, including their class, device placement, dtype, memory size, and load ID.

The output below corresponds to the from_pretrained example above.

Components:
=============================================================================================================================
Models:
-----------------------------------------------------------------------------------------------------------------------------
Name_ID                      | Class                    | Device: act(exec) | Dtype          | Size (GB) | Load ID
-----------------------------------------------------------------------------------------------------------------------------
text_encoder_140458257514752 | Qwen3Model               | cpu               | torch.bfloat16 | 7.49      | Tongyi-MAI/Z-Image-Turbo|text_encoder|null|null
vae_140458257515376          | AutoencoderKL            | cpu               | torch.bfloat16 | 0.16      | Tongyi-MAI/Z-Image-Turbo|vae|null|null
transformer_140458257515616  | ZImageTransformer2DModel | cpu               | torch.bfloat16 | 11.46     | Tongyi-MAI/Z-Image-Turbo|transformer|null|null
-----------------------------------------------------------------------------------------------------------------------------

Other Components:
-----------------------------------------------------------------------------------------------------------------------------
ID                           | Class                           | Collection
-----------------------------------------------------------------------------------------------------------------------------
scheduler_140461023555264    | FlowMatchEulerDiscreteScheduler | N/A
tokenizer_140458256346432    | Qwen2Tokenizer                  | N/A
-----------------------------------------------------------------------------------------------------------------------------

The table shows models (with device, dtype, and memory info) separately from other components like schedulers and tokenizers. If any models have LoRA adapters, IP-Adapters, or quantization applied, that information is displayed in an additional section at the bottom.

Offloading

The [~ComponentsManager.enable_auto_cpu_offload] method is a global offloading strategy that works across all models regardless of which pipeline is using them. Once enabled, you don't need to worry about device placement if you add or remove components.

manager.enable_auto_cpu_offload(device="cuda")

All models begin on the CPU and [ComponentsManager] moves them to the appropriate device right before they're needed, and moves other models back to the CPU when GPU memory is low.

Call [~ComponentsManager.disable_auto_cpu_offload] to disable offloading.

manager.disable_auto_cpu_offload()

4.7 KiB Raw Blame History

ComponentsManager

Connect to a pipeline

Inspect components

Offloading

4.7 KiB

Raw Blame History