diffusers

mirror of https://github.com/huggingface/diffusers.git synced 2026-06-05 00:53:09 +08:00

Author	SHA1	Message	Date
彼彼	0ceddf7dca	[docs] add docs for JoyAI-Image-Edit (#13726 ) add docs	2026-05-12 16:33:22 +09:00
Sayak Paul	e5cf820fc3	[docs] add magcache to caching api listing (#13714 ) add magcache to caching api listing	2026-05-12 06:45:26 +09:00
Ting-Yun Chang	4ca863323d	Add LoRA support for Cosmos Predict 2.5 and fix pipeline to match official Cosmos repo (#13664 ) * support lora for cosmos 2.5 * Fix inconsistencies with cosmos official repo in VAE encoding, text encoder attention implementation, and timestep scaling * Support f_min and f_max in linear_scheduler warmup * Add requirements and dataset preprocessing scripts to run examples * Add LoRA training scripts * Add LoRA eval scripts * add assets for blogpost * Fix(scheduler): device mismatch from upstream `b114620` - move rk and b to device before torch.stack * Always upcast to fp32 * Directly inhrit from LoraBaseMixin * remove flash-attn2 * Use _keep_in_fp32_modules instead of autocast * remove the get_latent_shape_cthw method and fix style * simplifiy the eval script to make it more user-friendly * overwrite scheduling_unipc_multistep.py with main's version * remove network_alphas and add # Copied from * remove figures and assets * revert scheduler * revert fp32 upcast and support bs > 1 --------- Co-authored-by: Ting-Yun Chang <tingyunc@nvidia.com>	2026-05-07 16:50:13 -10:00
Sayak Paul	5bd51bd189	Update attention_backends.md to update FA3 minimum support to Ampere (#13283 ) * Update attention_backends.md * Update docs/source/en/optimization/attention_backends.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2026-05-07 19:56:46 +09:00
Gong Junmin	1a8a17b71b	Add ACE-Step pipeline for text-to-music generation (#13095 ) * Add ACE-Step pipeline for text-to-music generation Rebased on origin/main from the original pr-13095 branch (3 commits squashed). - AceStepDiTModel: Diffusion Transformer with RoPE, GQA, sliding window, AdaLN timestep conditioning, and cross-attention. - AceStepConditionEncoder: fuses text / lyric / timbre into a single cross-attention sequence. - AceStepPipeline: text2music / cover / repaint / extract / lego / complete. - Conversion script for the original checkpoint layout. - Docs + tests. * Fix ACE-Step pipeline audio quality and auto-detect turbo/base/sft variants The PR's original inference produced low-quality audio on turbo because the pipeline (a) mangled the SFT prompt format, (b) applied classifier-free guidance with the wrong unconditional embedding (empty-string encoded vs. the learned `null_condition_emb`), and (c) hardcoded turbo defaults even when loading a base/SFT checkpoint. Changes: * Converter preserves `null_condition_emb` (stored under the condition encoder) and propagates `is_turbo`/`model_version` into the transformer config so the pipeline can route per-variant defaults. * `AceStepConditionEncoder` registers `null_condition_emb` as a learned parameter matching the original module. * Pipeline auto-detects variant via `is_turbo`/`model_version` and picks defaults that match `acestep/inference.py`: * turbo: steps=8, shift=3.0, guidance_scale=1.0 (no CFG) * base/SFT: steps=27, shift=1.0, guidance_scale=7.0 * Base/SFT timestep schedule uses the linear+shift transform from `acestep/models/base/modeling_acestep_v15_base.py`; turbo still uses the hardcoded 8-step `SHIFT_TIMESTEPS` table. * CFG reuses the learned `null_condition_emb` and batches the conditional+unconditional forwards into a single transformer call. * `SFT_GEN_PROMPT` matches the newline layout in `acestep/constants.py` so the text encoder sees the same prompt distribution it was trained on. DiT parity vs. the original ACE-Step 1.5 turbo DiT is bit-identical (max_abs=0.0 in fp32 eager/SDPA across 4 seed/shape cases) — see scripts/dit_parity_test.py. * Add ACE-Step parity test scripts Two developer-facing parity harnesses live under scripts/: * dit_parity_test.py — loads the same converted turbo weights into the original AceStepDiTModel and the diffusers AceStepDiTModel, drives identical (hidden_states, timestep, timestep_r, encoder_hidden_states, context_latents) inputs, and asserts max-abs-diff ≤ 1e-5 in fp32 eager/SDPA. Currently passes bit-identical (max_abs=0) across four shape/seed cases including batched + odd-length paths. * audio_parity_jieyue.py — full end-to-end audio parity. Given the same JSON example, runs both the original ACE-Step 1.5 pipeline and the diffusers AceStepPipeline at matched seed/precision (bf16 + FA2 by default) and saves side-by-side .wav files for listening verification. Supports text2music / cover / repaint × turbo / base / sft via a --matrix mode that writes 18 wavs named {variant}_{task}_{official,diffusers}.wav. * Route SFT parity to acestep-v15-sft checkpoint On jieyue the release tree has a dedicated SFT checkpoint at checkpoints/acestep-v15-sft with its own modeling_acestep_v15_base.py shipped under acestep/models/sft/. Point the SFT row of the parity matrix at that checkpoint / module so we're testing the actual SFT weights, not the plain base ones. * audio_parity_jieyue: fix doubled 'acestep-' in cache path; --converted-root flag Previously the converted-pipeline cache dir was `/tmp/acestep-<variant>-diffusers` but <variant> already starts with "acestep-", giving `/tmp/acestep-acestep-v15-turbo-diffusers`. Drop the prefix. On jieyue the overlay rootfs (including /tmp) only has a few GB free; a full turbo conversion needs ~5 GB per variant. Add --converted-root (env ACESTEP_CONVERTED_ROOT) so the cache can live on vepfs. * audio_parity_jieyue: two-phase matrix bootstraps cover/repaint from text2music The ACE-Step release bundle on jieyue doesn't ship sample .wav/.mp3 files, so matrix mode had no default --src-audio and would skip cover/repaint entirely. Run text2music first for every variant, then reuse the TURBO official text2music output as the shared source for the cover/repaint rows. Users can still override with --src-audio. * audio_parity_jieyue: seed the diffusers generator on the pipeline device The ORIGINAL ACE-Step pipeline seeds on the execution device (`torch.Generator(device=device).manual_seed(seed)`), i.e. the CUDA RNG stream when running on GPU. Previously the parity harness seeded the diffusers side with a CPU generator, so even though the seed integer matched, the two sides drew different noise from the outset and the final outputs were essentially uncorrelated. Use the execution-device generator on both sides for a fair comparison. * Fix ACE-Step pipeline: switch to APG guidance + peak normalization Two issues found after the first jieyue audio parity run: 1. The original base/SFT pipeline uses APG (Adaptive Projected Guidance, acestep/models/common/apg_guidance.py) with a stateful momentum buffer and norm/projection steps — NOT vanilla CFG. Using vanilla CFG produced uncorrelated outputs vs. the reference (pearson ~0.0 on 20 s samples); this PR ports `_apg_forward` + `_APGMomentumBuffer` and plugs them into the denoising loop when `guidance_scale > 1`. Momentum is instantiated once per pipeline call (persists across denoising steps) to match the reference semantics. 2. The post-VAE "anti-clipping normalization" in this pipeline was `audio /= std * 5` with a `std<1 -> std=1` guard. The original post-processing in acestep/core/generation/handler/generate_music_decode.py is simple peak normalization: `if audio.abs().max() > 1: audio /= peak`. The std-based proxy both (a) let clips with peak < 1 leak through unchanged (over-quiet) and (b) failed to bring clipping peaks to exactly 1 in a bunch of base/SFT cases (observed max=1.000, std=0.200 repeatedly in the first parity run). Switch to peak normalization on both sides. Tested via scripts/audio_parity_jieyue.py on A800; re-run pending to confirm the base/SFT correlation improvements. * Fix ACE-Step chunk mask values to match the original pipeline The DiT receives `context_latents = concat(src_latents, chunk_mask)` on the channel dim, and was trained with chunk_mask values drawn from the three sentinels documented in acestep/inference.py: 2.0 -> model-decided (default for text2music / cover / full-generation) 1.0 -> keep this latent frame from src_latents (repaint preserved region) 0.0 -> explicitly repaint this frame (only inside the repaint window) Previously _build_chunk_mask returned all-1.0 for text2music (and cover / lego), and an inverted 0/1 mask for repaint (1 inside the window, 0 outside). Either case puts context_latents out of distribution. Switch text2music / cover to the 2.0 sentinel and flip the repaint mask so it's 1.0 outside / 0.0 inside. Update the repaint src_latents zero-out to multiply by the new mask (was `1 - chunk_mask`) so the zero region still lines up with the repaint window. * Add direct invoker for ACE-Step generate_music (ground truth) Our earlier audio_parity_jieyue.py reconstructs the original pipeline by calling AceStepConditionGenerationModel.generate_audio() directly, which silently skips a lot of the real handler plumbing (conditioning masks, silence-latent tiling, cover/repaint pre-processing, etc.). That made the 'official' wavs we saved sound wrong — flat, drone-like, not music. This new script calls acestep.inference.generate_music end-to-end through the real AceStepHandler, with LM + CoT explicitly disabled so we still have a deterministic comparison. Use it to generate the ground-truth 'official' wav for a given JSON example, then separately run the diffusers pipeline with the same inputs and diff the two. * run_official_generate_music: call initialize_service to bind a DiT variant AceStepHandler() is a shell — you have to call handler.initialize_service( project_root=..., config_path=..., device=..., use_flash_attention=..., ...) before generate_music will work. Mirror what cli.py does at the equivalent spot (around cli.py:1400). * Fix silence-reference for ACE-Step timbre encoder The root cause for the flat / drone-like outputs I was seeing (including in my 'official' reconstruction): when no reference_audio is provided the pipeline was feeding literal zeros to the timbre encoder. The real handler feeds a slice of the learned `silence_latent` tensor. The handler also transposes silence_latent on load (see acestep/core/generation/handler/init_service_loader.py:214: self.silence_latent = torch.load(...).transpose(1, 2) ) converting [1, 64, 15000] -> [1, 15000, 64] so that `silence_latent[:, :750, :]` yields the expected [1, 750, 64] shape. Changes: * Converter: load silence_latent.pt, transpose to [1, T, C], bake into the condition_encoder safetensors under key `silence_latent`. (Also keeps the raw .pt file at the pipeline root for debugging.) * AceStepConditionEncoder: register `silence_latent` as a persistent buffer so from_pretrained loads it alongside the trained weights. * Pipeline: when reference_audio is None, slice `condition_encoder.silence_latent[:, :timbre_fix_frame, :]` and broadcast across the batch instead of zeros. Emits a loud warning (and falls back to zeros) if the buffer is all-zero — that means the checkpoint was produced by an older converter and should be rebuilt. * audio_parity_jieyue.py: the reference path now matches the handler's silence-latent slicing. Without this fix, every variant/task combo produced drone-like audio even when my numeric DiT-forward parity claimed they were identical. * Fix three more ACE-Step pipeline bugs I found by dumping real inputs Instrumented the live generate_audio call in the real ACE-Step handler and observed the exact tensors it sees — my diffusers pipeline was wrong in three independent ways: 1. src_latents for text2music should be silence_latent tiled to latent_length, NOT zeros. The handler fills no-target cases from silence_latent_tiled (observed std=0.96). Zeros are OOD for the DiT context_latents concat and produce drone-like outputs. 2. chunk_mask values cap at 1.0 (not 2.0). The handler starts with a bool tensor (True inside the generate span, False outside); the chunk_mask_modes=auto -> 2.0 override does NOT take effect because the underlying tensor is bool, so setting entry = 2.0 casts to True. After the later .to(dtype) float cast, the DiT sees 1.0/0.0 — exactly what I observed in the captured tensor (unique values = [True]). 3. Default shift is 1.0 for ALL variants, including turbo. I was defaulting turbo to shift=3.0 which picks a different SHIFT_TIMESTEPS table (the 8-step schedule is keyed by shift, not variant). Also: * Added _silence_latent_tiled() helper that slices / tiles the learned silence_latent (now loaded as a buffer on the condition encoder) to the requested latent length. * Repaint path now substitutes silence_latent (not raw zeros) inside the repaint window — matches conditioning_masks.py. * audio_parity_jieyue.py mirrors the same src/chunk/shift choices on its 'original' leg for apples-to-apples parity once the buggy reconstruction is removed from the picture. * Add peak+loudness post-normalization to AceStepPipeline The real pipeline normalizes audio in two stages (see acestep/audio_utils.py:72 normalize_audio + generate_music_decode.py): 1. if peak > 1: audio /= peak (anti-clip) 2. audio = target_amp / peak (target_amp = 10 * (-1/20) ~ 0.891) Step 2 is loudness normalization to -1 dBFS. Without it diffusers outputs had peak=1.0 vs the real 0.891 — same music content (pearson was ~0.86 already), just 1.12x louder. Add step 2 after the existing anti-clip step. * Match acestep/inference.py inference_steps=8 for ALL variants GenerationParams.inference_steps default is 8 — turbo AND base/SFT. I had base/SFT defaulting to 27 here, so every base/SFT parity run was comparing a 27-step diffusers trajectory against an 8-step real trajectory. Different number of denoising steps means different audio even at fixed seed. This likely explains the lower base/SFT correlation in my earlier jieyue runs (turbo was 0.86, base/SFT were 0.32-0.34). Aligning step counts should bring base/SFT closer to turbo parity. * Address PR #13095 review: rename classes + reuse diffusers primitives Response to dg845's PR comments batch 1+2. DiT parity harness still bit-identical (max_abs=0 on fp32 / SDPA across 4 shape cases). Transformer file: * Rename AceStepDiTModel -> AceStepTransformer1DModel (alias kept). * Rename AceStepDiTLayer -> AceStepTransformerBlock (alias kept). * Inherit AttentionMixin + CacheMixin on the DiT model. * Swap in diffusers.models.normalization.RMSNorm for the hand-rolled AceStepRMSNorm (weight-key-compatible). * Swap the hand-rolled rotary embedding + apply_rotary for diffusers' get_1d_rotary_pos_embed + apply_rotary_emb (use_real_unbind_dim=-2 to match the cat-half convention ACE-Step inherits from Qwen3). * Use get_timestep_embedding with flip_sin_to_cos=True — keeps the (cos, sin) ordering of the original sinusoidal. State-dict-compatible. * Drop max_position_embeddings arg from DiT config (RoPE computes freqs per call based on seq_len); converter drops it. * Gradient-checkpoint call now takes just the layer module (matches the Flux2 idiom). Pipeline modeling file (pipelines/ace_step/modeling_ace_step.py): * Moved _pack_sequences + AceStepEncoderLayer here — they aren't used by the DiT, so they shouldn't live in the transformer file. * AceStepLyricEncoder + AceStepTimbreEncoder set _supports_gradient_checkpointing = True and wrap encoder-layer calls through the checkpointing func when enabled. * Use diffusers RMSNorm + the RoPE helper from the transformer file (shared single implementation). Converter (scripts/convert_ace_step_to_diffusers.py): * model_index.json now carries AceStepTransformer1DModel. * Drop max_position_embeddings / use_sliding_window from the emitted configs. No numerical regressions: scripts/dit_parity_test.py PASSES with max_abs=0.0 on fp32/SDPA across short, long, batched, and padding-path shape variants. * Address PR #13095 review: pipeline polish + converter HF-hub support Response to dg845 review comments on the pipeline side. DiT parity still bit-identical (max_abs=0 across 4 shape cases). Pipeline (pipelines/ace_step/pipeline_ace_step.py): * Add `sample_rate` + `latents_per_second` properties sourced from the VAE config so the pipeline no longer hardcodes 48000 / 25 / 1920. Propagates through prepare_latents, chunk_mask window math, and the audio-duration round-trip. * Add `do_classifier_free_guidance` property (matches LTX2 et al.). * Add `check_inputs(...)` called from `__call__` before allocating noise. Validates prompt type, lyrics type, task_type, step count, guidance scale, shift, cfg interval bounds and repaint window ordering. * Add `callback_on_step_end` + `callback_on_step_end_tensor_inputs` — the modern callback form. The legacy `callback` / `callback_steps` pair is kept for back-compat. Setting `pipe._interrupt = True` inside the callback stops the loop early. * Expose `encode_audio(audio)` as a public helper that wraps the tiled VAE encode + (B, T, D) transpose the pipeline performs internally. Converter (scripts/convert_ace_step_to_diffusers.py): * Accept a Hugging Face Hub repo id for `--checkpoint_dir`; resolves it via `huggingface_hub.snapshot_download` when the argument isn't a local path. Exports: * Register `AceStepTransformer1DModel` in the top-level __init__, models/__init__, models/transformers/__init__, and dummy_pt_objects so `from diffusers import AceStepTransformer1DModel` works and the pipeline loader resolves the new class name from model_index.json. Deferred for a follow-up (commented inline in the PR): full `Attention + AttnProcessor + dispatch_attention_fn` refactor and `FlowMatchEulerDiscreteScheduler` migration — both would benefit from a dedicated parity re-run and review. * Fix stale ACE-Step 1.0-era docs / class names in the 1.5 integration Docs and docstrings still carried a mix of 1.0 paper title, non-existent `ACE-Step/ACE-Step-v1-5-turbo` hub id, `shift=3.0` turbo default, and the old `AceStepDiTModel` class name. Cleaned up to match the actual 1.5 release: * pipelines/ace_step.md: correct citation title ("ACE-Step 1.5: Pushing the Boundaries of Open-Source Music Generation"), correct repo (`ace-step/ACE-Step-1.5`), new variants table with real HF ids (`Ace-Step1.5` / `acestep-v15-base` / `acestep-v15-sft`) and their per-variant step/CFG defaults, drop the wrong `shift=3.0` tip. * models/ace_step_transformer.md: page renamed to `AceStepTransformer1DModel` with a short 1.5-specific description; `AceStepDiTModel` noted as a backwards-compat alias. * pipeline_ace_step.py: import, docstring, `Args`, and `__init__` annotation reference `AceStepTransformer1DModel`; example model id now `ACE-Step/Ace-Step1.5`; `_variant_defaults` docstring and the `__call__` variant-fallback comment no longer claim `shift=3.0` / `27 steps` — real defaults are 8 steps / shift=1.0 across all variants, guidance=1.0 (turbo) vs 7.0 (base+sft). * Address PR #13095 review: VAE tiling on AutoencoderOobleck + Timesteps class Two more deferred review threads from dg845 addressed: * Move tiled encode/decode onto AutoencoderOobleck (https://github.com/huggingface/diffusers/pull/13095#discussion_r2785513647). AutoencoderOobleck now carries `use_tiling` + `tile_sample_min_length` / `tile_sample_overlap` / `tile_latent_min_length` / `tile_latent_overlap` attributes and private `_tiled_encode` / `_tiled_decode` methods; the existing `encode` / `_decode` dispatch to them when tiling is enabled and the input exceeds the threshold. `AutoencoderMixin.enable_tiling()` is already inherited. AceStepPipeline's private `_tiled_encode` / `_tiled_decode` and the `use_tiled_decode` `__call__` arg are gone; `__init__` now calls `self.vae.enable_tiling()` so the long-audio memory behaviour is preserved by default. Users can opt out with `pipe.vae.disable_tiling()`. Note: the VAE-side tiling concatenates encoder features (h) and samples the posterior once, instead of the old per-tile `.sample()` calls. This is the standard diffusers pattern; numerically differs only in the structure of the noise across tile boundaries. * Use the Timesteps nn.Module for the sinusoid (https://github.com/huggingface/diffusers/pull/13095#discussion_r2785420234). `AceStepTimestepEmbedding` wraps `Timesteps(in_channels, flip_sin_to_cos= True, downscale_freq_shift=0)` instead of calling `get_timestep_embedding` directly — reviewer asked for the Module form. * Address PR #13095 review: refactor AceStepAttention to Attention + AttnProcessor Splits the monolithic AceStepAttention into the diffusers standard Attention + AttnProcessor layout: - AceStepAttention (torch.nn.Module, AttentionModuleMixin) holds the to_q/to_k/to_v/to_out projections and norm_q/norm_k RMSNorms. - AceStepAttnProcessor2_0 runs the attention dispatch through dispatch_attention_fn so users can pick flash / sage / native backends via model.set_attention_backend(...) or the attention_backend context manager. GQA (Q has 16 heads / K,V have 8) is preserved by passing enable_gqa=True to dispatch_attention_fn instead of repeat_interleave; fusion is disabled (_supports_qkv_fusion = False) because Q and K,V have different output sizes. The converter is updated to rename the six attention sub-keys (q_proj -> to_q, k_proj -> to_k, v_proj -> to_v, o_proj -> to_out.0, q_norm -> norm_q, k_norm -> norm_k) on both the DiT decoder path and the condition encoder path, since AceStepLyricEncoder / AceStepTimbreEncoder share the same AceStepAttention class. Addresses review comments r2785433213 and r2785450463. * Address PR #13095 review: migrate to FlowMatchEulerDiscreteScheduler Replace the hand-rolled flow-matching Euler loop with `FlowMatchEulerDiscreteScheduler`. ACE-Step still computes its own shifted / turbo sigma schedule via `_get_timestep_schedule`, but now passes it to `scheduler.set_timesteps(sigmas=...)` and delegates the ODE step to `scheduler.step()`. The scheduler is configured with `num_train_timesteps=1` and `shift=1.0` so `scheduler.timesteps` stays in `[0, 1]` (the convention the DiT was trained on) and the scheduler doesn't re-shift already-shifted sigmas. The scheduler's appended terminal `sigma=0` reproduces the old loop's final-step "project to x0" case exactly: `prev = x + (0 - t_curr) * v`. Parity on jieyue (seed=42, bf16 + flash-attn, turbo text2music, 8 steps): waveform Pearson = 0.999999 spectral Pearson = 1.000000 max \|diff\| = 2.5e-3 (fp32 step-math vs previous bf16 step-math) fp32 Euler-loop A/B against the hand-rolled path: max \|diff\| = 3.6e-7. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Address PR #13095 review: move DiT tests + drop stale test kwargs - Move the DiT transformer tests out of the pipeline test file into a new tests/models/transformers/test_models_transformer_ace_step.py that follows the standard BaseModelTesterConfig + ModelTesterMixin scaffold (matches test_models_transformer_longcat_audio_dit.py). - Drop `max_position_embeddings` from the remaining AceStepDiTModel and AceStepConditionEncoder test fixtures — neither constructor accepts that argument anymore. - Drop `use_sliding_window` from the same fixtures — also no longer a constructor argument (the actual `sliding_window` int kwarg is kept). - Wire `FlowMatchEulerDiscreteScheduler(num_train_timesteps=1, shift=1.0)` into `get_dummy_components()` now that the pipeline requires it. Resolves https://github.com/huggingface/diffusers/pull/13095#discussion_r3115653554, r3115664850, r3115673059, r3115676580, r3115680700. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Address PR #13095 review from dg845 (2026-04-23) Fixes 5 review threads + style: 1. Converter now builds `AceStepPipeline` in memory and calls `save_pretrained`. Previously the hand-written `model_index.json` was missing the `scheduler` entry — fresh converter output couldn't be loaded by `AceStepPipeline.from_pretrained` (r3127767785). This also makes the converter robust to future `__init__` signature changes. 2. `latent_length` uses `math.ceil(...)` instead of `int(...)` so non-integer products (e.g. `latents_per_second=2.0, audio_duration=0.4 → 0.8`) round up to `1` instead of truncating to `0` and crashing shape checks (r3127790939). 3. Add `_callback_tensor_inputs = ["latents"]` on `AceStepPipeline` so the standard diffusers callback tests pick up the right tensor (r3127795954). 4. `AceStepConditionEncoder.silence_latent` no longer hard-codes the channel dim to 64. The placeholder buffer now uses the `timbre_hidden_dim` constructor argument, so smaller test configs with `timbre_hidden_dim != 64` load without shape errors (r3127812932). 5. Revert `self.vae.enable_tiling()` from `AceStepPipeline.__init__`. Users can call `pipe.vae.enable_tiling()` themselves for long-form generation; that matches the opt-in convention used by the rest of diffusers (r3127777296). 6. `ruff check --fix` + `ruff format` over all ACE-Step sources (the style fix dg845 asked for via `@bot /style`). Also: converter now accepts sharded `model.safetensors.index.json` layouts alongside the single-file `model.safetensors`, so the 5B XL turbo variant converts without a pre-processing step. Parity on jieyue (seed=42, bf16 + flash-attn, turbo text2music 160s, fresh converter output loaded via `from_pretrained`): waveform Pearson = 0.999954 spectral Pearson = 0.999977 max \|a-b\| bf16 = 4.3e-02 (dominated by the VAE tiling default flip) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Address PR #13095 review from yiyixuxu (2026-04-23) Code-level (22 threads): 1. Delete 3 dev/parity scripts (`scripts/audio_parity_jieyue.py`, `scripts/dit_parity_test.py`, `scripts/run_official_generate_music.py`) that shouldn't have been committed. 2. Rename `AutoencoderOobleck._encode_one` → `_encode` to match the convention used by other diffusers VAEs. 3. Delete the hard-coded `SHIFT_TIMESTEPS` / `VALID_SHIFTS` table in `pipeline_ace_step.py`: the per-shift turbo schedules are recovered exactly by `linspace(1, 0, N+1)[:-1]` plus the flow-match shift formula that the non-turbo branch already uses, so a single code path covers both. 4. Drop the backwards-compat `AceStepDiTModel` / `AceStepDiTLayer` aliases and every reference (top-level `__init__`, `models/__init__`, `transformers/__init__`, dummy objects, tests, docs toctree, model card). `AceStepTransformer1DModel` is the only exported name now. 5. Remove the unused `attention_mask` / `encoder_attention_mask` args from `AceStepTransformer1DModel.forward`; the model rebuilds its masks from the sequence shape and never consumed them. 6. In the DiT forward and both encoders, pass `None` instead of an all-zero `full_attn_mask` / `encoder_4d_mask` to non-sliding attention layers — SDPA dispatches to a faster kernel when the mask is None. 7. Inline the shared `_run_encoder_layers` helper directly into `AceStepLyricEncoder.forward` / `AceStepTimbreEncoder.forward` so layer calls are visible at the forward boundary (diffusers style). 8. Move `is_turbo` / `sample_rate` / `latents_per_second` from `@property`s that re-read module configs each call to cached attributes populated in `__init__` (Flux2-style), with a default-ACE-Step fallback when `self.vae` is offloaded. Drop the now-unused `SAMPLE_RATE = 48000` module-level constant and the three property definitions. 9. Warn + coerce `guidance_scale` to 1.0 on turbo (guidance-distilled) checkpoints, following `pipeline_flux2_klein`. Prevents over-guided audio when users forward their base/sft CFG settings to a turbo pipe. 10. Remove the `logger.warning(...)` paths that triggered on `silence_latent` missing/zero — those only fired for author-side unconverted checkpoints and tests; end users always load converted weights where the buffer is baked in. 11. Drop the redundant `with torch.no_grad():` wrappers inside `encode_prompt` — the pipeline's `__call__` runs under `torch.no_grad` already. 12. Strip "reviewer comment on PR #13095" attribution comments from three docstrings (here and everywhere). Parity on jieyue (seed=42, bf16 + flash-attn, XL turbo 160s text2music): waveform Pearson = 0.9747 spectral Pearson = 0.9895 The shift comes from full-attention layers switching `attn_mask=0_tensor` → `attn_mask=None`, which dispatches to a different SDPA kernel on bf16. The two outputs are algebraically equivalent for fp32 eager; on bf16+FA the delta is dominated by kernel-level ULPs, well within the sampler-noise band (ear-check on the 160s example confirms no audible regression). Still open — AudioTokenizer/Detokenizer (deferred) + APG guider follow-up (dims differ from `diffusers.guiders.adaptive_projected_guidance`, not a drop-in; worth a separate PR). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Address ACE-Step audio token and APG review * Fix ACE-Step docs CI * Address ACE-Step pipeline cleanup review * Fix ACE-Step flash attention sliding windows * Add ACE-Step callback properties * Address ACE-Step final review comments --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>	2026-04-30 18:30:44 -10:00
Álvaro Somoza	303c1d8b04	[Ernie-Image] Add lora support (#13575 ) add lora support Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>	2026-04-30 22:02:09 +05:30
Sayak Paul	a5bc04696b	NucleusMoE docs (#13661 ) up	2026-04-30 15:17:04 +05:30
songh11	50cb2db4ad	feat: support ring attention with arbitrary KV sequence lengths (#13545 ) * feat: support ring attention with arbitrary KV sequence lengths * fix: align ring_anything with ulysses_anything (size gather + unshard) * docs: document ring_anything mode * fix: merge hook branches, add ring_anything comment + guard * docs: address ring_anything review comments * docs: update ring_anything guidance * docs: refine ring_anything guidance per review * fix: address ring_anything style check --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2026-04-30 14:12:28 +05:30
Alexander Ivanov	2173c554ea	[docs] fix typo in AutoencoderOobleck docs (#13642 ) (#13645 )	2026-04-29 09:51:15 -07:00
Sayak Paul	656da843e8	[docs] add a mention of torchao and other backends in speed memory docs. (#13499 ) add a mention of torchao and other backends in speed memory docs.	2026-04-18 09:22:22 +05:30
Lancer	947bc23ba4	[chore] Add diffusers-format example to LongCatAudioDiTPipeline (#13483 ) * [chore] Add diffusers-format example and seed parameter to LongCatAudioDiTPipeline Signed-off-by: Lancer <maruixiang6688@gmail.com> * Apply style fixes * Apply suggestions from code review Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * Apply style fixes --------- Signed-off-by: Lancer <maruixiang6688@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>	2026-04-15 21:52:15 -07:00
Lancer	c41a3c3ed8	[Feat] Adds LongCat-AudioDiT pipeline (#13390 ) * Add LongCat-AudioDiT pipeline Signed-off-by: Lancer <maruixiang6688@gmail.com> * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * upd * Apply suggestions from code review Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * Apply style fixes * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * Apply style fixes * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * Apply style fixes * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> --------- Signed-off-by: Lancer <maruixiang6688@gmail.com> Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2026-04-15 00:47:38 -07:00
HsiaWinter	dc8d903217	Add ernie image (#13432 ) Some checks failed CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Failing after 10m35s Details Build documentation / build (push) Failing after 10m47s Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled Details Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled Details Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled Details Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch Pipelines CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (examples) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (lora) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (models) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (others) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (schedulers) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (single_file) (push) Has been cancelled Details Nightly and release tests on main/release branch / PyTorch Compile CUDA tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch tests on big GPU (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch Minimum Version CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:nvidia_modelopt test_location:modelopt]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:optimum_quanto test_location:quanto]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:torchao test_location:torchao]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft kernels] backend:gguf test_location:gguf]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft] backend:bitsandbytes test_location:bnb]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Generate Consolidated Test Report (push) Has been cancelled Details Test, build, and push Docker images / test-build-docker-images (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-doc-builder) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cpu) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cuda) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-minimum-cuda) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-xformers-cuda) (push) Has been cancelled Details Stale Bot / Close Stale Issues (push) Has been cancelled Details * Add ERNIE-Image * Update doc * Update doc * Change from Custom-Attention to Diffusers Style Attention * Change from Custom-Attention to Diffusers Style Attention * 兼容SGLang * 优化PE模块的加载与offload策略 * 更新Doc文件与config配置相关内容 * Fix官方反馈的内容 * 根据官方建议优化代码 * Update code * update * update * Apply style fixes * update * update * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2026-04-10 17:06:31 -10:00
Xyc2016	251676dfda	Fix grammar in LoRA documentation (#13423 ) Fix grammar in LoRA documentation (LoRA's → LoRAs, trigger it → trigger them)	2026-04-10 09:18:30 -07:00
Samuel Meddin	3e53a383e1	Fix typos and grammar errors in documentation (#13391 ) - Fix 'allows to generate' -> 'allows you to generate' in controlling_generation.md - Fix 'it's refiner' -> 'its refiner' (possessive) in sdxl.md - Fix 'it's state' -> 'its state' (possessive) in reusing_seeds.md - Fix missing word 'you'll a function' -> 'you'll create a function' in sdxl.md	2026-04-02 13:42:32 -07:00
YiYi Xu	cf6af6b4f8	[docs] add auto docstring and parameter templates documentation for m… (#13382 ) * [docs] add auto docstring and parameter templates documentation for modular diffusers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Update docs/source/en/modular_diffusers/auto_docstring.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_docstring.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_docstring.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_docstring.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_docstring.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_docstring.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_docstring.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_docstring.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * up --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2026-04-02 10:34:45 -10:00
Steven Liu	e365d749a1	[docs] deprecate pipelines (#13157 ) Some checks failed Build documentation / build (push) Has been cancelled Details CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled Details Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled Details Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled Details Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details Nightly and release tests on main/release branch / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (examples) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (lora) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (models) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (others) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (schedulers) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (single_file) (push) Has been cancelled Details Nightly and release tests on main/release branch / PyTorch Compile CUDA tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch tests on big GPU (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch Minimum Version CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:nvidia_modelopt test_location:modelopt]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:optimum_quanto test_location:quanto]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:torchao test_location:torchao]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft kernels] backend:gguf test_location:gguf]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft] backend:bitsandbytes test_location:bnb]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (push) Has been cancelled Details Test, build, and push Docker images / test-build-docker-images (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-doc-builder) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cpu) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cuda) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-minimum-cuda) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-xformers-cuda) (push) Has been cancelled Details Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch Pipelines CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Generate Consolidated Test Report (push) Has been cancelled Details Stale Bot / Close Stale Issues (push) Has been cancelled Details * deprecate * fix * fix * fix * fix * remove deprecated .md files * update links * fix	2026-04-01 10:16:23 -07:00
Pranav Thombre	7e463ea4cc	[docs] Add NeMo Automodel training guide (#13306 ) Some checks failed Build documentation / build (push) Has been cancelled Details CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details Nightly and release tests on main/release branch / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch Pipelines CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (examples) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (lora) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (models) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (others) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (schedulers) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (single_file) (push) Has been cancelled Details Nightly and release tests on main/release branch / PyTorch Compile CUDA tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch tests on big GPU (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch Minimum Version CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:nvidia_modelopt test_location:modelopt]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:optimum_quanto test_location:quanto]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:torchao test_location:torchao]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft kernels] backend:gguf test_location:gguf]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft] backend:bitsandbytes test_location:bnb]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Generate Consolidated Test Report (push) Has been cancelled Details Test, build, and push Docker images / test-build-docker-images (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-doc-builder) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cpu) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cuda) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-minimum-cuda) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-xformers-cuda) (push) Has been cancelled Details * [docs] Add NeMo Automodel training guide Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com> * Update docs/source/en/training/nemo_automodel.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/training/nemo_automodel.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * adding contacts into the readme * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Address CR comments Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com> * Update docs/source/en/training/nemo_automodel.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/training/nemo_automodel.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: linnan wang <wangnan318@gmail.com>	2026-03-30 10:21:58 -07:00
Howard Zhang	1fe2125802	remove str option for quantization config in torchao (#13291 ) Some checks failed CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details Build documentation / build (push) Has been cancelled Details Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled Details Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled Details Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled Details Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled Details Stale Bot / Close Stale Issues (push) Has been cancelled Details * remove str option for quantization config in torchao * Apply style fixes * minor fixes * Added AOBaseConfig docs to torchao.md * minor fixes for removing str option torchao * minor change to add back int and uint check * minor fixes * minor fixes to tests * Update tests/quantization/torchao/test_torchao.py Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update tests/quantization/torchao/test_torchao.py Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * version=2 update to test_torchao.py --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2026-03-27 08:52:37 +05:30
dg845	7298f5be93	Update LTX-2 Docs to Cover LTX-2.3 Models (#13337 ) * Update LTX-2 docs to cover multimodal guidance and prompt enhancement * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply reviewer feedback --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2026-03-26 17:51:29 -07:00
Steven Liu	cbf4d9a3c3	[docs] kernels (#13139 ) * kernels * feedback	2026-03-25 09:31:54 -07:00
Kashif Rasul	762ae059fa	[LLADA2] documentation fixes (#13333 ) documentation fixes	2026-03-25 17:49:31 +05:30
Kashif Rasul	5d207e756e	[Discrete Diffusion] Add LLaDA2 pipeline (#13226 ) * feat: add LLaDA2 and BlockRefinement pipelines for discrete text diffusion Add support for LLaDA2/LLaDA2.1 discrete diffusion text generation: - BlockRefinementPipeline: block-wise iterative refinement with confidence-based token commitment, supporting editing threshold for LLaDA2.1 models - LLaDA2Pipeline: convenience wrapper with LLaDA2-specific defaults - DiscreteDiffusionPipelineMixin: shared SAR sampling utilities (top-k, top-p, temperature) and prompt/prefix helpers - compute_confidence_aware_loss: CAP-style training loss - Examples: sampling scripts for LLaDA2 and block refinement, training scripts with Qwen causal LM - Docs and tests included * feat: add BlockRefinementScheduler for commit-by-confidence scheduling Extract the confidence-based token commit logic from BlockRefinementPipeline into a dedicated BlockRefinementScheduler, following diffusers conventions. The scheduler owns: - Transfer schedule computation (get_num_transfer_tokens) - Timestep management (set_timesteps) - Step logic: confidence-based mask-filling and optional token editing The pipeline now delegates scheduling to self.scheduler.step() and accepts a scheduler parameter in __init__. * test: add unit tests for BlockRefinementScheduler 12 tests covering set_timesteps, get_num_transfer_tokens, step logic (confidence-based commits, threshold behavior, editing, prompt masking, batched inputs, tuple output). * docs: add toctree entries and standalone scheduler doc page - Add BlockRefinement and LLaDA2 to docs sidebar navigation - Add BlockRefinementScheduler to schedulers sidebar navigation - Move scheduler autodoc to its own page under api/schedulers/ * feat: add --revision flag and fix dtype deprecation in sample_llada2.py - Add --revision argument for loading model revisions from the Hub - Replace deprecated torch_dtype with dtype for transformers 5.x compat * fix: use 1/0 attention mask instead of 0/-inf for LLaDA2 compat LLaDA2 models expect a boolean-style (1/0) attention mask, not an additive (0/-inf) mask. The model internally converts to additive, so passing 0/-inf caused double-masking and gibberish output. * refactor: consolidate training scripts into single train_block_refinement.py - Remove toy train_block_refinement_cap.py (self-contained demo with tiny model) - Rename train_block_refinement_qwen_cap.py to train_block_refinement.py (already works with any causal LM via AutoModelForCausalLM) - Fix torch_dtype deprecation and update README with correct script names * fix formatting * docs: improve LLaDA2 and BlockRefinement documentation - Add usage examples with real model IDs and working code - Add recommended parameters table for LLaDA2.1 quality/speed modes - Note that editing is LLaDA2.1-only (not for LLaDA2.0 models) - Remove misleading config defaults section from BlockRefinement docs * feat: set LLaDA2Pipeline defaults to recommended model parameters - threshold: 0.95 -> 0.7 (quality mode) - max_post_steps: 0 -> 16 (recommended for LLaDA2.1, harmless for 2.0) - eos_early_stop: False -> True (stop at EOS token) block_length=32, steps=32, temperature=0.0 were already correct. editing_threshold remains None (users enable for LLaDA2.1 models). * feat: default editing_threshold=0.5 for LLaDA2.1 quality mode LLaDA2.1 is the current generation. Users with LLaDA2.0 models can disable editing by passing editing_threshold=None. * fix: align sampling utilities with official LLaDA2 implementation - top_p filtering: add shift-right to preserve at least one token above threshold (matches official code line 1210) - temperature ordering: apply scaling before top-k/top-p filtering so filtering operates on scaled logits (matches official code lines 1232-1235) - greedy branch: return argmax directly when temperature=0 without filtering (matches official code lines 1226-1230) * refactor: remove duplicate prompt encoding, reuse mixin's _prepare_input_ids LLaDA2Pipeline._prepare_prompt_ids was a near-copy of DiscreteDiffusionPipelineMixin._prepare_input_ids. Remove the duplicate and call the mixin method directly. Also simplify _extract_input_ids since we always pass return_dict=True. * formatting * fix: replace deprecated torch_dtype with dtype in examples and docstrings - Update EXAMPLE_DOC_STRING to use dtype= and LLaDA2.1-mini model ID - Fix sample_block_refinement.py to use dtype= * remove BlockRefinementPipeline * cleanup * fix readme * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * removed DiscreteDiffusionPipelineMixin * add support for 2d masks for flash attn * Update src/diffusers/training_utils.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/training_utils.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * fix issues from review * added tests * formatting * add check_eos_finished to scheduler * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/schedulers/scheduling_block_refinement.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/schedulers/scheduling_block_refinement.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * fix renaming issues and types * remove duplicate check * Update docs/source/en/api/pipelines/llada2.md Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>	2026-03-25 16:17:50 +05:30
ddavidchick	ef309a1bb0	Add KVAE 1.0 (#13033 ) Some checks failed Build documentation / build (push) Has been cancelled Details CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled Details Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled Details Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled Details Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details * add kvae2d * add kvae3d video * add docs for kvae2d and kvae3d video * style fixes * fix kvae3d docs * fix normalzation * fix kvae video for code style * fix kvae video * kvae minor fixes * add gradient ckpting for kvaes * get rid of inplace ops kvae video * add tests for KVAEs * kvae2d normalization style change * kvaes fix style * update dummy_pt_objects test for kvaes --------- Co-authored-by: YiYi Xu <yixu310@gmail.com>	2026-03-23 12:56:49 -10:00
Sayak Paul	0b35834351	[core] fa4 support. (#13280 ) Some checks failed Build documentation / build (push) Has been cancelled Details CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled Details Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled Details Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled Details Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details Stale Bot / Close Stale Issues (push) Has been cancelled Details Nightly and release tests on main/release branch / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch Pipelines CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (examples) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (lora) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (models) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (others) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (schedulers) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (single_file) (push) Has been cancelled Details Nightly and release tests on main/release branch / PyTorch Compile CUDA tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch tests on big GPU (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch Minimum Version CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:nvidia_modelopt test_location:modelopt]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:optimum_quanto test_location:quanto]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:torchao test_location:torchao]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft kernels] backend:gguf test_location:gguf]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft] backend:bitsandbytes test_location:bnb]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Generate Consolidated Test Report (push) Has been cancelled Details Test, build, and push Docker images / test-build-docker-images (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-doc-builder) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cpu) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cuda) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-minimum-cuda) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-xformers-cuda) (push) Has been cancelled Details * start fa4 support. * up * specify minimum version	2026-03-20 17:28:09 +05:30
YiYi Xu	a13e5cf9fc	[agents]support skills (#13269 ) Some checks failed Build documentation / build (push) Has been cancelled Details CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details * support skills * update * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update baSeed on new best practice * Update .ai/skills/parity-testing/pitfalls.md Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * update --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>	2026-03-19 18:07:41 -10:00
Steven Liu	ed31974c3e	[docs] updates (#13248 ) Some checks failed Build documentation / build (push) Has been cancelled Details CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details * fixes * few more links * update zh * fix	2026-03-16 13:24:57 -07:00
YiYi Xu	e5aa719241	Add AGENTS.md (#13259 ) Some checks failed Build documentation / build (push) Has been cancelled Details CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled Details Mirror Community Pipeline / mirror_community_pipeline (push) Has been cancelled Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled Details Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled Details Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled Details Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details Benchmarking tests / Torch Core Models CUDA Benchmarking Tests (push) Has been cancelled Details Stale Bot / Close Stale Issues (push) Has been cancelled Details Nightly and release tests on main/release branch / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (examples) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (lora) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (models) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (others) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (schedulers) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (single_file) (push) Has been cancelled Details Nightly and release tests on main/release branch / PyTorch Compile CUDA tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch tests on big GPU (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch Minimum Version CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:nvidia_modelopt test_location:modelopt]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:optimum_quanto test_location:quanto]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:torchao test_location:torchao]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft kernels] backend:gguf test_location:gguf]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft] backend:bitsandbytes test_location:bnb]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (push) Has been cancelled Details Test, build, and push Docker images / test-build-docker-images (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-doc-builder) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cpu) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cuda) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-minimum-cuda) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-xformers-cuda) (push) Has been cancelled Details Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch Pipelines CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Generate Consolidated Test Report (push) Has been cancelled Details * add a draft * add * up * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2026-03-14 08:35:12 -10:00
Sayak Paul	764f7ede33	[core] Flux2 klein kv followups (#13264 ) * implement Flux2Transformer2DModelOutput. * add output class to docs. * add Flux2KleinKV to docs. * add pipeline tests for klein kv.	2026-03-13 10:05:11 +05:30
Miguel Martin	0a2c26d0a4	Update Documentation for NVIDIA Cosmos (#13251 ) * fix docs * update main example	2026-03-11 09:14:56 -07:00
Shenghai Yuan	bd7a7a0b95	Optimize Helios docs (#13222 ) optimize helios docs	2026-03-08 13:54:16 -10:00
Ando	8ec0a5ccad	feat: implement rae autoencoder. (#13046 ) Some checks failed Build documentation / build (push) Has been cancelled Details CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled Details Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled Details Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled Details Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details * feat: implement three RAE encoders(dinov2, siglip2, mae) * feat: finish first version of autoencoder_rae * fix formatting * make fix-copies * initial doc * fix latent_mean / latent_var init types to accept config-friendly inputs * use mean and std convention * cleanup * add rae to diffusers script * use imports * use attention * remove unneeded class * example traiing script * input and ground truth sizes have to be the same * fix argument * move loss to training script * cleanup * simplify mixins * fix training script * fix entrypoint for instantiating the AutoencoderRAE * added encoder_image_size config * undo last change * fixes from pretrained weights * cleanups * address reviews * fix train script to use pretrained * fix conversion script review * latebt normalization buffers are now always registered with no-op defaults * Update examples/research_projects/autoencoder_rae/README.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update src/diffusers/models/autoencoders/autoencoder_rae.py Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * use image url * Encoder is frozen * fix slow test * remove config * use ModelTesterMixin and AutoencoderTesterMixin * make quality * strip final layernorm when converting * _strip_final_layernorm_affine for training script * fix test * add dispatch forward and update conversion script * update training script * error out as soon as possible and add comments * Update src/diffusers/models/autoencoders/autoencoder_rae.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * use buffer * inline * Update src/diffusers/models/autoencoders/autoencoder_rae.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * remove optional * _noising takes a generator * Update src/diffusers/models/autoencoders/autoencoder_rae.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * fix api * rename * remove unittest * use randn_tensor * fix device map on multigpu * check if the key is missing in the original state dict and only then add to the allow_missing set * remove initialize_weights --------- Co-authored-by: wangyuqi <wangyuqi@MBP-FJDQNJTWYN-0208.local> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>	2026-03-05 20:17:14 +05:30
Shenghai Yuan	ae5881ba77	Fix Helios paper link in documentation (#13213 ) * Fix Helios paper link in documentation Updated the link to the Helios paper for accuracy. * Fix reference link in HeliosTransformer3DModel documentation Updated the reference link for the Helios Transformer model paper. * Update Helios research paper link in documentation * Update Helios research paper link in documentation	2026-03-05 18:58:13 +05:30
dg845	ab6040ab2d	Add LTX2 Condition Pipeline (#13058 ) Some checks failed Build documentation / build (push) Has been cancelled Details CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled Details Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled Details Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled Details Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details * LTX2 condition pipeline initial commit * Fix pipeline import error * Implement LTX-2-style general image conditioning * Blend denoising output and clean latents in sample space instead of velocity space * make style and make quality * make fix-copies * Rename LTX2VideoCondition image to frames * Update LTX2ConditionPipeline example * Remove support for image and video in __call__ * Put latent_idx_from_index logic inline * Improve comment on using the conditioning mask in denoising loop * Apply suggestions from code review Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com> * make fix-copies * Migrate to Python 3.9+ style type annotations without explicit typing imports * Forward kwargs from preprocess/postprocess_video to preprocess/postprocess resp. * Center crop LTX-2 conditions following original code * Duplicate video and audio position ids if using CFG * make style and make quality * Remove unused index_type arg to preprocess_conditions * Add # Copied from for _normalize_latents * Fix _normalize_latents # Copied from statement * Add LTX-2 condition pipeline docs * Remove TODOs * Support only unpacked latents (5D for video, 4D for audio) * Remove # Copied from for prepare_audio_latents --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>	2026-03-05 00:42:55 -08:00
dg845	33f785b444	Add Helios-14B Video Generation Pipelines (#13208 ) * [1/N] add helios * fix test * make fix-copies * change script path * fix cus script * update docs * fix documented check * update links for docs and examples * change default config * small refactor * add test * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * remove register_buffer for _scale_cache * fix non-cuda devices error * remove "handle the case when timestep is 2D" * refactor HeliosMultiTermMemoryPatch and process_input_hidden_states * Update src/diffusers/pipelines/helios/pipeline_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/helios/pipeline_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * fix calculate_shift * Update src/diffusers/pipelines/helios/pipeline_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * rewritten `einops` in pure `torch` * fix: pass patch_size to apply_schedule_shift instead of hardcoding * remove the logics of 'vae_decode_type' * move some validation into check_inputs() * rename helios scheduler & merge all into one step() * add some details to doc * move dmd step() logics from pipeline to scheduler * change to Python 3.9+ style type * fix NoneType error * refactor DMD scheduler's set_timestep * change rope related vars name * fix stage2 sample * fix dmd sample * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * remove redundant & refactor norm_out * Update src/diffusers/pipelines/helios/pipeline_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * change "is_keep_x0" to "keep_first_frame" * use a more intuitive name * refactor dynamic_time_shifting * remove use_dynamic_shifting args * remove usage of UniPCMultistepScheduler * separate stage2 sample to HeliosPyramidPipeline * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * fix transformer * use a more intuitive name * update example script * fix requirements * remove redudant attention mask * fix * optimize pipelines * make style . * update TYPE_CHECKING * change to use torch.split Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * derive memory patch sizes from patch_size multiples * remove some hardcoding * move some checks into check_inputs * refactor sample_block_noise * optimize encoding chunks logits for v2v * use num_history_latent_frames = sum(history_sizes) * Update src/diffusers/pipelines/helios/pipeline_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * remove redudant optimized_scale * Update src/diffusers/pipelines/helios/pipeline_helios_pyramid.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * use more descriptive name * optimize history_latents * remove not used "num_inference_steps" * removed redudant "pyramid_num_stages" * add "is_cfg_zero_star" and "is_distilled" to HeliosPyramidPipeline * remove redudant * change example scripts name * change example scripts name * correct docs * update example * update docs * Update tests/models/transformers/test_models_transformer_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update tests/models/transformers/test_models_transformer_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * separate HeliosDMDScheduler * fix numerical stability issue: * Update src/diffusers/schedulers/scheduling_helios_dmd.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/schedulers/scheduling_helios_dmd.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/schedulers/scheduling_helios_dmd.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/schedulers/scheduling_helios_dmd.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/schedulers/scheduling_helios_dmd.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * remove redudant * small refactor * remove use_interpolate_prompt logits * simplified model test * fallbackt to BaseModelTesterConfig * remove _maybe_expand_t2v_lora_for_i2v * fix HeliosLoraLoaderMixin * update docs * use randn_tensor for test * fix doc typo * optimize code * mark torch.compile xfail * change paper name * Make get_dummy_inputs deterministic using self.generator * Set less strict threshold for test_save_load_float16 test for Helios pipeline * make style and make quality * Preparation for merging * add torch.Generator * Fix HeliosPipelineOutput doc path * Fix Helios related (optimize docs & remove redudant) (#13210) * fix docs * remove redudant * remove redudant * fix group offload * Removed fixes for group offload --------- Co-authored-by: yuanshenghai <yuanshenghai@bytedance.com> Co-authored-by: Shenghai Yuan <140951558+SHYuanBest@users.noreply.github.com> Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: SHYuanBest <shyuan-cs@hotmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2026-03-04 21:31:43 +05:30
Sayak Paul	4a2833c1c2	[Modular] implement requirements validation for custom blocks (#12196 ) * feat: implement requirements validation for custom blocks. * up * unify. * up * add tests * Apply suggestions from code review Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * reviewer feedback. * [docs] validation for custom blocks (#13156) validation * move to tmp_path fixture. * propagate to conditional and loopsequential blocks. * up * remove collected tests --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2026-03-04 12:19:08 +05:30
Dhruv Nair	3fd14f1acf	[AutoModel] Allow registering `auto_map` to model config (#13186 ) * update * update	2026-03-02 22:13:25 +05:30
YiYi Xu	680076fcc0	[Modular] update the auto pipeline blocks doc (#13148 ) * update * Apply suggestion from @yiyixuxu * Update docs/source/en/modular_diffusers/auto_pipeline_blocks.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_pipeline_blocks.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_pipeline_blocks.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_pipeline_blocks.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * add to api --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal>	2026-02-27 10:50:35 -10:00
Miguel Martin	212db7b999	Cosmos Transfer2.5 Auto-Regressive Inference Pipeline (#13114 ) * AR * address comments * address comments 2	2026-02-25 14:42:29 -10:00
Sayak Paul	aac94befce	[docs] Fix torchrun command argument order in docs (#13181 ) Some checks failed Build documentation / build (push) Has been cancelled Details CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details Stale Bot / Close Stale Issues (push) Has been cancelled Details Nightly and release tests on main/release branch / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch Pipelines CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (examples) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (lora) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (models) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (others) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (schedulers) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (single_file) (push) Has been cancelled Details Nightly and release tests on main/release branch / PyTorch Compile CUDA tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch tests on big GPU (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch Minimum Version CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:nvidia_modelopt test_location:modelopt]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:optimum_quanto test_location:quanto]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:torchao test_location:torchao]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft kernels] backend:gguf test_location:gguf]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft] backend:bitsandbytes test_location:bnb]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Generate Consolidated Test Report (push) Has been cancelled Details Test, build, and push Docker images / test-build-docker-images (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-doc-builder) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cpu) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cuda) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-minimum-cuda) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-xformers-cuda) (push) Has been cancelled Details Fix torchrun command argument order in docs	2026-02-24 08:31:39 -08:00
Steven Liu	6875490c3b	[docs] add docs for qwenimagelayered (#13158 ) Some checks failed Build documentation / build (push) Has been cancelled Details CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled Details Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled Details Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled Details Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details Nightly and release tests on main/release branch / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch Pipelines CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (examples) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (lora) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (models) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (others) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (schedulers) (push) Has been cancelled Details Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (single_file) (push) Has been cancelled Details Nightly and release tests on main/release branch / PyTorch Compile CUDA tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch tests on big GPU (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch Minimum Version CUDA Tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:nvidia_modelopt test_location:modelopt]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:optimum_quanto test_location:quanto]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:torchao test_location:torchao]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft kernels] backend:gguf test_location:gguf]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft] backend:bitsandbytes test_location:bnb]) (push) Has been cancelled Details Nightly and release tests on main/release branch / Torch quantization nightly tests (push) Has been cancelled Details Nightly and release tests on main/release branch / Generate Consolidated Test Report (push) Has been cancelled Details Test, build, and push Docker images / test-build-docker-images (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-doc-builder) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cpu) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cuda) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-minimum-cuda) (push) Has been cancelled Details Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-xformers-cuda) (push) Has been cancelled Details * add example * feedback	2026-02-18 11:02:15 -08:00
Dhruv Nair	59e7a46928	[Pipelines] Remove k-diffusion (#13152 ) * remove k-diffusion * fix copies	2026-02-16 13:54:24 +05:30
YiYi Xu	c919ec0611	[Modular] add explicit workflow support (#13028 ) Some checks failed Build documentation / build (push) Has been cancelled Details CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled Details Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled Details Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled Details Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details * up * up up * update outputs * style * add modular_auto_docstring! * more auto docstring * style * up up up * more more * up * address feedbacks * add TODO in the description for empty docstring * refactor based on dhruv's feedback: remove the class method * add template method * up * up up up * apply auto docstring * make style * rmove space in make docstring * Apply suggestions from code review * revert change in z * fix * Apply style fixes * include auto-docstring check in the modular ci. (#13004) * initial support: workflow * up up * treeat loop sequential pipeline blocks as leaf * update qwen image docstring note * add workflow support for sdxl * add a test suit * add test for qwen-image * refactor flux a bit, seperate modular_blocks into modular_blocks_flux and modular_blocks_flux_kontext + support workflow * refactor flux2: seperate blocks for klein_base + workflow * qwen: remove import support for stuff other than the default blocks * add workflow support for wan * sdxl: remove some imports: * refactor z * update flux2 auto core denoise * add workflow test for z and flux2 * Apply suggestions from code review * Apply suggestions from code review * add test for flux * add workflow test for flux * add test for flux-klein * sdxl: modular_blocks.py -> modular_blocks_stable_diffusion_xl.py * style * up * add auto docstring * workflow_names -> available_workflows * fix workflow test for klein base * Apply suggestions from code review Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * fix workflow tests * qwen: edit -> image_conditioned to be consistent with flux kontext/2 such * remove Optional * update type hints * update guider update_components * fix more * update docstring auto again --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>	2026-02-14 16:18:48 -10:00
YiYi Xu	3c7506b294	[Modular] update doc for `ModularPipeline` (#13100 ) * update create pipeline section * update more * update more * more * add a section on running pipeline moduarly * refactor update_components, remove support for spec * style * bullet points * update the pipeline block * small fix in state doc * update sequential doc * fix link * small update on quikstart * add a note on how to run pipeline without the componen4ts manager * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * remove the supported models mention * update more * up * revert type hint changes --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2026-02-14 11:43:28 -10:00
Dhruv Nair	bedc67c75f	[Docs] Add guide for AutoModel with custom code (#13099 ) update	2026-02-10 12:19:44 +05:30
dg845	baaa8d040b	LTX 2 Improve `encode_video` by Accepting More Input Types (#13057 ) Some checks failed Build documentation / build (push) Has been cancelled Details CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled Details Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled Details Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled Details Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details * Support different pipeline outputs for LTX 2 encode_video * Update examples to use improved encode_video function * Fix comment * Address review comments * make style and make quality * Have non-iterator video inputs respect video_chunks_number * make style and make quality * Add warning when encode_video receives a non-denormalized np.ndarray * make style and make quality --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2026-02-08 19:40:34 -08:00
YiYi Xu	10dc589a94	[modular]simplify components manager doc (#13088 ) Some checks failed Build documentation / build (push) Has been cancelled Details CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled Details Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled Details Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled Details Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details Stale Bot / Close Stale Issues (push) Has been cancelled Details * simplify components manager doc * Apply suggestion from @yiyixuxu * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2026-02-06 09:55:34 -10:00
CalamitousFelicitousness	99e2cfff27	Feature/zimage inpaint pipeline (#13006 ) * Add ZImageInpaintPipeline Updated the pipeline structure to include ZImageInpaintPipeline alongside ZImagePipeline and ZImageImg2ImgPipeline. Implemented the ZImageInpaintPipeline class for inpainting tasks, including necessary methods for encoding prompts, preparing masked latents, and denoising. Enhanced the auto_pipeline to map the new ZImageInpaintPipeline for inpainting generation tasks. Added unit tests for ZImageInpaintPipeline to ensure functionality and performance. Updated dummy objects to include ZImageInpaintPipeline for testing purposes. * Add documentation and improve test stability for ZImageInpaintPipeline - Add torch.empty fix for x_pad_token and cap_pad_token in test - Add # Copied from annotations for encode_prompt methods - Add documentation with usage example and autodoc directive * Address PR review feedback for ZImageInpaintPipeline Add batch size validation and callback handling fixes per review, using diffusers conventions rather than suggested code verbatim. * Update src/diffusers/pipelines/z_image/pipeline_z_image_inpaint.py Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com> * Update src/diffusers/pipelines/z_image/pipeline_z_image_inpaint.py Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com> * Add input validation and fix XLA support for ZImageInpaintPipeline - Add missing is_torch_xla_available import for TPU support - Add xm.mark_step() in denoising loop for proper XLA execution - Add check_inputs() method for comprehensive input validation - Call check_inputs() at the start of __call__ Addresses PR review feedback from @asomoza. * Cleanup --------- Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>	2026-02-05 11:48:25 -03:00
Sayak Paul	90818e82b3	[docs] Fix syntax error in quantization configuration (#13076 ) Fix syntax error in quantization configuration	2026-02-04 08:31:03 -08:00
Alan Ponnachan	430c557b6a	Add support for Magcache (#12744 ) Some checks failed Build documentation / build (push) Has been cancelled Details CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled Details Run dependency tests / check_dependencies (push) Has been cancelled Details Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled Details Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled Details Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled Details Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled Details Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled Details Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled Details Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled Details Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Diffusers metadata / update_metadata (push) Has been cancelled Details Stale Bot / Close Stale Issues (push) Has been cancelled Details * add magcache * formatting * add magcache support with calibration mode * add imports * improvements * Apply style fixes * fix kandinsky errors * add tests and documentation * Apply style fixes * improvements * Apply style fixes * make fix-copies. * minor fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2026-02-04 13:45:12 +05:30

1 2 3 4 5 ...

1269 Commits