mirror of
https://github.com/huggingface/diffusers.git
synced 2026-06-05 00:53:09 +08:00
07f80875830748bd31fe577fc99615deb9471ba0
1269 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
0ceddf7dca |
[docs] add docs for JoyAI-Image-Edit (#13726)
add docs |
||
|
|
e5cf820fc3 |
[docs] add magcache to caching api listing (#13714)
add magcache to caching api listing |
||
|
|
4ca863323d |
Add LoRA support for Cosmos Predict 2.5 and fix pipeline to match official Cosmos repo (#13664)
* support lora for cosmos 2.5
* Fix inconsistencies with cosmos official repo in VAE encoding, text encoder attention implementation, and timestep scaling
* Support f_min and f_max in linear_scheduler warmup
* Add requirements and dataset preprocessing scripts to run examples
* Add LoRA training scripts
* Add LoRA eval scripts
* add assets for blogpost
* Fix(scheduler): device mismatch from upstream
|
||
|
|
5bd51bd189 |
Update attention_backends.md to update FA3 minimum support to Ampere (#13283)
* Update attention_backends.md * Update docs/source/en/optimization/attention_backends.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> |
||
|
|
1a8a17b71b |
Add ACE-Step pipeline for text-to-music generation (#13095)
* Add ACE-Step pipeline for text-to-music generation
Rebased on origin/main from the original pr-13095 branch (3 commits squashed).
- AceStepDiTModel: Diffusion Transformer with RoPE, GQA, sliding window,
AdaLN timestep conditioning, and cross-attention.
- AceStepConditionEncoder: fuses text / lyric / timbre into a single
cross-attention sequence.
- AceStepPipeline: text2music / cover / repaint / extract / lego / complete.
- Conversion script for the original checkpoint layout.
- Docs + tests.
* Fix ACE-Step pipeline audio quality and auto-detect turbo/base/sft variants
The PR's original inference produced low-quality audio on turbo because the
pipeline (a) mangled the SFT prompt format, (b) applied classifier-free guidance
with the wrong unconditional embedding (empty-string encoded vs. the learned
`null_condition_emb`), and (c) hardcoded turbo defaults even when loading a
base/SFT checkpoint.
Changes:
* Converter preserves `null_condition_emb` (stored under the condition encoder)
and propagates `is_turbo`/`model_version` into the transformer config so the
pipeline can route per-variant defaults.
* `AceStepConditionEncoder` registers `null_condition_emb` as a learned
parameter matching the original module.
* Pipeline auto-detects variant via `is_turbo`/`model_version` and picks
defaults that match `acestep/inference.py`:
* turbo: steps=8, shift=3.0, guidance_scale=1.0 (no CFG)
* base/SFT: steps=27, shift=1.0, guidance_scale=7.0
* Base/SFT timestep schedule uses the linear+shift transform from
`acestep/models/base/modeling_acestep_v15_base.py`; turbo still uses the
hardcoded 8-step `SHIFT_TIMESTEPS` table.
* CFG reuses the learned `null_condition_emb` and batches the
conditional+unconditional forwards into a single transformer call.
* `SFT_GEN_PROMPT` matches the newline layout in `acestep/constants.py` so the
text encoder sees the same prompt distribution it was trained on.
DiT parity vs. the original ACE-Step 1.5 turbo DiT is bit-identical
(max_abs=0.0 in fp32 eager/SDPA across 4 seed/shape cases) — see
scripts/dit_parity_test.py.
* Add ACE-Step parity test scripts
Two developer-facing parity harnesses live under scripts/:
* dit_parity_test.py — loads the same converted turbo weights into the
original AceStepDiTModel and the diffusers AceStepDiTModel, drives
identical (hidden_states, timestep, timestep_r, encoder_hidden_states,
context_latents) inputs, and asserts max-abs-diff ≤ 1e-5 in fp32
eager/SDPA. Currently passes bit-identical (max_abs=0) across four
shape/seed cases including batched + odd-length paths.
* audio_parity_jieyue.py — full end-to-end audio parity. Given the same
JSON example, runs both the original ACE-Step 1.5 pipeline and the
diffusers AceStepPipeline at matched seed/precision (bf16 + FA2 by
default) and saves side-by-side .wav files for listening verification.
Supports text2music / cover / repaint × turbo / base / sft via a
--matrix mode that writes 18 wavs named
{variant}_{task}_{official,diffusers}.wav.
* Route SFT parity to acestep-v15-sft checkpoint
On jieyue the release tree has a dedicated SFT checkpoint at
checkpoints/acestep-v15-sft with its own modeling_acestep_v15_base.py
shipped under acestep/models/sft/. Point the SFT row of the parity matrix
at that checkpoint / module so we're testing the actual SFT weights, not
the plain base ones.
* audio_parity_jieyue: fix doubled 'acestep-' in cache path; --converted-root flag
Previously the converted-pipeline cache dir was
`/tmp/acestep-<variant>-diffusers` but <variant> already starts with
"acestep-", giving `/tmp/acestep-acestep-v15-turbo-diffusers`. Drop the
prefix.
On jieyue the overlay rootfs (including /tmp) only has a few GB free; a
full turbo conversion needs ~5 GB per variant. Add --converted-root (env
ACESTEP_CONVERTED_ROOT) so the cache can live on vepfs.
* audio_parity_jieyue: two-phase matrix bootstraps cover/repaint from text2music
The ACE-Step release bundle on jieyue doesn't ship sample .wav/.mp3
files, so matrix mode had no default --src-audio and would skip
cover/repaint entirely. Run text2music first for every variant, then
reuse the TURBO official text2music output as the shared source for the
cover/repaint rows. Users can still override with --src-audio.
* audio_parity_jieyue: seed the diffusers generator on the pipeline device
The ORIGINAL ACE-Step pipeline seeds on the execution device
(`torch.Generator(device=device).manual_seed(seed)`), i.e. the CUDA RNG
stream when running on GPU. Previously the parity harness seeded the
diffusers side with a CPU generator, so even though the seed integer
matched, the two sides drew different noise from the outset and the
final outputs were essentially uncorrelated. Use the execution-device
generator on both sides for a fair comparison.
* Fix ACE-Step pipeline: switch to APG guidance + peak normalization
Two issues found after the first jieyue audio parity run:
1. The original base/SFT pipeline uses APG (Adaptive Projected Guidance,
acestep/models/common/apg_guidance.py) with a stateful momentum
buffer and norm/projection steps — NOT vanilla CFG. Using vanilla CFG
produced uncorrelated outputs vs. the reference (pearson ~0.0 on
20 s samples); this PR ports `_apg_forward` + `_APGMomentumBuffer`
and plugs them into the denoising loop when `guidance_scale > 1`.
Momentum is instantiated once per pipeline call (persists across
denoising steps) to match the reference semantics.
2. The post-VAE "anti-clipping normalization" in this pipeline was
`audio /= std * 5` with a `std<1 -> std=1` guard. The original
post-processing in
acestep/core/generation/handler/generate_music_decode.py is simple
peak normalization: `if audio.abs().max() > 1: audio /= peak`. The
std-based proxy both (a) let clips with peak < 1 leak through
unchanged (over-quiet) and (b) failed to bring clipping peaks to
exactly 1 in a bunch of base/SFT cases (observed max=1.000, std=0.200
repeatedly in the first parity run). Switch to peak normalization on
both sides.
Tested via scripts/audio_parity_jieyue.py on A800; re-run pending to
confirm the base/SFT correlation improvements.
* Fix ACE-Step chunk mask values to match the original pipeline
The DiT receives `context_latents = concat(src_latents, chunk_mask)` on the
channel dim, and was trained with chunk_mask values drawn from the three
sentinels documented in acestep/inference.py:
2.0 -> model-decided (default for text2music / cover / full-generation)
1.0 -> keep this latent frame from src_latents (repaint preserved region)
0.0 -> explicitly repaint this frame (only inside the repaint window)
Previously _build_chunk_mask returned all-1.0 for text2music (and cover /
lego), and an inverted 0/1 mask for repaint (1 inside the window, 0 outside).
Either case puts context_latents out of distribution. Switch text2music /
cover to the 2.0 sentinel and flip the repaint mask so it's 1.0 outside /
0.0 inside. Update the repaint src_latents zero-out to multiply by the new
mask (was `1 - chunk_mask`) so the zero region still lines up with the
repaint window.
* Add direct invoker for ACE-Step generate_music (ground truth)
Our earlier audio_parity_jieyue.py reconstructs the original pipeline by
calling AceStepConditionGenerationModel.generate_audio() directly, which
silently skips a lot of the real handler plumbing (conditioning masks,
silence-latent tiling, cover/repaint pre-processing, etc.). That made the
'official' wavs we saved sound wrong — flat, drone-like, not music.
This new script calls acestep.inference.generate_music end-to-end through
the real AceStepHandler, with LM + CoT explicitly disabled so we still have
a deterministic comparison. Use it to generate the ground-truth 'official'
wav for a given JSON example, then separately run the diffusers pipeline
with the same inputs and diff the two.
* run_official_generate_music: call initialize_service to bind a DiT variant
AceStepHandler() is a shell — you have to call handler.initialize_service(
project_root=..., config_path=..., device=..., use_flash_attention=..., ...)
before generate_music will work. Mirror what cli.py does at the equivalent
spot (around cli.py:1400).
* Fix silence-reference for ACE-Step timbre encoder
The root cause for the flat / drone-like outputs I was seeing (including
in my 'official' reconstruction): when no reference_audio is provided the
pipeline was feeding literal zeros to the timbre encoder. The real
handler feeds a slice of the learned `silence_latent` tensor.
The handler also transposes silence_latent on load (see
acestep/core/generation/handler/init_service_loader.py:214:
self.silence_latent = torch.load(...).transpose(1, 2)
) converting [1, 64, 15000] -> [1, 15000, 64] so that
`silence_latent[:, :750, :]` yields the expected [1, 750, 64] shape.
Changes:
* Converter: load silence_latent.pt, transpose to [1, T, C], bake into
the condition_encoder safetensors under key `silence_latent`.
(Also keeps the raw .pt file at the pipeline root for debugging.)
* AceStepConditionEncoder: register `silence_latent` as a persistent
buffer so from_pretrained loads it alongside the trained weights.
* Pipeline: when reference_audio is None, slice
`condition_encoder.silence_latent[:, :timbre_fix_frame, :]` and
broadcast across the batch instead of zeros. Emits a loud warning
(and falls back to zeros) if the buffer is all-zero — that means the
checkpoint was produced by an older converter and should be rebuilt.
* audio_parity_jieyue.py: the reference path now matches the handler's
silence-latent slicing.
Without this fix, every variant/task combo produced drone-like audio
even when my numeric DiT-forward parity claimed they were identical.
* Fix three more ACE-Step pipeline bugs I found by dumping real inputs
Instrumented the live generate_audio call in the real ACE-Step handler and
observed the exact tensors it sees — my diffusers pipeline was wrong in
three independent ways:
1. src_latents for text2music should be silence_latent tiled to
latent_length, NOT zeros. The handler fills no-target cases from
silence_latent_tiled (observed std=0.96). Zeros are OOD for the DiT
context_latents concat and produce drone-like outputs.
2. chunk_mask values cap at 1.0 (not 2.0). The handler starts with a
bool tensor (True inside the generate span, False outside); the
chunk_mask_modes=auto -> 2.0 override does NOT take effect because
the underlying tensor is bool, so setting entry = 2.0 casts to True.
After the later .to(dtype) float cast, the DiT sees 1.0/0.0 — exactly
what I observed in the captured tensor (unique values = [True]).
3. Default shift is 1.0 for ALL variants, including turbo. I was
defaulting turbo to shift=3.0 which picks a different SHIFT_TIMESTEPS
table (the 8-step schedule is keyed by shift, not variant).
Also:
* Added _silence_latent_tiled() helper that slices / tiles the learned
silence_latent (now loaded as a buffer on the condition encoder) to
the requested latent length.
* Repaint path now substitutes silence_latent (not raw zeros) inside
the repaint window — matches conditioning_masks.py.
* audio_parity_jieyue.py mirrors the same src/chunk/shift choices on
its 'original' leg for apples-to-apples parity once the buggy
reconstruction is removed from the picture.
* Add peak+loudness post-normalization to AceStepPipeline
The real pipeline normalizes audio in two stages (see
acestep/audio_utils.py:72 normalize_audio + generate_music_decode.py):
1. if peak > 1: audio /= peak (anti-clip)
2. audio *= target_amp / peak (target_amp = 10 ** (-1/20) ~ 0.891)
Step 2 is loudness normalization to -1 dBFS. Without it diffusers outputs
had peak=1.0 vs the real 0.891 — same music content (pearson was ~0.86
already), just 1.12x louder. Add step 2 after the existing anti-clip step.
* Match acestep/inference.py inference_steps=8 for ALL variants
GenerationParams.inference_steps default is 8 — turbo AND base/SFT. I had
base/SFT defaulting to 27 here, so every base/SFT parity run was comparing
a 27-step diffusers trajectory against an 8-step real trajectory. Different
number of denoising steps means different audio even at fixed seed.
This likely explains the lower base/SFT correlation in my earlier jieyue
runs (turbo was 0.86, base/SFT were 0.32-0.34). Aligning step counts
should bring base/SFT closer to turbo parity.
* Address PR #13095 review: rename classes + reuse diffusers primitives
Response to dg845's PR comments batch 1+2. DiT parity harness still bit-identical
(max_abs=0 on fp32 / SDPA across 4 shape cases).
Transformer file:
* Rename AceStepDiTModel -> AceStepTransformer1DModel (alias kept).
* Rename AceStepDiTLayer -> AceStepTransformerBlock (alias kept).
* Inherit AttentionMixin + CacheMixin on the DiT model.
* Swap in diffusers.models.normalization.RMSNorm for the hand-rolled
AceStepRMSNorm (weight-key-compatible).
* Swap the hand-rolled rotary embedding + apply_rotary for diffusers'
get_1d_rotary_pos_embed + apply_rotary_emb (use_real_unbind_dim=-2 to
match the cat-half convention ACE-Step inherits from Qwen3).
* Use get_timestep_embedding with flip_sin_to_cos=True — keeps the
(cos, sin) ordering of the original sinusoidal. State-dict-compatible.
* Drop max_position_embeddings arg from DiT config (RoPE computes freqs
per call based on seq_len); converter drops it.
* Gradient-checkpoint call now takes just the layer module (matches the
Flux2 idiom).
Pipeline modeling file (pipelines/ace_step/modeling_ace_step.py):
* Moved _pack_sequences + AceStepEncoderLayer here — they aren't used
by the DiT, so they shouldn't live in the transformer file.
* AceStepLyricEncoder + AceStepTimbreEncoder set
_supports_gradient_checkpointing = True and wrap encoder-layer calls
through the checkpointing func when enabled.
* Use diffusers RMSNorm + the RoPE helper from the transformer file
(shared single implementation).
Converter (scripts/convert_ace_step_to_diffusers.py):
* model_index.json now carries AceStepTransformer1DModel.
* Drop max_position_embeddings / use_sliding_window from the emitted
configs.
No numerical regressions: scripts/dit_parity_test.py PASSES with
max_abs=0.0 on fp32/SDPA across short, long, batched, and
padding-path shape variants.
* Address PR #13095 review: pipeline polish + converter HF-hub support
Response to dg845 review comments on the pipeline side. DiT parity still
bit-identical (max_abs=0 across 4 shape cases).
Pipeline (pipelines/ace_step/pipeline_ace_step.py):
* Add `sample_rate` + `latents_per_second` properties sourced from the
VAE config so the pipeline no longer hardcodes 48000 / 25 / 1920.
Propagates through prepare_latents, chunk_mask window math, and the
audio-duration round-trip.
* Add `do_classifier_free_guidance` property (matches LTX2 et al.).
* Add `check_inputs(...)` called from `__call__` before allocating noise.
Validates prompt type, lyrics type, task_type, step count, guidance
scale, shift, cfg interval bounds and repaint window ordering.
* Add `callback_on_step_end` + `callback_on_step_end_tensor_inputs` —
the modern callback form. The legacy `callback` / `callback_steps`
pair is kept for back-compat. Setting `pipe._interrupt = True` inside
the callback stops the loop early.
* Expose `encode_audio(audio)` as a public helper that wraps the tiled
VAE encode + (B, T, D) transpose the pipeline performs internally.
Converter (scripts/convert_ace_step_to_diffusers.py):
* Accept a Hugging Face Hub repo id for `--checkpoint_dir`; resolves it
via `huggingface_hub.snapshot_download` when the argument isn't a
local path.
Exports:
* Register `AceStepTransformer1DModel` in the top-level __init__,
models/__init__, models/transformers/__init__, and dummy_pt_objects so
`from diffusers import AceStepTransformer1DModel` works and the
pipeline loader resolves the new class name from model_index.json.
Deferred for a follow-up (commented inline in the PR): full
`Attention + AttnProcessor + dispatch_attention_fn` refactor and
`FlowMatchEulerDiscreteScheduler` migration — both would benefit from a
dedicated parity re-run and review.
* Fix stale ACE-Step 1.0-era docs / class names in the 1.5 integration
Docs and docstrings still carried a mix of 1.0 paper title, non-existent
`ACE-Step/ACE-Step-v1-5-turbo` hub id, `shift=3.0` turbo default, and
the old `AceStepDiTModel` class name. Cleaned up to match the actual
1.5 release:
* pipelines/ace_step.md: correct citation title ("ACE-Step 1.5: Pushing
the Boundaries of Open-Source Music Generation"), correct repo
(`ace-step/ACE-Step-1.5`), new variants table with real HF ids
(`Ace-Step1.5` / `acestep-v15-base` / `acestep-v15-sft`) and their
per-variant step/CFG defaults, drop the wrong `shift=3.0` tip.
* models/ace_step_transformer.md: page renamed to
`AceStepTransformer1DModel` with a short 1.5-specific description;
`AceStepDiTModel` noted as a backwards-compat alias.
* pipeline_ace_step.py: import, docstring, `Args`, and `__init__`
annotation reference `AceStepTransformer1DModel`; example model id
now `ACE-Step/Ace-Step1.5`; `_variant_defaults` docstring and the
`__call__` variant-fallback comment no longer claim `shift=3.0` /
`27 steps` — real defaults are 8 steps / shift=1.0 across all
variants, guidance=1.0 (turbo) vs 7.0 (base+sft).
* Address PR #13095 review: VAE tiling on AutoencoderOobleck + Timesteps class
Two more deferred review threads from dg845 addressed:
* Move tiled encode/decode onto AutoencoderOobleck
(https://github.com/huggingface/diffusers/pull/13095#discussion_r2785513647).
AutoencoderOobleck now carries `use_tiling` + `tile_sample_min_length` /
`tile_sample_overlap` / `tile_latent_min_length` / `tile_latent_overlap`
attributes and private `_tiled_encode` / `_tiled_decode` methods; the
existing `encode` / `_decode` dispatch to them when tiling is enabled and
the input exceeds the threshold. `AutoencoderMixin.enable_tiling()` is
already inherited.
AceStepPipeline's private `_tiled_encode` / `_tiled_decode` and the
`use_tiled_decode` `__call__` arg are gone; `__init__` now calls
`self.vae.enable_tiling()` so the long-audio memory behaviour is preserved
by default. Users can opt out with `pipe.vae.disable_tiling()`.
Note: the VAE-side tiling concatenates encoder features (h) and samples
the posterior once, instead of the old per-tile `.sample()` calls. This
is the standard diffusers pattern; numerically differs only in the
structure of the noise across tile boundaries.
* Use the Timesteps nn.Module for the sinusoid
(https://github.com/huggingface/diffusers/pull/13095#discussion_r2785420234).
`AceStepTimestepEmbedding` wraps `Timesteps(in_channels, flip_sin_to_cos=
True, downscale_freq_shift=0)` instead of calling `get_timestep_embedding`
directly — reviewer asked for the Module form.
* Address PR #13095 review: refactor AceStepAttention to Attention + AttnProcessor
Splits the monolithic AceStepAttention into the diffusers standard
Attention + AttnProcessor layout:
- AceStepAttention (torch.nn.Module, AttentionModuleMixin) holds the
to_q/to_k/to_v/to_out projections and norm_q/norm_k RMSNorms.
- AceStepAttnProcessor2_0 runs the attention dispatch through
dispatch_attention_fn so users can pick flash / sage / native backends
via model.set_attention_backend(...) or the attention_backend context
manager.
GQA (Q has 16 heads / K,V have 8) is preserved by passing enable_gqa=True
to dispatch_attention_fn instead of repeat_interleave; fusion is disabled
(_supports_qkv_fusion = False) because Q and K,V have different output
sizes.
The converter is updated to rename the six attention sub-keys
(q_proj -> to_q, k_proj -> to_k, v_proj -> to_v, o_proj -> to_out.0,
q_norm -> norm_q, k_norm -> norm_k) on both the DiT decoder path and the
condition encoder path, since AceStepLyricEncoder / AceStepTimbreEncoder
share the same AceStepAttention class.
Addresses review comments r2785433213 and r2785450463.
* Address PR #13095 review: migrate to FlowMatchEulerDiscreteScheduler
Replace the hand-rolled flow-matching Euler loop with
`FlowMatchEulerDiscreteScheduler`. ACE-Step still computes its own shifted /
turbo sigma schedule via `_get_timestep_schedule`, but now passes it to
`scheduler.set_timesteps(sigmas=...)` and delegates the ODE step to
`scheduler.step()`. The scheduler is configured with `num_train_timesteps=1`
and `shift=1.0` so `scheduler.timesteps` stays in `[0, 1]` (the convention the
DiT was trained on) and the scheduler doesn't re-shift already-shifted sigmas.
The scheduler's appended terminal `sigma=0` reproduces the old loop's
final-step "project to x0" case exactly: `prev = x + (0 - t_curr) * v`.
Parity on jieyue (seed=42, bf16 + flash-attn, turbo text2music, 8 steps):
waveform Pearson = 0.999999
spectral Pearson = 1.000000
max |diff| = 2.5e-3 (fp32 step-math vs previous bf16 step-math)
fp32 Euler-loop A/B against the hand-rolled path: max |diff| = 3.6e-7.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Address PR #13095 review: move DiT tests + drop stale test kwargs
- Move the DiT transformer tests out of the pipeline test file into a new
tests/models/transformers/test_models_transformer_ace_step.py that follows
the standard BaseModelTesterConfig + ModelTesterMixin scaffold (matches
test_models_transformer_longcat_audio_dit.py).
- Drop `max_position_embeddings` from the remaining AceStepDiTModel and
AceStepConditionEncoder test fixtures — neither constructor accepts that
argument anymore.
- Drop `use_sliding_window` from the same fixtures — also no longer a
constructor argument (the actual `sliding_window` int kwarg is kept).
- Wire `FlowMatchEulerDiscreteScheduler(num_train_timesteps=1, shift=1.0)`
into `get_dummy_components()` now that the pipeline requires it.
Resolves https://github.com/huggingface/diffusers/pull/13095#discussion_r3115653554,
r3115664850, r3115673059, r3115676580, r3115680700.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Address PR #13095 review from dg845 (2026-04-23)
Fixes 5 review threads + style:
1. Converter now builds `AceStepPipeline` in memory and calls
`save_pretrained`. Previously the hand-written `model_index.json` was
missing the `scheduler` entry — fresh converter output couldn't be loaded
by `AceStepPipeline.from_pretrained` (r3127767785). This also makes the
converter robust to future `__init__` signature changes.
2. `latent_length` uses `math.ceil(...)` instead of `int(...)` so non-integer
products (e.g. `latents_per_second=2.0, audio_duration=0.4 → 0.8`) round up
to `1` instead of truncating to `0` and crashing shape checks (r3127790939).
3. Add `_callback_tensor_inputs = ["latents"]` on `AceStepPipeline` so the
standard diffusers callback tests pick up the right tensor (r3127795954).
4. `AceStepConditionEncoder.silence_latent` no longer hard-codes the channel
dim to 64. The placeholder buffer now uses the `timbre_hidden_dim`
constructor argument, so smaller test configs with `timbre_hidden_dim != 64`
load without shape errors (r3127812932).
5. Revert `self.vae.enable_tiling()` from `AceStepPipeline.__init__`. Users can
call `pipe.vae.enable_tiling()` themselves for long-form generation; that
matches the opt-in convention used by the rest of diffusers (r3127777296).
6. `ruff check --fix` + `ruff format` over all ACE-Step sources (the style fix
dg845 asked for via `@bot /style`).
Also: converter now accepts sharded `model.safetensors.index.json` layouts
alongside the single-file `model.safetensors`, so the 5B XL turbo variant
converts without a pre-processing step.
Parity on jieyue (seed=42, bf16 + flash-attn, turbo text2music 160s, fresh
converter output loaded via `from_pretrained`):
waveform Pearson = 0.999954
spectral Pearson = 0.999977
max |a-b| bf16 = 4.3e-02 (dominated by the VAE tiling default flip)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Address PR #13095 review from yiyixuxu (2026-04-23)
Code-level (22 threads):
1. Delete 3 dev/parity scripts (`scripts/audio_parity_jieyue.py`,
`scripts/dit_parity_test.py`, `scripts/run_official_generate_music.py`)
that shouldn't have been committed.
2. Rename `AutoencoderOobleck._encode_one` → `_encode` to match the convention
used by other diffusers VAEs.
3. Delete the hard-coded `SHIFT_TIMESTEPS` / `VALID_SHIFTS` table in
`pipeline_ace_step.py`: the per-shift turbo schedules are recovered
exactly by `linspace(1, 0, N+1)[:-1]` plus the flow-match shift formula
that the non-turbo branch already uses, so a single code path covers both.
4. Drop the backwards-compat `AceStepDiTModel` / `AceStepDiTLayer` aliases
and every reference (top-level `__init__`, `models/__init__`,
`transformers/__init__`, dummy objects, tests, docs toctree, model card).
`AceStepTransformer1DModel` is the only exported name now.
5. Remove the unused `attention_mask` / `encoder_attention_mask` args from
`AceStepTransformer1DModel.forward`; the model rebuilds its masks from
the sequence shape and never consumed them.
6. In the DiT forward and both encoders, pass `None` instead of an all-zero
`full_attn_mask` / `encoder_4d_mask` to non-sliding attention layers — SDPA
dispatches to a faster kernel when the mask is None.
7. Inline the shared `_run_encoder_layers` helper directly into
`AceStepLyricEncoder.forward` / `AceStepTimbreEncoder.forward` so layer
calls are visible at the forward boundary (diffusers style).
8. Move `is_turbo` / `sample_rate` / `latents_per_second` from `@property`s
that re-read module configs each call to cached attributes populated in
`__init__` (Flux2-style), with a default-ACE-Step fallback when
`self.vae` is offloaded. Drop the now-unused `SAMPLE_RATE = 48000`
module-level constant and the three property definitions.
9. Warn + coerce `guidance_scale` to 1.0 on turbo (guidance-distilled)
checkpoints, following `pipeline_flux2_klein`. Prevents over-guided
audio when users forward their base/sft CFG settings to a turbo pipe.
10. Remove the `logger.warning(...)` paths that triggered on
`silence_latent` missing/zero — those only fired for author-side
unconverted checkpoints and tests; end users always load converted
weights where the buffer is baked in.
11. Drop the redundant `with torch.no_grad():` wrappers inside
`encode_prompt` — the pipeline's `__call__` runs under `torch.no_grad`
already.
12. Strip "reviewer comment on PR #13095" attribution comments from three
docstrings (here and everywhere).
Parity on jieyue (seed=42, bf16 + flash-attn, XL turbo 160s text2music):
waveform Pearson = 0.9747
spectral Pearson = 0.9895
The shift comes from full-attention layers switching `attn_mask=0_tensor` →
`attn_mask=None`, which dispatches to a different SDPA kernel on bf16. The
two outputs are algebraically equivalent for fp32 eager; on bf16+FA the
delta is dominated by kernel-level ULPs, well within the sampler-noise
band (ear-check on the 160s example confirms no audible regression).
Still open — AudioTokenizer/Detokenizer (deferred) + APG guider follow-up
(dims differ from `diffusers.guiders.adaptive_projected_guidance`, not a
drop-in; worth a separate PR).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Address ACE-Step audio token and APG review
* Fix ACE-Step docs CI
* Address ACE-Step pipeline cleanup review
* Fix ACE-Step flash attention sliding windows
* Add ACE-Step callback properties
* Address ACE-Step final review comments
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
|
||
|
|
303c1d8b04 |
[Ernie-Image] Add lora support (#13575)
add lora support Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> |
||
|
|
a5bc04696b |
NucleusMoE docs (#13661)
up |
||
|
|
50cb2db4ad |
feat: support ring attention with arbitrary KV sequence lengths (#13545)
* feat: support ring attention with arbitrary KV sequence lengths * fix: align ring_anything with ulysses_anything (size gather + unshard) * docs: document ring_anything mode * fix: merge hook branches, add ring_anything comment + guard * docs: address ring_anything review comments * docs: update ring_anything guidance * docs: refine ring_anything guidance per review * fix: address ring_anything style check --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> |
||
|
|
2173c554ea | [docs] fix typo in AutoencoderOobleck docs (#13642) (#13645) | ||
|
|
656da843e8 |
[docs] add a mention of torchao and other backends in speed memory docs. (#13499)
add a mention of torchao and other backends in speed memory docs. |
||
|
|
947bc23ba4 |
[chore] Add diffusers-format example to LongCatAudioDiTPipeline (#13483)
* [chore] Add diffusers-format example and seed parameter to LongCatAudioDiTPipeline Signed-off-by: Lancer <maruixiang6688@gmail.com> * Apply style fixes * Apply suggestions from code review Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * Apply style fixes --------- Signed-off-by: Lancer <maruixiang6688@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> |
||
|
|
c41a3c3ed8 |
[Feat] Adds LongCat-AudioDiT pipeline (#13390)
* Add LongCat-AudioDiT pipeline Signed-off-by: Lancer <maruixiang6688@gmail.com> * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * upd * Apply suggestions from code review Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * Apply style fixes * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * Apply style fixes * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> * Apply style fixes * upd Signed-off-by: Lancer <maruixiang6688@gmail.com> --------- Signed-off-by: Lancer <maruixiang6688@gmail.com> Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> |
||
|
|
dc8d903217 |
Add ernie image (#13432)
Some checks failed
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Failing after 10m35s
Build documentation / build (push) Failing after 10m47s
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled
Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled
Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled
Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled
Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled
Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch Pipelines CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (examples) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (lora) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (models) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (others) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (schedulers) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (single_file) (push) Has been cancelled
Nightly and release tests on main/release branch / PyTorch Compile CUDA tests (push) Has been cancelled
Nightly and release tests on main/release branch / Torch tests on big GPU (push) Has been cancelled
Nightly and release tests on main/release branch / Torch Minimum Version CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:nvidia_modelopt test_location:modelopt]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:optimum_quanto test_location:quanto]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:torchao test_location:torchao]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft kernels] backend:gguf test_location:gguf]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft] backend:bitsandbytes test_location:bnb]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (push) Has been cancelled
Nightly and release tests on main/release branch / Generate Consolidated Test Report (push) Has been cancelled
Test, build, and push Docker images / test-build-docker-images (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-doc-builder) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cpu) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cuda) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-minimum-cuda) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-xformers-cuda) (push) Has been cancelled
Stale Bot / Close Stale Issues (push) Has been cancelled
* Add ERNIE-Image * Update doc * Update doc * Change from Custom-Attention to Diffusers Style Attention * Change from Custom-Attention to Diffusers Style Attention * 兼容SGLang * 优化PE模块的加载与offload策略 * 更新Doc文件与config配置相关内容 * Fix官方反馈的内容 * 根据官方建议优化代码 * Update code * update * update * Apply style fixes * update * update * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> |
||
|
|
251676dfda |
Fix grammar in LoRA documentation (#13423)
Fix grammar in LoRA documentation (LoRA's → LoRAs, trigger it → trigger them) |
||
|
|
3e53a383e1 |
Fix typos and grammar errors in documentation (#13391)
- Fix 'allows to generate' -> 'allows you to generate' in controlling_generation.md - Fix 'it's refiner' -> 'its refiner' (possessive) in sdxl.md - Fix 'it's state' -> 'its state' (possessive) in reusing_seeds.md - Fix missing word 'you'll a function' -> 'you'll create a function' in sdxl.md |
||
|
|
cf6af6b4f8 |
[docs] add auto docstring and parameter templates documentation for m… (#13382)
* [docs] add auto docstring and parameter templates documentation for modular diffusers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Update docs/source/en/modular_diffusers/auto_docstring.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_docstring.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_docstring.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_docstring.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_docstring.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_docstring.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_docstring.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_docstring.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * up --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> |
||
|
|
e365d749a1 |
[docs] deprecate pipelines (#13157)
Some checks failed
Build documentation / build (push) Has been cancelled
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled
Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled
Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled
Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled
Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled
Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
Nightly and release tests on main/release branch / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (examples) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (lora) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (models) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (others) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (schedulers) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (single_file) (push) Has been cancelled
Nightly and release tests on main/release branch / PyTorch Compile CUDA tests (push) Has been cancelled
Nightly and release tests on main/release branch / Torch tests on big GPU (push) Has been cancelled
Nightly and release tests on main/release branch / Torch Minimum Version CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:nvidia_modelopt test_location:modelopt]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:optimum_quanto test_location:quanto]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:torchao test_location:torchao]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft kernels] backend:gguf test_location:gguf]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft] backend:bitsandbytes test_location:bnb]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (push) Has been cancelled
Test, build, and push Docker images / test-build-docker-images (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-doc-builder) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cpu) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cuda) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-minimum-cuda) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-xformers-cuda) (push) Has been cancelled
Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch Pipelines CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Generate Consolidated Test Report (push) Has been cancelled
Stale Bot / Close Stale Issues (push) Has been cancelled
* deprecate * fix * fix * fix * fix * remove deprecated .md files * update links * fix |
||
|
|
7e463ea4cc |
[docs] Add NeMo Automodel training guide (#13306)
Some checks failed
Build documentation / build (push) Has been cancelled
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
Nightly and release tests on main/release branch / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch Pipelines CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (examples) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (lora) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (models) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (others) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (schedulers) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (single_file) (push) Has been cancelled
Nightly and release tests on main/release branch / PyTorch Compile CUDA tests (push) Has been cancelled
Nightly and release tests on main/release branch / Torch tests on big GPU (push) Has been cancelled
Nightly and release tests on main/release branch / Torch Minimum Version CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:nvidia_modelopt test_location:modelopt]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:optimum_quanto test_location:quanto]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:torchao test_location:torchao]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft kernels] backend:gguf test_location:gguf]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft] backend:bitsandbytes test_location:bnb]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (push) Has been cancelled
Nightly and release tests on main/release branch / Generate Consolidated Test Report (push) Has been cancelled
Test, build, and push Docker images / test-build-docker-images (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-doc-builder) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cpu) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cuda) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-minimum-cuda) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-xformers-cuda) (push) Has been cancelled
* [docs] Add NeMo Automodel training guide Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com> * Update docs/source/en/training/nemo_automodel.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/training/nemo_automodel.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * adding contacts into the readme * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Address CR comments Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com> * Update docs/source/en/training/nemo_automodel.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/training/nemo_automodel.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: linnan wang <wangnan318@gmail.com> |
||
|
|
1fe2125802 |
remove str option for quantization config in torchao (#13291)
Some checks failed
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
Build documentation / build (push) Has been cancelled
Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled
Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled
Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled
Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled
Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled
Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled
Stale Bot / Close Stale Issues (push) Has been cancelled
* remove str option for quantization config in torchao * Apply style fixes * minor fixes * Added AOBaseConfig docs to torchao.md * minor fixes for removing str option torchao * minor change to add back int and uint check * minor fixes * minor fixes to tests * Update tests/quantization/torchao/test_torchao.py Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update docs/source/en/quantization/torchao.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update tests/quantization/torchao/test_torchao.py Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * version=2 update to test_torchao.py --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> |
||
|
|
7298f5be93 |
Update LTX-2 Docs to Cover LTX-2.3 Models (#13337)
* Update LTX-2 docs to cover multimodal guidance and prompt enhancement * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply reviewer feedback --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> |
||
|
|
cbf4d9a3c3 |
[docs] kernels (#13139)
* kernels * feedback |
||
|
|
762ae059fa |
[LLADA2] documentation fixes (#13333)
documentation fixes |
||
|
|
5d207e756e |
[Discrete Diffusion] Add LLaDA2 pipeline (#13226)
* feat: add LLaDA2 and BlockRefinement pipelines for discrete text diffusion Add support for LLaDA2/LLaDA2.1 discrete diffusion text generation: - BlockRefinementPipeline: block-wise iterative refinement with confidence-based token commitment, supporting editing threshold for LLaDA2.1 models - LLaDA2Pipeline: convenience wrapper with LLaDA2-specific defaults - DiscreteDiffusionPipelineMixin: shared SAR sampling utilities (top-k, top-p, temperature) and prompt/prefix helpers - compute_confidence_aware_loss: CAP-style training loss - Examples: sampling scripts for LLaDA2 and block refinement, training scripts with Qwen causal LM - Docs and tests included * feat: add BlockRefinementScheduler for commit-by-confidence scheduling Extract the confidence-based token commit logic from BlockRefinementPipeline into a dedicated BlockRefinementScheduler, following diffusers conventions. The scheduler owns: - Transfer schedule computation (get_num_transfer_tokens) - Timestep management (set_timesteps) - Step logic: confidence-based mask-filling and optional token editing The pipeline now delegates scheduling to self.scheduler.step() and accepts a scheduler parameter in __init__. * test: add unit tests for BlockRefinementScheduler 12 tests covering set_timesteps, get_num_transfer_tokens, step logic (confidence-based commits, threshold behavior, editing, prompt masking, batched inputs, tuple output). * docs: add toctree entries and standalone scheduler doc page - Add BlockRefinement and LLaDA2 to docs sidebar navigation - Add BlockRefinementScheduler to schedulers sidebar navigation - Move scheduler autodoc to its own page under api/schedulers/ * feat: add --revision flag and fix dtype deprecation in sample_llada2.py - Add --revision argument for loading model revisions from the Hub - Replace deprecated torch_dtype with dtype for transformers 5.x compat * fix: use 1/0 attention mask instead of 0/-inf for LLaDA2 compat LLaDA2 models expect a boolean-style (1/0) attention mask, not an additive (0/-inf) mask. The model internally converts to additive, so passing 0/-inf caused double-masking and gibberish output. * refactor: consolidate training scripts into single train_block_refinement.py - Remove toy train_block_refinement_cap.py (self-contained demo with tiny model) - Rename train_block_refinement_qwen_cap.py to train_block_refinement.py (already works with any causal LM via AutoModelForCausalLM) - Fix torch_dtype deprecation and update README with correct script names * fix formatting * docs: improve LLaDA2 and BlockRefinement documentation - Add usage examples with real model IDs and working code - Add recommended parameters table for LLaDA2.1 quality/speed modes - Note that editing is LLaDA2.1-only (not for LLaDA2.0 models) - Remove misleading config defaults section from BlockRefinement docs * feat: set LLaDA2Pipeline defaults to recommended model parameters - threshold: 0.95 -> 0.7 (quality mode) - max_post_steps: 0 -> 16 (recommended for LLaDA2.1, harmless for 2.0) - eos_early_stop: False -> True (stop at EOS token) block_length=32, steps=32, temperature=0.0 were already correct. editing_threshold remains None (users enable for LLaDA2.1 models). * feat: default editing_threshold=0.5 for LLaDA2.1 quality mode LLaDA2.1 is the current generation. Users with LLaDA2.0 models can disable editing by passing editing_threshold=None. * fix: align sampling utilities with official LLaDA2 implementation - top_p filtering: add shift-right to preserve at least one token above threshold (matches official code line 1210) - temperature ordering: apply scaling before top-k/top-p filtering so filtering operates on scaled logits (matches official code lines 1232-1235) - greedy branch: return argmax directly when temperature=0 without filtering (matches official code lines 1226-1230) * refactor: remove duplicate prompt encoding, reuse mixin's _prepare_input_ids LLaDA2Pipeline._prepare_prompt_ids was a near-copy of DiscreteDiffusionPipelineMixin._prepare_input_ids. Remove the duplicate and call the mixin method directly. Also simplify _extract_input_ids since we always pass return_dict=True. * formatting * fix: replace deprecated torch_dtype with dtype in examples and docstrings - Update EXAMPLE_DOC_STRING to use dtype= and LLaDA2.1-mini model ID - Fix sample_block_refinement.py to use dtype= * remove BlockRefinementPipeline * cleanup * fix readme * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * removed DiscreteDiffusionPipelineMixin * add support for 2d masks for flash attn * Update src/diffusers/training_utils.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/training_utils.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * fix issues from review * added tests * formatting * add check_eos_finished to scheduler * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/schedulers/scheduling_block_refinement.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/schedulers/scheduling_block_refinement.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * fix renaming issues and types * remove duplicate check * Update docs/source/en/api/pipelines/llada2.md Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/llada2/pipeline_llada2.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> |
||
|
|
ef309a1bb0 |
Add KVAE 1.0 (#13033)
Some checks failed
Build documentation / build (push) Has been cancelled
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled
Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled
Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled
Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled
Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled
Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
* add kvae2d * add kvae3d video * add docs for kvae2d and kvae3d video * style fixes * fix kvae3d docs * fix normalzation * fix kvae video for code style * fix kvae video * kvae minor fixes * add gradient ckpting for kvaes * get rid of inplace ops kvae video * add tests for KVAEs * kvae2d normalization style change * kvaes fix style * update dummy_pt_objects test for kvaes --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> |
||
|
|
0b35834351 |
[core] fa4 support. (#13280)
Some checks failed
Build documentation / build (push) Has been cancelled
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled
Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled
Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled
Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled
Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled
Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
Stale Bot / Close Stale Issues (push) Has been cancelled
Nightly and release tests on main/release branch / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch Pipelines CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (examples) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (lora) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (models) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (others) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (schedulers) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (single_file) (push) Has been cancelled
Nightly and release tests on main/release branch / PyTorch Compile CUDA tests (push) Has been cancelled
Nightly and release tests on main/release branch / Torch tests on big GPU (push) Has been cancelled
Nightly and release tests on main/release branch / Torch Minimum Version CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:nvidia_modelopt test_location:modelopt]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:optimum_quanto test_location:quanto]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:torchao test_location:torchao]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft kernels] backend:gguf test_location:gguf]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft] backend:bitsandbytes test_location:bnb]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (push) Has been cancelled
Nightly and release tests on main/release branch / Generate Consolidated Test Report (push) Has been cancelled
Test, build, and push Docker images / test-build-docker-images (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-doc-builder) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cpu) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cuda) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-minimum-cuda) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-xformers-cuda) (push) Has been cancelled
* start fa4 support. * up * specify minimum version |
||
|
|
a13e5cf9fc |
[agents]support skills (#13269)
Some checks failed
Build documentation / build (push) Has been cancelled
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
* support skills * update * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * update baSeed on new best practice * Update .ai/skills/parity-testing/pitfalls.md Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * update --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> |
||
|
|
ed31974c3e |
[docs] updates (#13248)
Some checks failed
Build documentation / build (push) Has been cancelled
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
* fixes * few more links * update zh * fix |
||
|
|
e5aa719241 |
Add AGENTS.md (#13259)
Some checks failed
Build documentation / build (push) Has been cancelled
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled
Mirror Community Pipeline / mirror_community_pipeline (push) Has been cancelled
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled
Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled
Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled
Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled
Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled
Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
Benchmarking tests / Torch Core Models CUDA Benchmarking Tests (push) Has been cancelled
Stale Bot / Close Stale Issues (push) Has been cancelled
Nightly and release tests on main/release branch / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (examples) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (lora) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (models) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (others) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (schedulers) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (single_file) (push) Has been cancelled
Nightly and release tests on main/release branch / PyTorch Compile CUDA tests (push) Has been cancelled
Nightly and release tests on main/release branch / Torch tests on big GPU (push) Has been cancelled
Nightly and release tests on main/release branch / Torch Minimum Version CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:nvidia_modelopt test_location:modelopt]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:optimum_quanto test_location:quanto]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:torchao test_location:torchao]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft kernels] backend:gguf test_location:gguf]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft] backend:bitsandbytes test_location:bnb]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (push) Has been cancelled
Test, build, and push Docker images / test-build-docker-images (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-doc-builder) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cpu) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cuda) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-minimum-cuda) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-xformers-cuda) (push) Has been cancelled
Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch Pipelines CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Generate Consolidated Test Report (push) Has been cancelled
* add a draft * add * up * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> |
||
|
|
764f7ede33 |
[core] Flux2 klein kv followups (#13264)
* implement Flux2Transformer2DModelOutput. * add output class to docs. * add Flux2KleinKV to docs. * add pipeline tests for klein kv. |
||
|
|
0a2c26d0a4 |
Update Documentation for NVIDIA Cosmos (#13251)
* fix docs * update main example |
||
|
|
bd7a7a0b95 |
Optimize Helios docs (#13222)
optimize helios docs |
||
|
|
8ec0a5ccad |
feat: implement rae autoencoder. (#13046)
Some checks failed
Build documentation / build (push) Has been cancelled
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled
Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled
Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled
Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled
Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled
Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
* feat: implement three RAE encoders(dinov2, siglip2, mae) * feat: finish first version of autoencoder_rae * fix formatting * make fix-copies * initial doc * fix latent_mean / latent_var init types to accept config-friendly inputs * use mean and std convention * cleanup * add rae to diffusers script * use imports * use attention * remove unneeded class * example traiing script * input and ground truth sizes have to be the same * fix argument * move loss to training script * cleanup * simplify mixins * fix training script * fix entrypoint for instantiating the AutoencoderRAE * added encoder_image_size config * undo last change * fixes from pretrained weights * cleanups * address reviews * fix train script to use pretrained * fix conversion script review * latebt normalization buffers are now always registered with no-op defaults * Update examples/research_projects/autoencoder_rae/README.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update src/diffusers/models/autoencoders/autoencoder_rae.py Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * use image url * Encoder is frozen * fix slow test * remove config * use ModelTesterMixin and AutoencoderTesterMixin * make quality * strip final layernorm when converting * _strip_final_layernorm_affine for training script * fix test * add dispatch forward and update conversion script * update training script * error out as soon as possible and add comments * Update src/diffusers/models/autoencoders/autoencoder_rae.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * use buffer * inline * Update src/diffusers/models/autoencoders/autoencoder_rae.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * remove optional * _noising takes a generator * Update src/diffusers/models/autoencoders/autoencoder_rae.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * fix api * rename * remove unittest * use randn_tensor * fix device map on multigpu * check if the key is missing in the original state dict and only then add to the allow_missing set * remove initialize_weights --------- Co-authored-by: wangyuqi <wangyuqi@MBP-FJDQNJTWYN-0208.local> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> |
||
|
|
ae5881ba77 |
Fix Helios paper link in documentation (#13213)
* Fix Helios paper link in documentation Updated the link to the Helios paper for accuracy. * Fix reference link in HeliosTransformer3DModel documentation Updated the reference link for the Helios Transformer model paper. * Update Helios research paper link in documentation * Update Helios research paper link in documentation |
||
|
|
ab6040ab2d |
Add LTX2 Condition Pipeline (#13058)
Some checks failed
Build documentation / build (push) Has been cancelled
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled
Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled
Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled
Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled
Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled
Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
* LTX2 condition pipeline initial commit * Fix pipeline import error * Implement LTX-2-style general image conditioning * Blend denoising output and clean latents in sample space instead of velocity space * make style and make quality * make fix-copies * Rename LTX2VideoCondition image to frames * Update LTX2ConditionPipeline example * Remove support for image and video in __call__ * Put latent_idx_from_index logic inline * Improve comment on using the conditioning mask in denoising loop * Apply suggestions from code review Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com> * make fix-copies * Migrate to Python 3.9+ style type annotations without explicit typing imports * Forward kwargs from preprocess/postprocess_video to preprocess/postprocess resp. * Center crop LTX-2 conditions following original code * Duplicate video and audio position ids if using CFG * make style and make quality * Remove unused index_type arg to preprocess_conditions * Add # Copied from for _normalize_latents * Fix _normalize_latents # Copied from statement * Add LTX-2 condition pipeline docs * Remove TODOs * Support only unpacked latents (5D for video, 4D for audio) * Remove # Copied from for prepare_audio_latents --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com> |
||
|
|
33f785b444 |
Add Helios-14B Video Generation Pipelines (#13208)
* [1/N] add helios * fix test * make fix-copies * change script path * fix cus script * update docs * fix documented check * update links for docs and examples * change default config * small refactor * add test * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * remove register_buffer for _scale_cache * fix non-cuda devices error * remove "handle the case when timestep is 2D" * refactor HeliosMultiTermMemoryPatch and process_input_hidden_states * Update src/diffusers/pipelines/helios/pipeline_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/pipelines/helios/pipeline_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * fix calculate_shift * Update src/diffusers/pipelines/helios/pipeline_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * rewritten `einops` in pure `torch` * fix: pass patch_size to apply_schedule_shift instead of hardcoding * remove the logics of 'vae_decode_type' * move some validation into check_inputs() * rename helios scheduler & merge all into one step() * add some details to doc * move dmd step() logics from pipeline to scheduler * change to Python 3.9+ style type * fix NoneType error * refactor DMD scheduler's set_timestep * change rope related vars name * fix stage2 sample * fix dmd sample * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * remove redundant & refactor norm_out * Update src/diffusers/pipelines/helios/pipeline_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * change "is_keep_x0" to "keep_first_frame" * use a more intuitive name * refactor dynamic_time_shifting * remove use_dynamic_shifting args * remove usage of UniPCMultistepScheduler * separate stage2 sample to HeliosPyramidPipeline * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * Update src/diffusers/models/transformers/transformer_helios.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * fix transformer * use a more intuitive name * update example script * fix requirements * remove redudant attention mask * fix * optimize pipelines * make style . * update TYPE_CHECKING * change to use torch.split Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * derive memory patch sizes from patch_size multiples * remove some hardcoding * move some checks into check_inputs * refactor sample_block_noise * optimize encoding chunks logits for v2v * use num_history_latent_frames = sum(history_sizes) * Update src/diffusers/pipelines/helios/pipeline_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * remove redudant optimized_scale * Update src/diffusers/pipelines/helios/pipeline_helios_pyramid.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * use more descriptive name * optimize history_latents * remove not used "num_inference_steps" * removed redudant "pyramid_num_stages" * add "is_cfg_zero_star" and "is_distilled" to HeliosPyramidPipeline * remove redudant * change example scripts name * change example scripts name * correct docs * update example * update docs * Update tests/models/transformers/test_models_transformer_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update tests/models/transformers/test_models_transformer_helios.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * separate HeliosDMDScheduler * fix numerical stability issue: * Update src/diffusers/schedulers/scheduling_helios_dmd.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/schedulers/scheduling_helios_dmd.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/schedulers/scheduling_helios_dmd.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/schedulers/scheduling_helios_dmd.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Update src/diffusers/schedulers/scheduling_helios_dmd.py Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * remove redudant * small refactor * remove use_interpolate_prompt logits * simplified model test * fallbackt to BaseModelTesterConfig * remove _maybe_expand_t2v_lora_for_i2v * fix HeliosLoraLoaderMixin * update docs * use randn_tensor for test * fix doc typo * optimize code * mark torch.compile xfail * change paper name * Make get_dummy_inputs deterministic using self.generator * Set less strict threshold for test_save_load_float16 test for Helios pipeline * make style and make quality * Preparation for merging * add torch.Generator * Fix HeliosPipelineOutput doc path * Fix Helios related (optimize docs & remove redudant) (#13210) * fix docs * remove redudant * remove redudant * fix group offload * Removed fixes for group offload --------- Co-authored-by: yuanshenghai <yuanshenghai@bytedance.com> Co-authored-by: Shenghai Yuan <140951558+SHYuanBest@users.noreply.github.com> Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: SHYuanBest <shyuan-cs@hotmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> |
||
|
|
4a2833c1c2 |
[Modular] implement requirements validation for custom blocks (#12196)
* feat: implement requirements validation for custom blocks. * up * unify. * up * add tests * Apply suggestions from code review Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * reviewer feedback. * [docs] validation for custom blocks (#13156) validation * move to tmp_path fixture. * propagate to conditional and loopsequential blocks. * up * remove collected tests --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> |
||
|
|
3fd14f1acf |
[AutoModel] Allow registering auto_map to model config (#13186)
* update * update |
||
|
|
680076fcc0 |
[Modular] update the auto pipeline blocks doc (#13148)
* update * Apply suggestion from @yiyixuxu * Update docs/source/en/modular_diffusers/auto_pipeline_blocks.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_pipeline_blocks.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_pipeline_blocks.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/modular_diffusers/auto_pipeline_blocks.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * add to api --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal> |
||
|
|
212db7b999 |
Cosmos Transfer2.5 Auto-Regressive Inference Pipeline (#13114)
* AR * address comments * address comments 2 |
||
|
|
aac94befce |
[docs] Fix torchrun command argument order in docs (#13181)
Some checks failed
Build documentation / build (push) Has been cancelled
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
Stale Bot / Close Stale Issues (push) Has been cancelled
Nightly and release tests on main/release branch / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch Pipelines CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (examples) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (lora) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (models) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (others) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (schedulers) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (single_file) (push) Has been cancelled
Nightly and release tests on main/release branch / PyTorch Compile CUDA tests (push) Has been cancelled
Nightly and release tests on main/release branch / Torch tests on big GPU (push) Has been cancelled
Nightly and release tests on main/release branch / Torch Minimum Version CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:nvidia_modelopt test_location:modelopt]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:optimum_quanto test_location:quanto]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:torchao test_location:torchao]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft kernels] backend:gguf test_location:gguf]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft] backend:bitsandbytes test_location:bnb]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (push) Has been cancelled
Nightly and release tests on main/release branch / Generate Consolidated Test Report (push) Has been cancelled
Test, build, and push Docker images / test-build-docker-images (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-doc-builder) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cpu) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cuda) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-minimum-cuda) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-xformers-cuda) (push) Has been cancelled
Fix torchrun command argument order in docs |
||
|
|
6875490c3b |
[docs] add docs for qwenimagelayered (#13158)
Some checks failed
Build documentation / build (push) Has been cancelled
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled
Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled
Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled
Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled
Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled
Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
Nightly and release tests on main/release branch / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch Pipelines CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (examples) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (lora) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (models) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (others) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (schedulers) (push) Has been cancelled
Nightly and release tests on main/release branch / Nightly Torch CUDA Tests (single_file) (push) Has been cancelled
Nightly and release tests on main/release branch / PyTorch Compile CUDA tests (push) Has been cancelled
Nightly and release tests on main/release branch / Torch tests on big GPU (push) Has been cancelled
Nightly and release tests on main/release branch / Torch Minimum Version CUDA Tests (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:nvidia_modelopt test_location:modelopt]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:optimum_quanto test_location:quanto]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[] backend:torchao test_location:torchao]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft kernels] backend:gguf test_location:gguf]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (map[additional_deps:[peft] backend:bitsandbytes test_location:bnb]) (push) Has been cancelled
Nightly and release tests on main/release branch / Torch quantization nightly tests (push) Has been cancelled
Nightly and release tests on main/release branch / Generate Consolidated Test Report (push) Has been cancelled
Test, build, and push Docker images / test-build-docker-images (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-doc-builder) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cpu) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-cuda) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-minimum-cuda) (push) Has been cancelled
Test, build, and push Docker images / build-and-push-docker-images (diffusers-pytorch-xformers-cuda) (push) Has been cancelled
* add example * feedback |
||
|
|
59e7a46928 |
[Pipelines] Remove k-diffusion (#13152)
* remove k-diffusion * fix copies |
||
|
|
c919ec0611 |
[Modular] add explicit workflow support (#13028)
Some checks failed
Build documentation / build (push) Has been cancelled
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled
Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled
Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled
Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled
Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled
Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
* up * up up * update outputs * style * add modular_auto_docstring! * more auto docstring * style * up up up * more more * up * address feedbacks * add TODO in the description for empty docstring * refactor based on dhruv's feedback: remove the class method * add template method * up * up up up * apply auto docstring * make style * rmove space in make docstring * Apply suggestions from code review * revert change in z * fix * Apply style fixes * include auto-docstring check in the modular ci. (#13004) * initial support: workflow * up up * treeat loop sequential pipeline blocks as leaf * update qwen image docstring note * add workflow support for sdxl * add a test suit * add test for qwen-image * refactor flux a bit, seperate modular_blocks into modular_blocks_flux and modular_blocks_flux_kontext + support workflow * refactor flux2: seperate blocks for klein_base + workflow * qwen: remove import support for stuff other than the default blocks * add workflow support for wan * sdxl: remove some imports: * refactor z * update flux2 auto core denoise * add workflow test for z and flux2 * Apply suggestions from code review * Apply suggestions from code review * add test for flux * add workflow test for flux * add test for flux-klein * sdxl: modular_blocks.py -> modular_blocks_stable_diffusion_xl.py * style * up * add auto docstring * workflow_names -> available_workflows * fix workflow test for klein base * Apply suggestions from code review Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * fix workflow tests * qwen: edit -> image_conditioned to be consistent with flux kontext/2 such * remove Optional * update type hints * update guider update_components * fix more * update docstring auto again --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> |
||
|
|
3c7506b294 |
[Modular] update doc for ModularPipeline (#13100)
* update create pipeline section * update more * update more * more * add a section on running pipeline moduarly * refactor update_components, remove support for spec * style * bullet points * update the pipeline block * small fix in state doc * update sequential doc * fix link * small update on quikstart * add a note on how to run pipeline without the componen4ts manager * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * remove the supported models mention * update more * up * revert type hint changes --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> |
||
|
|
bedc67c75f |
[Docs] Add guide for AutoModel with custom code (#13099)
update |
||
|
|
baaa8d040b |
LTX 2 Improve encode_video by Accepting More Input Types (#13057)
Some checks failed
Build documentation / build (push) Has been cancelled
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled
Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled
Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled
Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled
Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled
Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
* Support different pipeline outputs for LTX 2 encode_video * Update examples to use improved encode_video function * Fix comment * Address review comments * make style and make quality * Have non-iterator video inputs respect video_chunks_number * make style and make quality * Add warning when encode_video receives a non-denormalized np.ndarray * make style and make quality --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> |
||
|
|
10dc589a94 |
[modular]simplify components manager doc (#13088)
Some checks failed
Build documentation / build (push) Has been cancelled
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled
Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled
Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled
Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled
Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled
Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
Stale Bot / Close Stale Issues (push) Has been cancelled
* simplify components manager doc * Apply suggestion from @yiyixuxu * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestion from @stevhliu Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> |
||
|
|
99e2cfff27 |
Feature/zimage inpaint pipeline (#13006)
* Add ZImageInpaintPipeline
Updated the pipeline structure to include ZImageInpaintPipeline
alongside ZImagePipeline and ZImageImg2ImgPipeline.
Implemented the ZImageInpaintPipeline class for inpainting
tasks, including necessary methods for encoding prompts,
preparing masked latents, and denoising.
Enhanced the auto_pipeline to map the new ZImageInpaintPipeline
for inpainting generation tasks.
Added unit tests for ZImageInpaintPipeline to ensure
functionality and performance.
Updated dummy objects to include ZImageInpaintPipeline for
testing purposes.
* Add documentation and improve test stability for ZImageInpaintPipeline
- Add torch.empty fix for x_pad_token and cap_pad_token in test
- Add # Copied from annotations for encode_prompt methods
- Add documentation with usage example and autodoc directive
* Address PR review feedback for ZImageInpaintPipeline
Add batch size validation and callback handling fixes per review,
using diffusers conventions rather than suggested code verbatim.
* Update src/diffusers/pipelines/z_image/pipeline_z_image_inpaint.py
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
* Update src/diffusers/pipelines/z_image/pipeline_z_image_inpaint.py
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
* Add input validation and fix XLA support for ZImageInpaintPipeline
- Add missing is_torch_xla_available import for TPU support
- Add xm.mark_step() in denoising loop for proper XLA execution
- Add check_inputs() method for comprehensive input validation
- Call check_inputs() at the start of __call__
Addresses PR review feedback from @asomoza.
* Cleanup
---------
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
|
||
|
|
90818e82b3 |
[docs] Fix syntax error in quantization configuration (#13076)
Fix syntax error in quantization configuration |
||
|
|
430c557b6a |
Add support for Magcache (#12744)
Some checks failed
Build documentation / build (push) Has been cancelled
CodeQL Security Analysis For Github Actions / CodeQL Analysis (push) Has been cancelled
Run dependency tests / check_dependencies (push) Has been cancelled
Run Torch dependency tests / check_torch_dependencies (push) Has been cancelled
Fast GPU Tests on main / Setup Torch Pipelines CUDA Slow Tests Matrix (push) Has been cancelled
Fast GPU Tests on main / Torch Pipelines CUDA Tests (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (lora) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (models) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (others) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (schedulers) (push) Has been cancelled
Fast GPU Tests on main / Torch CUDA Tests (single_file) (push) Has been cancelled
Fast GPU Tests on main / PyTorch Compile CUDA tests (push) Has been cancelled
Fast GPU Tests on main / PyTorch xformers CUDA tests (push) Has been cancelled
Fast GPU Tests on main / Examples PyTorch CUDA tests on Ubuntu (push) Has been cancelled
Fast tests on main / Fast PyTorch CPU tests on Ubuntu (push) Has been cancelled
Fast tests on main / PyTorch Example CPU tests on Ubuntu (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Diffusers metadata / update_metadata (push) Has been cancelled
Stale Bot / Close Stale Issues (push) Has been cancelled
* add magcache * formatting * add magcache support with calibration mode * add imports * improvements * Apply style fixes * fix kandinsky errors * add tests and documentation * Apply style fixes * improvements * Apply style fixes * make fix-copies. * minor fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> |