[agents] add a section on tests in the ai skill and integration guides. (#13752)

* add a section on tests in the ai skill and integration guides. * up
2026-05-28 00:39:35 +08:00 · 2026-05-15 07:51:58 +09:00
parent b9b7df3627
commit 62ec337e30
3 changed files with 44 additions and 7 deletions
--- a/.ai/skills/model-integration/SKILL.md
+++ b/.ai/skills/model-integration/SKILL.md
@@ -73,10 +73,37 @@ See [../../models.md](../../models.md) for the attention pattern, implementation

 **Don't combine structural changes with behavioral changes.** Restructuring code to fit diffusers APIs (ModelMixin, ConfigMixin, etc.) is unavoidable. But don't also "improve" the algorithm, refactor computation order, or rename internal variables for aesthetics. Keep numerical logic as close to the reference as possible, even if it looks unclean. For standard → modular, this is stricter: copy loop logic verbatim and only restructure into blocks. Clean up in a separate commit after parity is confirmed.

-### Test setup
+### Testing

- Slow tests gated with `@slow` and `RUN_SLOW=1`
- All model-level tests must use the `BaseModelTesterConfig`, `ModelTesterMixin`, `MemoryTesterMixin`, `AttentionTesterMixin`, `LoraTesterMixin`, and `TrainingTesterMixin` classes initially to write the tests. Any additional tests should be added after discussions with the maintainers. Use `tests/models/transformers/test_models_transformer_flux.py` as a reference.
+Two test layers must be added for any new pipeline: pipeline-level tests, and (if a new model is introduced) model-level tests. Integration/slow tests and LoRA tests are **not** added in the initial PR — they come later, after discussion with maintainers.
+
+**General rules (apply to both layers):**
+- Keep component sizes tiny so the suite runs fast — small `num_layers`, small hidden/attention dims, low resolution, few frames. Reference `tests/pipelines/wan/test_wan.py` (`get_dummy_components` and `get_dummy_inputs`) for the size scale to target.
+- No LoRA tests in the initial PR (no `LoraTesterMixin`, no `tests/lora/test_lora_layers_<model>.py`).
+- No integration / slow tests in the initial PR — don't add anything gated on `@slow` / `RUN_SLOW=1` yet.
+
+#### Pipeline-level tests
+
+- Location: `tests/pipelines/<model>/test_<model>.py` (one file per pipeline variant, e.g. T2V, I2V).
+- Subclass both `PipelineTesterMixin` (from `..test_pipelines_common`) and `unittest.TestCase`.
+- Set `pipeline_class`, `params`, `batch_params`, `image_params` from `..pipeline_params`, and any `required_optional_params` / capability flags (`test_xformers_attention`, `supports_dduf`, etc.) that apply.
+- Implement `get_dummy_components()` (build all sub-modules with tiny configs and a fixed `torch.manual_seed(0)` before each) and `get_dummy_inputs(device, seed=0)`.
+- Skip any inherited tests that don't apply with `@unittest.skip("Test not supported")` rather than deleting them.
+- Reference: `tests/pipelines/wan/test_wan.py`.
+
+#### Model-level tests
+
+Only required if the pipeline introduces a new model class (transformer, VAE, etc.). Don't write these by hand — generate them (example command below):
+
+```bash
+python utils/generate_model_tests.py src/diffusers/models/transformers/transformer_<model>.py
+```
+
+- Run with **no `--include` flags** initially. The generator auto-detects mixins/attributes and emits the always-on testers (`ModelTesterMixin`, `MemoryTesterMixin`, `TorchCompileTesterMixin`, plus `AttentionTesterMixin` / `ContextParallelTesterMixin` / `TrainingTesterMixin` as applicable). Optional testers (quantization, caching, single-file, IP adapter, etc.) are added later, after maintainer discussion.
+- The generator writes to `tests/models/transformers/test_models_transformer_<model>.py` (or the matching `unets/` / `autoencoders/` subdir).
+- Fill in the `TODO`s in the generated `<Model>TesterConfig`: `pretrained_model_name_or_path`, `get_init_dict()` (tiny config), `get_dummy_inputs()`, `input_shape`, `output_shape`. Keep init dims small for speed.
+- Do **not** add `LoraTesterMixin` at the start, even if the model subclasses `PeftAdapterMixin` — strip it from the generated file for the initial PR.
+- Reference: `tests/models/transformers/test_models_transformer_flux.py`.

 ---

--- a/.ai/skills/parity-testing/SKILL.md
+++ b/.ai/skills/parity-testing/SKILL.md
@@ -7,6 +7,8 @@ description: >
  visual artifacts — as these are usually parity bugs.
 ---

+> **Note**: Parity testing is **separate from** the unit-level tests that ship in `tests/`. If you are integrating a new model, the model-level test suite under `tests/models/` is still required — follow the **"#### Model-level tests"** section in [`../model-integration/SKILL.md`](../model-integration/SKILL.md) (generate via `utils/generate_model_tests.py`, no `--include` flags initially, no `LoraTesterMixin`). Parity tests verify numerical correctness during development; the generated test suite is what CI runs.
+
 ## Setup — gather before starting

 Before writing any test code, gather:
--- a/docs/source/en/conceptual/contribution.md
+++ b/docs/source/en/conceptual/contribution.md
@@ -570,11 +570,19 @@ For documentation strings, 🧨 Diffusers follows the [Google style](https://goo

 ## Coding with AI agents

-The repository keeps AI-agent configuration in `.ai/` and exposes local agent files via symlinks.
+The repository keeps AI-agent configuration in [`.ai/`](https://github.com/huggingface/diffusers/tree/main/.ai) and exposes local agent files via symlinks. If you use a coding agent (Claude Code, OpenAI Codex, etc.) to help with a contribution, point it at this directory — it contains the project conventions and on-demand task knowledge maintainers expect contributors to follow.

- **Source of truth** — edit files under `.ai/` (`AGENTS.md` for coding guidelines, `skills/` for on-demand task knowledge)
- **Don't edit** generated root-level `AGENTS.md`, `CLAUDE.md`, or `.agents/skills`/`.claude/skills` — they are symlinks
- Setup commands:
+- **Read-only for contributors** — `.ai/` is maintained by the core maintainers. Please do not edit files under `.ai/` (or the generated root-level `AGENTS.md`, `CLAUDE.md`, `.agents/skills`, `.claude/skills`, which are symlinks) in your PR. If you find something missing or wrong, open an issue or flag it on the PR and a maintainer will update it.
+- **Guidelines** (loaded into every agent session):
+  - [`.ai/AGENTS.md`](https://github.com/huggingface/diffusers/blob/main/.ai/AGENTS.md) — top-level coding guidelines
+  - [`.ai/models.md`](https://github.com/huggingface/diffusers/blob/main/.ai/models.md) — attention pattern, model implementation rules, common conventions
+  - [`.ai/pipelines.md`](https://github.com/huggingface/diffusers/blob/main/.ai/pipelines.md) — pipeline conventions
+  - [`.ai/modular.md`](https://github.com/huggingface/diffusers/blob/main/.ai/modular.md) — modular pipeline conventions and conversion checklist
+  - [`.ai/review-rules.md`](https://github.com/huggingface/diffusers/blob/main/.ai/review-rules.md) — what reviewers look for
+- **Skills** (under [`.ai/skills/`](https://github.com/huggingface/diffusers/tree/main/.ai/skills), loaded on demand for specific tasks):
+  - `model-integration` — adding a new model or pipeline to diffusers end-to-end (file structure, integration checklist, testing layout, weight conversion)
+  - `parity-testing` — verifying numerical parity between the diffusers implementation and a reference implementation
+- **Setup commands**:
  - `make codex` — symlink guidelines + skills for OpenAI Codex
  - `make claude` — symlink guidelines + skills for Claude Code
  - `make clean-ai` — remove all generated symlinks