mirror of
https://github.com/JuliaLang/julia.git
synced 2026-05-28 03:10:33 +08:00
## Macro expansion performance issue Noted in #61425, but here's a recap: Currently, we represent multi-provenance (a tree originating from source text and >=1 macro expansions) by setting its nodes' `.source` attributes to a tuple of NodeIds. This is slow: when we macroexpand, we need to copy, unpack, and repack the tuple associated with every subtree coming from the expansion. It also duplicates information we rarely ever need, since this is only used for debuginfo production and certain pretty-printing. We're usually only interested in the first item in the tuple. ## Changes ### `.macro_source` This change represents the same information (tuple elements 2:end) with a single NodeId `.macro_source` on multiple nodes in a tree's provenance. This fixes the performance issue noted in the past, but also improves serializability of SyntaxTree if we end up putting them on disk. This representation is similar to lowered `DebugInfo`, where `st.source` is `di.linetable` and `st.macro_source` is one of `di.edges`. ### Provenance API improvements I needed to rewrite `flattened_provenance` and friends to use `.macro_source`, so I've also written down what these functions should do. Some reverse-engineering was needed, and some functions have changed their behaviour slightly, so @aviatesk let me know if there's some functionality that's missing. `flattened_provenance` should be the same. ### `copy_ast`, `copy_node`, etc. Cleaned up given that copying needs to have special behaviour with `.source` and `.macro_source`. `mktree` now copies a subtree for lowering purposes (extends provenance, recurses on children) and `copy_ast` (actually copies `.source` and `.macro_source`) is restricted to the foreign-graph case. ### Benchmarks Script and GC-time disclaimer is the same as in #61425. Improvement depends on how macro-heavy the code is. <details> <summary> Before </summary> ``` julia> include("jlbench.jl"); @allocated main(["./test/char.jl"]) File: ./test/char.jl Module: Main Number of top-level expressions: 40 Number of runs: 10 1...2...3...4...5...6...7...8...9...10... Best times (over 10 runs): expand_forms_1: 43.383 ms expand_forms_2: 30.849 ms resolve_scopes: 23.396 ms convert_closures: 29.841 ms linearize_ir: 34.429 ms to_lowered_expr: 46.547 ms Total: 208.445 ms Total(flisp): 248.6 ms Percentage breakdown: expand_forms_1: 20.8% expand_forms_2: 14.8% resolve_scopes: 11.2% convert_closures: 14.3% linearize_ir: 16.5% to_lowered_expr: 22.3% 1964449696 julia> include("jlbench.jl"); @allocated main(["./test/atomics.jl"]) File: ./test/atomics.jl Module: Main Number of top-level expressions: 392 Number of runs: 10 1...2...3...4...5...6...7...8...9...10... Best times (over 10 runs): expand_forms_1: 112.906 ms expand_forms_2: 90.099 ms resolve_scopes: 55.319 ms convert_closures: 18.995 ms linearize_ir: 83.808 ms to_lowered_expr: 83.458 ms Total: 444.585 ms Total(flisp): 378.801 ms Percentage breakdown: expand_forms_1: 25.4% expand_forms_2: 20.3% resolve_scopes: 12.4% convert_closures: 4.3% linearize_ir: 18.9% to_lowered_expr: 18.8% 3792501984 julia> include("jlbench.jl"); @allocated main(["./test/numbers.jl"]) File: ./test/numbers.jl Module: Main Number of top-level expressions: 416 Number of runs: 10 1...2...3...4...5...6...7...8...9...10... Best times (over 10 runs): expand_forms_1: 777.836 ms expand_forms_2: 558.305 ms resolve_scopes: 378.443 ms convert_closures: 211.51 ms linearize_ir: 592.592 ms to_lowered_expr: 736.205 ms Total: 3254.891 ms Total(flisp): 2588.61 ms Percentage breakdown: expand_forms_1: 23.9% expand_forms_2: 17.2% resolve_scopes: 11.6% convert_closures: 6.5% linearize_ir: 18.2% to_lowered_expr: 22.6% 20309895520 ``` </details> <details> <summary> After </summary> ``` julia> include("jlbench.jl"); @allocated main(["./test/char.jl"]) File: ./test/char.jl Module: Main Number of top-level expressions: 40 Number of runs: 10 1...2...3...4...5...6...7...8...9...10... Best times (over 10 runs): expand_forms_1: 28.689 ms expand_forms_2: 32.89 ms resolve_scopes: 22.794 ms convert_closures: 18.304 ms linearize_ir: 37.902 ms to_lowered_expr: 26.57 ms Total: 167.149 ms Total(flisp): 243.393 ms Percentage breakdown: expand_forms_1: 17.2% expand_forms_2: 19.7% resolve_scopes: 13.6% convert_closures: 11.0% linearize_ir: 22.7% to_lowered_expr: 15.9% 1696645648 julia> include("jlbench.jl"); @allocated main(["./test/atomics.jl"]) File: ./test/atomics.jl Module: Main Number of top-level expressions: 392 Number of runs: 10 1...2...3...4...5...6...7...8...9...10... Best times (over 10 runs): expand_forms_1: 73.671 ms expand_forms_2: 86.262 ms resolve_scopes: 52.325 ms convert_closures: 19.826 ms linearize_ir: 81.106 ms to_lowered_expr: 58.467 ms Total: 371.658 ms Total(flisp): 375.771 ms Percentage breakdown: expand_forms_1: 19.8% expand_forms_2: 23.2% resolve_scopes: 14.1% convert_closures: 5.3% linearize_ir: 21.8% to_lowered_expr: 15.7% 3688971328 julia> include("jlbench.jl"); @allocated main(["./test/numbers.jl"]) File: ./test/numbers.jl Module: Main Number of top-level expressions: 416 Number of runs: 10 1...2...3...4...5...6...7...8...9...10... Best times (over 10 runs): expand_forms_1: 585.484 ms expand_forms_2: 571.274 ms resolve_scopes: 364.441 ms convert_closures: 142.187 ms linearize_ir: 615.537 ms to_lowered_expr: 461.66 ms Total: 2740.583 ms Total(flisp): 2560.689 ms Percentage breakdown: expand_forms_1: 21.4% expand_forms_2: 20.8% resolve_scopes: 13.3% convert_closures: 5.2% linearize_ir: 22.5% to_lowered_expr: 16.8% 20738718400 ``` </details> If I really cheat (minimum time over 10 runs like above, `const DEBUG = false` in JuliaLowering), it is reliably faster than flisp now :). There's still much more work to do in the practical case; memory use in particular needs improvement.