diff --git "a/layout_opt/log-neuron-cc.txt" "b/layout_opt/log-neuron-cc.txt" new file mode 100644--- /dev/null +++ "b/layout_opt/log-neuron-cc.txt" @@ -0,0 +1,1958 @@ +2025-09-05T19:18:35Z INFO 1939 [root]: /opt/conda/bin/neuronx-cc compile /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/model/graph.hlo --framework XLA --target trn1 --output /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/graph.neff --model-type=transformer -O1 --lnc=1 '--internal-hlo2tensorizer-options=--experimental-unsafe-fp8e4m3fn-as-fp8e4m3 --verify-hlo=false' --logfile=/models/mistral-7b-v0.3-instruct-neuronx/layout_opt/log-neuron-cc.txt --verbose=35 +2025-09-05T19:18:41Z INFO 1939 [root]: NeuronX Compiler version 2.20.9961.0+0acef03a Python version 3.10.17 HWM version 2.20.0.9961+0acef03a NumPy version 1.26.4 +2025-09-05T19:18:41Z INFO 1971 [root]: XLA detected +2025-09-05T19:18:41Z INFO 1971 [root]: Pipeline: HLOToTensorizer Frontend StaticIOTranspose WalrusDriver BIRLinker Kelper NeffWrapper +2025-09-05T19:18:41Z INFO 1971 [root]: Intermediate files stored in /opt/vllm/neuronxcc-gss0wi_w, output in /opt/vllm +2025-09-05T19:18:41Z INFO 1971 [pipeline.Pipeline.0]: Job Pipeline len(in_states) 1 +2025-09-05T19:18:41Z INFO 1971 [pipeline.Pipeline.0]: Processing input #0 +2025-09-05T19:18:41Z INFO 1971 [pipeline.Pipeline.0]: Running pipeline Pipeline.0 +2025-09-05T19:18:41Z INFO 1971 [pipeline.Pipeline.0]: Starting job job.HLOToTensorizer.0 +2025-09-05T19:18:41Z INFO 1971 [job.HLOToTensorizer.0]: Job HLOToTensorizer len(in_states) 1 +2025-09-05T19:18:41Z INFO 1971 [job.HLOToTensorizer.0]: Processing input #0 +2025-09-05T19:18:41Z INFO 1971 [job.HLOToTensorizer.0]: IR signature: b7a76fc5f3f76d1d69d57e0e784721bafd07e3a61734f6594e8c815123a8a771 for graph.hlo +2025-09-05T19:18:41Z INFO 1971 [job.HLOToTensorizer.0]: Executing: /opt/conda/lib/python3.10/site-packages/neuronxcc/starfish/bin/hlo2penguin --input /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/model/graph.hlo --out-dir ./ --output penguin.py --remat --max-costly-ops=2 --max-live-in-size=5 --max-remat-chain-size=10 --max-mem-multiple=1.8 --min-def-use-distance=500 --remat-policy=transformer --allow-same-pass-remat=true --layers-per-module=1 --partition --emit-tensor-level-dropout-ops --experimental-unsafe-fp8e4m3fn-as-fp8e4m3 --verify-hlo=false --native-to-custom-softmax --partitioner-opts='--transformer' +2025-09-05T19:18:41Z INFO 1971 [job.HLOToTensorizer.0]: DEBUG: needsModular? No. macCnt 0 num non-trivial Ops 290 +INFO: Switching to single-module compile. PrePartitionPipe skipped. +INFO: Found memory bound graph +INFO: Number of Native SoftmaxDx's detected and replaced: 0 +INFO: Number of Native Softmax's detected and replaced: 0 +Replaced 0 dropout sequences with OffloadedDropout +INFO: HloMacCount has found 0 +INFO: Traffic has found 7248289792 +INFO: AIF 0 +HLO Ops used in computation: parameter reshape transpose tuple +Warning: Could not open file debug_info_hlo_partitions.json +2025-09-05 19:18:41.368875: W hilo/hlo2penguin/utils/DumpDebugInfo.cc:52] Truncating long HLO operator name %last = tuple(%p68, %transpose.290, %transpose.291, %transpose.292, %transpose.293, %transpose.294, %transpose.295, %transpose.296, %transpose.297, %transpose.298, %transpose.299, %transpose.300, %transpose.301, %transpose.302, %transpose.303, %transpose.304, %transpose.305, %transpose.306, %transpose.307, %transpose.308, %transpose.309, %transpose.310, %transpose.311, %transpose.312, %transpose.313, %transpose.314, %transpose.315, %transpose.316, %transpose.317, %transpose.318, %transpose.319, %transpos... to 512 characters in the compiler's debug metadata +Invoking RemoveOptimizationBarriers pass + +2025-09-05T19:18:41Z INFO 1971 [job.HLOToTensorizer.0]: IR signature: 2e0dd598350cceee9da216d825f016375db09d3b92ca63d286d74eb4016c4d07 for sg0000/HLOToTensorizer +2025-09-05T19:18:41Z INFO 1971 [job.HLOToTensorizer.0]: Job #0 finished +2025-09-05T19:18:41Z INFO 1971 [pipeline.Pipeline.0]: Finished job job.HLOToTensorizer.0 +2025-09-05T19:18:41Z INFO 1971 [pipeline.Pipeline.0]: Starting job job.Frontend.0 +2025-09-05T19:18:41Z INFO 1971 [job.Frontend.0]: Job Frontend len(in_states) 1 +2025-09-05T19:18:41Z INFO 1971 [job.Frontend.0]: Processing input #0 +2025-09-05T19:18:41Z INFO 1971 [job.Frontend.0]: Start model loading +2025-09-05T19:18:41Z INFO 1971 [job.Frontend.0]: Start tensorization +2025-09-05T19:18:41Z INFO 1971 [job.Frontend.0]: Num jobs: 32 +2025-09-05T19:18:41Z USER 1971 [root/Tensorizer/Tensorizer]: Running Tensorizer +2025-09-05T19:18:41Z INFO 1971 [Tensorizer]: Frontend did not find netlist info. Switching to flat flow. +2025-09-05T19:18:41Z INFO 1971 [Tensorizer]: Building model from Penguin script "penguin.py"... +2025-09-05T19:18:41Z INFO 1971 [Tensorizer]: Tensorizer options: --run-pg-layout-and-tiling --enable-dse-after-mask-propagation --disable-concat-delinearizer --num-neuroncores-per-sengine=1 --num-neuroncores-per-sengine=1 --internal_dynamic_dma_scratch_size_per_partition=16384 --disable-bitcasted-transpose --dont-verify-after-all --fp32-cast=matmult-bf16 --mm-transpose-type=fp32 --disable-expensive-checks --disable-max-stride-tiling --hbm-scratchpad-page-size-in-bytes=536870912 --enable-replication --max-local-tensor-tile-size-in-bytes=32768 --tensor-layout-p-order=0 --tensor-layout-b-order=1 --enable-advanced-delinearization --weight-coalescing-threshold=512 --enable-bir-converter=enable --enable-tritium-loopfusion --enable-softmax-kernel --model-type-transformer --enable-isl-in-injective-check --enable-dge-on-io-dma --enable-dge-on-indirect-dma --enable-dge-on-vector-indirect-dma --keep-rng-tensor-op +2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/DoNothing]: Running DoNothing +2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/DoNothing]: Finished (changed=True) +2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/DoNothing]: DoNothing finished after 0.000 seconds +2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/LegalizeOpLevelAlias]: Running LegalizeOpLevelAlias +2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/LegalizeOpLevelAlias]: Finished (changed=False) +2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/LegalizeOpLevelAlias]: LegalizeOpLevelAlias finished after 0.003 seconds +2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/OptimizeAliasedCopyChain]: Running OptimizeAliasedCopyChain +2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/OptimizeAliasedCopyChain]: Finished (changed=False) +2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/OptimizeAliasedCopyChain]: OptimizeAliasedCopyChain finished after 0.004 seconds +2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: Running AliasDependencyInduction +2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: Finished (changed=True) +2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: AliasDependencyInduction finished after 0.032 seconds +2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/TransformConvOp]: Running TransformConvOp +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TransformConvOp]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TransformConvOp]: TransformConvOp finished after 0.011 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LowerTensorOp]: Running LowerTensorOp +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LowerTensorOp]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LowerTensorOp]: LowerTensorOp finished after 0.004 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyReset]: Running AliasDependencyReset +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: Running AliasDependencyElimination +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: Finished (changed=True) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: AliasDependencyElimination finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: Running AliasDependencyInduction +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: Finished (changed=True) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: AliasDependencyInduction finished after 0.033 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyReset]: AliasDependencyReset finished after 0.047 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TensorOpSimplifier]: Running TensorOpSimplifier +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TensorOpSimplifier]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TensorOpSimplifier]: TensorOpSimplifier finished after 0.014 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CanonicalizeIR]: Running CanonicalizeIR +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CanonicalizeIR]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CanonicalizeIR]: CanonicalizeIR finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LegalizeCCOpLayout]: Running LegalizeCCOpLayout +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LegalizeCCOpLayout]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LegalizeCCOpLayout]: LegalizeCCOpLayout finished after 0.004 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ResolveComplicatePredicates]: Running ResolveComplicatePredicates +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ResolveComplicatePredicates]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ResolveComplicatePredicates]: ResolveComplicatePredicates finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AffinePredicateResolution]: Running AffinePredicateResolution +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AffinePredicateResolution]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AffinePredicateResolution]: AffinePredicateResolution finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/EliminateDivs]: Running EliminateDivs +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/EliminateDivs]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/EliminateDivs]: EliminateDivs finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PerfectLoopNest]: Running PerfectLoopNest +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PerfectLoopNest]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PerfectLoopNest]: PerfectLoopNest finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Running Simplifier +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Finished (changed=True) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.068 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: Running GenericAccessSimplifier +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: GenericAccessSimplifier finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: Running TCTransform +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: TCTransform finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CommuteConcat]: Running CommuteConcat +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CommuteConcat]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CommuteConcat]: CommuteConcat finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ExpandBatchNorm]: Running ExpandBatchNorm +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ExpandBatchNorm]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ExpandBatchNorm]: ExpandBatchNorm finished after 0.006 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: Running TCTransform +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: TCTransform finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/EliminateDivs]: Running EliminateDivs +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/EliminateDivs]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/EliminateDivs]: EliminateDivs finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: Running GenericAccessSimplifier +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: GenericAccessSimplifier finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TensorOpTransform]: Running TensorOpTransform +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TensorOpTransform]: Finished (changed=True) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TensorOpTransform]: TensorOpTransform finished after 0.060 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LateLowerTensorOp]: Running LateLowerTensorOp +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LateLowerTensorOp]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LateLowerTensorOp]: LateLowerTensorOp finished after 0.005 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyReset]: Running AliasDependencyReset +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: Running AliasDependencyElimination +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: AliasDependencyElimination finished after 0.000 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: Running AliasDependencyInduction +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: AliasDependencyInduction finished after 0.005 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyReset]: AliasDependencyReset finished after 0.016 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MemcpyElimination]: Running MemcpyElimination +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MemcpyElimination]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MemcpyElimination]: MemcpyElimination finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: Running LoopFusion +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: LoopFusion finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Rematerialization]: Running Rematerialization +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Rematerialization]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Rematerialization]: Rematerialization finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Running Simplifier +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Running Delinearization +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Delinearization finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadStoreElimination]: Running DeadStoreElimination +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadStoreElimination]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadStoreElimination]: DeadStoreElimination finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Running Simplifier +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Running LICM +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: LICM finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Running Delinearization +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Delinearization finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: Running LoopFusion +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: LoopFusion finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/SimplifySlice]: Running SimplifySlice +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/SimplifySlice]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/SimplifySlice]: SimplifySlice finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Running LICM +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: LICM finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Running Simplifier +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ValueNumbering]: Running ValueNumbering +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ValueNumbering]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ValueNumbering]: ValueNumbering finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Running LICM +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: LICM finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PadElimination]: Running PadElimination +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PadElimination]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PadElimination]: PadElimination finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Running Delinearization +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Delinearization finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: Running LoopFusion +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: LoopFusion finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: Running GenericAccessSimplifier +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: GenericAccessSimplifier finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Running Simplifier +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Running LICM +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: LICM finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ValueNumbering]: Running ValueNumbering +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ValueNumbering]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ValueNumbering]: ValueNumbering finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: Running TCTransform +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: TCTransform finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CommuteConcat]: Running CommuteConcat +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CommuteConcat]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CommuteConcat]: CommuteConcat finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/RecognizeOpIdiom]: Running RecognizeOpIdiom +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/RecognizeOpIdiom]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/RecognizeOpIdiom]: RecognizeOpIdiom finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MaskPropagation]: Running MaskPropagation +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MaskPropagation]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MaskPropagation]: MaskPropagation finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadStoreElimination]: Running DeadStoreElimination +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadStoreElimination]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadStoreElimination]: DeadStoreElimination finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Recompute]: Running Recompute +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Recompute]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Recompute]: Recompute finished after 0.000 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadCodeElimination]: Running DeadCodeElimination +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadCodeElimination]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadCodeElimination]: DeadCodeElimination finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [Tensorizer]: After optimization: 290 statements +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DoNothing]: Running DoNothing +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DoNothing]: Finished (changed=True) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DoNothing]: DoNothing finished after 0.000 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MutateDataType]: Running MutateDataType +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MutateDataType]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MutateDataType]: MutateDataType finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AutoCastTCInputs]: Running AutoCastTCInputs +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AutoCastTCInputs]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AutoCastTCInputs]: AutoCastTCInputs finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: Running GenericAccessSimplifier +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: GenericAccessSimplifier finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Running Simplifier +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: Running DelinearIndices +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: DelinearIndices finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Running Delinearization +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Delinearization finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: Running DelinearIndices +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: DelinearIndices finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadCodeElimination]: Running DeadCodeElimination +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadCodeElimination]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadCodeElimination]: DeadCodeElimination finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LateLowerReshapeOp]: Running LateLowerReshapeOp +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LateLowerReshapeOp]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LateLowerReshapeOp]: LateLowerReshapeOp finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InferIntrinsicOnCC]: Running InferIntrinsicOnCC +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InferIntrinsicOnCC]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InferIntrinsicOnCC]: InferIntrinsicOnCC finished after 0.007 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ResolveAccessConflict]: Running ResolveAccessConflict +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ResolveAccessConflict]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ResolveAccessConflict]: ResolveAccessConflict finished after 0.004 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Running LICM +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: LICM finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LocalLayoutOpt]: Running LocalLayoutOpt +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LocalLayoutOpt]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LocalLayoutOpt]: LocalLayoutOpt finished after 0.038 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: Running DelinearIndices +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: DelinearIndices finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PGLayoutTilingPipeline]: Running PGLayoutTilingPipeline +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LayoutPreprocessingAndAnalysis]: Running LayoutPreprocessingAndAnalysis +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LayoutPreprocessing]: Running LayoutPreprocessing +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Running Delinearization +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Delinearization finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LayoutPreprocessing]: Finished (changed=True) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LayoutPreprocessing]: LayoutPreprocessing finished after 0.019 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LayoutRequirementAnalysis]: Running LayoutRequirementAnalysis +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LayoutRequirementAnalysis]: LayoutRequirementAnalysis finished after 0.005 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LayoutPreprocessingAndAnalysis]: LayoutPreprocessingAndAnalysis finished after 0.035 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InferNonlocalTensors]: Running InferNonlocalTensors +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InferNonlocalTensors]: prefer_non_broadcast_par: True +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InferNonlocalTensors]: prefer_non_broadcast_par: True +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InferNonlocalTensors]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InferNonlocalTensors]: InferNonlocalTensors finished after 0.017 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PAGLayoutOpt]: Running PAGLayoutOpt +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ParAxesAnnotation]: Running ParAxesAnnotation +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LayoutSearchAlgorithm]: prefer_non_broadcast_par: True +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ParAxesAnnotation]: Finished (changed=True) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ParAxesAnnotation]: ParAxesAnnotation finished after 0.016 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertLocalTransposes]: Running InsertLocalTransposes +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertLocalTransposes]: Finished (changed=True) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertLocalTransposes]: InsertLocalTransposes finished after 0.004 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PAGLayoutOpt]: PAGLayoutOpt finished after 0.031 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MaskPropagation]: Running MaskPropagation +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MaskPropagation]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MaskPropagation]: MaskPropagation finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CanonicalizeDAGForPGTiling]: Running CanonicalizeDAGForPGTiling +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CanonicalizeDAGForPGTiling]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CanonicalizeDAGForPGTiling]: CanonicalizeDAGForPGTiling finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PGTiling]: Running PGTiling +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AGOrderingAnalysisPass]: Running AGOrderingAnalysisPass +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AGOrderingAnalysisPass]: AGOrderingAnalysisPass finished after 0.048 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/StaticTransposeLocalTensor]: Running StaticTransposeLocalTensor +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/StaticTransposeLocalTensor]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/StaticTransposeLocalTensor]: StaticTransposeLocalTensor finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PComputeCutting]: Running PComputeCutting +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PComputeCutting]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PComputeCutting]: PComputeCutting finished after 0.007 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/BFComputeCutting]: Running BFComputeCutting +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/BFComputeCutting]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/BFComputeCutting]: BFComputeCutting finished after 0.003 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopSplitting]: Running LoopSplitting +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopSplitting]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopSplitting]: LoopSplitting finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MacroGeneration]: Running MacroGeneration +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MacroGeneration]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MacroGeneration]: MacroGeneration finished after 0.019 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PGTiling]: PGTiling finished after 0.114 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertIOTransposes]: Running InsertIOTransposes +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertIOTransposes]: Finished (changed=True) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertIOTransposes]: InsertIOTransposes finished after 0.002 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertOffloadedTransposes]: Running InsertOffloadedTransposes +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertOffloadedTransposes]: Finished (changed=False) +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertOffloadedTransposes]: InsertOffloadedTransposes finished after 0.001 seconds +2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DramToDramTranspose]: Running DramToDramTranspose +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/DramToDramTranspose]: Finished (changed=True) +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/DramToDramTranspose]: DramToDramTranspose finished after 14.818 seconds +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/PGLayoutTilingPipeline]: PGLayoutTilingPipeline finished after 15.072 seconds +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingProfiler]: Running TilingProfiler +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: +20 MACROS WITH LARGEST INSTRUCTION COUNTS: +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 4096: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingProfiler]: Finished (changed=False) +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingProfiler]: TilingProfiler finished after 0.174 seconds +2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: Running FlattenMacroLoop +2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: Finished (changed=True) +2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: FlattenMacroLoop finished after 0.101 seconds +2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/InferNeuronTensor]: Running InferNeuronTensor +2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/InferNeuronTensor]: Finished (changed=True) +2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/InferNeuronTensor]: InferNeuronTensor finished after 0.654 seconds +2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: Running NeuronSimplifier +2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: Finished (changed=False) +2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: NeuronSimplifier finished after 0.101 seconds +2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/LICM]: Running LICM +2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/LICM]: Finished (changed=False) +2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/LICM]: LICM finished after 0.030 seconds +2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/RewriteReplicationMatmul]: Running RewriteReplicationMatmul +2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/RewriteReplicationMatmul]: Finished (changed=False) +2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/RewriteReplicationMatmul]: RewriteReplicationMatmul finished after 0.023 seconds +2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: Running FlattenMacroLoop +2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: Finished (changed=False) +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: FlattenMacroLoop finished after 0.073 seconds +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/SimplifyMacroPredicates]: Running SimplifyMacroPredicates +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/SimplifyMacroPredicates]: Finished (changed=False) +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/SimplifyMacroPredicates]: SimplifyMacroPredicates finished after 0.084 seconds +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/DataLocalityOpt]: Running DataLocalityOpt +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/DataLocalityOpt]: Finished (changed=True) +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/DataLocalityOpt]: DataLocalityOpt finished after 0.148 seconds +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/DMATilingProfiler]: Running DMATilingProfiler +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: +20 MACROS WITH LARGEST INSTRUCTION COUNTS: +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 4096: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/DMATilingProfiler]: Finished (changed=False) +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/DMATilingProfiler]: DMATilingProfiler finished after 0.029 seconds +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: Running NeuronSimplifier +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: Finished (changed=False) +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: NeuronSimplifier finished after 0.107 seconds +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/LegalizeSundaMacro]: Running LegalizeSundaMacro +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/LegalizeSundaMacro]: Finished (changed=False) +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/LegalizeSundaMacro]: LegalizeSundaMacro finished after 0.053 seconds +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: Running NeuronSimplifier +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: Finished (changed=False) +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: NeuronSimplifier finished after 0.106 seconds +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PerfectLoopNest]: Running PerfectLoopNest +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PerfectLoopNest]: Finished (changed=False) +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PerfectLoopNest]: PerfectLoopNest finished after 0.022 seconds +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: Running FlattenMacroLoop +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: Finished (changed=True) +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: FlattenMacroLoop finished after 0.080 seconds +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/RewriteWeights]: Running RewriteWeights +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/RewriteWeights]: Finished (changed=False) +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/RewriteWeights]: RewriteWeights finished after 0.018 seconds +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/ReshapeWeights]: Running ReshapeWeights +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/ReshapeWeights]: Finished (changed=False) +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/ReshapeWeights]: ReshapeWeights finished after 0.006 seconds +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: Running FlattenMacroLoop +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: Finished (changed=False) +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: FlattenMacroLoop finished after 0.066 seconds +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/SimplifyMacroPredicates]: Running SimplifyMacroPredicates +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/SimplifyMacroPredicates]: Finished (changed=False) +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/SimplifyMacroPredicates]: SimplifyMacroPredicates finished after 0.087 seconds +2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/InferInitValue]: Running InferInitValue +2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/InferInitValue]: Finished (changed=True) +2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/InferInitValue]: InferInitValue finished after 0.437 seconds +2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: Running NeuronSimplifier +2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: Finished (changed=False) +2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: NeuronSimplifier finished after 0.109 seconds +2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/SimplifyTensor]: Running SimplifyTensor +2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/SimplifyTensor]: Finished (changed=False) +2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/SimplifyTensor]: SimplifyTensor finished after 0.068 seconds +2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/LICM]: Running LICM +2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/LICM]: Finished (changed=False) +2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/LICM]: LICM finished after 0.031 seconds +2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/SundaISel]: Running SundaISel +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/SundaISel]: Finished (changed=True) +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/SundaISel]: SundaISel finished after 0.472 seconds +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronAliasDependencyReset]: Running NeuronAliasDependencyReset +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: Running AliasDependencyElimination +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: Finished (changed=False) +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: AliasDependencyElimination finished after 0.000 seconds +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronAliasDependencyInduction]: Running NeuronAliasDependencyInduction +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronAliasDependencyInduction]: Finished (changed=True) +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronAliasDependencyInduction]: NeuronAliasDependencyInduction finished after 0.039 seconds +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronAliasDependencyReset]: NeuronAliasDependencyReset finished after 0.051 seconds +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/LowerComplexBroadcast]: Running LowerComplexBroadcast +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/LowerComplexBroadcast]: Finished (changed=False) +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/LowerComplexBroadcast]: LowerComplexBroadcast finished after 0.022 seconds +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopInterchange]: Running NeuronLoopInterchange +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopInterchange]: Finished (changed=False) +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopInterchange]: NeuronLoopInterchange finished after 0.018 seconds +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Running NeuronSimplifyPredicates +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Finished (changed=False) +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: NeuronSimplifyPredicates finished after 0.014 seconds +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopFusion]: Running NeuronLoopFusion +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopFusion]: Finished (changed=True) +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopFusion]: NeuronLoopFusion finished after 0.076 seconds +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopInterchange]: Running NeuronLoopInterchange +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopInterchange]: Finished (changed=False) +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopInterchange]: NeuronLoopInterchange finished after 0.018 seconds +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: Running NeuronLICM +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: Finished (changed=False) +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: NeuronLICM finished after 0.070 seconds +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/FactorizeBlkDims]: Running FactorizeBlkDims +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/FactorizeBlkDims]: Finished (changed=False) +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/FactorizeBlkDims]: FactorizeBlkDims finished after 0.092 seconds +2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronInstComb]: Running NeuronInstComb +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronInstComb]: Finished (changed=True) +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronInstComb]: NeuronInstComb finished after 1.250 seconds +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronValueNumbering]: Running NeuronValueNumbering +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronValueNumbering]: Finished (changed=False) +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronValueNumbering]: NeuronValueNumbering finished after 0.035 seconds +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronInstComb]: Running NeuronInstComb +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronInstComb]: Finished (changed=False) +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronInstComb]: NeuronInstComb finished after 0.016 seconds +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/VectorizeDMA]: Running VectorizeDMA +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/VectorizeDMA]: Finished (changed=False) +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/VectorizeDMA]: VectorizeDMA finished after 0.023 seconds +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Running NeuronSimplifyPredicates +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Finished (changed=False) +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: NeuronSimplifyPredicates finished after 0.009 seconds +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/LegalizePartitionReduce]: Running LegalizePartitionReduce +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/LegalizePartitionReduce]: Finished (changed=False) +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/LegalizePartitionReduce]: LegalizePartitionReduce finished after 0.008 seconds +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/DeConcat]: Running DeConcat +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/DeConcat]: Finished (changed=False) +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/DeConcat]: DeConcat finished after 0.001 seconds +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/FactorizeThreadAxesInFreeDims]: Running FactorizeThreadAxesInFreeDims +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/FactorizeThreadAxesInFreeDims]: Finished (changed=False) +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/FactorizeThreadAxesInFreeDims]: FactorizeThreadAxesInFreeDims finished after 0.017 seconds +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/PartialSimdFusion]: Running PartialSimdFusion +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/PartialSimdFusion]: Finished (changed=False) +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/PartialSimdFusion]: PartialSimdFusion finished after 0.007 seconds +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/TritiumFusion]: Running TritiumFusion +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/TritiumFusion]: Finished (changed=False) +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/TritiumFusion]: TritiumFusion finished after 0.008 seconds +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/CCOpFusion]: Running CCOpFusion +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/CCOpFusion]: Finished (changed=False) +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/CCOpFusion]: CCOpFusion finished after 0.069 seconds +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/VectorizeMatMult]: Running VectorizeMatMult +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/VectorizeMatMult]: Finished (changed=False) +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/VectorizeMatMult]: VectorizeMatMult finished after 0.004 seconds +2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/PartialLoopFusion]: Running PartialLoopFusion +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/PartialLoopFusion]: Finished (changed=False) +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/PartialLoopFusion]: PartialLoopFusion finished after 0.122 seconds +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: Running NeuronLICM +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: Finished (changed=False) +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: NeuronLICM finished after 0.038 seconds +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerTranspose]: Running LowerTranspose +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerTranspose]: Finished (changed=True) +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerTranspose]: LowerTranspose finished after 0.415 seconds +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerBroadcast]: Running LowerBroadcast +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerBroadcast]: Finished (changed=False) +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerBroadcast]: LowerBroadcast finished after 0.016 seconds +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LateNeuronInstComb]: Running LateNeuronInstComb +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LateNeuronInstComb]: Finished (changed=True) +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LateNeuronInstComb]: LateNeuronInstComb finished after 0.107 seconds +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/SplitAccGrp]: Running SplitAccGrp +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/SplitAccGrp]: Finished (changed=False) +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/SplitAccGrp]: SplitAccGrp finished after 0.013 seconds +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/SpillPSum]: Running SpillPSum +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/SpillPSum]: Finished (changed=False) +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/SpillPSum]: SpillPSum finished after 0.120 seconds +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerIntrinsics]: Running LowerIntrinsics +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerIntrinsics]: Finished (changed=False) +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerIntrinsics]: LowerIntrinsics finished after 0.015 seconds +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/InlineNativeKernels]: Running InlineNativeKernels +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/InlineNativeKernels]: Finished (changed=False) +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/InlineNativeKernels]: InlineNativeKernels finished after 0.012 seconds +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LegalizeType]: Running LegalizeType +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LegalizeType]: Finished (changed=True) +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LegalizeType]: LegalizeType finished after 0.089 seconds +2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: Running NeuronLICM +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: Finished (changed=False) +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: NeuronLICM finished after 0.063 seconds +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/InferPSumTensor]: Running InferPSumTensor +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/InferPSumTensor]: Finished (changed=False) +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/InferPSumTensor]: InferPSumTensor finished after 0.137 seconds +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/WeightCoalescing]: Running WeightCoalescing +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/WeightCoalescing]: Finished (changed=False) +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/WeightCoalescing]: WeightCoalescing finished after 0.013 seconds +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/LegalizeSundaAccess]: Running LegalizeSundaAccess +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/LegalizeSundaAccess]: Finished (changed=False) +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/LegalizeSundaAccess]: LegalizeSundaAccess finished after 0.114 seconds +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/RelaxPredicates]: Running RelaxPredicates +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/RelaxPredicates]: Finished (changed=False) +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/RelaxPredicates]: RelaxPredicates finished after 0.029 seconds +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/TensorInitialization]: Running TensorInitialization +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/TensorInitialization]: Finished (changed=False) +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/TensorInitialization]: TensorInitialization finished after 0.015 seconds +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Running NeuronSimplifyPredicates +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Finished (changed=False) +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: NeuronSimplifyPredicates finished after 0.014 seconds +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/ExpandISAMacro]: Running ExpandISAMacro +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/ExpandISAMacro]: Finished (changed=False) +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/ExpandISAMacro]: ExpandISAMacro finished after 0.025 seconds +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/SimplifyNeuronTensor]: Running SimplifyNeuronTensor +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/SimplifyNeuronTensor]: Finished (changed=False) +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/SimplifyNeuronTensor]: SimplifyNeuronTensor finished after 0.050 seconds +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/DMALocalityOpt]: Running DMALocalityOpt +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/DMALocalityOpt]: Finished (changed=False) +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/DMALocalityOpt]: DMALocalityOpt finished after 0.010 seconds +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/DataStreaming]: Running DataStreaming +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/DataStreaming]: Finished (changed=False) +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/DataStreaming]: DataStreaming finished after 0.026 seconds +2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/SFKVectorizer]: Running SFKVectorizer +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/SFKVectorizer]: Finished (changed=True) +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/SFKVectorizer]: SFKVectorizer finished after 2.728 seconds +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/LateLegalizeInst]: Running LateLegalizeInst +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/LateLegalizeInst]: Finished (changed=False) +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/LateLegalizeInst]: LateLegalizeInst finished after 0.050 seconds +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/CoalesceCCOp]: Running CoalesceCCOp +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/CoalesceCCOp]: Finished (changed=False) +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/CoalesceCCOp]: CoalesceCCOp finished after 0.015 seconds +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/SimpleAllReduceTiling]: Running SimpleAllReduceTiling +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/SimpleAllReduceTiling]: Finished (changed=False) +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/SimpleAllReduceTiling]: SimpleAllReduceTiling finished after 0.015 seconds +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Running DMAProfiler +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Top 10 (estimated) latency DMAs: +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 583.920us (128.000MiB, est bw: 229.856GB/s, 1.157% of tot. time) for bfloat16<128 x 4096> TongaSB partitions[1] bfloat16 (128, 128, 4096) %'19629.24343'[T_i0,i0.128,i1.4096] = load bfloat16<128 x 4096> {'CrossPassTensor': ''}bfloat16 (128, 128, 4096) %'input289'[T_i0,i0.128,i1.4096] # id=23881, src_id=None, , instances=128 # dl = tensor_op_name: t4051_pftranspose_19629 | hlo_id: 2097 | [[i0.128];[i1.4096]] -> [[i0.128];[i1.4096]] +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 349.240us (128.000MiB, est bw: 384.314GB/s, 0.692% of tot. time) for bfloat16<128 x 4096> bfloat16 (128, 128, 32, 128) %'output289'[16i0_18312_0+i0_18312_1,i0.128,i2.32,i1.128] = store bfloat16<128 x 4096> TongaSB partitions[1] bfloat16 (128, 128, 4096) %'t4051_pftranspose_19629'[16i0_18312_0+i0_18312_1,i0.128,i1.128+128i2.32] # id=21501, src_id=None, , instances=128 # dl = tensor_op_name: _transpose.578 | hlo_id: 2097 | [[i0.128];[i1.128, i2.32]] -> [[i0.128];[i1.128, i2.32]] +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 262.710us (56.000MiB, est bw: 223.518GB/s, 0.521% of tot. time) for bfloat16<128 x 3584> TongaSB partitions[2] bfloat16 (32, 2, 128, 3584) %'18341.23903'[T_i0,T_i2_26087,i0.128,i1.3584] = load bfloat16<128 x 3584> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3584) %'input6'[T_i0,i0.128,T_i2_26087,i1.3584] # id=22057, src_id=None, , instances=64 # dl = tensor_op_name: t2070_pftranspose_18341 | hlo_id: 1531 | [[i0.128];[i1.3584]] -> [[i0.128];[i1.3584]] +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 262.710us (56.000MiB, est bw: 223.518GB/s, 0.521% of tot. time) for bfloat16<128 x 3584> TongaSB partitions[2] bfloat16 (32, 2, 128, 3584) %'18382.23917'[T_i0,T_i2_26095,i0.128,i1.3584] = load bfloat16<128 x 3584> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3584) %'input15'[T_i0,i0.128,T_i2_26095,i1.3584] # id=22115, src_id=None, , instances=64 # dl = tensor_op_name: t2133_pftranspose_18382 | hlo_id: 1549 | [[i0.128];[i1.3584]] -> [[i0.128];[i1.3584]] +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 262.710us (56.000MiB, est bw: 223.518GB/s, 0.521% of tot. time) for bfloat16<128 x 3584> TongaSB partitions[2] bfloat16 (32, 2, 128, 3584) %'18423.23931'[T_i0,T_i2_26103,i0.128,i1.3584] = load bfloat16<128 x 3584> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3584) %'input24'[T_i0,i0.128,T_i2_26103,i1.3584] # id=22173, src_id=None, , instances=64 # dl = tensor_op_name: t2196_pftranspose_18423 | hlo_id: 1567 | [[i0.128];[i1.3584]] -> [[i0.128];[i1.3584]] +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 262.710us (56.000MiB, est bw: 223.518GB/s, 0.521% of tot. time) for bfloat16<128 x 3584> TongaSB partitions[2] bfloat16 (32, 2, 128, 3584) %'18464.23945'[T_i0,T_i2_26111,i0.128,i1.3584] = load bfloat16<128 x 3584> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3584) %'input33'[T_i0,i0.128,T_i2_26111,i1.3584] # id=22231, src_id=None, , instances=64 # dl = tensor_op_name: t2259_pftranspose_18464 | hlo_id: 1585 | [[i0.128];[i1.3584]] -> [[i0.128];[i1.3584]] +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 262.710us (56.000MiB, est bw: 223.518GB/s, 0.521% of tot. time) for bfloat16<128 x 3584> TongaSB partitions[2] bfloat16 (32, 2, 128, 3584) %'18505.23959'[T_i0,T_i2_26119,i0.128,i1.3584] = load bfloat16<128 x 3584> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3584) %'input42'[T_i0,i0.128,T_i2_26119,i1.3584] # id=22289, src_id=None, , instances=64 # dl = tensor_op_name: t2322_pftranspose_18505 | hlo_id: 1603 | [[i0.128];[i1.3584]] -> [[i0.128];[i1.3584]] +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 262.710us (56.000MiB, est bw: 223.518GB/s, 0.521% of tot. time) for bfloat16<128 x 3584> TongaSB partitions[2] bfloat16 (32, 2, 128, 3584) %'18546.23973'[T_i0,T_i2_26127,i0.128,i1.3584] = load bfloat16<128 x 3584> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3584) %'input51'[T_i0,i0.128,T_i2_26127,i1.3584] # id=22347, src_id=None, , instances=64 # dl = tensor_op_name: t2385_pftranspose_18546 | hlo_id: 1621 | [[i0.128];[i1.3584]] -> [[i0.128];[i1.3584]] +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 262.710us (56.000MiB, est bw: 223.518GB/s, 0.521% of tot. time) for bfloat16<128 x 3584> TongaSB partitions[2] bfloat16 (32, 2, 128, 3584) %'18587.23987'[T_i0,T_i2_26135,i0.128,i1.3584] = load bfloat16<128 x 3584> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3584) %'input60'[T_i0,i0.128,T_i2_26135,i1.3584] # id=22405, src_id=None, , instances=64 # dl = tensor_op_name: t2448_pftranspose_18587 | hlo_id: 1639 | [[i0.128];[i1.3584]] -> [[i0.128];[i1.3584]] +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 262.710us (56.000MiB, est bw: 223.518GB/s, 0.521% of tot. time) for bfloat16<128 x 3584> TongaSB partitions[2] bfloat16 (32, 2, 128, 3584) %'18628.24001'[T_i0,T_i2_26143,i0.128,i1.3584] = load bfloat16<128 x 3584> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3584) %'input69'[T_i0,i0.128,T_i2_26143,i1.3584] # id=22463, src_id=None, , instances=64 # dl = tensor_op_name: t2511_pftranspose_18628 | hlo_id: 1657 | [[i0.128];[i1.3584]] -> [[i0.128];[i1.3584]] +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Finished (changed=False) +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: DMAProfiler finished after 0.029 seconds +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/OptimizeNKIKernels]: Running OptimizeNKIKernels +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/OptimizeNKIKernels]: Finished (changed=False) +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/OptimizeNKIKernels]: OptimizeNKIKernels finished after 0.015 seconds +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/CCOpFusion]: Running CCOpFusion +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/CCOpFusion]: Finished (changed=True) +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/CCOpFusion]: CCOpFusion finished after 0.304 seconds +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/StaticProfiler]: Running StaticProfiler +2025-09-05T19:19:07Z WARNING 1971 [sg0000/Tensorizer/StaticProfiler]: matmul-based transposes inserted by penguin takes up 100.00 percent of all matmul computation +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/StaticProfiler]: Finished (changed=False) +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/StaticProfiler]: StaticProfiler finished after 0.035 seconds +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/SplitAPUnionSets]: Running SplitAPUnionSets +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/SplitAPUnionSets]: Finished (changed=True) +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/SplitAPUnionSets]: SplitAPUnionSets finished after 0.130 seconds +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/LateLegalizePostSplit]: Running LateLegalizePostSplit +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/LateLegalizePostSplit]: Finished (changed=False) +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/LateLegalizePostSplit]: LateLegalizePostSplit finished after 0.032 seconds +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DumpGraphAndMetadata]: Running DumpGraphAndMetadata +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DumpGraphAndMetadata]: Finished (changed=False) +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DumpGraphAndMetadata]: DumpGraphAndMetadata finished after 0.038 seconds +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/ZeroSizeTensorElimination]: Running ZeroSizeTensorElimination +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/ZeroSizeTensorElimination]: Finished (changed=False) +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/ZeroSizeTensorElimination]: ZeroSizeTensorElimination finished after 0.000 seconds +2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/BirCodeGenLoop]: Running BirCodeGenLoop +2025-09-05T19:19:08Z INFO 1971 [sg0000/Tensorizer/BirCodeGenLoop]: Finished (changed=False) +2025-09-05T19:19:08Z INFO 1971 [sg0000/Tensorizer/BirCodeGenLoop]: BirCodeGenLoop finished after 0.618 seconds +2025-09-05T19:19:09Z INFO 1971 [Tensorizer]: BirCodeGen estimate #instances=285991 in sg0000 +2025-09-05T19:19:09Z INFO 1971 [Tensorizer]: IR signature: 868d010c2395728dde8cb4de874491d6109b3553bfa434107fadd7604cffd131 for nc00/sg0000/TensorizerBIR +2025-09-05T19:19:09Z INFO 1971 [Tensorizer]: Weights total number of bytes: 131072 +2025-09-05T19:19:09Z INFO 1971 [Tensorizer]: Successfully built model. +2025-09-05T19:19:09Z USER 1971 [root/Tensorizer/Tensorizer]: Tensorizer finished after 27.551 seconds +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: End tensorization +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input0 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input1 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input2 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input3 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input4 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input5 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input6 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input7 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input8 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input9 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input10 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input11 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input12 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input13 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input14 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input15 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input16 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input17 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input18 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input19 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input20 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input21 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input22 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input23 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input24 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input25 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input26 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input27 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input28 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input29 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input30 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input31 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input32 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input33 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input34 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input35 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input36 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input37 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input38 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input39 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input40 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input41 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input42 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input43 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input44 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input45 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input46 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input47 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input48 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input49 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input50 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input51 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input52 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input53 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input54 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input55 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input56 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input57 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input58 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input59 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input60 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input61 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input62 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input63 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input64 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input65 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input66 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input67 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input68 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input69 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input70 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input71 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input72 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input73 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input74 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input75 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input76 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input77 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input78 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input79 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input80 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input81 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input82 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input83 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input84 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input85 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input86 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input87 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input88 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input89 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input90 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input91 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input92 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input93 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input94 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input95 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input96 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input97 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input98 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input99 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input100 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input101 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input102 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input103 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input104 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input105 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input106 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input107 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input108 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input109 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input110 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input111 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input112 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input113 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input114 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input115 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input116 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input117 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input118 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input119 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input120 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input121 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input122 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input123 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input124 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input125 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input126 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input127 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input128 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input129 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input130 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input131 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input132 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input133 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input134 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input135 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input136 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input137 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input138 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input139 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input140 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input141 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input142 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input143 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input144 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input145 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input146 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input147 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input148 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input149 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input150 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input151 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input152 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input153 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input154 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input155 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input156 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input157 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input158 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input159 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input160 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input161 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input162 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input163 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input164 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input165 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input166 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input167 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input168 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input169 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input170 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input171 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input172 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input173 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input174 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input175 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input176 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input177 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input178 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input179 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input180 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input181 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input182 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input183 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input184 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input185 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input186 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input187 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input188 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input189 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input190 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input191 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input192 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input193 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input194 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input195 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input196 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input197 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input198 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input199 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input200 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input201 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input202 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input203 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input204 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input205 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input206 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input207 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input208 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input209 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input210 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input211 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input212 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input213 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input214 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input215 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input216 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input217 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input218 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input219 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input220 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input221 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input222 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input223 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input224 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input225 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input226 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input227 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input228 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input229 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input230 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input231 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input232 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input233 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input234 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input235 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input236 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input237 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input238 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input239 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input240 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input241 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input242 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input243 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input244 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input245 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input246 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input247 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input248 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input249 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input250 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input251 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input252 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input253 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input254 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input255 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input256 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input257 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input258 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input259 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input260 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input261 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input262 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input263 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input264 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input265 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input266 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input267 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input268 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input269 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input270 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input271 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input272 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input273 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input274 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input275 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input276 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input277 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input278 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input279 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input280 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input281 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input282 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input283 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input284 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input285 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input286 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input287 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input288 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input289 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input290 +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: wrote bir.json +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: wrote tensor_map.json +2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Job #0 finished +2025-09-05T19:19:09Z INFO 1971 [pipeline.Pipeline.0]: Finished job job.Frontend.0 +2025-09-05T19:19:09Z INFO 1971 [pipeline.Pipeline.0]: Starting job job.StaticIOTranspose.0 +2025-09-05T19:19:09Z INFO 1971 [pipeline.Pipeline.0]: Finished job job.StaticIOTranspose.0 +2025-09-05T19:19:09Z INFO 1971 [pipeline.Pipeline.0]: Starting job job.WalrusDriver.0 +2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: BackendDriver has 1 states with 1 core LNC +2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: BackendDriver: no partitions found. Switching to flat flow. +2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: Job WalrusDriver len(in_states) 1 +2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: Processing input #0 +2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: BackendDriver in_state.num_states 1 with 1 core LNC +2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: Executing /opt/conda/lib/python3.10/site-packages/neuronxcc/starfish/bin/walrus_driver --optlevel 2 --allocator coloring --verbose 35 --logfile-verbose 20 --logfile /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/log-neuron-cc.txt --execute-repetition 1 -i bir.json --min_split_size 10240 --skip_split_vns '' --no_split_dram --split_huge_dram_tensor 1.0 --preprocessing_only --max_tensorizer_distance 64 --pack_same_shape_only --instruction_fetch_latency 511 --max-partitions 1 --policy 3 --auxflag 0 --interleave none --schedule-delayed-latency 1 --postsched-mm-accum-reorder=false --max-load-color-rotation --max-load-lower-bound 0.14 --mm-reorder-opt --force-prefetch-follow-incoming-order -1 --allreduce-buffer-size 500 --dram-page-size 512 --dram-rotation-size -1 --allreduce-rotation-dis 8 --repeat-load-thres 4 --enable-mm-transpose-remat-optimization=true --save-len-thres 512 --save-dma-cnt-thres 32 --relaxed-order=true --enable-anti-dependence-reduction=false --num-semaphores-per-queue 16 --numcores 1 --act-root-json /opt/conda/lib/python3.10/site-packages/neuronxcc/pwp/pwp_bin_trainium/act_info.json --dve-root-json /opt/conda/lib/python3.10/site-packages/neuronxcc/dve/dve_bin_gen2/dve_info.json --unified-backend-and-legacy-codegen --tensor-map tensor_map.json --enable-verifier=true --enable-birsim=false --enable-birsim-sync-only=false --enable-data-race-checker=false --enable-new-backend=true --inject-error=NONE --dge-levels vector_dynamic_offsets,scalar_dynamic_offset,io --dynamic-dma-scratch-size-per-partition=16384 --neff-output-filename /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/graph.neff +2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: Working directory is /opt/vllm/neuronxcc-gss0wi_w/sg00 +2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: propagate_exit=True +2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: use_logger=False +2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: expose_stderr=True +2025-09-05T19:19:09Z INFO 2007 [Logging]: Logging to ../../../../models/mistral-7b-v0.3-instruct-neuronx/layout_opt/log-neuron-cc.txt at level 'INFO' +2025-09-05T19:19:09Z INFO 2007 [BackendDriver]: max_allowed_parallelism=32 +2025-09-05T19:19:09Z INFO 2007 [BackendDriver]: Backend driver mtBackend: false numModules: 1 Cwd: "/opt/vllm/neuronxcc-gss0wi_w/sg00" +2025-09-05T19:19:09Z INFO 2007 [BackendDriver]: DynamicDMA is enabled +2025-09-05T19:19:09Z INFO 2007 [BackendDriver]: DynamicDMA levels being enabled: io, scalar_dynamic_offset, vector_dynamic_offsets, +2025-09-05T19:19:09Z USER 2007 [BackendPassManager]: Running mod_parallel_pass +2025-09-05T19:19:09Z INFO 2007 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=1455 blocks=1 instructions=775 Max writers: 1 Max Readers: 290 +2025-09-05T19:19:09Z USER 2007 [ModuleForkPass]: Running do_nothing +2025-09-05T19:19:09Z INFO 2007 [ModuleForkPass]: Inputs to do_nothing: modules=1 functions=1 allocs=1455 blocks=1 instructions=775 Max writers: 1 Max Readers: 290 +2025-09-05T19:19:09Z USER 2007 [ModuleForkPass]: do_nothing finished after 0.002 seconds +2025-09-05T19:19:09Z INFO 2007 [ModuleForkPass]: curr_vmrss: 175mb, ru_maxrss: 419mb (delta=0mb) +2025-09-05T19:19:09Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 1455 memory location(s), 1 block(s), and 775 instruction(s). Max writers: 1 Max Readers: 290 +2025-09-05T19:19:09Z USER 2007 [ModuleForkPass]: Running birverifier +2025-09-05T19:19:09Z INFO 2007 [ModuleForkPass]: Inputs to birverifier: modules=1 functions=1 allocs=1455 blocks=1 instructions=775 Max writers: 1 Max Readers: 290 +2025-09-05T19:19:10Z USER 2007 [ModuleForkPass]: birverifier finished after 0.275 seconds +2025-09-05T19:19:10Z INFO 2007 [ModuleForkPass]: curr_vmrss: 961mb, ru_maxrss: 961mb (delta=542mb) +2025-09-05T19:19:10Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 1455 memory location(s), 1 block(s), and 775 instruction(s). Max writers: 1 Max Readers: 290 +2025-09-05T19:19:10Z USER 2007 [BackendPassManager]: mod_parallel_pass finished after 0.282 seconds +2025-09-05T19:19:10Z INFO 2007 [BackendPassManager]: curr_vmrss: 961mb, ru_maxrss: 961mb (delta=542mb) +2025-09-05T19:19:10Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 1455 memory location(s), 1 block(s), and 775 instruction(s). Max writers: 1 Max Readers: 290 +2025-09-05T19:19:10Z USER 2007 [BackendPassManager]: Running subgraph_parallel_pass +2025-09-05T19:19:10Z INFO 2007 [BackendPassManager]: Inputs to subgraph_parallel_pass: modules=1 functions=1 allocs=1455 blocks=1 instructions=775 Max writers: 1 Max Readers: 290 +2025-09-05T19:19:10Z USER 2007 [SubgraphForkPass]: Running lnc_verifier +2025-09-05T19:19:10Z INFO 2007 [SubgraphForkPass]: Inputs to lnc_verifier: modules=1 functions=1 allocs=1455 blocks=1 instructions=775 Max writers: 1 Max Readers: 290 +2025-09-05T19:19:10Z USER 2007 [SubgraphForkPass]: lnc_verifier finished after 0.000 seconds +2025-09-05T19:19:10Z INFO 2007 [SubgraphForkPass]: curr_vmrss: 961mb, ru_maxrss: 961mb (delta=0mb) +2025-09-05T19:19:10Z INFO 2007 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 1455 memory location(s), 1 block(s), and 775 instruction(s). Max writers: 1 Max Readers: 290 +2025-09-05T19:19:10Z USER 2007 [BackendPassManager]: subgraph_parallel_pass finished after 0.001 seconds +2025-09-05T19:19:10Z INFO 2007 [BackendPassManager]: curr_vmrss: 961mb, ru_maxrss: 961mb (delta=0mb) +2025-09-05T19:19:10Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 1455 memory location(s), 1 block(s), and 775 instruction(s). Max writers: 1 Max Readers: 290 +2025-09-05T19:19:10Z USER 2007 [BackendPassManager]: Running mod_parallel_pass +2025-09-05T19:19:10Z INFO 2007 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=1455 blocks=1 instructions=775 Max writers: 1 Max Readers: 290 +2025-09-05T19:19:10Z USER 2007 [ModuleForkPass]: Running expand_replication +2025-09-05T19:19:10Z INFO 2007 [ModuleForkPass]: Inputs to expand_replication: modules=1 functions=1 allocs=1455 blocks=1 instructions=775 Max writers: 1 Max Readers: 290 +2025-09-05T19:19:10Z INFO 2007 [ExpandReplication]: Found 0 replicated matmults +2025-09-05T19:19:10Z USER 2007 [ModuleForkPass]: expand_replication finished after 0.001 seconds +2025-09-05T19:19:10Z INFO 2007 [ModuleForkPass]: curr_vmrss: 961mb, ru_maxrss: 961mb (delta=0mb) +2025-09-05T19:19:10Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 1455 memory location(s), 1 block(s), and 775 instruction(s). Max writers: 1 Max Readers: 290 +2025-09-05T19:19:10Z USER 2007 [ModuleForkPass]: Running unroll +2025-09-05T19:19:10Z INFO 2007 [ModuleForkPass]: Inputs to unroll: modules=1 functions=1 allocs=1455 blocks=1 instructions=775 Max writers: 1 Max Readers: 290 +2025-09-05T19:19:10Z INFO 2007 [Unroll]: INFO (Unroll) Start unrolling at Fri Sep 5 19:19:10 2025 +2025-09-05T19:19:12Z INFO 2007 [Unroll]: INFO (Unroll) DONE unrolling Fri Sep 5 19:19:10 2025 + +2025-09-05T19:19:12Z INFO 2007 [Unroll]: sg0000 Instruction count after Unroll: +2025-09-05T19:19:12Z INFO 2007 [Unroll]: Total count: 285701 +2025-09-05T19:19:12Z INFO 2007 [Unroll]: Matmult: 217153 +2025-09-05T19:19:12Z INFO 2007 [Unroll]: GenericCopy: 54337 +2025-09-05T19:19:12Z INFO 2007 [Unroll]: Load: 7106 +2025-09-05T19:19:12Z INFO 2007 [Unroll]: Save: 7105 +2025-09-05T19:19:12Z INFO 2007 [Unroll]: Unrolled DGE count with Dynamic AP: 0 +2025-09-05T19:19:12Z USER 2007 [ModuleForkPass]: unroll finished after 2.471 seconds +2025-09-05T19:19:12Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2540mb, ru_maxrss: 2541mb (delta=1580mb) +2025-09-05T19:19:12Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69807 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:12Z USER 2007 [BackendPassManager]: mod_parallel_pass finished after 2.543 seconds +2025-09-05T19:19:12Z INFO 2007 [BackendPassManager]: curr_vmrss: 1679mb, ru_maxrss: 2541mb (delta=1580mb) +2025-09-05T19:19:12Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69807 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:12Z USER 2007 [BackendPassManager]: Running subgraph_parallel_pass +2025-09-05T19:19:12Z INFO 2007 [BackendPassManager]: Inputs to subgraph_parallel_pass: modules=1 functions=1 allocs=69807 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:12Z USER 2007 [SubgraphForkPass]: Running dead_code_elim +2025-09-05T19:19:12Z INFO 2007 [SubgraphForkPass]: Inputs to dead_code_elim: modules=1 functions=1 allocs=69807 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:12Z INFO 2007 [DeadCodeElim]: eliminateDeadStore removed 0 instructions +2025-09-05T19:19:12Z INFO 2007 [DeadCodeElim]: remove_must_alias_dmacopy removed 0 DMAcopys +2025-09-05T19:19:12Z INFO 2007 [DeadCodeElim]: remove_redundant_alias_dmacopy removed 0 DMAcopys +2025-09-05T19:19:12Z INFO 2007 [DeadCodeElim]: remove_redundant_internal2internal_dmacopy removed 0 DMAcopys +2025-09-05T19:19:12Z USER 2007 [SubgraphForkPass]: dead_code_elim finished after 0.257 seconds +2025-09-05T19:19:12Z INFO 2007 [SubgraphForkPass]: curr_vmrss: 1699mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:12Z INFO 2007 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:12Z USER 2007 [BackendPassManager]: subgraph_parallel_pass finished after 0.266 seconds +2025-09-05T19:19:12Z INFO 2007 [BackendPassManager]: curr_vmrss: 1699mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:12Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:12Z USER 2007 [BackendPassManager]: Running mod_parallel_pass +2025-09-05T19:19:12Z INFO 2007 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:12Z USER 2007 [ModuleForkPass]: Running birverifier +2025-09-05T19:19:12Z INFO 2007 [ModuleForkPass]: Inputs to birverifier: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: birverifier finished after 0.232 seconds +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1753mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [BackendPassManager]: mod_parallel_pass finished after 0.239 seconds +2025-09-05T19:19:13Z INFO 2007 [BackendPassManager]: curr_vmrss: 1753mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:13Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [BackendPassManager]: Running subgraph_parallel_pass +2025-09-05T19:19:13Z INFO 2007 [BackendPassManager]: Inputs to subgraph_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [SubgraphForkPass]: Running lnc_verifier +2025-09-05T19:19:13Z INFO 2007 [SubgraphForkPass]: Inputs to lnc_verifier: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [SubgraphForkPass]: lnc_verifier finished after 0.003 seconds +2025-09-05T19:19:13Z INFO 2007 [SubgraphForkPass]: curr_vmrss: 1753mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:13Z INFO 2007 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [BackendPassManager]: subgraph_parallel_pass finished after 0.010 seconds +2025-09-05T19:19:13Z INFO 2007 [BackendPassManager]: curr_vmrss: 1753mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:13Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [BackendPassManager]: Running mod_parallel_pass +2025-09-05T19:19:13Z INFO 2007 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: Running instruction_reorder +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Inputs to instruction_reorder: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: instruction_reorder finished after 0.027 seconds +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1753mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: Running psum_legalization +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Inputs to psum_legalization: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: psum_legalization finished after 0.031 seconds +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1753mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: Running legalize_cce_dma +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Inputs to legalize_cce_dma: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: legalize_cce_dma finished after 0.014 seconds +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1753mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: Running error_injector +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Inputs to error_injector: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z WARNING 2007 [ErrorInjector]: Unrecognized injected error value "0" +2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: error_injector finished after 0.003 seconds +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1753mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: Running vn_splitter +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Inputs to vn_splitter: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z INFO 2007 [VNSplitter]: INFO (VNSplitter) Collected all the internal vnodes: size = 0 +2025-09-05T19:19:13Z INFO 2007 [VNSplitter]: INFO (VNSplitter) Done with analyze and splitting: total dead nodes = 0 +2025-09-05T19:19:13Z INFO 2007 [PerformanceProfiler]: number of tensorizer non-local-tensor caused reload left 0 +2025-09-05T19:19:13Z INFO 2007 [PerformanceProfiler]: number of tensorizer non-local-tensor caused spill left 0 +2025-09-05T19:19:13Z INFO 2007 [VNSplitterPass]: INFO (VNSplitter) Time: 0.004 seconds +2025-09-05T19:19:13Z INFO 2007 [VNSplitterPass]: INFO (VerticalFusion) Time: 0.073 seconds +2025-09-05T19:19:13Z INFO 2007 [VNSplitterPass]: INFO (ShrinkDN) Time: 0.075 seconds +2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: vn_splitter finished after 0.186 seconds +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1764mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: Running constant_propagate +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Inputs to constant_propagate: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: [Constant_propagate for select] directly remove instruction number: 0 +2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: eliminateDeadStore removed 0 instructions +2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: remove_must_alias_dmacopy removed 0 DMAcopys +2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: remove_redundant_alias_dmacopy removed 0 DMAcopys +2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: remove_redundant_internal2internal_dmacopy removed 0 DMAcopys +2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: [Constant_propagate for Affineselect] directly remove instruction number: 0 +2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: eliminateDeadStore removed 0 instructions +2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: remove_must_alias_dmacopy removed 0 DMAcopys +2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: remove_redundant_alias_dmacopy removed 0 DMAcopys +2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: remove_redundant_internal2internal_dmacopy removed 0 DMAcopys +2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: constant_propagate finished after 0.482 seconds +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1768mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: Running lower_ac +2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Inputs to lower_ac: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:14Z INFO 2007 [LowerAC]: INFO (LowerAC) Lowered 0 loads, 0 saves, 0 copies. +2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: lower_ac finished after 0.026 seconds +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1768mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: Running input_dma_coalescing +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Inputs to input_dma_coalescing: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:14Z INFO 2007 [DMAOptimizationBase]: DMA input Coalescing combined 0 input loads +2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: input_dma_coalescing finished after 0.078 seconds +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1768mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: Running remat_optimization +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Inputs to remat_optimization: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:14Z INFO 2007 [RematOpt]: Removed 0 remat instructions +2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: remat_optimization finished after 0.174 seconds +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1772mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: Running early_peephole_opts +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Inputs to early_peephole_opts: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:14Z INFO 2007 [EarlyPeepholeOpts]: PeepholeOpts enabled? ActivationAccumulate: true +2025-09-05T19:19:14Z INFO 2007 [EarlyPeepholeOpts]: Activation Accumulate: 0 +2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: early_peephole_opts finished after 0.048 seconds +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1772mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: Running coalesce_multichannel_cc_ops +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Inputs to coalesce_multichannel_cc_ops: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: coalesce_multichannel_cc_ops finished after 0.015 seconds +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1772mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: Running infer_stream_ids +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Inputs to infer_stream_ids: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: infer_stream_ids finished after 0.015 seconds +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1772mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: Running pre_sched +2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Inputs to pre_sched: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:14Z INFO 2007 [PreSched]: Start PRE scheduling 2 cores: 1 at: Fri Sep 5 19:19:14 2025 +2025-09-05T19:19:14Z INFO 2007 [LayerSpiller]: LayerSpill: Start... +2025-09-05T19:19:14Z INFO 2007 [LayerSpiller]: LayerSpill: Found 0 Splits CCs +2025-09-05T19:19:14Z INFO 2007 [LayerSpiller]: Grouped CCs to 0 clusters. +2025-09-05T19:19:14Z INFO 2007 [LayerSpiller]: LayerSpill: To Spill 0 multi-layer tensors +2025-09-05T19:19:14Z INFO 2007 [LayerSpiller]: LayerSpill: set uninit flag on 0 insts +2025-09-05T19:19:14Z INFO 2007 [LayerSpiller]: LayerSpill: Done. +2025-09-05T19:19:14Z INFO 2007 [PreSched]: Start split live ranges Fri Sep 5 19:19:14 2025 +2025-09-05T19:19:14Z INFO 2007 [PreSched]: Num_Splits: 0 +2025-09-05T19:19:14Z INFO 2007 [PreSched]: End split live ranges Fri Sep 5 19:19:14 2025 +2025-09-05T19:19:14Z INFO 2007 [PreSched]: Strt remove redundncies Fri Sep 5 19:19:14 2025 +2025-09-05T19:19:14Z INFO 2007 [PreSched]: remove_redundant_memsets +2025-09-05T19:19:14Z INFO 2007 [PreSched]: remove_redundant_memsets: 0 +2025-09-05T19:19:14Z INFO 2007 [PreSched]: remove_redundant_loads +2025-09-05T19:19:14Z INFO 2007 [PreSched]: remove_redundant_loads: 0 +2025-09-05T19:19:14Z INFO 2007 [PreSched]: End remove redundncies Fri Sep 5 19:19:14 2025 +2025-09-05T19:19:14Z INFO 2007 [PreSched]: Start DCE Fri Sep 5 19:19:14 2025 +2025-09-05T19:19:14Z INFO 2007 [PreSched]: eliminateDeadStore removed 0 instructions +2025-09-05T19:19:14Z INFO 2007 [PreSched]: remove_must_alias_dmacopy removed 0 DMAcopys +2025-09-05T19:19:14Z INFO 2007 [PreSched]: remove_redundant_alias_dmacopy removed 0 DMAcopys +2025-09-05T19:19:14Z INFO 2007 [PreSched]: remove_redundant_internal2internal_dmacopy removed 0 DMAcopys +2025-09-05T19:19:14Z INFO 2007 [PreSched]: End DCE Fri Sep 5 19:19:14 2025 +2025-09-05T19:19:14Z INFO 2007 [PreSched]: Start build flow dependencies Fri Sep 5 19:19:14 2025 +2025-09-05T19:19:14Z INFO 2007 [build_flow_deps]: Start build fdeps. Invocation: 1Fri Sep 5 19:19:14 2025 +2025-09-05T19:19:14Z INFO 2007 [build_flow_deps]: Allocs: 69132 instructions: 285701 +2025-09-05T19:19:15Z INFO 2007 [build_flow_deps]: Build fdeps inserted 714949 edges +2025-09-05T19:19:15Z INFO 2007 [build_flow_deps]: Done build fdeps 714949 Fri Sep 5 19:19:15 2025 +2025-09-05T19:19:15Z INFO 2007 [PreSched]: End build flow dependencies Fri Sep 5 19:19:15 2025 +2025-09-05T19:19:15Z INFO 2007 [PreSched]: Start remove useless insts Fri Sep 5 19:19:15 2025 +2025-09-05T19:19:15Z INFO 2007 [PreSched]: remove_useless_insts +2025-09-05T19:19:16Z INFO 2007 [PreSched]: remove Useless Instructions: 0 +2025-09-05T19:19:16Z INFO 2007 [PreSched]: End remove useless insts Fri Sep 5 19:19:16 2025 +2025-09-05T19:19:16Z INFO 2007 [PreSched]: Start scratchpad optimization Fri Sep 5 19:19:16 2025 +2025-09-05T19:19:16Z INFO 2007 [PreSched]: End scratchpad optimization Fri Sep 5 19:19:16 2025 +2025-09-05T19:19:16Z INFO 2007 [PreSched]: DONE PRE scheduling Fri Sep 5 19:19:16 2025 +2025-09-05T19:19:16Z USER 2007 [ModuleForkPass]: pre_sched finished after 1.914 seconds +2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1884mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:16Z USER 2007 [ModuleForkPass]: Running tensor_copy_elim +2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: Inputs to tensor_copy_elim: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:16Z INFO 2007 [TensorCopyElim]: Tensor CP elimination: 0 +2025-09-05T19:19:16Z INFO 2007 [TensorCopyElim]: eliminateDeadStore removed 0 instructions +2025-09-05T19:19:16Z INFO 2007 [TensorCopyElim]: remove_must_alias_dmacopy removed 0 DMAcopys +2025-09-05T19:19:16Z INFO 2007 [TensorCopyElim]: remove_redundant_alias_dmacopy removed 0 DMAcopys +2025-09-05T19:19:16Z INFO 2007 [TensorCopyElim]: remove_redundant_internal2internal_dmacopy removed 0 DMAcopys +2025-09-05T19:19:16Z USER 2007 [ModuleForkPass]: tensor_copy_elim finished after 0.341 seconds +2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1884mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:16Z USER 2007 [ModuleForkPass]: Running dynamic_dma_setup +2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: Inputs to dynamic_dma_setup: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:16Z USER 2007 [ModuleForkPass]: dynamic_dma_setup finished after 0.004 seconds +2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1884mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69133 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:16Z USER 2007 [ModuleForkPass]: Running runtime_memory_reservation +2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: Inputs to runtime_memory_reservation: modules=1 functions=1 allocs=69133 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:16Z USER 2007 [ModuleForkPass]: runtime_memory_reservation finished after 0.004 seconds +2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1884mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69133 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:16Z USER 2007 [ModuleForkPass]: Running coloring_allocator_psum +2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: Inputs to coloring_allocator_psum: modules=1 functions=1 allocs=69133 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:16Z INFO 2007 [ColoringAllocator::Rep]: Allocating functions +2025-09-05T19:19:16Z INFO 2007 [ColoringAllocator::Rep]: linearize and check +2025-09-05T19:19:16Z INFO 2007 [PSUM_Allocator]: allocating PSUM +2025-09-05T19:19:16Z INFO 2007 [PSUM_Allocator]: main loop +2025-09-05T19:19:16Z INFO 2007 [PSUM_Allocator]: renumber locations +2025-09-05T19:19:16Z INFO 2007 [PSUM_Allocator]: size = 54337 +2025-09-05T19:19:16Z INFO 2007 [PSUM_Allocator]: build_no_bitmap start +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: 100% PSUM demand before spilling +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: PSUM high-water mark = 8 tensors +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: found 182784 edges +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: mean: 6.72779 +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: median: 6.99997 +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: adjacency vectors require 1462272 bytes +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: build_no_bitmap done +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: find costs +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: best-of-n loop, heuristic = 0, allow_psum_spill_within_accum_group = false +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: simplify interference graph +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: initialize low and high +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: lo = 54337 +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: hi = 0 +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: inf = 0 +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: total = 54337 +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: simplify +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: new candidates = 0 +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: select ranges +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: no more spills +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: PSUM score = 0 (lower is better) +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: spilling from PSUM cost about 0 cycles +2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: 100% PSUM utilization after allocation +2025-09-05T19:19:17Z USER 2007 [ModuleForkPass]: coloring_allocator_psum finished after 0.597 seconds +2025-09-05T19:19:17Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1901mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:17Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69133 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:17Z USER 2007 [ModuleForkPass]: Running dma_optimization_psum +2025-09-05T19:19:17Z INFO 2007 [ModuleForkPass]: Inputs to dma_optimization_psum: modules=1 functions=1 allocs=69133 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:17Z INFO 2007 [DMAOptimizationBase]: [psum spill optimization]: removed 0 spill/reload instructions +2025-09-05T19:19:17Z INFO 2007 [DMAOptimizationBase]: [psum spill optimization]: removed 0 spill/reload memory locations +2025-09-05T19:19:17Z USER 2007 [ModuleForkPass]: dma_optimization_psum finished after 0.203 seconds +2025-09-05T19:19:17Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1901mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:17Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69133 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:17Z USER 2007 [ModuleForkPass]: Running address_rotation_psum +2025-09-05T19:19:17Z INFO 2007 [ModuleForkPass]: Inputs to address_rotation_psum: modules=1 functions=1 allocs=69133 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:18Z INFO 2007 [DMAOptimizationBase]: PSUM Rotation rotated 0 PSUM Banks +2025-09-05T19:19:18Z INFO 2007 [DMAOptimizationBase]: PSUM Rotation rotated 0 PSUM Banks +2025-09-05T19:19:19Z INFO 2007 [DMAOptimizationBase]: PSUM Rotation rotated 0 PSUM Banks +2025-09-05T19:19:19Z USER 2007 [ModuleForkPass]: address_rotation_psum finished after 1.984 seconds +2025-09-05T19:19:19Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1907mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:19Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69133 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:19Z USER 2007 [ModuleForkPass]: Running coloring_allocator_sb +2025-09-05T19:19:19Z INFO 2007 [ModuleForkPass]: Inputs to coloring_allocator_sb: modules=1 functions=1 allocs=69133 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:19Z INFO 2007 [ColoringAllocator::Rep]: INFO: Pre GCA DRAM bytes loaded 7114104832 +2025-09-05T19:19:19Z INFO 2007 [ColoringAllocator::Rep]: INFO: Pre GCA average loaded DMA size 7875 bytes +2025-09-05T19:19:19Z INFO 2007 [ColoringAllocator::Rep]: INFO: Pre GCA DRAM bytes saved 7114072064 +2025-09-05T19:19:19Z INFO 2007 [ColoringAllocator::Rep]: INFO: Pre GCA average saved DMA size 7822 bytes +2025-09-05T19:19:19Z INFO 2007 [ColoringAllocator::Rep]: INFO: Post GCA DRAM bytes DMACopyed 0 +2025-09-05T19:19:19Z INFO 2007 [ColoringAllocator::Rep]: INFO: Post GCA average DMACopyed DMA size 0 bytes +2025-09-05T19:19:19Z INFO 2007 [ColoringAllocator::Rep]: Allocating functions +2025-09-05T19:19:19Z INFO 2007 [ColoringAllocator::Rep]: linearize and check +2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: allocating SB +2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: main loop +2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: renumber locations +2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: size = 14212 +2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: find partners +2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: found 54337 accumulation groups +2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: largest = 19629.23884_i1022 +2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: tensors = 2 +2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: requires 8448 bytes/partition +2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: expanding partners +2025-09-05T19:19:19Z INFO 2007 []: find first defs for local +2025-09-05T19:19:19Z INFO 2007 []: find first defs for global +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: find loads +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: 1 pin count +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: 6081 remat count +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: 1 pinned tensors will require about 16384 bytes/partition +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: build interference graph +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: pass 1 int-tree +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Num intervals 14212 Num locations 14212 +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: IntervalTree Build Done +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: info.neighbors init Done +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: info.neighbors partners Done +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: IntervalTree readback Done +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: edge: 31236 +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: mean: 4.39572 +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: median: 2.00087 +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: find costs +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: best-of-n loop, heuristic = 0 +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: simplify interference graph +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: initialize safe & unsafe +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: safe = 14210 +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: unsafe = 1 +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: inf = 0 +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: total = 14211 +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: simplify +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: simplify_step3_sorted2 #Unsafe 0 #Pinned 0 #Safe 0 minCost 1.79769e+308 maxCost 2.22507e-308 locations 14212 +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: new candidates = 0 +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: select ranges +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Total: 14211 +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Spilled: 0.000 (0) +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Allocated: 1.000 (14211) +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Rover zone: 0.989 (14051) +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Pre-rover zone: 0.009 (128) +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Post-rover zone: 0.002 (32) +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Slice zone: 0.000 (0) +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Blocks nothing: 0.000 (0) +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Blocks medium: 0.000 (0) +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Blocks tall: 1.000 (14211) +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Visited until tall blocking (mean): 0.997 +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Visited until tall blocking (median): 1.000 +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Visited until tall blocking (p95): 1.000 +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Success +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: SB spills = 0 tensors +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: size = 0 bytes/partition +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: remats = 0 tensors +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: unpinned = 0 tensors +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: size = 0 bytes/partition +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: SB score = 0 +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: spilling from SB cost about 0 cycles +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: 16384 bytes/partition (100%) successfully pinned +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: pinning saved approximately 9010 cycles +2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: 0% SB utilization after allocation +2025-09-05T19:19:20Z INFO 2007 [ColoringAllocator::Rep]: INFO: Post GCA DRAM bytes loaded 7114104832 +2025-09-05T19:19:20Z INFO 2007 [ColoringAllocator::Rep]: INFO: Post GCA average loaded DMA size 7875 bytes +2025-09-05T19:19:20Z INFO 2007 [ColoringAllocator::Rep]: INFO: Post GCA DRAM bytes saved 7114072064 +2025-09-05T19:19:20Z INFO 2007 [ColoringAllocator::Rep]: INFO: Post GCA average saved DMA size 7822 bytes +2025-09-05T19:19:20Z INFO 2007 [ColoringAllocator::Rep]: INFO: Post GCA DRAM bytes DMACopyed 0 +2025-09-05T19:19:20Z INFO 2007 [ColoringAllocator::Rep]: INFO: Post GCA average DMACopyed DMA size 0 bytes +2025-09-05T19:19:20Z USER 2007 [ModuleForkPass]: coloring_allocator_sb finished after 1.103 seconds +2025-09-05T19:19:20Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1911mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:20Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69133 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:20Z USER 2007 [ModuleForkPass]: Running address_rotation_sb +2025-09-05T19:19:20Z INFO 2007 [ModuleForkPass]: Inputs to address_rotation_sb: modules=1 functions=1 allocs=69133 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:20Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address +2025-09-05T19:19:20Z USER 2007 [ModuleForkPass]: address_rotation_sb finished after 0.290 seconds +2025-09-05T19:19:20Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1912mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:20Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69133 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:20Z USER 2007 [ModuleForkPass]: Running dma_optimization_sb +2025-09-05T19:19:20Z INFO 2007 [ModuleForkPass]: Inputs to dma_optimization_sb: modules=1 functions=1 allocs=69133 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:20Z INFO 2007 [DMAOptimizationBase]: DMA optimization In bytes loaded or saved 14228176896, 50.0001% input load, 49.9999% output write, 0% spill/reload [sg0000] +2025-09-05T19:19:20Z INFO 2007 [DMAOptimizationBase]: [DMA optimization]Reload_just_for_save Optimization removed 0 memlocs +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: removed 0 identical load +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: adjusted 0 DMACopy remat +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: adjusted 0 DMACopy remat +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: sub-graph will get execute 1 times +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [Load Merging]: removed 0 remat/cloned instructions +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [Load shrink]: shrinked 0 GCA remat/cloned instructions +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [Load Merging + Load shrink] reduced input/const loading DMA traffic 0, 0% out of total dma traffic(7.1141e+09) +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [spill optimization round 0]: removed 0 spill/reload instructions +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [spill optimization round 0]: removed 0 spill/reload memory locations +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [Spill Optimization] reduced DMA traffic 0, -nan% out of total spill/reload dma traffic +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [Allocation optimization]: removed 0 spill/reload instructions +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [Allocation optimization]: removed 0 spill/reload memory locations +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [Re-allocation Optimization] reduced DMA traffic 0, -nan% out of total spill/reload dma traffic +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [spill optimization round 0]: removed 0 spill/reload instructions +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [spill optimization round 0]: removed 0 spill/reload memory locations +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [Spill Optimization] reduced DMA traffic 0, -nan% out of total spill/reload dma traffic +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [remove extra save] removed 0 memlocs and 0 instructions +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [remove_memset_spill]: removed 0 spill/reload instructions +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [remove_memset_spill]: removed 0 spill/reload memory locations +2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: eliminateDeadStore removed 0 instructions +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: DMA SpillSave Coalescing Round 0 combined 0 SpillSaves and Reloads +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: average loaded DMA size 7875 bytes +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: average saved DMA size 7822 bytes +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA coalescing DRAM bytes loaded 7114104832 +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA coalescing average loaded DMA size 7875 bytes +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA coalescing DRAM bytes saved 7114072064 +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA coalescing average saved DMA size 7822 bytes +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: [DMA optimization]Reload_just_for_save Optimization removed 0 memlocs +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: [Experiment partial DMA access] reduced DMA traffic 0, -nan% out of total spill/reload dma traffic +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: [DMA optimization] reduced DMA traffic 0, 0% out of total dma traffic +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: DMA optimization Out bytes loaded or saved 14228176896, 50.0001% input load, 49.9999% output write, 0% spill/reload [sg0000] +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA optimization DRAM bytes loaded 7114104832 +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA optimization average loaded DMA size 7875 bytes +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA optimization DRAM bytes saved 7114072064 +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA optimization average saved DMA size 7822 bytes +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA optimization DRAM bytes DMAcopyed 0 +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA optimization average DMAcopyed DMA size 0 bytes +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA optimization average DMA size 7848 bytes +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Finished set_spill_canreadUninit(module); +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: DMA optimization re-enable optimization +2025-09-05T19:19:22Z USER 2007 [ModuleForkPass]: dma_optimization_sb finished after 1.621 seconds +2025-09-05T19:19:22Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1932mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:22Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:22Z USER 2007 [ModuleForkPass]: Running address_rotation_sb +2025-09-05T19:19:22Z INFO 2007 [ModuleForkPass]: Inputs to address_rotation_sb: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 5919 Sb address +2025-09-05T19:19:23Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 4699 Sb address +2025-09-05T19:19:23Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address +2025-09-05T19:19:23Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address +2025-09-05T19:19:23Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 1769 Sb address +2025-09-05T19:19:24Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address +2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: address_rotation_sb finished after 1.665 seconds +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1932mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: Running coloring_allocator_dram +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Inputs to coloring_allocator_dram: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:24Z INFO 2007 [ColoringAllocator::Rep]: Allocating functions +2025-09-05T19:19:24Z INFO 2007 [ColoringAllocator::Rep]: linearize and check +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: allocating spills in DRAM pre_link mode for address space Local +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: reserved space = 14496612352 bytes +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: spill space = 0 bytes +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: aligned spill space = 0 bytes +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: dram space = 107374182400 bytes +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: renumber locations +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: size = 0 +2025-09-05T19:19:24Z INFO 2007 []: find first defs for local +2025-09-05T19:19:24Z INFO 2007 []: find first defs for global +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: Num intervals 0 Num locations 0 +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: IntervalTree Build Done +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: info.neighbors init Done +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: IntervalTree readback Done +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: simplify interference graph +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: initialize low and high +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: lo = 0 +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: hi = 0 +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: total = 0 +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: simplify +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: new candidates = 0 +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: select ranges +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: CC buffer size limit 524288000 +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: allreduce_dram_hwm 0 +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: Real CC buffer size 0 +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: DRAM hwm after allocation: 0 +2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: DRAM allocation successful +2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: coloring_allocator_dram finished after 0.469 seconds +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1937mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: Running address_rotation_dram +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Inputs to address_rotation_dram: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:24Z INFO 2007 [DMAOptimizationBase]: Runtime page size at 512MB +2025-09-05T19:19:24Z INFO 2007 [DMAOptimizationBase]: DRAM hwm before rotation 0 +2025-09-05T19:19:24Z INFO 2007 [DMAOptimizationBase]: allreduce buffer size 524288000 +2025-09-05T19:19:24Z INFO 2007 [DMAOptimizationBase]: allreduce hwm 0 +2025-09-05T19:19:24Z INFO 2007 [DMAOptimizationBase]: Real CC buffer size 0 +2025-09-05T19:19:24Z INFO 2007 [DMAOptimizationBase]: DRAM hwm after rotation 0 +2025-09-05T19:19:24Z INFO 2007 [DMAOptimizationBase]: DRAM Rotation rotated 0 Dram address +2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: address_rotation_dram finished after 0.171 seconds +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1939mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: Running tensorcopy_accel +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Inputs to tensorcopy_accel: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:24Z INFO 2007 [TensorCopyAccel::Impl]: Running peephole optimization pass +2025-09-05T19:19:24Z INFO 2007 [TensorCopyAccel::Impl]: Accelerated 0 out of 54337 tensorcopy in Function: sg0000 average acceleration factor: -nan +2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: tensorcopy_accel finished after 0.025 seconds +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1939mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: Running peephole_opts +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Inputs to peephole_opts: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:24Z INFO 2007 [PeepholeOpts]: PeepholeOpts enabled? Recip: true Tsp: true Tc: false SplitSelect: true SimplifyMemset true +2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: peephole_opts finished after 0.055 seconds +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1939mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: Running lower_kernel +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Inputs to lower_kernel: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:24Z INFO 2007 [LowerKernel]: Started running LowerKernel +2025-09-05T19:19:24Z INFO 2007 [LowerKernel]: Start of kernel lowering pass, number of insts: 285701, number of allocs: 69132 +2025-09-05T19:19:24Z INFO 2007 [LowerKernel]: Scan BKs time (s): 0.013202 +2025-09-05T19:19:24Z INFO 2007 [LowerKernel]: Lower BKs time (s): 3e-06 +2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: lower_kernel finished after 0.019 seconds +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1939mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: Running lower_nki_kernel +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Inputs to lower_nki_kernel: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: lower_nki_kernel finished after 0.017 seconds +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1939mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: Running dynamic_dma_cleanup +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Inputs to dynamic_dma_cleanup: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: dynamic_dma_cleanup finished after 0.029 seconds +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1941mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: Running birverifier +2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Inputs to birverifier: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:25Z USER 2007 [ModuleForkPass]: birverifier finished after 0.195 seconds +2025-09-05T19:19:25Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1941mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:25Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:25Z USER 2007 [ModuleForkPass]: Running dynamic_dma_scan +2025-09-05T19:19:25Z INFO 2007 [ModuleForkPass]: Inputs to dynamic_dma_scan: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:25Z USER 2007 [ModuleForkPass]: dynamic_dma_scan finished after 0.027 seconds +2025-09-05T19:19:25Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1941mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:25Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:25Z USER 2007 [ModuleForkPass]: Running build_fdeps +2025-09-05T19:19:25Z INFO 2007 [ModuleForkPass]: Inputs to build_fdeps: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:25Z INFO 2007 [build_flow_deps]: Start build fdeps. Invocation: 2Fri Sep 5 19:19:25 2025 +2025-09-05T19:19:25Z INFO 2007 [build_flow_deps]: Allocs: 69132 instructions: 285701 +2025-09-05T19:19:26Z INFO 2007 [build_flow_deps]: Build fdeps inserted 714949 edges +2025-09-05T19:19:26Z INFO 2007 [build_flow_deps]: Done build fdeps 714949 Fri Sep 5 19:19:26 2025 +2025-09-05T19:19:26Z USER 2007 [ModuleForkPass]: build_fdeps finished after 1.075 seconds +2025-09-05T19:19:26Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1963mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:26Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:26Z USER 2007 [ModuleForkPass]: Running remove_redundancies +2025-09-05T19:19:26Z INFO 2007 [ModuleForkPass]: Inputs to remove_redundancies: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:26Z INFO 2007 [RemoveRedundancies]: remove_clobbered_writes +2025-09-05T19:19:26Z INFO 2007 [RemoveRedundancies]: remove_clobbered_writes: 0 +2025-09-05T19:19:26Z INFO 2007 [RemoveRedundancies]: remove_useless_insts +2025-09-05T19:19:26Z INFO 2007 [RemoveRedundancies]: remove Useless Instructions: 0 +2025-09-05T19:19:26Z USER 2007 [ModuleForkPass]: remove_redundancies finished after 0.090 seconds +2025-09-05T19:19:26Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1963mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:26Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:26Z USER 2007 [ModuleForkPass]: Running anti_dependency_analyzer +2025-09-05T19:19:26Z INFO 2007 [ModuleForkPass]: Inputs to anti_dependency_analyzer: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:26Z INFO 2007 [AntiDependencyAnalyzer]: Batch size: 1000 +2025-09-05T19:19:26Z INFO 2007 [AntiDependencyAnalyzer]: Analysis types: {DRAM,ALIAS,PSUM,SB} +2025-09-05T19:19:26Z INFO 2007 [AntiDependencyAnalyzer]: DRAM size: 17179869184 num-bins: 16 bin-size: 1073741824 +2025-09-05T19:19:27Z USER 2007 [ModuleForkPass]: anti_dependency_analyzer finished after 0.989 seconds +2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2058mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:27Z USER 2007 [ModuleForkPass]: Running tensor_copy_elim +2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: Inputs to tensor_copy_elim: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:27Z INFO 2007 [TensorCopyElim]: Tensor CP elimination: 0 +2025-09-05T19:19:27Z INFO 2007 [TensorCopyElim]: eliminateDeadStore removed 0 instructions +2025-09-05T19:19:27Z USER 2007 [ModuleForkPass]: tensor_copy_elim finished after 0.240 seconds +2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2071mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:27Z USER 2007 [ModuleForkPass]: Running prefetch_scheduling_before_sched +2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: Inputs to prefetch_scheduling_before_sched: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:27Z USER 2007 [ModuleForkPass]: prefetch_scheduling_before_sched finished after 0.003 seconds +2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2071mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:27Z USER 2007 [ModuleForkPass]: Running post_sched +2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: Inputs to post_sched: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:27Z INFO 2007 [post_scheduler]: Start PosT ScheD 3 sunda Fri Sep 5 19:19:27 2025 +2025-09-05T19:19:29Z INFO 2007 [post_scheduler]: Time-aware hwm post-sched +2025-09-05T19:19:31Z INFO 2007 [post_scheduler]: Time-aware simulation time: 59414448 +2025-09-05T19:19:32Z INFO 2007 [post_scheduler]: Done PosT ScheD Fri Sep 5 19:19:32 2025 +2025-09-05T19:19:32Z USER 2007 [ModuleForkPass]: post_sched finished after 4.715 seconds +2025-09-05T19:19:32Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2468mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:32Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:32Z USER 2007 [ModuleForkPass]: Running expand_scheduling_units +2025-09-05T19:19:32Z INFO 2007 [ModuleForkPass]: Inputs to expand_scheduling_units: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:32Z USER 2007 [ModuleForkPass]: expand_scheduling_units finished after 0.027 seconds +2025-09-05T19:19:32Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2218mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:32Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:32Z USER 2007 [ModuleForkPass]: Running address_rotation_sb +2025-09-05T19:19:32Z INFO 2007 [ModuleForkPass]: Inputs to address_rotation_sb: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:34Z INFO 2007 [DMAOptimizationBase]: PSUM Rotation rotated 9197 PSUM Banks +2025-09-05T19:19:34Z INFO 2007 [DMAOptimizationBase]: PSUM Rotation rotated 5583 PSUM Banks +2025-09-05T19:19:35Z INFO 2007 [DMAOptimizationBase]: PSUM Rotation rotated 0 PSUM Banks +2025-09-05T19:19:35Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 2554 Sb address +2025-09-05T19:19:36Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 2329 Sb address +2025-09-05T19:19:36Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address +2025-09-05T19:19:36Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address +2025-09-05T19:19:37Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 59 Sb address +2025-09-05T19:19:37Z INFO 2007 [DMAOptimizationBase]: moved 0 MM forward +2025-09-05T19:19:37Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address +2025-09-05T19:19:38Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address +2025-09-05T19:19:38Z USER 2007 [ModuleForkPass]: address_rotation_sb finished after 5.695 seconds +2025-09-05T19:19:38Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2254mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:38Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:38Z USER 2007 [ModuleForkPass]: Running anti_dependency_analyzer +2025-09-05T19:19:38Z INFO 2007 [ModuleForkPass]: Inputs to anti_dependency_analyzer: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:38Z INFO 2007 [AntiDependencyAnalyzer]: Batch size: 1000 +2025-09-05T19:19:38Z INFO 2007 [AntiDependencyAnalyzer]: Analysis types: {DRAM,ALIAS,PSUM,SB} +2025-09-05T19:19:38Z INFO 2007 [AntiDependencyAnalyzer]: DRAM size: 17179869184 num-bins: 16 bin-size: 1073741824 +2025-09-05T19:19:39Z USER 2007 [ModuleForkPass]: anti_dependency_analyzer finished after 1.349 seconds +2025-09-05T19:19:39Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2285mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:39Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:39Z USER 2007 [ModuleForkPass]: Running anti_dependency_analyzer +2025-09-05T19:19:39Z INFO 2007 [ModuleForkPass]: Inputs to anti_dependency_analyzer: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:39Z INFO 2007 [AntiDependencyAnalyzer]: Batch size: 1000 +2025-09-05T19:19:39Z INFO 2007 [AntiDependencyAnalyzer]: Analysis types: {DRAM,ALIAS} +2025-09-05T19:19:39Z INFO 2007 [AntiDependencyAnalyzer]: DRAM size: 17179869184 num-bins: 16 bin-size: 1073741824 +2025-09-05T19:19:39Z USER 2007 [ModuleForkPass]: anti_dependency_analyzer finished after 0.188 seconds +2025-09-05T19:19:39Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2287mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:39Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:39Z USER 2007 [ModuleForkPass]: Running dep_opt +2025-09-05T19:19:39Z INFO 2007 [ModuleForkPass]: Inputs to dep_opt: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:39Z INFO 2007 [build_flow_deps]: Start build fdeps. Invocation: 3Fri Sep 5 19:19:39 2025 +2025-09-05T19:19:39Z INFO 2007 [build_flow_deps]: Allocs: 69132 instructions: 285701 +2025-09-05T19:19:40Z INFO 2007 [build_flow_deps]: Build fdeps inserted 701433 edges +2025-09-05T19:19:40Z INFO 2007 [build_flow_deps]: Done build fdeps 701433 Fri Sep 5 19:19:40 2025 +2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: dep_opt finished after 1.572 seconds +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2248mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: Running report_stats +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Inputs to report_stats: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z INFO 2007 [ReportStats]: Data Movement Statistics: sg0000 +┌─────────────┬────────────────────────────┬───────┬────────────┐ +│ Instruction │ Kind │ Count │ Bytes │ +├─────────────┼────────────────────────────┼───────┼────────────┤ +│ Load │ Const -> Internal │ 1 │ 32768 │ +│ Load │ ExternalInput -> Internal │ 7105 │ 7114072064 │ +│ Save │ Internal -> ExternalOutput │ 7105 │ 7114072064 │ +└─────────────┴────────────────────────────┴───────┴────────────┘ + +2025-09-05T19:19:41Z INFO 2007 [ReportStats]: +┌─────────────────────┬───────┐ +│ Bytes per partition │ Count │ +├─────────────────────┼───────┤ +│ 64 │ 65 │ +│ 256 │ 66 │ +│ 7168 │ 4096 │ +│ 8192 │ 9984 │ +└─────────────────────┴───────┘ + +2025-09-05T19:19:41Z INFO 2007 [ReportStats]: MM Stats: #MatMults 217153 #MatMult-Transposes 217153 +2025-09-05T19:19:41Z INFO 2007 [ReportStats]: IO Tensor size combined: 14496579584 +2025-09-05T19:19:41Z INFO 2007 [ReportStats]: IO Tensor Statistics: +┌────────────────────┬────────────────┬──────────┬──────────────┐ +│ Largest IO Tensors │ Kind │ Src Type │ Size (Bytes) │ +├────────────────────┼────────────────┼──────────┼──────────────┤ +│ output289 │ ExternalOutput │ bfloat16 │ 134217728 │ +│ input0 │ ExternalInput │ bfloat16 │ 134217728 │ +│ output0 │ ExternalOutput │ bfloat16 │ 134217728 │ +│ input289 │ ExternalInput │ bfloat16 │ 134217728 │ +│ input7 │ ExternalInput │ bfloat16 │ 58720256 │ +│ input9 │ ExternalInput │ bfloat16 │ 58720256 │ +│ input24 │ ExternalInput │ bfloat16 │ 58720256 │ +│ input18 │ ExternalInput │ bfloat16 │ 58720256 │ +│ input15 │ ExternalInput │ bfloat16 │ 58720256 │ +│ input27 │ ExternalInput │ bfloat16 │ 58720256 │ +└────────────────────┴────────────────┴──────────┴──────────────┘ + +2025-09-05T19:19:41Z INFO 2007 [ReportStats]: Large (Internal) Tensor Statistics: +┌────────────────────────────┬──────────┬──────────┬──────────────┐ +│ Largest Tensors │ Kind │ Src Type │ Size (Bytes) │ +├────────────────────────────┼──────────┼──────────┼──────────────┤ +│ DynamicDMAScratchLoc ��� Internal │ uint8 │ 2097152 │ +│ t2035_pftranspose_18320_i5 │ Internal │ bfloat16 │ 1048576 │ +│ t2035_pftranspose_18320_i2 │ Internal │ bfloat16 │ 1048576 │ +│ t2035_pftranspose_18320_i1 │ Internal │ bfloat16 │ 1048576 │ +│ t2035_pftranspose_18320_i3 │ Internal │ bfloat16 │ 1048576 │ +│ t2035_pftranspose_18320_i4 │ Internal │ bfloat16 │ 1048576 │ +│ t2035_pftranspose_18320_i6 │ Internal │ bfloat16 │ 1048576 │ +│ t2035_pftranspose_18320_i9 │ Internal │ bfloat16 │ 1048576 │ +│ t2035_pftranspose_18320_i8 │ Internal │ bfloat16 │ 1048576 │ +│ t2035_pftranspose_18320_i7 │ Internal │ bfloat16 │ 1048576 │ +└────────────────────────────┴──────────┴──────────┴──────────────┘ + +2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: report_stats finished after 0.059 seconds +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2248mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: mod_parallel_pass finished after 28.058 seconds +2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: curr_vmrss: 2248mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: Running assign_trigger_engine +2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Inputs to assign_trigger_engine: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z INFO 2007 [AssignTriggerEngine]: Assigned trigger engine for 0 DMA instructions. Moved 0 DMA instructions to CC's engines. +2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: assign_trigger_engine finished after 0.097 seconds +2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: curr_vmrss: 2248mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: Running subgraph_parallel_pass +2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Inputs to subgraph_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [SubgraphForkPass]: Running lower_local_collectives +2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: Inputs to lower_local_collectives: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [SubgraphForkPass]: lower_local_collectives finished after 0.004 seconds +2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: curr_vmrss: 2248mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [SubgraphForkPass]: Running extend_shared_lifetimes +2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: Inputs to extend_shared_lifetimes: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [SubgraphForkPass]: extend_shared_lifetimes finished after 0.003 seconds +2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: curr_vmrss: 2248mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [SubgraphForkPass]: Running dead_code_elim +2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: Inputs to dead_code_elim: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z INFO 2007 [DeadCodeElim]: eliminateDeadStore removed 0 instructions +2025-09-05T19:19:41Z USER 2007 [SubgraphForkPass]: dead_code_elim finished after 0.245 seconds +2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: curr_vmrss: 2249mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: subgraph_parallel_pass finished after 0.266 seconds +2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: curr_vmrss: 2249mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: Running assign_hwdge_engine +2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Inputs to assign_hwdge_engine: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: assign_hwdge_engine finished after 0.027 seconds +2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: curr_vmrss: 2249mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: Running mod_parallel_pass +2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: Running alloc_queues +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Inputs to alloc_queues: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z INFO 2007 [AllocQueues]: DMACopy transpose will be triggered from multiple engines +2025-09-05T19:19:41Z INFO 2007 [AllocQueues]: Alloc Queue info: +┌─────────────────┬────────────────┬────────┬────────────┬──────────────────┐ +│ Name │ DMAQueue::Type │ Engine │ Num Queues │ Num instructions │ +├─────────────────┼────────────────┼────────┼────────────┼──────────────────┤ +│ qSPSpillReload0 │ data │ SP │ 16 │ 1 │ +│ qPoolDynamic │ dynamic │ Pool │ 16 │ 14210 │ +└─────────────────┴────────────────┴────────┴────────────┴──────────────────┘ + +2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: alloc_queues finished after 0.030 seconds +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2249mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: Running chain_dma_transposes +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Inputs to chain_dma_transposes: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: chain_dma_transposes finished after 0.003 seconds +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2249mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: Running prefetch_scheduling_after_sched +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Inputs to prefetch_scheduling_after_sched: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: prefetch_scheduling_after_sched finished after 0.003 seconds +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2249mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: Running lower_control +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Inputs to lower_control: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z INFO 2007 [LowerControl]: EraseInterBbDeps removed 0 inter-BB deps +2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: lower_control finished after 0.197 seconds +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2249mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: mod_parallel_pass finished after 0.252 seconds +2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: curr_vmrss: 2249mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: Running nc_parallel_pass +2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Inputs to nc_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z USER 2007 [CoreForkPass]: Running dep_reduction +2025-09-05T19:19:41Z INFO 2007 [CoreForkPass]: Inputs to dep_reduction: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:41Z INFO 2007 [DepReduction]: Start Dependency Reduction +2025-09-05T19:19:42Z INFO 2007 [DepReduction]: Processing async instrs... +2025-09-05T19:19:42Z INFO 2007 [DepReduction]: Processing secondary edges per engine... +2025-09-05T19:19:42Z INFO 2007 [DepReduction]: Processing secondary edges per engine, Done. Num edges removed 484672 +2025-09-05T19:19:42Z INFO 2007 [DepReduction]: Processing redundant descendants, Done. Num edges removed 497220 +2025-09-05T19:19:42Z INFO 2007 [DepReduction]: Processing async instrs, Done. Num edges removed 497220 +2025-09-05T19:19:43Z INFO 2007 [DepReduction]: Num Async removed: 0 +2025-09-05T19:19:43Z INFO 2007 [DepReduction]: Finished dependency reduction: 1173793 removed, new total 115343 +2025-09-05T19:19:43Z INFO 2007 [DepReduction]: Finished Dependency Reduction +2025-09-05T19:19:43Z USER 2007 [CoreForkPass]: dep_reduction finished after 1.714 seconds +2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: curr_vmrss: 2263mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:43Z USER 2007 [CoreForkPass]: Running lower_dynamic_dma +2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: Inputs to lower_dynamic_dma: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:43Z USER 2007 [CoreForkPass]: lower_dynamic_dma finished after 0.048 seconds +2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: curr_vmrss: 2263mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:43Z USER 2007 [CoreForkPass]: Running legalize_dynamic_dma +2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: Inputs to legalize_dynamic_dma: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:43Z INFO 2007 [LegalizeDynamicDMA]: Legalize Dynamic DMA scanned 0 DGE instructions +2025-09-05T19:19:43Z INFO 2007 [LegalizeDynamicDMA]: After Legalize Dynamic DMA, 0 DGE instructions were scanned +2025-09-05T19:19:43Z INFO 2007 [LegalizeDynamicDMA]: +┌───────────┬───────────────────────────────┬────────────────────────────┐ +│ Sub-Pass │ Illegal Instructions Detected │ New Instructions Generated │ +├───────────┼───────────────────────────────┼────────────────────────────┤ +│ Peeling │ 0 │ 0 │ +│ Unrolling │ 0 │ 0 │ +│ Splitting │ 0 │ 0 │ +└───────────┴───────────────────────────────┴────────────────────────────┘ + +2025-09-05T19:19:43Z USER 2007 [CoreForkPass]: legalize_dynamic_dma finished after 0.100 seconds +2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: curr_vmrss: 2263mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:43Z USER 2007 [CoreForkPass]: Running lower_dma +2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: Inputs to lower_dma: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:43Z INFO 2007 [LowerDMA]: lower_dma metrics start + IO + Copy (DGE/DMA) + 128 partition : 14145/14145 (100% DGE) + power-of-2 partition : 14210/14210 (100% DGE) + > 3 dimensional : 0/0 + non-integer desc size : 0/0 + total : 14210/14210 (100% DGE) + Cast (DGE/DMA) + 128 partition : 0/0 + power-of-2 partition : 0/0 + > 3 dimensional : 0/0 + non-integer desc size : 0/0 + total : 0/0 + Spill/Reload + Copy (DGE/DMA) + 128 partition : 0/1 (0% DGE) + power-of-2 partition : 0/1 (0% DGE) + > 3 dimensional : 0/0 + non-integer desc size : 0/0 + total : 0/1 (0% DGE) + Cast (DGE/DMA) + 128 partition : 0/0 + power-of-2 partition : 0/0 + > 3 dimensional : 0/0 + non-integer desc size : 0/0 + total : 0/0 + CopyMode + CCE : 0 + Transpose : 0 + Replicate : 0 + Dynamic (DGE/DMA) + scalar : 0/0 + vector : 0/0 + Opcode + ReadVarAddr : 0 + IndirectLoad : 0 + IndirectSave : 0 + IndirectSaveAccumulate : 0 + DstReduceDGE : 0 +lower_dma metrics end +2025-09-05T19:19:43Z USER 2007 [CoreForkPass]: lower_dma finished after 0.145 seconds +2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: curr_vmrss: 2263mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285710 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:43Z USER 2007 [CoreForkPass]: Running coalesce_dma_blocks +2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: Inputs to coalesce_dma_blocks: modules=1 functions=1 allocs=69132 blocks=1 instructions=285710 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:44Z INFO 2007 [CoalesceDmaBlocks]: Coaleseced 0 DMA triggers +2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: coalesce_dma_blocks finished after 0.133 seconds +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: curr_vmrss: 2264mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285710 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: Running expand_all_engine +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Inputs to expand_all_engine: modules=1 functions=1 allocs=69132 blocks=1 instructions=285710 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: expand_all_engine finished after 0.050 seconds +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: curr_vmrss: 2264mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285710 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: Running alloc_semaphores +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Inputs to alloc_semaphores: modules=1 functions=1 allocs=69132 blocks=1 instructions=285710 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: alloc_semaphores finished after 0.242 seconds +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: curr_vmrss: 2264mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285710 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: Running expand_inst_late +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Inputs to expand_inst_late: modules=1 functions=1 allocs=69132 blocks=1 instructions=285710 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: expand_inst_late finished after 0.262 seconds +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: curr_vmrss: 2264mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285710 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: Running seq_inst_opt +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Inputs to seq_inst_opt: modules=1 functions=1 allocs=69132 blocks=1 instructions=285710 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:44Z INFO 2007 [SeqInstOpt]: Removing 0 unnecessary InstRegisterMove instruction(s) from Block1 +2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: seq_inst_opt finished after 0.027 seconds +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: curr_vmrss: 2264mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285710 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: Running lower_sync +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Inputs to lower_sync: modules=1 functions=1 allocs=69132 blocks=1 instructions=285710 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: lower_sync finished after 0.102 seconds +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: curr_vmrss: 2264mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300967 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: Running lower_act +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Inputs to lower_act: modules=1 functions=1 allocs=69132 blocks=1 instructions=300967 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: lower_act finished after 0.036 seconds +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: curr_vmrss: 2264mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: Running lower_dve +2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Inputs to lower_dve: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:44Z INFO 2007 [LowerDVE]: Loading DVE opcodes table dve_info.json from /opt/conda/lib/python3.10/site-packages/neuronxcc/dve/dve_bin_gen2/dve_info.json +2025-09-05T19:19:45Z USER 2007 [CoreForkPass]: lower_dve finished after 0.303 seconds +2025-09-05T19:19:45Z INFO 2007 [CoreForkPass]: curr_vmrss: 2296mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:45Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:45Z USER 2007 [CoreForkPass]: Running lower_ap +2025-09-05T19:19:45Z INFO 2007 [CoreForkPass]: Inputs to lower_ap: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:45Z USER 2007 [CoreForkPass]: lower_ap finished after 0.066 seconds +2025-09-05T19:19:45Z INFO 2007 [CoreForkPass]: curr_vmrss: 2177mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:45Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:45Z USER 2007 [CoreForkPass]: Running coloring_allocator_reg +2025-09-05T19:19:45Z INFO 2007 [CoreForkPass]: Inputs to coloring_allocator_reg: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:45Z INFO 2007 [ColoringAllocator::Rep]: Allocating functions +2025-09-05T19:19:45Z INFO 2007 [ColoringAllocator::Rep]: linearize and check +2025-09-05T19:19:45Z INFO 2007 [REG_Allocator]: allocating REG +2025-09-05T19:19:45Z INFO 2007 [REG_Allocator]: main loop iteration 1 +2025-09-05T19:19:45Z USER 2007 [CoreForkPass]: coloring_allocator_reg finished after 0.044 seconds +2025-09-05T19:19:45Z INFO 2007 [CoreForkPass]: curr_vmrss: 2190mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:45Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:45Z USER 2007 [BackendPassManager]: nc_parallel_pass finished after 3.389 seconds +2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: curr_vmrss: 2190mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:45Z USER 2007 [BackendPassManager]: Running mod_parallel_pass +2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:45Z USER 2007 [ModuleForkPass]: Running birverifier +2025-09-05T19:19:45Z INFO 2007 [ModuleForkPass]: Inputs to birverifier: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:45Z USER 2007 [ModuleForkPass]: birverifier finished after 0.254 seconds +2025-09-05T19:19:45Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2190mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:45Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:45Z USER 2007 [BackendPassManager]: mod_parallel_pass finished after 0.261 seconds +2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: curr_vmrss: 2190mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:45Z USER 2007 [BackendPassManager]: Running subgraph_parallel_pass +2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: Inputs to subgraph_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:45Z USER 2007 [SubgraphForkPass]: Running lnc_verifier +2025-09-05T19:19:45Z INFO 2007 [SubgraphForkPass]: Inputs to lnc_verifier: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:45Z USER 2007 [SubgraphForkPass]: lnc_verifier finished after 0.003 seconds +2025-09-05T19:19:45Z INFO 2007 [SubgraphForkPass]: curr_vmrss: 2190mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:45Z INFO 2007 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:45Z USER 2007 [BackendPassManager]: subgraph_parallel_pass finished after 0.010 seconds +2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: curr_vmrss: 2190mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:45Z USER 2007 [BackendPassManager]: Running mod_parallel_pass +2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:45Z USER 2007 [ModuleForkPass]: Running codegen +2025-09-05T19:19:45Z INFO 2007 [ModuleForkPass]: Inputs to codegen: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:45Z INFO 2007 [Codegen]: Total compiler allocated DRAM tensors: 0 GB +2025-09-05T19:19:45Z INFO 2007 [Codegen]: Total un-allocated DRAM tensors by kind: +2025-09-05T19:19:45Z INFO 2007 [Codegen]: +┌───────────────┬─────────────┐ +│ TensorKind │ Size (GB) │ +├───────────────┼─────────────┤ +│ ExternalInput │ 6.7505 │ +│ Const │ 3.05176e-05 │ +└───────────────┴─────────────┘ + +2025-09-05T19:19:45Z INFO 2007 [Codegen]: Total runtime managed DRAM tensors: 6.75053 GB +2025-09-05T19:19:46Z INFO 2007 [Codegen]: Instruction Stats: +2025-09-05T19:19:46Z INFO 2007 [Codegen]: +┌─────────────────────┬────────┐ +│ Opcode │ Count │ +├─────────────────────┼────────┤ +│ LDWEIGHTS │ 217153 │ +│ MATMUL │ 217153 │ +│ ACTIVATE │ 54337 │ +│ EVENT_SEMAPHORE │ 15257 │ +│ UNKNOWN(0xd4) │ 14210 │ +│ NOP │ 8 │ +│ PSEUDO_BRANCH_LABEL │ 5 │ +│ ACT_TABLE_LOAD │ 1 │ +│ PSEUDO_DMA_TRIGGER │ 1 │ +└─────────────────────┴────────┘ + +2025-09-05T19:19:46Z INFO 2007 [Codegen]: +┌────────────┬────────┐ +│ Engine │ Count │ +├────────────┼────────┤ +│ Unassigned │ 0 │ +│ GPSIMD │ 20924 │ +│ Scalar │ 56947 │ +│ Tensor │ 440254 │ +│ SyncDMA │ 0 │ +│ Vector │ 2 │ +│ Sync │ 3 │ +│ All │ 0 │ +└────────────┴────────┘ + +2025-09-05T19:19:46Z INFO 2007 [Codegen]: Total instructions: 518130 (0.030883 GB) +2025-09-05T19:19:46Z INFO 2007 [Codegen]: Total DynamicDMA instruction count: 14210 +2025-09-05T19:19:46Z USER 2007 [Codegen]: isa_gen finished after 1.105 seconds +2025-09-05T19:19:46Z INFO 2007 [Codegen]: Number of DMA descriptors on each queue instance: +┌─────────────────┬────────────────┐ +│ Queue Instance │ RT Descriptors │ +├─────────────────┼────────────────┤ +│ qSPSpillReload0 │ 256 │ +└─────────────────┴────────────────┘ + +Total descriptors: 256 (3.8147e-06 GB) +2025-09-05T19:19:46Z INFO 2007 [Codegen]: Number of DMA engines used by each queue: +┌─────────────────┬─────────────────────┐ +│ Queue │ DMA Engines │ +├─────────────────┼─────────────────────┤ +│ qSPSpillReload0 │ 16 │ +│ qPoolDynamic │ 16 │ +├─────────────────┼─────────────────────┤ +│ TOTAL │ 32 (must be <= 176) │ +└─────────────────┴─────────────────────┘ + +2025-09-05T19:19:46Z INFO 2007 [Codegen]: Tensors with largest descriptor count: +┌──────────────────────┬──────────┬──────────┬──────────────────┐ +│ Tensor Name │ Kind │ Src Type │ Descriptor Count │ +├──────────────────────┼──────────┼──────────┼──────────────────┤ +│ identity_local_22027 │ Internal │ bfloat16 │ 1 │ +│ identity_22025 │ Const │ bfloat16 │ 1 │ +└──────────────────────┴──────────┴──────────┴──────────────────┘ + +2025-09-05T19:19:46Z USER 2007 [Codegen]: dma_desc_gen finished after 0.004 seconds +2025-09-05T19:19:46Z INFO 2007 [Codegen]: Estimated peak DRAM usage: 6.78141 GB +2025-09-05T19:19:46Z INFO 2007 [Codegen]: Generating debug info +2025-09-05T19:19:47Z USER 2007 [Codegen]: debug_info_gen finished after 0.568 seconds +2025-09-05T19:19:47Z USER 2007 [ModuleForkPass]: codegen finished after 1.742 seconds +2025-09-05T19:19:47Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2385mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:47Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:47Z USER 2007 [BackendPassManager]: mod_parallel_pass finished after 1.775 seconds +2025-09-05T19:19:47Z INFO 2007 [BackendPassManager]: curr_vmrss: 2204mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:47Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:47Z USER 2007 [BackendPassManager]: Running neff_packager +2025-09-05T19:19:47Z INFO 2007 [BackendPassManager]: Inputs to neff_packager: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:47Z WARNING 2007 [NeffFileWriter]: writeKelp missing file /local/p4clients/pkgbuild-const/workspace/build/KaenaCompiler/KaenaCompiler-2.x.169490.0/AL2_x86_64/DEV.STD.PTHREAD/build/private/_skbuild/linux-x86_64-3.10/cmake-build/neuronxcc/walrus/neff_packager/MetricMetadata.json +2025-09-05T19:19:47Z INFO 2007 [NeffFileWriter]: Neff will be written to: /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/graph.neff +2025-09-05T19:19:47Z INFO 2007 [NeffFileWriter]: IR signature: b43fb3a95505587f066b6d2d419d7150 for neff artifacts +2025-09-05T19:19:47Z USER 2007 [BackendPassManager]: neff_packager finished after 0.299 seconds +2025-09-05T19:19:47Z INFO 2007 [BackendPassManager]: curr_vmrss: 2205mb, ru_maxrss: 2541mb (delta=0mb) +2025-09-05T19:19:47Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 +2025-09-05T19:19:47Z INFO 2007 [BackendDriver]: HBM scratchpad usage summary (post-allocation): +┌──────┬───────────┬────────────────────────────────────────────────────────────┬─────────────┐ +│ Core │ Subgraph │ Description │ Value │ +├──────┼───────────┼────────────────────────────────────────────────────────────┼─────────────┤ +│ nc00 │ module │ Peak scratchpad usage: local │ 0.000000 GB │ +│ nc00 │ module │ Total size of allocated tensors: local │ 0.000000 GB │ +│ nc00 │ Max │ Peak scratchpad usage: local │ 0.000000 GB │ +│ nc00 │ Post-link │ Peak scratchpad usage after intermediate tensor allocation │ 0.000000 GB │ +│ nc00 │ Post-link │ Total size of allocated intermediate tensors │ 0.000000 GB │ +├──────┼───────────┼────────────────────────────────────────────────────────────┼─────────────┤ +│ Max │ Max │ Peak scratchpad usage │ 0.000000 GB │ +│ Max │ Max │ Peak scratchpad usage (page-aligned) │ 0.000000 GB │ +└──────┴───────────┴──────────────────────────���─────────────────────────────────┴─────────────┘ + +2025-09-05T19:19:47Z INFO 2007 [BackendDriver]: Backend completed successfully, tearing down. +2025-09-05T19:19:48Z INFO 1971 [job.WalrusDriver.0]: Job #0 finished +2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Finished job job.WalrusDriver.0 +2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Starting job job.BIRLinker.0 +2025-09-05T19:19:48Z INFO 1971 [job.BIRLinker.0]: Replay this job by calling: /opt/conda/bin/neuronx-cc compile --framework XLA --state '{"model": ["/models/mistral-7b-v0.3-instruct-neuronx/layout_opt/model/graph.hlo"], "tensormap": "tensor_map.json", "bir": "bir.json", "lorean_sg_key": null, "input_name_map": null, "output_name_map": null, "constant_tensors": null, "state_dir": "/opt/vllm/neuronxcc-gss0wi_w/sg00", "state_id": "sg00"}' --pipeline BIRLinker +2025-09-05T19:19:48Z INFO 1971 [job.BIRLinker.0]: BIRLinker cwd: /opt/vllm/neuronxcc-gss0wi_w +2025-09-05T19:19:48Z INFO 1971 [job.BIRLinker.0]: Linking not needed. Netlist doesnt exist +2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Finished job job.BIRLinker.0 +2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Starting job job.Kelper.0 +2025-09-05T19:19:48Z INFO 1971 [job.Kelper.0]: Skipping neff generation which was already performed by neff_packager +2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Finished job job.Kelper.0 +2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Starting job job.NeffWrapper.0 +2025-09-05T19:19:48Z INFO 1971 [job.NeffWrapper.0]: Job NeffWrapper len(in_states) 1 +2025-09-05T19:19:48Z INFO 1971 [job.NeffWrapper.0]: Processing input #0 +2025-09-05T19:19:48Z INFO 1971 [job.NeffWrapper.0]: Start NeffWrapper +2025-09-05T19:19:48Z INFO 1971 [job.NeffWrapper.0]: Executing: /opt/conda/lib/python3.10/site-packages/neuronxcc/starfish/bin/hlo-neff-wrapper --hlo /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/model/graph.hlo --neff /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/graph.neff --io_transposes /opt/vllm/neuronxcc-gss0wi_w/io_transposes.json --output /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/wrapped_neff.hlo --netlist /opt/vllm/neuronxcc-gss0wi_w/hlo_netlist.json +2025-09-05T19:19:48Z INFO 1971 [job.NeffWrapper.0]: Could not open file: /opt/vllm/neuronxcc-gss0wi_w/hlo_netlist.json +There are no io transposes nor zero-sized parameters. Output will not be produced. +Hlo neff wrapper finished successfully. Have a wonderful day :D + +2025-09-05T19:19:48Z INFO 1971 [job.NeffWrapper.0]: Job #0 finished +2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Finished job job.NeffWrapper.0 +2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Finished pipeline Pipeline +2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Job #0 finished +2025-09-05T19:19:48Z INFO 1939 [root]: Subcommand returned with exitcode=0