File size: 210,358 Bytes
9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 ee61cf7 9725092 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 |
2025-08-09T18:44:43Z INFO 67541 [root]: /opt/aws_neuronx_venv_pytorch_2_7_nxd_inference/bin/neuronx-cc compile /home/ubuntu/qwen3/layout_opt/model/graph.hlo --framework XLA --target trn1 --output /home/ubuntu/qwen3/layout_opt/graph.neff --model-type=transformer -O1 --lnc=1 '--internal-hlo2tensorizer-options=--experimental-unsafe-fp8e4m3fn-as-fp8e4m3 --verify-hlo=false' --logfile=/home/ubuntu/qwen3/layout_opt/log-neuron-cc.txt --verbose=35 2025-08-09T18:44:43Z INFO 67541 [root]: NeuronX Compiler version 2.20.9961.0+0acef03a Python version 3.10.12 HWM version 2.20.0.9961+0acef03a NumPy version 1.26.4 Running on AMI ami-040348201d80b58ad Running in region usw2-az4 2025-08-09T18:44:43Z INFO 67605 [root]: XLA detected 2025-08-09T18:44:43Z INFO 67605 [root]: Pipeline: HLOToTensorizer Frontend StaticIOTranspose WalrusDriver BIRLinker Kelper NeffWrapper 2025-08-09T18:44:44Z INFO 67605 [root]: Intermediate files stored in /home/ubuntu/neuronxcc-mk9kpjyq, output in /home/ubuntu 2025-08-09T18:44:44Z INFO 67605 [pipeline.Pipeline.0]: Job Pipeline len(in_states) 1 2025-08-09T18:44:44Z INFO 67605 [pipeline.Pipeline.0]: Processing input #0 2025-08-09T18:44:44Z INFO 67605 [pipeline.Pipeline.0]: Running pipeline Pipeline.0 2025-08-09T18:44:44Z INFO 67605 [pipeline.Pipeline.0]: Starting job job.HLOToTensorizer.0 2025-08-09T18:44:44Z INFO 67605 [job.HLOToTensorizer.0]: Job HLOToTensorizer len(in_states) 1 2025-08-09T18:44:44Z INFO 67605 [job.HLOToTensorizer.0]: Processing input #0 2025-08-09T18:44:44Z INFO 67605 [job.HLOToTensorizer.0]: IR signature: 12b45b028e502b2dd8c42c1287fbdbea434454143a30d473806853bc18673d98 for graph.hlo 2025-08-09T18:44:44Z INFO 67605 [job.HLOToTensorizer.0]: Executing: /opt/aws_neuronx_venv_pytorch_2_7_nxd_inference/lib/python3.10/site-packages/neuronxcc/starfish/bin/hlo2penguin --input /home/ubuntu/qwen3/layout_opt/model/graph.hlo --out-dir ./ --output penguin.py --remat --max-costly-ops=2 --max-live-in-size=5 --max-remat-chain-size=10 --max-mem-multiple=1.8 --min-def-use-distance=500 --remat-policy=transformer --allow-same-pass-remat=true --layers-per-module=1 --partition --emit-tensor-level-dropout-ops --experimental-unsafe-fp8e4m3fn-as-fp8e4m3 --verify-hlo=false --native-to-custom-softmax --partitioner-opts='--transformer' 2025-08-09T18:44:44Z INFO 67605 [job.HLOToTensorizer.0]: DEBUG: needsModular? No. macCnt 0 num non-trivial Ops 325 INFO: Switching to single-module compile. PrePartitionPipe skipped. INFO: Found memory bound graph INFO: Number of Native SoftmaxDx's detected and replaced: 0 INFO: Number of Native Softmax's detected and replaced: 0 Replaced 0 dropout sequences with OffloadedDropout INFO: HloMacCount has found 0 INFO: Traffic has found 8191043584 INFO: AIF 0 HLO Ops used in computation: parameter reshape transpose tuple Warning: Could not open file debug_info_hlo_partitions.json 2025-08-09 18:44:44.153865: W hilo/hlo2penguin/utils/DumpDebugInfo.cc:52] Truncating long HLO operator name %last = tuple(%p76, %transpose.325, %transpose.326, %transpose.327, %p80, %transpose.328, %p82, %transpose.329, %transpose.330, %transpose.331, %transpose.332, %transpose.333, %transpose.334, %transpose.335, %transpose.336, %p91, %transpose.337, %p93, %transpose.338, %transpose.339, %transpose.340, %transpose.341, %transpose.342, %transpose.343, %transpose.344, %transpose.345, %p102, %transpose.346, %p104, %transpose.347, %transpose.348, %transpose.349, %transpose.350, %transpose.351, %transpose.352, %tr... to 512 characters in the compiler's debug metadata Invoking RemoveOptimizationBarriers pass 2025-08-09T18:44:44Z INFO 67605 [job.HLOToTensorizer.0]: IR signature: 5bb2cda84f89e3e556843403ea05d6d67130299dc9a1fbfc964c0d386a78e543 for sg0000/HLOToTensorizer 2025-08-09T18:44:44Z INFO 67605 [job.HLOToTensorizer.0]: Job #0 finished 2025-08-09T18:44:44Z INFO 67605 [pipeline.Pipeline.0]: Finished job job.HLOToTensorizer.0 2025-08-09T18:44:44Z INFO 67605 [pipeline.Pipeline.0]: Starting job job.Frontend.0 2025-08-09T18:44:44Z INFO 67605 [job.Frontend.0]: Job Frontend len(in_states) 1 2025-08-09T18:44:44Z INFO 67605 [job.Frontend.0]: Processing input #0 2025-08-09T18:44:44Z INFO 67605 [job.Frontend.0]: Start model loading 2025-08-09T18:44:44Z INFO 67605 [job.Frontend.0]: Start tensorization 2025-08-09T18:44:44Z INFO 67605 [job.Frontend.0]: Num jobs: 128 2025-08-09T18:44:44Z USER 67605 [root/Tensorizer/Tensorizer]: Running Tensorizer 2025-08-09T18:44:44Z INFO 67605 [Tensorizer]: Frontend did not find netlist info. Switching to flat flow. 2025-08-09T18:44:44Z INFO 67605 [Tensorizer]: Building model from Penguin script "penguin.py"... 2025-08-09T18:44:44Z INFO 67605 [Tensorizer]: Tensorizer options: --run-pg-layout-and-tiling --enable-dse-after-mask-propagation --disable-concat-delinearizer --num-neuroncores-per-sengine=1 --num-neuroncores-per-sengine=1 --internal_dynamic_dma_scratch_size_per_partition=16384 --disable-bitcasted-transpose --dont-verify-after-all --fp32-cast=matmult-bf16 --mm-transpose-type=fp32 --disable-expensive-checks --disable-max-stride-tiling --hbm-scratchpad-page-size-in-bytes=536870912 --enable-replication --max-local-tensor-tile-size-in-bytes=32768 --tensor-layout-p-order=0 --tensor-layout-b-order=1 --enable-advanced-delinearization --weight-coalescing-threshold=512 --enable-bir-converter=enable --enable-tritium-loopfusion --enable-softmax-kernel --model-type-transformer --enable-isl-in-injective-check --enable-dge-on-io-dma --enable-dge-on-indirect-dma --enable-dge-on-vector-indirect-dma --keep-rng-tensor-op 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/DoNothing]: Running DoNothing 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/DoNothing]: Finished (changed=True) 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/DoNothing]: DoNothing finished after 0.000 seconds 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/LegalizeOpLevelAlias]: Running LegalizeOpLevelAlias 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/LegalizeOpLevelAlias]: Finished (changed=False) 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/LegalizeOpLevelAlias]: LegalizeOpLevelAlias finished after 0.004 seconds 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/OptimizeAliasedCopyChain]: Running OptimizeAliasedCopyChain 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/OptimizeAliasedCopyChain]: Finished (changed=False) 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/OptimizeAliasedCopyChain]: OptimizeAliasedCopyChain finished after 0.006 seconds 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/AliasDependencyInduction]: Running AliasDependencyInduction 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/AliasDependencyInduction]: Finished (changed=True) 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/AliasDependencyInduction]: AliasDependencyInduction finished after 0.037 seconds 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/TransformConvOp]: Running TransformConvOp 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/TransformConvOp]: Finished (changed=False) 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/TransformConvOp]: TransformConvOp finished after 0.014 seconds 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/LowerTensorOp]: Running LowerTensorOp 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/LowerTensorOp]: Finished (changed=False) 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/LowerTensorOp]: LowerTensorOp finished after 0.005 seconds 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/AliasDependencyReset]: Running AliasDependencyReset 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/AliasDependencyElimination]: Running AliasDependencyElimination 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/AliasDependencyElimination]: Finished (changed=True) 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/AliasDependencyElimination]: AliasDependencyElimination finished after 0.003 seconds 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/AliasDependencyInduction]: Running AliasDependencyInduction 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/AliasDependencyInduction]: Finished (changed=True) 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/AliasDependencyInduction]: AliasDependencyInduction finished after 0.037 seconds 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/AliasDependencyReset]: AliasDependencyReset finished after 0.049 seconds 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/TensorOpSimplifier]: Running TensorOpSimplifier 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/TensorOpSimplifier]: Finished (changed=False) 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/TensorOpSimplifier]: TensorOpSimplifier finished after 0.019 seconds 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/CanonicalizeIR]: Running CanonicalizeIR 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/CanonicalizeIR]: Finished (changed=False) 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/CanonicalizeIR]: CanonicalizeIR finished after 0.004 seconds 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/LegalizeCCOpLayout]: Running LegalizeCCOpLayout 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/LegalizeCCOpLayout]: Finished (changed=False) 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/LegalizeCCOpLayout]: LegalizeCCOpLayout finished after 0.005 seconds 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/ResolveComplicatePredicates]: Running ResolveComplicatePredicates 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/ResolveComplicatePredicates]: Finished (changed=False) 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/ResolveComplicatePredicates]: ResolveComplicatePredicates finished after 0.004 seconds 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/AffinePredicateResolution]: Running AffinePredicateResolution 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/AffinePredicateResolution]: Finished (changed=False) 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/AffinePredicateResolution]: AffinePredicateResolution finished after 0.004 seconds 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/EliminateDivs]: Running EliminateDivs 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/EliminateDivs]: Finished (changed=False) 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/EliminateDivs]: EliminateDivs finished after 0.004 seconds 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/PerfectLoopNest]: Running PerfectLoopNest 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/PerfectLoopNest]: Finished (changed=False) 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/PerfectLoopNest]: PerfectLoopNest finished after 0.004 seconds 2025-08-09T18:44:44Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Running Simplifier 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Finished (changed=True) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.070 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/GenericAccessSimplifier]: Running GenericAccessSimplifier 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/GenericAccessSimplifier]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/GenericAccessSimplifier]: GenericAccessSimplifier finished after 0.004 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/TCTransform]: Running TCTransform 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/TCTransform]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/TCTransform]: TCTransform finished after 0.004 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/CommuteConcat]: Running CommuteConcat 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/CommuteConcat]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/CommuteConcat]: CommuteConcat finished after 0.004 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/ExpandBatchNorm]: Running ExpandBatchNorm 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/ExpandBatchNorm]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/ExpandBatchNorm]: ExpandBatchNorm finished after 0.007 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/TCTransform]: Running TCTransform 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/TCTransform]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/TCTransform]: TCTransform finished after 0.004 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/EliminateDivs]: Running EliminateDivs 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/EliminateDivs]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/EliminateDivs]: EliminateDivs finished after 0.004 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/GenericAccessSimplifier]: Running GenericAccessSimplifier 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/GenericAccessSimplifier]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/GenericAccessSimplifier]: GenericAccessSimplifier finished after 0.004 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/TensorOpTransform]: Running TensorOpTransform 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/TensorOpTransform]: Finished (changed=True) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/TensorOpTransform]: TensorOpTransform finished after 0.077 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LateLowerTensorOp]: Running LateLowerTensorOp 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LateLowerTensorOp]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LateLowerTensorOp]: LateLowerTensorOp finished after 0.006 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/AliasDependencyReset]: Running AliasDependencyReset 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/AliasDependencyElimination]: Running AliasDependencyElimination 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/AliasDependencyElimination]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/AliasDependencyElimination]: AliasDependencyElimination finished after 0.000 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/AliasDependencyInduction]: Running AliasDependencyInduction 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/AliasDependencyInduction]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/AliasDependencyInduction]: AliasDependencyInduction finished after 0.007 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/AliasDependencyReset]: AliasDependencyReset finished after 0.014 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/MemcpyElimination]: Running MemcpyElimination 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/MemcpyElimination]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/MemcpyElimination]: MemcpyElimination finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LoopFusion]: Running LoopFusion 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LoopFusion]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LoopFusion]: LoopFusion finished after 0.003 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Rematerialization]: Running Rematerialization 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Rematerialization]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Rematerialization]: Rematerialization finished after 0.003 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Running Simplifier 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.001 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Delinearization]: Running Delinearization 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Delinearization]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Delinearization]: Delinearization finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DeadStoreElimination]: Running DeadStoreElimination 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DeadStoreElimination]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DeadStoreElimination]: DeadStoreElimination finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Running Simplifier 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.001 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LICM]: Running LICM 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LICM]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LICM]: LICM finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Delinearization]: Running Delinearization 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Delinearization]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Delinearization]: Delinearization finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LoopFusion]: Running LoopFusion 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LoopFusion]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LoopFusion]: LoopFusion finished after 0.003 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/SimplifySlice]: Running SimplifySlice 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/SimplifySlice]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/SimplifySlice]: SimplifySlice finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LICM]: Running LICM 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LICM]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LICM]: LICM finished after 0.001 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Running Simplifier 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/ValueNumbering]: Running ValueNumbering 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/ValueNumbering]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/ValueNumbering]: ValueNumbering finished after 0.001 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LICM]: Running LICM 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LICM]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LICM]: LICM finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/PadElimination]: Running PadElimination 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/PadElimination]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/PadElimination]: PadElimination finished after 0.001 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Delinearization]: Running Delinearization 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Delinearization]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Delinearization]: Delinearization finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LoopFusion]: Running LoopFusion 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LoopFusion]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LoopFusion]: LoopFusion finished after 0.003 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/GenericAccessSimplifier]: Running GenericAccessSimplifier 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/GenericAccessSimplifier]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/GenericAccessSimplifier]: GenericAccessSimplifier finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Running Simplifier 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LICM]: Running LICM 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LICM]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LICM]: LICM finished after 0.001 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/ValueNumbering]: Running ValueNumbering 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/ValueNumbering]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/ValueNumbering]: ValueNumbering finished after 0.001 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/TCTransform]: Running TCTransform 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/TCTransform]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/TCTransform]: TCTransform finished after 0.001 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/CommuteConcat]: Running CommuteConcat 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/CommuteConcat]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/CommuteConcat]: CommuteConcat finished after 0.001 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/RecognizeOpIdiom]: Running RecognizeOpIdiom 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/RecognizeOpIdiom]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/RecognizeOpIdiom]: RecognizeOpIdiom finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/MaskPropagation]: Running MaskPropagation 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/MaskPropagation]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/MaskPropagation]: MaskPropagation finished after 0.003 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DeadStoreElimination]: Running DeadStoreElimination 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DeadStoreElimination]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DeadStoreElimination]: DeadStoreElimination finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Recompute]: Running Recompute 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Recompute]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Recompute]: Recompute finished after 0.000 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DeadCodeElimination]: Running DeadCodeElimination 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DeadCodeElimination]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DeadCodeElimination]: DeadCodeElimination finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [Tensorizer]: After optimization: 325 statements 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DoNothing]: Running DoNothing 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DoNothing]: Finished (changed=True) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DoNothing]: DoNothing finished after 0.000 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/MutateDataType]: Running MutateDataType 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/MutateDataType]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/MutateDataType]: MutateDataType finished after 0.003 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/AutoCastTCInputs]: Running AutoCastTCInputs 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/AutoCastTCInputs]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/AutoCastTCInputs]: AutoCastTCInputs finished after 0.004 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/GenericAccessSimplifier]: Running GenericAccessSimplifier 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/GenericAccessSimplifier]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/GenericAccessSimplifier]: GenericAccessSimplifier finished after 0.001 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Running Simplifier 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.001 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DelinearIndices]: Running DelinearIndices 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DelinearIndices]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DelinearIndices]: DelinearIndices finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Delinearization]: Running Delinearization 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Delinearization]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Delinearization]: Delinearization finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DelinearIndices]: Running DelinearIndices 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DelinearIndices]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DelinearIndices]: DelinearIndices finished after 0.003 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DeadCodeElimination]: Running DeadCodeElimination 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DeadCodeElimination]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DeadCodeElimination]: DeadCodeElimination finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LateLowerReshapeOp]: Running LateLowerReshapeOp 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LateLowerReshapeOp]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LateLowerReshapeOp]: LateLowerReshapeOp finished after 0.004 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InferIntrinsicOnCC]: Running InferIntrinsicOnCC 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InferIntrinsicOnCC]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InferIntrinsicOnCC]: InferIntrinsicOnCC finished after 0.037 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/ResolveAccessConflict]: Running ResolveAccessConflict 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/ResolveAccessConflict]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/ResolveAccessConflict]: ResolveAccessConflict finished after 0.005 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LICM]: Running LICM 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LICM]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LICM]: LICM finished after 0.001 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LocalLayoutOpt]: Running LocalLayoutOpt 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LocalLayoutOpt]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LocalLayoutOpt]: LocalLayoutOpt finished after 0.009 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DelinearIndices]: Running DelinearIndices 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DelinearIndices]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DelinearIndices]: DelinearIndices finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/PGLayoutTilingPipeline]: Running PGLayoutTilingPipeline 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LayoutPreprocessingAndAnalysis]: Running LayoutPreprocessingAndAnalysis 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LayoutPreprocessing]: Running LayoutPreprocessing 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Delinearization]: Running Delinearization 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Delinearization]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/Delinearization]: Delinearization finished after 0.002 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LayoutPreprocessing]: Finished (changed=True) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LayoutPreprocessing]: LayoutPreprocessing finished after 0.022 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LayoutRequirementAnalysis]: Running LayoutRequirementAnalysis 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LayoutRequirementAnalysis]: LayoutRequirementAnalysis finished after 0.006 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LayoutPreprocessingAndAnalysis]: LayoutPreprocessingAndAnalysis finished after 0.036 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InferNonlocalTensors]: Running InferNonlocalTensors 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InferNonlocalTensors]: prefer_non_broadcast_par: True 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InferNonlocalTensors]: prefer_non_broadcast_par: True 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InferNonlocalTensors]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InferNonlocalTensors]: InferNonlocalTensors finished after 0.022 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/PAGLayoutOpt]: Running PAGLayoutOpt 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/ParAxesAnnotation]: Running ParAxesAnnotation 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LayoutSearchAlgorithm]: prefer_non_broadcast_par: True 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/ParAxesAnnotation]: Finished (changed=True) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/ParAxesAnnotation]: ParAxesAnnotation finished after 0.044 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InsertLocalTransposes]: Running InsertLocalTransposes 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InsertLocalTransposes]: Finished (changed=True) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InsertLocalTransposes]: InsertLocalTransposes finished after 0.005 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/PAGLayoutOpt]: PAGLayoutOpt finished after 0.059 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/MaskPropagation]: Running MaskPropagation 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/MaskPropagation]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/MaskPropagation]: MaskPropagation finished after 0.003 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/CanonicalizeDAGForPGTiling]: Running CanonicalizeDAGForPGTiling 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/CanonicalizeDAGForPGTiling]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/CanonicalizeDAGForPGTiling]: CanonicalizeDAGForPGTiling finished after 0.003 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/PGTiling]: Running PGTiling 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/AGOrderingAnalysisPass]: Running AGOrderingAnalysisPass 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/AGOrderingAnalysisPass]: AGOrderingAnalysisPass finished after 0.029 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/StaticTransposeLocalTensor]: Running StaticTransposeLocalTensor 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/StaticTransposeLocalTensor]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/StaticTransposeLocalTensor]: StaticTransposeLocalTensor finished after 0.003 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/PComputeCutting]: Running PComputeCutting 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/PComputeCutting]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/PComputeCutting]: PComputeCutting finished after 0.008 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/BFComputeCutting]: Running BFComputeCutting 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/BFComputeCutting]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/BFComputeCutting]: BFComputeCutting finished after 0.004 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LoopSplitting]: Running LoopSplitting 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LoopSplitting]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/LoopSplitting]: LoopSplitting finished after 0.001 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/MacroGeneration]: Running MacroGeneration 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/MacroGeneration]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/MacroGeneration]: MacroGeneration finished after 0.025 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/PGTiling]: PGTiling finished after 0.090 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InsertIOTransposes]: Running InsertIOTransposes 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InsertIOTransposes]: Finished (changed=True) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InsertIOTransposes]: InsertIOTransposes finished after 0.003 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InsertOffloadedTransposes]: Running InsertOffloadedTransposes 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InsertOffloadedTransposes]: Finished (changed=False) 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/InsertOffloadedTransposes]: InsertOffloadedTransposes finished after 0.001 seconds 2025-08-09T18:44:45Z INFO 67605 [sg0000/Tensorizer/DramToDramTranspose]: Running DramToDramTranspose 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/DramToDramTranspose]: Finished (changed=True) 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/DramToDramTranspose]: DramToDramTranspose finished after 18.575 seconds 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/PGLayoutTilingPipeline]: PGLayoutTilingPipeline finished after 18.825 seconds 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingProfiler]: Running TilingProfiler 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 20 MACROS WITH LARGEST INSTRUCTION COUNTS: 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingProfiler]: Finished (changed=False) 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/TilingProfiler]: TilingProfiler finished after 0.208 seconds 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/FlattenMacroLoop]: Running FlattenMacroLoop 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/FlattenMacroLoop]: Finished (changed=True) 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/FlattenMacroLoop]: FlattenMacroLoop finished after 0.121 seconds 2025-08-09T18:45:04Z INFO 67605 [sg0000/Tensorizer/InferNeuronTensor]: Running InferNeuronTensor 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/InferNeuronTensor]: Finished (changed=True) 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/InferNeuronTensor]: InferNeuronTensor finished after 0.824 seconds 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifier]: Running NeuronSimplifier 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifier]: Finished (changed=False) 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifier]: NeuronSimplifier finished after 0.123 seconds 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/LICM]: Running LICM 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/LICM]: Finished (changed=False) 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/LICM]: LICM finished after 0.036 seconds 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/RewriteReplicationMatmul]: Running RewriteReplicationMatmul 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/RewriteReplicationMatmul]: Finished (changed=False) 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/RewriteReplicationMatmul]: RewriteReplicationMatmul finished after 0.029 seconds 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/FlattenMacroLoop]: Running FlattenMacroLoop 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/FlattenMacroLoop]: Finished (changed=False) 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/FlattenMacroLoop]: FlattenMacroLoop finished after 0.088 seconds 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/SimplifyMacroPredicates]: Running SimplifyMacroPredicates 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/SimplifyMacroPredicates]: Finished (changed=False) 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/SimplifyMacroPredicates]: SimplifyMacroPredicates finished after 0.094 seconds 2025-08-09T18:45:05Z INFO 67605 [sg0000/Tensorizer/DataLocalityOpt]: Running DataLocalityOpt 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/DataLocalityOpt]: Finished (changed=True) 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/DataLocalityOpt]: DataLocalityOpt finished after 0.189 seconds 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/DMATilingProfiler]: Running DMATilingProfiler 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 20 MACROS WITH LARGEST INSTRUCTION COUNTS: 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1536: transpose_128x128 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/DMATilingProfiler]: Finished (changed=False) 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/DMATilingProfiler]: DMATilingProfiler finished after 0.034 seconds 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifier]: Running NeuronSimplifier 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifier]: Finished (changed=False) 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifier]: NeuronSimplifier finished after 0.130 seconds 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/LegalizeSundaMacro]: Running LegalizeSundaMacro 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/LegalizeSundaMacro]: Finished (changed=False) 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/LegalizeSundaMacro]: LegalizeSundaMacro finished after 0.064 seconds 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifier]: Running NeuronSimplifier 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifier]: Finished (changed=False) 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifier]: NeuronSimplifier finished after 0.130 seconds 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PerfectLoopNest]: Running PerfectLoopNest 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PerfectLoopNest]: Finished (changed=False) 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/PerfectLoopNest]: PerfectLoopNest finished after 0.027 seconds 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/FlattenMacroLoop]: Running FlattenMacroLoop 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/FlattenMacroLoop]: Finished (changed=True) 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/FlattenMacroLoop]: FlattenMacroLoop finished after 0.096 seconds 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/RewriteWeights]: Running RewriteWeights 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/RewriteWeights]: Finished (changed=False) 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/RewriteWeights]: RewriteWeights finished after 0.023 seconds 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/ReshapeWeights]: Running ReshapeWeights 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/ReshapeWeights]: Finished (changed=False) 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/ReshapeWeights]: ReshapeWeights finished after 0.007 seconds 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/FlattenMacroLoop]: Running FlattenMacroLoop 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/FlattenMacroLoop]: Finished (changed=False) 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/FlattenMacroLoop]: FlattenMacroLoop finished after 0.080 seconds 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/SimplifyMacroPredicates]: Running SimplifyMacroPredicates 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/SimplifyMacroPredicates]: Finished (changed=False) 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/SimplifyMacroPredicates]: SimplifyMacroPredicates finished after 0.098 seconds 2025-08-09T18:45:06Z INFO 67605 [sg0000/Tensorizer/InferInitValue]: Running InferInitValue 2025-08-09T18:45:07Z INFO 67605 [sg0000/Tensorizer/InferInitValue]: Finished (changed=True) 2025-08-09T18:45:07Z INFO 67605 [sg0000/Tensorizer/InferInitValue]: InferInitValue finished after 0.433 seconds 2025-08-09T18:45:07Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifier]: Running NeuronSimplifier 2025-08-09T18:45:07Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifier]: Finished (changed=False) 2025-08-09T18:45:07Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifier]: NeuronSimplifier finished after 0.130 seconds 2025-08-09T18:45:07Z INFO 67605 [sg0000/Tensorizer/SimplifyTensor]: Running SimplifyTensor 2025-08-09T18:45:07Z INFO 67605 [sg0000/Tensorizer/SimplifyTensor]: Finished (changed=False) 2025-08-09T18:45:07Z INFO 67605 [sg0000/Tensorizer/SimplifyTensor]: SimplifyTensor finished after 0.081 seconds 2025-08-09T18:45:07Z INFO 67605 [sg0000/Tensorizer/LICM]: Running LICM 2025-08-09T18:45:07Z INFO 67605 [sg0000/Tensorizer/LICM]: Finished (changed=False) 2025-08-09T18:45:07Z INFO 67605 [sg0000/Tensorizer/LICM]: LICM finished after 0.037 seconds 2025-08-09T18:45:07Z INFO 67605 [sg0000/Tensorizer/SundaISel]: Running SundaISel 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/SundaISel]: Finished (changed=True) 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/SundaISel]: SundaISel finished after 0.549 seconds 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronAliasDependencyReset]: Running NeuronAliasDependencyReset 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/AliasDependencyElimination]: Running AliasDependencyElimination 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/AliasDependencyElimination]: Finished (changed=False) 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/AliasDependencyElimination]: AliasDependencyElimination finished after 0.000 seconds 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronAliasDependencyInduction]: Running NeuronAliasDependencyInduction 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronAliasDependencyInduction]: Finished (changed=True) 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronAliasDependencyInduction]: NeuronAliasDependencyInduction finished after 0.041 seconds 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronAliasDependencyReset]: NeuronAliasDependencyReset finished after 0.049 seconds 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/LowerComplexBroadcast]: Running LowerComplexBroadcast 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/LowerComplexBroadcast]: Finished (changed=False) 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/LowerComplexBroadcast]: LowerComplexBroadcast finished after 0.027 seconds 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronLoopInterchange]: Running NeuronLoopInterchange 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronLoopInterchange]: Finished (changed=False) 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronLoopInterchange]: NeuronLoopInterchange finished after 0.024 seconds 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Running NeuronSimplifyPredicates 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Finished (changed=False) 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifyPredicates]: NeuronSimplifyPredicates finished after 0.017 seconds 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronLoopFusion]: Running NeuronLoopFusion 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronLoopFusion]: Finished (changed=True) 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronLoopFusion]: NeuronLoopFusion finished after 0.090 seconds 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronLoopInterchange]: Running NeuronLoopInterchange 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronLoopInterchange]: Finished (changed=False) 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronLoopInterchange]: NeuronLoopInterchange finished after 0.022 seconds 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronLICM]: Running NeuronLICM 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronLICM]: Finished (changed=False) 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronLICM]: NeuronLICM finished after 0.083 seconds 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/FactorizeBlkDims]: Running FactorizeBlkDims 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/FactorizeBlkDims]: Finished (changed=False) 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/FactorizeBlkDims]: FactorizeBlkDims finished after 0.113 seconds 2025-08-09T18:45:08Z INFO 67605 [sg0000/Tensorizer/NeuronInstComb]: Running NeuronInstComb 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/NeuronInstComb]: Finished (changed=True) 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/NeuronInstComb]: NeuronInstComb finished after 1.604 seconds 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/NeuronValueNumbering]: Running NeuronValueNumbering 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/NeuronValueNumbering]: Finished (changed=False) 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/NeuronValueNumbering]: NeuronValueNumbering finished after 0.045 seconds 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/NeuronInstComb]: Running NeuronInstComb 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/NeuronInstComb]: Finished (changed=False) 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/NeuronInstComb]: NeuronInstComb finished after 0.020 seconds 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/VectorizeDMA]: Running VectorizeDMA 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/VectorizeDMA]: Finished (changed=False) 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/VectorizeDMA]: VectorizeDMA finished after 0.030 seconds 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Running NeuronSimplifyPredicates 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Finished (changed=False) 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifyPredicates]: NeuronSimplifyPredicates finished after 0.011 seconds 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/LegalizePartitionReduce]: Running LegalizePartitionReduce 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/LegalizePartitionReduce]: Finished (changed=False) 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/LegalizePartitionReduce]: LegalizePartitionReduce finished after 0.010 seconds 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/DeConcat]: Running DeConcat 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/DeConcat]: Finished (changed=False) 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/DeConcat]: DeConcat finished after 0.002 seconds 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/FactorizeThreadAxesInFreeDims]: Running FactorizeThreadAxesInFreeDims 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/FactorizeThreadAxesInFreeDims]: Finished (changed=False) 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/FactorizeThreadAxesInFreeDims]: FactorizeThreadAxesInFreeDims finished after 0.020 seconds 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/PartialSimdFusion]: Running PartialSimdFusion 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/PartialSimdFusion]: Finished (changed=False) 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/PartialSimdFusion]: PartialSimdFusion finished after 0.009 seconds 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/TritiumFusion]: Running TritiumFusion 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/TritiumFusion]: Finished (changed=False) 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/TritiumFusion]: TritiumFusion finished after 0.010 seconds 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/CCOpFusion]: Running CCOpFusion 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/CCOpFusion]: Finished (changed=False) 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/CCOpFusion]: CCOpFusion finished after 0.081 seconds 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/VectorizeMatMult]: Running VectorizeMatMult 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/VectorizeMatMult]: Finished (changed=False) 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/VectorizeMatMult]: VectorizeMatMult finished after 0.005 seconds 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/PartialLoopFusion]: Running PartialLoopFusion 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/PartialLoopFusion]: Finished (changed=False) 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/PartialLoopFusion]: PartialLoopFusion finished after 0.154 seconds 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/NeuronLICM]: Running NeuronLICM 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/NeuronLICM]: Finished (changed=False) 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/NeuronLICM]: NeuronLICM finished after 0.048 seconds 2025-08-09T18:45:10Z INFO 67605 [sg0000/Tensorizer/LowerTranspose]: Running LowerTranspose 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/LowerTranspose]: Finished (changed=True) 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/LowerTranspose]: LowerTranspose finished after 0.491 seconds 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/LowerBroadcast]: Running LowerBroadcast 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/LowerBroadcast]: Finished (changed=False) 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/LowerBroadcast]: LowerBroadcast finished after 0.019 seconds 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/LateNeuronInstComb]: Running LateNeuronInstComb 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/LateNeuronInstComb]: Finished (changed=True) 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/LateNeuronInstComb]: LateNeuronInstComb finished after 0.128 seconds 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/SplitAccGrp]: Running SplitAccGrp 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/SplitAccGrp]: Finished (changed=False) 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/SplitAccGrp]: SplitAccGrp finished after 0.015 seconds 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/SpillPSum]: Running SpillPSum 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/SpillPSum]: Finished (changed=False) 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/SpillPSum]: SpillPSum finished after 0.150 seconds 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/LowerIntrinsics]: Running LowerIntrinsics 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/LowerIntrinsics]: Finished (changed=False) 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/LowerIntrinsics]: LowerIntrinsics finished after 0.018 seconds 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/InlineNativeKernels]: Running InlineNativeKernels 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/InlineNativeKernels]: Finished (changed=False) 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/InlineNativeKernels]: InlineNativeKernels finished after 0.015 seconds 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/LegalizeType]: Running LegalizeType 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/LegalizeType]: Finished (changed=True) 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/LegalizeType]: LegalizeType finished after 0.104 seconds 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/NeuronLICM]: Running NeuronLICM 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/NeuronLICM]: Finished (changed=False) 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/NeuronLICM]: NeuronLICM finished after 0.074 seconds 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/InferPSumTensor]: Running InferPSumTensor 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/InferPSumTensor]: Finished (changed=False) 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/InferPSumTensor]: InferPSumTensor finished after 0.176 seconds 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/WeightCoalescing]: Running WeightCoalescing 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/WeightCoalescing]: Finished (changed=False) 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/WeightCoalescing]: WeightCoalescing finished after 0.015 seconds 2025-08-09T18:45:11Z INFO 67605 [sg0000/Tensorizer/LegalizeSundaAccess]: Running LegalizeSundaAccess 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/LegalizeSundaAccess]: Finished (changed=False) 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/LegalizeSundaAccess]: LegalizeSundaAccess finished after 0.145 seconds 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/RelaxPredicates]: Running RelaxPredicates 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/RelaxPredicates]: Finished (changed=False) 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/RelaxPredicates]: RelaxPredicates finished after 0.039 seconds 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/TensorInitialization]: Running TensorInitialization 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/TensorInitialization]: Finished (changed=False) 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/TensorInitialization]: TensorInitialization finished after 0.017 seconds 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Running NeuronSimplifyPredicates 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Finished (changed=False) 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/NeuronSimplifyPredicates]: NeuronSimplifyPredicates finished after 0.017 seconds 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/ExpandISAMacro]: Running ExpandISAMacro 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/ExpandISAMacro]: Finished (changed=False) 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/ExpandISAMacro]: ExpandISAMacro finished after 0.034 seconds 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/SimplifyNeuronTensor]: Running SimplifyNeuronTensor 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/SimplifyNeuronTensor]: Finished (changed=False) 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/SimplifyNeuronTensor]: SimplifyNeuronTensor finished after 0.060 seconds 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/DMALocalityOpt]: Running DMALocalityOpt 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/DMALocalityOpt]: Finished (changed=False) 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/DMALocalityOpt]: DMALocalityOpt finished after 0.012 seconds 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/DataStreaming]: Running DataStreaming 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/DataStreaming]: Finished (changed=False) 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/DataStreaming]: DataStreaming finished after 0.033 seconds 2025-08-09T18:45:12Z INFO 67605 [sg0000/Tensorizer/SFKVectorizer]: Running SFKVectorizer 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/SFKVectorizer]: Finished (changed=True) 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/SFKVectorizer]: SFKVectorizer finished after 3.184 seconds 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/LateLegalizeInst]: Running LateLegalizeInst 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/LateLegalizeInst]: Finished (changed=False) 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/LateLegalizeInst]: LateLegalizeInst finished after 0.066 seconds 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/CoalesceCCOp]: Running CoalesceCCOp 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/CoalesceCCOp]: Finished (changed=False) 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/CoalesceCCOp]: CoalesceCCOp finished after 0.018 seconds 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/SimpleAllReduceTiling]: Running SimpleAllReduceTiling 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/SimpleAllReduceTiling]: Finished (changed=False) 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/SimpleAllReduceTiling]: SimpleAllReduceTiling finished after 0.018 seconds 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/DMAProfiler]: Running DMAProfiler 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/DMAProfiler]: Top 10 (estimated) latency DMAs: 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 231.410us (48.000MiB, est bw: 217.500GB/s, 0.465% of tot. time) for bfloat16<128 x 3072> TongaSB partitions[2] bfloat16 (32, 2, 128, 3072) %'20894.27130'[T_i0,T_i2_29578,i0.128,i1.3072] = load bfloat16<128 x 3072> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3072) %'input8'[T_i0,i0.128,T_i2_29578,i1.3072] # id=25058, src_id=None, , instances=64 # dl = tensor_op_name: t2534_pftranspose_20894 | hlo_id: 1787 | [[i0.128];[i1.3072]] -> [[i0.128];[i1.3072]] 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 231.410us (48.000MiB, est bw: 217.500GB/s, 0.465% of tot. time) for bfloat16<128 x 3072> TongaSB partitions[2] bfloat16 (32, 2, 128, 3072) %'20935.27144'[T_i0,T_i2_29586,i0.128,i1.3072] = load bfloat16<128 x 3072> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3072) %'input19'[T_i0,i0.128,T_i2_29586,i1.3072] # id=25116, src_id=None, , instances=64 # dl = tensor_op_name: t2597_pftranspose_20935 | hlo_id: 1805 | [[i0.128];[i1.3072]] -> [[i0.128];[i1.3072]] 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 231.410us (48.000MiB, est bw: 217.500GB/s, 0.465% of tot. time) for bfloat16<128 x 3072> TongaSB partitions[2] bfloat16 (32, 2, 128, 3072) %'20976.27158'[T_i0,T_i2_29594,i0.128,i1.3072] = load bfloat16<128 x 3072> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3072) %'input30'[T_i0,i0.128,T_i2_29594,i1.3072] # id=25174, src_id=None, , instances=64 # dl = tensor_op_name: t2660_pftranspose_20976 | hlo_id: 1823 | [[i0.128];[i1.3072]] -> [[i0.128];[i1.3072]] 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 231.410us (48.000MiB, est bw: 217.500GB/s, 0.465% of tot. time) for bfloat16<128 x 3072> TongaSB partitions[2] bfloat16 (32, 2, 128, 3072) %'21017.27172'[T_i0,T_i2_29602,i0.128,i1.3072] = load bfloat16<128 x 3072> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3072) %'input41'[T_i0,i0.128,T_i2_29602,i1.3072] # id=25232, src_id=None, , instances=64 # dl = tensor_op_name: t2723_pftranspose_21017 | hlo_id: 1841 | [[i0.128];[i1.3072]] -> [[i0.128];[i1.3072]] 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 231.410us (48.000MiB, est bw: 217.500GB/s, 0.465% of tot. time) for bfloat16<128 x 3072> TongaSB partitions[2] bfloat16 (32, 2, 128, 3072) %'21058.27186'[T_i0,T_i2_29610,i0.128,i1.3072] = load bfloat16<128 x 3072> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3072) %'input52'[T_i0,i0.128,T_i2_29610,i1.3072] # id=25290, src_id=None, , instances=64 # dl = tensor_op_name: t2786_pftranspose_21058 | hlo_id: 1859 | [[i0.128];[i1.3072]] -> [[i0.128];[i1.3072]] 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 231.410us (48.000MiB, est bw: 217.500GB/s, 0.465% of tot. time) for bfloat16<128 x 3072> TongaSB partitions[2] bfloat16 (32, 2, 128, 3072) %'21099.27200'[T_i0,T_i2_29618,i0.128,i1.3072] = load bfloat16<128 x 3072> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3072) %'input63'[T_i0,i0.128,T_i2_29618,i1.3072] # id=25348, src_id=None, , instances=64 # dl = tensor_op_name: t2849_pftranspose_21099 | hlo_id: 1877 | [[i0.128];[i1.3072]] -> [[i0.128];[i1.3072]] 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 231.410us (48.000MiB, est bw: 217.500GB/s, 0.465% of tot. time) for bfloat16<128 x 3072> TongaSB partitions[2] bfloat16 (32, 2, 128, 3072) %'21140.27214'[T_i0,T_i2_29626,i0.128,i1.3072] = load bfloat16<128 x 3072> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3072) %'input74'[T_i0,i0.128,T_i2_29626,i1.3072] # id=25406, src_id=None, , instances=64 # dl = tensor_op_name: t2912_pftranspose_21140 | hlo_id: 1895 | [[i0.128];[i1.3072]] -> [[i0.128];[i1.3072]] 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 231.410us (48.000MiB, est bw: 217.500GB/s, 0.465% of tot. time) for bfloat16<128 x 3072> TongaSB partitions[2] bfloat16 (32, 2, 128, 3072) %'21181.27228'[T_i0,T_i2_29634,i0.128,i1.3072] = load bfloat16<128 x 3072> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3072) %'input85'[T_i0,i0.128,T_i2_29634,i1.3072] # id=25464, src_id=None, , instances=64 # dl = tensor_op_name: t2975_pftranspose_21181 | hlo_id: 1913 | [[i0.128];[i1.3072]] -> [[i0.128];[i1.3072]] 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 231.410us (48.000MiB, est bw: 217.500GB/s, 0.465% of tot. time) for bfloat16<128 x 3072> TongaSB partitions[2] bfloat16 (32, 2, 128, 3072) %'21222.27242'[T_i0,T_i2_29642,i0.128,i1.3072] = load bfloat16<128 x 3072> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3072) %'input96'[T_i0,i0.128,T_i2_29642,i1.3072] # id=25522, src_id=None, , instances=64 # dl = tensor_op_name: t3038_pftranspose_21222 | hlo_id: 1931 | [[i0.128];[i1.3072]] -> [[i0.128];[i1.3072]] 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 231.410us (48.000MiB, est bw: 217.500GB/s, 0.465% of tot. time) for bfloat16<128 x 3072> TongaSB partitions[2] bfloat16 (32, 2, 128, 3072) %'21263.27256'[T_i0,T_i2_29650,i0.128,i1.3072] = load bfloat16<128 x 3072> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3072) %'input107'[T_i0,i0.128,T_i2_29650,i1.3072] # id=25580, src_id=None, , instances=64 # dl = tensor_op_name: t3101_pftranspose_21263 | hlo_id: 1949 | [[i0.128];[i1.3072]] -> [[i0.128];[i1.3072]] 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/DMAProfiler]: Finished (changed=False) 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/DMAProfiler]: DMAProfiler finished after 0.033 seconds 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/OptimizeNKIKernels]: Running OptimizeNKIKernels 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/OptimizeNKIKernels]: Finished (changed=False) 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/OptimizeNKIKernels]: OptimizeNKIKernels finished after 0.017 seconds 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/CCOpFusion]: Running CCOpFusion 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/CCOpFusion]: Finished (changed=True) 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/CCOpFusion]: CCOpFusion finished after 0.357 seconds 2025-08-09T18:45:15Z INFO 67605 [sg0000/Tensorizer/StaticProfiler]: Running StaticProfiler 2025-08-09T18:45:16Z WARNING 67605 [sg0000/Tensorizer/StaticProfiler]: matmul-based transposes inserted by penguin takes up 100.00 percent of all matmul computation 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/StaticProfiler]: Finished (changed=False) 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/StaticProfiler]: StaticProfiler finished after 0.041 seconds 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/SplitAPUnionSets]: Running SplitAPUnionSets 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/SplitAPUnionSets]: Finished (changed=True) 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/SplitAPUnionSets]: SplitAPUnionSets finished after 0.154 seconds 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/LateLegalizePostSplit]: Running LateLegalizePostSplit 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/LateLegalizePostSplit]: Finished (changed=False) 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/LateLegalizePostSplit]: LateLegalizePostSplit finished after 0.040 seconds 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/DumpGraphAndMetadata]: Running DumpGraphAndMetadata 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/DumpGraphAndMetadata]: Finished (changed=False) 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/DumpGraphAndMetadata]: DumpGraphAndMetadata finished after 0.046 seconds 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/ZeroSizeTensorElimination]: Running ZeroSizeTensorElimination 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/ZeroSizeTensorElimination]: Finished (changed=False) 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/ZeroSizeTensorElimination]: ZeroSizeTensorElimination finished after 0.001 seconds 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/BirCodeGenLoop]: Running BirCodeGenLoop 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/BirCodeGenLoop]: Finished (changed=False) 2025-08-09T18:45:16Z INFO 67605 [sg0000/Tensorizer/BirCodeGenLoop]: BirCodeGenLoop finished after 0.665 seconds 2025-08-09T18:45:17Z INFO 67605 [Tensorizer]: BirCodeGen estimate #instances=279978 in sg0000 2025-08-09T18:45:17Z INFO 67605 [Tensorizer]: IR signature: 4c500c33f6b410247d09546b05e57cdd552637593e5e9cae706f41ffd3eaadab for nc00/sg0000/TensorizerBIR 2025-08-09T18:45:17Z INFO 67605 [Tensorizer]: Weights total number of bytes: 131072 2025-08-09T18:45:17Z INFO 67605 [Tensorizer]: Successfully built model. 2025-08-09T18:45:17Z USER 67605 [root/Tensorizer/Tensorizer]: Tensorizer finished after 33.074 seconds 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: End tensorization 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input0 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input1 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input2 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input3 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input4 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input5 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input6 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input7 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input8 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input9 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input10 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input11 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input12 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input13 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input14 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input15 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input16 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input17 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input18 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input19 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input20 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input21 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input22 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input23 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input24 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input25 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input26 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input27 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input28 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input29 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input30 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input31 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input32 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input33 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input34 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input35 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input36 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input37 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input38 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input39 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input40 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input41 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input42 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input43 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input44 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input45 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input46 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input47 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input48 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input49 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input50 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input51 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input52 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input53 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input54 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input55 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input56 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input57 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input58 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input59 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input60 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input61 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input62 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input63 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input64 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input65 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input66 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input67 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input68 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input69 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input70 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input71 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input72 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input73 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input74 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input75 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input76 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input77 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input78 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input79 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input80 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input81 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input82 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input83 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input84 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input85 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input86 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input87 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input88 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input89 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input90 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input91 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input92 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input93 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input94 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input95 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input96 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input97 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input98 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input99 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input100 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input101 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input102 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input103 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input104 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input105 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input106 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input107 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input108 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input109 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input110 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input111 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input112 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input113 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input114 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input115 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input116 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input117 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input118 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input119 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input120 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input121 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input122 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input123 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input124 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input125 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input126 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input127 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input128 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input129 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input130 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input131 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input132 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input133 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input134 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input135 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input136 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input137 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input138 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input139 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input140 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input141 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input142 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input143 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input144 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input145 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input146 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input147 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input148 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input149 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input150 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input151 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input152 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input153 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input154 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input155 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input156 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input157 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input158 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input159 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input160 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input161 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input162 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input163 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input164 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input165 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input166 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input167 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input168 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input169 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input170 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input171 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input172 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input173 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input174 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input175 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input176 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input177 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input178 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input179 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input180 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input181 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input182 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input183 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input184 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input185 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input186 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input187 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input188 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input189 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input190 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input191 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input192 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input193 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input194 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input195 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input196 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input197 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input198 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input199 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input200 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input201 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input202 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input203 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input204 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input205 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input206 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input207 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input208 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input209 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input210 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input211 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input212 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input213 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input214 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input215 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input216 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input217 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input218 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input219 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input220 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input221 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input222 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input223 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input224 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input225 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input226 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input227 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input228 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input229 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input230 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input231 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input232 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input233 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input234 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input235 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input236 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input237 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input238 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input239 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input240 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input241 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input242 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input243 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input244 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input245 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input246 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input247 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input248 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input249 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input250 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input251 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input252 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input253 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input254 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input255 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input256 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input257 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input258 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input259 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input260 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input261 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input262 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input263 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input264 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input265 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input266 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input267 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input268 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input269 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input270 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input271 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input272 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input273 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input274 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input275 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input276 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input277 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input278 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input279 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input280 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input281 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input282 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input283 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input284 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input285 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input286 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input287 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input288 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input289 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input290 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input291 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input292 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input293 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input294 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input295 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input296 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input297 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input298 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input299 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input300 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input301 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input302 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input303 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input304 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input305 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input306 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input307 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input308 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input309 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input310 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input311 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input312 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input313 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input314 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input315 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input316 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input317 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input318 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input319 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input320 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input321 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input322 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input323 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input324 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input325 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input326 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input327 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input328 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input329 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input330 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input331 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input332 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input333 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input334 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input335 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input336 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input337 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input338 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input339 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input340 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input341 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input342 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input343 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input344 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input345 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input346 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input347 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input348 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input349 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input350 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input351 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input352 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input353 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input354 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input355 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input356 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input357 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input358 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input359 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input360 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input361 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input362 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input363 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input364 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input365 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input366 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input367 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input368 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input369 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input370 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input371 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input372 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input373 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input374 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input375 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input376 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input377 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input378 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input379 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input380 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input381 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input382 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input383 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input384 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input385 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input386 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input387 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input388 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input389 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input390 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input391 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input392 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input393 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input394 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input395 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input396 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input397 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Network input: input398 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: wrote bir.json 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: wrote tensor_map.json 2025-08-09T18:45:17Z INFO 67605 [job.Frontend.0]: Job #0 finished 2025-08-09T18:45:17Z INFO 67605 [pipeline.Pipeline.0]: Finished job job.Frontend.0 2025-08-09T18:45:17Z INFO 67605 [pipeline.Pipeline.0]: Starting job job.StaticIOTranspose.0 2025-08-09T18:45:17Z INFO 67605 [pipeline.Pipeline.0]: Finished job job.StaticIOTranspose.0 2025-08-09T18:45:17Z INFO 67605 [pipeline.Pipeline.0]: Starting job job.WalrusDriver.0 2025-08-09T18:45:17Z INFO 67605 [job.WalrusDriver.0]: BackendDriver has 1 states with 1 core LNC 2025-08-09T18:45:17Z INFO 67605 [job.WalrusDriver.0]: BackendDriver: no partitions found. Switching to flat flow. 2025-08-09T18:45:17Z INFO 67605 [job.WalrusDriver.0]: Job WalrusDriver len(in_states) 1 2025-08-09T18:45:17Z INFO 67605 [job.WalrusDriver.0]: Processing input #0 2025-08-09T18:45:17Z INFO 67605 [job.WalrusDriver.0]: BackendDriver in_state.num_states 1 with 1 core LNC 2025-08-09T18:45:17Z INFO 67605 [job.WalrusDriver.0]: Executing /opt/aws_neuronx_venv_pytorch_2_7_nxd_inference/lib/python3.10/site-packages/neuronxcc/starfish/bin/walrus_driver --optlevel 2 --allocator coloring --verbose 35 --logfile-verbose 20 --logfile /home/ubuntu/qwen3/layout_opt/log-neuron-cc.txt --execute-repetition 1 -i bir.json --min_split_size 10240 --skip_split_vns '' --no_split_dram --split_huge_dram_tensor 1.0 --preprocessing_only --max_tensorizer_distance 64 --pack_same_shape_only --instruction_fetch_latency 511 --max-partitions 1 --policy 3 --auxflag 0 --interleave none --schedule-delayed-latency 1 --postsched-mm-accum-reorder=false --max-load-color-rotation --max-load-lower-bound 0.14 --mm-reorder-opt --force-prefetch-follow-incoming-order -1 --allreduce-buffer-size 500 --dram-page-size 512 --dram-rotation-size -1 --allreduce-rotation-dis 8 --repeat-load-thres 4 --enable-mm-transpose-remat-optimization=true --save-len-thres 512 --save-dma-cnt-thres 32 --relaxed-order=true --enable-anti-dependence-reduction=false --num-semaphores-per-queue 16 --numcores 1 --act-root-json /opt/aws_neuronx_venv_pytorch_2_7_nxd_inference/lib/python3.10/site-packages/neuronxcc/pwp/pwp_bin_trainium/act_info.json --dve-root-json /opt/aws_neuronx_venv_pytorch_2_7_nxd_inference/lib/python3.10/site-packages/neuronxcc/dve/dve_bin_gen2/dve_info.json --unified-backend-and-legacy-codegen --tensor-map tensor_map.json --enable-verifier=true --enable-birsim=false --enable-birsim-sync-only=false --enable-data-race-checker=false --enable-new-backend=true --inject-error=NONE --dge-levels io,vector_dynamic_offsets,scalar_dynamic_offset --dynamic-dma-scratch-size-per-partition=16384 --neff-output-filename /home/ubuntu/qwen3/layout_opt/graph.neff 2025-08-09T18:45:17Z INFO 67605 [job.WalrusDriver.0]: Working directory is /home/ubuntu/neuronxcc-mk9kpjyq/sg00 2025-08-09T18:45:17Z INFO 67605 [job.WalrusDriver.0]: propagate_exit=True 2025-08-09T18:45:17Z INFO 67605 [job.WalrusDriver.0]: use_logger=False 2025-08-09T18:45:17Z INFO 67605 [job.WalrusDriver.0]: expose_stderr=True 2025-08-09T18:45:17Z INFO 67673 [Logging]: Logging to ../../qwen3/layout_opt/log-neuron-cc.txt at level 'INFO' 2025-08-09T18:45:17Z INFO 67673 [BackendDriver]: max_allowed_parallelism=128 2025-08-09T18:45:18Z INFO 67673 [BackendDriver]: Backend driver mtBackend: false numModules: 1 Cwd: "/home/ubuntu/neuronxcc-mk9kpjyq/sg00" 2025-08-09T18:45:18Z INFO 67673 [BackendDriver]: DynamicDMA is enabled 2025-08-09T18:45:18Z INFO 67673 [BackendDriver]: DynamicDMA levels being enabled: io, scalar_dynamic_offset, vector_dynamic_offsets, 2025-08-09T18:45:18Z USER 67673 [BackendPassManager]: Running mod_parallel_pass 2025-08-09T18:45:18Z INFO 67673 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=1776 blocks=1 instructions=869 Max writers: 1 Max Readers: 325 2025-08-09T18:45:18Z USER 67673 [ModuleForkPass]: Running do_nothing 2025-08-09T18:45:18Z INFO 67673 [ModuleForkPass]: Inputs to do_nothing: modules=1 functions=1 allocs=1776 blocks=1 instructions=869 Max writers: 1 Max Readers: 325 2025-08-09T18:45:18Z USER 67673 [ModuleForkPass]: do_nothing finished after 0.003 seconds 2025-08-09T18:45:18Z INFO 67673 [ModuleForkPass]: curr_vmrss: 177mb, ru_maxrss: 429mb (delta=0mb) 2025-08-09T18:45:18Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 1776 memory location(s), 1 block(s), and 869 instruction(s). Max writers: 1 Max Readers: 325 2025-08-09T18:45:18Z USER 67673 [ModuleForkPass]: Running birverifier 2025-08-09T18:45:18Z INFO 67673 [ModuleForkPass]: Inputs to birverifier: modules=1 functions=1 allocs=1776 blocks=1 instructions=869 Max writers: 1 Max Readers: 325 2025-08-09T18:45:18Z USER 67673 [ModuleForkPass]: birverifier finished after 0.290 seconds 2025-08-09T18:45:18Z INFO 67673 [ModuleForkPass]: curr_vmrss: 945mb, ru_maxrss: 945mb (delta=516mb) 2025-08-09T18:45:18Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 1776 memory location(s), 1 block(s), and 869 instruction(s). Max writers: 1 Max Readers: 325 2025-08-09T18:45:18Z USER 67673 [BackendPassManager]: mod_parallel_pass finished after 0.301 seconds 2025-08-09T18:45:18Z INFO 67673 [BackendPassManager]: curr_vmrss: 945mb, ru_maxrss: 945mb (delta=516mb) 2025-08-09T18:45:18Z INFO 67673 [BackendPassManager]: Output has 1 module(s), 1 function(s), 1776 memory location(s), 1 block(s), and 869 instruction(s). Max writers: 1 Max Readers: 325 2025-08-09T18:45:18Z USER 67673 [BackendPassManager]: Running subgraph_parallel_pass 2025-08-09T18:45:18Z INFO 67673 [BackendPassManager]: Inputs to subgraph_parallel_pass: modules=1 functions=1 allocs=1776 blocks=1 instructions=869 Max writers: 1 Max Readers: 325 2025-08-09T18:45:18Z USER 67673 [SubgraphForkPass]: Running lnc_verifier 2025-08-09T18:45:18Z INFO 67673 [SubgraphForkPass]: Inputs to lnc_verifier: modules=1 functions=1 allocs=1776 blocks=1 instructions=869 Max writers: 1 Max Readers: 325 2025-08-09T18:45:18Z USER 67673 [SubgraphForkPass]: lnc_verifier finished after 0.001 seconds 2025-08-09T18:45:18Z INFO 67673 [SubgraphForkPass]: curr_vmrss: 945mb, ru_maxrss: 945mb (delta=0mb) 2025-08-09T18:45:18Z INFO 67673 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 1776 memory location(s), 1 block(s), and 869 instruction(s). Max writers: 1 Max Readers: 325 2025-08-09T18:45:18Z USER 67673 [BackendPassManager]: subgraph_parallel_pass finished after 0.004 seconds 2025-08-09T18:45:18Z INFO 67673 [BackendPassManager]: curr_vmrss: 945mb, ru_maxrss: 945mb (delta=0mb) 2025-08-09T18:45:18Z INFO 67673 [BackendPassManager]: Output has 1 module(s), 1 function(s), 1776 memory location(s), 1 block(s), and 869 instruction(s). Max writers: 1 Max Readers: 325 2025-08-09T18:45:18Z USER 67673 [BackendPassManager]: Running mod_parallel_pass 2025-08-09T18:45:18Z INFO 67673 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=1776 blocks=1 instructions=869 Max writers: 1 Max Readers: 325 2025-08-09T18:45:18Z USER 67673 [ModuleForkPass]: Running expand_replication 2025-08-09T18:45:18Z INFO 67673 [ModuleForkPass]: Inputs to expand_replication: modules=1 functions=1 allocs=1776 blocks=1 instructions=869 Max writers: 1 Max Readers: 325 2025-08-09T18:45:18Z INFO 67673 [ExpandReplication]: Found 0 replicated matmults 2025-08-09T18:45:18Z USER 67673 [ModuleForkPass]: expand_replication finished after 0.001 seconds 2025-08-09T18:45:18Z INFO 67673 [ModuleForkPass]: curr_vmrss: 945mb, ru_maxrss: 945mb (delta=0mb) 2025-08-09T18:45:18Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 1776 memory location(s), 1 block(s), and 869 instruction(s). Max writers: 1 Max Readers: 325 2025-08-09T18:45:18Z USER 67673 [ModuleForkPass]: Running unroll 2025-08-09T18:45:18Z INFO 67673 [ModuleForkPass]: Inputs to unroll: modules=1 functions=1 allocs=1776 blocks=1 instructions=869 Max writers: 1 Max Readers: 325 2025-08-09T18:45:18Z INFO 67673 [Unroll]: INFO (Unroll) Start unrolling at Sat Aug 9 18:45:18 2025 2025-08-09T18:45:21Z INFO 67673 [Unroll]: INFO (Unroll) DONE unrolling Sat Aug 9 18:45:18 2025 2025-08-09T18:45:21Z INFO 67673 [Unroll]: sg0000 Instruction count after Unroll: 2025-08-09T18:45:21Z INFO 67673 [Unroll]: Total count: 279653 2025-08-09T18:45:21Z INFO 67673 [Unroll]: Matmult: 212041 2025-08-09T18:45:21Z INFO 67673 [Unroll]: GenericCopy: 53065 2025-08-09T18:45:21Z INFO 67673 [Unroll]: Load: 7274 2025-08-09T18:45:21Z INFO 67673 [Unroll]: Save: 7273 2025-08-09T18:45:21Z INFO 67673 [Unroll]: Unrolled DGE count with Dynamic AP: 0 2025-08-09T18:45:21Z USER 67673 [ModuleForkPass]: unroll finished after 2.731 seconds 2025-08-09T18:45:21Z INFO 67673 [ModuleForkPass]: curr_vmrss: 2494mb, ru_maxrss: 2494mb (delta=1549mb) 2025-08-09T18:45:21Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69168 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:21Z USER 67673 [BackendPassManager]: mod_parallel_pass finished after 2.780 seconds 2025-08-09T18:45:21Z INFO 67673 [BackendPassManager]: curr_vmrss: 1647mb, ru_maxrss: 2494mb (delta=1549mb) 2025-08-09T18:45:21Z INFO 67673 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69168 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:21Z USER 67673 [BackendPassManager]: Running subgraph_parallel_pass 2025-08-09T18:45:21Z INFO 67673 [BackendPassManager]: Inputs to subgraph_parallel_pass: modules=1 functions=1 allocs=69168 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:21Z USER 67673 [SubgraphForkPass]: Running dead_code_elim 2025-08-09T18:45:21Z INFO 67673 [SubgraphForkPass]: Inputs to dead_code_elim: modules=1 functions=1 allocs=69168 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:21Z INFO 67673 [DeadCodeElim]: eliminateDeadStore removed 0 instructions 2025-08-09T18:45:21Z INFO 67673 [DeadCodeElim]: remove_must_alias_dmacopy removed 0 DMAcopys 2025-08-09T18:45:21Z INFO 67673 [DeadCodeElim]: remove_redundant_alias_dmacopy removed 0 DMAcopys 2025-08-09T18:45:21Z INFO 67673 [DeadCodeElim]: remove_redundant_internal2internal_dmacopy removed 0 DMAcopys 2025-08-09T18:45:21Z USER 67673 [SubgraphForkPass]: dead_code_elim finished after 0.371 seconds 2025-08-09T18:45:21Z INFO 67673 [SubgraphForkPass]: curr_vmrss: 1669mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:21Z INFO 67673 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:21Z USER 67673 [BackendPassManager]: subgraph_parallel_pass finished after 0.386 seconds 2025-08-09T18:45:21Z INFO 67673 [BackendPassManager]: curr_vmrss: 1669mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:21Z INFO 67673 [BackendPassManager]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:21Z USER 67673 [BackendPassManager]: Running mod_parallel_pass 2025-08-09T18:45:21Z INFO 67673 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:21Z USER 67673 [ModuleForkPass]: Running birverifier 2025-08-09T18:45:21Z INFO 67673 [ModuleForkPass]: Inputs to birverifier: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:21Z USER 67673 [ModuleForkPass]: birverifier finished after 0.311 seconds 2025-08-09T18:45:21Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1671mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:21Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:21Z USER 67673 [BackendPassManager]: mod_parallel_pass finished after 0.330 seconds 2025-08-09T18:45:21Z INFO 67673 [BackendPassManager]: curr_vmrss: 1671mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:21Z INFO 67673 [BackendPassManager]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:21Z USER 67673 [BackendPassManager]: Running subgraph_parallel_pass 2025-08-09T18:45:21Z INFO 67673 [BackendPassManager]: Inputs to subgraph_parallel_pass: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:21Z USER 67673 [SubgraphForkPass]: Running lnc_verifier 2025-08-09T18:45:21Z INFO 67673 [SubgraphForkPass]: Inputs to lnc_verifier: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:21Z USER 67673 [SubgraphForkPass]: lnc_verifier finished after 0.009 seconds 2025-08-09T18:45:21Z INFO 67673 [SubgraphForkPass]: curr_vmrss: 1671mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:21Z INFO 67673 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:21Z USER 67673 [BackendPassManager]: subgraph_parallel_pass finished after 0.027 seconds 2025-08-09T18:45:21Z INFO 67673 [BackendPassManager]: curr_vmrss: 1671mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:21Z INFO 67673 [BackendPassManager]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:21Z USER 67673 [BackendPassManager]: Running mod_parallel_pass 2025-08-09T18:45:21Z INFO 67673 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:21Z USER 67673 [ModuleForkPass]: Running instruction_reorder 2025-08-09T18:45:21Z INFO 67673 [ModuleForkPass]: Inputs to instruction_reorder: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:22Z USER 67673 [ModuleForkPass]: instruction_reorder finished after 0.077 seconds 2025-08-09T18:45:22Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1671mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:22Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:22Z USER 67673 [ModuleForkPass]: Running psum_legalization 2025-08-09T18:45:22Z INFO 67673 [ModuleForkPass]: Inputs to psum_legalization: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:22Z USER 67673 [ModuleForkPass]: psum_legalization finished after 0.049 seconds 2025-08-09T18:45:22Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1671mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:22Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:22Z USER 67673 [ModuleForkPass]: Running legalize_cce_dma 2025-08-09T18:45:22Z INFO 67673 [ModuleForkPass]: Inputs to legalize_cce_dma: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:22Z USER 67673 [ModuleForkPass]: legalize_cce_dma finished after 0.049 seconds 2025-08-09T18:45:22Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1671mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:22Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:22Z USER 67673 [ModuleForkPass]: Running error_injector 2025-08-09T18:45:22Z INFO 67673 [ModuleForkPass]: Inputs to error_injector: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:22Z WARNING 67673 [ErrorInjector]: Unrecognized injected error value "0" 2025-08-09T18:45:22Z USER 67673 [ModuleForkPass]: error_injector finished after 0.009 seconds 2025-08-09T18:45:22Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1671mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:22Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:22Z USER 67673 [ModuleForkPass]: Running vn_splitter 2025-08-09T18:45:22Z INFO 67673 [ModuleForkPass]: Inputs to vn_splitter: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:22Z INFO 67673 [VNSplitter]: INFO (VNSplitter) Collected all the internal vnodes: size = 0 2025-08-09T18:45:22Z INFO 67673 [VNSplitter]: INFO (VNSplitter) Done with analyze and splitting: total dead nodes = 0 2025-08-09T18:45:22Z INFO 67673 [PerformanceProfiler]: number of tensorizer non-local-tensor caused reload left 0 2025-08-09T18:45:22Z INFO 67673 [PerformanceProfiler]: number of tensorizer non-local-tensor caused spill left 0 2025-08-09T18:45:22Z INFO 67673 [VNSplitterPass]: INFO (VNSplitter) Time: 0.009 seconds 2025-08-09T18:45:22Z INFO 67673 [VNSplitterPass]: INFO (VerticalFusion) Time: 0.099 seconds 2025-08-09T18:45:22Z INFO 67673 [VNSplitterPass]: INFO (ShrinkDN) Time: 0.115 seconds 2025-08-09T18:45:22Z USER 67673 [ModuleForkPass]: vn_splitter finished after 0.314 seconds 2025-08-09T18:45:22Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1681mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:22Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:22Z USER 67673 [ModuleForkPass]: Running constant_propagate 2025-08-09T18:45:22Z INFO 67673 [ModuleForkPass]: Inputs to constant_propagate: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:22Z INFO 67673 [ConstantPropagate]: [Constant_propagate for select] directly remove instruction number: 0 2025-08-09T18:45:22Z INFO 67673 [ConstantPropagate]: eliminateDeadStore removed 0 instructions 2025-08-09T18:45:22Z INFO 67673 [ConstantPropagate]: remove_must_alias_dmacopy removed 0 DMAcopys 2025-08-09T18:45:23Z INFO 67673 [ConstantPropagate]: remove_redundant_alias_dmacopy removed 0 DMAcopys 2025-08-09T18:45:23Z INFO 67673 [ConstantPropagate]: remove_redundant_internal2internal_dmacopy removed 0 DMAcopys 2025-08-09T18:45:23Z INFO 67673 [ConstantPropagate]: [Constant_propagate for Affineselect] directly remove instruction number: 0 2025-08-09T18:45:24Z INFO 67673 [ConstantPropagate]: eliminateDeadStore removed 0 instructions 2025-08-09T18:45:24Z INFO 67673 [ConstantPropagate]: remove_must_alias_dmacopy removed 0 DMAcopys 2025-08-09T18:45:24Z INFO 67673 [ConstantPropagate]: remove_redundant_alias_dmacopy removed 0 DMAcopys 2025-08-09T18:45:24Z INFO 67673 [ConstantPropagate]: remove_redundant_internal2internal_dmacopy removed 0 DMAcopys 2025-08-09T18:45:24Z USER 67673 [ModuleForkPass]: constant_propagate finished after 2.035 seconds 2025-08-09T18:45:24Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1684mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:24Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:24Z USER 67673 [ModuleForkPass]: Running lower_ac 2025-08-09T18:45:24Z INFO 67673 [ModuleForkPass]: Inputs to lower_ac: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:24Z INFO 67673 [LowerAC]: INFO (LowerAC) Lowered 0 loads, 0 saves, 0 copies. 2025-08-09T18:45:24Z USER 67673 [ModuleForkPass]: lower_ac finished after 0.049 seconds 2025-08-09T18:45:24Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1684mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:24Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:24Z USER 67673 [ModuleForkPass]: Running input_dma_coalescing 2025-08-09T18:45:24Z INFO 67673 [ModuleForkPass]: Inputs to input_dma_coalescing: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:24Z INFO 67673 [DMAOptimizationBase]: DMA input Coalescing combined 0 input loads 2025-08-09T18:45:24Z USER 67673 [ModuleForkPass]: input_dma_coalescing finished after 0.121 seconds 2025-08-09T18:45:24Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1684mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:24Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:24Z USER 67673 [ModuleForkPass]: Running remat_optimization 2025-08-09T18:45:24Z INFO 67673 [ModuleForkPass]: Inputs to remat_optimization: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:24Z INFO 67673 [RematOpt]: Removed 0 remat instructions 2025-08-09T18:45:24Z USER 67673 [ModuleForkPass]: remat_optimization finished after 0.200 seconds 2025-08-09T18:45:24Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1686mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:24Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:24Z USER 67673 [ModuleForkPass]: Running early_peephole_opts 2025-08-09T18:45:24Z INFO 67673 [ModuleForkPass]: Inputs to early_peephole_opts: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:24Z INFO 67673 [EarlyPeepholeOpts]: PeepholeOpts enabled? ActivationAccumulate: true 2025-08-09T18:45:24Z INFO 67673 [EarlyPeepholeOpts]: Activation Accumulate: 0 2025-08-09T18:45:25Z USER 67673 [ModuleForkPass]: early_peephole_opts finished after 0.096 seconds 2025-08-09T18:45:25Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1686mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:25Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:25Z USER 67673 [ModuleForkPass]: Running coalesce_multichannel_cc_ops 2025-08-09T18:45:25Z INFO 67673 [ModuleForkPass]: Inputs to coalesce_multichannel_cc_ops: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:25Z USER 67673 [ModuleForkPass]: coalesce_multichannel_cc_ops finished after 0.027 seconds 2025-08-09T18:45:25Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1686mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:25Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:25Z USER 67673 [ModuleForkPass]: Running infer_stream_ids 2025-08-09T18:45:25Z INFO 67673 [ModuleForkPass]: Inputs to infer_stream_ids: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:25Z USER 67673 [ModuleForkPass]: infer_stream_ids finished after 0.027 seconds 2025-08-09T18:45:25Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1686mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:25Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:25Z USER 67673 [ModuleForkPass]: Running pre_sched 2025-08-09T18:45:25Z INFO 67673 [ModuleForkPass]: Inputs to pre_sched: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:25Z INFO 67673 [PreSched]: Start PRE scheduling 2 cores: 1 at: Sat Aug 9 18:45:25 2025 2025-08-09T18:45:25Z INFO 67673 [LayerSpiller]: LayerSpill: Start... 2025-08-09T18:45:25Z INFO 67673 [LayerSpiller]: LayerSpill: Found 0 Splits CCs 2025-08-09T18:45:25Z INFO 67673 [LayerSpiller]: Grouped CCs to 0 clusters. 2025-08-09T18:45:25Z INFO 67673 [LayerSpiller]: LayerSpill: To Spill 0 multi-layer tensors 2025-08-09T18:45:25Z INFO 67673 [LayerSpiller]: LayerSpill: set uninit flag on 0 insts 2025-08-09T18:45:25Z INFO 67673 [LayerSpiller]: LayerSpill: Done. 2025-08-09T18:45:25Z INFO 67673 [PreSched]: Start split live ranges Sat Aug 9 18:45:25 2025 2025-08-09T18:45:25Z INFO 67673 [PreSched]: Num_Splits: 0 2025-08-09T18:45:25Z INFO 67673 [PreSched]: End split live ranges Sat Aug 9 18:45:25 2025 2025-08-09T18:45:25Z INFO 67673 [PreSched]: Strt remove redundncies Sat Aug 9 18:45:25 2025 2025-08-09T18:45:25Z INFO 67673 [PreSched]: remove_redundant_memsets 2025-08-09T18:45:25Z INFO 67673 [PreSched]: remove_redundant_memsets: 0 2025-08-09T18:45:25Z INFO 67673 [PreSched]: remove_redundant_loads 2025-08-09T18:45:25Z INFO 67673 [PreSched]: remove_redundant_loads: 0 2025-08-09T18:45:25Z INFO 67673 [PreSched]: End remove redundncies Sat Aug 9 18:45:25 2025 2025-08-09T18:45:25Z INFO 67673 [PreSched]: Start DCE Sat Aug 9 18:45:25 2025 2025-08-09T18:45:25Z INFO 67673 [PreSched]: eliminateDeadStore removed 0 instructions 2025-08-09T18:45:25Z INFO 67673 [PreSched]: remove_must_alias_dmacopy removed 0 DMAcopys 2025-08-09T18:45:25Z INFO 67673 [PreSched]: remove_redundant_alias_dmacopy removed 0 DMAcopys 2025-08-09T18:45:25Z INFO 67673 [PreSched]: remove_redundant_internal2internal_dmacopy removed 0 DMAcopys 2025-08-09T18:45:25Z INFO 67673 [PreSched]: End DCE Sat Aug 9 18:45:25 2025 2025-08-09T18:45:25Z INFO 67673 [PreSched]: Start build flow dependencies Sat Aug 9 18:45:25 2025 2025-08-09T18:45:25Z INFO 67673 [build_flow_deps]: Start build fdeps. Invocation: 1Sat Aug 9 18:45:25 2025 2025-08-09T18:45:25Z INFO 67673 [build_flow_deps]: Allocs: 68412 instructions: 279653 2025-08-09T18:45:27Z INFO 67673 [build_flow_deps]: Build fdeps inserted 698765 edges 2025-08-09T18:45:27Z INFO 67673 [build_flow_deps]: Done build fdeps 698765 Sat Aug 9 18:45:27 2025 2025-08-09T18:45:27Z INFO 67673 [PreSched]: End build flow dependencies Sat Aug 9 18:45:27 2025 2025-08-09T18:45:27Z INFO 67673 [PreSched]: Start remove useless insts Sat Aug 9 18:45:27 2025 2025-08-09T18:45:27Z INFO 67673 [PreSched]: remove_useless_insts 2025-08-09T18:45:27Z INFO 67673 [PreSched]: remove Useless Instructions: 0 2025-08-09T18:45:27Z INFO 67673 [PreSched]: End remove useless insts Sat Aug 9 18:45:27 2025 2025-08-09T18:45:27Z INFO 67673 [PreSched]: Start scratchpad optimization Sat Aug 9 18:45:27 2025 2025-08-09T18:45:27Z INFO 67673 [PreSched]: End scratchpad optimization Sat Aug 9 18:45:27 2025 2025-08-09T18:45:27Z INFO 67673 [PreSched]: DONE PRE scheduling Sat Aug 9 18:45:27 2025 2025-08-09T18:45:27Z USER 67673 [ModuleForkPass]: pre_sched finished after 2.387 seconds 2025-08-09T18:45:27Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1810mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:27Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:27Z USER 67673 [ModuleForkPass]: Running tensor_copy_elim 2025-08-09T18:45:27Z INFO 67673 [ModuleForkPass]: Inputs to tensor_copy_elim: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:27Z INFO 67673 [TensorCopyElim]: Tensor CP elimination: 0 2025-08-09T18:45:27Z INFO 67673 [TensorCopyElim]: eliminateDeadStore removed 0 instructions 2025-08-09T18:45:27Z INFO 67673 [TensorCopyElim]: remove_must_alias_dmacopy removed 0 DMAcopys 2025-08-09T18:45:27Z INFO 67673 [TensorCopyElim]: remove_redundant_alias_dmacopy removed 0 DMAcopys 2025-08-09T18:45:27Z INFO 67673 [TensorCopyElim]: remove_redundant_internal2internal_dmacopy removed 0 DMAcopys 2025-08-09T18:45:27Z USER 67673 [ModuleForkPass]: tensor_copy_elim finished after 0.474 seconds 2025-08-09T18:45:27Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1812mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:27Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:27Z USER 67673 [ModuleForkPass]: Running dynamic_dma_setup 2025-08-09T18:45:27Z INFO 67673 [ModuleForkPass]: Inputs to dynamic_dma_setup: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:27Z USER 67673 [ModuleForkPass]: dynamic_dma_setup finished after 0.007 seconds 2025-08-09T18:45:27Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1812mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:28Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68413 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:28Z USER 67673 [ModuleForkPass]: Running runtime_memory_reservation 2025-08-09T18:45:28Z INFO 67673 [ModuleForkPass]: Inputs to runtime_memory_reservation: modules=1 functions=1 allocs=68413 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:28Z USER 67673 [ModuleForkPass]: runtime_memory_reservation finished after 0.006 seconds 2025-08-09T18:45:28Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1812mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:28Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68413 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:28Z USER 67673 [ModuleForkPass]: Running coloring_allocator_psum 2025-08-09T18:45:28Z INFO 67673 [ModuleForkPass]: Inputs to coloring_allocator_psum: modules=1 functions=1 allocs=68413 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:28Z INFO 67673 [ColoringAllocator::Rep]: Allocating functions 2025-08-09T18:45:28Z INFO 67673 [ColoringAllocator::Rep]: linearize and check 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: allocating PSUM 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: main loop 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: renumber locations 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: size = 53065 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: build_no_bitmap start 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: 100% PSUM demand before spilling 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: PSUM high-water mark = 8 tensors 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: found 171648 edges 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: mean: 6.46935 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: median: 6.99995 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: adjacency vectors require 1373184 bytes 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: build_no_bitmap done 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: find costs 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: best-of-n loop, heuristic = 0, allow_psum_spill_within_accum_group = false 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: simplify interference graph 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: initialize low and high 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: lo = 53065 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: hi = 0 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: inf = 0 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: total = 53065 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: simplify 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: new candidates = 0 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: select ranges 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: no more spills 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: PSUM score = 0 (lower is better) 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: spilling from PSUM cost about 0 cycles 2025-08-09T18:45:28Z INFO 67673 [PSUM_Allocator]: 100% PSUM utilization after allocation 2025-08-09T18:45:28Z USER 67673 [ModuleForkPass]: coloring_allocator_psum finished after 0.663 seconds 2025-08-09T18:45:28Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1828mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:28Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68413 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:28Z USER 67673 [ModuleForkPass]: Running dma_optimization_psum 2025-08-09T18:45:28Z INFO 67673 [ModuleForkPass]: Inputs to dma_optimization_psum: modules=1 functions=1 allocs=68413 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:28Z INFO 67673 [DMAOptimizationBase]: [psum spill optimization]: removed 0 spill/reload instructions 2025-08-09T18:45:28Z INFO 67673 [DMAOptimizationBase]: [psum spill optimization]: removed 0 spill/reload memory locations 2025-08-09T18:45:28Z USER 67673 [ModuleForkPass]: dma_optimization_psum finished after 0.259 seconds 2025-08-09T18:45:28Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1828mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:28Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68413 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:28Z USER 67673 [ModuleForkPass]: Running address_rotation_psum 2025-08-09T18:45:28Z INFO 67673 [ModuleForkPass]: Inputs to address_rotation_psum: modules=1 functions=1 allocs=68413 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:29Z INFO 67673 [DMAOptimizationBase]: PSUM Rotation rotated 0 PSUM Banks 2025-08-09T18:45:30Z INFO 67673 [DMAOptimizationBase]: PSUM Rotation rotated 0 PSUM Banks 2025-08-09T18:45:31Z INFO 67673 [DMAOptimizationBase]: PSUM Rotation rotated 0 PSUM Banks 2025-08-09T18:45:31Z USER 67673 [ModuleForkPass]: address_rotation_psum finished after 2.215 seconds 2025-08-09T18:45:31Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1830mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:31Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68413 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:31Z USER 67673 [ModuleForkPass]: Running coloring_allocator_sb 2025-08-09T18:45:31Z INFO 67673 [ModuleForkPass]: Inputs to coloring_allocator_sb: modules=1 functions=1 allocs=68413 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:31Z INFO 67673 [ColoringAllocator::Rep]: INFO: Pre GCA DRAM bytes loaded 6946398208 2025-08-09T18:45:31Z INFO 67673 [ColoringAllocator::Rep]: INFO: Pre GCA average loaded DMA size 7517 bytes 2025-08-09T18:45:31Z INFO 67673 [ColoringAllocator::Rep]: INFO: Pre GCA DRAM bytes saved 6946365440 2025-08-09T18:45:31Z INFO 67673 [ColoringAllocator::Rep]: INFO: Pre GCA average saved DMA size 7461 bytes 2025-08-09T18:45:31Z INFO 67673 [ColoringAllocator::Rep]: INFO: Post GCA DRAM bytes DMACopyed 0 2025-08-09T18:45:31Z INFO 67673 [ColoringAllocator::Rep]: INFO: Post GCA average DMACopyed DMA size 0 bytes 2025-08-09T18:45:31Z INFO 67673 [ColoringAllocator::Rep]: Allocating functions 2025-08-09T18:45:31Z INFO 67673 [ColoringAllocator::Rep]: linearize and check 2025-08-09T18:45:31Z INFO 67673 [SB_Allocator]: allocating SB 2025-08-09T18:45:31Z INFO 67673 [SB_Allocator]: main loop 2025-08-09T18:45:31Z INFO 67673 [SB_Allocator]: renumber locations 2025-08-09T18:45:31Z INFO 67673 [SB_Allocator]: size = 14548 2025-08-09T18:45:31Z INFO 67673 [SB_Allocator]: find partners 2025-08-09T18:45:31Z INFO 67673 [SB_Allocator]: found 53065 accumulation groups 2025-08-09T18:45:31Z INFO 67673 [SB_Allocator]: largest = 22342.27111_i383 2025-08-09T18:45:31Z INFO 67673 [SB_Allocator]: tensors = 2 2025-08-09T18:45:31Z INFO 67673 [SB_Allocator]: requires 8448 bytes/partition 2025-08-09T18:45:31Z INFO 67673 [SB_Allocator]: expanding partners 2025-08-09T18:45:31Z INFO 67673 []: find first defs for local 2025-08-09T18:45:31Z INFO 67673 []: find first defs for global 2025-08-09T18:45:31Z INFO 67673 [SB_Allocator]: find loads 2025-08-09T18:45:31Z INFO 67673 [SB_Allocator]: 1 pin count 2025-08-09T18:45:31Z INFO 67673 [SB_Allocator]: 6121 remat count 2025-08-09T18:45:31Z INFO 67673 [SB_Allocator]: 1 pinned tensors will require about 16384 bytes/partition 2025-08-09T18:45:31Z INFO 67673 [SB_Allocator]: build interference graph 2025-08-09T18:45:31Z INFO 67673 [SB_Allocator]: pass 1 int-tree 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: Num intervals 14548 Num locations 14548 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: IntervalTree Build Done 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: info.neighbors init Done 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: info.neighbors partners Done 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: IntervalTree readback Done 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: edge: 32260 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: mean: 4.43497 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: median: 2.00048 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: find costs 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: best-of-n loop, heuristic = 0 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: simplify interference graph 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: initialize safe & unsafe 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: safe = 14546 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: unsafe = 1 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: inf = 0 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: total = 14547 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: simplify 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: simplify_step3_sorted2 #Unsafe 0 #Pinned 0 #Safe 0 minCost 1.79769e+308 maxCost 2.22507e-308 locations 14548 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: new candidates = 0 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: select ranges 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: Total: 14547 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: Spilled: 0.000 (0) 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: Allocated: 1.000 (14547) 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: Rover zone: 0.988 (14367) 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: Pre-rover zone: 0.010 (144) 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: Post-rover zone: 0.002 (36) 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: Slice zone: 0.000 (0) 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: Blocks nothing: 0.000 (0) 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: Blocks medium: 0.000 (0) 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: Blocks tall: 1.000 (14547) 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: Visited until tall blocking (mean): 0.996 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: Visited until tall blocking (median): 1.000 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: Visited until tall blocking (p95): 1.000 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: Success 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: SB spills = 0 tensors 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: size = 0 bytes/partition 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: remats = 0 tensors 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: unpinned = 0 tensors 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: size = 0 bytes/partition 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: SB score = 0 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: spilling from SB cost about 0 cycles 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: 16384 bytes/partition (100%) successfully pinned 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: pinning saved approximately 9010 cycles 2025-08-09T18:45:32Z INFO 67673 [SB_Allocator]: 0% SB utilization after allocation 2025-08-09T18:45:32Z INFO 67673 [ColoringAllocator::Rep]: INFO: Post GCA DRAM bytes loaded 6946398208 2025-08-09T18:45:32Z INFO 67673 [ColoringAllocator::Rep]: INFO: Post GCA average loaded DMA size 7517 bytes 2025-08-09T18:45:32Z INFO 67673 [ColoringAllocator::Rep]: INFO: Post GCA DRAM bytes saved 6946365440 2025-08-09T18:45:32Z INFO 67673 [ColoringAllocator::Rep]: INFO: Post GCA average saved DMA size 7461 bytes 2025-08-09T18:45:32Z INFO 67673 [ColoringAllocator::Rep]: INFO: Post GCA DRAM bytes DMACopyed 0 2025-08-09T18:45:32Z INFO 67673 [ColoringAllocator::Rep]: INFO: Post GCA average DMACopyed DMA size 0 bytes 2025-08-09T18:45:32Z USER 67673 [ModuleForkPass]: coloring_allocator_sb finished after 1.186 seconds 2025-08-09T18:45:32Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1835mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:32Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68413 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:32Z USER 67673 [ModuleForkPass]: Running address_rotation_sb 2025-08-09T18:45:32Z INFO 67673 [ModuleForkPass]: Inputs to address_rotation_sb: modules=1 functions=1 allocs=68413 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:32Z INFO 67673 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address 2025-08-09T18:45:32Z USER 67673 [ModuleForkPass]: address_rotation_sb finished after 0.356 seconds 2025-08-09T18:45:32Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1838mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:32Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68413 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:32Z USER 67673 [ModuleForkPass]: Running dma_optimization_sb 2025-08-09T18:45:32Z INFO 67673 [ModuleForkPass]: Inputs to dma_optimization_sb: modules=1 functions=1 allocs=68413 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:32Z INFO 67673 [DMAOptimizationBase]: DMA optimization In bytes loaded or saved 13892763648, 50.0001% input load, 49.9999% output write, 0% spill/reload [sg0000] 2025-08-09T18:45:32Z INFO 67673 [DMAOptimizationBase]: [DMA optimization]Reload_just_for_save Optimization removed 0 memlocs 2025-08-09T18:45:33Z INFO 67673 [DMAOptimizationBase]: removed 0 identical load 2025-08-09T18:45:33Z INFO 67673 [DMAOptimizationBase]: adjusted 0 DMACopy remat 2025-08-09T18:45:33Z INFO 67673 [DMAOptimizationBase]: adjusted 0 DMACopy remat 2025-08-09T18:45:33Z INFO 67673 [DMAOptimizationBase]: sub-graph will get execute 1 times 2025-08-09T18:45:33Z INFO 67673 [DMAOptimizationBase]: [Load Merging]: removed 0 remat/cloned instructions 2025-08-09T18:45:33Z INFO 67673 [DMAOptimizationBase]: [Load shrink]: shrinked 0 GCA remat/cloned instructions 2025-08-09T18:45:33Z INFO 67673 [DMAOptimizationBase]: [Load Merging + Load shrink] reduced input/const loading DMA traffic 0, 0% out of total dma traffic(6.9464e+09) 2025-08-09T18:45:33Z INFO 67673 [DMAOptimizationBase]: [spill optimization round 0]: removed 0 spill/reload instructions 2025-08-09T18:45:33Z INFO 67673 [DMAOptimizationBase]: [spill optimization round 0]: removed 0 spill/reload memory locations 2025-08-09T18:45:33Z INFO 67673 [DMAOptimizationBase]: [Spill Optimization] reduced DMA traffic 0, -nan% out of total spill/reload dma traffic 2025-08-09T18:45:33Z INFO 67673 [DMAOptimizationBase]: [Allocation optimization]: removed 0 spill/reload instructions 2025-08-09T18:45:33Z INFO 67673 [DMAOptimizationBase]: [Allocation optimization]: removed 0 spill/reload memory locations 2025-08-09T18:45:33Z INFO 67673 [DMAOptimizationBase]: [Re-allocation Optimization] reduced DMA traffic 0, -nan% out of total spill/reload dma traffic 2025-08-09T18:45:33Z INFO 67673 [DMAOptimizationBase]: [spill optimization round 0]: removed 0 spill/reload instructions 2025-08-09T18:45:33Z INFO 67673 [DMAOptimizationBase]: [spill optimization round 0]: removed 0 spill/reload memory locations 2025-08-09T18:45:33Z INFO 67673 [DMAOptimizationBase]: [Spill Optimization] reduced DMA traffic 0, -nan% out of total spill/reload dma traffic 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: [remove extra save] removed 0 memlocs and 0 instructions 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: [remove_memset_spill]: removed 0 spill/reload instructions 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: [remove_memset_spill]: removed 0 spill/reload memory locations 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: eliminateDeadStore removed 0 instructions 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: DMA SpillSave Coalescing Round 0 combined 0 SpillSaves and Reloads 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: average loaded DMA size 7517 bytes 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: average saved DMA size 7461 bytes 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: INFO: Post DMA coalescing DRAM bytes loaded 6946398208 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: INFO: Post DMA coalescing average loaded DMA size 7517 bytes 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: INFO: Post DMA coalescing DRAM bytes saved 6946365440 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: INFO: Post DMA coalescing average saved DMA size 7461 bytes 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: [DMA optimization]Reload_just_for_save Optimization removed 0 memlocs 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: [Experiment partial DMA access] reduced DMA traffic 0, -nan% out of total spill/reload dma traffic 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: [DMA optimization] reduced DMA traffic 0, 0% out of total dma traffic 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: DMA optimization Out bytes loaded or saved 13892763648, 50.0001% input load, 49.9999% output write, 0% spill/reload [sg0000] 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: INFO: Post DMA optimization DRAM bytes loaded 6946398208 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: INFO: Post DMA optimization average loaded DMA size 7517 bytes 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: INFO: Post DMA optimization DRAM bytes saved 6946365440 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: INFO: Post DMA optimization average saved DMA size 7461 bytes 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: INFO: Post DMA optimization DRAM bytes DMAcopyed 0 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: INFO: Post DMA optimization average DMAcopyed DMA size 0 bytes 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: INFO: Post DMA optimization average DMA size 7488 bytes 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: INFO: Finished set_spill_canreadUninit(module); 2025-08-09T18:45:34Z INFO 67673 [DMAOptimizationBase]: DMA optimization re-enable optimization 2025-08-09T18:45:34Z USER 67673 [ModuleForkPass]: dma_optimization_sb finished after 2.175 seconds 2025-08-09T18:45:34Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1857mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:34Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:34Z USER 67673 [ModuleForkPass]: Running address_rotation_sb 2025-08-09T18:45:34Z INFO 67673 [ModuleForkPass]: Inputs to address_rotation_sb: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:35Z INFO 67673 [DMAOptimizationBase]: SB Rotation rotated 5962 Sb address 2025-08-09T18:45:35Z INFO 67673 [DMAOptimizationBase]: SB Rotation rotated 4811 Sb address 2025-08-09T18:45:35Z INFO 67673 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address 2025-08-09T18:45:36Z INFO 67673 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address 2025-08-09T18:45:36Z INFO 67673 [DMAOptimizationBase]: SB Rotation rotated 2052 Sb address 2025-08-09T18:45:36Z INFO 67673 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address 2025-08-09T18:45:36Z USER 67673 [ModuleForkPass]: address_rotation_sb finished after 2.022 seconds 2025-08-09T18:45:36Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1857mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:36Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:36Z USER 67673 [ModuleForkPass]: Running coloring_allocator_dram 2025-08-09T18:45:36Z INFO 67673 [ModuleForkPass]: Inputs to coloring_allocator_dram: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:36Z INFO 67673 [ColoringAllocator::Rep]: Allocating functions 2025-08-09T18:45:36Z INFO 67673 [ColoringAllocator::Rep]: linearize and check 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: allocating spills in DRAM pre_link mode for address space Local 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: reserved space = 16382119936 bytes 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: spill space = 0 bytes 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: aligned spill space = 0 bytes 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: dram space = 107374182400 bytes 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: renumber locations 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: size = 0 2025-08-09T18:45:37Z INFO 67673 []: find first defs for local 2025-08-09T18:45:37Z INFO 67673 []: find first defs for global 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: Num intervals 0 Num locations 0 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: IntervalTree Build Done 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: info.neighbors init Done 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: IntervalTree readback Done 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: simplify interference graph 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: initialize low and high 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: lo = 0 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: hi = 0 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: total = 0 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: simplify 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: new candidates = 0 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: select ranges 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: CC buffer size limit 524288000 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: allreduce_dram_hwm 0 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: Real CC buffer size 0 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: DRAM hwm after allocation: 0 2025-08-09T18:45:37Z INFO 67673 [DRAM_Allocator]: DRAM allocation successful 2025-08-09T18:45:37Z USER 67673 [ModuleForkPass]: coloring_allocator_dram finished after 0.466 seconds 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1862mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:37Z USER 67673 [ModuleForkPass]: Running address_rotation_dram 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: Inputs to address_rotation_dram: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:37Z INFO 67673 [DMAOptimizationBase]: Runtime page size at 512MB 2025-08-09T18:45:37Z INFO 67673 [DMAOptimizationBase]: DRAM hwm before rotation 0 2025-08-09T18:45:37Z INFO 67673 [DMAOptimizationBase]: allreduce buffer size 524288000 2025-08-09T18:45:37Z INFO 67673 [DMAOptimizationBase]: allreduce hwm 0 2025-08-09T18:45:37Z INFO 67673 [DMAOptimizationBase]: Real CC buffer size 0 2025-08-09T18:45:37Z INFO 67673 [DMAOptimizationBase]: DRAM hwm after rotation 0 2025-08-09T18:45:37Z INFO 67673 [DMAOptimizationBase]: DRAM Rotation rotated 0 Dram address 2025-08-09T18:45:37Z USER 67673 [ModuleForkPass]: address_rotation_dram finished after 0.254 seconds 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1862mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:37Z USER 67673 [ModuleForkPass]: Running tensorcopy_accel 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: Inputs to tensorcopy_accel: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:37Z INFO 67673 [TensorCopyAccel::Impl]: Running peephole optimization pass 2025-08-09T18:45:37Z INFO 67673 [TensorCopyAccel::Impl]: Accelerated 0 out of 53065 tensorcopy in Function: sg0000 average acceleration factor: -nan 2025-08-09T18:45:37Z USER 67673 [ModuleForkPass]: tensorcopy_accel finished after 0.037 seconds 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1862mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:37Z USER 67673 [ModuleForkPass]: Running peephole_opts 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: Inputs to peephole_opts: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:37Z INFO 67673 [PeepholeOpts]: PeepholeOpts enabled? Recip: true Tsp: true Tc: false SplitSelect: true SimplifyMemset true 2025-08-09T18:45:37Z USER 67673 [ModuleForkPass]: peephole_opts finished after 0.109 seconds 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1862mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:37Z USER 67673 [ModuleForkPass]: Running lower_kernel 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: Inputs to lower_kernel: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:37Z INFO 67673 [LowerKernel]: Started running LowerKernel 2025-08-09T18:45:37Z INFO 67673 [LowerKernel]: Start of kernel lowering pass, number of insts: 279653, number of allocs: 68412 2025-08-09T18:45:37Z INFO 67673 [LowerKernel]: Scan BKs time (s): 0.022361 2025-08-09T18:45:37Z INFO 67673 [LowerKernel]: Lower BKs time (s): 1.3e-05 2025-08-09T18:45:37Z USER 67673 [ModuleForkPass]: lower_kernel finished after 0.031 seconds 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1862mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:37Z USER 67673 [ModuleForkPass]: Running lower_nki_kernel 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: Inputs to lower_nki_kernel: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:37Z USER 67673 [ModuleForkPass]: lower_nki_kernel finished after 0.028 seconds 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1862mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:37Z USER 67673 [ModuleForkPass]: Running dynamic_dma_cleanup 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: Inputs to dynamic_dma_cleanup: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:37Z USER 67673 [ModuleForkPass]: dynamic_dma_cleanup finished after 0.044 seconds 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1864mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:37Z USER 67673 [ModuleForkPass]: Running birverifier 2025-08-09T18:45:37Z INFO 67673 [ModuleForkPass]: Inputs to birverifier: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:38Z USER 67673 [ModuleForkPass]: birverifier finished after 0.322 seconds 2025-08-09T18:45:38Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1864mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:38Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:38Z USER 67673 [ModuleForkPass]: Running dynamic_dma_scan 2025-08-09T18:45:38Z INFO 67673 [ModuleForkPass]: Inputs to dynamic_dma_scan: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:38Z USER 67673 [ModuleForkPass]: dynamic_dma_scan finished after 0.043 seconds 2025-08-09T18:45:38Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1864mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:38Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:38Z USER 67673 [ModuleForkPass]: Running build_fdeps 2025-08-09T18:45:38Z INFO 67673 [ModuleForkPass]: Inputs to build_fdeps: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:38Z INFO 67673 [build_flow_deps]: Start build fdeps. Invocation: 2Sat Aug 9 18:45:38 2025 2025-08-09T18:45:38Z INFO 67673 [build_flow_deps]: Allocs: 68412 instructions: 279653 2025-08-09T18:45:39Z INFO 67673 [build_flow_deps]: Build fdeps inserted 698765 edges 2025-08-09T18:45:39Z INFO 67673 [build_flow_deps]: Done build fdeps 698765 Sat Aug 9 18:45:39 2025 2025-08-09T18:45:39Z USER 67673 [ModuleForkPass]: build_fdeps finished after 1.197 seconds 2025-08-09T18:45:39Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1896mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:39Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:39Z USER 67673 [ModuleForkPass]: Running remove_redundancies 2025-08-09T18:45:39Z INFO 67673 [ModuleForkPass]: Inputs to remove_redundancies: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:39Z INFO 67673 [RemoveRedundancies]: remove_clobbered_writes 2025-08-09T18:45:39Z INFO 67673 [RemoveRedundancies]: remove_clobbered_writes: 0 2025-08-09T18:45:39Z INFO 67673 [RemoveRedundancies]: remove_useless_insts 2025-08-09T18:45:39Z INFO 67673 [RemoveRedundancies]: remove Useless Instructions: 0 2025-08-09T18:45:39Z USER 67673 [ModuleForkPass]: remove_redundancies finished after 0.164 seconds 2025-08-09T18:45:39Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1896mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:39Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:39Z USER 67673 [ModuleForkPass]: Running anti_dependency_analyzer 2025-08-09T18:45:39Z INFO 67673 [ModuleForkPass]: Inputs to anti_dependency_analyzer: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:39Z INFO 67673 [AntiDependencyAnalyzer]: Batch size: 1000 2025-08-09T18:45:39Z INFO 67673 [AntiDependencyAnalyzer]: Analysis types: {DRAM,ALIAS,PSUM,SB} 2025-08-09T18:45:39Z INFO 67673 [AntiDependencyAnalyzer]: DRAM size: 17179869184 num-bins: 16 bin-size: 1073741824 2025-08-09T18:45:40Z USER 67673 [ModuleForkPass]: anti_dependency_analyzer finished after 1.041 seconds 2025-08-09T18:45:40Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1985mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:40Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:40Z USER 67673 [ModuleForkPass]: Running tensor_copy_elim 2025-08-09T18:45:40Z INFO 67673 [ModuleForkPass]: Inputs to tensor_copy_elim: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:40Z INFO 67673 [TensorCopyElim]: Tensor CP elimination: 0 2025-08-09T18:45:41Z INFO 67673 [TensorCopyElim]: eliminateDeadStore removed 0 instructions 2025-08-09T18:45:41Z USER 67673 [ModuleForkPass]: tensor_copy_elim finished after 0.377 seconds 2025-08-09T18:45:41Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1994mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:41Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:41Z USER 67673 [ModuleForkPass]: Running prefetch_scheduling_before_sched 2025-08-09T18:45:41Z INFO 67673 [ModuleForkPass]: Inputs to prefetch_scheduling_before_sched: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:41Z USER 67673 [ModuleForkPass]: prefetch_scheduling_before_sched finished after 0.007 seconds 2025-08-09T18:45:41Z INFO 67673 [ModuleForkPass]: curr_vmrss: 1994mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:41Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:41Z USER 67673 [ModuleForkPass]: Running post_sched 2025-08-09T18:45:41Z INFO 67673 [ModuleForkPass]: Inputs to post_sched: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:41Z INFO 67673 [post_scheduler]: Start PosT ScheD 3 sunda Sat Aug 9 18:45:41 2025 2025-08-09T18:45:44Z INFO 67673 [post_scheduler]: Time-aware hwm post-sched 2025-08-09T18:45:46Z INFO 67673 [post_scheduler]: Time-aware simulation time: 58352865 2025-08-09T18:45:46Z INFO 67673 [post_scheduler]: Done PosT ScheD Sat Aug 9 18:45:46 2025 2025-08-09T18:45:46Z USER 67673 [ModuleForkPass]: post_sched finished after 5.460 seconds 2025-08-09T18:45:46Z INFO 67673 [ModuleForkPass]: curr_vmrss: 2386mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:46Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:46Z USER 67673 [ModuleForkPass]: Running expand_scheduling_units 2025-08-09T18:45:46Z INFO 67673 [ModuleForkPass]: Inputs to expand_scheduling_units: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:46Z USER 67673 [ModuleForkPass]: expand_scheduling_units finished after 0.038 seconds 2025-08-09T18:45:46Z INFO 67673 [ModuleForkPass]: curr_vmrss: 2142mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:46Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:46Z USER 67673 [ModuleForkPass]: Running address_rotation_sb 2025-08-09T18:45:46Z INFO 67673 [ModuleForkPass]: Inputs to address_rotation_sb: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:48Z INFO 67673 [DMAOptimizationBase]: PSUM Rotation rotated 10969 PSUM Banks 2025-08-09T18:45:49Z INFO 67673 [DMAOptimizationBase]: PSUM Rotation rotated 8848 PSUM Banks 2025-08-09T18:45:50Z INFO 67673 [DMAOptimizationBase]: PSUM Rotation rotated 0 PSUM Banks 2025-08-09T18:45:50Z INFO 67673 [DMAOptimizationBase]: SB Rotation rotated 2531 Sb address 2025-08-09T18:45:51Z INFO 67673 [DMAOptimizationBase]: SB Rotation rotated 2569 Sb address 2025-08-09T18:45:51Z INFO 67673 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address 2025-08-09T18:45:51Z INFO 67673 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address 2025-08-09T18:45:52Z INFO 67673 [DMAOptimizationBase]: SB Rotation rotated 71 Sb address 2025-08-09T18:45:52Z INFO 67673 [DMAOptimizationBase]: moved 0 MM forward 2025-08-09T18:45:52Z INFO 67673 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address 2025-08-09T18:45:53Z INFO 67673 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address 2025-08-09T18:45:53Z USER 67673 [ModuleForkPass]: address_rotation_sb finished after 6.509 seconds 2025-08-09T18:45:53Z INFO 67673 [ModuleForkPass]: curr_vmrss: 2178mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:53Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:53Z USER 67673 [ModuleForkPass]: Running anti_dependency_analyzer 2025-08-09T18:45:53Z INFO 67673 [ModuleForkPass]: Inputs to anti_dependency_analyzer: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:53Z INFO 67673 [AntiDependencyAnalyzer]: Batch size: 1000 2025-08-09T18:45:53Z INFO 67673 [AntiDependencyAnalyzer]: Analysis types: {DRAM,ALIAS,PSUM,SB} 2025-08-09T18:45:53Z INFO 67673 [AntiDependencyAnalyzer]: DRAM size: 17179869184 num-bins: 16 bin-size: 1073741824 2025-08-09T18:45:54Z USER 67673 [ModuleForkPass]: anti_dependency_analyzer finished after 0.807 seconds 2025-08-09T18:45:54Z INFO 67673 [ModuleForkPass]: curr_vmrss: 2209mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:54Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:54Z USER 67673 [ModuleForkPass]: Running anti_dependency_analyzer 2025-08-09T18:45:54Z INFO 67673 [ModuleForkPass]: Inputs to anti_dependency_analyzer: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:54Z INFO 67673 [AntiDependencyAnalyzer]: Batch size: 1000 2025-08-09T18:45:54Z INFO 67673 [AntiDependencyAnalyzer]: Analysis types: {DRAM,ALIAS} 2025-08-09T18:45:54Z INFO 67673 [AntiDependencyAnalyzer]: DRAM size: 17179869184 num-bins: 16 bin-size: 1073741824 2025-08-09T18:45:54Z USER 67673 [ModuleForkPass]: anti_dependency_analyzer finished after 0.213 seconds 2025-08-09T18:45:54Z INFO 67673 [ModuleForkPass]: curr_vmrss: 2213mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:54Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:54Z USER 67673 [ModuleForkPass]: Running dep_opt 2025-08-09T18:45:54Z INFO 67673 [ModuleForkPass]: Inputs to dep_opt: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:54Z INFO 67673 [build_flow_deps]: Start build fdeps. Invocation: 3Sat Aug 9 18:45:54 2025 2025-08-09T18:45:54Z INFO 67673 [build_flow_deps]: Allocs: 68412 instructions: 279653 2025-08-09T18:45:55Z INFO 67673 [build_flow_deps]: Build fdeps inserted 685617 edges 2025-08-09T18:45:55Z INFO 67673 [build_flow_deps]: Done build fdeps 685617 Sat Aug 9 18:45:55 2025 2025-08-09T18:45:55Z USER 67673 [ModuleForkPass]: dep_opt finished after 1.580 seconds 2025-08-09T18:45:55Z INFO 67673 [ModuleForkPass]: curr_vmrss: 2213mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:55Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:55Z USER 67673 [ModuleForkPass]: Running report_stats 2025-08-09T18:45:55Z INFO 67673 [ModuleForkPass]: Inputs to report_stats: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:55Z INFO 67673 [ReportStats]: Data Movement Statistics: sg0000 βββββββββββββββ¬βββββββββββββββββββββββββββββ¬ββββββββ¬βββββββββββββ β Instruction β Kind β Count β Bytes β βββββββββββββββΌβββββββββββββββββββββββββββββΌββββββββΌβββββββββββββ€ β Load β Const -> Internal β 1 β 32768 β β Load β ExternalInput -> Internal β 7273 β 6946365440 β β Save β Internal -> ExternalOutput β 7273 β 6946365440 β βββββββββββββββ΄βββββββββββββββββββββββββββββ΄ββββββββ΄βββββββββββββ 2025-08-09T18:45:55Z INFO 67673 [ReportStats]: βββββββββββββββββββββββ¬ββββββββ β Bytes per partition β Count β βββββββββββββββββββββββΌββββββββ€ β 64 β 73 β β 256 β 74 β β 6144 β 4608 β β 8192 β 9792 β βββββββββββββββββββββββ΄ββββββββ 2025-08-09T18:45:55Z INFO 67673 [ReportStats]: MM Stats: #MatMults 212041 #MatMult-Transposes 212041 2025-08-09T18:45:55Z INFO 67673 [ReportStats]: IO Tensor size combined: 16382087168 2025-08-09T18:45:55Z INFO 67673 [ReportStats]: IO Tensor Statistics: ββββββββββββββββββββββ¬βββββββββββββββββ¬βββββββββββ¬βββββββββββββββ β Largest IO Tensors β Kind β Src Type β Size (Bytes) β ββββββββββββββββββββββΌβββββββββββββββββΌβββββββββββΌβββββββββββββββ€ β output0 β ExternalOutput β bfloat16 β 622329856 β β input0 β ExternalInput β bfloat16 β 622329856 β β output397 β ExternalOutput β bfloat16 β 622329856 β β input397 β ExternalInput β bfloat16 β 622329856 β β input8 β ExternalInput β bfloat16 β 50331648 β β input22 β ExternalInput β bfloat16 β 50331648 β β input30 β ExternalInput β bfloat16 β 50331648 β β input20 β ExternalInput β bfloat16 β 50331648 β β input11 β ExternalInput β bfloat16 β 50331648 β β input33 β ExternalInput β bfloat16 β 50331648 β ββββββββββββββββββββββ΄βββββββββββββββββ΄βββββββββββ΄βββββββββββββββ 2025-08-09T18:45:55Z INFO 67673 [ReportStats]: Large (Internal) Tensor Statistics: ββββββββββββββββββββββββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββββββ β Largest Tensors β Kind β Src Type β Size (Bytes) β ββββββββββββββββββββββββββββββΌβββββββββββΌβββββββββββΌβββββββββββββββ€ β DynamicDMAScratchLoc β Internal β uint8 β 2097152 β β t2499_pftranspose_20873_i5 β Internal β bfloat16 β 1048576 β β t2499_pftranspose_20873_i2 β Internal β bfloat16 β 1048576 β β t2499_pftranspose_20873_i1 β Internal β bfloat16 β 1048576 β β t2499_pftranspose_20873_i3 β Internal β bfloat16 β 1048576 β β t2499_pftranspose_20873_i4 β Internal β bfloat16 β 1048576 β β t2499_pftranspose_20873_i6 β Internal β bfloat16 β 1048576 β β t2499_pftranspose_20873_i9 β Internal β bfloat16 β 1048576 β β t2499_pftranspose_20873_i8 β Internal β bfloat16 β 1048576 β β t2499_pftranspose_20873_i7 β Internal β bfloat16 β 1048576 β ββββββββββββββββββββββββββββββ΄βββββββββββ΄βββββββββββ΄βββββββββββββββ 2025-08-09T18:45:55Z USER 67673 [ModuleForkPass]: report_stats finished after 0.081 seconds 2025-08-09T18:45:55Z INFO 67673 [ModuleForkPass]: curr_vmrss: 2213mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:55Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:55Z USER 67673 [BackendPassManager]: mod_parallel_pass finished after 33.982 seconds 2025-08-09T18:45:55Z INFO 67673 [BackendPassManager]: curr_vmrss: 2213mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:55Z INFO 67673 [BackendPassManager]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:55Z USER 67673 [BackendPassManager]: Running assign_trigger_engine 2025-08-09T18:45:55Z INFO 67673 [BackendPassManager]: Inputs to assign_trigger_engine: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z INFO 67673 [AssignTriggerEngine]: Assigned trigger engine for 0 DMA instructions. Moved 0 DMA instructions to CC's engines. 2025-08-09T18:45:56Z USER 67673 [BackendPassManager]: assign_trigger_engine finished after 0.121 seconds 2025-08-09T18:45:56Z INFO 67673 [BackendPassManager]: curr_vmrss: 2213mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:56Z INFO 67673 [BackendPassManager]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [BackendPassManager]: Running subgraph_parallel_pass 2025-08-09T18:45:56Z INFO 67673 [BackendPassManager]: Inputs to subgraph_parallel_pass: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [SubgraphForkPass]: Running lower_local_collectives 2025-08-09T18:45:56Z INFO 67673 [SubgraphForkPass]: Inputs to lower_local_collectives: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [SubgraphForkPass]: lower_local_collectives finished after 0.006 seconds 2025-08-09T18:45:56Z INFO 67673 [SubgraphForkPass]: curr_vmrss: 2213mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:56Z INFO 67673 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [SubgraphForkPass]: Running extend_shared_lifetimes 2025-08-09T18:45:56Z INFO 67673 [SubgraphForkPass]: Inputs to extend_shared_lifetimes: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [SubgraphForkPass]: extend_shared_lifetimes finished after 0.006 seconds 2025-08-09T18:45:56Z INFO 67673 [SubgraphForkPass]: curr_vmrss: 2213mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:56Z INFO 67673 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [SubgraphForkPass]: Running dead_code_elim 2025-08-09T18:45:56Z INFO 67673 [SubgraphForkPass]: Inputs to dead_code_elim: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z INFO 67673 [DeadCodeElim]: eliminateDeadStore removed 0 instructions 2025-08-09T18:45:56Z USER 67673 [SubgraphForkPass]: dead_code_elim finished after 0.262 seconds 2025-08-09T18:45:56Z INFO 67673 [SubgraphForkPass]: curr_vmrss: 2213mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:56Z INFO 67673 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [BackendPassManager]: subgraph_parallel_pass finished after 0.301 seconds 2025-08-09T18:45:56Z INFO 67673 [BackendPassManager]: curr_vmrss: 2213mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:56Z INFO 67673 [BackendPassManager]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [BackendPassManager]: Running assign_hwdge_engine 2025-08-09T18:45:56Z INFO 67673 [BackendPassManager]: Inputs to assign_hwdge_engine: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [BackendPassManager]: assign_hwdge_engine finished after 0.040 seconds 2025-08-09T18:45:56Z INFO 67673 [BackendPassManager]: curr_vmrss: 2213mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:56Z INFO 67673 [BackendPassManager]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [BackendPassManager]: Running mod_parallel_pass 2025-08-09T18:45:56Z INFO 67673 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [ModuleForkPass]: Running alloc_queues 2025-08-09T18:45:56Z INFO 67673 [ModuleForkPass]: Inputs to alloc_queues: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z INFO 67673 [AllocQueues]: DMACopy transpose will be triggered from multiple engines 2025-08-09T18:45:56Z INFO 67673 [AllocQueues]: Alloc Queue info: βββββββββββββββββββ¬βββββββββββββββββ¬βββββββββ¬βββββββββββββ¬βββββββββββββββββββ β Name β DMAQueue::Type β Engine β Num Queues β Num instructions β βββββββββββββββββββΌβββββββββββββββββΌβββββββββΌβββββββββββββΌβββββββββββββββββββ€ β qSPSpillReload0 β data β SP β 16 β 1 β β qPoolDynamic β dynamic β Pool β 16 β 14546 β βββββββββββββββββββ΄βββββββββββββββββ΄βββββββββ΄βββββββββββββ΄βββββββββββββββββββ 2025-08-09T18:45:56Z USER 67673 [ModuleForkPass]: alloc_queues finished after 0.041 seconds 2025-08-09T18:45:56Z INFO 67673 [ModuleForkPass]: curr_vmrss: 2213mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:56Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [ModuleForkPass]: Running chain_dma_transposes 2025-08-09T18:45:56Z INFO 67673 [ModuleForkPass]: Inputs to chain_dma_transposes: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [ModuleForkPass]: chain_dma_transposes finished after 0.006 seconds 2025-08-09T18:45:56Z INFO 67673 [ModuleForkPass]: curr_vmrss: 2213mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:56Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [ModuleForkPass]: Running prefetch_scheduling_after_sched 2025-08-09T18:45:56Z INFO 67673 [ModuleForkPass]: Inputs to prefetch_scheduling_after_sched: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [ModuleForkPass]: prefetch_scheduling_after_sched finished after 0.006 seconds 2025-08-09T18:45:56Z INFO 67673 [ModuleForkPass]: curr_vmrss: 2213mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:56Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [ModuleForkPass]: Running lower_control 2025-08-09T18:45:56Z INFO 67673 [ModuleForkPass]: Inputs to lower_control: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z INFO 67673 [LowerControl]: EraseInterBbDeps removed 0 inter-BB deps 2025-08-09T18:45:56Z USER 67673 [ModuleForkPass]: lower_control finished after 0.214 seconds 2025-08-09T18:45:56Z INFO 67673 [ModuleForkPass]: curr_vmrss: 2213mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:56Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [BackendPassManager]: mod_parallel_pass finished after 0.300 seconds 2025-08-09T18:45:56Z INFO 67673 [BackendPassManager]: curr_vmrss: 2213mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:56Z INFO 67673 [BackendPassManager]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [BackendPassManager]: Running nc_parallel_pass 2025-08-09T18:45:56Z INFO 67673 [BackendPassManager]: Inputs to nc_parallel_pass: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z USER 67673 [CoreForkPass]: Running dep_reduction 2025-08-09T18:45:56Z INFO 67673 [CoreForkPass]: Inputs to dep_reduction: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:56Z INFO 67673 [DepReduction]: Start Dependency Reduction 2025-08-09T18:45:56Z INFO 67673 [DepReduction]: Processing async instrs... 2025-08-09T18:45:56Z INFO 67673 [DepReduction]: Processing secondary edges per engine... 2025-08-09T18:45:57Z INFO 67673 [DepReduction]: Processing secondary edges per engine, Done. Num edges removed 473602 2025-08-09T18:45:57Z INFO 67673 [DepReduction]: Processing redundant descendants, Done. Num edges removed 486433 2025-08-09T18:45:57Z INFO 67673 [DepReduction]: Processing async instrs, Done. Num edges removed 486433 2025-08-09T18:45:58Z INFO 67673 [DepReduction]: Num Async removed: 0 2025-08-09T18:45:58Z INFO 67673 [DepReduction]: Finished dependency reduction: 1150790 removed, new total 112455 2025-08-09T18:45:58Z INFO 67673 [DepReduction]: Finished Dependency Reduction 2025-08-09T18:45:58Z USER 67673 [CoreForkPass]: dep_reduction finished after 1.704 seconds 2025-08-09T18:45:58Z INFO 67673 [CoreForkPass]: curr_vmrss: 2225mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:58Z INFO 67673 [CoreForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:58Z USER 67673 [CoreForkPass]: Running lower_dynamic_dma 2025-08-09T18:45:58Z INFO 67673 [CoreForkPass]: Inputs to lower_dynamic_dma: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:58Z USER 67673 [CoreForkPass]: lower_dynamic_dma finished after 0.072 seconds 2025-08-09T18:45:58Z INFO 67673 [CoreForkPass]: curr_vmrss: 2225mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:58Z INFO 67673 [CoreForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:58Z USER 67673 [CoreForkPass]: Running legalize_dynamic_dma 2025-08-09T18:45:58Z INFO 67673 [CoreForkPass]: Inputs to legalize_dynamic_dma: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:58Z INFO 67673 [LegalizeDynamicDMA]: Legalize Dynamic DMA scanned 0 DGE instructions 2025-08-09T18:45:58Z INFO 67673 [LegalizeDynamicDMA]: After Legalize Dynamic DMA, 0 DGE instructions were scanned 2025-08-09T18:45:58Z INFO 67673 [LegalizeDynamicDMA]: βββββββββββββ¬ββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββ β Sub-Pass β Illegal Instructions Detected β New Instructions Generated β βββββββββββββΌββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββ€ β Peeling β 0 β 0 β β Unrolling β 0 β 0 β β Splitting β 0 β 0 β βββββββββββββ΄ββββββββββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββ 2025-08-09T18:45:58Z USER 67673 [CoreForkPass]: legalize_dynamic_dma finished after 0.133 seconds 2025-08-09T18:45:58Z INFO 67673 [CoreForkPass]: curr_vmrss: 2225mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:58Z INFO 67673 [CoreForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279653 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:58Z USER 67673 [CoreForkPass]: Running lower_dma 2025-08-09T18:45:58Z INFO 67673 [CoreForkPass]: Inputs to lower_dma: modules=1 functions=1 allocs=68412 blocks=1 instructions=279653 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:58Z INFO 67673 [LowerDMA]: lower_dma metrics start IO Copy (DGE/DMA) 128 partition : 14473/14473 (100% DGE) power-of-2 partition : 14546/14546 (100% DGE) > 3 dimensional : 0/0 non-integer desc size : 0/0 total : 14546/14546 (100% DGE) Cast (DGE/DMA) 128 partition : 0/0 power-of-2 partition : 0/0 > 3 dimensional : 0/0 non-integer desc size : 0/0 total : 0/0 Spill/Reload Copy (DGE/DMA) 128 partition : 0/1 (0% DGE) power-of-2 partition : 0/1 (0% DGE) > 3 dimensional : 0/0 non-integer desc size : 0/0 total : 0/1 (0% DGE) Cast (DGE/DMA) 128 partition : 0/0 power-of-2 partition : 0/0 > 3 dimensional : 0/0 non-integer desc size : 0/0 total : 0/0 CopyMode CCE : 0 Transpose : 0 Replicate : 0 Dynamic (DGE/DMA) scalar : 0/0 vector : 0/0 Opcode ReadVarAddr : 0 IndirectLoad : 0 IndirectSave : 0 IndirectSaveAccumulate : 0 DstReduceDGE : 0 lower_dma metrics end 2025-08-09T18:45:58Z USER 67673 [CoreForkPass]: lower_dma finished after 0.165 seconds 2025-08-09T18:45:58Z INFO 67673 [CoreForkPass]: curr_vmrss: 2225mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:58Z INFO 67673 [CoreForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279661 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:58Z USER 67673 [CoreForkPass]: Running coalesce_dma_blocks 2025-08-09T18:45:58Z INFO 67673 [CoreForkPass]: Inputs to coalesce_dma_blocks: modules=1 functions=1 allocs=68412 blocks=1 instructions=279661 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:58Z INFO 67673 [CoalesceDmaBlocks]: Coaleseced 0 DMA triggers 2025-08-09T18:45:58Z USER 67673 [CoreForkPass]: coalesce_dma_blocks finished after 0.138 seconds 2025-08-09T18:45:58Z INFO 67673 [CoreForkPass]: curr_vmrss: 2226mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:58Z INFO 67673 [CoreForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279661 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:58Z USER 67673 [CoreForkPass]: Running expand_all_engine 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: Inputs to expand_all_engine: modules=1 functions=1 allocs=68412 blocks=1 instructions=279661 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:59Z USER 67673 [CoreForkPass]: expand_all_engine finished after 0.055 seconds 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: curr_vmrss: 2226mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279661 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:59Z USER 67673 [CoreForkPass]: Running alloc_semaphores 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: Inputs to alloc_semaphores: modules=1 functions=1 allocs=68412 blocks=1 instructions=279661 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:59Z USER 67673 [CoreForkPass]: alloc_semaphores finished after 0.291 seconds 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: curr_vmrss: 2226mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279661 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:59Z USER 67673 [CoreForkPass]: Running expand_inst_late 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: Inputs to expand_inst_late: modules=1 functions=1 allocs=68412 blocks=1 instructions=279661 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:59Z USER 67673 [CoreForkPass]: expand_inst_late finished after 0.278 seconds 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: curr_vmrss: 2226mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279661 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:59Z USER 67673 [CoreForkPass]: Running seq_inst_opt 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: Inputs to seq_inst_opt: modules=1 functions=1 allocs=68412 blocks=1 instructions=279661 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:59Z INFO 67673 [SeqInstOpt]: Removing 0 unnecessary InstRegisterMove instruction(s) from Block1 2025-08-09T18:45:59Z USER 67673 [CoreForkPass]: seq_inst_opt finished after 0.041 seconds 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: curr_vmrss: 2226mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 279661 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:59Z USER 67673 [CoreForkPass]: Running lower_sync 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: Inputs to lower_sync: modules=1 functions=1 allocs=68412 blocks=1 instructions=279661 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:59Z USER 67673 [CoreForkPass]: lower_sync finished after 0.138 seconds 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: curr_vmrss: 2226mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 295353 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:59Z USER 67673 [CoreForkPass]: Running lower_act 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: Inputs to lower_act: modules=1 functions=1 allocs=68412 blocks=1 instructions=295353 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:59Z USER 67673 [CoreForkPass]: lower_act finished after 0.050 seconds 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: curr_vmrss: 2226mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 295354 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:45:59Z USER 67673 [CoreForkPass]: Running lower_dve 2025-08-09T18:45:59Z INFO 67673 [CoreForkPass]: Inputs to lower_dve: modules=1 functions=1 allocs=68412 blocks=1 instructions=295354 Max writers: 64 Max Readers: 212041 2025-08-09T18:45:59Z INFO 67673 [LowerDVE]: Loading DVE opcodes table dve_info.json from /opt/aws_neuronx_venv_pytorch_2_7_nxd_inference/lib/python3.10/site-packages/neuronxcc/dve/dve_bin_gen2/dve_info.json 2025-08-09T18:46:00Z USER 67673 [CoreForkPass]: lower_dve finished after 0.309 seconds 2025-08-09T18:46:00Z INFO 67673 [CoreForkPass]: curr_vmrss: 2254mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:46:00Z INFO 67673 [CoreForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 295354 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:46:00Z USER 67673 [CoreForkPass]: Running lower_ap 2025-08-09T18:46:00Z INFO 67673 [CoreForkPass]: Inputs to lower_ap: modules=1 functions=1 allocs=68412 blocks=1 instructions=295354 Max writers: 64 Max Readers: 212041 2025-08-09T18:46:00Z USER 67673 [CoreForkPass]: lower_ap finished after 0.069 seconds 2025-08-09T18:46:00Z INFO 67673 [CoreForkPass]: curr_vmrss: 2108mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:46:00Z INFO 67673 [CoreForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 295354 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:46:00Z USER 67673 [CoreForkPass]: Running coloring_allocator_reg 2025-08-09T18:46:00Z INFO 67673 [CoreForkPass]: Inputs to coloring_allocator_reg: modules=1 functions=1 allocs=68412 blocks=1 instructions=295354 Max writers: 64 Max Readers: 212041 2025-08-09T18:46:00Z INFO 67673 [ColoringAllocator::Rep]: Allocating functions 2025-08-09T18:46:00Z INFO 67673 [ColoringAllocator::Rep]: linearize and check 2025-08-09T18:46:00Z INFO 67673 [REG_Allocator]: allocating REG 2025-08-09T18:46:00Z INFO 67673 [REG_Allocator]: main loop iteration 1 2025-08-09T18:46:00Z USER 67673 [CoreForkPass]: coloring_allocator_reg finished after 0.055 seconds 2025-08-09T18:46:00Z INFO 67673 [CoreForkPass]: curr_vmrss: 2119mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:46:00Z INFO 67673 [CoreForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 295354 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:46:00Z USER 67673 [BackendPassManager]: nc_parallel_pass finished after 3.677 seconds 2025-08-09T18:46:00Z INFO 67673 [BackendPassManager]: curr_vmrss: 2119mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:46:00Z INFO 67673 [BackendPassManager]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 295354 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:46:00Z USER 67673 [BackendPassManager]: Running mod_parallel_pass 2025-08-09T18:46:00Z INFO 67673 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=68412 blocks=1 instructions=295354 Max writers: 64 Max Readers: 212041 2025-08-09T18:46:00Z USER 67673 [ModuleForkPass]: Running birverifier 2025-08-09T18:46:00Z INFO 67673 [ModuleForkPass]: Inputs to birverifier: modules=1 functions=1 allocs=68412 blocks=1 instructions=295354 Max writers: 64 Max Readers: 212041 2025-08-09T18:46:00Z USER 67673 [ModuleForkPass]: birverifier finished after 0.306 seconds 2025-08-09T18:46:00Z INFO 67673 [ModuleForkPass]: curr_vmrss: 2119mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:46:00Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 295354 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:46:00Z USER 67673 [BackendPassManager]: mod_parallel_pass finished after 0.321 seconds 2025-08-09T18:46:00Z INFO 67673 [BackendPassManager]: curr_vmrss: 2119mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:46:00Z INFO 67673 [BackendPassManager]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 295354 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:46:00Z USER 67673 [BackendPassManager]: Running subgraph_parallel_pass 2025-08-09T18:46:00Z INFO 67673 [BackendPassManager]: Inputs to subgraph_parallel_pass: modules=1 functions=1 allocs=68412 blocks=1 instructions=295354 Max writers: 64 Max Readers: 212041 2025-08-09T18:46:00Z USER 67673 [SubgraphForkPass]: Running lnc_verifier 2025-08-09T18:46:00Z INFO 67673 [SubgraphForkPass]: Inputs to lnc_verifier: modules=1 functions=1 allocs=68412 blocks=1 instructions=295354 Max writers: 64 Max Readers: 212041 2025-08-09T18:46:00Z USER 67673 [SubgraphForkPass]: lnc_verifier finished after 0.006 seconds 2025-08-09T18:46:00Z INFO 67673 [SubgraphForkPass]: curr_vmrss: 2119mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:46:00Z INFO 67673 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 295354 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:46:00Z USER 67673 [BackendPassManager]: subgraph_parallel_pass finished after 0.019 seconds 2025-08-09T18:46:00Z INFO 67673 [BackendPassManager]: curr_vmrss: 2119mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:46:00Z INFO 67673 [BackendPassManager]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 295354 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:46:00Z USER 67673 [BackendPassManager]: Running mod_parallel_pass 2025-08-09T18:46:00Z INFO 67673 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=68412 blocks=1 instructions=295354 Max writers: 64 Max Readers: 212041 2025-08-09T18:46:00Z USER 67673 [ModuleForkPass]: Running codegen 2025-08-09T18:46:00Z INFO 67673 [ModuleForkPass]: Inputs to codegen: modules=1 functions=1 allocs=68412 blocks=1 instructions=295354 Max writers: 64 Max Readers: 212041 2025-08-09T18:46:00Z INFO 67673 [Codegen]: Total compiler allocated DRAM tensors: 0 GB 2025-08-09T18:46:00Z INFO 67673 [Codegen]: Total un-allocated DRAM tensors by kind: 2025-08-09T18:46:00Z INFO 67673 [Codegen]: βββββββββββββββββ¬ββββββββββββββ β TensorKind β Size (GB) β βββββββββββββββββΌββββββββββββββ€ β ExternalInput β 7.6285 β β Const β 3.05176e-05 β βββββββββββββββββ΄ββββββββββββββ 2025-08-09T18:46:00Z INFO 67673 [Codegen]: Total runtime managed DRAM tensors: 7.62853 GB 2025-08-09T18:46:01Z INFO 67673 [Codegen]: Instruction Stats: 2025-08-09T18:46:01Z INFO 67673 [Codegen]: βββββββββββββββββββββββ¬βββββββββ β Opcode β Count β βββββββββββββββββββββββΌβββββββββ€ β LDWEIGHTS β 212041 β β MATMUL β 212041 β β ACTIVATE β 53065 β β EVENT_SEMAPHORE β 15692 β β UNKNOWN(0xd4) β 14546 β β NOP β 7 β β PSEUDO_BRANCH_LABEL β 5 β β ACT_TABLE_LOAD β 1 β β PSEUDO_DMA_TRIGGER β 1 β βββββββββββββββββββββββ΄βββββββββ 2025-08-09T18:46:01Z INFO 67673 [Codegen]: ββββββββββββββ¬βββββββββ β Engine β Count β ββββββββββββββΌβββββββββ€ β Unassigned β 0 β β GPSIMD β 21233 β β Scalar β 55905 β β Tensor β 430261 β β SyncDMA β 0 β β Vector β 2 β β Sync β 3 β β All β 0 β ββββββββββββββ΄βββββββββ 2025-08-09T18:46:01Z INFO 67673 [Codegen]: Total instructions: 507404 (0.0302436 GB) 2025-08-09T18:46:01Z INFO 67673 [Codegen]: Total DynamicDMA instruction count: 14546 2025-08-09T18:46:01Z USER 67673 [Codegen]: isa_gen finished after 1.123 seconds 2025-08-09T18:46:01Z INFO 67673 [Codegen]: Number of DMA descriptors on each queue instance: βββββββββββββββββββ¬βββββββββββββββββ β Queue Instance β RT Descriptors β βββββββββββββββββββΌβββββββββββββββββ€ β qSPSpillReload0 β 256 β βββββββββββββββββββ΄βββββββββββββββββ Total descriptors: 256 (3.8147e-06 GB) 2025-08-09T18:46:01Z INFO 67673 [Codegen]: Number of DMA engines used by each queue: βββββββββββββββββββ¬ββββββββββββββββββββββ β Queue β DMA Engines β βββββββββββββββββββΌββββββββββββββββββββββ€ β qSPSpillReload0 β 16 β β qPoolDynamic β 16 β βββββββββββββββββββΌββββββββββββββββββββββ€ β TOTAL β 32 (must be <= 176) β βββββββββββββββββββ΄ββββββββββββββββββββββ 2025-08-09T18:46:01Z INFO 67673 [Codegen]: Tensors with largest descriptor count: ββββββββββββββββββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββββββββββ β Tensor Name β Kind β Src Type β Descriptor Count β ββββββββββββββββββββββββΌβββββββββββΌβββββββββββΌβββββββββββββββββββ€ β identity_local_25028 β Internal β bfloat16 β 1 β β identity_25026 β Const β bfloat16 β 1 β ββββββββββββββββββββββββ΄βββββββββββ΄βββββββββββ΄βββββββββββββββββββ 2025-08-09T18:46:01Z USER 67673 [Codegen]: dma_desc_gen finished after 0.000 seconds 2025-08-09T18:46:01Z INFO 67673 [Codegen]: Estimated peak DRAM usage: 7.65878 GB 2025-08-09T18:46:01Z INFO 67673 [Codegen]: Generating debug info 2025-08-09T18:46:02Z USER 67673 [Codegen]: debug_info_gen finished after 0.613 seconds 2025-08-09T18:46:02Z USER 67673 [ModuleForkPass]: codegen finished after 1.797 seconds 2025-08-09T18:46:02Z INFO 67673 [ModuleForkPass]: curr_vmrss: 2311mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:46:02Z INFO 67673 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 295354 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:46:02Z USER 67673 [BackendPassManager]: mod_parallel_pass finished after 1.826 seconds 2025-08-09T18:46:02Z INFO 67673 [BackendPassManager]: curr_vmrss: 2130mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:46:02Z INFO 67673 [BackendPassManager]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 295354 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:46:02Z USER 67673 [BackendPassManager]: Running neff_packager 2025-08-09T18:46:02Z INFO 67673 [BackendPassManager]: Inputs to neff_packager: modules=1 functions=1 allocs=68412 blocks=1 instructions=295354 Max writers: 64 Max Readers: 212041 2025-08-09T18:46:02Z WARNING 67673 [NeffFileWriter]: writeKelp missing file /local/p4clients/pkgbuild-const/workspace/build/KaenaCompiler/KaenaCompiler-2.x.169490.0/AL2_x86_64/DEV.STD.PTHREAD/build/private/_skbuild/linux-x86_64-3.10/cmake-build/neuronxcc/walrus/neff_packager/MetricMetadata.json 2025-08-09T18:46:02Z INFO 67673 [NeffFileWriter]: Neff will be written to: /home/ubuntu/qwen3/layout_opt/graph.neff 2025-08-09T18:46:02Z INFO 67673 [NeffFileWriter]: IR signature: c6cb604c4535169891036e23b5114d01 for neff artifacts 2025-08-09T18:46:02Z USER 67673 [BackendPassManager]: neff_packager finished after 0.313 seconds 2025-08-09T18:46:02Z INFO 67673 [BackendPassManager]: curr_vmrss: 2131mb, ru_maxrss: 2494mb (delta=0mb) 2025-08-09T18:46:02Z INFO 67673 [BackendPassManager]: Output has 1 module(s), 1 function(s), 68412 memory location(s), 1 block(s), and 295354 instruction(s). Max writers: 64 Max Readers: 212041 2025-08-09T18:46:02Z INFO 67673 [BackendDriver]: HBM scratchpad usage summary (post-allocation): ββββββββ¬ββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¬ββββββββββββββ β Core β Subgraph β Description β Value β ββββββββΌββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββΌββββββββββββββ€ β nc00 β module β Peak scratchpad usage: local β 0.000000 GB β β nc00 β module β Total size of allocated tensors: local β 0.000000 GB β β nc00 β Max β Peak scratchpad usage: local β 0.000000 GB β β nc00 β Post-link β Peak scratchpad usage after intermediate tensor allocation β 0.000000 GB β β nc00 β Post-link β Total size of allocated intermediate tensors β 0.000000 GB β ββββββββΌββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββΌββββββββββββββ€ β Max β Max β Peak scratchpad usage β 0.000000 GB β β Max β Max β Peak scratchpad usage (page-aligned) β 0.000000 GB β ββββββββ΄ββββββββββββ΄βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ΄ββββββββββββββ 2025-08-09T18:46:02Z INFO 67673 [BackendDriver]: Backend completed successfully, tearing down. 2025-08-09T18:46:03Z INFO 67605 [job.WalrusDriver.0]: Job #0 finished 2025-08-09T18:46:03Z INFO 67605 [pipeline.Pipeline.0]: Finished job job.WalrusDriver.0 2025-08-09T18:46:03Z INFO 67605 [pipeline.Pipeline.0]: Starting job job.BIRLinker.0 2025-08-09T18:46:03Z INFO 67605 [job.BIRLinker.0]: Replay this job by calling: /opt/aws_neuronx_venv_pytorch_2_7_nxd_inference/bin/neuronx-cc compile --framework XLA --state '{"model": ["/home/ubuntu/qwen3/layout_opt/model/graph.hlo"], "tensormap": "tensor_map.json", "bir": "bir.json", "lorean_sg_key": null, "input_name_map": null, "output_name_map": null, "constant_tensors": null, "state_dir": "/home/ubuntu/neuronxcc-mk9kpjyq/sg00", "state_id": "sg00"}' --pipeline BIRLinker 2025-08-09T18:46:03Z INFO 67605 [job.BIRLinker.0]: BIRLinker cwd: /home/ubuntu/neuronxcc-mk9kpjyq 2025-08-09T18:46:03Z INFO 67605 [job.BIRLinker.0]: Linking not needed. Netlist doesnt exist 2025-08-09T18:46:03Z INFO 67605 [pipeline.Pipeline.0]: Finished job job.BIRLinker.0 2025-08-09T18:46:03Z INFO 67605 [pipeline.Pipeline.0]: Starting job job.Kelper.0 2025-08-09T18:46:03Z INFO 67605 [job.Kelper.0]: Skipping neff generation which was already performed by neff_packager 2025-08-09T18:46:03Z INFO 67605 [pipeline.Pipeline.0]: Finished job job.Kelper.0 2025-08-09T18:46:03Z INFO 67605 [pipeline.Pipeline.0]: Starting job job.NeffWrapper.0 2025-08-09T18:46:03Z INFO 67605 [job.NeffWrapper.0]: Job NeffWrapper len(in_states) 1 2025-08-09T18:46:03Z INFO 67605 [job.NeffWrapper.0]: Processing input #0 2025-08-09T18:46:03Z INFO 67605 [job.NeffWrapper.0]: Start NeffWrapper 2025-08-09T18:46:03Z INFO 67605 [job.NeffWrapper.0]: Executing: /opt/aws_neuronx_venv_pytorch_2_7_nxd_inference/lib/python3.10/site-packages/neuronxcc/starfish/bin/hlo-neff-wrapper --hlo /home/ubuntu/qwen3/layout_opt/model/graph.hlo --neff /home/ubuntu/qwen3/layout_opt/graph.neff --io_transposes /home/ubuntu/neuronxcc-mk9kpjyq/io_transposes.json --output /home/ubuntu/qwen3/layout_opt/wrapped_neff.hlo --netlist /home/ubuntu/neuronxcc-mk9kpjyq/hlo_netlist.json 2025-08-09T18:46:04Z INFO 67605 [job.NeffWrapper.0]: Could not open file: /home/ubuntu/neuronxcc-mk9kpjyq/hlo_netlist.json There are no io transposes nor zero-sized parameters. Output will not be produced. Hlo neff wrapper finished successfully. Have a wonderful day :D 2025-08-09T18:46:04Z INFO 67605 [job.NeffWrapper.0]: Job #0 finished 2025-08-09T18:46:04Z INFO 67605 [pipeline.Pipeline.0]: Finished job job.NeffWrapper.0 2025-08-09T18:46:04Z INFO 67605 [pipeline.Pipeline.0]: Finished pipeline Pipeline 2025-08-09T18:46:04Z INFO 67605 [pipeline.Pipeline.0]: Job #0 finished 2025-08-09T18:46:04Z INFO 67541 [root]: Subcommand returned with exitcode=0 |