Post
64
After 2 months of refinement, I'm happy to announce that a lot of Transformers' modeling code is now significantly more torch-compile & export-friendly ๐ฅ
Why it had to be done ๐
PyTorch's Dynamo compiler is increasingly becoming the default interoperability layer for ML systems. Anything that relies on torch.export or torch.compile, from model optimization to cross-framework integrations, benefits directly when models can be captured as a single dynamo-traced graph !
Transformers models are now easier to:
โ๏ธ Compile end-to-end with torch.compile backends
๐ฆ Export reliably via torch.export and torch.onnx.export
๐ Deploy to ONNX / ONNX Runtime, Intel Corporation's OpenVINO, NVIDIA AutoDeploy (TRT-LLM), AMD's Quark, Meta's Executorch and more hardware-specific runtimes.
This work aims at unblocking entire TorchDynamo-based toolchains that rely on exporting Transformers across runtimes and accelerators.
We are doubling down on Transformers commitment to be a first-class citizen of the PyTorch ecosystem, more exportable, more optimizable, and easier to deploy everywhere.
There are definitely some edge-cases that we still haven't addressed so don't hesitate to try compiling / exporting your favorite transformers and to open issues / PRs.
PR in the comments ! More updates coming coming soon !
Why it had to be done ๐
PyTorch's Dynamo compiler is increasingly becoming the default interoperability layer for ML systems. Anything that relies on torch.export or torch.compile, from model optimization to cross-framework integrations, benefits directly when models can be captured as a single dynamo-traced graph !
Transformers models are now easier to:
โ๏ธ Compile end-to-end with torch.compile backends
๐ฆ Export reliably via torch.export and torch.onnx.export
๐ Deploy to ONNX / ONNX Runtime, Intel Corporation's OpenVINO, NVIDIA AutoDeploy (TRT-LLM), AMD's Quark, Meta's Executorch and more hardware-specific runtimes.
This work aims at unblocking entire TorchDynamo-based toolchains that rely on exporting Transformers across runtimes and accelerators.
We are doubling down on Transformers commitment to be a first-class citizen of the PyTorch ecosystem, more exportable, more optimizable, and easier to deploy everywhere.
There are definitely some edge-cases that we still haven't addressed so don't hesitate to try compiling / exporting your favorite transformers and to open issues / PRs.
PR in the comments ! More updates coming coming soon !