Release Torch-TensorRT v1.4.0 · pytorch/TensorRT

PyTorch 2.0, CUDA 11.8, TensorRT 8.6, Support for the new `torch.compile` API, compatibility mode for FX frontend

Torch-TensorRT 1.4.0 targets PyTorch 2.0, CUDA 11.8, TensorRT 8.5. This release introduces a number of beta features to set the stage for working with PyTorch and TensorRT in the 2.0 ecosystem. Primarily, this includes a new torch.compile backend targeting Torch-TensorRT. It also adds a compatibility layer that allows users of the TorchScript frontend for Torch-TensorRT to seamlessly try FX and Dynamo.

torch.compile` Backend for Torch-TensorRT

One of the most prominent new features in PyTorch 2.0 is the torch.compile workflow, which enables users to accelerate code easily by specifying a backend of their choice. Torch-TensorRT 1.4.0 introduces a new backend for torch.compile as a beta feature, including a convenience frontend to perform accelerated inference. This frontend can be accessed in one of two ways:

import torch_tensorrt
torch_tensorrt.dynamo.compile(model, inputs, ...)

##### OR #####

torch_tensorrt.compile(model, ir="dynamo_compile", inputs=inputs, ...)

For more examples, see the provided sample scripts, which can be found here

This compilation method has a couple key considerations:

It can handle models with data-dependent control flow
It automatically falls back to Torch if the TRT Engine Build fails for any reason
It uses the Torch FX aten library of converters to accelerate models
Recompilation can be caused by changing the batch size of the input, or providing an input which enters a new control flow branch
Compiled models cannot be saved across Python sessions (yet)

The feature is currently in beta, and we expect updates, changes, and improvements to the above in the future.

`fx_ts_compat` Frontend

As the ecosystem transitions from TorchScript to Dynamo, users of Torch-TensorRT may want start to experiment with this stack. As such we have introduced a new frontend for Torch-TensorRT which exposes the same APIs as the TorchScript frontend but will use the FX/Dynamo compiler stack. You can try this frontend by using the ir="fx_ts_compat" setting

torch_tensorrt.compile(..., ir="fx_ts_compat")

What's Changed

Fix build by @yinghai in #1479
add circle CI signal in README page by @yinghai in #1481
fix eisum signature by @yinghai in #1480
Fix link to CircleCI in README.md by @yinghai in #1483
Minor changes by @yinghai in #1482
[FX] Changes done internally at Facebook by @frank-wei in #1456
chore: upload docs for 1.3.0 by @narendasan in #1504
fix: Repair Citrinet-1024 compilation issues by @gs-olive in #1488
refactor: Split elementwise tests by @peri044 in #1507
[feat] Support 1D topk by @mfeliz-cruise in #1491
Support aten::sum with bool tensor input by @mfeliz-cruise in #1512
[fix]Disambiguate cast layer names by @mfeliz-cruise in #1513
feat: Add functionality for easily benchmarking fx code on key models by @gs-olive in #1506
[feat]Canonicalize aten::multiply to aten::mul by @mfeliz-cruise in #1517
broadcast the two input shapes for transposed matmul by @nvpohanh in #1457
make padding layer converter more efficient by @nvpohanh in #1470
fix: Change equals-check from reference to value for BERT model not compiling in FX by @gs-olive in #1539
Update README dependencies section for v1.3.0 by @take-cheeze in #1540
fix: aten::where with differing-shape inputs bugfix by @gs-olive in #1533
fix: Automatically send truncated long ints to cuda at shape analysis time by @gs-olive in #1541
feat: Add functionality to FX benchmarking + Improve documentation by @gs-olive in #1529
[fix] Fix crash when calling unbind on evaluated tensor by @mfeliz-cruise in #1554
Update test_flatten_aten and test_reshape_aten due to PT2.0 changed tracer behavior for these ops by @frank-wei in #1559
fix: Bugfix for align_corners=False- FX interpolate by @gs-olive in #1561
fix: Properly cast intermediate Int8 tensors to TensorRT Engines in Fallback by @gs-olive in #1549
Upgrade stack to Pytorch 2.0 + CUDA 11.7 + TRT 8.5 GA by @peri044 in #1477
feat: Add option to specify int64 as an Input dtype by @gs-olive in #1551
feat: Support int inputs to aten::max/min and aten::argmax/argmin by @mfeliz-cruise in #1574
fix: Add aten::full_like evaluator by @gs-olive in #1584
tools: assign 1 person to a bug instead of all by @narendasan in #1604
feat: Add support for aten::meshgrid by @mfeliz-cruise in #1601
[FX] Changes done internally at Facebook by @frank-wei in #1603
chore: Add FX core test by @peri044 in #1593
chore: Update dockerfile by @peri044 in #1581
fix: Replace RemoveDropout lowering pass implementation with modified JIT pass by @gs-olive in #1589
[FX] Changes done internally at Facebook by @frank-wei in #1625
chore: Update Dockerfile to Ubuntu 20.04 + Crash Resolution by @gs-olive in #1639
fix: Bugfix in Linear-to-AddMM Fusion Lowering Pass by @gs-olive in #1619
fix: Resolve compilation bug for empty tensors in aten::select by @gs-olive in #1623
Convolution cast by @apbose in #1609
fix: Bugfix in TRT Engine deserialization indexing by @gs-olive in #1646
fix: fix the inappropriate lowering pass of aten::to by @bowang007 in #1649
Lowering aten::pad to aten::constant_pad_nd/aten::reflection_padXd/aten::replication_padXd by @ruoqianguo in #1588
[fix] Disambiguate element-wise cast layer names by @mfeliz-cruise in #1630
feat: Add optional tensor domain argument to Input class by @gs-olive in #1537
Improve batch_norm fp16 accuracy by @mfeliz-cruise in #1450
add an example of aten2trt, fix batch norm pass by @frank-wei in #1685
fix: Issue in non-Tensor Input Resolution by @gs-olive in #1617
Corrected a typo, which was raising an error by @zshn25 in #1694
Cherry-pick manylinux compatible builds into main by @narendasan in #1677
fix: Improve input handling for input_signature by @gs-olive in #1698
Unsqueeze operator with dynamic inout by @apbose in #1624
[feat] Add converter support for index_select by @mfeliz-cruise in #1692
[feat] Add converter support for aten::logical_not by @mfeliz-cruise in #1705
fix: Bugfix in convNd_to_convolution lowering pass by @gs-olive in #1693
[feat] Add converter for aten::any.dim by @mfeliz-cruise in #1707
[fix] resolve issue for single non-batch index tensor in aten::index by @mfeliz-cruise in #1700
fix: Handle nonetype pad value for Constant pad by @peri044 in #1712
infra: Add Torch 1.13.1 testing to nightly CI by @gs-olive in #1731
fix: Allow full model compilation with collection outputs by @gs-olive in #1599
fix: fix the prim::Loop fallback issue by @bowang007 in #1691
feat: Add decorator utility to improve error messaging for legacy support by @gs-olive in #1738
minor fix: Update default minimum torch version for aten tracer by @gs-olive in #1747
Get windows build working by @bharrisau in #1711
Update config.yml by @frank-wei in #1736
fix: Bugfix in shape analysis for multi-GPU systems by @gs-olive in #1765
fix: Add schemas to convolution lowering pass by @gs-olive in #1728
fix: Update Docker build to automatically adapt Torch version by @gs-olive in #1732
feat: Upgrade Pytorch and TensorRT versions by @peri044 in #1759
feat: Merge dynamo additions into release/1.4 by @gs-olive in #1884
fix: Cherry-pick acc convolution fix to release/1.4 by @gs-olive in #1910
cherry-pick: Reorganize + Upgrade Dynamo (release/1.4) by @gs-olive in #1931
fix: Upgrade release/1.4 to Torch 2.0.1 + TensorRT 8.6.1 by @gs-olive in #1896
cherry-pick: Dynamo upgrades and bugfixes (release/1.4) by @gs-olive in #1956

New Contributors

@nvpohanh made their first contribution in #1457
@take-cheeze made their first contribution in #1540
@zshn25 made their first contribution in #1694
@bharrisau made their first contribution in #1711

Full Changelog: v1.3.0...v1.4.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Torch-TensorRT v1.4.0

PyTorch 2.0, CUDA 11.8, TensorRT 8.6, Support for the new `torch.compile` API, compatibility mode for FX frontend

torch.compile` Backend for Torch-TensorRT

`fx_ts_compat` Frontend

What's Changed

New Contributors

Contributors

Torch-TensorRT v1.4.0

PyTorch 2.0, CUDA 11.8, TensorRT 8.6, Support for the new torch.compile API, compatibility mode for FX frontend

torch.compile` Backend for Torch-TensorRT

fx_ts_compat Frontend

What's Changed

New Contributors

Contributors

PyTorch 2.0, CUDA 11.8, TensorRT 8.6, Support for the new `torch.compile` API, compatibility mode for FX frontend

`fx_ts_compat` Frontend