From 057232755b52422f4f0c8040154c90ede33ade2d Mon Sep 17 00:00:00 2001
From: Aravinda Kumar <76619616+surprisedPikachu007@users.noreply.github.com>
Date: Thu, 11 Jul 2024 01:29:18 +0530
Subject: [PATCH 1/7] Update distributed_device_mesh.rst (#2965)

fixed a typo in the link
---
 recipes_source/distributed_device_mesh.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/recipes_source/distributed_device_mesh.rst b/recipes_source/distributed_device_mesh.rst
index dbc4a81043..d41d6c1df1 100644
--- a/recipes_source/distributed_device_mesh.rst
+++ b/recipes_source/distributed_device_mesh.rst
@@ -156,4 +156,4 @@ they can be used to describe the layout of devices across the cluster.
 For more information, please see the following:
 
 - `2D parallel combining Tensor/Sequance Parallel with FSDP <https://github.com/pytorch/examples/blob/main/distributed/tensor_parallelism/fsdp_tp_example.py>`__
-- `Composable PyTorch Distributed with PT2 <chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://static.sched.com/hosted_files/pytorch2023/d1/%5BPTC%2023%5D%20Composable%20PyTorch%20Distributed%20with%20PT2.pdf>`__
+- `Composable PyTorch Distributed with PT2 <https://static.sched.com/hosted_files/pytorch2023/d1/%5BPTC%2023%5D%20Composable%20PyTorch%20Distributed%20with%20PT2.pdf>`__

From 25ea481f26589f6259e9409b1487581c4bde7e00 Mon Sep 17 00:00:00 2001
From: Lucas Pasqualin <lpasqualin@meta.com>
Date: Wed, 10 Jul 2024 18:35:11 -0400
Subject: [PATCH 2/7] Update
 recipes_source/distributed_async_checkpoint_recipe.rst

Co-authored-by: Svetlana Karslioglu <svekars@meta.com>
---
 recipes_source/distributed_async_checkpoint_recipe.rst | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/recipes_source/distributed_async_checkpoint_recipe.rst b/recipes_source/distributed_async_checkpoint_recipe.rst
index 7d81a53c37..a8a7d35de6 100644
--- a/recipes_source/distributed_async_checkpoint_recipe.rst
+++ b/recipes_source/distributed_async_checkpoint_recipe.rst
@@ -156,9 +156,12 @@ If the above optimization is still not performant enough, you can take advantage
 Specifically, this optimization attacks the main overhead of asynchronous checkpointing, which is the in-memory copying to checkpointing buffers. By maintaining a pinned memory buffer between
 checkpoint requests users can take advantage of direct memory access to speed up this copy.
 
-.. note:: The main drawback of this optimization is the persistence of the buffer in between checkpointing steps. Without the pinned memory optimization (as demonstrated above),
-any checkpointing buffers are released as soon as checkpointing is finished. With the pinned memory implementation, this buffer is maintained between steps, leading to the same
-peak memory pressure being sustained through the application life.
+.. note::
+   The main drawback of this optimization is the persistence of the buffer in between checkpointing steps. Without 
+   the pinned memory optimization (as demonstrated above), any checkpointing buffers are released as soon as 
+   checkpointing is finished. With the pinned memory implementation, this buffer is maintained between steps, 
+   leading to the same
+   peak memory pressure being sustained through the application life.
 
 
 .. code-block:: python

From e6b3ac2e964f76350cf2422f7c12fea393112952 Mon Sep 17 00:00:00 2001
From: Lucas Pasqualin <lpasqualin@meta.com>
Date: Wed, 10 Jul 2024 18:35:28 -0400
Subject: [PATCH 3/7] Update
 recipes_source/distributed_async_checkpoint_recipe.rst

Co-authored-by: Svetlana Karslioglu <svekars@meta.com>
---
 recipes_source/distributed_async_checkpoint_recipe.rst | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/recipes_source/distributed_async_checkpoint_recipe.rst b/recipes_source/distributed_async_checkpoint_recipe.rst
index a8a7d35de6..11e7dadeb6 100644
--- a/recipes_source/distributed_async_checkpoint_recipe.rst
+++ b/recipes_source/distributed_async_checkpoint_recipe.rst
@@ -1,6 +1,8 @@
 Asynchronous Saving with Distributed Checkpoint (DCP)
 =====================================================
 
+**Author:** `Lucas Pasqualin <https://github.com/lucasllc>`__, `Iris Zhang <https://github.com/wz337>`__, `Rodrigo Kumpera <https://github.com/kumpera>`__, `Chien-Chin Huang <https://github.com/fegin>`__
+
 Checkpointing is often a bottle-neck in the critical path for distributed training workloads, incurring larger and larger costs as both model and world sizes grow.
 One excellent strategy for offsetting this cost is to checkpoint in parallel, asynchronously. Below, we expand the save example
 from the `Getting Started with Distributed Checkpoint Tutorial <https://github.com/pytorch/tutorials/blob/main/recipes_source/distributed_checkpoint_recipe.rst>`__

From f4ec793acaf0349d9a543beebaf2a6bbde012696 Mon Sep 17 00:00:00 2001
From: Lucas Pasqualin <lpasqualin@meta.com>
Date: Wed, 10 Jul 2024 18:35:41 -0400
Subject: [PATCH 4/7] Update
 recipes_source/distributed_async_checkpoint_recipe.rst

Co-authored-by: Svetlana Karslioglu <svekars@meta.com>
---
 recipes_source/distributed_async_checkpoint_recipe.rst | 1 -
 1 file changed, 1 deletion(-)

diff --git a/recipes_source/distributed_async_checkpoint_recipe.rst b/recipes_source/distributed_async_checkpoint_recipe.rst
index 11e7dadeb6..712e7dce42 100644
--- a/recipes_source/distributed_async_checkpoint_recipe.rst
+++ b/recipes_source/distributed_async_checkpoint_recipe.rst
@@ -8,7 +8,6 @@ One excellent strategy for offsetting this cost is to checkpoint in parallel, as
 from the `Getting Started with Distributed Checkpoint Tutorial <https://github.com/pytorch/tutorials/blob/main/recipes_source/distributed_checkpoint_recipe.rst>`__
 to show how this can be integrated quite easily with ``torch.distributed.checkpoint.async_save``.
 
-**Author**: , `Lucas Pasqualin <https://github.com/lucasllc>`__, `Iris Zhang <https://github.com/wz337>`__, `Rodrigo Kumpera <https://github.com/kumpera>`__, `Chien-Chin Huang <https://github.com/fegin>`__
 
 .. grid:: 2
 

From c32ce5883b3b9f67f0f345325c69436ece8446bb Mon Sep 17 00:00:00 2001
From: ZincCat <52513999+zinccat@users.noreply.github.com>
Date: Mon, 15 Jul 2024 09:43:50 -0700
Subject: [PATCH 5/7] Update cpp_export.rst (#2970)

Updated specified c++ version from 14 to 17
---
 advanced_source/cpp_export.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/advanced_source/cpp_export.rst b/advanced_source/cpp_export.rst
index 5dedbdaaa6..45556a5320 100644
--- a/advanced_source/cpp_export.rst
+++ b/advanced_source/cpp_export.rst
@@ -203,7 +203,7 @@ minimal ``CMakeLists.txt`` to build it could look as simple as:
 
   add_executable(example-app example-app.cpp)
   target_link_libraries(example-app "${TORCH_LIBRARIES}")
-  set_property(TARGET example-app PROPERTY CXX_STANDARD 14)
+  set_property(TARGET example-app PROPERTY CXX_STANDARD 17)
 
 The last thing we need to build the example application is the LibTorch
 distribution. You can always grab the latest stable release from the `download

From 5efa2e52aafdd94ef9ae6fbfa8c63fe888a15374 Mon Sep 17 00:00:00 2001
From: Bas Krahmer <baskrahmer@gmail.com>
Date: Wed, 17 Jul 2024 17:22:21 +0200
Subject: [PATCH 6/7] Typo (#2974)

Typo
---
 advanced_source/super_resolution_with_onnxruntime.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/advanced_source/super_resolution_with_onnxruntime.py b/advanced_source/super_resolution_with_onnxruntime.py
index ecb0ba4fe4..264678ee17 100644
--- a/advanced_source/super_resolution_with_onnxruntime.py
+++ b/advanced_source/super_resolution_with_onnxruntime.py
@@ -9,7 +9,7 @@
     * ``torch.onnx.export`` is based on TorchScript backend and has been available since PyTorch 1.2.0.
 
 In this tutorial, we describe how to convert a model defined
-in PyTorch into the ONNX format using the TorchScript ``torch.onnx.export` ONNX exporter.
+in PyTorch into the ONNX format using the TorchScript ``torch.onnx.export`` ONNX exporter.
 
 The exported model will be executed with ONNX Runtime.
 ONNX Runtime is a performance-focused engine for ONNX models,

From 2f2db747605e73e168d614c4f3a680ab6a286f78 Mon Sep 17 00:00:00 2001
From: Haechan An <48047392+AnHaechan@users.noreply.github.com>
Date: Thu, 18 Jul 2024 00:24:08 +0900
Subject: [PATCH 7/7] FIX: typo in inductor_debug_cpu.py (#2938)

Co-authored-by: Svetlana Karslioglu <svekars@meta.com>
---
 intermediate_source/inductor_debug_cpu.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/intermediate_source/inductor_debug_cpu.py b/intermediate_source/inductor_debug_cpu.py
index 94dee3ba15..370180d968 100644
--- a/intermediate_source/inductor_debug_cpu.py
+++ b/intermediate_source/inductor_debug_cpu.py
@@ -87,9 +87,9 @@ def neg1(x):
 # +-----------------------------+----------------------------------------------------------------+
 # | ``fx_graph_transformed.py`` | Transformed FX graph, after pattern match                      |
 # +-----------------------------+----------------------------------------------------------------+
-# | ``ir_post_fusion.txt``      | Inductor IR before fusion                                      |
+# | ``ir_pre_fusion.txt``       | Inductor IR before fusion                                      |
 # +-----------------------------+----------------------------------------------------------------+
-# | ``ir_pre_fusion.txt``       | Inductor IR after fusion                                       |
+# | ``ir_post_fusion.txt``      | Inductor IR after fusion                                       |
 # +-----------------------------+----------------------------------------------------------------+
 # | ``output_code.py``          | Generated Python code for graph, with C++/Triton kernels       |
 # +-----------------------------+----------------------------------------------------------------+