feature/fused-ops #55

mgehre-amd · 2023-06-16T07:20:59Z

Not for merging; just convenience to look at our changes.

[AutoBump] Merge with d99bb01 (3)

Bump (4)

[AutoBump] Merge with 9997e03 (5)

Bump with conflict resolution (6)

Bump with conflict resolution to 2e271ce (1)

Bump to 2e271ce (needs onnx-mlir update) (2)

Bump to fe2119a with conflict resolution (1)

Bump (needs onnx-mlir update) (2)

Merge with fixes of 647d75d (4)

[AutoBump] Merge with 7340263 (5)

Bump with fixes to fa6e433 (6)

[AutoBump] Merge with 538257b (7)

feat: implement constant folding for tosa.slice

When deciding whether to emit a map like `#map = affine_map<(d0, d1, d2, d3) -> (0, d1, d2, d3)>` or `#map = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>` for and operand of a linalg.generic when lowering element wise TOSA ops, prefer the latter unless broadcasting of the operand is really needed. This helps later transformations which often require the affine map to be a projected permuatation, which only the latter is.

* Fix for aliasing the region args * Add test case * Add empty line

…ered version (Emitc::ForOp) (#390)

OpaqueType: Use format string

@Max191

Refactored @Max191's PR llvm#94637 to move it to `Tensor` From the original PR >This PR adds fusion by expansion patterns to push a tensor.expand_shape up through a tensor.collapse_shape with non-intersecting reassociations. Sometimes parallel collapse_shape ops like this can block propagation of expand_shape ops, so this allows them to pass through each other. I'm not sure if I put the code/tests in the right places, so let me know where those go if they aren't. cc @MaheshRavishankar @hanhanW --------- Co-authored-by: Max Dawkins <[email protected]>

Add missing `getIterationDomainTileFromOperandTile` and `getTiledImplementationFromOperandTile` to `tensor.pack` and enable fusing it as a consumer. NOTE that, it only expects perfect tiling scenario without padding semantic currently.

…#96184) In order to support arbitrary size input data of conv2d, implement TilingInterface for winograd operations. Before converting winograd operations into nested loops with matrix multiply, tile the input of conv2d into the supported size first. Add a transform operation structured.decompose_winograd_op to decompose winograd operations. Before applying the transform op, use tile_using_for to tile the input data into supported size. The test case shows how to tile and decompose winograd operations.

…to continue tile + fuse. (llvm#107882) Current implementation of `scf::tileConsumerAndFuseProducerUsingSCF` looks at operands of tiled/tiled+fused operations to see if they are produced by `extract_slice` operations to populate the worklist used to continue fusion. This implicit assumption does not always work. Instead make the implementations of `getTiledImplementation` return the slices to use to continue fusion. This is a breaking change - To continue to get the same behavior of `scf::tileConsumerAndFuseProducerUsingSCF`, change all out-of-tree implementation of `TilingInterface::getTiledImplementation` to return the slices to continue fusion on. All in-tree implementations have been adapted to this. - This change touches parts that required a simplification to the `ControlFn` in `scf::SCFTileAndFuseOptions`. It now returns a `std::optional<scf::SCFTileAndFuseOptions::ControlFnResult>` object that should be `std::nullopt` if fusion is not to be performed. Signed-off-by: MaheshRavishankar <[email protected]>

…m#109554) The SCF helper for tiling an operation implementing the TilingInterface and greedily fusing consumers requires an uninterrupted chain of operations implementing the tiling interface to succeed. There can be cases with intermediate ops that don't implement the interface but have producers that could be fused if various canonicalization/simplification patterns could run in between fusion steps. This adds an option to SCFTileAndFuseOptions for a pattern set to run between fusion steps to the ops that result from fusion/tiling. Removed and newly inserted slices are tracked for continued fusion applications. See this RFC for more discussion: https://discourse.llvm.org/t/rfc-split-fusion-portions-of-the-tilinginterface-into-a-new-interface/81155

Add emitc.tu

The auto-generated builder created an emitc.tu that had an empty region. This is a bit cumbersome to work with, as you would always manually needed to create a block in it. Do what ModuleOp::build does and always create that block. Also accept StringRef as argument for id instead of requiring a StringAttr.

`#include` make sense everywhere, and in particular we need to allow them inside a `emitc.tu`. But sometimes we might even want to have an `#include` in a function body.

emitc.include: don't require the parent to be a ModuleOp

emitc.tu: Automatically create block for body

…ape_fold fix: fuse locations of double reshapes when folding.

Backport various improvements to fusion from upstream

…ther.

…ations feat: improve CSE by fusing locations when replacing one op for the other.

Make EliminateLibm work on EmitC::FuncOp

emitc: Do not add newlines after ModuleOp, TranslationUnitOp

* Fix conversion for scf.for and scf.if

* Add -mlir-reproducer-before-all

mgehre-amd and others added 30 commits August 14, 2024 15:47

Merge pull request #250 from Xilinx/bump_to_d99bb014

94924fc

[AutoBump] Merge with d99bb01 (3)

Merge pull request #251 from Xilinx/bump_to_818af71b

4437bd2

Bump (4)

Merge pull request #252 from Xilinx/bump_to_9997e039

6b89ba9

[AutoBump] Merge with 9997e03 (5)

Merge pull request #253 from Xilinx/bump_to_19266ca3

2d24671

Bump with conflict resolution (6)

Merge commit '2e271ceff668' into bump_to_2e271ceff668

6710e4d

Merge commit '516ccce7fa26' into bump_to_516ccce7fa26

7847f9e

Merge pull request #256 from Xilinx/bump_to_2e271ceff668

a26d9f2

Bump with conflict resolution to 2e271ce (1)

Merge pull request #257 from Xilinx/bump_to_516ccce7fa26

8b899eb

Bump to 2e271ce (needs onnx-mlir update) (2)

Merge commit 'fe2119a7b08b' into matthias.bump_to_fe2119a7b08b

4990e5c

Merge commit 'd84252e064b3' into matthias.bump_to_d84252e064b3

1b038a9

Merge pull request #258 from Xilinx/matthias.bump_to_fe2119a7b08b

de8cc8f

Bump to fe2119a with conflict resolution (1)

[AutoBump] Merge with 972f65a

4291ba3

Merge with fixes of 647d75d

685a2c0

Merge pull request #259 from Xilinx/matthias.bump_to_d84252e064b3

6883373

Bump (needs onnx-mlir update) (2)

[AutoBump] Merge with de0abc0

0b3015b

[AutoBump] Merge with 7340263

a074d69

Merge pull request #260 from Xilinx/bump_to_972f65a8

3cef20c

Merge with fixes of fa6e433

45d71d8

[AutoBump] Merge with 538257b

c0d1d7e

Merge pull request #261 from Xilinx/bump_to_647d75d3

396ae58

Merge with fixes of 647d75d (4)

Merge with fixes of 0aa6d57

ae6cb41

Merge pull request #262 from Xilinx/bump_to_73402634

f3ecb39

[AutoBump] Merge with 7340263 (5)

[AutoBump] Merge with 8612fa0

64c0a57

Merge with fixes of 71db971

c02cb39

[AutoBump] Merge with 527a624

5ef826d

Do not return null on tranpose

f7ab7cd

Merge with fixes of 72c729f

6b4bce9

Merge pull request #263 from Xilinx/bump_to_fa6e4338

46bc47e

Bump with fixes to fa6e433 (6)

Merge pull request #264 from Xilinx/bump_to_538257bf

c807411

[AutoBump] Merge with 538257b (7)

[AutoBump] Merge with 507e59a

7a09708

ttjost and others added 30 commits October 15, 2024 00:49

feat: implement constant folding for tosa.slice

9489ae8

test: add more LIT tests for tosa.slice folding.

83bdfaf

Merge pull request #388 from Xilinx/tiagot.constant_folding_tosa_slice

08bb427

feat: implement constant folding for tosa.slice

Fix attr aliasing on region args (#389)

2015abf

* Fix for aliasing the region args * Add test case * Add empty line

Copy attributes from the original operation (SCF::ForOp) into the low…

4b36487

…ered version (Emitc::ForOp) (#390)

OpaqueType with format strings (#391)

ad4697c

OpaqueType: Use format string

Add emitc.tu

a64ebcc

Merge pull request #393 from Xilinx/matthias.emitc_tu

cab7e24

Add emitc.tu

emitc.include: don't require the parent to be a ModuleOp

831eb66

`#include` make sense everywhere, and in particular we need to allow them inside a `emitc.tu`. But sometimes we might even want to have an `#include` in a function body.

Merge pull request #396 from Xilinx/matthias.emitc_include_parent

26a9787

emitc.include: don't require the parent to be a ModuleOp

Merge pull request #395 from Xilinx/matthias.emitc_tu_2

0684dc4

emitc.tu: Automatically create block for body

fix: fuse locations of double reshapes when folding.

69cbbb5

Merge pull request #397 from Xilinx/tiagot.merge_location_double_resh…

2f0e627

…ape_fold fix: fuse locations of double reshapes when folding.

Merge pull request #394 from Xilinx/matthias.tiling_backport

1e37f7b

Backport various improvements to fusion from upstream

feat: improve CSE by fusing locations when replacing one op for the o…

39c4494

…ther.

Merge pull request #398 from Xilinx/tiagot.improve_cse_with_debug_loc…

213d2b0

…ations feat: improve CSE by fusing locations when replacing one op for the other.

Make EliminateLibm work on EmitC::FuncOp

99f8f98

Fix test

1fc2b98

Merge pull request #399 from Xilinx/jose.fix_pass

2113e3c

Make EliminateLibm work on EmitC::FuncOp

emitc: Do not add newlines after ModuleOp, TranslationUnitOp

a4a93fb

Merge pull request #400 from Xilinx/matthias.emitc_newline

72cbeca

emitc: Do not add newlines after ModuleOp, TranslationUnitOp

Fix yield conversion of scf.if/scf.for to emitc (#401)

7326995

* Fix conversion for scf.for and scf.if

Add -mlir-reproducer-before-all (#402)

20a6720

* Add -mlir-reproducer-before-all

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature/fused-ops #55

feature/fused-ops #55

mgehre-amd commented Jun 16, 2023

feature/fused-ops #55

Are you sure you want to change the base?

feature/fused-ops #55

Conversation

mgehre-amd commented Jun 16, 2023