Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][Graph] Update doc for UR PR moving reset commands to a dedicated cmd-list #357

Closed
wants to merge 744 commits into from

Conversation

mfrancepillois
Copy link
Collaborator

Update the design doc.
Update the UR tag.

grypp and others added 30 commits February 13, 2024 09:50
Currently, `phaseParity` argument of `nvgpu.mbarrier.try_wait.parity` is
index. This can cause a problem if it's passed any value different than
0 or 1. Because the PTX instruction only accepts even or odd phase. This
PR makes phaseParity argument i1 to avoid misuse.

Here is the information from PTX doc:

```
The .parity variant of the instructions test for the completion of the phase indicated 
by the operand phaseParity, which is the integer parity of either the current phase or 
the immediately preceding phase of the mbarrier object. An even phase has integer 
parity 0 and an odd phase has integer parity of 1. So the valid values of phaseParity 
operand are 0 and 1.
```
See for more information:

https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-test-wait-mbarrier-try-wait
…81239)

This function will be useful when we change the behavior of record-type
prvalues
so that they directly initialize the associated result object. See also
the
comment here for more details:


https://github.com/llvm/llvm-project/blob/9e73656af524a2c592978aec91de67316c5ce69f/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h#L354

As part of this patch, we document and assert that synthetic fields may
not have
reference type.

There is no practical use case for this: A `StorageLocation` may not
have
reference type, and a synthetic field of the corresponding non-reference
type
can serve the same purpose.
llvm.dbg.assign intrinsics have 2 {value, expression} pairs; fix hwasan to
update the second expression.

Fixes #76545. This is #78606 rebased and with the addition of DPValue handling.
Note the addition of --try-experimental-debuginfo-iterators in the tests and
some shuffling of code in MemoryTaggingSupport.cpp.
The strictfp attribute has the requirement that "LLVM will not introduce
any new floating-point instructions that may trap". The llvm.is.fpclass
intrinsic is documented as "The function never raises floating-point
exceptions", and the fcmp instruction may raise one, so we can't
transform the former into the latter in functions with the strictfp
attribute.
…#81585)

This reverts commit a034e65.

Some protobuf users reported that this patch caused a significant
compile-time regression because `TailDuplicator` works poorly with a
specific pattern.

We will reland it once the codegen issue is fixed.
…ugprone-unused-local-non-trivial-variable (#81563)
This patch adds full support for linking SystemZ (ELF s390x) object
files. Support should be generally complete:
- All relocation types are supported.
- Full shared library support (DYNAMIC, GOT, PLT, ifunc).
- Relaxation of TLS and GOT relocations where appropriate.
- Platform-specific test cases.

In addition to new platform code and the obvious changes, there were a
few additional changes to common code:

- Add three new RelExpr members (R_GOTPLT_OFF, R_GOTPLT_PC, and
R_PLT_GOTREL) needed to support certain s390x relocations. I chose not
to use a platform-specific name since nothing in the definition of these
relocs is actually platform-specific; it is well possible that other
platforms will need the same.

- A couple of tweaks to TLS relocation handling, as the particular
semantics of the s390x versions differ slightly. See comments in the
code.

This was tested by building and testing >1500 Fedora packages, with only
a handful of failures; as these also have issues when building with LLD
on other architectures, they seem unrelated.

Co-authored-by: Tulio Magno Quites Machado Filho <[email protected]>
The motivation here was a suggestion over in Compiler Explorer. You can
use `-mllvm` already to do this but since gfortran supports `-masm`, I
figured I'd try to add it.

This is done by flang expanding `-masm` into `-mllvm x86-asm-syntax=`,
then passing that to fc1. Which then collects all the `-mllvm` options
and forwards them on.

The code to expand it comes from clang `Clang::AddX86TargetArgs` (there
are some other places doing the same thing too). However I've removed
the `-inline-asm` that clang adds, as fortran doesn't have inline
assembly.

So `-masm` for flang purely changes the style of assembly output.

```
$ ./bin/flang-new /tmp/test.f90 -o - -S -target x86_64-linux-gnu
<...>
        pushq   %rbp
$ ./bin/flang-new /tmp/test.f90 -o - -S -target x86_64-linux-gnu -masm=att
<...>
        pushq   %rbp
$ ./bin/flang-new /tmp/test.f90 -o - -S -target x86_64-linux-gnu -masm=intel
<...>
        push    rbp
```

The test is adapted from `clang/test/Driver/masm.c` by removing the
clang-cl related lines and changing the 32 bit triples to 64 bit triples
since flang doesn't support 32 bit targets.
…(#80991)

Although in a normal implementation the assumption is reasonable, it
seems that some esoteric implementation are not returning a T&. This
should be handled correctly and the values be propagated.

---------

Co-authored-by: martinboehme <[email protected]>
… (#80966)

The 1-D case directly maps to LLVM intrinsics. The n-D case will be
handled by unrolling to 1-D first (in a later patch).

Depends on: #80965
Without this I would hit errors with libstdc++-12 like:

/usr/include/c++/12/bits/stl_iterator_base_funcs.h:230:5: note:
candidate template ignored: substitution failure [with _InputIterator =
llvm::const_set_bits_iterator_impl<llvm::BitVector>]: argument may not
have 'void' type
    next(_InputIterator __x, typename
    ^
…oading directives (#81081)

This patch adds support for the depend clause in a number of OpenMP
directives/constructs related to offloading. Specifically, it adds the
handling of the depend clause when it is used with the following
constructs

- target
- target enter data
- target update data
- target exit data
…1500)

Adds a test to help document Linalg Ops that are currently not supported
by the vectoriser (i.e. the logic to vectorise these is missing). The
list is not exhaustive.
Common backends (LLVM, SPIR-V) only supports 1D vectors, LLVM conversion
handles ND vectors (N >= 2) as `array<array<... vector>>` and SPIR-V
conversion doesn't handle them at all at the moment. Sometimes it's
preferable to treat multidim vectors as linearized 1D. Add pass to do
this. Only constants and simple elementwise ops are supported for now.

@krzysz00 I've extracted yours result type conversion code from
LegalizeToF32 and moved it to common place.

Also, add ConversionPattern class operating on traits.
This fixes a crash when lowering an extract_subvector like:

t0:v1i64 = extract_subvector t1:v2i64, 1

Whilst we never need a vslidedown with M1 on scalable vector types, we might
need to do it for v1i64/v1f64, since the smallest container type for it is
nxv1i64/nxv1f64.

The lowering code is still correct for this case, but the assertion was too
strict. The actual invariant we're relying on is that ContainerSubVecVT's LMUL
<= M1, not < M1. Hence why we handled v2i32 fine, because its container type
was nxv1i32 and MF2.
Allocate storage and initialize it with the given APValue contents.
…#80735)

zOS doesn't support aligned allocation, so mark these testcases as
unsupported.

Continuation of https://reviews.llvm.org/D102798
Introduce `mcdc::DecisionParameters` and `mcdc::BranchParameters` and make
sure them not initialized as zero.

FIXME: Could we make `CoverageMappingRegion` as a smart tagged union?
… (#81602)

In a few places we test whether sets (i.e. sorted ranges) intersect by
computing the set_intersection and then testing whether it is empty. For
this purpose it should be more efficient to use a std:vector instead of
a std::set to hold the result of the set_intersection, since insertion
is simpler.
Just emit their satisfaction state, which is what the current
interpreter does as well.
'serial', 'parallel', and 'kernel' constructs are all considered
'Compute' constructs. This patch creates the AST type, plus the required
infrastructure for such a type, plus some base types that will be useful
in the future for breaking this up.

The only difference between the three is the 'kind'( plus some minor
 clause legalization rules, but those can be differentiated easily
enough), so rather than representing them as separate AST nodes, it
seems
to make sense to make them the same.

Additionally, no clause AST functionality is being implemented yet, as
that fits better in a separate patch, and this is enough to get the
'naked' constructs implemented.

This is otherwise an 'NFC' patch, as it doesn't alter execution at all,
so there aren't any tests.  I did this to break up the review workload
and to get feedback on the layout.
MrSidims and others added 15 commits February 15, 2024 13:03
For now just convert BB with convertFromNewDbgValues, will
figure out something smarter a bit later.

I've updated several tests with dbg.declare intrinsic
adding --experimental-debuginfo-iterators=1 to check if it works.

Signed-off-by: Sidorov, Dmitry <[email protected]>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@0e87aefecf7c500
The SPIR-V Specification allows `OpConstantNull` types to be scalar or
vector booleans, integers, or floats.  Update an assert for this and
add a SPIR-V -> LLVM IR test.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@9ec969c1c379bde
…supported (intel#12700)

Final PR in the series of intel#12636.
Refer to it for a description.
After a discussion with @AlexeySachkov we've decided its best to not
rewrite USM and syclcompat tests with buffers/accessors. For USM, the
reason is obvious and for syclcompat you can reach out to Alexey.
Therefore, these tests are handled using if statements or requring
aspect to be supported.
Once this PR is merged, the behavior of malloc_shared will be to throw
if the usm_shared_allocations is not supported which is conformant with
the spec.
…s.txt (intel#12714)

Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.6
to 42.0.0 to resolve identified security vulnerability in 3rd party
dependency.

Refer to [cryptography's
changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst).
despite having a unit test for default context, realized there is not
one to affirm the new default configuration.
Some clean-up for SYCL-Graph E2E tests:
* Remove redundant `Event` variables that are initialized over loop
iterations but never used.
* Remove all instances of the no immediate command-list property, and
use environment variable instead to test both paths.
* Always use FileCheck leak checking rather than `CHECK-NOT: Leak`.
* Remove unnecessary threading code from `Inputs/basic_usm.cpp`
Copy link
Collaborator

@EwanC EwanC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've nitpicked the language a bit, but this is a nice improvement to the documentation. The new diagram is clearer too

sycl/doc/design/CommandGraph.md Outdated Show resolved Hide resolved
sycl/doc/design/CommandGraph.md Outdated Show resolved Hide resolved
sycl/doc/design/CommandGraph.md Outdated Show resolved Hide resolved
sycl/doc/design/CommandGraph.md Outdated Show resolved Hide resolved
sycl/doc/design/CommandGraph.md Outdated Show resolved Hide resolved
sycl/doc/design/CommandGraph.md Outdated Show resolved Hide resolved
sycl/doc/design/CommandGraph.md Outdated Show resolved Hide resolved
sycl/doc/design/CommandGraph.md Outdated Show resolved Hide resolved
sycl/doc/design/CommandGraph.md Outdated Show resolved Hide resolved
sycl/doc/design/CommandGraph.md Outdated Show resolved Hide resolved
mfrancepillois and others added 10 commits February 19, 2024 11:39
…2680)

Improves management of inter-partition dependencies, so that only
required dependencies are added.
As removing these dependencies can results in multiple executions paths,
we have added a map to track all events returned from submitted
partitions.
All these events are linked to the main event returned to user. 
Adds tests.
Grad flag was set to 0x3 (meaning Lod + Bias) instead of 0x4.
See
https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Image_Operands

Signed-off-by: Victor Lomuller <[email protected]>
Bring the fix for MaxRegsPerBlock check from
oneapi-src/unified-runtime#1299 to `intel/llvm`.
No changes needed other than updating the UR repo hash.

---------

Co-authored-by: Kenneth Benzie (Benie) <[email protected]>
`LoaderConfig` is created and stored in a local pointer and never
released when done using, causing it to be leaked.
This patch releases the `LoaderConfig` when finished using it.
Old builtins implementation is going to be removed in the next ABI
breaking window and that helper is only used there.
Bumps [cryptography](https://github.com/pyca/cryptography) from 42.0.0
to 42.0.2.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst">cryptography's
changelog</a>.</em></p>
<blockquote>
<p>42.0.2 - 2024-01-30</p>
<pre><code>
* Updated Windows, macOS, and Linux wheels to be compiled with OpenSSL
3.2.1.
* Fixed an issue that prevented the use of Python buffer protocol
objects in
  ``sign`` and ``verify`` methods on asymmetric keys.
* Fixed an issue with incorrect keyword-argument naming with
``EllipticCurvePrivateKey``

:meth:`~cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePrivateKey.exchange`,
  ``X25519PrivateKey``

:meth:`~cryptography.hazmat.primitives.asymmetric.x25519.X25519PrivateKey.exchange`,
  ``X448PrivateKey``

:meth:`~cryptography.hazmat.primitives.asymmetric.x448.X448PrivateKey.exchange`,
  and ``DHPrivateKey``

:meth:`~cryptography.hazmat.primitives.asymmetric.dh.DHPrivateKey.exchange`.
<p>.. _v42-0-1:</p>
<p>42.0.1 - 2024-01-24
</code></pre></p>
<ul>
<li>Fixed an issue with incorrect keyword-argument naming with
<code>EllipticCurvePrivateKey</code>

:meth:<code>~cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePrivateKey.sign</code>.</li>
<li>Resolved compatibility issue with loading certain RSA public keys in

:func:<code>~cryptography.hazmat.primitives.serialization.load_pem_public_key</code>.</li>
</ul>
<p>.. _v42-0-0:</p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/pyca/cryptography/commit/2202123b50de1b8788f909a3e5afe350c56ad81e"><code>2202123</code></a>
changelog and version bump 42.0.2 (<a
href="https://redirect.github.com/pyca/cryptography/issues/10268">#10268</a>)</li>
<li><a
href="https://github.com/pyca/cryptography/commit/f7032bdd409838f67fc2b93343f897fb5f397d80"><code>f7032bd</code></a>
bump openssl in CI (<a
href="https://redirect.github.com/pyca/cryptography/issues/10298">#10298</a>)
(<a
href="https://redirect.github.com/pyca/cryptography/issues/10299">#10299</a>)</li>
<li><a
href="https://github.com/pyca/cryptography/commit/002e886f16d8857151c09b11dc86b35f2ac9aec3"><code>002e886</code></a>
Fixes <a
href="https://redirect.github.com/pyca/cryptography/issues/10294">#10294</a>
-- correct accidental change to exchange kwarg (<a
href="https://redirect.github.com/pyca/cryptography/issues/10295">#10295</a>)
(<a
href="https://redirect.github.com/pyca/cryptography/issues/10296">#10296</a>)</li>
<li><a
href="https://github.com/pyca/cryptography/commit/92fa9f2f606caea5d499c825e832be5bac6f0c23"><code>92fa9f2</code></a>
support bytes-like consistently across our asym sign/verify APIs (<a
href="https://redirect.github.com/pyca/cryptography/issues/10260">#10260</a>)
(<a
href="https://redirect.github.com/pyca/cryptography/issues/1">#1</a>...</li>
<li><a
href="https://github.com/pyca/cryptography/commit/6478f7e28be54b51931277235de01b249ceabd96"><code>6478f7e</code></a>
explicitly support bytes-like for signature/data in RSA sign/verify (<a
href="https://redirect.github.com/pyca/cryptography/issues/10259">#10259</a>)
...</li>
<li><a
href="https://github.com/pyca/cryptography/commit/4bb8596ae02d95bb054dbcf55e8771379dbe0c19"><code>4bb8596</code></a>
fix the release script (<a
href="https://redirect.github.com/pyca/cryptography/issues/10233">#10233</a>)
(<a
href="https://redirect.github.com/pyca/cryptography/issues/10254">#10254</a>)</li>
<li><a
href="https://github.com/pyca/cryptography/commit/337437dc2e62772bde4ad5544f4b1db9ee7572d9"><code>337437d</code></a>
42.0.1 bump (<a
href="https://redirect.github.com/pyca/cryptography/issues/10252">#10252</a>)</li>
<li><a
href="https://github.com/pyca/cryptography/commit/56255de6b2d1a2d2e502b0275231ca81907f33f1"><code>56255de</code></a>
allow SPKI RSA keys to be parsed even if they have an incorrect
delimiter (<a
href="https://redirect.github.com/pyca/cryptography/issues/1">#1</a>...</li>
<li><a
href="https://github.com/pyca/cryptography/commit/12f038b38af76e36efe8cef09597010c97647e8f"><code>12f038b</code></a>
fixes <a
href="https://redirect.github.com/pyca/cryptography/issues/10237">#10237</a>
-- correct EC sign parameter name (<a
href="https://redirect.github.com/pyca/cryptography/issues/10239">#10239</a>)
(<a
href="https://redirect.github.com/pyca/cryptography/issues/10240">#10240</a>)</li>
<li>See full diff in <a
href="https://github.com/pyca/cryptography/compare/42.0.0...42.0.2">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=cryptography&package-manager=pip&previous-version=42.0.0&new-version=42.0.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts page](https://github.com/intel/llvm/network/alerts).

</details>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alexey Bader <[email protected]>
…tel#12748)

Warnings fixed:
- deprecated scatter_rgba
- deprecated get_cl_code
- deprecated lsc_fence
- deprecated uchar type usage
- deprecated get_access on HOST
- deprecated get_pointer
- usage of isfinite with -ffast-math
- deprecated dpas_argument_type::s1
- deprecated gpu_selector()

Also, the memory alloc/free in historgram*.cpp tests were updated to
simplify the potential memory leak avoidance.

Signed-off-by: Klochkov, Vyacheslav N <[email protected]>
Scheduled drivers uplift

Co-authored-by: GitHub Actions <[email protected]>
…ed cmd-list

Update the design doc.
Update the UR tag.
@mfrancepillois mfrancepillois force-pushed the maxime/UR-improve-ZE-enqueue-delay branch from 2840382 to 8e21a1d Compare February 20, 2024 17:06
@mfrancepillois
Copy link
Collaborator Author

Upstream PR intel#12770

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Graph Implementation Related to DPC++ implementation and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.