[AutoBump] Merge with e55d6f5e (Sep 11) (22) #376

) Migrate CodeGenHWModes to use const RecordKeeper and const Record pointers. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089

The `@llvm.dx.typedBufferLoad` intrinsic is lowered to `@dx.op.bufferLoad`. There's some complexity here in translating to scalarized IR, which I've abstracted out into a function that should be useful for samples, gathers, and CBuffer loads. I've also updated the DXILResources.rst docs to match what I'm doing here and the proposal in llvm/wg-hlsl#59. I've removed the content about stores and raw buffers for now with the expectation that it will be added along with the work. Note that this change includes a bit of a hack in how it deals with `getOverloadKind` for the `dx.ResRet` types - we need to adjust how we deal with operation overloads to generate a table directly rather than proxy through the OverloadKind enum, but that's left for a later change here. Part of llvm#91367 Pull Request: llvm#104252

The inconsistency surfaced in llvm#95305. Split off the reduce the diff.

…lvm#107899) This reapplies 8fa66c6 ([asan][windows] Eliminate the static asan runtime on windows) for a second time. That PR bounced off the tests because it caused failures in the other sanitizer runtimes, these have been fixed by only building interception, sanitizer_common, and asan with /MD, and continuing to build the rest of the runtimes with /MT. This does mean that any usage of the static ubsan/fuzzer/etc runtimes will mean you're mixing different runtime library linkages in the same app, the interception, sanitizer_common, and asan runtimes are designed for this, however it does result in some linker warnings. Additionally, it turns out when building in release-mode with LLVM_ENABLE_PDBs the build system forced /OPT:ICF. This totally breaks asan's "new" method of doing "weak" functions on windows, and so /OPT:NOICF was explicitly added to asan's link flags. --------- Co-authored-by: Amy Wishnousky <[email protected]>

Fills in many missing functions from VectorType

This information helps with tuning the heuristic of selecting memory groups to release the unused pages.

Fix a bug that `lto_runtime_lib_symbols_list` is returning the address of a local variable that will be freed when getting out of scope. This is a regression from llvm#98512 that rewrites the runtime libcall function lists into a SmallVector. rdar://135559037

partially fixes llvm#70078 ### Changes - Added `int_spv_sign` intrinsic in `IntrinsicsSPIRV.td` - Added lowering and map to `int_spv_sign in `SPIRVInstructionSelector.cpp` - Added SPIR-V backend test case in `llvm/test/CodeGen/SPIRV/hlsl-intrinsics/sign.ll` ### Related PRs - llvm#101988 - llvm#101989

…lvm#107858) Change CGIOperandList::OperandInfo::Rec and CGIOperandList::TheDef to const pointer. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089

This patch implements the Pass base class and the FunctionPass sub-class that operate on Sandbox IR.

…07919) Fixes generation of invalid loads leading to misaligned access errors. The bug got exposed by SLP vectorizer change ec360d6 which allowed SLP to produce `v16i8` vectors. Also updated the tests to use automatic check generator.

shifts are the same as sub where rhs == 0 is identity. and is the inverted case where: `SELECT (AND(X,1) == 0), (AND Y, Z), Y` -> `(AND Y, (OR NEG(AND(X, 1)), Z))` With -1 as the identity. Closes llvm#107910

…m#107498) Apparently, there are two almost identical implementations: one for MachO and another one for ELF. The ELF bits somehow slipped while llvm#84573 was reviewed. The particular implementation is identical to MachO case.

This patch implements sandboxir::UndefValue mirroring llvm::UndefValue.

Lower `fcopysign` SDNodes into `copysign` PTX instructions where possible. See [PTX ISA: 9.7.3.2. Floating Point Instructions: copysign] (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#floating-point-instructions-copysign).

…vm#107499) This patch enables experimenting with the contextual profile. ICP is currently disabled in this case - will reenable it subsequently. Also subsequently the inline cost model / decision making would be updated to be context-aware. Right now, this just achieves "complete use" of the profile, in that it's ingested, maintained, and sunk to a flat profile when not needed anymore. Issue [llvm#89287](llvm#89287)

This patch drop redundant rankReductionStrategy in `populateFoldUnitExtentDimsViaSlicesPatterns` and fixes comment typos.

…lvm#107432) After llvm#92205, LoongArch ISel selects `div.w` for `trunc i64 (sdiv i64 3202030857, (sext i32 X to i64)) to i32`. It is incorrect since `3202030857` is not a signed 32-bit constant. It will produce wrong result when `X == 2`: https://alive2.llvm.org/ce/z/pzfGZZ This patch adds additional `sexti32` checks to operands of `PatGprGpr_32`. Alive2 proof: https://alive2.llvm.org/ce/z/AkH5Mp Fix llvm#107414.

…#107909) Make constructors that take const Record * implicit, allowing us to simplify some range based loops to use that class instance as the loop variable. Change remaining constructor calls to use () instead of {} to construct objects.

Fixes: llvm#107355 Reviewed By: SixWeining Pull Request: llvm#107523

After 2773719 e.g. ``` external/llvm-project/libc/test/src/math/smoke/NextTowardTest.h:12:10: error: module llvm-project//libc/test/src/math/smoke:nexttowardf_test does not depend on a module exporting 'src/__support/CPP/bit.h' ```

…lvm#107897) After landing llvm#99285 we found that the call graph update was causing the following crash when expensive checks are turned on ``` llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:982: LazyCallGraph::SCC &updateCGAndAnalysisManagerForPass(LazyCallGraph &, LazyCallGraph::SCC &, LazyCallGraph::Node &, CGSCCAnalysisManager &, CGSCCUpdateResult &, FunctionAnalysisManager &, bool): Assertion `(RC == &TargetRC || RC->isAncestorOf(Targe tRC)) && "New call edge is not trivial!"' failed. ``` I have to admit I believe that the call graph update process I did for that patch could be wrong. After reading the code in `CGSCCToFunctionPassAdaptor`, I am convinced that `CoroAnnotationElidePass` can be a FunctionPass and rely on the adaptor to update the call graph for us, so long as we properly invalidate the caller's analyses. After this patch, `llvm/test/Transforms/Coroutines/coro-transform-must-elide.ll` no longer fails under expensive checks.

We shouldn't assume that we're using system zlib installation.

This patch tries to infer is-power-of-2 from assumptions. I don't see that this kind of assumption exists in my dataset. Related issue: rust-lang/rust#129795 Close llvm#58996.

Data type conversion between fp16 and bf16 will generate fptrunc and fpextend nodes, but they are actually bitcast nodes.

partially fixes llvm#70078 ### Changes - Implemented `sign` clang builtin - Linked `sign` clang builtin with `hlsl_intrinsics.h` - Added sema checks for `sign` to `CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp` - Add codegen for `sign` to `EmitHLSLBuiltinExpr` in `CGBuiltin.cpp` - Add codegen tests to `clang/test/CodeGenHLSL/builtins/sign.hlsl` - Add sema tests to `clang/test/SemaHLSL/BuiltIns/sign-errors.hlsl` ### Related PRs - llvm#101987 - llvm#101988 ### Discussion - Should there be a `usign` intrinsic that handles the unsigned cases?

One more after 2773719

Dependant lists hold raw pointers back to EDUs that depend on them. We need to remove these entries before destroying the EDU or we'll be left with a dangling reference that can result in use-after-free bugs. No testcase: This has only been observed in multi-threaded setups that reproduce the issue inconsistently. rdar://135403614

… streams (llvm#97470) Currently, LLDB assumes all minidumps will have unique sections. This is intuitive because almost all of the minidump sections are themselves lists. Exceptions including Signals are unique in that they are all individual sections with their own directory. This means LLDB fails to load minidumps with multiple exceptions due to them not being unique. This behavior is erroneous and this PR introduces support for an arbitrary number of exception streams. Additionally, stop info was calculated only for a single thread before, and now we properly support mapping exceptions to threads. ~~This PR is starting in DRAFT because implementing testing is still required.~~

Add appropriate scopes and use reverse-order iteration in LocalScope::emitDestructors().

…atch arg. This decouples function argument serialization / deserialization from the function call dispatch mechanism. This will eventually allow us to replace the existing __orc_rt_jit_dispatch function with a system that supports pre-linking parts of the ORC runtime into the executor.

In llvm#107827 we now set true's passthru to the false operand if it was undef. We need to remember to also constrain the regclass in case true is a masked pseudo which needs its passthrus to be in VR[M*]NoV0

)" This fixes llvm#107950 and adds a test case for it. The issue was due to us incorrectly assuming that we stored a V0Defs entry for every single instruction. We actually only store them for instructions that use V0, so when we updated the V0Def after moving we sometimes ended up copying nullptr over from an instruction that doesn't use V0 and clearing the V0Def entry inadvertently. Because we don't have V0Defs on instructions that don't use V0, the FIXME was never actually needed in the first place since the bookkeeping wasn't out of sync to begin with. That commit also mentioned that a future unmasked to masked pseudo peephole might need unmasked pseudos to have V0Defs entries, but after working on this locally it turns out we don't. This reverts commit ce36480.

…05865) It is a violation of the standard to use 0 length arrays, especially when not at the end of a structure (not a FAM GNU extension). Compiler generally accept it, but it's probably better to have a conforming implementation. --------- Co-authored-by: Louis Dionne <[email protected]>

add more patterns clarify wip_match_opcode usage

Haiku has pthread_setname_np() / pthread_getname_np().

…107618)" This reverts commit 7543d09. This change caused failed asserts when building the openmp assembly sources, reproducible with: $ llvm-ml -m64 -D_M_AMD64 -c -Fo out.obj openmp/runtime/src/z_Windows_NT-586_asm.asm llvm-ml: ../lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp:624: void {anonymous}::X86MCCodeEmitter::emitMemModRMByte(const llvm::MCInst&, unsigned int, unsigned int, uint64_t, {anonymous}::PrefixKind, uint64_t, llvm::SmallVectorImpl<char>&, llvm::SmallVectorImpl<llvm::MCFixup>&, const llvm::MCSubtargetInfo&, bool) const: Assertion `IndexReg.getReg() == 0 && !ForceSIB && "Invalid rip-relative address"' failed. The assert can also be triggered with one lone instruction: lea rdx, QWORD PTR [rax*8+16]

…#100361) Allow customization of the `resolveCallable` method in the `CallOpInterface`. This change allows for operations implementing this interface to provide their own logic for resolving callables. - Introduce the `resolveCallable` method, which does not include the optional symbol table parameter. This method replaces the previously existing extra class declaration `resolveCallable`. - Introduce the `resolveCallableInTable` method, which incorporates the symbol table parameter. This method replaces the previous extra class declaration `resolveCallable` that used the optional symbol table parameter.

This commit adds support for `nvvm.breakpoint` Op which lowers to the PTX brkpt instruction. Also, added the respective tests in `nvvmir.mlir`

…ric dispatch arg." This reverts commit 462251b. This reverts commit 9b67c99. Build fails for compiler-rt/lib/orc/tests/unit/wrapper_function_utils_test.cpp https://buildkite.com/llvm-project/upstream-bazel/builds/109731#0191da59-6710-4420-92ef-aa6e0355cb2c

…e`" (llvm#107984) Reverts llvm#100361 This commit caused some linker errors. (Missing `MLIRCallInterfaces` dependency.)

…lvm#107585) Introduces loop hoisting to ARM SME E2E tests to allow the hoisting of the tile load offering very important speedup. Discussed here : https://discourse.llvm.org/t/mlir-for-arm-sme-reducing-tile-data-transfers/80065/2

…t shift amounts. NFC

This PR adds `f6E3M2FN` type to mlir. `f6E3M2FN` type is proposed in [OpenCompute MX Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). It defines a 6-bit floating point number with bit layout S1E3M2. Unlike IEEE-754 types, there are no infinity or NaN values. ```c f6E3M2FN - Exponent bias: 3 - Maximum stored exponent value: 7 (binary 111) - Maximum unbiased exponent value: 7 - 3 = 4 - Minimum stored exponent value: 1 (binary 001) - Minimum unbiased exponent value: 1 − 3 = −2 - Has Positive and Negative zero - Doesn't have infinity - Doesn't have NaNs Additional details: - Zeros (+/-): S.000.00 - Max normal number: S.111.11 = ±2^(4) x (1 + 0.75) = ±28 - Min normal number: S.001.00 = ±2^(-2) = ±0.25 - Max subnormal number: S.000.11 = ±2^(-2) x 0.75 = ±0.1875 - Min subnormal number: S.000.01 = ±2^(-2) x 0.25 = ±0.0625 ``` Related PRs: - [PR-94735](llvm#94735) [APFloat] Add APFloat support for FP6 data types - [PR-97118](llvm#97118) [MLIR] Add f8E4M3 type - was used as a template for this PR

…07372) As suggested in the comments of llvm#105573

…107971) If all incoming values of `div.d` are sign-extended and all users only use the lower 32 bits, then convert them to W versions. Fixes: llvm#107946

…th vector types (llvm#104606) Check that `binop(zext(value)`, other) is possible and profitable to transform into: `zext(binop(value, trunc(other)))`. When CPU architecture has illegal scalar type iX, but vector type <N * iX> is legal, scalar expressions before vectorisation may be extended to a legal type iY. This extension could result in underutilization of vector lanes, as more lanes could be used at one instruction with the lower type. Vectorisers may not always recognize opportunities for type shrinking, and this patch aims to address that limitation.

Also updates and clarifies which version would be installed. As per https://discourse.llvm.org/t/information-on-lit-is-outdated/76498.

…llvm#95305) Similar to VFxUF, also add a VF VPValue to VPlan and use it to get the runtime VF in VPWidenIntOrFpInductionRecipe. Code for VF is only generated if there are users of VF, to avoid unnecessary test changes. PR: llvm#95305

This PR makes tosa.negate op for integer types to use the simplified calculation branch if input_zp and output_zp values are also zero. Signed-off-by: Dmitriy Smirnov <[email protected]>

…h fixes. This reapplies commits 462251b and 9b67c99, which were reverted in 53d35c4 due to bot failures for the wrapper_function_utils_test.cpp unit test.

…xtend (llvm#105375) GCC compiles the built-in function `__builtin_bswap16`, to the ARM instruction rev16, which reverses the byte order of 16-bit data. On the other Clang compiles the same built-in function to e.g. ``` rev w8, w0 lsr w0, w8, #16 ``` i.e. it performs a byte reversal of a 32-bit register, (which moves the lower half, which contains the 16-bit data, to the upper half) and then right shifts the reversed 16-bit data back to the lower half of the register. We can improve Clang codegen by generating `rev16` instead of `rev` and `lsr`, like GCC.

…7265) Update the predicate protecting bfloat instructions to only reference FEAT_SVE_B16B16, which matches the specification. Rename and move instruction classes to match the names of the encoding groups the bfloat arithmetic instructions belong.

This patch creates a simple RAII wrapper class for `SymMap` to make it easier to use and prevent a missing matching `popScope()` for a `pushScope()` call on simple use cases. Some push-pop pairs are replaced with instances of the new class by this patch.

…lvm#107241) This allows e.g. DWARFDIE::GetName() to return the name of the type when looking at its declaration (which contains only DW_AT_declaration+DW_AT_signature). This is similar to how we recurse through DW_AT_specification when looking for a function name. Llvm dwarf parser has obtained the same functionality through llvm#99495. This fixes a bug where we would confuse a type like NS::Outer::Struct with NS::Struct (because NS::Outer (and its name) was in a type unit).

This intrinsic is meant to be used in functions that have a "tail" that needs to be run with all the lanes enabled. The "tail" may contain complex control flow that makes it unsuitable for the use of the existing WWM intrinsics. Instead, we will pretend that the function starts with all the lanes enabled, then branches into the actual body of the function for the lanes that were meant to run it, and then finally all the lanes will rejoin and run the tail. As such, the intrinsic will return the EXEC mask for the body of the function, and is meant to be used only as part of a very limited pattern (for now only in amdgpu_cs_chain functions): ``` entry: %func_exec = call i1 @llvm.amdgcn.init.whole.wave() br i1 %func_exec, label %func, label %tail func: ; ... stuff that should run with the actual EXEC mask br label %tail tail: ; ... stuff that runs with all the lanes enabled; ; can contain more than one basic block ``` It's an error to use the result of this intrinsic for anything other than a branch (but unfortunately checking that in the verifier is non-trivial because SIAnnotateControlFlow will introduce an amdgcn.if between the intrinsic and the branch). The intrinsic is lowered to a SI_INIT_WHOLE_WAVE pseudo, which for now is expanded in si-wqm (which is where SI_INIT_EXEC is handled too); however the information that the function was conceptually started in whole wave mode is stored in the machine function info (hasInitWholeWave). This will be useful in prolog epilog insertion, where we can skip saving the inactive lanes for CSRs (since if the function started with all the lanes active, then there are no inactive lanes to preserve).

This doesn't modify the PC, so pass OpPC as a copy.

In AArch64, the endianness of instruction encodings is always little, whereas the endianness of data swaps between LE and BE modes. So getImplicitAddend must use the right one of read32() and read32le(), for data and code respectively. It was using read32() throughout, causing instructions to be read as big-endian in BE mode, getting the wrong addend. Fixed, and updated the existing test to check both endiannesses. The expected results for data must be byte-swapped, but the ones for code need no adjustment.

…nstructions (llvm#101317) When AArch64LoadStoreOptimizer pass merges an SP update with a load/store instruction and needs to adjust unwind information either: * create the merged instruction at the location of the SP update (so no CFI instructions are moved), or * only move a CFI instruction if the move would not reorder it across other CFI instructions If neither of the above is possible, don't perform the optimisation.

…llvm#107879) Mostly NFC, I was bothered by the declaration that were always made even if unsued, and I think using LLVM Ops is nicer anyway with regards to side effects here. ``` func.func private @llvm.stacksave.p0() -> !fir.ref<i8> func.func private @llvm.stackrestore.p0(!fir.ref<i8>) ``` There are other places in lowering that are using the calls instead of the LLVM intrinsics, but I will deal with them another time (the issue there is mostly to get the proper address space for the llvm.ptr type).

…V files (llvm#107911) When we introduced the machinery for transitive includes validation, at some point we stopped including the full set of transitive includes in the CSV files and instead only tracked the set of public headers included *directly* by a top-level header. The reason for doing that was so that the CSV files containing "transitive" includes could be used to draw the dependency graph of libc++ headers. However, the downside was that it made the contents of the CSV files much harder to interpret. In particular, many changes that modify the CSV files do not in fact modify the effective set of transitive includes, which is confusing. This patch goes back to storing the full set of transitive includes in the CSV files and removes the ability to graph the libc++ includes directly from those CSV files, which we never actually used.

Based on the output of llvm/utils/gn/build/sync_source_lists_from_cmake.py and reading the diff, but not actually tested on Windows.

…lvm#107989) Relands llvm#100361 with fixed dependencies.

…PU. (llvm#107997) HIPAMDToolChain and AMDGPUOpenMPToolChain both depends on the "shouldSkipSanitizeOption" api to sanitize/not sanitize device code.

…libm calls (llvm#99517) This patch invokes a pass when compiling for an AMDGPU target to lower math operations to AMD GPU library calls library calls instead of libm calls.

…input LLVM IR module to SPIR-V (llvm#107216) The goal of this PR is to facilitate integration of SPIRV Backend into misc 3rd party tools and libraries by means of exposing an API call that translate LLVM module to SPIR-V and write results into a string as binary SPIR-V output, providing diagnostics on fail and means of configuring translation in a style of command line options. An example of a use case may be Khronos Translator that provides bidirectional translation LLVM IR <=> SPIR-V, where LLVM IR => SPIR-V step may be substituted by the call to SPIR-V Backend API, implemented by this PR.

The resolution of LWG2593 didn't require the standard library implementation to change. It merely strengthened requirements on user-defined allocator types and allowed the implementation to make stronger assumptions. The status is tentatively set to Nothing To Do. However, `test_allocator` in libc++'s test suit needs to be fixed to conform to the strengthened requirements. Closes llvm#100220.

This commit introduces support for outlining functions across modules using codegen data generated from previous codegen. The codegen data currently manages the outlined hash tree, which records outlining instances that occurred locally in the past. The machine outliner now operates in one of three modes: 1. CGDataMode::None: This is the default outliner mode that uses the suffix tree to identify (local) outlining candidates within a module. This mode is also used by (full)LTO to maintain optimal behavior with the combined module. 2. CGDataMode::Write (`-codegen-data-generate`): This mode is identical to the default mode, but it also publishes the stable hash sequences of instructions in the outlined functions into a local outlined hash tree. It then encodes this into the `__llvm_outline` section, which will be dead-stripped at link time. 3. CGDataMode::Read (`-codegen-data-use-path={.cgdata}`): This mode reads a codegen data file (.cgdata) and initializes a global outlined hash tree. This tree is used to generate global outlining candidates. Note that the codegen data file has been post-processed with the raw `__llvm_outline` sections from all native objects using the `llvm-cgdata` tool (or a linker, `LLD`, or a new ThinLTO pipeline later). This depends on llvm#105398. After this PR, LLD (llvm#90166) and Clang (llvm#90304) will follow for each client side support. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.

…#107425) The copyin clause currently forbids pointer and allocatable variables, which are allowed by the OpenMP 1.1 and 3.0 specifications respectively.

…te (llvm#107895) This patch refactors the procedure of getting the register number from a register name to LLVMState rather than having individual users get the values themselves by getting a reference to the map from LLVMState. This is primarily intended to make some downstream usage in Gematria simpler, but also cleans up a little bit upstream by pulling the actual map searching out and just leaving error handling to the clients. The original getter is left to enable downstream migration in Gematria, particularly before it gets imported into google internal.

for 2773719

Co-authored-by: Owen Pan <[email protected]>

MacroAnnotations has three std::optional fields. Functions makeDeprecation, makeRestrictExpansion, and makeFinal construct an instance of MacroAnnotations with one field initialized with a non-default value (that is, some value other than std::nullopt). Functions addMacroDeprecationMsg, addRestrictExpansionMsg, and addFinalLoc either create a new map entry with one field initialized with a non-default value or replaces one field of an existing map entry. We can do all this with a simple statement of the form: AnnotationInfos[II].FieldName = NonDefaultValue; which takes care of default initialization of the fields with std::nullopt when a requested map entry does not exist.

In the `lowerPack` transform, there is a special case for lowering into a simple `tensor.pad` + `tensor.insert_slice`, but the destination becomes a newly created `tensor.empty`. This PR fixes the transform to reuse the original destination of the `tensor.pack`.

…arwin (llvm#108003) This started failing on the macOS CI after llvm#106885: ``` lldb-api :: commands/expression/import-std-module/vector-dbg-info-content/TestDbgInfoContentVectorFromStdModule.py "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang" -std=c++11 -g -O0 -isysroot "/Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk" -arch arm64 -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/../../../../..//include -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/tools/lldb/include -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/expression/import-std-module/vector-dbg-info-content -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make -include /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/test_common.h -fno-limit-debug-info -nostdlib++ -nostdinc++ -cxx-isystem /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/include/c++/v1 --driver-mode=g++ -MT main.o -MD -MP -MF main.d -c -o main.o /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/expression/import-std-module/vector-dbg-info-content/main.cpp "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang" main.o -g -O0 -isysroot "/Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk" -arch arm64 -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/../../../../..//include -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/tools/lldb/include -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/expression/import-std-module/vector-dbg-info-content -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make -include /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/test_common.h -fno-limit-debug-info -L/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/lib -Wl,-rpath,/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/lib -lc++ --driver-mode=g++ -o "a.out" ld: warning: ignoring duplicate libraries: '-lc++' codesign --entitlements /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/entitlements-macos.plist -s - "a.out" "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/./bin/dsymutil" -o "a.out.dSYM" "a.out" runCmd: settings set target.import-std-module true output: runCmd: expr std::reverse(a.begin(), a.end()) Assertion failed: (isa<InjectedClassNameType>(Decl->TypeForDecl)), function getInjectedClassNameType, file ASTContext.cpp, line 5057. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. HandleCommand(command = "expr std::reverse(a.begin(), a.end())") 1. <eof> parser at end of file 2. /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/include/c++/v1/__algorithm/reverse.h:54:1: instantiating function definition 'std::reverse<std::__wrap_iter<Foo *>>' 3. /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/include/c++/v1/__algorithm/reverse.h:47:58: instantiating function definition 'std::__reverse<std::_ClassicAlgPolicy, std::__wrap_iter<Foo *>, std::__wrap_iter<Foo *>>' 4. /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/include/c++/v1/__algorithm/reverse.h:40:1: instantiating function definition 'std::__reverse_impl<std::_ClassicAlgPolicy, std::__wrap_iter<Foo *>>' ```

) Instead of visiting call sites in Attribute::checkForAllUses, we now keep track of returns in AAPointerInfo and use the call site return information as required. This way, the user of AAPointerInfo(CallSite)Argument can determine if the call return should be visited. We do not collect them as "may accesses" in the AAPointerInfo(CallSite)Argument itself in case a return user is found.

…d from different modules (llvm#104512) Summary: Because AST loading code is lazy and happens in unpredictable order it could happen that function and lambda inside function can be loaded from different modules. In this case, captured DeclRefExpr won’t match the corresponding VarDecl inside function. In AST it looks like this: ``` FunctionDecl 0x555564f4aff0 <Conv.h:33:1, line:41:1> line:33:35 imported in ./thrift_cpp2_base.h hidden tryTo 'Expected<Tgt, const char *> ()' inline |-also in ./folly-conv.h `-CompoundStmt 0x555564f7cfc8 <col:43, line:41:1> |-DeclStmt 0x555564f7ced8 <line:34:3, col:17> | `-VarDecl 0x555564f7cef8 <col:3, col:16> col:7 imported in ./thrift_cpp2_base.h hidden referenced result 'Tgt' cinit | `-IntegerLiteral 0x555564f7d080 <col:16> 'int' 0 |-CallExpr 0x555564f7cea8 <line:39:3, col:76> '<dependent type>' | |-UnresolvedLookupExpr 0x555564f7bea0 <col:3, col:19> '<overloaded function type>' lvalue (no ADL) = 'then_' 0x555564f7bef0 | |-CXXTemporaryObjectExpr 0x555564f7bcb0 <col:25, col:45> 'Expected<bool, int>':'folly::Expected<bool, int>' 'void () noexcept' zeroing | `-LambdaExpr 0x555564f7bc88 <col:48, col:75> '(lambda at Conv.h:39:48)' | |-CXXRecordDecl 0x555564f76b88 <col:48> col:48 imported in ./folly-conv.h hidden implicit <undeserialized declarations> class definition | | |-also in ./thrift_cpp2_base.h | | `-DefinitionData lambda empty standard_layout trivially_copyable literal can_const_default_init | | |-DefaultConstructor defaulted_is_constexpr | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveConstructor exists simple trivial needs_implicit | | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveAssignment | | `-Destructor simple irrelevant trivial constexpr needs_implicit | `-CompoundStmt 0x555564f7d1a8 <col:58, col:75> | `-ReturnStmt 0x555564f7d198 <col:60, col:67> | `-DeclRefExpr 0x555564f7d0a0 <col:67> 'Tgt' lvalue Var 0x555564f7d0c8 'result' 'Tgt' refers_to_enclosing_variable_or_capture `-ReturnStmt 0x555564f7bc78 <line:40:3, col:11> `-InitListExpr 0x555564f7bc38 <col:10, col:11> 'void' ``` This diff changes AST deserialization to load lambdas inside canonical function declaration earlier right after the function to make sure that their canonical decl is loaded from the same module. Test Plan: check-clang

llvm#108037)

…vm#90074) llvm#108037 (llvm#108047) The previous `attempt to fix [CGData][MachineOutliner] Global Outlining (llvm#90074) llvm#108037` was incomplete because the `ImmutableModuleSummaryIndexWrapperPass` is now optional for the MachineOutliner pass. With this fix, the test file `CodeGen/AArch64/O3-pipeline.ll` shows no changes compared to its state before `[CGData][MachineOutliner] Global Outlining (llvm#90074)`. Co-authored-by: Kyungwoo Lee <[email protected]>

The previous `Fix for Attempt to fix [CGData][MachineOutliner] Global Outlining (llvm#90074) llvm#108037 (llvm#108047)` somehow dropped this file.

Scalar FP calling convention has gotten more complicated with recent changes to Zfinx/Zdinx, proposed addition of a GPRF16 register class, and using customReg for f16/bf16 and other FP types small than XLen. The previous code tried to share a single getReg and getMem call for many different cases. This patch separates all the FP register handling to the top of the function with their own getReg calls. The only exception is f64 with XLen==32, when we are out of FPRs or not able to use FPRs due to ABI. The way I've structured this, we no longer need to correct the LocVT for FP back to ValVT before the call to getMem.

) Breaks bots, see llvm#105822. Reverts llvm#105822

…providers within lldb (llvm#102708) This PR adds a statistics provider cache, which allows an individual target to keep a rolling tally of it's total time and number of invocations for a given summary provider. This information is then available in statistics dump to help slow summary providers, and gleam more into insight into LLDB's time use.

Address the following issue: ``` ❯ ninja libc.test.src.__support.OSUtil.linux.vdso_test.__unit__ [91/127] Building CXX object libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o FAILED: libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o sccache /usr/bin/clang++ -DLIBC_NAMESPACE=__llvm_libc_20_0_0_git -D_DEBUG -I/home/schrodingerzy/Documents/llvm-project/libc -isystem /home/schrodingerzy/Documents/llvm-project/build/libc/include -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -g -std=gnu++17 -fpie -DLIBC_FULL_BUILD -ffreestanding -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -MD -MT libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o -MF libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o.d -o libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o -c /home/schrodingerzy/Documents/llvm-project/libc/test/src/__support/OSUtil/linux/vdso_test.cpp In file included from /home/schrodingerzy/Documents/llvm-project/libc/test/src/__support/OSUtil/linux/vdso_test.cpp:21: In file included from /home/schrodingerzy/Documents/llvm-project/libc/test/UnitTest/ErrnoSetterMatcher.h:13: In file included from /home/schrodingerzy/Documents/llvm-project/libc/src/__support/FPUtil/fpbits_str.h:12: In file included from /home/schrodingerzy/Documents/llvm-project/libc/src/__support/CPP/string.h:20: /home/schrodingerzy/Documents/llvm-project/build/libc/include/stdlib.h:13:10: fatal error: 'llvm-libc-types/locale_t.h' file not found 13 | #include "llvm-libc-types/locale_t.h" | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. [123/127] Building CXX object libc/test/UnitTest/CMakeFiles/LibcTest.unit.dir/LibcTestMain.cpp.o ninja: build stopped: subcommand failed. ```

…vm#107918) Sort the list of calls such that those with the same stack ids are also sorted by function. This allows processing of all matching calls (that can share a context node) in bulk as they are all adjacent. This has 2 benefits: 1. It reduces unnecessary work, specifically the handling to intersect the context ids with those along the graph edges for the stack ids, for calls that we know can share a node. 2. It simplifies detecting when we have matching stack ids but don't need to duplicate context ids. Specifically, we were previously still duplicating context ids whenever we saw another call with the same stack ids, but that isn't necessary if they will share a context node. With this change we now only duplicate context ids if we see some that not only have the same ids but also are in different functions. This change reduced the amount of context id duplication and provided reductions in both both peak memory (~8%) and time (~%5) for a large target.

llvm#107856) Similarly to operator<(), equality-comparing iterators from different ranges must really be forbidden. The preconditions for being able to do `it1 < it2` and `it1 != it2` (or `it1 == it2` for the matter) ought to be the same. Thus, there's little sense in keeping explicit base object comparison in operator==() whilst having this is a precondition in operator<() and operator-() (e.g. used for std::distance() and such).

The `@llvm.dx.typedBufferStore` intrinsic is lowered to `@dx.op.bufferStore`. Pull Request: llvm#104253

Hit Assertion failed: Num < NumOperands && "Invalid child # of SDNode!" Fix by checking opcode and value type before calling getOperand.

…#108066) This reverts commit b0d2411. Reverting because the original commit misses case of copysign from a constant.

The CoroSplit pass has it's own salvageDebugInfo implementation and it's DIExpressions do not get folded. Add a call to DIExpression::foldConstantMath in the CoroSplit pass to reduce the size of those DIExpressions. [The compile time tracker shows no significant increase in compile time either.](https://llvm-compile-time-tracker.com/compare.php?from=bdf02249e7f8f95177ff58c881caf219699acb98&to=e1c1c1759c06bc4c42f79eebdb0e3cd45219cef4&stat=instructions:u) rdar://134675402

This is newly used as of 0f52545. The bulk of the targets were added earlier in 9bb5556.

…v.s/d aliases. We were missing test coverage for fneg.d/fabs.d for Zdinx. When I added it revealed it only worked on RV64. The assembler was not creating a GPRPair register class on RV32 so the alias couldn't match. The disassembler was also not using GPRPair registers preventing the aliases from printing in disassembly too. I've fixed the assembler by adding new parsing methods in an attempt to get decent diagnostics. This is hard since the mnemonics are ambiguous between D and Zdinx. Tests have been adjusted for some differences in what errors are reported first.

…ers (llvm#105905) Refactoring `stackTrace` to perform frame look ups in a more on-demand fashion to improve overall performance. Additionally adding additional information to the `exceptionInfo` request to report exception stacks there instead of merging the exception stack into the stack trace. The `exceptionInfo` request is only called if a stop event occurs with `reason='exception'`, which should mitigate the performance of `SBThread::GetCurrentException` calls. Adding unit tests for exception handling and stack trace supporting.

Adds target codegen info class for DirectX. For now it always translates `__hlsl_resource_t` handle to `target("dx.TypedBuffer", i32, 1, 0, 1)` (`RWBuffer<int>`). More work is needed to determine the actual target exp type and parameters based on the resource handle attributes. Part 1/2 of llvm#95952

* Move code related to spilling into SpillUtils to help cleanup CoroFrame See RFC for more info: https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057

…eses (llvm#107973) We should issue a warning whenever a duplicate resource type attribute is found. Currently we do that only for `resource_class`. This PR fixes that by checking for duplicate `is_rov` attributes as well. Also removes unnecessary parenthesis on `is_rov`.

The OpenACC standard makes depending on side effects to be effectively UB, so this patch ensures we handle them reaonably by making it a potentially evaluated context, and ignoring cleanups.

llvm#107972)

This patch implements a simple pass manager for Sandbox IR.

Update planContainsAdditionalSimplifications to also check phis not in the loop header. This ensures we don't miss cases where VPBlendRecipes (which correspond to such phis) have been simplified. Fixes llvm#107473.

…lvm#106776) Test output that carried color across newlines previously resulted in the formatting around the output also being colored. Detect the current ANSI color and reset it when printing formatting, and then reapply it. As an added bonus an unterminated color code is also detected, preventing it from leaking out into the rest of the terminal. Fixes llvm#106633

…sic (llvm#108081) cuf.data_transfer was wrongly generated when calling the `size` intrinsic on a device allocatable variable. Since the descriptor is available on the host, there is no transfer needed. Add `DescriptorInquiry` in the `CollectCudaSymbolsHelper` to filter out symbols that are not needed for the transfer decision to be made.

llvm#108091) Reverts llvm#105865 This breaks a pair of LLDB tests in CI.

…07444) When a module file has been compiled with CUDA enabled, don't emit spurious errors about non-interoperable types when that module is read by a USE statement in a later non-CUDA compilation.

The standard requires that a generic interface with the same name as a derived type contain only functions. We generally allow a generic interface to contain both functions and subroutines, since there's never any ambiguity at the point of call; these is helpful when the specific procedures of two generics are combined during USE association. Emit a warning instead of a hard error when a generic interface with the same name as a derived type contains a subroutine to improve portability of code from compilers that don't check for this condition.

This was a subtle problem. When the shape of a function result is explicit but not constant, it is characterized with bounds expressions that use Extremum<SubscriptInteger> operations to force extents to 0 rather than be negative. These Extremum operations are formatted as "max()" intrinsic functions in the module file. Upon being read from the module file, they are not folded back into Extremum operations, but remain as function references; and this then leads to expressions not comparing equal when the procedure characteristics are compared to those of a local procedure declared identically. The real fix here would be for folding to just always change max and min function references into Extremum<> operations, constant operands or not, and I tried that, but it lead to test failures and crashes in lowering that I couldn't resolve. So, until those can be fixed, here's a change that will read max/min operations in module file declarations back into Extremum operations to solve the compatibility checking problem, but leave other non-constant max/min operations as function calls.

Don't require the "VALUES=" argument to the extension intrinsic procedure ETIME to have exactly two elements. Other compilers that support ETIME do not, and it's easy to adapt the behavior to whatever the dynamic size turns out to be.

…vm#107656) Specification expressions may contain references to dummy arguments, host objects, module variables, and variables in COMMON blocks, since they will have values on entry to the scope. A local variable with a initializer and the SAVE attribute (which will always be implied by an explicit initialization) will also always work, and is accepted by at least one other compiler, so accept it with a warning.

Commas are optional between edit descriptors in a format, so treat "AA" as if it were "A,A".

…llvm#107716) When scanning ahead for the first character in the next input item in list-directed internal input, allow a newline character to appear and treat it as a space, matching the behavior of nearly all other Fortran compilers.

) A defined assignment generic interface for a given LHS/RHS type & rank combination may have a specific procedure with LHS dummy argument that is neither allocatable nor pointer, or specific procedure(s) whose LHS dummy arguments are allocatable or pointer. It is possible to have two specific procedures if one's LHS dummy argument is allocatable and the other's is pointer. However, the runtime doesn't work with LHS dummy arguments that are allocatable, and will crash with a mysterious "invalid descriptor" error message. Extend the list of special bindings to include ScalarAllocatableAssignment and ScalarPointerAssignment, use them when appropriate in the runtime type information tables, and handle them in Assign() in the runtime support library.

Don't emit a bogus error about being unable to forward an assumed-rank dummy argument as an actual argument in the case of the KIND intrinsic function. Fixes llvm#107782.

…lvm#107928) Use associated procedure pointers were eliciting bogus errors from semantics if their modules also contained generic procedure interfaces of the same name. (The compiler handles this case correctly when the specific procedure of the same name is not a pointer.) With this fix, the test case in llvm#107784 no longer experiences semantic errors; however, it now crashes unexpectedly in lowering.

llvm#107620)

Existing methods in AsmTypeCheck assumes symbol operand is the 0th operand; they take a `MCInst` and take `getOperand(0)` on it. I think passing a `MCOperand` removes this assumption and also is more intuitive. This was motivated by a new `try_table` instruction, whose support is going to be added to AsmTypeCheck soon, which has tag symbol operands in any position, depending on the number and the kinds of catch clauses. This PR changes all methods' signature that assumes the 0th operand is the relevant one, even if it's not the symbol operand. This also adds `getSignature` method, which factors out the common task when getting a `WasmSignature` from a `MCOperand`.

@vporpo

…lding a BB in debug mode. (llvm#108078) @vporpo suggested in an offline conversation that verifying all instructions during `BasicBlock::buildBasicBlockFromLLVMIR` would be a good way to get coverage for errors like this during testing. He also suggested not gating it on `SBVEC_EXPENSIVE_CHECKS` for now as the checks are pretty basic at the moment and they only affect Debug builds.

This patch implements a simple Pass Registry class, which takes ownership of the passes registered with it and provides an interface to get the pass pointer by its name.

I broke the shared library builds a few minutes ago by introducing a cyclic dependency between two parts of the compiler. Fix.

Follow up to llvm#102708, the tests are failing for windows. There is a large variance in these tests between summary strings and built in types. I'm disabling these test for windows, and will add windows specific tests as a follow up to this.

…tting" (llvm#108104) Reverts llvm#106776 because of a test failure on Windows.

This patch implements sandboxir::BlockAddress mirroring llvm:BlockAddress.

…m#108112) Reverts llvm#107941 Broke PPC bot

…lvm#91014) Governments around the world are starting to require labelling for AI-generated content, and some LLVM stakeholders have asked if LLVM contains AI-generated content. Defining a policy on the use of AI tools allows us to answer that question affirmatively, one way of the other. The policy proposed here allows the use of AI tools in LLVM contributions, flowing from the idea that any contribution is fine regardless of how it is made, as long as the contributor has the right to license it under the project license. I gathered input from the community in this RFC and incorporated it into the policy: https://discourse.llvm.org/t/rfc-define-policy-on-ai-tool-usage-in-contributions/78758

As suggested in llvm#107918, improve readability by converting this tuple to a struct.

… NFC The Hi result is sometimes calculated a different way and this node goes unused. Defer creation until we know for sure it is neeeded. The test changes is because the node creation order changed the names in the debug output.

After llvm#106155, Android arm32 asan builds stopped working with missing definition linker errors. This is due to inconsistent definitions of `uptr` of either `unsigned long` or `unsigned int` even between TUs in compiler-rt. This is caused by Linux arm32 headers redefining `__UINTPTR_TYPE__` (see `arch/arm/include/uapi/asm/types.h` in the Linux kernel repo), meaning include order/whether or not the Linux header is included changes compiler-rt symbol mangling. As a workaround, this hardcodes `uptr`/`sptr` in compiler-rt to `unsigned int`/`int` on Linux arm32, matching clang/gcc.

MTE doesn't support MaxReleasedCachePages which may break the assumption that only the first 4 pages will have memory tagged.

@lhames

With the help of @lhames, This pull request introduces the `dlupdate` function in the ORC runtime. `dlupdate` enables incremental execution of new initializers introduced in the REPL environment. Unlike traditional `dlopen`, which manages initializers, code mapping, and library reference counts, `dlupdate` focuses exclusively on running new initializers.

Even though vmv.v.x has a non constant scalar operand, we can still rematerialize it because we have split register allocation between vectors and scalars. InlineSpiller will check to make sure that the scalar operand is live at the point where the rematerialization occurs, so this won't extend any scalar live ranges. However this also means we may not be able to rematerialize in some cases, as shown in @vmv.v.x_needs_extended. It might be worthwhile teaching InlineSpiller to extend scalar live ranges in a future patch. I experimented with this locally and it reduced spills on 531.deepsjeng_r by a further 3%.

This is a pre-commit test for llvm#107817

MIPSr6 has class.s/class.d instructions. Let's use them for llvm.is.fpclass intrinsic.

This is the same principle as vmv.v.x in llvm#107993, but for floats.

Continuing with llvm#107993 and llvm#108007, this handles the last of the main rematerializable vector instructions. There's an extra spill in one of the test cases, but it's likely noise from the spill weights and isn't an issue in practice.

llvm#108129) … failure Any flang module with a derived type definition implicitly depends on flang/module/__fortran_type_info.f90. Make this dependency explicit so that an unlucky build order doesn't cause a crash.

This allows us to reduce VLs feeding reduction instructions. In particular, this means that <3 x Ty> reduce(load) like sequences no longer require a VL toggle. This was waiting on 3d72957; now that the latent correctness issue is fixed, we can expand this transform.

…UMNUM (llvm#107416) ISD::FCANONICALIZE is enough, which can process NaN or non-NaN correctly, thus getSelectCC is not needed here.

) Reverts llvm#107927 We are supposed to check the MaxAllowedFragmentedPages instead.

While trying to fix one build problem, I made things worse. This should clear things up.

As in the heading.

…lvm#108090) Remove unnecessary load of the `cptr` component when getting the `__address`. `fir.coordinate_of` operation can be chained so the load is not needed.

clangd reports these as unused headers. My manual inspection agrees with the findings.

This adds the basic assembly generation support for the final EH proposal, which was newly adopted in Sep 2023 and advanced into Phase 4 in Jul 2024: https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md This adds support for the generation of new `try_table` and `throw_ref` instruction in .s asesmbly format. This does NOT yet include - Block annotation comment generation for .s format - .o object file generation - .s assembly parsing - Type checking (AsmTypeCheck) - Disassembler - Fixing unwind mismatches in CFGStackify These will be added as follow-up PRs. --- The format for `TRY_TABLE`, both for `MachineInstr` and `MCInst`, is as follows: ``` TRY_TABLE type number_of_catches catch_clauses* ``` where `catch_clause` is ``` catch_opcode tag+ destination ``` `catch_opcode` should be one of 0/1/2/3, which denotes `CATCH`/`CATCH_REF`/`CATCH_ALL`/`CATCH_ALL_REF` respectively. (See `BinaryFormat/Wasm.h`) `tag` exists when the catch is one of `CATCH` or `CATCH_REF`. The MIR format is printed as just the list of raw operands. The (stack-based) assembly instruction supports pretty-printing, including printing `catch` clauses by name, in InstPrinter. In addition to the new instructions `TRY_TABLE` and `THROW_REF`, this adds four pseudo instructions: `CATCH`, `CATCH_REF`, `CATCH_ALL`, and `CATCH_ALL_REF`. These are pseudo instructions to simulate block return values of `catch`, `catch_ref`, `catch_all`, `catch_all_ref` clauses in `try_table` respectively, given that we don't support block return values except for one case (`fixEndsAtEndOfFunction` in CFGStackify). These will be omitted when we lower the instructions to `MCInst` at the end. LateEHPrepare now will have one more stage to covert `CATCH`/`CATCH_ALL`s to `CATCH_REF`/`CATCH_ALL_REF`s when there is a `RETHROW` to rethrow its exception. The pass also converts `RETHROW`s into `THROW_REF`. Note that we still use `RETHROW` as an interim pseudo instruction until we convert them to `THROW_REF` in LateEHPrepare. CFGStackify has a new `placeTryTableMarker` function, which places `try_table`/`end_try_table` markers with a necessary `catch` clause and also `block`/`end_block` markers for the destination of the `catch` clause. In MCInstLower, now we need to support one more case for the multivalue block signature (`catch_ref`'s destination's `(i32, exnref)` return type). InstPrinter has a new routine to print the `catch_list` type, which is used to print `try_table` instructions. The new test, `exception.ll`'s source is the same as `exception-legacy.ll`, with the FileCheck expectations changed. One difference is the commands in this file have `-wasm-enable-exnref` to test the new format, and don't have `-wasm-disable-explicit-locals -wasm-keep-registers`, because the new custom InstPrinter routine to print `catch_list` only works for the stack-based instructions (`_S`), and we can't use `-wasm-keep-registers` for them. As in `exception-legacy.ll`, the FileCheck lines for the new tests do not contain the whole program; they mostly contain only the control flow instructions for readability.

…lvm#107992) Implicit functions may still have a body. The !hasBody() check is enough.

…8130) (llvm#108134) This reverts commit 76151c4. Also changed to check MaxAllowedFragmentedPages.

…te warnings (llvm#107676) This PR makes WebKit's RefCntblBaseVirtualDtor checker not generate a warning for ThreadSafeRefCounted when the destruction thread is a specific thread. Prior to this PR, we only allowed CRTP classes without a virtual destructor if its deref function had an explicit cast to the derived type, skipping any lambda declarations which aren't invoked. This ends up generating a warning for ThreadSafeRefCounted when a specific thread is used to destruct the object because there is no inline body / definition for ensureOnMainThread and ensureOnMainRunLoop and DerefFuncDeleteExprVisitor concludes that there is no explicit delete of the derived type. This PR relaxes the condition DerefFuncDeleteExprVisitor checks by allowing a delete expression to appear within a lambda declaration if it's an argument to an "opaque" function; i.e. a function without definition / body.

The bug was introduced by llvm#68473 Fixes: llvm#102351

Migrate llvm-debuginfod-find tool to use GenericOptTable. Enable multicall driver.

…uality_comparable (llvm#107815) Fixes llvm#107777

…urn]] (llvm#80455) `[[__noreturn__]]` is now always available, so we can simply use the attribute directly instead of through a macro.

The InitUndef pass currently uses the getLargestSuperClass() hook (which is only used by that pass) to chose the register to initialize. This was done to reduce the number of undef init pseudos needed, e.g. so that the vrnov0 regclass would use the same pseudo as v0. After llvm#106744 we use a single generic pseudo, so this is no longer necessary.

…luator This would've caught the failures in llvm#105865 in the libc++ data-formatter CI.

We currently elide memcpys for readonly nocapture noalias arguments. noalias is checked to make sure that there are no other ways to write the memory, e.g. through a different argument or an escaped pointer. In addition to the current noalias check, also query alias analysis, in case it can prove that modification is not possible through other means. This fixes the problem reported in https://discourse.llvm.org/t/problem-about-memcpy-elimination/81121.

This patch adds speculation behavior for linalg structured ops, allowing them to be hoisted out of loops using LICM.

As described in llvm#98883, we have to qualify a module variable name in debugger to get its value. This PR tries to remove this limitation. LLVM provides `DIImportedEntity` to handle such cases but the PR is made more complicated due to the following 2 issues. 1. The MLIR attributes are readonly and we have a circular dependency here. This has to be handled using the recursive interface provided by the MLIR. This requires us to first create a place holder `DISubprogramAttr` which is used in creating `DIImportedEntityAttr`. Later another `DISubprogramAttr` is created which replaces the place holder. 2. The flang IR does not provide any information about the 'used' module so this has to be extracted by doing a pass over the `DeclareOp` in the function. This presents certain limitation as 'only' and module variable renaming may not be handled properly. Due to the change in `DISubprogramAttr`, some tests also needed to be adjusted. Fixes llvm#98883.

…ero tests. NFC

….scope.decl` (llvm#108144) Since `llvm.experimental.noalias.scope.decl` is marked as `memory(inaccessiblemem: readwrite)`, we cannot treat this annotation intrinsic as having no side effects. It will block loop deletion when this intrinsic exists inside a dead loop: https://github.com/llvm/llvm-project/blob/3dad29b677e427bf69c035605a16efd065576829/llvm/lib/Transforms/Scalar/LoopDeletion.cpp#L103-L110 This patch marks `llvm.experimental.noalias.scope.decl` as droppable to address the issue. Fixes llvm#108052.

…m#103879) I've tried to avoid giving too much detailed explanation as the psABI docs are the better source for this.

…108045) This patch registers the tensor dialect as dependent of the ConvertVectorToLLVM. This which fixes a crash when `vector.transfer_write` is used with dynamic tensor type. The MaterializeTransferMask pattern would call `vector::createOrFoldDimOp` which creates a `tensor.dim` operation. Fixes llvm#107805.

…actOpConversion (llvm#107549) This patch adds support for converting `vector.extract` that extract 1-element vectors into LLVM, fixing a crash in such cases. E.g., `vector.extract %1[0]: vector<1xf32> from vector<2xf32>`. Fix llvm#61372.

…vm#107847) A static analyzer identified that this operator was unsafe in the case of self-assignment. In the placement new statement, StringValue's copy constructor was being implicitly called, which received a reference to "itself". In fact, it was being passed an old StringValue at the same address - one whose lifetime had already ended. The copy constructor was thus copying fields from a dead object. We need to be careful when switching active union members, and calling the destructor on the old StringValue will avoid memory leaks which I believe the old code exhibited.

…s. NFC.

)

…llvm#108155) This is the root-cause for the LLDB failures that started occurring after llvm#105865. The DWARFASTParserClang has logic to try derive unnamed bitfields from DWARF offsets. In this case we treat `padding` as a 1-byte size field that would overlap with `flag`, and decide we need to introduce an unnamed bitfield into the AST, which is incorrect.

This patch adds the "gen-openmp-clause-ops" `mlir-tblgen` generator to produce the structure definitions previously in OpenMPClauseOperands.h automatically from the information contained in OpenMPOps.td and OpenMPClauses.td. The original header is maintained to enable the definition of similar structures that are not directly related to any single `OpenMP_Clause` or `OpenMP_Op` tablegen definition.

llvm#107213) This pull request enhances the GSL lifetime analysis to detect situations where a dangling `Container<GSLPointer>` object is constructed: ```cpp std::vector<std::string_view> bad = {std::string()}; // dangling ``` The assignment case is not yet supported, but they will be addressed in a follow-up. Fixes llvm#100526 (excluding the `push_back` case).

…107640) Support IntegerSet attribute python binding.

Ensures that offsets for instance variables are marked with `dllimport` if the interface to which they belong has this attribute.

…ent deduction (NFC) /llvm-project/mlir/tools/mlir-tblgen/OmpOpGen.cpp:202:3: error: 'StringSet' may not intend to support class template argument deduction [-Werror,-Wctad-maybe-unsupported] llvm::StringSet superClasses; ^ /llvm-project/llvm/include/llvm/ADT/StringSet.h:23:7: note: add a deduction guide to suppress this warning class StringSet : public StringMap<std::nullopt_t, AllocatorTy> { ^

/llvm-project/mlir/tools/mlir-tblgen/OmpOpGen.cpp:239:8: error: unused variable 'isAttr' [-Werror,-Wunused-variable] bool isAttr = superClasses.contains("Attr"); ^

@mem

…Code creates G_PTR_ADD to convey the semantics (llvm#107880) When running SPIR-V Backend with optimization levels higher than 0, we observe GEP Operator's as a new factor, massively used to convey the semantics of the original LLVM IR. Previously, an issue related to GEP Operator was mentioned and fixed on the consumer side of toolchains (see, for example, Khronos Trandslator Issue KhronosGroup/SPIRV-LLVM-Translator#2486 and PR KhronosGroup/SPIRV-LLVM-Translator#2487). However, there is a case when GenCode creates G_PTR_ADD to convey the original semantics under optimization levels higher than 0 where it's SPIR-V Backend that fails to translate source LLVM IR correctly. Consider the following reproducer: ``` %struct = type { i32, [257 x i8], [257 x i8], [129 x i8], i32, i64, i64, i64, i64, i64, i64 } @mem = linkonce_odr dso_local addrspace(1) global %struct zeroinitializer, align 8 define weak dso_local spir_func void @__devicelib_assert_fail(ptr addrspace(4) noundef %expr, i32 noundef %line, i1 %fl) { entry: %cmp = icmp eq i32 %line, 0 br i1 %cmp, label %lbl, label %exit lbl: store i32 %line, ptr addrspace(1) getelementptr inbounds (i8, ptr addrspace(1) @mem, i64 648), align 8 br i1 %fl, label %lbl, label %exit exit: ret void } ``` converted to the following machine instructions by SPIR-V Backend: ``` %4:type(s64) = OpTypeInt 32, 0 %22:type(s64) = OpTypePointer 5, %4:type(s64) %2:type(s64) = OpTypeInt 8, 0 %28:type(s64) = OpTypePointer 5, %2:type(s64) %10:pid(p1) = G_GLOBAL_VALUE @mem %36:type(s64) = OpTypeStruct %4:type(s64), %32:type(s64), %32:type(s64), %34:type(s64), %4:type(s64), %35:type(s64), %35:type(s64), %35:type(s64), %35:type(s64), %35:type(s64), %35:type(s64) %37:iid(s32) = G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.spv.const.composite) %8:iid(s32) = ASSIGN_TYPE %37:iid(s32), %36:type(s64) G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.spv.init.global), %10:pid(p1), %8:iid(s32) %29:pid(p1) = nuw G_PTR_ADD %10:pid, %16:iid(s64) %15:pid(p1) = nuw ASSIGN_TYPE %29:pid(p1), %28:type(s64) %27:pid(p2) = G_BITCAST %15:pid(p1) %17:pid(p2) = ASSIGN_TYPE %27:pid(p2), %22:type(s64) G_STORE %1:iid(s32), %17:pid(p2) :: (store (s32) into %ir.3, align 8, addrspace 1) ``` On the next stage of instruction selection this `G_PTR_ADD`-related pattern would be interpreted as an initialization of a global variable and converted to an invalid constant GEP pattern that, in its turn, would fail to be verified by LLVM during back translation from SPIR-V to LLVM IR. This PR introduces a fix for the problem by adding one more case of `G_PTR_ADD` translation, when we use a non-const GEP to convey the meaning. The reproducer is attached as a new test case.

combineSubABS already handles the "(sub Y, cmovns X, -X) -> (add Y, cmovns -X, X)" fold by flipping the cmov operands. We can do something similar for the negation of ABDS/U patterns which have been expanded to a CMOVL/CMOVB with a pair of commuted subtractions: "NEG(ABD(X,Y)) -> NEG(CMOV(SUB(X,Y),SUB(Y,X))) -> CMOV(SUB(Y,X),SUB(X,Y))"

…uto-complete (llvm#107956) As per llvm#106672 and llvm#107377, the documentation should be updated to note that the current bug on Windows involving ``LineEditor`` causing Tab key related features to not work. Fixes llvm#107377

When emitting references to functions as part of `ValueAsMetadata`, we currently emit the incorrect (typed) pointer, resulting in crashes during deserialization. Avoid this by correctly mapping the type during serialization.

These thunks can be accessed using `__impchk_*` symbols, though they are typically not called directly. Instead, they are used to populate the auxiliary IAT. When the imported function is x86_64 (or an ARM64EC function with a patched export thunk), the thunk is used to call it. Otherwise, the OS may replace the thunk at runtime with a direct pointer to the ARM64EC function to avoid the overhead.

Try to detect if the git remote URL has a password or a Github token and return an error teaching the user how to avoid leaking their password or token.

Migrate Opt/OptRST Emitters to const RecordKeeper. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089

@math-fehr

…re types (llvm#105505) Refactors the tblgen-to-irdl script slightly and adds support for - Various integer types - Various Float types - Confined types - Complex types (with fixed element type) Also doesn't add the operand and result ops if they are empty. I could potentially split this into smaller PRs if that'd be helpful (refactor + integer/float/complex, confined type, optional operand/result). @math-fehr

…es. (llvm#107599) Some switch statements require all SVE builtin types to be manually specified. This patch refactors the SVE_*_TYPE macros so that such code can be generated during preprocessing. I've tried to establish a minimal interface that covers all types where no special information is required and then created a set of macros that are dedicated to specific datatypes (i.e. int, float). This patch is groundwork to simplify the changing of SVE tuple types to become struct based as well as work to support the FP8 ACLE.

This fix is based on a problem with cxx_compiler and cxx_linker macros on Windows. There was an issue with compiler detection in paths containing "icc". In such case, Makefile.rules thought it was provided with icc compiler. To solve that, utilities detection has been rewritten in Python. The last element of compiler's path is separated, taking into account the platform path delimiter, and compiler type is extracted, with regard of possible cross-toolchain prefix. --------- Co-authored-by: Pavel Labath <[email protected]>

trunc (binop X, C) --> binop (trunc X, trunc C) --> binop (trunc X, C`) Try to narrow the width of math or bitwise logic instructions by pulling a truncate ahead of binary operators. Vx and Nx cores consider 32-bit and 64-bit basic arithmetic equal in costs.

…08043) Handle procedure pointer with the same name as generics in lowering to avoid crashes after llvm#107928.

…m#108171)

Handle the case of same pointer used as both inputs to the `CompareOptionRecords`, to avoid emitting errors for equivalent options. Follow-up to llvm#107696.

…lvm#108207) We will deref<>() it later, so this is the right check.

…lvm#53045 Add tests for "sub(select(icmp(a,b),a,b),select(icmp(a,b),b,a)) -> abd(a,b)" patterns that still fail to match to abd nodes This will hopefully be helped by llvm#108218

Vector cases are broken, so leave those for later.

Check cost of all instructions in an interleave group, to prepare for follow-up changes.

…uilding (llvm#108076) * Split buildCoroutineFrame into code related to normalization and code related to actually building the coroutine frame. * This will enable future specialization of buildCoroutineFrame for different ABIs while the normalization can be done by splitCoroutine prior to calling buildCoroutineFrame. See RFC for more info: https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057

This is a three deep expression which is deeper than we've otherwise gone for multiple expansions, but I think it's reasonable to do so. This covers mul by 50, 100, and 200 which are reasonably common naturally arising numbers.

…lvm#107785)" This reverts commit 15106c2. Commit does not pass check-flang on x86 host.

This is a large patch includes the MC level support for V_CVT_F16_F32, V_CVT_F32_F16 and V_LDEXP_F16 in true16 format. This patch includes the asm/disasm changes to encode/decode the 16bit vsrc, vdst and src modifieres for vop and dpp format. This patch is a dependency for many 16 bit instructions while only three instructions are updated to make it easier to review. There will be another patch to support these three instructions in the codeGen level, this patch just replaces these two instructions with its fake16 format.

This return is dead code as the return just above will always be taken.

There was a mistake in a comment regarding dyn_cast_or_null deprication. It was suggested to use cast_if_present instead of dyn_cast_or_null, but that was probably a copy paste mistake, and dyn_cast_if_present is the function that should be used instead of dyn_cast_or_null. Authored-by: Ofri Frishman <[email protected]>

…#108215) Right now `describe()`ing a `FunctionDecl` dups the whole code of the function. Dump only its name.

…108020) The check for IV increments in collectUsersInEntryBlock currently triggers for exit-block PHIs which use the IV start value, resulting in us failing to add the input value for the middle block to these PHIs. Fix this by amending the check for IV increments to only include incoming values that are instructions inside the loop. Fixes llvm#108004

…7921) Change CodeGenInstruction::{TheDef, InfereredFrom} to const pointers. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089

Print a warning when the debugger detects a mismatch between the MD5 checksum in the DWARF 5 line table and the file on disk. The warning is printed only once per file.

…lvm#108013) Change SubtargetFeatureInfo to use const Record pointers. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089

…08027) Change CodeGenRegister to use const Record pointer. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089

…vm#108193) Change ASTTableGen to use const Record pointers. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089

…lvm#108195) Change Builtins emitter to use const RecordKeeper. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089

This is a preparation for upcoming changes to Dense[Map|Set] regarding hardening against OOM scenarios (see [this RFC](https://discourse.llvm.org/t/rfc-malfunction-safe-densemap-denseset/81036/7)). We have changed a lot of code inside Dense[Map|Set] and this preparation change helps to isolate the relevant parts from pure formatting stuff.

This makes it slightly easier to see what's different between the two.

…lvm#107889) Always generate v_cndmask_b32 instead of modifying exec around v_mov_b32. This is expected to be faster because modifying exec generally causes pipeline stalls.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoBump] Merge with e55d6f5e (Sep 11) (22) #376

[AutoBump] Merge with e55d6f5e (Sep 11) (22) #376

Commits on Sep 9, 2024

Commits on Sep 10, 2024

Commits on Sep 11, 2024

Commits on Sep 26, 2024

[AutoBump] Merge with e55d6f5e (Sep 11) (22) #376

Are you sure you want to change the base?

[AutoBump] Merge with e55d6f5e (Sep 11) (22) #376

Commits on Sep 9, 2024

Commits on Sep 10, 2024

Commits on Sep 11, 2024

Commits on Sep 26, 2024