-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with e55d6f5e (Sep 11) (22) #376
base: bump_to_37263b6c
Are you sure you want to change the base?
Commits on Sep 9, 2024
-
Configuration menu - View commit details
-
Copy full SHA for b3d2d50 - Browse repository at this point
Copy the full SHA b3d2d50View commit details -
[TableGen] Migrate CodeGenHWModes to use const RecordKeeper (llvm#107851
) Migrate CodeGenHWModes to use const RecordKeeper and const Record pointers. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
Configuration menu - View commit details
-
Copy full SHA for 985600d - Browse repository at this point
Copy the full SHA 985600dView commit details -
[DirectX] Lower
@llvm.dx.typedBufferLoad
to DXIL opsThe `@llvm.dx.typedBufferLoad` intrinsic is lowered to `@dx.op.bufferLoad`. There's some complexity here in translating to scalarized IR, which I've abstracted out into a function that should be useful for samples, gathers, and CBuffer loads. I've also updated the DXILResources.rst docs to match what I'm doing here and the proposal in llvm/wg-hlsl#59. I've removed the content about stores and raw buffers for now with the expectation that it will be added along with the work. Note that this change includes a bit of a hack in how it deals with `getOverloadKind` for the `dx.ResRet` types - we need to adjust how we deal with operation overloads to generate a table directly rather than proxy through the OverloadKind enum, but that's left for a later change here. Part of llvm#91367 Pull Request: llvm#104252
Configuration menu - View commit details
-
Copy full SHA for 3f22756 - Browse repository at this point
Copy the full SHA 3f22756View commit details -
[VPlan] Consistently use VTC for vector trip count in vplan-printing.ll.
The inconsistency surfaced in llvm#95305. Split off the reduce the diff.
Configuration menu - View commit details
-
Copy full SHA for 3403438 - Browse repository at this point
Copy the full SHA 3403438View commit details -
Reland [asan][windows] Eliminate the static asan runtime on windows (l…
…lvm#107899) This reapplies 8fa66c6 ([asan][windows] Eliminate the static asan runtime on windows) for a second time. That PR bounced off the tests because it caused failures in the other sanitizer runtimes, these have been fixed by only building interception, sanitizer_common, and asan with /MD, and continuing to build the rest of the runtimes with /MT. This does mean that any usage of the static ubsan/fuzzer/etc runtimes will mean you're mixing different runtime library linkages in the same app, the interception, sanitizer_common, and asan runtimes are designed for this, however it does result in some linker warnings. Additionally, it turns out when building in release-mode with LLVM_ENABLE_PDBs the build system forced /OPT:ICF. This totally breaks asan's "new" method of doing "weak" functions on windows, and so /OPT:NOICF was explicitly added to asan's link flags. --------- Co-authored-by: Amy Wishnousky <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 53a81d4 - Browse repository at this point
Copy the full SHA 53a81d4View commit details -
[SandboxIR] Add missing VectorType functions (llvm#107650)
Fills in many missing functions from VectorType
Configuration menu - View commit details
-
Copy full SHA for 6f8d278 - Browse repository at this point
Copy the full SHA 6f8d278View commit details -
[scudo] Add fragmentation info for each memory group (llvm#107475)
This information helps with tuning the heuristic of selecting memory groups to release the unused pages.
Configuration menu - View commit details
-
Copy full SHA for d9a9960 - Browse repository at this point
Copy the full SHA d9a9960View commit details -
[LTO] Fix a use-after-free in legacy LTO C APIs (llvm#107896)
Fix a bug that `lto_runtime_lib_symbols_list` is returning the address of a local variable that will be freed when getting out of scope. This is a regression from llvm#98512 that rewrites the runtime libcall function lists into a SmallVector. rdar://135559037
Configuration menu - View commit details
-
Copy full SHA for 66e9078 - Browse repository at this point
Copy the full SHA 66e9078View commit details -
[SPIRV] Add sign intrinsic part 1 (llvm#101987)
partially fixes llvm#70078 ### Changes - Added `int_spv_sign` intrinsic in `IntrinsicsSPIRV.td` - Added lowering and map to `int_spv_sign in `SPIRVInstructionSelector.cpp` - Added SPIR-V backend test case in `llvm/test/CodeGen/SPIRV/hlsl-intrinsics/sign.ll` ### Related PRs - llvm#101988 - llvm#101989
Configuration menu - View commit details
-
Copy full SHA for a9a5a18 - Browse repository at this point
Copy the full SHA a9a5a18View commit details -
[TableGen] Change CGIOperandList::OperandInfo::Rec to const pointer (l…
…lvm#107858) Change CGIOperandList::OperandInfo::Rec and CGIOperandList::TheDef to const pointer. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
Configuration menu - View commit details
-
Copy full SHA for bdf0224 - Browse repository at this point
Copy the full SHA bdf0224View commit details -
[SandboxVec] Implement Pass class (llvm#107617)
This patch implements the Pass base class and the FunctionPass sub-class that operate on Sandbox IR.
Configuration menu - View commit details
-
Copy full SHA for f12e10b - Browse repository at this point
Copy the full SHA f12e10bView commit details -
[NVPTX] Restrict combining to properly aligned v16i8 vectors. (llvm#1…
…07919) Fixes generation of invalid loads leading to misaligned access errors. The bug got exposed by SLP vectorizer change ec360d6 which allowed SLP to produce `v16i8` vectors. Also updated the tests to use automatic check generator.
Configuration menu - View commit details
-
Copy full SHA for 26b786a - Browse repository at this point
Copy the full SHA 26b786aView commit details -
Configuration menu - View commit details
-
Copy full SHA for d148a1a - Browse repository at this point
Copy the full SHA d148a1aView commit details -
[X86] Handle shifts + and in
LowerSELECTWithCmpZero
shifts are the same as sub where rhs == 0 is identity. and is the inverted case where: `SELECT (AND(X,1) == 0), (AND Y, Z), Y` -> `(AND Y, (OR NEG(AND(X, 1)), Z))` With -1 as the identity. Closes llvm#107910
Configuration menu - View commit details
-
Copy full SHA for 88bd507 - Browse repository at this point
Copy the full SHA 88bd507View commit details -
[PAC] Make __is_function_overridden pauth-aware on ELF platforms (llv…
…m#107498) Apparently, there are two almost identical implementations: one for MachO and another one for ELF. The ELF bits somehow slipped while llvm#84573 was reviewed. The particular implementation is identical to MachO case.
Configuration menu - View commit details
-
Copy full SHA for 33c1325 - Browse repository at this point
Copy the full SHA 33c1325View commit details -
[SandboxIR] Implement UndefValue (llvm#107628)
This patch implements sandboxir::UndefValue mirroring llvm::UndefValue.
Configuration menu - View commit details
-
Copy full SHA for ae02211 - Browse repository at this point
Copy the full SHA ae02211View commit details
Commits on Sep 10, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 81ef8e2 - Browse repository at this point
Copy the full SHA 81ef8e2View commit details -
[NVPTX] Support copysign PTX instruction (llvm#107800)
Lower `fcopysign` SDNodes into `copysign` PTX instructions where possible. See [PTX ISA: 9.7.3.2. Floating Point Instructions: copysign] (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#floating-point-instructions-copysign).
Configuration menu - View commit details
-
Copy full SHA for b0d2411 - Browse repository at this point
Copy the full SHA b0d2411View commit details -
[ctx_prof] Insert the ctx prof flattener after the module inliner (ll…
…vm#107499) This patch enables experimenting with the contextual profile. ICP is currently disabled in this case - will reenable it subsequently. Also subsequently the inline cost model / decision making would be updated to be context-aware. Right now, this just achieves "complete use" of the profile, in that it's ingested, maintained, and sunk to a flat profile when not needed anymore. Issue [llvm#89287](llvm#89287)
Configuration menu - View commit details
-
Copy full SHA for 3b22618 - Browse repository at this point
Copy the full SHA 3b22618View commit details -
[mlir][linalg][NFC] Drop redundant rankReductionStrategy (llvm#107875)
This patch drop redundant rankReductionStrategy in `populateFoldUnitExtentDimsViaSlicesPatterns` and fixes comment typos.
Configuration menu - View commit details
-
Copy full SHA for f3b4e47 - Browse repository at this point
Copy the full SHA f3b4e47View commit details -
[LoongArch][ISel] Check the number of sign bits in
PatGprGpr_32
(l……lvm#107432) After llvm#92205, LoongArch ISel selects `div.w` for `trunc i64 (sdiv i64 3202030857, (sext i32 X to i64)) to i32`. It is incorrect since `3202030857` is not a signed 32-bit constant. It will produce wrong result when `X == 2`: https://alive2.llvm.org/ce/z/pzfGZZ This patch adds additional `sexti32` checks to operands of `PatGprGpr_32`. Alive2 proof: https://alive2.llvm.org/ce/z/AkH5Mp Fix llvm#107414.
Configuration menu - View commit details
-
Copy full SHA for a111f91 - Browse repository at this point
Copy the full SHA a111f91View commit details -
[NFC][TableGen] Simplify DirectiveEmitter using range for loops (llvm…
…#107909) Make constructors that take const Record * implicit, allowing us to simplify some range based loops to use that class instance as the loop variable. Change remaining constructor calls to use () instead of {} to construct objects.
Configuration menu - View commit details
-
Copy full SHA for f7479b5 - Browse repository at this point
Copy the full SHA f7479b5View commit details -
Configuration menu - View commit details
-
Copy full SHA for e64a1c0 - Browse repository at this point
Copy the full SHA e64a1c0View commit details -
[LoongArch] Codegen for concat_vectors with LASX
Fixes: llvm#107355 Reviewed By: SixWeining Pull Request: llvm#107523
Configuration menu - View commit details
-
Copy full SHA for 1ca411c - Browse repository at this point
Copy the full SHA 1ca411cView commit details -
[bazel][libc][NFC] Add missing layering deps (llvm#107947)
After 2773719 e.g. ``` external/llvm-project/libc/test/src/math/smoke/NextTowardTest.h:12:10: error: module llvm-project//libc/test/src/math/smoke:nexttowardf_test does not depend on a module exporting 'src/__support/CPP/bit.h' ```
Configuration menu - View commit details
-
Copy full SHA for 7a8e9df - Browse repository at this point
Copy the full SHA 7a8e9dfView commit details -
[LLVM][Coroutines] Switch CoroAnnotationElidePass to a FunctionPass (l…
…lvm#107897) After landing llvm#99285 we found that the call graph update was causing the following crash when expensive checks are turned on ``` llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:982: LazyCallGraph::SCC &updateCGAndAnalysisManagerForPass(LazyCallGraph &, LazyCallGraph::SCC &, LazyCallGraph::Node &, CGSCCAnalysisManager &, CGSCCUpdateResult &, FunctionAnalysisManager &, bool): Assertion `(RC == &TargetRC || RC->isAncestorOf(Targe tRC)) && "New call edge is not trivial!"' failed. ``` I have to admit I believe that the call graph update process I did for that patch could be wrong. After reading the code in `CGSCCToFunctionPassAdaptor`, I am convinced that `CoroAnnotationElidePass` can be a FunctionPass and rely on the adaptor to update the call graph for us, so long as we properly invalidate the caller's analyses. After this patch, `llvm/test/Transforms/Coroutines/coro-transform-must-elide.ll` no longer fails under expensive checks.
Configuration menu - View commit details
-
Copy full SHA for 761bf33 - Browse repository at this point
Copy the full SHA 761bf33View commit details -
[Fuzzer] Passthrough zlib CMake paths into the test (llvm#107926)
We shouldn't assume that we're using system zlib installation.
Configuration menu - View commit details
-
Copy full SHA for eb0e4b1 - Browse repository at this point
Copy the full SHA eb0e4b1View commit details -
[ValueTracking] Infer is-power-of-2 from assumptions. (llvm#107745)
This patch tries to infer is-power-of-2 from assumptions. I don't see that this kind of assumption exists in my dataset. Related issue: rust-lang/rust#129795 Close llvm#58996.
Configuration menu - View commit details
-
Copy full SHA for ffcff4a - Browse repository at this point
Copy the full SHA ffcff4aView commit details -
[clang] fix half && bfloat16 convert node expr codegen (llvm#89051)
Data type conversion between fp16 and bf16 will generate fptrunc and fpextend nodes, but they are actually bitcast nodes.
Configuration menu - View commit details
-
Copy full SHA for 56905da - Browse repository at this point
Copy the full SHA 56905daView commit details -
[clang][HLSL] Add sign intrinsic part 3 (llvm#101989)
partially fixes llvm#70078 ### Changes - Implemented `sign` clang builtin - Linked `sign` clang builtin with `hlsl_intrinsics.h` - Added sema checks for `sign` to `CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp` - Add codegen for `sign` to `EmitHLSLBuiltinExpr` in `CGBuiltin.cpp` - Add codegen tests to `clang/test/CodeGenHLSL/builtins/sign.hlsl` - Add sema tests to `clang/test/SemaHLSL/BuiltIns/sign-errors.hlsl` ### Related PRs - llvm#101987 - llvm#101988 ### Discussion - Should there be a `usign` intrinsic that handles the unsigned cases?
Configuration menu - View commit details
-
Copy full SHA for dce5039 - Browse repository at this point
Copy the full SHA dce5039View commit details -
Configuration menu - View commit details
-
Copy full SHA for 02ab435 - Browse repository at this point
Copy the full SHA 02ab435View commit details -
[ORC] Remove EDU from dependants list of dependencies before destroying.
Dependant lists hold raw pointers back to EDUs that depend on them. We need to remove these entries before destroying the EDU or we'll be left with a dangling reference that can result in use-after-free bugs. No testcase: This has only been observed in multi-threaded setups that reproduce the issue inconsistently. rdar://135403614
Configuration menu - View commit details
-
Copy full SHA for 7034ec4 - Browse repository at this point
Copy the full SHA 7034ec4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 094e6b8 - Browse repository at this point
Copy the full SHA 094e6b8View commit details -
[LLDB][Minidump] Support minidumps where there are multiple exception…
… streams (llvm#97470) Currently, LLDB assumes all minidumps will have unique sections. This is intuitive because almost all of the minidump sections are themselves lists. Exceptions including Signals are unique in that they are all individual sections with their own directory. This means LLDB fails to load minidumps with multiple exceptions due to them not being unique. This behavior is erroneous and this PR introduces support for an arbitrary number of exception streams. Additionally, stop info was calculated only for a single thread before, and now we properly support mapping exceptions to threads. ~~This PR is starting in DRAFT because implementing testing is still required.~~
Configuration menu - View commit details
-
Copy full SHA for 4926835 - Browse repository at this point
Copy the full SHA 4926835View commit details -
[clang][bytecode] Fix local destructor order (llvm#107951)
Add appropriate scopes and use reverse-order iteration in LocalScope::emitDestructors().
Configuration menu - View commit details
-
Copy full SHA for 3928ede - Browse repository at this point
Copy the full SHA 3928edeView commit details -
[ORC-RT] Replace FnTag arg of WrapperFunction::call with generic disp…
…atch arg. This decouples function argument serialization / deserialization from the function call dispatch mechanism. This will eventually allow us to replace the existing __orc_rt_jit_dispatch function with a system that supports pre-linking parts of the ORC runtime into the executor.
Configuration menu - View commit details
-
Copy full SHA for 462251b - Browse repository at this point
Copy the full SHA 462251bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9b67c99 - Browse repository at this point
Copy the full SHA 9b67c99View commit details -
[RISCV] Constrain passthru regclass in vmerge -> vmv peephole
In llvm#107827 we now set true's passthru to the false operand if it was undef. We need to remember to also constrain the regclass in case true is a masked pseudo which needs its passthrus to be in VR[M*]NoV0
Configuration menu - View commit details
-
Copy full SHA for b71d88c - Browse repository at this point
Copy the full SHA b71d88cView commit details -
Revert "[RISCV] Update V0Defs after moving Src in peepholes (llvm#107359
)" This fixes llvm#107950 and adds a test case for it. The issue was due to us incorrectly assuming that we stored a V0Defs entry for every single instruction. We actually only store them for instructions that use V0, so when we updated the V0Def after moving we sometimes ended up copying nullptr over from an instruction that doesn't use V0 and clearing the V0Def entry inadvertently. Because we don't have V0Defs on instructions that don't use V0, the FIXME was never actually needed in the first place since the bookkeeping wasn't out of sync to begin with. That commit also mentioned that a future unmasked to masked pseudo peephole might need unmasked pseudos to have V0Defs entries, but after working on this locally it turns out we don't. This reverts commit ce36480.
Configuration menu - View commit details
-
Copy full SHA for 7ba6768 - Browse repository at this point
Copy the full SHA 7ba6768View commit details -
[libc++][string] Remove potential non-trailing 0-length array (llvm#1…
…05865) It is a violation of the standard to use 0 length arrays, especially when not at the end of a structure (not a FAM GNU extension). Compiler generally accept it, but it's probably better to have a conforming implementation. --------- Co-authored-by: Louis Dionne <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ed0da00 - Browse repository at this point
Copy the full SHA ed0da00View commit details -
Configuration menu - View commit details
-
Copy full SHA for 06c3311 - Browse repository at this point
Copy the full SHA 06c3311View commit details -
[GlobalIsel] Update MIR gallery (llvm#107903)
add more patterns clarify wip_match_opcode usage
Configuration menu - View commit details
-
Copy full SHA for bece0d7 - Browse repository at this point
Copy the full SHA bece0d7View commit details -
[llvm][Support] Determine the max thread length on Haiku (llvm#107801)
Haiku has pthread_setname_np() / pthread_getname_np().
Configuration menu - View commit details
-
Copy full SHA for 1c334de - Browse repository at this point
Copy the full SHA 1c334deView commit details -
Revert "[llvm-ml] Fix RIP-relative addressing for ptr operands (llvm#…
…107618)" This reverts commit 7543d09. This change caused failed asserts when building the openmp assembly sources, reproducible with: $ llvm-ml -m64 -D_M_AMD64 -c -Fo out.obj openmp/runtime/src/z_Windows_NT-586_asm.asm llvm-ml: ../lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp:624: void {anonymous}::X86MCCodeEmitter::emitMemModRMByte(const llvm::MCInst&, unsigned int, unsigned int, uint64_t, {anonymous}::PrefixKind, uint64_t, llvm::SmallVectorImpl<char>&, llvm::SmallVectorImpl<llvm::MCFixup>&, const llvm::MCSubtargetInfo&, bool) const: Assertion `IndexReg.getReg() == 0 && !ForceSIB && "Invalid rip-relative address"' failed. The assert can also be triggered with one lone instruction: lea rdx, QWORD PTR [rax*8+16]
Configuration menu - View commit details
-
Copy full SHA for 1581183 - Browse repository at this point
Copy the full SHA 1581183View commit details -
[MLIR] Make
resolveCallable
customizable inCallOpInterface
(llvm……#100361) Allow customization of the `resolveCallable` method in the `CallOpInterface`. This change allows for operations implementing this interface to provide their own logic for resolving callables. - Introduce the `resolveCallable` method, which does not include the optional symbol table parameter. This method replaces the previously existing extra class declaration `resolveCallable`. - Introduce the `resolveCallableInTable` method, which incorporates the symbol table parameter. This method replaces the previous extra class declaration `resolveCallable` that used the optional symbol table parameter.
Configuration menu - View commit details
-
Copy full SHA for 958f59d - Browse repository at this point
Copy the full SHA 958f59dView commit details -
[MLIR][NVVM] Add support for nvvm.breakpoint Op (llvm#107193)
This commit adds support for `nvvm.breakpoint` Op which lowers to the PTX brkpt instruction. Also, added the respective tests in `nvvmir.mlir`
Configuration menu - View commit details
-
Copy full SHA for 831236e - Browse repository at this point
Copy the full SHA 831236eView commit details -
Revert "[ORC-RT] Replace FnTag arg of WrapperFunction::call with gene…
…ric dispatch arg." This reverts commit 462251b. This reverts commit 9b67c99. Build fails for compiler-rt/lib/orc/tests/unit/wrapper_function_utils_test.cpp https://buildkite.com/llvm-project/upstream-bazel/builds/109731#0191da59-6710-4420-92ef-aa6e0355cb2c
Configuration menu - View commit details
-
Copy full SHA for 53d35c4 - Browse repository at this point
Copy the full SHA 53d35c4View commit details -
Revert "[MLIR] Make
resolveCallable
customizable in `CallOpInterfac……e`" (llvm#107984) Reverts llvm#100361 This commit caused some linker errors. (Missing `MLIRCallInterfaces` dependency.)
Configuration menu - View commit details
-
Copy full SHA for 7574042 - Browse repository at this point
Copy the full SHA 7574042View commit details -
[mlir][SME] Update E2E test to show optional loop optimisation (NFC) (l…
…lvm#107585) Introduces loop hoisting to ARM SME E2E tests to allow the hoisting of the tile load offering very important speedup. Discussed here : https://discourse.llvm.org/t/mlir-for-arm-sme-reducing-tile-data-transfers/80065/2
Configuration menu - View commit details
-
Copy full SHA for 8aeb104 - Browse repository at this point
Copy the full SHA 8aeb104View commit details -
[DAG] expandAVG - consistently use getShiftAmountConstant for constan…
…t shift amounts. NFC
Configuration menu - View commit details
-
Copy full SHA for 7e07c1d - Browse repository at this point
Copy the full SHA 7e07c1dView commit details -
[MLIR] Add f6E3M2FN type (llvm#105573)
This PR adds `f6E3M2FN` type to mlir. `f6E3M2FN` type is proposed in [OpenCompute MX Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). It defines a 6-bit floating point number with bit layout S1E3M2. Unlike IEEE-754 types, there are no infinity or NaN values. ```c f6E3M2FN - Exponent bias: 3 - Maximum stored exponent value: 7 (binary 111) - Maximum unbiased exponent value: 7 - 3 = 4 - Minimum stored exponent value: 1 (binary 001) - Minimum unbiased exponent value: 1 − 3 = −2 - Has Positive and Negative zero - Doesn't have infinity - Doesn't have NaNs Additional details: - Zeros (+/-): S.000.00 - Max normal number: S.111.11 = ±2^(4) x (1 + 0.75) = ±28 - Min normal number: S.001.00 = ±2^(-2) = ±0.25 - Max subnormal number: S.000.11 = ±2^(-2) x 0.75 = ±0.1875 - Min subnormal number: S.000.01 = ±2^(-2) x 0.25 = ±0.0625 ``` Related PRs: - [PR-94735](llvm#94735) [APFloat] Add APFloat support for FP6 data types - [PR-97118](llvm#97118) [MLIR] Add f8E4M3 type - was used as a template for this PR
Configuration menu - View commit details
-
Copy full SHA for 918222b - Browse repository at this point
Copy the full SHA 918222bView commit details -
[MLIR] [NFC] Use APFloat semantics to get floating type width (llvm#1…
…07372) As suggested in the comments of llvm#105573
Configuration menu - View commit details
-
Copy full SHA for 083e25c - Browse repository at this point
Copy the full SHA 083e25cView commit details -
[LoongArch] Eliminate the redundant sign extension of division (llvm#…
…107971) If all incoming values of `div.d` are sign-extended and all users only use the lower 32 bits, then convert them to W versions. Fixes: llvm#107946
Configuration menu - View commit details
-
Copy full SHA for 0f47e3a - Browse repository at this point
Copy the full SHA 0f47e3aView commit details -
[VectorCombine] Add type shrinking and zext propagation for fixed-wid…
…th vector types (llvm#104606) Check that `binop(zext(value)`, other) is possible and profitable to transform into: `zext(binop(value, trunc(other)))`. When CPU architecture has illegal scalar type iX, but vector type <N * iX> is legal, scalar expressions before vectorisation may be extended to a legal type iY. This extension could result in underutilization of vector lanes, as more lanes could be used at one instruction with the lower type. Vectorisers may not always recognize opportunities for type shrinking, and this patch aims to address that limitation.
Configuration menu - View commit details
-
Copy full SHA for bf69484 - Browse repository at this point
Copy the full SHA bf69484View commit details -
[llvm][Docs] Update guide to include
pip install lit
(llvm#106526)Also updates and clarifies which version would be installed. As per https://discourse.llvm.org/t/information-on-lit-is-outdated/76498.
Configuration menu - View commit details
-
Copy full SHA for edbe8fa - Browse repository at this point
Copy the full SHA edbe8faView commit details -
Configuration menu - View commit details
-
Copy full SHA for a99d666 - Browse repository at this point
Copy the full SHA a99d666View commit details -
[VPlan] Add VPValue for VF, use it for VPWidenIntOrFpInductionRecipe. (…
…llvm#95305) Similar to VFxUF, also add a VF VPValue to VPlan and use it to get the runtime VF in VPWidenIntOrFpInductionRecipe. Code for VF is only generated if there are users of VF, to avoid unnecessary test changes. PR: llvm#95305
Configuration menu - View commit details
-
Copy full SHA for a794ee4 - Browse repository at this point
Copy the full SHA a794ee4View commit details -
[TOSA] tosa.negate operator lowering update (llvm#107924)
This PR makes tosa.negate op for integer types to use the simplified calculation branch if input_zp and output_zp values are also zero. Signed-off-by: Dmitriy Smirnov <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2778d9d - Browse repository at this point
Copy the full SHA 2778d9dView commit details -
Re-apply "[ORC-RT] Replace FnTag arg of WrapperFunction::call..." wit…
Configuration menu - View commit details
-
Copy full SHA for 69f8923 - Browse repository at this point
Copy the full SHA 69f8923View commit details -
[AArch64] Lower __builtin_bswap16 to rev16 if bswap followed by any_e…
…xtend (llvm#105375) GCC compiles the built-in function `__builtin_bswap16`, to the ARM instruction rev16, which reverses the byte order of 16-bit data. On the other Clang compiles the same built-in function to e.g. ``` rev w8, w0 lsr w0, w8, #16 ``` i.e. it performs a byte reversal of a 32-bit register, (which moves the lower half, which contains the 16-bit data, to the upper half) and then right shifts the reversed 16-bit data back to the lower half of the register. We can improve Clang codegen by generating `rev16` instead of `rev` and `lsr`, like GCC.
Configuration menu - View commit details
-
Copy full SHA for 23595d1 - Browse repository at this point
Copy the full SHA 23595d1View commit details -
[LLVM][AArch64] Refactor sve-b16b16 instruction definitions. (llvm#10…
…7265) Update the predicate protecting bfloat instructions to only reference FEAT_SVE_B16B16, which matches the specification. Rename and move instruction classes to match the names of the encoding groups the bfloat arithmetic instructions belong.
Configuration menu - View commit details
-
Copy full SHA for 516f08b - Browse repository at this point
Copy the full SHA 516f08bView commit details -
[Flang][Lower] Introduce SymMapScope helper class (NFC) (llvm#107866)
This patch creates a simple RAII wrapper class for `SymMap` to make it easier to use and prevent a missing matching `popScope()` for a `pushScope()` call on simple use cases. Some push-pop pairs are replaced with instances of the new class by this patch.
Configuration menu - View commit details
-
Copy full SHA for 433ca3e - Browse repository at this point
Copy the full SHA 433ca3eView commit details -
Configuration menu - View commit details
-
Copy full SHA for fffdd9e - Browse repository at this point
Copy the full SHA fffdd9eView commit details -
[lldb] Recurse through DW_AT_signature when looking for attributes (l…
…lvm#107241) This allows e.g. DWARFDIE::GetName() to return the name of the type when looking at its declaration (which contains only DW_AT_declaration+DW_AT_signature). This is similar to how we recurse through DW_AT_specification when looking for a function name. Llvm dwarf parser has obtained the same functionality through llvm#99495. This fixes a bug where we would confuse a type like NS::Outer::Struct with NS::Struct (because NS::Outer (and its name) was in a type unit).
Configuration menu - View commit details
-
Copy full SHA for 925b220 - Browse repository at this point
Copy the full SHA 925b220View commit details -
[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic (llvm#105822)
This intrinsic is meant to be used in functions that have a "tail" that needs to be run with all the lanes enabled. The "tail" may contain complex control flow that makes it unsuitable for the use of the existing WWM intrinsics. Instead, we will pretend that the function starts with all the lanes enabled, then branches into the actual body of the function for the lanes that were meant to run it, and then finally all the lanes will rejoin and run the tail. As such, the intrinsic will return the EXEC mask for the body of the function, and is meant to be used only as part of a very limited pattern (for now only in amdgpu_cs_chain functions): ``` entry: %func_exec = call i1 @llvm.amdgcn.init.whole.wave() br i1 %func_exec, label %func, label %tail func: ; ... stuff that should run with the actual EXEC mask br label %tail tail: ; ... stuff that runs with all the lanes enabled; ; can contain more than one basic block ``` It's an error to use the result of this intrinsic for anything other than a branch (but unfortunately checking that in the verifier is non-trivial because SIAnnotateControlFlow will introduce an amdgcn.if between the intrinsic and the branch). The intrinsic is lowered to a SI_INIT_WHOLE_WAVE pseudo, which for now is expanded in si-wqm (which is where SI_INIT_EXEC is handled too); however the information that the function was conceptually started in whole wave mode is stored in the machine function info (hasInitWholeWave). This will be useful in prolog epilog insertion, where we can skip saving the inactive lanes for CSRs (since if the function started with all the lanes active, then there are no inactive lanes to preserve).
Configuration menu - View commit details
-
Copy full SHA for 44556e6 - Browse repository at this point
Copy the full SHA 44556e6View commit details -
[clang][bytecode][NFC] Fix CallBI function signature
This doesn't modify the PC, so pass OpPC as a copy.
Configuration menu - View commit details
-
Copy full SHA for 4687017 - Browse repository at this point
Copy the full SHA 4687017View commit details -
[lld][AArch64] Fix getImplicitAddend in big-endian mode. (llvm#107845)
In AArch64, the endianness of instruction encodings is always little, whereas the endianness of data swaps between LE and BE modes. So getImplicitAddend must use the right one of read32() and read32le(), for data and code respectively. It was using read32() throughout, causing instructions to be read as big-endian in BE mode, getting the wrong addend. Fixed, and updated the existing test to check both endiannesses. The expected results for data must be byte-swapped, but the ones for code need no adjustment.
Configuration menu - View commit details
-
Copy full SHA for daf2085 - Browse repository at this point
Copy the full SHA daf2085View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6a56f15 - Browse repository at this point
Copy the full SHA 6a56f15View commit details -
[AArch64] Prevent the AArch64LoadStoreOptimizer from reordering CFI i…
…nstructions (llvm#101317) When AArch64LoadStoreOptimizer pass merges an SP update with a load/store instruction and needs to adjust unwind information either: * create the merged instruction at the location of the SP update (so no CFI instructions are moved), or * only move a CFI instruction if the move would not reorder it across other CFI instructions If neither of the above is possible, don't perform the optimisation.
Configuration menu - View commit details
-
Copy full SHA for b0ffaa7 - Browse repository at this point
Copy the full SHA b0ffaa7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 306b08c - Browse repository at this point
Copy the full SHA 306b08cView commit details -
[flang] Use LLVM dialect ops for stack save/restore in target-rewrite (…
…llvm#107879) Mostly NFC, I was bothered by the declaration that were always made even if unsued, and I think using LLVM Ops is nicer anyway with regards to side effects here. ``` func.func private @llvm.stacksave.p0() -> !fir.ref<i8> func.func private @llvm.stackrestore.p0(!fir.ref<i8>) ``` There are other places in lowering that are using the calls instead of the LLVM intrinsics, but I will deal with them another time (the issue there is mostly to get the proper address space for the llvm.ptr type).
Configuration menu - View commit details
-
Copy full SHA for cb30169 - Browse repository at this point
Copy the full SHA cb30169View commit details -
[libc++] Include the full set of libc++ transitive includes in the CS…
…V files (llvm#107911) When we introduced the machinery for transitive includes validation, at some point we stopped including the full set of transitive includes in the CSV files and instead only tracked the set of public headers included *directly* by a top-level header. The reason for doing that was so that the CSV files containing "transitive" includes could be used to draw the dependency graph of libc++ headers. However, the downside was that it made the contents of the CSV files much harder to interpret. In particular, many changes that modify the CSV files do not in fact modify the effective set of transitive includes, which is confusing. This patch goes back to storing the full set of transitive includes in the CSV files and removes the ability to graph the libc++ includes directly from those CSV files, which we never actually used.
Configuration menu - View commit details
-
Copy full SHA for 930915a - Browse repository at this point
Copy the full SHA 930915aView commit details -
Configuration menu - View commit details
-
Copy full SHA for bda9474 - Browse repository at this point
Copy the full SHA bda9474View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0ccc609 - Browse repository at this point
Copy the full SHA 0ccc609View commit details -
[gn] attempt to port 53a81d4 (win/asan dynamic runtime)
Based on the output of llvm/utils/gn/build/sync_source_lists_from_cmake.py and reading the diff, but not actually tested on Windows.
Configuration menu - View commit details
-
Copy full SHA for 4a63d62 - Browse repository at this point
Copy the full SHA 4a63d62View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4d55f0b - Browse repository at this point
Copy the full SHA 4d55f0bView commit details -
Reland [MLIR] Make resolveCallable customizable in CallOpInterface (l…
…lvm#107989) Relands llvm#100361 with fixed dependencies.
Configuration menu - View commit details
-
Copy full SHA for d1cad22 - Browse repository at this point
Copy the full SHA d1cad22View commit details -
Configuration menu - View commit details
-
Copy full SHA for e610a0e - Browse repository at this point
Copy the full SHA e610a0eView commit details -
[NFC][AMDGPU][Driver] Move 'shouldSkipSanitizeOption' utility to AMDG…
…PU. (llvm#107997) HIPAMDToolChain and AMDGPUOpenMPToolChain both depends on the "shouldSkipSanitizeOption" api to sanitize/not sanitize device code.
Configuration menu - View commit details
-
Copy full SHA for 5dd1c82 - Browse repository at this point
Copy the full SHA 5dd1c82View commit details -
Configuration menu - View commit details
-
Copy full SHA for f58312e - Browse repository at this point
Copy the full SHA f58312eView commit details -
[flang][AMDGPU] Convert math ops to AMD GPU library calls instead of …
…libm calls (llvm#99517) This patch invokes a pass when compiling for an AMDGPU target to lower math operations to AMD GPU library calls library calls instead of libm calls.
Configuration menu - View commit details
-
Copy full SHA for 4290e34 - Browse repository at this point
Copy the full SHA 4290e34View commit details -
Configuration menu - View commit details
-
Copy full SHA for 69828c4 - Browse repository at this point
Copy the full SHA 69828c4View commit details -
[SPIR-V] Expose an API call to initialize SPIRV target and translate …
…input LLVM IR module to SPIR-V (llvm#107216) The goal of this PR is to facilitate integration of SPIRV Backend into misc 3rd party tools and libraries by means of exposing an API call that translate LLVM module to SPIR-V and write results into a string as binary SPIR-V output, providing diagnostics on fail and means of configuring translation in a style of command line options. An example of a use case may be Khronos Translator that provides bidirectional translation LLVM IR <=> SPIR-V, where LLVM IR => SPIR-V step may be substituted by the call to SPIR-V Backend API, implemented by this PR.
Configuration menu - View commit details
-
Copy full SHA for bca2b6d - Browse repository at this point
Copy the full SHA bca2b6dView commit details -
[libc++][test] LWG2593: Moved-from state of Allocators (llvm#107344)
The resolution of LWG2593 didn't require the standard library implementation to change. It merely strengthened requirements on user-defined allocator types and allowed the implementation to make stronger assumptions. The status is tentatively set to Nothing To Do. However, `test_allocator` in libc++'s test suit needs to be fixed to conform to the strengthened requirements. Closes llvm#100220.
Configuration menu - View commit details
-
Copy full SHA for 46a76c3 - Browse repository at this point
Copy the full SHA 46a76c3View commit details -
[CGData][MachineOutliner] Global Outlining (llvm#90074)
This commit introduces support for outlining functions across modules using codegen data generated from previous codegen. The codegen data currently manages the outlined hash tree, which records outlining instances that occurred locally in the past. The machine outliner now operates in one of three modes: 1. CGDataMode::None: This is the default outliner mode that uses the suffix tree to identify (local) outlining candidates within a module. This mode is also used by (full)LTO to maintain optimal behavior with the combined module. 2. CGDataMode::Write (`-codegen-data-generate`): This mode is identical to the default mode, but it also publishes the stable hash sequences of instructions in the outlined functions into a local outlined hash tree. It then encodes this into the `__llvm_outline` section, which will be dead-stripped at link time. 3. CGDataMode::Read (`-codegen-data-use-path={.cgdata}`): This mode reads a codegen data file (.cgdata) and initializes a global outlined hash tree. This tree is used to generate global outlining candidates. Note that the codegen data file has been post-processed with the raw `__llvm_outline` sections from all native objects using the `llvm-cgdata` tool (or a linker, `LLD`, or a new ThinLTO pipeline later). This depends on llvm#105398. After this PR, LLD (llvm#90166) and Clang (llvm#90304) will follow for each client side support. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.
Configuration menu - View commit details
-
Copy full SHA for 0f52545 - Browse repository at this point
Copy the full SHA 0f52545View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7190368 - Browse repository at this point
Copy the full SHA 7190368View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2459679 - Browse repository at this point
Copy the full SHA 2459679View commit details -
[flang][OpenMP] Implement copyin for pointers and allocatables. (llvm…
…#107425) The copyin clause currently forbids pointer and allocatable variables, which are allowed by the OpenMP 1.1 and 3.0 specifications respectively.
Configuration menu - View commit details
-
Copy full SHA for 53b5902 - Browse repository at this point
Copy the full SHA 53b5902View commit details -
[llvm-exegesis] Refactor getting register number from name to LLVMSta…
…te (llvm#107895) This patch refactors the procedure of getting the register number from a register name to LLVMState rather than having individual users get the values themselves by getting a reference to the map from LLVMState. This is primarily intended to make some downstream usage in Gematria simpler, but also cleans up a little bit upstream by pulling the actual map searching out and just leaving error handling to the clients. The original getter is left to enable downstream migration in Gematria, particularly before it gets imported into google internal.
Configuration menu - View commit details
-
Copy full SHA for 5823ac0 - Browse repository at this point
Copy the full SHA 5823ac0View commit details -
Configuration menu - View commit details
-
Copy full SHA for dfd7284 - Browse repository at this point
Copy the full SHA dfd7284View commit details -
Configuration menu - View commit details
-
Copy full SHA for 33f1235 - Browse repository at this point
Copy the full SHA 33f1235View commit details -
Configuration menu - View commit details
-
Copy full SHA for 13c14c6 - Browse repository at this point
Copy the full SHA 13c14c6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8530329 - Browse repository at this point
Copy the full SHA 8530329View commit details -
[Format] Avoid repeated hash lookups (NFC) (llvm#107962)
Co-authored-by: Owen Pan <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 19a2f17 - Browse repository at this point
Copy the full SHA 19a2f17View commit details -
[Lex] Avoid repeated hash lookups (NFC) (llvm#107963)
MacroAnnotations has three std::optional fields. Functions makeDeprecation, makeRestrictExpansion, and makeFinal construct an instance of MacroAnnotations with one field initialized with a non-default value (that is, some value other than std::nullopt). Functions addMacroDeprecationMsg, addRestrictExpansionMsg, and addFinalLoc either create a new map entry with one field initialized with a non-default value or replaces one field of an existing map entry. We can do all this with a simple statement of the form: AnnotationInfos[II].FieldName = NonDefaultValue; which takes care of default initialization of the fields with std::nullopt when a requested map entry does not exist.
Configuration menu - View commit details
-
Copy full SHA for 9710085 - Browse repository at this point
Copy the full SHA 9710085View commit details -
[mlir] Reuse pack dest in tensor.pack decomposition (llvm#108025)
In the `lowerPack` transform, there is a special case for lowering into a simple `tensor.pad` + `tensor.insert_slice`, but the destination becomes a newly created `tensor.empty`. This PR fixes the transform to reuse the original destination of the `tensor.pack`.
Configuration menu - View commit details
-
Copy full SHA for e982d7f - Browse repository at this point
Copy the full SHA e982d7fView commit details -
[lldb][test] TestDbgInfoContentVectorFromStdModule.py: skip test on D…
…arwin (llvm#108003) This started failing on the macOS CI after llvm#106885: ``` lldb-api :: commands/expression/import-std-module/vector-dbg-info-content/TestDbgInfoContentVectorFromStdModule.py "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang" -std=c++11 -g -O0 -isysroot "/Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk" -arch arm64 -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/../../../../..//include -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/tools/lldb/include -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/expression/import-std-module/vector-dbg-info-content -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make -include /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/test_common.h -fno-limit-debug-info -nostdlib++ -nostdinc++ -cxx-isystem /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/include/c++/v1 --driver-mode=g++ -MT main.o -MD -MP -MF main.d -c -o main.o /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/expression/import-std-module/vector-dbg-info-content/main.cpp "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang" main.o -g -O0 -isysroot "/Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk" -arch arm64 -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/../../../../..//include -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/tools/lldb/include -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/expression/import-std-module/vector-dbg-info-content -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make -include /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/test_common.h -fno-limit-debug-info -L/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/lib -Wl,-rpath,/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/lib -lc++ --driver-mode=g++ -o "a.out" ld: warning: ignoring duplicate libraries: '-lc++' codesign --entitlements /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/entitlements-macos.plist -s - "a.out" "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/./bin/dsymutil" -o "a.out.dSYM" "a.out" runCmd: settings set target.import-std-module true output: runCmd: expr std::reverse(a.begin(), a.end()) Assertion failed: (isa<InjectedClassNameType>(Decl->TypeForDecl)), function getInjectedClassNameType, file ASTContext.cpp, line 5057. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. HandleCommand(command = "expr std::reverse(a.begin(), a.end())") 1. <eof> parser at end of file 2. /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/include/c++/v1/__algorithm/reverse.h:54:1: instantiating function definition 'std::reverse<std::__wrap_iter<Foo *>>' 3. /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/include/c++/v1/__algorithm/reverse.h:47:58: instantiating function definition 'std::__reverse<std::_ClassicAlgPolicy, std::__wrap_iter<Foo *>, std::__wrap_iter<Foo *>>' 4. /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/include/c++/v1/__algorithm/reverse.h:40:1: instantiating function definition 'std::__reverse_impl<std::_ClassicAlgPolicy, std::__wrap_iter<Foo *>>' ```
Configuration menu - View commit details
-
Copy full SHA for 2bcab9b - Browse repository at this point
Copy the full SHA 2bcab9bView commit details -
[Attributor] Keep track of reached returns in AAPointerInfo (llvm#107479
) Instead of visiting call sites in Attribute::checkForAllUses, we now keep track of returns in AAPointerInfo and use the call site return information as required. This way, the user of AAPointerInfo(CallSite)Argument can determine if the call return should be visited. We do not collect them as "may accesses" in the AAPointerInfo(CallSite)Argument itself in case a return user is found.
Configuration menu - View commit details
-
Copy full SHA for 56a0334 - Browse repository at this point
Copy the full SHA 56a0334View commit details -
[RFC][C++20][Modules] Fix crash when function and lambda inside loade…
…d from different modules (llvm#104512) Summary: Because AST loading code is lazy and happens in unpredictable order it could happen that function and lambda inside function can be loaded from different modules. In this case, captured DeclRefExpr won’t match the corresponding VarDecl inside function. In AST it looks like this: ``` FunctionDecl 0x555564f4aff0 <Conv.h:33:1, line:41:1> line:33:35 imported in ./thrift_cpp2_base.h hidden tryTo 'Expected<Tgt, const char *> ()' inline |-also in ./folly-conv.h `-CompoundStmt 0x555564f7cfc8 <col:43, line:41:1> |-DeclStmt 0x555564f7ced8 <line:34:3, col:17> | `-VarDecl 0x555564f7cef8 <col:3, col:16> col:7 imported in ./thrift_cpp2_base.h hidden referenced result 'Tgt' cinit | `-IntegerLiteral 0x555564f7d080 <col:16> 'int' 0 |-CallExpr 0x555564f7cea8 <line:39:3, col:76> '<dependent type>' | |-UnresolvedLookupExpr 0x555564f7bea0 <col:3, col:19> '<overloaded function type>' lvalue (no ADL) = 'then_' 0x555564f7bef0 | |-CXXTemporaryObjectExpr 0x555564f7bcb0 <col:25, col:45> 'Expected<bool, int>':'folly::Expected<bool, int>' 'void () noexcept' zeroing | `-LambdaExpr 0x555564f7bc88 <col:48, col:75> '(lambda at Conv.h:39:48)' | |-CXXRecordDecl 0x555564f76b88 <col:48> col:48 imported in ./folly-conv.h hidden implicit <undeserialized declarations> class definition | | |-also in ./thrift_cpp2_base.h | | `-DefinitionData lambda empty standard_layout trivially_copyable literal can_const_default_init | | |-DefaultConstructor defaulted_is_constexpr | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveConstructor exists simple trivial needs_implicit | | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveAssignment | | `-Destructor simple irrelevant trivial constexpr needs_implicit | `-CompoundStmt 0x555564f7d1a8 <col:58, col:75> | `-ReturnStmt 0x555564f7d198 <col:60, col:67> | `-DeclRefExpr 0x555564f7d0a0 <col:67> 'Tgt' lvalue Var 0x555564f7d0c8 'result' 'Tgt' refers_to_enclosing_variable_or_capture `-ReturnStmt 0x555564f7bc78 <line:40:3, col:11> `-InitListExpr 0x555564f7bc38 <col:10, col:11> 'void' ``` This diff changes AST deserialization to load lambdas inside canonical function declaration earlier right after the function to make sure that their canonical decl is loaded from the same module. Test Plan: check-clang
Configuration menu - View commit details
-
Copy full SHA for d778689 - Browse repository at this point
Copy the full SHA d778689View commit details -
Configuration menu - View commit details
-
Copy full SHA for bf68403 - Browse repository at this point
Copy the full SHA bf68403View commit details -
Fix for Attempt to fix [CGData][MachineOutliner] Global Outlining (ll…
…vm#90074) llvm#108037 (llvm#108047) The previous `attempt to fix [CGData][MachineOutliner] Global Outlining (llvm#90074) llvm#108037` was incomplete because the `ImmutableModuleSummaryIndexWrapperPass` is now optional for the MachineOutliner pass. With this fix, the test file `CodeGen/AArch64/O3-pipeline.ll` shows no changes compared to its state before `[CGData][MachineOutliner] Global Outlining (llvm#90074)`. Co-authored-by: Kyungwoo Lee <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ba2aa1d - Browse repository at this point
Copy the full SHA ba2aa1dView commit details -
Fix for llvm/test/CodeGen/RISCV/O3-pipeline.ll (llvm#108050)
The previous `Fix for Attempt to fix [CGData][MachineOutliner] Global Outlining (llvm#90074) llvm#108037 (llvm#108047)` somehow dropped this file.
Configuration menu - View commit details
-
Copy full SHA for 2cfdcfb - Browse repository at this point
Copy the full SHA 2cfdcfbView commit details -
[RISCV] Separate more of scalar FP in CC_RISCV. NFC (llvm#107908)
Scalar FP calling convention has gotten more complicated with recent changes to Zfinx/Zdinx, proposed addition of a GPRF16 register class, and using customReg for f16/bf16 and other FP types small than XLen. The previous code tried to share a single getReg and getMem call for many different cases. This patch separates all the FP register handling to the top of the function with their own getReg calls. The only exception is f64 with XLen==32, when we are out of FPRs or not able to use FPRs due to ABI. The way I've structured this, we no longer need to correct the LocVT for FP back to ValVT before the call to getMem.
Configuration menu - View commit details
-
Copy full SHA for 14b4356 - Browse repository at this point
Copy the full SHA 14b4356View commit details -
Revert "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (llvm#108054
) Breaks bots, see llvm#105822. Reverts llvm#105822
Configuration menu - View commit details
-
Copy full SHA for c7a7767 - Browse repository at this point
Copy the full SHA c7a7767View commit details -
[LLDB][Data Formatters] Calculate average and total time for summary …
…providers within lldb (llvm#102708) This PR adds a statistics provider cache, which allows an individual target to keep a rolling tally of it's total time and number of invocations for a given summary provider. This information is then available in statistics dump to help slow summary providers, and gleam more into insight into LLDB's time use.
Configuration menu - View commit details
-
Copy full SHA for 22144e2 - Browse repository at this point
Copy the full SHA 22144e2View commit details -
[libc] fix locale dependency for stdlib (llvm#108042)
Address the following issue: ``` ❯ ninja libc.test.src.__support.OSUtil.linux.vdso_test.__unit__ [91/127] Building CXX object libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o FAILED: libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o sccache /usr/bin/clang++ -DLIBC_NAMESPACE=__llvm_libc_20_0_0_git -D_DEBUG -I/home/schrodingerzy/Documents/llvm-project/libc -isystem /home/schrodingerzy/Documents/llvm-project/build/libc/include -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -g -std=gnu++17 -fpie -DLIBC_FULL_BUILD -ffreestanding -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -MD -MT libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o -MF libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o.d -o libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o -c /home/schrodingerzy/Documents/llvm-project/libc/test/src/__support/OSUtil/linux/vdso_test.cpp In file included from /home/schrodingerzy/Documents/llvm-project/libc/test/src/__support/OSUtil/linux/vdso_test.cpp:21: In file included from /home/schrodingerzy/Documents/llvm-project/libc/test/UnitTest/ErrnoSetterMatcher.h:13: In file included from /home/schrodingerzy/Documents/llvm-project/libc/src/__support/FPUtil/fpbits_str.h:12: In file included from /home/schrodingerzy/Documents/llvm-project/libc/src/__support/CPP/string.h:20: /home/schrodingerzy/Documents/llvm-project/build/libc/include/stdlib.h:13:10: fatal error: 'llvm-libc-types/locale_t.h' file not found 13 | #include "llvm-libc-types/locale_t.h" | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. [123/127] Building CXX object libc/test/UnitTest/CMakeFiles/LibcTest.unit.dir/LibcTestMain.cpp.o ninja: build stopped: subcommand failed. ```
Configuration menu - View commit details
-
Copy full SHA for ce9f987 - Browse repository at this point
Copy the full SHA ce9f987View commit details -
[MemProf] Streamline and avoid unnecessary context id duplication (ll…
…vm#107918) Sort the list of calls such that those with the same stack ids are also sorted by function. This allows processing of all matching calls (that can share a context node) in bulk as they are all adjacent. This has 2 benefits: 1. It reduces unnecessary work, specifically the handling to intersect the context ids with those along the graph edges for the stack ids, for calls that we know can share a node. 2. It simplifies detecting when we have matching stack ids but don't need to duplicate context ids. Specifically, we were previously still duplicating context ids whenever we saw another call with the same stack ids, but that isn't necessary if they will share a context node. With this change we now only duplicate context ids if we see some that not only have the same ids but also are in different functions. This change reduced the amount of context id duplication and provided reductions in both both peak memory (~8%) and time (~%5) for a large target.
Configuration menu - View commit details
-
Copy full SHA for 524a028 - Browse repository at this point
Copy the full SHA 524a028View commit details -
[ADT] Require base equality in indexed_accessor_iterator::operator==() (
llvm#107856) Similarly to operator<(), equality-comparing iterators from different ranges must really be forbidden. The preconditions for being able to do `it1 < it2` and `it1 != it2` (or `it1 == it2` for the matter) ought to be the same. Thus, there's little sense in keeping explicit base object comparison in operator==() whilst having this is a precondition in operator<() and operator-() (e.g. used for std::distance() and such).
Configuration menu - View commit details
-
Copy full SHA for 7fb19cb - Browse repository at this point
Copy the full SHA 7fb19cbView commit details -
[DirectX] Lower
@llvm.dx.typedBufferStore
to DXIL opsThe `@llvm.dx.typedBufferStore` intrinsic is lowered to `@dx.op.bufferStore`. Pull Request: llvm#104253
Configuration menu - View commit details
-
Copy full SHA for 90e8411 - Browse repository at this point
Copy the full SHA 90e8411View commit details -
Configuration menu - View commit details
-
Copy full SHA for c8ed2b8 - Browse repository at this point
Copy the full SHA c8ed2b8View commit details -
[PowerPC] Fix assert exposed by PR 95931 in LowerBITCAST (llvm#108062)
Hit Assertion failed: Num < NumOperands && "Invalid child # of SDNode!" Fix by checking opcode and value type before calling getOperand.
Configuration menu - View commit details
-
Copy full SHA for 22067a8 - Browse repository at this point
Copy the full SHA 22067a8View commit details -
Revert "[NVPTX] Support copysign PTX instruction (llvm#107800)" (llvm…
Configuration menu - View commit details
-
Copy full SHA for 02c943a - Browse repository at this point
Copy the full SHA 02c943aView commit details -
Add DIExpression::foldConstantMath to CoroSplit (llvm#107933)
The CoroSplit pass has it's own salvageDebugInfo implementation and it's DIExpressions do not get folded. Add a call to DIExpression::foldConstantMath in the CoroSplit pass to reduce the size of those DIExpressions. [The compile time tracker shows no significant increase in compile time either.](https://llvm-compile-time-tracker.com/compare.php?from=bdf02249e7f8f95177ff58c881caf219699acb98&to=e1c1c1759c06bc4c42f79eebdb0e3cd45219cef4&stat=instructions:u) rdar://134675402
Configuration menu - View commit details
-
Copy full SHA for 7a91af4 - Browse repository at this point
Copy the full SHA 7a91af4View commit details -
Configuration menu - View commit details
-
Copy full SHA for feeb6aa - Browse repository at this point
Copy the full SHA feeb6aaView commit details -
[RISCV] Fix fneg.d/fabs.d aliasing handling for Zdinx. Add missing fm…
…v.s/d aliases. We were missing test coverage for fneg.d/fabs.d for Zdinx. When I added it revealed it only worked on RV64. The assembler was not creating a GPRPair register class on RV32 so the alias couldn't match. The disassembler was also not using GPRPair registers preventing the aliases from printing in disassembly too. I've fixed the assembler by adding new parsing methods in an attempt to get decent diagnostics. This is hard since the mnemonics are ambiguous between D and Zdinx. Tests have been adjusted for some differences in what errors are reported first.
Configuration menu - View commit details
-
Copy full SHA for 5537ae8 - Browse repository at this point
Copy the full SHA 5537ae8View commit details -
[lldb-dap] Improve
stackTrace
andexceptionInfo
DAP request handl……ers (llvm#105905) Refactoring `stackTrace` to perform frame look ups in a more on-demand fashion to improve overall performance. Additionally adding additional information to the `exceptionInfo` request to report exception stacks there instead of merging the exception stack into the stack trace. The `exceptionInfo` request is only called if a stop event occurs with `reason='exception'`, which should mitigate the performance of `SBThread::GetCurrentException` calls. Adding unit tests for exception handling and stack trace supporting.
Configuration menu - View commit details
-
Copy full SHA for 5b4100c - Browse repository at this point
Copy the full SHA 5b4100cView commit details -
[DirectX] Add DirectXTargetCodeGenInfo (llvm#104856)
Adds target codegen info class for DirectX. For now it always translates `__hlsl_resource_t` handle to `target("dx.TypedBuffer", i32, 1, 0, 1)` (`RWBuffer<int>`). More work is needed to determine the actual target exp type and parameters based on the resource handle attributes. Part 1/2 of llvm#95952
Configuration menu - View commit details
-
Copy full SHA for becb03f - Browse repository at this point
Copy the full SHA becb03fView commit details -
[Coroutines] Move spill related methods to a Spill utils (llvm#107884)
* Move code related to spilling into SpillUtils to help cleanup CoroFrame See RFC for more info: https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057
Configuration menu - View commit details
-
Copy full SHA for f4e2d7b - Browse repository at this point
Copy the full SHA f4e2d7bView commit details -
[HLSL] Warn on duplicate is_rov attribute; remove unnecessary parenth…
…eses (llvm#107973) We should issue a warning whenever a duplicate resource type attribute is found. Currently we do that only for `resource_class`. This PR fixes that by checking for duplicate `is_rov` attributes as well. Also removes unnecessary parenthesis on `is_rov`.
Configuration menu - View commit details
-
Copy full SHA for 61372fc - Browse repository at this point
Copy the full SHA 61372fcView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0b12cd2 - Browse repository at this point
Copy the full SHA 0b12cd2View commit details -
Configuration menu - View commit details
-
Copy full SHA for b9703cb - Browse repository at this point
Copy the full SHA b9703cbView commit details -
Configuration menu - View commit details
-
Copy full SHA for cb3eb06 - Browse repository at this point
Copy the full SHA cb3eb06View commit details -
[OpenACC] Properly ignore side-effects in clause arguments
The OpenACC standard makes depending on side effects to be effectively UB, so this patch ensures we handle them reaonably by making it a potentially evaluated context, and ignoring cleanups.
Configuration menu - View commit details
-
Copy full SHA for 6dacc38 - Browse repository at this point
Copy the full SHA 6dacc38View commit details -
Configuration menu - View commit details
-
Copy full SHA for 27a01f6 - Browse repository at this point
Copy the full SHA 27a01f6View commit details -
[SandboxIR] PassManager (llvm#107932)
This patch implements a simple pass manager for Sandbox IR.
Configuration menu - View commit details
-
Copy full SHA for 3363760 - Browse repository at this point
Copy the full SHA 3363760View commit details -
Configuration menu - View commit details
-
Copy full SHA for 99fb150 - Browse repository at this point
Copy the full SHA 99fb150View commit details -
[VPlan] Consider non-header phis in planContainsAdditionalSimp.
Update planContainsAdditionalSimplifications to also check phis not in the loop header. This ensures we don't miss cases where VPBlendRecipes (which correspond to such phis) have been simplified. Fixes llvm#107473.
Configuration menu - View commit details
-
Copy full SHA for e3c537f - Browse repository at this point
Copy the full SHA e3c537fView commit details -
[llvm-lit] Process ANSI color codes in test output when formatting (l…
…lvm#106776) Test output that carried color across newlines previously resulted in the formatting around the output also being colored. Detect the current ANSI color and reset it when printing formatting, and then reapply it. As an added bonus an unterminated color code is also detected, preventing it from leaking out into the rest of the terminal. Fixes llvm#106633
Configuration menu - View commit details
-
Copy full SHA for 0f56ba1 - Browse repository at this point
Copy the full SHA 0f56ba1View commit details -
[flang][cuda] Avoid generating data transfer when calling size intrin…
…sic (llvm#108081) cuf.data_transfer was wrongly generated when calling the `size` intrinsic on a device allocatable variable. Since the descriptor is available on the host, there is no transfer needed. Add `DescriptorInquiry` in the `CollectCudaSymbolsHelper` to filter out symbols that are not needed for the transfer decision to be made.
Configuration menu - View commit details
-
Copy full SHA for b5ce7a9 - Browse repository at this point
Copy the full SHA b5ce7a9View commit details -
Revert "[libc++][string] Remove potential non-trailing 0-length array" (
llvm#108091) Reverts llvm#105865 This breaks a pair of LLDB tests in CI.
Configuration menu - View commit details
-
Copy full SHA for d8a8eae - Browse repository at this point
Copy the full SHA d8a8eaeView commit details -
[flang] Silence spurious error on non-CUDA use of CUDA module (llvm#1…
…07444) When a module file has been compiled with CUDA enabled, don't emit spurious errors about non-interoperable types when that module is read by a USE statement in a later non-CUDA compilation.
Configuration menu - View commit details
-
Copy full SHA for ce39247 - Browse repository at this point
Copy the full SHA ce39247View commit details -
[flang] Relax error into a warning (llvm#107489)
The standard requires that a generic interface with the same name as a derived type contain only functions. We generally allow a generic interface to contain both functions and subroutines, since there's never any ambiguity at the point of call; these is helpful when the specific procedures of two generics are combined during USE association. Emit a warning instead of a hard error when a generic interface with the same name as a derived type contains a subroutine to improve portability of code from compilers that don't check for this condition.
Configuration menu - View commit details
-
Copy full SHA for 5a229db - Browse repository at this point
Copy the full SHA 5a229dbView commit details -
[flang] Fix bogus error about procedure incompatbility (llvm#107645)
This was a subtle problem. When the shape of a function result is explicit but not constant, it is characterized with bounds expressions that use Extremum<SubscriptInteger> operations to force extents to 0 rather than be negative. These Extremum operations are formatted as "max()" intrinsic functions in the module file. Upon being read from the module file, they are not folded back into Extremum operations, but remain as function references; and this then leads to expressions not comparing equal when the procedure characteristics are compared to those of a local procedure declared identically. The real fix here would be for folding to just always change max and min function references into Extremum<> operations, constant operands or not, and I tried that, but it lead to test failures and crashes in lowering that I couldn't resolve. So, until those can be fixed, here's a change that will read max/min operations in module file declarations back into Extremum operations to solve the compatibility checking problem, but leave other non-constant max/min operations as function calls.
Configuration menu - View commit details
-
Copy full SHA for d2126ec - Browse repository at this point
Copy the full SHA d2126ecView commit details -
[flang] Relax ETIME(VALUES=) runtime checking (llvm#107647)
Don't require the "VALUES=" argument to the extension intrinsic procedure ETIME to have exactly two elements. Other compilers that support ETIME do not, and it's easy to adapt the behavior to whatever the dynamic size turns out to be.
Configuration menu - View commit details
-
Copy full SHA for fe58527 - Browse repository at this point
Copy the full SHA fe58527View commit details -
[flang] Accept initialized SAVE local in specification expression (ll…
…vm#107656) Specification expressions may contain references to dummy arguments, host objects, module variables, and variables in COMMON blocks, since they will have values on entry to the scope. A local variable with a initializer and the SAVE attribute (which will always be implied by an explicit initialization) will also always work, and is accepted by at least one other compiler, so accept it with a warning.
Configuration menu - View commit details
-
Copy full SHA for 26ac30b - Browse repository at this point
Copy the full SHA 26ac30bView commit details -
[flang][runtime] Don't emit runtime error for "AA" editing (llvm#107714)
Commas are optional between edit descriptors in a format, so treat "AA" as if it were "A,A".
Configuration menu - View commit details
-
Copy full SHA for cd92c42 - Browse repository at this point
Copy the full SHA cd92c42View commit details -
[flang][runtime] Accept '\n' as space in internal list-directed input (…
…llvm#107716) When scanning ahead for the first character in the next input item in list-directed internal input, allow a newline character to appear and treat it as a space, matching the behavior of nearly all other Fortran compilers.
Configuration menu - View commit details
-
Copy full SHA for ea858e3 - Browse repository at this point
Copy the full SHA ea858e3View commit details -
[flang][runtime] Fix odd "invalid descriptor" runtime crash (llvm#107785
) A defined assignment generic interface for a given LHS/RHS type & rank combination may have a specific procedure with LHS dummy argument that is neither allocatable nor pointer, or specific procedure(s) whose LHS dummy arguments are allocatable or pointer. It is possible to have two specific procedures if one's LHS dummy argument is allocatable and the other's is pointer. However, the runtime doesn't work with LHS dummy arguments that are allocatable, and will crash with a mysterious "invalid descriptor" error message. Extend the list of special bindings to include ScalarAllocatableAssignment and ScalarPointerAssignment, use them when appropriate in the runtime type information tables, and handle them in Assign() in the runtime support library.
Configuration menu - View commit details
-
Copy full SHA for 15106c2 - Browse repository at this point
Copy the full SHA 15106c2View commit details -
[flang] Accept KIND(x) when x is assumed-rank (llvm#107787)
Don't emit a bogus error about being unable to forward an assumed-rank dummy argument as an actual argument in the case of the KIND intrinsic function. Fixes llvm#107782.
Configuration menu - View commit details
-
Copy full SHA for 37f94cd - Browse repository at this point
Copy the full SHA 37f94cdView commit details -
[flang] Fix error from semantics on use associated procedure pointer (l…
…lvm#107928) Use associated procedure pointers were eliciting bogus errors from semantics if their modules also contained generic procedure interfaces of the same name. (The compiler handles this case correctly when the specific procedure of the same name is not a pointer.) With this fix, the test case in llvm#107784 no longer experiences semantic errors; however, it now crashes unexpectedly in lowering.
Configuration menu - View commit details
-
Copy full SHA for d418a03 - Browse repository at this point
Copy the full SHA d418a03View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5a2071b - Browse repository at this point
Copy the full SHA 5a2071bView commit details -
[WebAssembly] Misc. refactoring in AsmTypeCheck (NFC) (llvm#107978)
Existing methods in AsmTypeCheck assumes symbol operand is the 0th operand; they take a `MCInst` and take `getOperand(0)` on it. I think passing a `MCOperand` removes this assumption and also is more intuitive. This was motivated by a new `try_table` instruction, whose support is going to be added to AsmTypeCheck soon, which has tag symbol operands in any position, depending on the number and the kinds of catch clauses. This PR changes all methods' signature that assumes the 0th operand is the relevant one, even if it's not the symbol operand. This also adds `getSignature` method, which factors out the common task when getting a `WasmSignature` from a `MCOperand`.
Configuration menu - View commit details
-
Copy full SHA for 5495c36 - Browse repository at this point
Copy the full SHA 5495c36View commit details -
[SandboxIR] Fix base class of FenceInst. Verify instructions when bui…
…lding a BB in debug mode. (llvm#108078) @vporpo suggested in an offline conversation that verifying all instructions during `BasicBlock::buildBasicBlockFromLLVMIR` would be a good way to get coverage for errors like this during testing. He also suggested not gating it on `SBVEC_EXPENSIVE_CHECKS` for now as the checks are pretty basic at the moment and they only affect Debug builds.
Configuration menu - View commit details
-
Copy full SHA for ace6d5f - Browse repository at this point
Copy the full SHA ace6d5fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0fc4147 - Browse repository at this point
Copy the full SHA 0fc4147View commit details -
[SandboxIR] Pass registry (llvm#108084)
This patch implements a simple Pass Registry class, which takes ownership of the passes registered with it and provides an interface to get the pass pointer by its name.
Configuration menu - View commit details
-
Copy full SHA for 2ddf21b - Browse repository at this point
Copy the full SHA 2ddf21bView commit details -
[flang] Fix shared library flang build (llvm#108101)
I broke the shared library builds a few minutes ago by introducing a cyclic dependency between two parts of the compiler. Fix.
Configuration menu - View commit details
-
Copy full SHA for d452429 - Browse repository at this point
Copy the full SHA d452429View commit details -
Configuration menu - View commit details
-
Copy full SHA for 957af73 - Browse repository at this point
Copy the full SHA 957af73View commit details -
[LLDB]Skip Summary Statistics Tests for Windows (llvm#108079)
Follow up to llvm#102708, the tests are failing for windows. There is a large variance in these tests between summary strings and built in types. I'm disabling these test for windows, and will add windows specific tests as a follow up to this.
Configuration menu - View commit details
-
Copy full SHA for 10c04d9 - Browse repository at this point
Copy the full SHA 10c04d9View commit details -
Revert "[llvm-lit] Process ANSI color codes in test output when forma…
…tting" (llvm#108104) Reverts llvm#106776 because of a test failure on Windows.
Configuration menu - View commit details
-
Copy full SHA for 6007ad7 - Browse repository at this point
Copy the full SHA 6007ad7View commit details -
[SandboxIR] Implement BlockAddress (llvm#107940)
This patch implements sandboxir::BlockAddress mirroring llvm:BlockAddress.
Configuration menu - View commit details
-
Copy full SHA for d14a600 - Browse repository at this point
Copy the full SHA d14a600View commit details -
Configuration menu - View commit details
-
Copy full SHA for bb72865 - Browse repository at this point
Copy the full SHA bb72865View commit details -
Revert "[sanitizer] Add CHECKs to validate calculated TLS range" (llv…
…m#108112) Reverts llvm#107941 Broke PPC bot
Configuration menu - View commit details
-
Copy full SHA for 5804193 - Browse repository at this point
Copy the full SHA 5804193View commit details -
[docs] Add a section on AI-generated content to the developer policy (l…
…lvm#91014) Governments around the world are starting to require labelling for AI-generated content, and some LLVM stakeholders have asked if LLVM contains AI-generated content. Defining a policy on the use of AI tools allows us to answer that question affirmatively, one way of the other. The policy proposed here allows the use of AI tools in LLVM contributions, flowing from the idea that any contribution is fine regardless of how it is made, as long as the contributor has the right to license it under the project license. I gathered input from the community in this RFC and incorporated it into the policy: https://discourse.llvm.org/t/rfc-define-policy-on-ai-tool-usage-in-contributions/78758
Configuration menu - View commit details
-
Copy full SHA for 829ea59 - Browse repository at this point
Copy the full SHA 829ea59View commit details -
[MemProf] Convert CallContextInfo to a struct (NFC) (llvm#108086)
As suggested in llvm#107918, improve readability by converting this tuple to a struct.
Configuration menu - View commit details
-
Copy full SHA for ae5f1a7 - Browse repository at this point
Copy the full SHA ae5f1a7View commit details -
[LegalizeTypes] Avoid creating an unused node in ExpandIntRes_ADDSUB.…
… NFC The Hi result is sometimes calculated a different way and this node goes unused. Defer creation until we know for sure it is neeeded. The test changes is because the node creation order changed the names in the debug output.
Configuration menu - View commit details
-
Copy full SHA for d2f25e5 - Browse repository at this point
Copy the full SHA d2f25e5View commit details
Commits on Sep 11, 2024
-
[compiler-rt] Hardcode uptr/sptr typedefs on Linux Arm (llvm#108105)
After llvm#106155, Android arm32 asan builds stopped working with missing definition linker errors. This is due to inconsistent definitions of `uptr` of either `unsigned long` or `unsigned int` even between TUs in compiler-rt. This is caused by Linux arm32 headers redefining `__UINTPTR_TYPE__` (see `arch/arm/include/uapi/asm/types.h` in the Linux kernel repo), meaning include order/whether or not the Linux header is included changes compiler-rt symbol mangling. As a workaround, this hardcodes `uptr`/`sptr` in compiler-rt to `unsigned int`/`int` on Linux arm32, matching clang/gcc.
Configuration menu - View commit details
-
Copy full SHA for db7e8f2 - Browse repository at this point
Copy the full SHA db7e8f2View commit details -
[scudo] Fix the logic of MaxAllowedFragmentedPages (llvm#107927)
MTE doesn't support MaxReleasedCachePages which may break the assumption that only the first 4 pages will have memory tagged.
Configuration menu - View commit details
-
Copy full SHA for 6e854a6 - Browse repository at this point
Copy the full SHA 6e854a6View commit details -
[ORC][Runtime] Add
dlupdate
for MachO (llvm#97441)With the help of @lhames, This pull request introduces the `dlupdate` function in the ORC runtime. `dlupdate` enables incremental execution of new initializers introduced in the REPL environment. Unlike traditional `dlopen`, which manages initializers, code mapping, and library reference counts, `dlupdate` focuses exclusively on running new initializers.
Configuration menu - View commit details
-
Copy full SHA for 68f31aa - Browse repository at this point
Copy the full SHA 68f31aaView commit details -
[RISCV] Rematerialize vmv.v.x (llvm#107993)
Even though vmv.v.x has a non constant scalar operand, we can still rematerialize it because we have split register allocation between vectors and scalars. InlineSpiller will check to make sure that the scalar operand is live at the point where the rematerialization occurs, so this won't extend any scalar live ranges. However this also means we may not be able to rematerialize in some cases, as shown in @vmv.v.x_needs_extended. It might be worthwhile teaching InlineSpiller to extend scalar live ranges in a future patch. I experimented with this locally and it reduced spills on 531.deepsjeng_r by a further 3%.
Configuration menu - View commit details
-
Copy full SHA for 77fc8da - Browse repository at this point
Copy the full SHA 77fc8daView commit details -
[RISCV] Add testcase for -mcmodel= (llvm#107816)
This is a pre-commit test for llvm#107817
Configuration menu - View commit details
-
Copy full SHA for 69ed733 - Browse repository at this point
Copy the full SHA 69ed733View commit details -
MIPSr6: Add llvm.is.fpclasss intrinsic support (llvm#107857)
MIPSr6 has class.s/class.d instructions. Let's use them for llvm.is.fpclass intrinsic.
Configuration menu - View commit details
-
Copy full SHA for c641b61 - Browse repository at this point
Copy the full SHA c641b61View commit details -
[RISCV] Rematerialize vfmv.v.f (llvm#108007)
This is the same principle as vmv.v.x in llvm#107993, but for floats.
Configuration menu - View commit details
-
Copy full SHA for 21a0176 - Browse repository at this point
Copy the full SHA 21a0176View commit details -
[RISCV] Rematerialize vmv.s.x and vfmv.s.f (llvm#108012)
Continuing with llvm#107993 and llvm#108007, this handles the last of the main rematerializable vector instructions. There's an extra spill in one of the test cases, but it's likely noise from the spill weights and isn't an issue in practice.
Configuration menu - View commit details
-
Copy full SHA for 933fc63 - Browse repository at this point
Copy the full SHA 933fc63View commit details -
[flang] Make flang module hidden dependency explicit to correct build… (
llvm#108129) … failure Any flang module with a derived type definition implicitly depends on flang/module/__fortran_type_info.f90. Make this dependency explicit so that an unlucky build order doesn't cause a crash.
Configuration menu - View commit details
-
Copy full SHA for 901006f - Browse repository at this point
Copy the full SHA 901006fView commit details -
[RISCV] Add reductions to list of roots in tryToReduceVL (llvm#107595)
This allows us to reduce VLs feeding reduction instructions. In particular, this means that <3 x Ty> reduce(load) like sequences no longer require a VL toggle. This was waiting on 3d72957; now that the latent correctness issue is fixed, we can expand this transform.
Configuration menu - View commit details
-
Copy full SHA for 1253001 - Browse repository at this point
Copy the full SHA 1253001View commit details -
SelectionDAG: Remove unneeded getSelectCC in expandFMINIMUMNUM_FMAXIM…
…UMNUM (llvm#107416) ISD::FCANONICALIZE is enough, which can process NaN or non-NaN correctly, thus getSelectCC is not needed here.
Configuration menu - View commit details
-
Copy full SHA for 5773adb - Browse repository at this point
Copy the full SHA 5773adbView commit details -
Revert "[scudo] Fix the logic of MaxAllowedFragmentedPages" (llvm#108130
) Reverts llvm#107927 We are supposed to check the MaxAllowedFragmentedPages instead.
Configuration menu - View commit details
-
Copy full SHA for 76151c4 - Browse repository at this point
Copy the full SHA 76151c4View commit details -
[flang] Fix cycle of build dependencies (llvm#108132)
While trying to fix one build problem, I made things worse. This should clear things up.
Configuration menu - View commit details
-
Copy full SHA for c571113 - Browse repository at this point
Copy the full SHA c571113View commit details -
[SandboxIR] Implement ScalableVectorType (llvm#108124)
As in the heading.
Configuration menu - View commit details
-
Copy full SHA for 3b4e7c9 - Browse repository at this point
Copy the full SHA 3b4e7c9View commit details -
[flang][cuda] Avoid extra load in c_f_pointer lowering with c_devptr (l…
…lvm#108090) Remove unnecessary load of the `cptr` component when getting the `__address`. `fir.coordinate_of` operation can be chained so the load is not needed.
Configuration menu - View commit details
-
Copy full SHA for e67a666 - Browse repository at this point
Copy the full SHA e67a666View commit details -
[LTO] Remove unused includes (NFC) (llvm#108110)
clangd reports these as unused headers. My manual inspection agrees with the findings.
Configuration menu - View commit details
-
Copy full SHA for 3dad29b - Browse repository at this point
Copy the full SHA 3dad29bView commit details -
[WebAssembly] Add assembly support for final EH proposal (llvm#107917)
This adds the basic assembly generation support for the final EH proposal, which was newly adopted in Sep 2023 and advanced into Phase 4 in Jul 2024: https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md This adds support for the generation of new `try_table` and `throw_ref` instruction in .s asesmbly format. This does NOT yet include - Block annotation comment generation for .s format - .o object file generation - .s assembly parsing - Type checking (AsmTypeCheck) - Disassembler - Fixing unwind mismatches in CFGStackify These will be added as follow-up PRs. --- The format for `TRY_TABLE`, both for `MachineInstr` and `MCInst`, is as follows: ``` TRY_TABLE type number_of_catches catch_clauses* ``` where `catch_clause` is ``` catch_opcode tag+ destination ``` `catch_opcode` should be one of 0/1/2/3, which denotes `CATCH`/`CATCH_REF`/`CATCH_ALL`/`CATCH_ALL_REF` respectively. (See `BinaryFormat/Wasm.h`) `tag` exists when the catch is one of `CATCH` or `CATCH_REF`. The MIR format is printed as just the list of raw operands. The (stack-based) assembly instruction supports pretty-printing, including printing `catch` clauses by name, in InstPrinter. In addition to the new instructions `TRY_TABLE` and `THROW_REF`, this adds four pseudo instructions: `CATCH`, `CATCH_REF`, `CATCH_ALL`, and `CATCH_ALL_REF`. These are pseudo instructions to simulate block return values of `catch`, `catch_ref`, `catch_all`, `catch_all_ref` clauses in `try_table` respectively, given that we don't support block return values except for one case (`fixEndsAtEndOfFunction` in CFGStackify). These will be omitted when we lower the instructions to `MCInst` at the end. LateEHPrepare now will have one more stage to covert `CATCH`/`CATCH_ALL`s to `CATCH_REF`/`CATCH_ALL_REF`s when there is a `RETHROW` to rethrow its exception. The pass also converts `RETHROW`s into `THROW_REF`. Note that we still use `RETHROW` as an interim pseudo instruction until we convert them to `THROW_REF` in LateEHPrepare. CFGStackify has a new `placeTryTableMarker` function, which places `try_table`/`end_try_table` markers with a necessary `catch` clause and also `block`/`end_block` markers for the destination of the `catch` clause. In MCInstLower, now we need to support one more case for the multivalue block signature (`catch_ref`'s destination's `(i32, exnref)` return type). InstPrinter has a new routine to print the `catch_list` type, which is used to print `try_table` instructions. The new test, `exception.ll`'s source is the same as `exception-legacy.ll`, with the FileCheck expectations changed. One difference is the commands in this file have `-wasm-enable-exnref` to test the new format, and don't have `-wasm-disable-explicit-locals -wasm-keep-registers`, because the new custom InstPrinter routine to print `catch_list` only works for the stack-based instructions (`_S`), and we can't use `-wasm-keep-registers` for them. As in `exception-legacy.ll`, the FileCheck lines for the new tests do not contain the whole program; they mostly contain only the control flow instructions for readability.
Configuration menu - View commit details
-
Copy full SHA for 6bbf7f0 - Browse repository at this point
Copy the full SHA 6bbf7f0View commit details -
[clang][bytecode] Fix lookup of source locations in implicit ctors (l…
…lvm#107992) Implicit functions may still have a body. The !hasBody() check is enough.
Configuration menu - View commit details
-
Copy full SHA for d03822d - Browse repository at this point
Copy the full SHA d03822dView commit details -
Reapply "[scudo] Fix the logic of MaxAllowedFragmentedPages" (llvm#10…
…8130) (llvm#108134) This reverts commit 76151c4. Also changed to check MaxAllowedFragmentedPages.
Configuration menu - View commit details
-
Copy full SHA for 323911d - Browse repository at this point
Copy the full SHA 323911dView commit details -
[webkit.RefCntblBaseVirtualDtor] Make ThreadSafeRefCounted not genera…
…te warnings (llvm#107676) This PR makes WebKit's RefCntblBaseVirtualDtor checker not generate a warning for ThreadSafeRefCounted when the destruction thread is a specific thread. Prior to this PR, we only allowed CRTP classes without a virtual destructor if its deref function had an explicit cast to the derived type, skipping any lambda declarations which aren't invoked. This ends up generating a warning for ThreadSafeRefCounted when a specific thread is used to destruct the object because there is no inline body / definition for ensureOnMainThread and ensureOnMainRunLoop and DerefFuncDeleteExprVisitor concludes that there is no explicit delete of the derived type. This PR relaxes the condition DerefFuncDeleteExprVisitor checks by allowing a delete expression to appear within a lambda declaration if it's an argument to an "opaque" function; i.e. a function without definition / body.
Configuration menu - View commit details
-
Copy full SHA for 203a2ca - Browse repository at this point
Copy the full SHA 203a2caView commit details -
Bail out jump threading on indirect branches (llvm#103688)
The bug was introduced by llvm#68473 Fixes: llvm#102351
Configuration menu - View commit details
-
Copy full SHA for 3c9022c - Browse repository at this point
Copy the full SHA 3c9022cView commit details -
[llvm-debuginfod-find] Enable multicall driver (llvm#108082)
Migrate llvm-debuginfod-find tool to use GenericOptTable. Enable multicall driver.
Configuration menu - View commit details
-
Copy full SHA for bc152fb - Browse repository at this point
Copy the full SHA bc152fbView commit details -
[Clang] Fix crash due to invalid source location in __is_trivially_eq…
…uality_comparable (llvm#107815) Fixes llvm#107777
Configuration menu - View commit details
-
Copy full SHA for 6dbdb84 - Browse repository at this point
Copy the full SHA 6dbdb84View commit details -
Configuration menu - View commit details
-
Copy full SHA for cd0e867 - Browse repository at this point
Copy the full SHA cd0e867View commit details -
[libc++][NFC] Replace _LIBCPP_NORETURN and TEST_NORETURN with [[noret…
…urn]] (llvm#80455) `[[__noreturn__]]` is now always available, so we can simply use the attribute directly instead of through a macro.
Configuration menu - View commit details
-
Copy full SHA for 748023d - Browse repository at this point
Copy the full SHA 748023dView commit details -
[InitUndef] Don't use largest super class (llvm#107885)
The InitUndef pass currently uses the getLargestSuperClass() hook (which is only used by that pass) to chose the register to initialize. This was done to reduce the number of undef init pseudos needed, e.g. so that the vrnov0 regclass would use the same pseudo as v0. After llvm#106744 we use a single generic pseudo, so this is no longer necessary.
Configuration menu - View commit details
-
Copy full SHA for 1e3a24d - Browse repository at this point
Copy the full SHA 1e3a24dView commit details -
[lldb][test] Add test for printing std::string through expression eva…
…luator This would've caught the failures in llvm#105865 in the libc++ data-formatter CI.
Configuration menu - View commit details
-
Copy full SHA for 19f604e - Browse repository at this point
Copy the full SHA 19f604eView commit details -
[MemCpyOpt] Allow memcpy elision for non-noalias arguments (llvm#107860)
We currently elide memcpys for readonly nocapture noalias arguments. noalias is checked to make sure that there are no other ways to write the memory, e.g. through a different argument or an escaped pointer. In addition to the current noalias check, also query alias analysis, in case it can prove that modification is not possible through other means. This fixes the problem reported in https://discourse.llvm.org/t/problem-about-memcpy-elimination/81121.
Configuration menu - View commit details
-
Copy full SHA for 2afe678 - Browse repository at this point
Copy the full SHA 2afe678View commit details -
Configuration menu - View commit details
-
Copy full SHA for 34cab2e - Browse repository at this point
Copy the full SHA 34cab2eView commit details -
[mlir][Linalg] Add speculation for LinalgStructuredOps (llvm#108032)
This patch adds speculation behavior for linalg structured ops, allowing them to be hoisted out of loops using LICM.
Configuration menu - View commit details
-
Copy full SHA for c9aa55d - Browse repository at this point
Copy the full SHA c9aa55dView commit details -
[flang][debug] Handle 'used' module. (llvm#107626)
As described in llvm#98883, we have to qualify a module variable name in debugger to get its value. This PR tries to remove this limitation. LLVM provides `DIImportedEntity` to handle such cases but the PR is made more complicated due to the following 2 issues. 1. The MLIR attributes are readonly and we have a circular dependency here. This has to be handled using the recursive interface provided by the MLIR. This requires us to first create a place holder `DISubprogramAttr` which is used in creating `DIImportedEntityAttr`. Later another `DISubprogramAttr` is created which replaces the place holder. 2. The flang IR does not provide any information about the 'used' module so this has to be extracted by doing a pass over the `DeclareOp` in the function. This presents certain limitation as 'only' and module variable renaming may not be handled properly. Due to the change in `DISubprogramAttr`, some tests also needed to be adjusted. Fixes llvm#98883.
Configuration menu - View commit details
-
Copy full SHA for db64e69 - Browse repository at this point
Copy the full SHA db64e69View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3001617 - Browse repository at this point
Copy the full SHA 3001617View commit details -
[LoopDeletion] Unblock loop deletion with `llvm.experimental.noalias…
….scope.decl` (llvm#108144) Since `llvm.experimental.noalias.scope.decl` is marked as `memory(inaccessiblemem: readwrite)`, we cannot treat this annotation intrinsic as having no side effects. It will block loop deletion when this intrinsic exists inside a dead loop: https://github.com/llvm/llvm-project/blob/3dad29b677e427bf69c035605a16efd065576829/llvm/lib/Transforms/Scalar/LoopDeletion.cpp#L103-L110 This patch marks `llvm.experimental.noalias.scope.decl` as droppable to address the issue. Fixes llvm#108052.
Configuration menu - View commit details
-
Copy full SHA for b4bb2f8 - Browse repository at this point
Copy the full SHA b4bb2f8View commit details -
[RISCV][doc] Add note to RISCVUsage about supported atomics ABIs (llv…
…m#103879) I've tried to avoid giving too much detailed explanation as the psABI docs are the better source for this.
Configuration menu - View commit details
-
Copy full SHA for 596e7cc - Browse repository at this point
Copy the full SHA 596e7ccView commit details -
[mlir] Add dependent TensorDialect to ConvertVectorToLLVM pass (llvm#…
…108045) This patch registers the tensor dialect as dependent of the ConvertVectorToLLVM. This which fixes a crash when `vector.transfer_write` is used with dynamic tensor type. The MaterializeTransferMask pattern would call `vector::createOrFoldDimOp` which creates a `tensor.dim` operation. Fixes llvm#107805.
Configuration menu - View commit details
-
Copy full SHA for a8f3d30 - Browse repository at this point
Copy the full SHA a8f3d30View commit details -
[mlir][vector] Support for extracting 1-element vectors in VectorExtr…
…actOpConversion (llvm#107549) This patch adds support for converting `vector.extract` that extract 1-element vectors into LLVM, fixing a crash in such cases. E.g., `vector.extract %1[0]: vector<1xf32> from vector<2xf32>`. Fix llvm#61372.
Configuration menu - View commit details
-
Copy full SHA for a4b0153 - Browse repository at this point
Copy the full SHA a4b0153View commit details -
[AMDGPU] Fix leak and self-assignment in copy assignment operator (ll…
…vm#107847) A static analyzer identified that this operator was unsafe in the case of self-assignment. In the placement new statement, StringValue's copy constructor was being implicitly called, which received a reference to "itself". In fact, it was being passed an old StringValue at the same address - one whose lifetime had already ended. The copy constructor was thus copying fields from a dead object. We need to be careful when switching active union members, and calling the destructor on the old StringValue will avoid memory leaks which I believe the old code exhibited.
Configuration menu - View commit details
-
Copy full SHA for f4dd1bc - Browse repository at this point
Copy the full SHA f4dd1bcView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2e4e918 - Browse repository at this point
Copy the full SHA 2e4e918View commit details -
Configuration menu - View commit details
-
Copy full SHA for c4a00be - Browse repository at this point
Copy the full SHA c4a00beView commit details -
Configuration menu - View commit details
-
Copy full SHA for 935b9f6 - Browse repository at this point
Copy the full SHA 935b9f6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7041163 - Browse repository at this point
Copy the full SHA 7041163View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7e0008d - Browse repository at this point
Copy the full SHA 7e0008dView commit details -
Configuration menu - View commit details
-
Copy full SHA for e1ee07d - Browse repository at this point
Copy the full SHA e1ee07dView commit details -
[lldb][test] Add test for no_unique_address when mixed with bitfields (…
…llvm#108155) This is the root-cause for the LLDB failures that started occurring after llvm#105865. The DWARFASTParserClang has logic to try derive unnamed bitfields from DWARF offsets. In this case we treat `padding` as a 1-byte size field that would overlap with `flag`, and decide we need to introduce an unnamed bitfield into the AST, which is incorrect.
Configuration menu - View commit details
-
Copy full SHA for da69449 - Browse repository at this point
Copy the full SHA da69449View commit details -
[MLIR][OpenMP] Automate operand structure definition (llvm#99508)
This patch adds the "gen-openmp-clause-ops" `mlir-tblgen` generator to produce the structure definitions previously in OpenMPClauseOperands.h automatically from the information contained in OpenMPOps.td and OpenMPClauses.td. The original header is maintained to enable the definition of similar structures that are not directly related to any single `OpenMP_Clause` or `OpenMP_Op` tablegen definition.
Configuration menu - View commit details
-
Copy full SHA for 2f3d061 - Browse repository at this point
Copy the full SHA 2f3d061View commit details -
[clang] Diagnose dangling issues for the "Container<GSLPointer>" case. (
llvm#107213) This pull request enhances the GSL lifetime analysis to detect situations where a dangling `Container<GSLPointer>` object is constructed: ```cpp std::vector<std::string_view> bad = {std::string()}; // dangling ``` The assignment case is not yet supported, but they will be addressed in a follow-up. Fixes llvm#100526 (excluding the `push_back` case).
Configuration menu - View commit details
-
Copy full SHA for e50131a - Browse repository at this point
Copy the full SHA e50131aView commit details -
[MLIR][Python] Python binding support for IntegerSet attribute (llvm#…
…107640) Support IntegerSet attribute python binding.
Configuration menu - View commit details
-
Copy full SHA for 334873f - Browse repository at this point
Copy the full SHA 334873fView commit details -
Set dllimport on Objective C ivar offsets (llvm#107604)
Ensures that offsets for instance variables are marked with `dllimport` if the interface to which they belong has this attribute.
Configuration menu - View commit details
-
Copy full SHA for 7c25ae8 - Browse repository at this point
Copy the full SHA 7c25ae8View commit details -
[mlir] Fix 'StringSet' may not intend to support class template argum…
…ent deduction (NFC) /llvm-project/mlir/tools/mlir-tblgen/OmpOpGen.cpp:202:3: error: 'StringSet' may not intend to support class template argument deduction [-Werror,-Wctad-maybe-unsupported] llvm::StringSet superClasses; ^ /llvm-project/llvm/include/llvm/ADT/StringSet.h:23:7: note: add a deduction guide to suppress this warning class StringSet : public StringMap<std::nullopt_t, AllocatorTy> { ^
Configuration menu - View commit details
-
Copy full SHA for b35bb7b - Browse repository at this point
Copy the full SHA b35bb7bView commit details -
[mlir] Fix -Wunused-variable in OmpOpGen.cpp (NFC)
/llvm-project/mlir/tools/mlir-tblgen/OmpOpGen.cpp:239:8: error: unused variable 'isAttr' [-Werror,-Wunused-variable] bool isAttr = superClasses.contains("Attr"); ^
Configuration menu - View commit details
-
Copy full SHA for 0856f12 - Browse repository at this point
Copy the full SHA 0856f12View commit details -
[SPIR-V] Address the case when optimization uses GEP operator and Gen…
…Code creates G_PTR_ADD to convey the semantics (llvm#107880) When running SPIR-V Backend with optimization levels higher than 0, we observe GEP Operator's as a new factor, massively used to convey the semantics of the original LLVM IR. Previously, an issue related to GEP Operator was mentioned and fixed on the consumer side of toolchains (see, for example, Khronos Trandslator Issue KhronosGroup/SPIRV-LLVM-Translator#2486 and PR KhronosGroup/SPIRV-LLVM-Translator#2487). However, there is a case when GenCode creates G_PTR_ADD to convey the original semantics under optimization levels higher than 0 where it's SPIR-V Backend that fails to translate source LLVM IR correctly. Consider the following reproducer: ``` %struct = type { i32, [257 x i8], [257 x i8], [129 x i8], i32, i64, i64, i64, i64, i64, i64 } @mem = linkonce_odr dso_local addrspace(1) global %struct zeroinitializer, align 8 define weak dso_local spir_func void @__devicelib_assert_fail(ptr addrspace(4) noundef %expr, i32 noundef %line, i1 %fl) { entry: %cmp = icmp eq i32 %line, 0 br i1 %cmp, label %lbl, label %exit lbl: store i32 %line, ptr addrspace(1) getelementptr inbounds (i8, ptr addrspace(1) @mem, i64 648), align 8 br i1 %fl, label %lbl, label %exit exit: ret void } ``` converted to the following machine instructions by SPIR-V Backend: ``` %4:type(s64) = OpTypeInt 32, 0 %22:type(s64) = OpTypePointer 5, %4:type(s64) %2:type(s64) = OpTypeInt 8, 0 %28:type(s64) = OpTypePointer 5, %2:type(s64) %10:pid(p1) = G_GLOBAL_VALUE @mem %36:type(s64) = OpTypeStruct %4:type(s64), %32:type(s64), %32:type(s64), %34:type(s64), %4:type(s64), %35:type(s64), %35:type(s64), %35:type(s64), %35:type(s64), %35:type(s64), %35:type(s64) %37:iid(s32) = G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.spv.const.composite) %8:iid(s32) = ASSIGN_TYPE %37:iid(s32), %36:type(s64) G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.spv.init.global), %10:pid(p1), %8:iid(s32) %29:pid(p1) = nuw G_PTR_ADD %10:pid, %16:iid(s64) %15:pid(p1) = nuw ASSIGN_TYPE %29:pid(p1), %28:type(s64) %27:pid(p2) = G_BITCAST %15:pid(p1) %17:pid(p2) = ASSIGN_TYPE %27:pid(p2), %22:type(s64) G_STORE %1:iid(s32), %17:pid(p2) :: (store (s32) into %ir.3, align 8, addrspace 1) ``` On the next stage of instruction selection this `G_PTR_ADD`-related pattern would be interpreted as an initialization of a global variable and converted to an invalid constant GEP pattern that, in its turn, would fail to be verified by LLVM during back translation from SPIR-V to LLVM IR. This PR introduces a fix for the problem by adding one more case of `G_PTR_ADD` translation, when we use a non-const GEP to convey the meaning. The reproducer is attached as a new test case.
Configuration menu - View commit details
-
Copy full SHA for ed22029 - Browse repository at this point
Copy the full SHA ed22029View commit details -
[X86] combineSubABS - handle NEG(ABD()) expanded patterns
combineSubABS already handles the "(sub Y, cmovns X, -X) -> (add Y, cmovns -X, X)" fold by flipping the cmov operands. We can do something similar for the negation of ABDS/U patterns which have been expanded to a CMOVL/CMOVB with a pair of commuted subtractions: "NEG(ABD(X,Y)) -> NEG(CMOV(SUB(X,Y),SUB(Y,X))) -> CMOV(SUB(Y,X),SUB(X,Y))"
Configuration menu - View commit details
-
Copy full SHA for 1b0400e - Browse repository at this point
Copy the full SHA 1b0400eView commit details -
Configuration menu - View commit details
-
Copy full SHA for b9c2e2e - Browse repository at this point
Copy the full SHA b9c2e2eView commit details -
[Docs][clang-query] disclose Windows linetab bug on clang-query tab a…
…uto-complete (llvm#107956) As per llvm#106672 and llvm#107377, the documentation should be updated to note that the current bug on Windows involving ``LineEditor`` causing Tab key related features to not work. Fixes llvm#107377
Configuration menu - View commit details
-
Copy full SHA for 80fcab8 - Browse repository at this point
Copy the full SHA 80fcab8View commit details -
DXIL: Use correct type ID when writing ValueAsMetadata. (llvm#94337)
When emitting references to functions as part of `ValueAsMetadata`, we currently emit the incorrect (typed) pointer, resulting in crashes during deserialization. Avoid this by correctly mapping the type during serialization.
Configuration menu - View commit details
-
Copy full SHA for 49b57df - Browse repository at this point
Copy the full SHA 49b57dfView commit details -
[LLD][COFF] Add support for ARM64EC import call thunks. (llvm#107931)
These thunks can be accessed using `__impchk_*` symbols, though they are typically not called directly. Instead, they are used to populate the auxiliary IAT. When the imported function is x86_64 (or an ARM64EC function with a patched export thunk), the thunk is used to call it. Otherwise, the OS may replace the thunk at runtime with a direct pointer to the ARM64EC function to avoid the overhead.
Configuration menu - View commit details
-
Copy full SHA for 99a2354 - Browse repository at this point
Copy the full SHA 99a2354View commit details -
Avoid exposing password and token from git repositories (llvm#105220)
Try to detect if the git remote URL has a password or a Github token and return an error teaching the user how to avoid leaking their password or token.
Configuration menu - View commit details
-
Copy full SHA for 5904448 - Browse repository at this point
Copy the full SHA 5904448View commit details -
[TableGen] Migrate Option Emitters to const RecordKeeper (llvm#107696)
Migrate Opt/OptRST Emitters to const RecordKeeper. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
Configuration menu - View commit details
-
Copy full SHA for 5f25b89 - Browse repository at this point
Copy the full SHA 5f25b89View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6043321 - Browse repository at this point
Copy the full SHA 6043321View commit details -
[mlir] [tblgen-to-irdl] Refactor tblgen-to-irdl script and support mo…
…re types (llvm#105505) Refactors the tblgen-to-irdl script slightly and adds support for - Various integer types - Various Float types - Confined types - Complex types (with fixed element type) Also doesn't add the operand and result ops if they are empty. I could potentially split this into smaller PRs if that'd be helpful (refactor + integer/float/complex, confined type, optional operand/result). @math-fehr
Configuration menu - View commit details
-
Copy full SHA for 135bd31 - Browse repository at this point
Copy the full SHA 135bd31View commit details -
[NFC][Clang][SVE] Refactor AArch64SVEACLETypes.def to enabled more us…
…es. (llvm#107599) Some switch statements require all SVE builtin types to be manually specified. This patch refactors the SVE_*_TYPE macros so that such code can be generated during preprocessing. I've tried to establish a minimal interface that covers all types where no special information is required and then created a set of macros that are dedicated to specific datatypes (i.e. int, float). This patch is groundwork to simplify the changing of SVE tuple types to become struct based as well as work to support the FP8 ACLE.
Configuration menu - View commit details
-
Copy full SHA for 2a130f1 - Browse repository at this point
Copy the full SHA 2a130f1View commit details -
[lldb][test] Toolchain detection rewrite in Python (llvm#102185)
This fix is based on a problem with cxx_compiler and cxx_linker macros on Windows. There was an issue with compiler detection in paths containing "icc". In such case, Makefile.rules thought it was provided with icc compiler. To solve that, utilities detection has been rewritten in Python. The last element of compiler's path is separated, taking into account the platform path delimiter, and compiler type is extracted, with regard of possible cross-toolchain prefix. --------- Co-authored-by: Pavel Labath <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 44fc987 - Browse repository at this point
Copy the full SHA 44fc987View commit details -
[GlobalIsel] Combine trunc of binop (llvm#107721)
trunc (binop X, C) --> binop (trunc X, trunc C) --> binop (trunc X, C`) Try to narrow the width of math or bitwise logic instructions by pulling a truncate ahead of binary operators. Vx and Nx cores consider 32-bit and 64-bit basic arithmetic equal in costs.
Configuration menu - View commit details
-
Copy full SHA for ba4bcce - Browse repository at this point
Copy the full SHA ba4bcceView commit details -
[flang][lowering] handle procedure pointers with generic name (llvm#1…
…08043) Handle procedure pointer with the same name as generics in lowering to avoid crashes after llvm#107928.
Configuration menu - View commit details
-
Copy full SHA for b88aced - Browse repository at this point
Copy the full SHA b88acedView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7be6ea1 - Browse repository at this point
Copy the full SHA 7be6ea1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7dfaedf - Browse repository at this point
Copy the full SHA 7dfaedfView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4b1b450 - Browse repository at this point
Copy the full SHA 4b1b450View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6ffa7cd - Browse repository at this point
Copy the full SHA 6ffa7cdView commit details -
Configuration menu - View commit details
-
Copy full SHA for 01967e2 - Browse repository at this point
Copy the full SHA 01967e2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7a30b9c - Browse repository at this point
Copy the full SHA 7a30b9cView commit details -
[TableGen] Fix MacOS failure in Option Emitter. (llvm#108225)
Handle the case of same pointer used as both inputs to the `CompareOptionRecords`, to avoid emitting errors for equivalent options. Follow-up to llvm#107696.
Configuration menu - View commit details
-
Copy full SHA for ccc4fa1 - Browse repository at this point
Copy the full SHA ccc4fa1View commit details -
[clang][bytecode] Check for Pointer dereference in EvaluationResult (l…
…lvm#108207) We will deref<>() it later, so this is the right check.
Configuration menu - View commit details
-
Copy full SHA for 35f7cfb - Browse repository at this point
Copy the full SHA 35f7cfbView commit details -
[DAG] Add test coverage for ABD "sub of selects" patterns based off l…
…lvm#53045 Add tests for "sub(select(icmp(a,b),a,b),select(icmp(a,b),b,a)) -> abd(a,b)" patterns that still fail to match to abd nodes This will hopefully be helped by llvm#108218
Configuration menu - View commit details
-
Copy full SHA for 43da8a7 - Browse repository at this point
Copy the full SHA 43da8a7View commit details -
AMDGPU: Add tests for minimumnum/maximumnum intrinsics
Vector cases are broken, so leave those for later.
Configuration menu - View commit details
-
Copy full SHA for ee61a4d - Browse repository at this point
Copy the full SHA ee61a4dView commit details -
[LV] Generalize check lines for interleave group costs.
Check cost of all instructions in an interleave group, to prepare for follow-up changes.
Configuration menu - View commit details
-
Copy full SHA for 1741b9c - Browse repository at this point
Copy the full SHA 1741b9cView commit details -
[Coroutines] Split buildCoroutineFrame into normalization and frame b…
…uilding (llvm#108076) * Split buildCoroutineFrame into code related to normalization and code related to actually building the coroutine frame. * This will enable future specialization of buildCoroutineFrame for different ABIs while the normalization can be done by splitCoroutine prior to calling buildCoroutineFrame. See RFC for more info: https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057
Configuration menu - View commit details
-
Copy full SHA for 9a9f155 - Browse repository at this point
Copy the full SHA 9a9f155View commit details -
[RISCV] Expand mul X, C where C=2^N*(3,5,9)*(3,5,9) (llvm#108100)
This is a three deep expression which is deeper than we've otherwise gone for multiple expansions, but I think it's reasonable to do so. This covers mul by 50, 100, and 200 which are reasonably common naturally arising numbers.
Configuration menu - View commit details
-
Copy full SHA for 65e0574 - Browse repository at this point
Copy the full SHA 65e0574View commit details -
Revert "[flang][runtime] Fix odd "invalid descriptor" runtime crash (l…
…lvm#107785)" This reverts commit 15106c2. Commit does not pass check-flang on x86 host.
Configuration menu - View commit details
-
Copy full SHA for 050f785 - Browse repository at this point
Copy the full SHA 050f785View commit details -
[AMDGPU][True16][MC] 16bit vsrc and vdst support in MC (llvm#104510)
This is a large patch includes the MC level support for V_CVT_F16_F32, V_CVT_F32_F16 and V_LDEXP_F16 in true16 format. This patch includes the asm/disasm changes to encode/decode the 16bit vsrc, vdst and src modifieres for vop and dpp format. This patch is a dependency for many 16 bit instructions while only three instructions are updated to make it easier to review. There will be another patch to support these three instructions in the codeGen level, this patch just replaces these two instructions with its fake16 format.
Configuration menu - View commit details
-
Copy full SHA for 35e27c0 - Browse repository at this point
Copy the full SHA 35e27c0View commit details -
[AMDGPU] Remove dead code in SIISelLowering (NFC) (llvm#108198)
This return is dead code as the return just above will always be taken.
Configuration menu - View commit details
-
Copy full SHA for ccc52a8 - Browse repository at this point
Copy the full SHA ccc52a8View commit details -
Fix mistake in comment regarding dyn_cast_or_null (llvm#108026)
There was a mistake in a comment regarding dyn_cast_or_null deprication. It was suggested to use cast_if_present instead of dyn_cast_or_null, but that was probably a copy paste mistake, and dyn_cast_if_present is the function that should be used instead of dyn_cast_or_null. Authored-by: Ofri Frishman <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2a4992e - Browse repository at this point
Copy the full SHA 2a4992eView commit details -
[clang][transformer] Make
describe()
terser forNamedDecl
s. (llvm……#108215) Right now `describe()`ing a `FunctionDecl` dups the whole code of the function. Dump only its name.
Configuration menu - View commit details
-
Copy full SHA for 512ceca - Browse repository at this point
Copy the full SHA 512cecaView commit details -
[LV] Amend check for IV increments in collectUsersInEntryBlock (llvm#…
…108020) The check for IV increments in collectUsersInEntryBlock currently triggers for exit-block PHIs which use the IV start value, resulting in us failing to add the input value for the middle block to these PHIs. Fix this by amending the check for IV increments to only include incoming values that are instructions inside the loop. Fixes llvm#108004
Configuration menu - View commit details
-
Copy full SHA for 7858e14 - Browse repository at this point
Copy the full SHA 7858e14View commit details -
[TableGen] Change CodeGenInstruction record members to const (llvm#10…
…7921) Change CodeGenInstruction::{TheDef, InfereredFrom} to const pointers. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
Configuration menu - View commit details
-
Copy full SHA for 3786568 - Browse repository at this point
Copy the full SHA 3786568View commit details -
[lldb] Print a warning on checksum mismatch (llvm#107968)
Print a warning when the debugger detects a mismatch between the MD5 checksum in the DWARF 5 line table and the file on disk. The warning is printed only once per file.
Configuration menu - View commit details
-
Copy full SHA for ffa2f53 - Browse repository at this point
Copy the full SHA ffa2f53View commit details -
[TableGen] Change SubtargetFeatureInfo to use const Record pointers (l…
…lvm#108013) Change SubtargetFeatureInfo to use const Record pointers. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
Configuration menu - View commit details
-
Copy full SHA for 2b452b4 - Browse repository at this point
Copy the full SHA 2b452b4View commit details -
[TableGen] Change CodeGenRegister to use const Record pointer (llvm#1…
…08027) Change CodeGenRegister to use const Record pointer. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
Configuration menu - View commit details
-
Copy full SHA for 7c6592f - Browse repository at this point
Copy the full SHA 7c6592fView commit details -
[clang][TableGen] Change ASTTableGen to use const Record pointers (ll…
…vm#108193) Change ASTTableGen to use const Record pointers. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
Configuration menu - View commit details
-
Copy full SHA for 463c9d2 - Browse repository at this point
Copy the full SHA 463c9d2View commit details -
[clang][TableGen] Change Builtins emitter to use const RecordKeeper (l…
…lvm#108195) Change Builtins emitter to use const RecordKeeper. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
Configuration menu - View commit details
-
Copy full SHA for 970e2c1 - Browse repository at this point
Copy the full SHA 970e2c1View commit details -
Configuration menu - View commit details
-
Copy full SHA for ff7eb1d - Browse repository at this point
Copy the full SHA ff7eb1dView commit details -
[ADT][NFC] Clang-format DenseMap and DenseSet (llvm#108162)
This is a preparation for upcoming changes to Dense[Map|Set] regarding hardening against OOM scenarios (see [this RFC](https://discourse.llvm.org/t/rfc-malfunction-safe-densemap-denseset/81036/7)). We have changed a lot of code inside Dense[Map|Set] and this preparation change helps to isolate the relevant parts from pure formatting stuff.
Configuration menu - View commit details
-
Copy full SHA for 3cfc733 - Browse repository at this point
Copy the full SHA 3cfc733View commit details -
Configuration menu - View commit details
-
Copy full SHA for d5bc1f4 - Browse repository at this point
Copy the full SHA d5bc1f4View commit details -
[RISCV] Reorder zvfbfmin operation actions to match zvfhmin. NFC
This makes it slightly easier to see what's different between the two.
Configuration menu - View commit details
-
Copy full SHA for 30fbfe5 - Browse repository at this point
Copy the full SHA 30fbfe5View commit details -
[AMDGPU] Simplify and improve codegen for llvm.amdgcn.set.inactive (l…
…lvm#107889) Always generate v_cndmask_b32 instead of modifying exec around v_mov_b32. This is expected to be faster because modifying exec generally causes pipeline stalls.
Configuration menu - View commit details
-
Copy full SHA for e55d6f5 - Browse repository at this point
Copy the full SHA e55d6f5View commit details
Commits on Sep 26, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 6118705 - Browse repository at this point
Copy the full SHA 6118705View commit details