Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel-supporting jll #389

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open

Kernel-supporting jll #389

wants to merge 18 commits into from

Conversation

wsmoses
Copy link
Member

@wsmoses wsmoses commented Dec 17, 2024

No description provided.

@@ -131,6 +131,7 @@ function __init__()
end
end

@ccall MLIR.API.mlir_c.RegisterCustomCallTarget("enzymexla_gpu"::Cstring, cglobal((:EnzymeGPUCustomCall, MLIR.API.mlir_c))::Ptr{Cvoid}, "CUDA"::Cstring)::Cvoid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
@ccall MLIR.API.mlir_c.RegisterCustomCallTarget("enzymexla_gpu"::Cstring, cglobal((:EnzymeGPUCustomCall, MLIR.API.mlir_c))::Ptr{Cvoid}, "CUDA"::Cstring)::Cvoid
@ccall MLIR.API.mlir_c.RegisterCustomCallTarget(
"enzymexla_gpu"::Cstring,
cglobal((:EnzymeGPUCustomCall, MLIR.API.mlir_c))::Ptr{Cvoid},
"CUDA"::Cstring,
)::Cvoid

Comment on lines +320 to 321
Reactant.@reactant_overlay @noinline function (func::LLVMFunc{F,tt})(args...; convert=Val(false), blocks::CuDim=1, threads::CuDim=1,
cooperative::Bool=false, shmem::Integer=0, call_kwargs...) where{F, tt}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
Reactant.@reactant_overlay @noinline function (func::LLVMFunc{F,tt})(args...; convert=Val(false), blocks::CuDim=1, threads::CuDim=1,
cooperative::Bool=false, shmem::Integer=0, call_kwargs...) where{F, tt}
Reactant.@reactant_overlay @noinline function (func::LLVMFunc{F,tt})(
args...;
convert=Val(false),
blocks::CuDim=1,
threads::CuDim=1,
cooperative::Bool=false,
shmem::Integer=0,
call_kwargs...,
) where {F,tt}

Comment on lines 21 to +22
@show @code_hlo optimize=false square!(A)
@show @code_hlo optimize=:before_kernel square!(A)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
@show @code_hlo optimize=false square!(A)
@show @code_hlo optimize=:before_kernel square!(A)
@show @code_hlo optimize = false square!(A)
@show @code_hlo optimize = :before_kernel square!(A)

Comment on lines +356 to +357
for idx in (blockdim.x, blockdim.y, blockdim.z, threaddim.x, threaddim.y, threaddim.z, shmem)
push!(operands, Reactant.TracedUtils.promote_to(Reactant.TracedRNumber{Int}, idx).mlir_data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
for idx in (blockdim.x, blockdim.y, blockdim.z, threaddim.x, threaddim.y, threaddim.z, shmem)
push!(operands, Reactant.TracedUtils.promote_to(Reactant.TracedRNumber{Int}, idx).mlir_data)
for idx in
(blockdim.x, blockdim.y, blockdim.z, threaddim.x, threaddim.y, threaddim.z, shmem)
push!(
operands,
Reactant.TracedUtils.promote_to(Reactant.TracedRNumber{Int}, idx).mlir_data,
)

push!(operands, Reactant.TracedUtils.promote_to(Reactant.TracedRNumber{Int}, idx).mlir_data)
end
for arg in mlir_args
push!(operands, arg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
push!(operands, arg)
push!(operands, arg)

Comment on lines +365 to +366
MLIR.IR.NamedAttribute("fn", MLIR.IR.FlatSymbolRefAttribute(Base.String(fname))),
MLIR.IR.NamedAttribute("output_operand_aliases", MLIR.IR.Attribute(output_operand_aliases))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
MLIR.IR.NamedAttribute("fn", MLIR.IR.FlatSymbolRefAttribute(Base.String(fname))),
MLIR.IR.NamedAttribute("output_operand_aliases", MLIR.IR.Attribute(output_operand_aliases))
MLIR.IR.NamedAttribute("fn", MLIR.IR.FlatSymbolRefAttribute(Base.String(fname))),
MLIR.IR.NamedAttribute(
"output_operand_aliases", MLIR.IR.Attribute(output_operand_aliases)
),


call = MLIR.Dialects.stablehlo.custom_call(mlir_args; result_0=restys, call_target_name="reactant_gpu_call", output_operand_aliases, backend_config=MLIR.IR.Attribute(fname))
# call = MLIR.Dialects.stablehlo.custom_call(mlir_args; result_0=restys, call_target_name="reactant_gpu_call", output_operand_aliases, backend_config=MLIR.IR.Attribute(func.mod))
for (i, res) in enumerate(rarrays)
res.mlir_data = transpose_val(MLIR.IR.result(call, i))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
res.mlir_data = transpose_val(MLIR.IR.result(call, i))
res.mlir_data = transpose_val(MLIR.IR.result(call, i))

@@ -379,7 +394,7 @@ function compiler_cache(ctx::MLIR.IR.Context)
return cache
end

Reactant.@reactant_override @noinline function CUDA.cufunction(f::F, tt::TT=Tuple{}; kwargs...) where {F,TT}
Reactant.@reactant_overlay @noinline function CUDA.cufunction(f::F, tt::TT=Tuple{}; kwargs...) where {F,TT}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
Reactant.@reactant_overlay @noinline function CUDA.cufunction(f::F, tt::TT=Tuple{}; kwargs...) where {F,TT}
Reactant.@reactant_overlay @noinline function CUDA.cufunction(
f::F, tt::TT=Tuple{}; kwargs...
) where {F,TT}

@@ -379,7 +394,7 @@ function compiler_cache(ctx::MLIR.IR.Context)
return cache
end

Reactant.@reactant_override @noinline function CUDA.cufunction(f::F, tt::TT=Tuple{}; kwargs...) where {F,TT}
Reactant.@reactant_overlay @noinline function CUDA.cufunction(f::F, tt::TT=Tuple{}; kwargs...) where {F,TT}
res = Base.@lock CUDA.cufunction_lock begin
# compile the function
cache = compiler_cache(MLIR.IR.context())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
cache = compiler_cache(MLIR.IR.context())
cache = compiler_cache(MLIR.IR.context())

@@ -304,7 +306,28 @@ function compile_mlir!(mod, f, args; optimize::Union{Bool,Symbol}=true)

optimize isa Bool && (optimize = ifelse(optimize, :all, :none))

toolkit = ""
if isdefined(Reactant_jll, :ptxas_path)
toolkit = Reactant_jll.ptxas_path[1:end-length("/bin/ptxas")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
toolkit = Reactant_jll.ptxas_path[1:end-length("/bin/ptxas")]
toolkit = Reactant_jll.ptxas_path[1:(end - length("/bin/ptxas"))]

"remove-unnecessary-enzyme-ops",
"enzyme-simplify-math",
opt_passes,
kern
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
kern
kern,

@@ -340,6 +363,7 @@ function compile_mlir!(mod, f, args; optimize::Union{Bool,Symbol}=true)
"remove-unnecessary-enzyme-ops",
"enzyme-simplify-math",
opt_passes,
kern
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
kern
kern,

@@ -348,7 +372,7 @@ function compile_mlir!(mod, f, args; optimize::Union{Bool,Symbol}=true)
run_pass_pipeline!(mod, join([opt_passes, "enzyme-batch", opt_passes], ","))
run_pass_pipeline!(mod, "enzyme,arith-raise{stablehlo=true}"; enable_verifier=false)
run_pass_pipeline!(
mod, "canonicalize,remove-unnecessary-enzyme-ops,enzyme-simplify-math"
mod, "canonicalize,remove-unnecessary-enzyme-ops,enzyme-simplify-math,"*kern
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
mod, "canonicalize,remove-unnecessary-enzyme-ops,enzyme-simplify-math,"*kern
mod, "canonicalize,remove-unnecessary-enzyme-ops,enzyme-simplify-math," * kern

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant