-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not expose Celerity internals in installed headers #308
base: master
Are you sure you want to change the base?
Conversation
Check-perf-impact results: (120b801fc25bb311186a33753cc215e6) ❓ No new benchmark data submitted. ❓ |
Pull Request Test Coverage Report for Build 12033415348Details
💛 - Coveralls |
3351b9b
to
4d2ef26
Compare
Check-perf-impact results: (32e33781a20e2a3976ece8f7dc81ae2b) ❓ No new benchmark data submitted. ❓ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy
found issue(s) with the introduced code (1/1)
687d035
to
24f4606
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat, LGTM!
24f4606
to
6703821
Compare
efbcd69
to
f385675
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving again for new testspy-pattern, looks very clean now!
362ea09
to
03c2e34
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great overall!
Since this makes some (minor, but still) implementation changes to the DAG benchmarks, did you run those to make sure that there is no performance change?
Edit: one final thing -- this is mostly an internal change, but it does change the (exposed) API surface, so maybe we should have a line for it in the changelog?
I ran the benchmarks on the branch, everything looks ok. ✔️ No significant performance change in the microbenchmark set. You are good to go! |
Invoking a CGF now generates a raw_command_group, which is later ingested by the runtime / task_manager to create a TDAG node.
detail::runtime is now an abstract base class, which allows moving all internal includes to runtime.cc without PIMPL boilerplate and while still allowing access to internals for runtime_testspy.
Eventually internal headers should be moved to src/, but this might cause difficult-to-resolve merge conflicts at the moment.
test/test_utils.h
Outdated
detail::cgf_diagnostics::teardown(); | ||
return tm.submit_command_group([&, er = execution_range](handler& cgh) { | ||
auto cg = detail::invoke_command_group_function([&, er = execution_range](handler& cgh) { | ||
cgf(cgh); | ||
cgh.parallel_for<KernelName>(er, [](nd_item<KernelDims>) {}); | ||
}); | ||
return tm.generate_command_group_task(std::move(cg)); | ||
detail::cgf_diagnostics::make_available(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixing an unreachable code warning now, how did that ever work?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops! Well this line not being executed just means that the CGF diagnostics are disabled for the remainder of the test. So it just wouldn't have caught any accessor-misuse later on. But I doubt that there's any test that uses both add_compute_task
et al., as well as a real command group where the diagnostics would matter. In fact, I just removed the call to teardown
and it looks like we have exactly 1 test that uses add_compute_task
with a real buffer, producing that warning (all others use mock buffers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ironicallly, the test in question get_access can be called on const buffer
does not call get_access
on a const buffer. I'm refactoring it and splitting it into a RT test and a RT deprecation test to give it some meaning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed another commit that also moves non-RT sibling tests out to task_graph_tests
. We should get rid of add_*_task
, but this will have to be alongside a TDAG generator / TDAG test context refactoring.
This eliminates dead code from an earlier incomplete refactoring. The CGF teardown / reinit was only required by a single test, which was coincidentally also broken and didn't test the feature advertised. This commit splits up the test between runtime_ and runtime_deprecation_tests and also moves runtime-independent sibling tests to task_graph_tests.
03c2e34
to
603a2b9
Compare
It doesn't touch the public API, unless I missed exporting something. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
We have a considerable number of header files now, most of which are concerned with implementation details only that should not be exposed to the user, if only in order to cut compile times somewhat. This is a refactoring PR following up on #300 and #299 that removes direct and transitive includes of all DAG related logic from
celerity.h
.handler
now stores the results from CGF invocation instruct raw_command_group
instead of creating a task instance directly. This allows it to be oblivious of the future task id and the number of collective nodes (both are task-manager parameters). This also simplifies some test code as "standalone" CGF invocation is possible for the first time.runtime
andtask_manager
now take araw_command_group
instead of a generic lambda to invoke. Task creation, previously spread throughouthandler
, now happens increate_command_group_task
inside task.cc. This helps to separate TDAG generation from SYCL semantics more.struct buffer_access
are moved to the new "cgf.h" alongsidestruct raw_command_group
.runtime
is now an abstract base class which still exposes all static members, but is implemented byruntime_impl
which performs all internal includes.print_utils.h
is split into two files, one for public-facing types, one for internals.task_manager::generate_fence_task
does not take BAMs and SEMs anymore but has a overload for each variant. Astruct host_object_effect
is added as a companion tostruct buffer_access
.All in all, this shrinks the set of headers reachable from
celerity.h
considerably. We now assert that this set does not grow unnoticed by explicitly enumerating the set of public headers which will be installed by CMake, while internal headers are only available in-tree (and thus for unit tests). Eventually we will want to move internal headers to src/ or a subdirectory, but I'm skipping this step for now to avoid merge conflicts.