Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not expose Celerity internals in installed headers #308

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

fknorr
Copy link
Contributor

@fknorr fknorr commented Nov 15, 2024

We have a considerable number of header files now, most of which are concerned with implementation details only that should not be exposed to the user, if only in order to cut compile times somewhat. This is a refactoring PR following up on #300 and #299 that removes direct and transitive includes of all DAG related logic from celerity.h.

  • handler now stores the results from CGF invocation in struct raw_command_group instead of creating a task instance directly. This allows it to be oblivious of the future task id and the number of collective nodes (both are task-manager parameters). This also simplifies some test code as "standalone" CGF invocation is possible for the first time.
  • The global range is now passed into host task launchers instead of being captured in them, which is necessary to type-erase host task lambdas without knowledge of the number of collective nodes.
  • runtime and task_manager now take a raw_command_group instead of a generic lambda to invoke. Task creation, previously spread throughout handler, now happens in create_command_group_task inside task.cc. This helps to separate TDAG generation from SYCL semantics more.
  • command-group "parameter types" such as struct buffer_access are moved to the new "cgf.h" alongside struct raw_command_group.
  • runtime is now an abstract base class which still exposes all static members, but is implemented by runtime_impl which performs all internal includes.
  • print_utils.h is split into two files, one for public-facing types, one for internals.
  • As a drive-by, task_manager::generate_fence_task does not take BAMs and SEMs anymore but has a overload for each variant. A struct host_object_effect is added as a companion to struct buffer_access.

All in all, this shrinks the set of headers reachable from celerity.h considerably. We now assert that this set does not grow unnoticed by explicitly enumerating the set of public headers which will be installed by CMake, while internal headers are only available in-tree (and thus for unit tests). Eventually we will want to move internal headers to src/ or a subdirectory, but I'm skipping this step for now to avoid merge conflicts.

@fknorr fknorr requested review from psalz and PeterTh November 15, 2024 14:36
@fknorr fknorr self-assigned this Nov 15, 2024
Copy link

Check-perf-impact results: (120b801fc25bb311186a33753cc215e6)

❓ No new benchmark data submitted. ❓
Please re-run the microbenchmarks and include the results if your commit could potentially affect performance.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

@coveralls
Copy link

coveralls commented Nov 15, 2024

Pull Request Test Coverage Report for Build 12033415348

Details

  • 289 of 303 (95.38%) changed or added relevant lines in 20 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.007%) to 94.965%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/task.cc 51 55 92.73%
src/testspy/runtime_testspy.inl 37 41 90.24%
src/testspy/scheduler_testspy.inl 1 7 14.29%
Totals Coverage Status
Change from base Build 12012926758: -0.007%
Covered Lines: 7151
Relevant Lines: 7270

💛 - Coveralls

@fknorr fknorr added this to the 0.7.0 milestone Nov 15, 2024
github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

@fknorr fknorr force-pushed the pimpl-runtime branch 2 times, most recently from 3351b9b to 4d2ef26 Compare November 25, 2024 09:22
Copy link

Check-perf-impact results: (32e33781a20e2a3976ece8f7dc81ae2b)

❓ No new benchmark data submitted. ❓
Please re-run the microbenchmarks and include the results if your commit could potentially affect performance.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

src/platform_specific/named_threads.win.cc Outdated Show resolved Hide resolved
src/backend/sycl_cuda_backend.cc Outdated Show resolved Hide resolved
@fknorr fknorr force-pushed the pimpl-runtime branch 3 times, most recently from 687d035 to 24f4606 Compare November 25, 2024 12:21
include/runtime.h Outdated Show resolved Hide resolved
Copy link
Member

@psalz psalz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat, LGTM!

include/queue.h Outdated Show resolved Hide resolved
include/device.h Outdated Show resolved Hide resolved
test/command_graph_generator_test_utils.h Show resolved Hide resolved
test/dag_benchmarks.cc Outdated Show resolved Hide resolved
github-actions[bot]

This comment was marked as resolved.

github-actions[bot]

This comment was marked as resolved.

github-actions[bot]

This comment was marked as resolved.

github-actions[bot]

This comment was marked as resolved.

@fknorr fknorr force-pushed the pimpl-runtime branch 4 times, most recently from efbcd69 to f385675 Compare November 26, 2024 07:05
Copy link
Member

@psalz psalz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving again for new testspy-pattern, looks very clean now!

src/named_threads.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@PeterTh PeterTh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall!

Since this makes some (minor, but still) implementation changes to the DAG benchmarks, did you run those to make sure that there is no performance change?

Edit: one final thing -- this is mostly an internal change, but it does change the (exposed) API surface, so maybe we should have a line for it in the changelog?

@PeterTh
Copy link
Contributor

PeterTh commented Nov 26, 2024

I ran the benchmarks on the branch, everything looks ok.

✔️ No significant performance change in the microbenchmark set. You are good to go!

Invoking a CGF now generates a raw_command_group, which is later
ingested by the runtime / task_manager to create a TDAG node.
detail::runtime is now an abstract base class, which allows moving all
internal includes to runtime.cc without PIMPL boilerplate and while
still allowing access to internals for runtime_testspy.
Eventually internal headers should be moved to src/, but this might
cause difficult-to-resolve merge conflicts at the moment.
Comment on lines 331 to 337
detail::cgf_diagnostics::teardown();
return tm.submit_command_group([&, er = execution_range](handler& cgh) {
auto cg = detail::invoke_command_group_function([&, er = execution_range](handler& cgh) {
cgf(cgh);
cgh.parallel_for<KernelName>(er, [](nd_item<KernelDims>) {});
});
return tm.generate_command_group_task(std::move(cg));
detail::cgf_diagnostics::make_available();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing an unreachable code warning now, how did that ever work?!

Copy link
Member

@psalz psalz Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! Well this line not being executed just means that the CGF diagnostics are disabled for the remainder of the test. So it just wouldn't have caught any accessor-misuse later on. But I doubt that there's any test that uses both add_compute_task et al., as well as a real command group where the diagnostics would matter. In fact, I just removed the call to teardown and it looks like we have exactly 1 test that uses add_compute_task with a real buffer, producing that warning (all others use mock buffers).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ironicallly, the test in question get_access can be called on const buffer does not call get_access on a const buffer. I'm refactoring it and splitting it into a RT test and a RT deprecation test to give it some meaning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed another commit that also moves non-RT sibling tests out to task_graph_tests. We should get rid of add_*_task, but this will have to be alongside a TDAG generator / TDAG test context refactoring.

This eliminates dead code from an earlier incomplete refactoring.

The CGF teardown / reinit was only required by a single test, which was
coincidentally also broken and didn't test the feature advertised. This
commit splits up the test between runtime_ and runtime_deprecation_tests
and also moves runtime-independent sibling tests to task_graph_tests.
@fknorr
Copy link
Contributor Author

fknorr commented Nov 26, 2024

Edit: one final thing -- this is mostly an internal change, but it does change the (exposed) API surface, so maybe we should have a line for it in the changelog?

It doesn't touch the public API, unless I missed exporting something.

Copy link
Contributor

@PeterTh PeterTh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants