-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SCCHoist: hoist inline call temporaries and don't hoist statically declared arrays #268
Conversation
Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/268/index.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the "inline call part" seems to have been missing indeed.
Further, the solution/new default via is_dimension_constant
seems reasonable.
else: | ||
new_args = tuple(v.clone(dimensions=None) for v in variables) | ||
vmap = {call: call.clone(parameters=call.parameters + new_args)} | ||
routine.body = SubstituteExpressions(vmap).visit(routine.body) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could move routine.body = SubstituteExpressions(vmap).visit(routine.body)
outside the if
/else
statement ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, good spot, thanks!
c1cb682
to
8e286d0
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #268 +/- ##
=======================================
Coverage 92.88% 92.88%
=======================================
Files 102 102
Lines 18253 18270 +17
=======================================
+ Hits 16954 16971 +17
Misses 1299 1299
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, nice find and great fix. There seems to be a typo in the implementation, though, which suggests a shortfall in testing. I think it would be good to expand testing to cover that.
new_args = dict((a.name, v.clone(dimensions=None)) for (a, v) in variables) | ||
_call_clone = call.clone(kwparameters=call.kw_parameters) | ||
_call_clone.kw_parameters.update(new_args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this pattern of in-place updating the call attribute a little weird, particularly since we expect to only add new kwargs to the end. Wouldn't something more similar to the other argument_remapping method be a little more intuitive?
Also, the clone call has a typo (kwparameters
vs. kw_parameters
), which suggests we don't have a test with existing kw_parameters
:) Could you add one for that?
new_args = dict((a.name, v.clone(dimensions=None)) for (a, v) in variables) | |
_call_clone = call.clone(kwparameters=call.kw_parameters) | |
_call_clone.kw_parameters.update(new_args) | |
kw_params = call.kw_parameters | |
kw_params += tuple((a.name, v.clone(dimensions=None)) for (a, v) in variables) | |
_call_clone = call.clone(kwparameters=dict(kw_params)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks, this looks great now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only had a brief looks, but LGTM.
@@ -240,7 +243,7 @@ def transform_subroutine(self, routine, **kwargs): | |||
routine.arguments += hoisted_temporaries | |||
|
|||
call_map = {} | |||
for call in FindNodes(CallStatement).visit(routine.body): | |||
for call in FindNodes(CallStatement).visit(routine.body) + list(FindInlineCalls().visit(routine.body)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both of these should return tuples, I think. Better to fix the return of the visitor than to cast here, no? Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a bit of a mismatch there, something we should clean up separately:
FindNodes
has indeed always returned lists. Since these are built incrementally and can be quite long, there could indeed a performance impact from using tuples- Expression visitors return by default sets, unless
unique=False
is used.
This PR adds the capability to hoist temporaries declared in inline function calls. It also changes the default behaviour so that statically declared arrays are no longer hoisted.