Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it safe for klippy:ready handlers to execute UPDATE_DELAYED_GCODE #6631

Closed
wants to merge 1 commit into from

Conversation

rschaeuble
Copy link

When another klippy:ready handler executes UPDATE_DELAYED_GCODE before DelayedGcode's ready handler runs, then self.timer_handler is None, causing the update_timer to fail.

@rschaeuble
Copy link
Author

This actually happened to me with moonraker-timelapse, whose Moonraker plugin executes GCode once Klippy is ready.

@rschaeuble rschaeuble force-pushed the patch-1 branch 2 times, most recently from 5afde75 to c8ec9d2 Compare June 29, 2024 14:56
@flowerysong
Copy link
Contributor

flowerysong commented Jul 2, 2024

This looks plausible, but it makes this branch do unnecessary work to calculate a value that is discarded. You can instead make the entire branch conditional:

        elif self.timer_handler:
            waketime = self.reactor.NEVER
            if self.duration:
                waketime = self.reactor.monotonic() + self.duration
            self.reactor.update_timer(self.timer_handler, waketime)

When another klippy:ready handler executes UPDATE_DELAYED_GCODE before
DelayedGcode's ready handler runs, then self.timer_handler is None,
causing update_timer() to fail.

Signed-off-by: Rolf Schäuble <[email protected]>
@rschaeuble
Copy link
Author

@flowerysong: you are right; making the whole branch conditional is better. I've changed the code accordingly.

Copy link

Thank you for your contribution to Klipper. Unfortunately, a reviewer has not assigned themselves to this GitHub Pull Request. All Pull Requests are reviewed before merging, and a reviewer will need to volunteer. Further information is available at: https://www.klipper3d.org/CONTRIBUTING.html

There are some steps that you can take now:

  1. Perform a self-review of your Pull Request by following the steps at: https://www.klipper3d.org/CONTRIBUTING.html#what-to-expect-in-a-review
    If you have completed a self-review, be sure to state the results of that self-review explicitly in the Pull Request comments. A reviewer is more likely to participate if the bulk of a review has already been completed.
  2. Consider opening a topic on the Klipper Discourse server to discuss this work. The Discourse server is a good place to discuss development ideas and to engage users interested in testing. Reviewers are more likely to prioritize Pull Requests with an active community of users.
  3. Consider helping out reviewers by reviewing other Klipper Pull Requests. Taking the time to perform a careful and detailed review of others work is appreciated. Regular contributors are more likely to prioritize the contributions of other regular contributors.

Unfortunately, if a reviewer does not assign themselves to this GitHub Pull Request then it will be automatically closed. If this happens, then it is a good idea to move further discussion to the Klipper Discourse server. Reviewers can reach out on that forum to let you know if they are interested and when they are available.

Best regards,
~ Your friendly GitIssueBot

PS: I'm just an automated script, not a human being.

@rschaeuble
Copy link
Author

Alright, so here's a self-review based on the steps from https://www.klipper3d.org/CONTRIBUTING.html#what-to-expect-in-a-review:

  1. I have tested the change on my printer; it works as expected and solves the problem. Tests run successfully. I believe the code is ready for deployment.
  2. The change solves a real-world problem. I also found at least one other case in Discord where someone had the same issue (but didn't find the solution then).
  3. I provided a signed-off-by line in my commit. The change is entirely my own.
  4. The change follows the coding style of the modified code.
  5. No documentation update required.
  6. It's a single commit with a single, small change.

If there's more that I can do to speed this up, please let me know.

@KevinOConnor
Copy link
Collaborator

Thanks. Could you describe the problem being solved? In what situations would a 'klippy:ready' event run a g-code command? (If there are any cases - those cases should be fixed as it is not valid to run g-code from the "klippy:ready" callback.)

-Kevin

@rschaeuble
Copy link
Author

rschaeuble commented Sep 30, 2024

@KevinOConnor: I had this problem when using moonraker-timelapse. It's Moonraker plugin executes GCode once Klippy is ready. The executed GCode uses UPDATED_DELAYED_GCODE.
When that happens before delayed_gcode has fully initialized, an exception is raised.

I don't know why, but it seems there is some amount of randomness involved. It seems to work for most people (and initially it worked for me), but for some reason sometimes the Moonraker plugin's ready-handler comes first, causing the issue.

Is there another event moonraker-timelapse's plugin could listen for? One that is emitted after all klippy_ready handlers have run?

@KevinOConnor
Copy link
Collaborator

The "klippy:ready" callback is internal to the Klipper code (that is, only relevant to code in this github repo). If you've found some way to create an error in the code, then I'd start by providing a Klipper log of the event. Make sure you are running the pristine klipper code from this repo, reproduce the event, issue an M112 command, and then attach the full unmodified log here - or on discourse as described at https://www.klipper3d.org/Contact.html#i-found-a-bug-in-the-klipper-software .

Cheers,
-Kevin

@rschaeuble
Copy link
Author

rschaeuble commented Sep 30, 2024

@KevinOConnor: your comments and questions made me look deeper into how Moonraker and Klipper interact.

klippy.py contains the following code:

self._set_state(message_ready)
for cb in self.event_handlers.get("klippy:ready", []):
  if self.state_message is not message_ready:
    return
  cb()

The first line sets the state_message. It is this message that Moonraker uses internally to detect when Klippy is ready. Once it reports ready, Moonraker executes its internal server:klippy_ready event handlers. One of those (the one from moonraker-timelapse) then runs UPDATE_DELAYED_GCODE.

Now with a bit of bad luck, apparently Moonraker catches the ready state before Klipper's internal event handlers have all run.

Wouldn't it make sense to first run the klippy_ready handlers and only then execute self._set_state(message_ready)? This way, once Moonraker sees Klippy as ready, it is guaranteed to be ready with no potential for this race condition.

@KevinOConnor
Copy link
Collaborator

My vague recollection is that some of the ready callbacks may inspect the current state and thus we need to transition prior to invoking the callbacks.

That said, I don't think it should have been valid for a command to run between the set state and the completion of that loop. The main loop is single threaded and none of the ready callbacks should have "yielded" to allow a command handler to run. So, something seems wrong. It'll take some work to track down which internal "klippy:ready" callback is misbehaving.

-Kevin

@rschaeuble
Copy link
Author

I have to admit that I don't really understand how Klipper's reactor works, and what causes a "yield' and what doesn't.

However, I have a suspicion: I'm using the klipper_tmc_autotune plugin. This plugin has a klippy:ready handler in which it communicates with the tmc2209 chips that I have in my printer.

If I comment out this line, then the error goes away: https://github.com/andrewmcgr/klipper_tmc_autotune/blob/489a1789d5ddb5510dcdd953aaa7780892d63f0e/autotune_tmc.py#L241

Would using tmc_uart to send queries result in a "yield"?

@KevinOConnor
Copy link
Collaborator

Klipper does not have a plugin system. Any changes to the repo (including adding of files) is a modification of the code. From what you describe, it does sound like those modifications are defective as it is not valid to issue commands like that from the "klippy:ready" handler.

-Kevin

@KevinOConnor KevinOConnor added the not mainline Wont merge into master branch label Sep 30, 2024
@rschaeuble
Copy link
Author

I’ll try sending a PR to autotune_tmc that fixes this. It’s a nasty issue, quite hard to diagnose, and could trigger all kinds of issues in the future.

Please allow me a few questions to make sure I got everything right:

  1. klippy:ready callbacks are not allowed to execute GCode or do anything that switches to another greenlet. The latter means all code that ends up in Reactor.pause() must not be used. Is this correct?

  2. Is the same true for klippy:connect and all the other events? I assume yes, as that sounds like a reasonable restriction.

  3. This issue should be fixable by moving the offending code from the ready-handler to a timer callback. The result would look like this:

def handle_ready(self): # the klippy:ready handler
  reactor.register_timer(self._callback, reactor.NOW)

def _callback(self):
  # do stuff that may switch to another greenlet

This will allow the loop that calls all event handlers to finish before the timer callback is called.

  1. Would a PR be welcome that extends the documentation to state these restrictions?

@KevinOConnor
Copy link
Collaborator

Thanks.

Some answers to your questions:

  1. Yes. It shouldn't call pause() and shouldn't call anything that can raise an error ( see the short blurb at https://www.klipper3d.org/Code_Overview.html#adding-a-host-module ).
  2. No. Every event has its own execution context (as mentioned at the blurb above) and there are no general rules for them. The "klippy:connect" callbacks definitely can pause and definitely can raise config errors.
  3. Yes. Might be simpler to use a reactor.register_callback() though.
  4. Yes.

More generally, the "klippy:connect" event is a general "do what needs to be done to configure and establish communications with remote micro-controllers" and thus has a lot of flexibility. The "klippy:ready" callback is an instantaneous notification of a state transition that can't do anything complex nor raise errors, because that could leave the system in a "weird half way ready/not-ready" state.

Cheers,
-Kevin

@rschaeuble
Copy link
Author

Thank you for the answers, Kevin.

Here's the PR for the documentation: #6702

And here's the PR for autotune_tmc, for those interested in that: andrewmcgr/klipper_tmc_autotune#203.

@rschaeuble rschaeuble closed this Oct 3, 2024
@rschaeuble rschaeuble deleted the patch-1 branch October 3, 2024 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
not mainline Wont merge into master branch reviewer needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants