Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync RTU sequential reading failure #180

Open
gustavowd opened this issue Apr 3, 2023 · 10 comments
Open

Sync RTU sequential reading failure #180

gustavowd opened this issue Apr 3, 2023 · 10 comments
Labels

Comments

@gustavowd
Copy link
Contributor

If i read sequentially two the sets of registers, the second read never return.

For example:
let rsp1 = ctx.read_holding_registers(32000, 20).unwrap();
let rsp2 = ctx.read_holding_registers(32300, 40).unwrap();

A simple thread sleep of 10 miliseconds between readings solve the problem. However, I beleive this issue should be solved.

@uklotzde
Copy link
Member

uklotzde commented Apr 3, 2023

What's the outcome if you use libmodbus instead of tokio-modbus for this particular device and invocation sequence?

@gustavowd
Copy link
Contributor Author

It works with libmodbus. Maybe its the modbus RTU interframe space of 3,5 char that is not implemented in tokio-modbus.

@uklotzde
Copy link
Member

uklotzde commented Apr 5, 2023

For reference: 2.5.1.1 MODBUS Message RTU Framing

@uklotzde
Copy link
Member

uklotzde commented Apr 5, 2023

Adding a timeout unconditionally before and/or after each request would be a waste of time. Only the client knows if it will send a single request or a sequence of requests that require the insertion of silence intervals. The duration may also vary from device to device.

Ideas how to make this more convenient for clients through the API are welcome.

@flosse flosse added the rtu label May 21, 2023
@domenicquirl
Copy link

We just hit this issue with a different setup: successive requests to a single device were working fine, but due to the missing inter-frame silence the library did not allow communicating with multiple devices on the same RTU bus. With just the single device, either side would know when its done sending by controlling the send buffer and receiving by looking at the function code and, where applicable, length field of the frame. A second request immediately after the response is therefore easily processed by the server because it knows that it is done responding.

However, if you add a second server device which listens on the same bus, then that device has no additional information / state that tells it when a response ends. It could of course decode the full frame and also inspect function code and length, but that requires processing and also the frame might be invalid, the length might not actually be correct, etc. and so it has to safeguard against that anyways.

So instead, devices will often listen for new data on the bus and only look at the slave ID at the beginning of the new frame. If that doesn't match, they know the frame is not addressing them, so they just wait for the bus to clear before they start listening again for the next frame which might be directed at them.

The problem is now that, because the tokio-modbus decoder also uses function code and length to immediately decode when sufficient bytes have been read, without the inter-frame pause the bus doesn't actually clear. So if I request data from one device and then another, the second device will see the response from the first device, ignore it, and think that my second request to it is still part of that response (which may then be malformed) because at no point did the data stream stop to indicate a delimiter between the two frames. The second request will thus be ignored as well and get no response, which in our case means running into a timeout.

The problem is exacerbated if you both regularly read from the devices and sometimes write to them, because presumably you might spend at least some processing time on the result of the read, while even if you synchronize access to the connection Context a write may happen immediately after a read releases the connection and vice versa.

Adding a timeout unconditionally before and/or after each request would be a waste of time. Only the client knows if it will send a single request or a sequence of requests that require the insertion of silence intervals. The duration may also vary from device to device.

Ideas how to make this more convenient for clients through the API are welcome.

So after spending a significant amount of time to track this down, I definitely think that this behaviour should be prominently displayed in the documentation. As evidenced by this thread, this is not an issue with other Rust Modbus crates and we've also cross-tested our hardware with libraries in other languages, such as Python's PyModbus, all of which handle this inside the library and thus "just work" no matter the setup. Since connecting multiple devices is also kind of the point of a bus, I think it's fair to say that the way tokio-modbus currently works here is not what one might expect having used any other Modbus RTU implementation and the documentation currently just links to the Modbus RTU spec, suggesting that the crate follows the spec, which includes the delay as part of the section which you already linked above.

In my personal opinion, the current behaviour is quite a big footgun: it subverts expectations (in other words: the behaviour is surprising) and at least in our case this issue was also very painful to debug - it only occurred when connecting multiple devices to the same bus, manifested differently for different devices (some would respond sometimes, some not at all, some almost always but would then fail), was dependent on the connection settings in use (like the baudrate), and then further it also didn't occur or was less prominent when using different devices as the client, because the different client devices have different processing speed and the request only gets lost when the second request is actually sent fast enough.

So regarding any API choices that are made around this, I think it would be much better if the default were to include the inter-frame silence interval (for RTU connections). This would make the crate "just work" like all the others and tbh I doubt that most applications even notice the delay (it's at most ~4ms at 9600 baud).

If you're really worried about performance, you could still expose a configuration on the client to explicitly disable automatic pausing after requests (and / or adjust the pause time, though I think the spec is quite clear here on how to calculate the default). This would also provide an additional place to document this behaviour and what to look out for when switching to "manual" / "single-shot" mode in the function on the client. But again, I think it's much better to not have the crate's defaults potentially cause long debugging sessions after which people end up re-doing the same calculation in user code and manually insert breaks after their requests. That's just frustrating and not very ergonomic as opposed to maybe losing out on the last bit of perf, which they could still get by using the library at a lower level.

@flosse
Copy link
Member

flosse commented Nov 19, 2024

@domenicquirl could you summarize yourself again in one sentence?

@domenicquirl
Copy link

@domenicquirl could you summarize yourself again in one sentence?

@flosse one might be difficult, but I'll try to give a TLDR:

  • This issue is also a problem when trying to communicate with multiple devices on one RTU bus, because devices not actively communicating may not actually process a frame but try to ignore it / wait for it to end.
  • Not respecting the inter-frame silence (by default) inside the library is surprising because it doesn't "just work" and other libraries (in Rust and other languages) do not do this.
  • It is also hard to debug, because it's a timing problem that depends a lot on the specific setup.
  • Imo tokio-modus should therefore by default handle frame delimiters so it "just works" with any setup and maybe allow to turn that off. If it doesn't, it should be documented somewhere loud that the user is supposed to be responsible for handling this, because currently the docs don't mention it (afaict).

That all with a lot of gratitude for making and maintaining the crate, which works very well for TCP and also for RTU if you know to insert frame breaks yourself ^^

@uklotzde
Copy link
Member

My take:

An RTU frame only ends after the inter-frame silence interval expired and does not depend on the actual number of bytes received.

Hopefully someone is able to pick this up and provide a fix.

@domenicquirl
Copy link

An RTU frame only ends after the inter-frame silence interval expired and does not depend on the actual number of bytes received.

I very much agree with that view. At the same time, this might be difficult to integrate with the current implementation approach of using Decoders, which only provides you with the bytes that have been read so far and asynchronously fills that buffer - you would have to work with the underlying serial stream directly to be able to additionally take the time on the reads (and I can tell you from experience that even on an embedded device the timings don't always match up exactly: there is latency from the bus to the hardware, from that to the driver, etc.). I've also not thought a lot about how this would work with your current retry / drop bytes mechanism.

As a first step, would you be comfortable with assuming that the decoder returns "as soon as possible" - i.e., immediately after all expected bytes have been received, modulo some minimal time required to actually decode the frame - and always adding a short sleep afterwards @uklotzde? This could probably be made RTU-only by sticking it into service::rtu::Client.

@uklotzde
Copy link
Member

uklotzde commented Nov 24, 2024

As a first step, would you be comfortable with assuming that the decoder returns "as soon as possible" - i.e., immediately after all expected bytes have been received, modulo some minimal time required to actually decode the frame - and always adding a short sleep afterwards @uklotzde? This could probably be made RTU-only by sticking it into service::rtu::Client.

I didn't think about how a fix could be implemented and do not plan to work on this in my spare time.

It might be insightful to investigate how others are handling this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants