Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add USBHost #3307

Open
wants to merge 103 commits into
base: main
Choose a base branch
from
Open

Add USBHost #3307

wants to merge 103 commits into from

Conversation

ijager
Copy link

@ijager ijager commented Sep 4, 2024

This PR adds USBHost in several layers:

  • USBHostDriver HAL for STM32 usb_v4 devices in embassy-stm32 crate
  • USBHostDriverTrait in embassy-usb-driver crate
  • USBHost implementation in embassy-usb crate
  • HIDKeyboard example for stm32g0

This is the implementation for issue #3295.

@i404788
Copy link

i404788 commented Oct 4, 2024

Hey @ijager,

I'm working on extending this PR to support 'universal' drivers (see wip @ https://github.com/i404788/embassy/tree/usb-asynch). I was wondering if there is a reason to have Channels be static (i.e. non-reconfigurable, and only one of In/Out). On the platform I'm looking to target (USBOTG), the current strategy will quickly consume all channels.

In the Hub handler example I'm working on, it's significantly more efficient to allocate 2 pipes: 1 for interrupt, and one for control IN/OUT, but that requires re-configuring the pipe.

Does the stm32 not support changing attributes? If it does, I think a shared Channel trait with a reconfigure(endpoint: &EndpointDescriptor) would improve the resource efficiency and convenience for one-shot packets. Only downside is that write and read may need to check if the direction is correct (could ChannelError::InvalidDirection if it's not supported; or just attempt the transaction on the specified side, I have at least one device which has bulk IN/OUT on the same endpoint_addr but with | 0x80 for in).

@nikvoid
Copy link

nikvoid commented Oct 4, 2024

I'm also working in this direction. I completely redesigned channel and host API to make it usable from multiple concurrent tasks. On RP2040 you need to reconfigure hardware for each control/bulk/isochronous transfer and only interrupt channels are special. So in my variant channels other from interrupts are basically free, because the only resource they use is stack space. Interrupt channels are limited, but automatically freed on channel drop. Other type of channels should reconfigure hardware and block other tasks that want to issue transfers.

https://github.com/nikvoid/embassy/tree/rp2040-usb-host

https://github.com/nikvoid/embassy/blob/rp2040-usb-host/embassy-usb-driver/src/host.rs#L282
UsbChannel trait is parametrised by type and direction, where direction may be either In, Out, or InOut (I'm thinking of improving this part because InOut probably only suitable for Control pipes.

Updated host example: https://github.com/nikvoid/embassy/blob/rp2040-usb-host/examples/rp/src/bin/usb_host_keyboard.rs

@i404788
Copy link

i404788 commented Oct 5, 2024

@nikvoid Thanks for the notif.

That does look a lot closer to what I can work with. There is one difference that might cause an issue for USBOTG, there channels are limited (even non-interrupt) but each are independent (async-safe) and reconfigurable. Which is why I was looking at a reconfigure API for all channels (can still be type-state by consuming the channels); it seems like your RP2040 would support this (just changing values on the stack), and it would allow for the USBOTG to run without mutexes.

I will rebase to your branch, and if you're up for it we can collaborate on the final traits/interfaces.

@ijager
Copy link
Author

ijager commented Oct 6, 2024

Good that there's more people working on this. I am happy to change things up. If we have three targets and use cases together that's great for testing.

I am still on vacation for a week without computer so cannot really checks things right now.

@nikvoid
Copy link

nikvoid commented Oct 6, 2024

OK. For now I'm going to write a hub driver and test it on few keyboards. By the way, what are good devices to test bulk and isochronous transfers on? I think mainly of mass storage and web camera, but I don't have one.

@i404788
Copy link

i404788 commented Oct 6, 2024

If you have a second RP2040, you could try the CDC-ACM, or MIDI drivers from embassy-usb in device mode (or just put it in usb flash mode).
If you happen to have another Raspberry Pi Zero or similar SBC you can use the linux gadget for mass storage.

Isochronous isn't used much, mainly audio & video equipment, if you have an android device you might be able to configure it as a webcam (or potentially in audio accessory mode: https://source.android.com/docs/core/audio/usb#accessoryAudio).

@nikvoid
Copy link

nikvoid commented Oct 6, 2024

Now I have USB hub working for full speed devices, but not yet for low speed.
If I simply set preamble_en register before transfer to LS device, RxTimeout interrupt will be fired.
If that interrupt is disabled, it seems to work somehow. However INTERRUPT IN channel for keyboard is in dead silence.
Maybe I forgot something important to set on hub, or maybe another keyboard is really slow, or both.

@i404788
Copy link

i404788 commented Oct 6, 2024

Could be that you are missing the LS preamble on the interrupt channel, at least in USBOTG you need to set it per channel.

@nikvoid
Copy link

nikvoid commented Oct 7, 2024

Yeah, indeed. I overlooked a register for that. Now it works, except for timeouts. There may be another register to configure it, I'll research on that later.

@nikvoid
Copy link

nikvoid commented Oct 8, 2024

Timeouts appear to be a problem with specfic hub. Another hub is working fine.

@i404788
Copy link

i404788 commented Oct 9, 2024

@nikvoid I've converted most of your hub driver to a usb driver trait: https://github.com/i404788/embassy/blob/usb-asynch/embassy-usb/src/handlers/hub.rs).

If you could review the general structure, especially wait_for_event since that's where I'm least sure about the interface.
I've also removed a lot of the dependencies from the driver itself, so the overarching layer can more easily be changed.

@nikvoid
Copy link

nikvoid commented Oct 9, 2024

Well, I don't see a reason to create device driver trait. I have updated rp/usb_host_hid_keyboard example with hub usage, as you can see, user can just try all drivers with some priorities, then create task for device. I think this is the very flexible way.

About handler trait: what purpose it serves? I see that it uses raw UsbHostDriver and channels. Could you please provide details about its usage? I also spotted several suspicious lines:

https://github.com/i404788/embassy/blob/fb70372f52f3615e92863ef01afcb5d176e5d4c5/embassy-usb/src/handlers/hub.rs#L115
On RP2040 there is a special register for interval, which is already used by driver. Even if it is not available on other platform, its driver could just use Timer::after as polyfill for periodic poll.

https://github.com/i404788/embassy/blob/fb70372f52f3615e92863ef01afcb5d176e5d4c5/embassy-usb/src/handlers/hub.rs#L119
Value returned by interrupt endpoint is a bitmap, where each bit corresponds to port, starting from bit 1

https://github.com/i404788/embassy/blob/fb70372f52f3615e92863ef01afcb5d176e5d4c5/embassy-usb/src/handlers/hub.rs#L53
1st argument is not a device address, it's rather an endpoint address (and yes, this is redundant because control channel are always 0, FIXME)

@i404788
Copy link

i404788 commented Oct 9, 2024

Thank you for the review.

About handler trait: what purpose it serves?

The main purpose is to create a common interface for drivers such that a user can easily re-use them & create a common initialization sequence. This allows (for example) for automatic enumeration & initialization of drivers from a registry without mapping out what each driver needs to get started; this is effectively similar to how larger OSes do it.

As for why UsbHostDriver instead of UsbHost, it's because UsbHost doesn't seem to provide anything you'd need in a driver while increasing the restrictions on usage. I'm aiming to avoid having to pigeonhole the drivers into embassy-usb for future development.

https://github.com/i404788/embassy/blob/fb70372f52f3615e92863ef01afcb5d176e5d4c5/embassy-usb/src/handlers/hub.rs#L115

I didn't know you had integrated the HW interrupt into the request_in, since it seems like it would do a manual request; I had suspected you got timeouts due to sending interrupt requests too frequently. I think we might want to make a separate interface for the interrupt request to avoid confusing with on-demand requests, but let me know your thoughts on that.

https://github.com/i404788/embassy/blob/fb70372f52f3615e92863ef01afcb5d176e5d4c5/embassy-usb/src/handlers/hub.rs#L119

Ah yes I see what you mean, I misinterpreted your impl and tried to optimize; I'll fix that.

https://github.com/i404788/embassy/blob/fb70372f52f3615e92863ef01afcb5d176e5d4c5/embassy-usb/src/handlers/hub.rs#L53

Ditto, my mistake.

@nikvoid
Copy link

nikvoid commented Oct 9, 2024

The main purpose is to create a common interface for drivers such that a user can easily re-use them & create a common initialization sequence. This allows (for example) for automatic enumeration & initialization of drivers from a registry without mapping out what each driver needs to get started; this is effectively similar to how larger OSes do it.

I think the main problem with this is that we are on embedded device without Box and alloc, and with statically sized... let's call them "device driver structs", we need some way to return these different structs from automatic initialization method.
Correct me if I misunderstood what you mean.

I didn't know you had integrated the HW interrupt into the request_in, since it seems like it would do a manual request; I had suspected you got timeouts due to sending interrupt requests too frequently. I think we might want to make a separate interface for the interrupt request to avoid confusing with on-demand requests, but let me know your thoughts on that.

Usb is host-centric, and even interrupts are not interrupts, but rather poll-based things, so I though it would be appropriate name. We could just state about this behaviour in docs and/or rename method, e.g. to transfer_in/out

@i404788
Copy link

i404788 commented Oct 9, 2024

While I agree we should assume no-alloc for any infra code, I do regularly use alloc in project development, with the traits you now have the options. Functionally, I don't think your struct pattern is much worse; just that it seems to me most drivers don't need to assume it will run on embassy, rather any async rust would have the equivalent code; so a trait would allow for more portability and for 3rd party drivers that neatly integrate (similar to usbh with their Driver trait, or embedded-hal).

A few things I was planning to build based on the trait was a faster enumeration filter using the static spec, also with the new structure of the hub you don't need a mutexed device registry (or no registry at all just unique address assignment).

Btw the only reason I'm calling them handlers is because driver was already used to describe the UsbHostDriver and gadget-side Driver traits; but it seems like Handler is also already a trait so maybe it should just be UsbDriver or Driver and accept the potential confusion.

even interrupts are not interrupts, but rather poll-based things, so I though it would be appropriate name

If you configure HW interrupt polling you may still manually poll, at least on USBOTG (maybe that's a unnecessary use-case to consider). If we only allow automatic polling I think just documenting the behavior around this would be good enough. (I do agree that the HostDriver should ensure software polling works the same as hw polling with polyfills)

@nikvoid
Copy link

nikvoid commented Oct 10, 2024

So, if I finally understand correctly, handler trait will be something like a middleware between device driver and usb host driver (or just device driver that stands directly over bus driver).

And I have more questions:
https://github.com/i404788/embassy/blob/fb70372f52f3615e92863ef01afcb5d176e5d4c5/embassy-usb/src/handlers/mod.rs#L67
This also not the case for rp2040, so handler probably would need to impl Drop.

https://github.com/i404788/embassy/blob/fb70372f52f3615e92863ef01afcb5d176e5d4c5/embassy-usb/src/handlers/mod.rs#L69
Currently all UsbHostDriver methods take &self, so mut is not necessary. As mentioned in previous question, dropping channels is a handler's responsibility, so it needs to store reference to UsbHostDriver anyway.

https://github.com/i404788/embassy/blob/fb70372f52f3615e92863ef01afcb5d176e5d4c5/embassy-usb/src/handlers/mod.rs#L87
UsbResumableHandler is needed to talk with more devices when channels are limited, right?

@i404788
Copy link

i404788 commented Oct 11, 2024

This also not the case for rp2040, so handler probably would need to impl Drop.

Hmm I was hoping this could be done in the Channel itself to avoid complicating the trait & implementation; e.g. give a Ref<AtomicU32>, and a it's own index, which it can use to xor to mark it is cleared & clear the registers needed if it's an hw interrupt pipe.

Currently all UsbHostDriver methods take &self, so mut is not necessary. As mentioned in previous question, dropping channels is a handler's responsibility, so it needs to store reference to UsbHostDriver anyway.

True I suppose it could be &self, then the assumption is that we have internal mutability (in our case usually registers, but if potentially in other drivers separate data needs to be managed) that the driver has to manage.

UsbResumableHandler is needed to talk with more devices when channels are limited, right?

Correct, so at a higher layer it can be decided to suspend a device in order to start/resume a different device.

@ijager
Copy link
Author

ijager commented Nov 7, 2024

Yay I managed to get a device to enumerate with my refactored stm32 HAL! I like how it works with the extended traits that give a lot of the high level logic for free.

I do have an interesting edge case here. The device is an RFID badge reader that implements HIDKeyboard class, but rather poorly, because this device does not respond to SET_PROTOCOL and SET_IDLE commands. It just hangs.

This is why I implemented TransferOptions with a configurable timeout in the host layer (after you forked it i guess). I think we should reintroduce something like that to be able to deal with unresponsive devices. Or devices that just don't implement the full spec of a device class.

1.003082 DEBUG TCPP03-M20 Value 1: 1C
└─ usb_host_hid_keyboard2::____embassy_main_task::{async_fn#0} @ src/bin/usb_host_hid_keyboard2.rs:115 
1.103668 DEBUG Detecting device
└─ usb_host_hid_keyboard2::____embassy_main_task::{async_fn#0} @ src/bin/usb_host_hid_keyboard2.rs:124 
10.576690 TRACE USB IRQ: device connect/disconnect
└─ embassy_stm32::usb::_version::usb_host::{impl#0}::on_interrupt @ /Users/ijager/dev/telecom/embassy/embassy-stm32/src/fmt.rs:124 
10.577148 <lvl> Found device with speed = Full
└─ usb_host_hid_keyboard2::____embassy_main_task::{async_fn#0} @ src/bin/usb_host_hid_keyboard2.rs:133 
10.577636 TRACE Bus reset
└─ embassy_stm32::usb::_version::usb_host::{impl#5}::bus_reset::{async_fn#0} @ /Users/ijager/dev/telecom/embassy/embassy-stm32/src/fmt.rs:124 
10.628204 TRACE retarget_channel: addr: 0 ep_type: Control index: 0
└─ embassy_stm32::usb::_version::usb_host::{impl#4}::retarget_channel @ /Users/ijager/dev/telecom/embassy/embassy-stm32/src/fmt.rs:124 
10.628875 TRACE retarget_channel: addr: 0 ep_type: Control index: 0
└─ embassy_stm32::usb::_version::usb_host::{impl#4}::retarget_channel @ /Users/ijager/dev/telecom/embassy/embassy-stm32/src/fmt.rs:124 
10.629516 TRACE [enum] Attempting to get max_packet_size for new device
└─ embassy_usb::host::ControlChannelExt::enumerate_device::{async_fn#0} @ /Users/ijager/dev/telecom/embassy/embassy-usb/src/fmt.rs:137 
10.630340 TRACE READ DONE, rx_len = 8
└─ embassy_stm32::usb::_version::usb_host::{impl#3}::read_data @ /Users/ijager/dev/telecom/embassy/embassy-stm32/src/fmt.rs:124 
10.631042 TRACE Descriptor embassy_usb::host::descriptor::DeviceDescriptorPartial: [18, 1, 16, 1, 0, 0, 0, 64]
└─ embassy_usb::host::ControlChannelExt::request_descriptor::{async_fn#0} @ /Users/ijager/dev/telecom/embassy/embassy-usb/src/fmt.rs:137 
10.633789 TRACE [enum] got max packet size for new device 64, attempting to set address
└─ embassy_usb::host::ControlChannelExt::enumerate_device::{async_fn#0} @ /Users/ijager/dev/telecom/embassy/embassy-usb/src/fmt.rs:137 
10.634552 TRACE READ DONE, rx_len = 0
└─ embassy_stm32::usb::_version::usb_host::{impl#3}::read_data @ /Users/ijager/dev/telecom/embassy/embassy-stm32/src/fmt.rs:124 
10.637207 TRACE [enum] Finished setting address
└─ embassy_usb::host::ControlChannelExt::enumerate_device::{async_fn#0} @ /Users/ijager/dev/telecom/embassy/embassy-usb/src/fmt.rs:137 
10.637542 TRACE retarget_channel: addr: 1 ep_type: Control index: 0
└─ embassy_stm32::usb::_version::usb_host::{impl#4}::retarget_channel @ /Users/ijager/dev/telecom/embassy/embassy-stm32/src/fmt.rs:124 
10.638549 TRACE READ DONE, rx_len = 18
└─ embassy_stm32::usb::_version::usb_host::{impl#3}::read_data @ /Users/ijager/dev/telecom/embassy/embassy-stm32/src/fmt.rs:124 
10.639343 TRACE Descriptor embassy_usb::host::descriptor::DeviceDescriptor: [18, 1, 16, 1, 0, 0, 0, 64, 217, 4, 3, 21, 16, 3, 1, 2, 0, 1]
└─ embassy_usb::host::ControlChannelExt::request_descriptor::{async_fn#0} @ /Users/ijager/dev/telecom/embassy/embassy-usb/src/fmt.rs:137 
10.642242 TRACE Device Descriptor: DeviceDescriptor { len: 18, descriptor_type: 1, bcd_usb: 272, device_class: 0, device_subclass: 0, device_protocol: 0, max_packet_size0: 64, vendor_id: 1241, product_id: 5379, bcd_device: 784, manufacturer: 1, product: 2, serial_number: 0, num_configurations: 1 }
└─ embassy_usb::host::ControlChannelExt::enumerate_device::{async_fn#0} @ /Users/ijager/dev/telecom/embassy/embassy-usb/src/fmt.rs:137 
10.643829 TRACE READ DONE, rx_len = 9
└─ embassy_stm32::usb::_version::usb_host::{impl#3}::read_data @ /Users/ijager/dev/telecom/embassy/embassy-stm32/src/fmt.rs:124 
10.644531 TRACE Descriptor embassy_usb::host::descriptor::ConfigurationDescriptor: [9, 2, 41, 0, 1, 1, 0, 128, 80]
└─ embassy_usb::host::ControlChannelExt::request_descriptor::{async_fn#0} @ /Users/ijager/dev/telecom/embassy/embassy-usb/src/fmt.rs:137 
10.648376 TRACE READ DONE, rx_len = 41
└─ embassy_stm32::usb::_version::usb_host::{impl#3}::read_data @ /Users/ijager/dev/telecom/embassy/embassy-stm32/src/fmt.rs:124 
10.649414 TRACE Full Configuration Descriptor [41]: [9, 2, 41, 0, 1, 1, 0, 128, 80, 9, 4, 0, 0, 1, 3, 1, 1, 10, 9, 33, 1, 1, 0, 1, 34, 63, 0, 7, 5, 129, 3, 64, 0, 10, 7, 5, 1, 3, 64, 0, 10]
└─ embassy_usb::host::ControlChannelExt::enumerate_device::{async_fn#0} @ /Users/ijager/dev/telecom/embassy/embassy-usb/src/fmt.rs:137 
10.652130 TRACE READ DONE, rx_len = 0
└─ embassy_stm32::usb::_version::usb_host::{impl#3}::read_data @ /Users/ijager/dev/telecom/embassy/embassy-stm32/src/fmt.rs:124 
10.653137 TRACE retarget_channel: addr: 1 ep_type: Interrupt index: 1
└─ embassy_stm32::usb::_version::usb_host::{impl#4}::retarget_channel @ /Users/ijager/dev/telecom/embassy/embassy-stm32/src/fmt.rs:124 
10.653900 TRACE retarget_channel: addr: 1 ep_type: Control index: 0
└─ embassy_stm32::usb::_version::usb_host::{impl#4}::retarget_channel @ /Users/ijager/dev/telecom/embassy/embassy-stm32/src/fmt.rs:124 
10.654541 DEBUG [kbd]: Setting PROTOCOL & idle
└─ embassy_usb::handlers::kbd::{impl#1}::try_register::{async_fn#0} @ /Users/ijager/dev/telecom/embassy/embassy-usb/src/fmt.rs:151 

... hangs here

@i404788
Copy link

i404788 commented Nov 7, 2024

Yeah timeouts are definitely needed, in USBOTG it's in the hardware and not configurable (unless I missed a register).

The spec says:

The device bus turn-around time is defined by the worst case round trip delay plus the maximum device
response delay (refer to Sections 7.1.18 and 7.1.19 for specific bus turn-around times). If a response is not
received within this worst case timeout, then the transmitter considers that the packet transmission has
failed.

So you should be able to define a constant amount of time (per device speed setting) which is the worst-case timeout.

Also does your device require SET_PROTOCOL/SET_IDLE and ignore it, or does it not support it? A device is supposed to give a STALL response to indicate a request is not supported.

@nikvoid
Copy link

nikvoid commented Nov 8, 2024

RP2040 also has hardware timeouts, but they are apparently too strict for slow devices (like that usb hub I first encountered this issue with). What about adding method to bus trait and maybe individual channels, like set_timeout or so? Implementer then should ignore hardware timeouts until specified timeout reached.

@i404788
Copy link

i404788 commented Nov 8, 2024

Hmm I don't think USBOTG can implement a custom timeout at all; any transaction error that gives any feedback (among which timeout) will also halt the channel. I think that's because the timeout defined by the spec is already quite generous even for older devices, and the protocol supports notifying of not-ready (NYET/NAK). I've also encountered a few devices that require handling NAKs when going fast but not yet one that actually times out.

For the RP2040 I checked the datasheet, RX_TIMEOUT seems like it would be triggered in case of NAK response as well so it's possible the device responded and requested to try again later (busy). @nikvoid could you check if this is the case for your slow hub?

EDIT: it seems like RP2040 unlike USBOTG does automatic retry given NAK_POLL register, but I'm still curious to know if it's getting a NAK or if it's simply not replying since the spec timeouts for NAKs are really long: https://beyondlogic.org/usbnutshell/usb6.shtml#SetupPacket.

I'm okay adding a fail-safe timeout, but I don't think it would be good to ignore the specs' timeout if it can be avoided.

@ijager
Copy link
Author

ijager commented Nov 10, 2024

Also does your device require SET_PROTOCOL/SET_IDLE and ignore it, or does it not support it? A device is supposed to give a STALL response to indicate a request is not supported.

It just doesn't respond at all to the request setup packet. No stall. Continuing with a new, different request will make it respond again. So I guess it is a device with a bad implementation.

Reading the USB 2.0 spec I see the following about timeouts:

  • total maximum timeout: 5sec. That is also what i have seen MacOS USB Host doing for this particular device when it does not respond, looking at the logic analyzer.
  • SetAddress: maximum 50ms
  • Standard Requests: 50ms without Data stage, 500ms with data stage.

I guess we can consider SetAddress a "Standard request without Data Stage".

I agree with having a default timeout (e.g. 50ms/500ms for low and full speed), and a UsbChannel method set_timeout to override it.

Let me know if I am missing anything.

@ijager
Copy link
Author

ijager commented Nov 11, 2024

@i404788, I have included all latest changes of your branch and have now a working example for STM32G0 based on the new traits.

Regarding the async fn wait_for_device_event(&self) -> embassy_usb_driver::host::DeviceEvent API, It might be better to actually pass in the event you're expecting. Because currently it is not fully defined what to wait for.

If a device is already connected and then I call wait_for_device_event(&self), expecting to get a Connected Event, I will never get it. If we pass in the Connected event then we can immediately return if the device was already connected.

Perhaps as an optional, so

// either 

async fn wait_for_device_event(&self, expected_event: host::DeviceEvent) ->host::DeviceEvent

// or

async fn wait_for_device_event(&self, expected_event: Option<host::DeviceEvent>) ->host::DeviceEvent

If passing in expected_event = None we could just wait for any future event. But currently only Connected and Disconnected exist.

@ijager
Copy link
Author

ijager commented Nov 11, 2024

Hmm also not ideal as we have to pass in the Device Speed, which is dummy:

    debug!("Detecting device");
    // Wait for root-port to detect device
    let speed = loop {
        match usbhost.wait_for_device_event(Connected(USBSpeed::Full)).await {
            Connected(speed) => break speed,
            _ => {}
        }
    };

@i404788
Copy link

i404788 commented Nov 14, 2024

It just doesn't respond at all to the request setup packet. No stall. Continuing with a new, different request will make it respond again.

Hmm that does sound strange, it could also be a bad request I recently got this with USBOTG as well. For example if the CRC is incorrect the device will just not respond.

default timeout (e.g. 50ms/500ms for low and full speed), and a UsbChannel method set_timeout to override it.

Sounds good, I'll see if I can get a decent interface around it for the configuration.

Regarding the async fn wait_for_device_event(&self) -> embassy_usb_driver::host::DeviceEvent API, It might be better to actually pass in the event you're expecting. Because currently it is not fully defined what to wait for.

I can see your reasoning here, but wait_for_device_event needs to serialize the events rather (i.e. Disconnected can only be returned if Connected was returned before, vice-versa), as I think ignoring certain events may cause additional unsoundness. Disconnects always need to be handled since channels and drivers will be invalidated (or at least made non-functional).
I'm open to other suggestions on how to improve it, because I agree it's not as intuitive as it could be.

@ijager
Copy link
Author

ijager commented Nov 14, 2024

You are right, the driver can be a state machine. Then we could make sure we always get connect before disconnect.

For unexpected disconnects, those are usually noticed because of failed channel operations/transfers. Is the idea to call wait_for_device_event() after detecting a disconnect due to a ChannelError?

Currently my USB task looks like this:

 loop {
        // Wait for device connected
        usbhost.wait_for_device_event().await;
        // do enumeration etc ...
        let Ok(mut kbd) = KbdHandler::try_register(&usbhost, enum_info).await else {
            ...
            break;
        };

        // Read device endpoints until disconnect
        loop {
            match kbd.wait_for_event().await {
                Ok(HandlerEvent::HandlerEvent(KbdEvent::KeyStatusUpdate(ev))) => {
                ...
                }
                Err(HostError::ChannelError(ChannelError::Disconnected)) | Ok(HandlerEvent::HandlerDisconnected) => {
                    info!("Device disconnected");
                    break;
                }
                other => warn!("Unhandled event: {:?}", other),
            }
        }
    // kbd is dropped here
    }

So I don't really need to wait_for_disconnect_event, as the break will make sure everything is dropped. What is your usecase for disconnect, do you have a different task that handles DeviceEvents?

@nikvoid
Copy link

nikvoid commented Nov 16, 2024

For the RP2040 I checked the datasheet, RX_TIMEOUT seems like it would be triggered in case of NAK response as well so it's possible the device responded and requested to try again later (busy). @nikvoid could you check if this is the case for your slow hub?

EDIT: it seems like RP2040 unlike USBOTG does automatic retry given NAK_POLL register, but I'm still curious to know if it's getting a NAK or if it's simply not replying since the spec timeouts for NAKs are really long: https://beyondlogic.org/usbnutshell/usb6.shtml#SetupPacket.

I tried to take a closer look into this issue.
I captured communication through faulty hub with logic analyzer, and it seems like keyboard behind the hub doesn't respond with ACKs, or hub drops them. No NAKs, nothing. After 1ms hardware repeats the request and sometimes gets ACK (this was a bug - transactions weren't stopped after error), then software passes the partial descriptor request, but fails in next ones, as they only have single try attempt.
The average retry count to get partial device descriptor is 5.

fails in random places:
SETUP | No ACK     | SETUP | ACK | IN | DATA | OUT | No ACK       | ... wait and retry ... | SETUP | ...
'--setup -- timeout  '-- setup --- data in --- status -- timeout

EDIT: RX timeout seems to be no serious error, but rather an indication that hardware is about to repeat request. So +1 to adding a failsafe timeout.

let mut desc_len = slice[offset] as usize;
let mut desc_type = slice[offset + 1];

while desc_type != T::DESC_TYPE || desc_len != T::SIZE {
Copy link

@jakerr jakerr Nov 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only accepting descriptors that are the exact size expected as is done in this line would make the descriptor fail to parse if there were existance of any extentions to the descriptors such as those that exist in USB 2.0+.

For example MIDI 2.0 DeviceEndpoints can potentially have two extra bytes in positions 7 and 8 and so the DeviceEndpointDescriptor bLength may be 9 rather than 7.

Offset Field Size Value Description
0-6 (same as in USB 1.1)
7 bRefresh 1 0x00 Unused
8 bSynchAddress 1 0x00 Unused

What this means is in practice if you assume exact sizes, newer devices will be ignored by this driver if they use any extensions even though they should be backwards compatible.

The assertion that bytes beyond the expected size must be ignored is mentioned briefly in the USB spec in Chapter 9.5 "Descriptors":

If a descriptor returns with a value in its length field that is less than defined by this specification, the
descriptor is invalid and should be rejected by the host. If the descriptor returns with a value in its length field that is greater than defined by this specification, the extra bytes are ignored by the host, but the next
descriptor is located using the length returned rather than the length expected.

This would also be relevant to parse_endpoints (and other descriptor parsing logic above) which assumes the record length when advancing the working_buffer slice working_buffer = &working_buffer[EndpointDescriptor::SIZE..]; as that could break if some bytes in the extended area happened to collide with a valid start to a descriptor.

I think it would be best to refactor the code to always advance the slice being considered by the amount specified by bLength of the current slice, and never assume that it's safe to advance by a static amount.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A sent an attempted fix as another PR targeting this PR branch. Please take a look :)

@jakerr
Copy link

jakerr commented Dec 4, 2024

This branch currently has quite a few commits mixed in that are duplicates of things that have already been merged into main that I think make it a bit unwieldy to review.

I tried my hand at rebasing it in my fork's usb-host that did require a couple conflict resolutions but hopefully I got them right.
Just leaving it here for the authors in case it's helpul.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.