Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] zenoh_bridge_ros2dds fails during runtime with 100% CPU #152

Open
Timple opened this issue May 23, 2024 · 3 comments
Open

[Bug] zenoh_bridge_ros2dds fails during runtime with 100% CPU #152

Timple opened this issue May 23, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@Timple
Copy link

Timple commented May 23, 2024

Describe the bug

zenoh_bridge_ros2dds fails during runtime. It blows to 100% CPU and doesn't allow connections to port 7447 anymore.

To reproduce

  1. Start with a really busy network:
$ ros2 topic list | wc -l
229
$ ros2 node list | wc -l
WARNING: Be aware that there are nodes in the graph that share an exact name, which can have unintended side effects.
111
  1. Run the bridge in a seperate docker:
  zenoh-bridge:
    image: eclipse/zenoh-bridge-ros2dds
    command: --listen tcp/0.0.0.0:7447
    environment:
      - ROS_DISTRO=iron
    network_mode: host
  1. Connect with the bridge on the other side:
$ RUST_BACKTRACE=1 zenoh_bridge_ros2dds -m client --connect tcp/10.8.129.75:7447
2024-05-23T09:31:11.123646Z  INFO main ThreadId(01) zenoh_bridge_ros2dds: zenoh-bridge-ros2dds v0.11.0-dev-110-gc0e7130
2024-05-23T09:31:11.123856Z  INFO main ThreadId(01) zenoh_bridge_ros2dds: Zenoh Config { id: 53102a8c91987dc6cacb4174a006d22f, metadata: Null, mode: Some(Client), connect: ConnectConfig { timeout_ms: None, endpoints: [tcp/10.8.129.75:7447], exit_on_failure: None, retry: None }, listen: ListenConfig { timeout_ms: None, endpoints: [], exit_on_failure: None, retry: None }, scouting: ScoutingConf { timeout: None, delay: None, multicast: ScoutingMulticastConf { enabled: None, address: None, interface: None, autoconnect: None, listen: None }, gossip: GossipConf { enabled: None, multihop: None, autoconnect: None } }, timestamping: TimestampingConf { enabled: Some(Unique(true)), drop_future_timestamp: None }, queries_default_timeout: None, routing: RoutingConf { router: RouterRoutingConf { peers_failover_brokering: None }, peer: PeerRoutingConf { mode: None } }, aggregation: AggregationConf { subscribers: [], publishers: [] }, transport: TransportConf { unicast: TransportUnicastConf { accept_timeout: 10000, accept_pending: 100, max_sessions: 1000, max_links: 1, lowlatency: false, qos: QoSUnicastConf { enabled: true }, compression: CompressionUnicastConf { enabled: false } }, multicast: TransportMulticastConf { join_interval: Some(2500), max_sessions: Some(1000), qos: QoSMulticastConf { enabled: false }, compression: CompressionMulticastConf { enabled: false } }, link: TransportLinkConf { protocols: None, tx: LinkTxConf { sequence_number_resolution: U32, lease: 10000, keep_alive: 4, batch_size: 65535, queue: QueueConf { size: QueueSizeConf { control: 1, real_time: 1, interactive_high: 1, interactive_low: 1, data_high: 2, data: 4, data_low: 2, background: 1 }, congestion_control: CongestionControlConf { wait_before_drop: 1000 }, backoff: 100 }, threads: 5 }, rx: LinkRxConf { buffer_size: 65535, max_message_size: 1073741824 }, tls: TLSConf { root_ca_certificate: None, server_private_key: None, server_certificate: None, client_auth: None, client_private_key: None, client_certificate: None, server_name_verification: None, root_ca_certificate_base64: None, server_private_key_base64: None, server_certificate_base64: None, client_private_key_base64: None, client_certificate_base64: None }, unixpipe: UnixPipeConf { file_access_mask: None } }, shared_memory: SharedMemoryConf { enabled: false }, auth: AuthConf { usrpwd: UsrPwdConf { user: None, password: None, dictionary_file: None }, pubkey: PubKeyConf { public_key_pem: None, private_key_pem: None, public_key_file: None, private_key_file: None, key_size: None, known_keys_file: None } } }, adminspace: AdminSpaceConf { enabled: true, permissions: PermissionsConf { read: true, write: false } }, downsampling: [], access_control: AclConfig { enabled: false, default_permission: Deny, rules: None }, plugins_loading: PluginsLoading { enabled: true, search_dirs: None }, plugins: Object {"ros2dds": Object {"ros_localhost_only": Bool(false)}} }
2024-05-23T09:31:11.123936Z  INFO main ThreadId(01) zenoh::net::runtime: Using ZID: 53102a8c91987dc6cacb4174a006d22f
2024-05-23T09:31:11.128394Z  INFO main ThreadId(01) zenoh::plugins::loader: Starting required plugin "ros2dds"
2024-05-23T09:31:11.128453Z  INFO main ThreadId(01) zenoh::plugins::loader: Successfully started plugin ros2dds from "__static_lib__"
2024-05-23T09:31:11.128456Z  INFO main ThreadId(01) zenoh::plugins::loader: Finished loading plugins
2024-05-23T09:31:11.128480Z  INFO async-std/runtime ThreadId(43) zenoh_plugin_ros2dds: ROS2 plugin Config { id: None, namespace: "/", nodename: "zenoh_bridge_ros2dds", domain: 0, ros_localhost_only: false, allowance: None, pub_max_frequencies: [], transient_local_cache_multiplier: 10, queries_timeout: None, reliable_routes_blocking: true, pub_priorities: [], __required__: None, __path__: None }
2024-05-23T09:31:12.130079Z  WARN              main ThreadId(01) zenoh::net::runtime::orchestrator: Unable to connect to tcp/10.8.129.75:7447! deadline has elapsed
2024-05-23T09:31:12.130160Z ERROR              main ThreadId(01) zenoh::net::runtime::orchestrator: [] Unable to connect to any of [tcp/10.8.129.75:7447]!  at /home/tim/.cargo/git/checkouts/zenoh-cc237f2570fab813/75aa273/zenoh/src/net/runtime/orchestrator.rs:278.
Failed to start Zenoh runtime: [] Unable to connect to any of [tcp/10.8.129.75:7447]!  at /home/tim/.cargo/git/checkouts/zenoh-cc237f2570fab813/75aa273/zenoh/src/net/runtime/orchestrator.rs:278.. Exiting...
  1. Validate port with telnet:
$ telnet 10.8.251.17 7447
Trying 10.8.251.17...
Connected to 10.8.251.17.

System info

Ubuntu: 22.04
ROS: Iron
Docker image: latest (today)
Local bridge: main (today) from source, but also latest docker fails the same

Other

Restarting the docker on the robot does resolve the CPU usage (back to few %). But still no connection possible.

@Timple Timple added the bug Something isn't working label May 23, 2024
@YannickdeHoop
Copy link

I have the same issue

@jeaninevbrr
Copy link

Same here!

@JEnoch
Copy link
Member

JEnoch commented Oct 18, 2024

Are you still experiencing this issue with image eclipse/zenoh-bridge-ros2dds:1.0.0-rc.2 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants