Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subscriber is being held when another subscriber callback is being called #2678

Open
TonyWelte opened this issue Nov 20, 2024 · 1 comment
Open

Comments

@TonyWelte
Copy link

Bug report

Required Info:

  • Operating System:
    • Ubuntu 24.04 (docker and host)
  • Installation type:
    • binaries
  • Version or commit hash:
    • jazzy
  • DDS implementation:
    • Fast-RTPS
  • Client library (if applicable):
    • rclcpp

Steps to reproduce issue

#include <rclcpp/rclcpp.hpp>

#include <std_msgs/msg/empty.hpp>

class MyClass {
  public:
    MyClass(rclcpp::Node& node) {
        std::cout << "Constructor: " << this << std::endl;

        _sub = node.create_subscription<std_msgs::msg::Empty>("test", 10, std::bind(&MyClass::callback, this, std::placeholders::_1));
    }

    ~MyClass() {
        std::cout << "Destructor:  " << this << ", Count: " << _sub.use_count() << std::endl;
    }

    void callback(const std_msgs::msg::Empty&) {
        std::cout << "Callback:    " << this << std::endl;
        my_data += 1.0;
    } 

  private:
  rclcpp::Subscription<std_msgs::msg::Empty>::SharedPtr _sub;

  double my_data;
};

int main(int argc, char* argv[]) {
    rclcpp::init(argc, argv);
    
    auto node = std::make_shared<rclcpp::Node>("test_node");

    std::unique_ptr<MyClass> object;
    
    auto sub = node->create_subscription<std_msgs::msg::Empty>("sub", 10, [&object, &node](const std_msgs::msg::Empty&){
        if(object){
            object.reset();
        }
        else {
            object = std::make_unique<MyClass>(*node);
        }
    });

    rclcpp::spin(node);

    rclcpp::shutdown();
    
    return 0;
}

You should be able to build the executable with just rclcpp and std_msgs linked.

I recommend compiling with the address sanitizer to make sure it crashes once the issue occurs. Without it, it might not crash but it's definitely UB.

colcon build --cmake-args -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_FLAGS=-fsanitize=address

In three terminals

# Term1
ros2 topic pub --rate 10.0 /sub std_msgs/msg/Empty '{}'
# Term2
ros2 topic pub --rate 100.1 /test std_msgs/msg/Empty '{}'
# Term3
ASAN_OPTIONS=new_delete_type_mismatch=0 ros2 run YOUR_TEST_PACKAGE YOU_TEST_EXECUTABLE

Expected behavior

My understanding was that the executor had weak ownership of the subscriber and would only get a shared_ptr when executing the callback. Hence, in a single threaded context when a callback from one subscriber is called it should be possible to destroy another subscriber and clean related objects.

Even in a multi threaded context I would expect the same if the two subscribers interacting with each other are part a mutually exclusive callback group.

Actual behavior

Here when the callback for /sub is called MyClass is destroyed but the subscriber to /test it contains is held by someone else (the count of the share_ptr is 2).

This worked fine with ROS2 humble, the subscriber count was 1 when it's callback wasn't being called.

Assuming this is the intended behavior it becomes difficult cleanup objects used by subscribers (or containing subscribers as in the example). I'm sure it still feasible by keeping a weak_ptr before resting our shared_ptr, then the executor will hopefully release its shared_ptr and only after that destroy the object. But what was as simple as destroying the object becomes more convoluted.

Additional information

I found these issues that are somewhat related but I think it's a different issue since they had callbacks called in parallel which is not the case here:

@Barry-Xu-2018
Copy link
Collaborator

The implementation of Executor has been refactored.

When there are some "wait" ready to handle,

[this](WaitResultKind wait_result_kind) -> WaitResult<WaitSetTemplate> {
// convert the result into a WaitResult
switch (wait_result_kind) {
case WaitResultKind::Ready:
return WaitResult<WaitSetTemplate>::from_ready_wait_result_kind(*this);

A WaitResult will be created. storage_acquire_ownerships() will be called in wait_result_acquire(). It will get ownership for all waitable entities.

WaitResult(WaitResultKind wait_result_kind, WaitSetT & wait_set)
: wait_result_kind_(wait_result_kind),
wait_set_pointer_(&wait_set)
{
// Should be enforced by the static factory methods on this class.
assert(WaitResultKind::Ready == wait_result_kind);
// Secure thread-safety (if provided) and shared ownership (if needed).
this->get_wait_set().wait_result_acquire();
}

When handling each ready wait (call callback function), the ownerships for all waitable entities are hold (I understand that this is probably done for efficiency reasons).

Is this something that needs to be changed? Let's hear what others think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants