Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does get_metadata() fail when running in a debugger, but succeeds otherwise? #628

Open
themightyoarfish opened this issue Nov 26, 2024 · 6 comments
Labels
question Further information is requested

Comments

@themightyoarfish
Copy link

Describe your question

This is a very interesting problem I have never had, and I don't even know how to begin debugging it.

Our client code calls get_metadata() for a number of times, since it often returns an empty string for the first n attempts.

    // For unknown reasons, the first attempt at `get_metadata()` (a HTTP api
    // request) often aborts in the SDK and we get an empty string. Trying again
    // right away often helps, so we do just this
    std::string meta{};
    const size_t num_retries = config.get<int>("fetch_metadata_num_retries", 5);
    const size_t fetch_timeout = config.get<int>("fetch_metadata_timeout_s", 2);
    for (size_t connect_attempt = 0;
         connect_attempt < num_retries && meta.empty();
         connect_attempt++) {
      try {
        LOG_INFO(Logger::SENSOR,
                 __func__ << ": get_metadata attempt number "
                          << connect_attempt + 1);
        meta = sensor::get_metadata(*cli_tmp, fetch_timeout);
      } catch (const std::runtime_error& e) {
        LOG_INFO(
            Logger::SENSOR, __func__ << ": Failed fetching sensor metadata.");
      } catch (const std::invalid_argument& e) {
        LOG_INFO(
            Logger::SENSOR, __func__ << ": Failed parsing sensor metadata.");
      }
    }

    if (meta.empty()) {
      throw std::runtime_error(
          "Fetching or parsing sensor info json failed. Often, this indicates "
          "a network "
          "communication problem or that the sensor is currently starting");
    }

This works mostly, but now when I execute the same program in lldb, all attempts fail, no matter how often I try.

I recently upgraded the OS, but I don't know if Apple's new llvm version can have anything to do with this. How could I begin troubleshooting?

Platform (please complete the following information):

  • Ouster Sensor? OS1 rev 7
  • Ouster Firmware Version? v3.1.0
  • Programming Language? C++
  • Operating System? macOS 15.1.1
  • Machine Architecture? arm
  • git commit hash (if not the latest) 3ecf147
@themightyoarfish themightyoarfish added the question Further information is requested label Nov 26, 2024
@themightyoarfish
Copy link
Author

I'm suspecting this is some fuckup of Apple's that haunts me now that I've upgraded the OS.

@themightyoarfish
Copy link
Author

On a related note, why does get_metadata() so often just return an empty string?

@themightyoarfish
Copy link
Author

When I use this code to do the HTTP request directly

    auto sensor_http =
        ouster::sensor::util::SensorHttp::create("os-122307000738.local", 10);
    auto info = sensor_http->sensor_info(10);

I receive

libc++abi: terminating due to uncaught exception of type std::runtime_error: CurlClient::execute_request failed for the url: [http://os-122307000738.local/api/v1/system/firmware] with the error message: Couldn't connect to server

When running in debugger, but the program runs without it.

Meanwhile, curl on the command line works too:

curl --request GET --url http://os-122307000738.local/api/v1/sensor/metadata/sensor_info

It seems that under lldb, I cannot make network connections?

@themightyoarfish
Copy link
Author

Also does not seem to be related to debugging entitlements. Running this has no effect

codesign -s - -v -f --entitlements =(echo -n '<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "https://www.apple.com/DTDs/PropertyList-1.0.dtd"\>
<plist version="1.0">
    <dict>
        <key>com.apple.security.get-task-allow</key>
        <true/>
    </dict>
</plist>') <program>

@themightyoarfish
Copy link
Author

Update: This seems to be a problem with apple lldb. Homebrew lldb via /opt/homebrew/Cellar/llvm/18.1.8/bin/lldb in my case works 🤡

@themightyoarfish
Copy link
Author

What's real strange is that opening tcp connections inside a c++ program works normally in lldb

#include <iostream>
using namespace std;
int main() {
  int x = system("nc -z os-122307000738.local 80 > /dev/null 2>&1");
  if (x == 0) {
    cout << "success";
  } else {
    cout << "failed";
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant