-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nomad logs
no longer works without NOMAD_REGION
set
#24635
Comments
Hi @HerbCSO! It looks like you're running into the same issue described here: #24609 (comment). The For what it's worth, I haven't been able to reproduce this behavior yet.
This is an interesting clue! The CLI is 1.8.0+ and the agents are older? |
I don't have full access to the cluster (specifically no API access), however
Correct. And using CLI versions 1.6.2 or 1.7.7 (basically any version < 1.8.0) from the same machine I was able to pull logs without specifying the region. |
Thanks @HerbCSO. I was able to reproduce and the reproducer was that the requests were going to a non-leader. The HTTP API request arrives at the server with the region set to (Edit: weirdly I think this may have revealed an existing bug, rather than being the bug itself. https://github.com/hashicorp/nomad/blob/v1.9.3/api/api.go#L612 doesn't match the behavior we have anywhere else.) (Edit2: yeah this bug dates back to 4d3b75d. When making direct-to-client requests as in |
In #16872 we added support for unix domain sockets, but this required mutating the `Config` when parsing the address. In #23785 we fixed a bug where if the configuration was used across multiple clients that mutation would happen multiple times and the address would be incorrectly parsed. When making `alloc log` or `alloc exec` calls to a region where the region is not "global", we create a new client from the same configuration and then set the address. But in this case we copy the private `url` field and that causes the URL parsing to be skipped for the new client. This results in the region always being set to the string literal `global` (because of mTLS handling code introduced all the way back in 4d3b75d), which fails with an error "no path to region" when the cluster isn't non-global and requests are sent to a non-leader. The "right" way of fixing this would be for `ClientConfig` not to change the region to global in the first place, but as this is a public API and extremely longstanding behavior, it could potentially be a breaking change for some downstream consumers. Instead, we'll avoid copying the private `url` field so that the new address is re-parsed. Fixes: #24635 Fixes: #24609 Ref: #16872 Ref: #23785 Ref: 4d3b75d
In #16872 we added support for unix domain sockets, but this required mutating the `Config` when parsing the address so as to remove the port number. In #23785 we fixed a bug where if the configuration was used across multiple clients that mutation would happen multiple times and the address would be incorrectly parsed. When making `alloc log`, `alloc fs`, or `alloc exec` calls where we have line-of-sight to the client, we attempt to make a HTTP API call directly to the client node. So we create a new API client from the same configuration and then set the address. But in this case we copy the private `url` field and that causes the URL parsing to be skipped for the new client. This results in the region always being set to the string literal `"global"` (because of mTLS handling code introduced all the way back in 4d3b75d), unless the user has set the region specifically. This fails with an error "no path to region" when the cluster isn't non-global and requests are sent to a non-leader. Arguably the "right" way of fixing this would be for `ClientConfig` not to change the API client's region to `"global"` in the first place, but as this is a public API and extremely longstanding behavior, it could potentially be a breaking change for some downstream consumers. Instead, we'll avoid copying the private `url` field so that the new address is re-parsed. Fixes: #24635 Fixes: #24609 Ref: #16872 Ref: #23785 Ref: 4d3b75d
In #16872 we added support for unix domain sockets, but this required mutating the `Config` when parsing the address so as to remove the port number. In #23785 we fixed a bug where if the configuration was used across multiple clients that mutation would happen multiple times and the address would be incorrectly parsed. When making `alloc log`, `alloc fs`, or `alloc exec` calls where we have line-of-sight to the client, we attempt to make a HTTP API call directly to the client node. So we create a new API client from the same configuration and then set the address. But in this case we copy the private `url` field and that causes the URL parsing to be skipped for the new client. This results in the region always being set to the string literal `"global"` (because of mTLS handling code introduced all the way back in 4d3b75d), unless the user has set the region specifically. This fails with an error "no path to region" when the cluster isn't non-global and requests are sent to a non-leader. Arguably the "right" way of fixing this would be for `ClientConfig` not to change the API client's region to `"global"` in the first place, but as this is a public API and extremely longstanding behavior, it could potentially be a breaking change for some downstream consumers. Instead, we'll avoid copying the private `url` field so that the new address is re-parsed. Fixes: #24635 Fixes: #24609 Ref: #16872 Ref: #23785 Ref: 4d3b75d
Fix up here for review: #24644 |
In #16872 we added support for unix domain sockets, but this required mutating the `Config` when parsing the address so as to remove the port number. In #23785 we fixed a bug where if the configuration was used across multiple clients that mutation would happen multiple times and the address would be incorrectly parsed. When making `alloc log`, `alloc fs`, or `alloc exec` calls where we have line-of-sight to the client, we attempt to make a HTTP API call directly to the client node. So we create a new API client from the same configuration and then set the address. But in this case we copy the private `url` field and that causes the URL parsing to be skipped for the new client. This results in the region always being set to the string literal `"global"` (because of mTLS handling code introduced all the way back in 4d3b75d), unless the user has set the region specifically. This fails with an error "no path to region" when the cluster isn't non-global and requests are sent to a non-leader. Arguably the "right" way of fixing this would be for `ClientConfig` not to change the API client's region to `"global"` in the first place, but as this is a public API and extremely longstanding behavior, it could potentially be a breaking change for some downstream consumers. Instead, we'll avoid copying the private `url` field so that the new address is re-parsed. Fixes: #24635 Fixes: #24609 Ref: #16872 Ref: #23785 Ref: 4d3b75d
In #16872 we added support for unix domain sockets, but this required mutating the `Config` when parsing the address so as to remove the port number. In #23785 we fixed a bug where if the configuration was used across multiple clients that mutation would happen multiple times and the address would be incorrectly parsed. When making `alloc log`, `alloc fs`, or `alloc exec` calls where we have line-of-sight to the client, we attempt to make a HTTP API call directly to the client node. So we create a new API client from the same configuration and then set the address. But in this case we copy the private `url` field and that causes the URL parsing to be skipped for the new client. This results in the region always being set to the string literal `"global"` (because of mTLS handling code introduced all the way back in 4d3b75d), unless the user has set the region specifically. This fails with an error "no path to region" when the cluster isn't non-global and requests are sent to a non-leader. Arguably the "right" way of fixing this would be for `ClientConfig` not to change the API client's region to `"global"` in the first place, but as this is a public API and extremely longstanding behavior, it could potentially be a breaking change for some downstream consumers. Instead, we'll avoid copying the private `url` field so that the new address is re-parsed. Fixes: #24635 Fixes: #24609 Ref: #16872 Ref: #23785 Ref: 4d3b75d
In #16872 we added support for unix domain sockets, but this required mutating the `Config` when parsing the address so as to remove the port number. In #23785 we fixed a bug where if the configuration was used across multiple clients that mutation would happen multiple times and the address would be incorrectly parsed. When making `alloc log`, `alloc fs`, or `alloc exec` calls where we have line-of-sight to the client, we attempt to make a HTTP API call directly to the client node. So we create a new API client from the same configuration and then set the address. But in this case we copy the private `url` field and that causes the URL parsing to be skipped for the new client. This results in the region always being set to the string literal `"global"` (because of mTLS handling code introduced all the way back in 4d3b75d), unless the user has set the region specifically. This fails with an error "no path to region" when the cluster isn't non-global and requests are sent to a non-leader. Arguably the "right" way of fixing this would be for `ClientConfig` not to change the API client's region to `"global"` in the first place, but as this is a public API and extremely longstanding behavior, it could potentially be a breaking change for some downstream consumers. Instead, we'll avoid copying the private `url` field so that the new address is re-parsed. Fixes: #24635 Fixes: #24609 Ref: #16872 Ref: #23785 Ref: 4d3b75d
In #16872 we added support for unix domain sockets, but this required mutating the `Config` when parsing the address so as to remove the port number. In #23785 we fixed a bug where if the configuration was used across multiple clients that mutation would happen multiple times and the address would be incorrectly parsed. When making `alloc log`, `alloc fs`, or `alloc exec` calls where we have line-of-sight to the client, we attempt to make a HTTP API call directly to the client node. So we create a new API client from the same configuration and then set the address. But in this case we copy the private `url` field and that causes the URL parsing to be skipped for the new client. This results in the region always being set to the string literal `"global"` (because of mTLS handling code introduced all the way back in 4d3b75d), unless the user has set the region specifically. This fails with an error "no path to region" when the cluster isn't non-global and requests are sent to a non-leader. Arguably the "right" way of fixing this would be for `ClientConfig` not to change the API client's region to `"global"` in the first place, but as this is a public API and extremely longstanding behavior, it could potentially be a breaking change for some downstream consumers. Instead, we'll avoid copying the private `url` field so that the new address is re-parsed. Fixes: #24635 Fixes: #24609 Ref: #16872 Ref: #23785 Ref: 4d3b75d
…ddress (#24644) (#24682) In #16872 we added support for unix domain sockets, but this required mutating the `Config` when parsing the address so as to remove the port number. In #23785 we fixed a bug where if the configuration was used across multiple clients that mutation would happen multiple times and the address would be incorrectly parsed. When making `alloc log`, `alloc fs`, or `alloc exec` calls where we have line-of-sight to the client, we attempt to make a HTTP API call directly to the client node. So we create a new API client from the same configuration and then set the address. But in this case we copy the private `url` field and that causes the URL parsing to be skipped for the new client. This results in the region always being set to the string literal `"global"` (because of mTLS handling code introduced all the way back in 4d3b75d), unless the user has set the region specifically. This fails with an error "no path to region" when the cluster isn't non-global and requests are sent to a non-leader. Arguably the "right" way of fixing this would be for `ClientConfig` not to change the API client's region to `"global"` in the first place, but as this is a public API and extremely longstanding behavior, it could potentially be a breaking change for some downstream consumers. Instead, we'll avoid copying the private `url` field so that the new address is re-parsed. Fixes: #24635 Fixes: #24609 Ref: #16872 Ref: #23785 Ref: 4d3b75d Co-authored-by: Tim Gross <[email protected]>
Nomad version
Nomad v1.9.3
BuildDate 2024-11-11T16:35:41Z
Revision d92bf10
Operating system and Environment details
MacOS CLI, Linux server
Issue
Up to and including Nomad CLI 1.7.7 I was able to run
nomad alloc logs deadbeef
withoutNOMAD_REGION
being set. Since 1.8.0NOMAD_REGION
is required, else I just get aUnexpected response code: 500 (No path to region)
. So I have to either setNOMAD_REGION
or runnomad alloc logs -region myregion deadbeef
.From the docs it reads to me like this is unintentional behavior. It says
Defaults to the Agent's local region.
so I would assume I wouldn't need to specify this parameter. I didn't find anything in the release notes that specifies a necessary config change for this, so please let me know if I missed something there.Also, this sounds a bit like the issue described in hashicorp/nomad-pack#374, but I'm not sure if this is really the same thing.
FWIW our Nomad servers and clients are a bit older, running 1.6.3 - maybe that's part of the issue?
Reproduction steps
Run
nomad alloc logs
using a version prior to 1.8.0 - worksRun
nomad alloc logs
using a version prior >= 1.8.0 - produces 500 errorExpected Result
No 500 error
Actual Result
500 error
Job file (if appropriate)
N/A
Nomad Server logs (if appropriate)
N/A
Nomad Client logs (if appropriate)
N/A
The text was updated successfully, but these errors were encountered: