-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deps: python3-kafka sync our package with upstream [SRE-7606] #24
base: main
Are you sure you want to change the base?
Conversation
After stop/start kafka service, kafka-python may use 100% CPU caused by busy-retry while the socket was closed. This fix the issue by sleep 0.1 second if the fd is negative.
Implement support for SOCKS5 proxies. Implement a new proxy wrapper that handles SOCKS5 connection, authentication and requesting connections to the actual Kafka broker endpoint. The proxy can be configured via a new keyword argument `socks5_proxy` to consumers, producers or admin client. The value is URL with optional username and password. E.g. `socks5://user:[email protected]:10800` The implementation is done in state machine that emulates repeated calls to connect_ex. The rationale with this design is minimal changes to the actual BrokerConnection object.
If the value `_controller_id` is out-of-date and the node was removed from the cluster, `_send_request_to_node` would enter an infinite loop.
A call to `maybe_connect` can be performed while the cluster metadata is being updated. If that happens, the assumption that every entry in `_connecting` has metadata won't hold. The existing assert will then raise on every subsequent call to `poll` driving the client instance unusable. This fixes the issue by ignoring connetion request to nodes that do not have the metadata available anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the one note, not sure if it needs to change or is OK. Most of the sync seems fine.
kafka/admin/client.py
Outdated
Returns a future that may be polled for status and results. | ||
|
||
:param node_id: The broker id to which to send the message. | ||
:param request: The message to send. | ||
:param wakeup: Optional flag to disable thread-wakeup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We seem to be missing the wakeup
parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merging upstream into our fork is not the right action to take, rather we should make sure our commits are applied on top of upstream, i.e. essentially a rebase.
@rushidave To expand on my comment, I'd expect this to be carried out something like this: # Setup access to the two repositories and fetch.
$ git remote add fork [email protected]:aiven/kafka-python.git
$ git remote add upstream [email protected]:dpkp/kafka-python.git
$ git fetch -p fork
$ git fetch -p upstream
# Checkout main and reset it to the fork's main (make sure to not have anything uncommitted before doing this)
$ git checkout main
$ git reset fork/main --hard
# Rebase main on upstream/main. This means any commit made on the fork will be reapplied on top of
# the new upstream/main commits.
$ git rebase upstream/main
# Force push our rebased main to the fork.
# --force-with-lease makes sure we don't accidentally overwrite something that was committed while
# we were working.
$ git push fork main --force-with-lease |
Would love to get another set of eyes before pushing. Also we have a backup branch
|
@rushidave I've verified:
I would consider omitting |
@rushidave From your comment here.
Yes, I think you can ask in access management if you can get admin permissions for this repository. |
Need to sync upstream as python3-kafka package is failing due to this import error. The fix we need has been merged upstream (https://github.com/dpkp/kafka-python/pull/2398/files) and we need to sync with our package.