Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deps: python3-kafka sync our package with upstream [SRE-7606] #24

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

rushidave
Copy link
Collaborator

Need to sync upstream as python3-kafka package is failing due to this import error. The fix we need has been merged upstream (https://github.com/dpkp/kafka-python/pull/2398/files) and we need to sync with our package.

orange-kao and others added 5 commits January 3, 2024 10:55
After stop/start kafka service, kafka-python may use 100% CPU caused by
busy-retry while the socket was closed. This fix the issue by sleep 0.1
second if the fd is negative.
Implement support for SOCKS5 proxies. Implement a new proxy wrapper
that handles SOCKS5 connection, authentication and requesting
connections to the actual Kafka broker endpoint.

The proxy can be configured via a new keyword argument `socks5_proxy`
to consumers, producers or admin client. The value is URL with optional
username and password. E.g.
`socks5://user:[email protected]:10800`

The implementation is done in state machine that emulates repeated
calls to connect_ex. The rationale with this design is minimal changes
to the actual BrokerConnection object.
If the value `_controller_id` is out-of-date and the node was removed
from the cluster, `_send_request_to_node` would enter an infinite loop.
A call to `maybe_connect` can be performed while the cluster metadata is
being updated. If that happens, the assumption that every entry in
`_connecting` has metadata won't hold. The existing assert will then
raise on every subsequent call to `poll` driving the client instance
unusable.

This fixes the issue by ignoring connetion request to nodes that do not
have the metadata available anymore.
@rushidave rushidave self-assigned this Jan 3, 2024
@rushidave rushidave changed the title Rushidave merge upstream Sync our package with upstream Jan 3, 2024
@rushidave rushidave requested review from ivanyu and tvainika January 3, 2024 17:24
@rushidave rushidave changed the title Sync our package with upstream deps: python3-kafka sync our package with upstream [SRE-7606] Jan 18, 2024
Copy link

@edward-evans-aiven edward-evans-aiven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the one note, not sure if it needs to change or is OK. Most of the sync seems fine.

Returns a future that may be polled for status and results.

:param node_id: The broker id to which to send the message.
:param request: The message to send.
:param wakeup: Optional flag to disable thread-wakeup.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to be missing the wakeup parameter

Copy link

@aiven-anton aiven-anton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging upstream into our fork is not the right action to take, rather we should make sure our commits are applied on top of upstream, i.e. essentially a rebase.

@aiven-anton
Copy link

@rushidave To expand on my comment, I'd expect this to be carried out something like this:

# Setup access to the two repositories and fetch.
$ git remote add fork [email protected]:aiven/kafka-python.git
$ git remote add upstream [email protected]:dpkp/kafka-python.git
$ git fetch -p fork
$ git fetch -p upstream
# Checkout main and reset it to the fork's main (make sure to not have anything uncommitted before doing this)
$ git checkout main
$ git reset fork/main --hard
# Rebase main on upstream/main. This means any commit made on the fork will be reapplied on top of
# the new upstream/main commits.
$ git rebase upstream/main
# Force push our rebased main to the fork.
# --force-with-lease makes sure we don't accidentally overwrite something that was committed while
# we were working.
$ git push fork main --force-with-lease

@rushidave
Copy link
Collaborator Author

rushidave commented Feb 28, 2024

Would love to get another set of eyes before pushing. Also we have a backup branch rushidave-main-backup-feb-2024

git log. Main is 8 commits ahead of upstream
commit c662d941b13f11620f60fc48580f2f6643051ca6 (HEAD -> main)
Author: Jonas Keeling <[email protected]>
Date:   Mon Feb 26 14:36:42 2024 +0100

    bugfix: catch KeyError in check_version and retry

commit 1a6bf016fe5c47c597afa081c6e103ca0d283d14
Author: Jonas Keeling <[email protected]>
Date:   Mon Feb 26 10:29:17 2024 +0100

    Revert "bugfix: raise error in check_version if broker is unavailable"

commit 919f61da9efdfe1cf760b94585d72b16d414cdda
Author: Jonas Keeling <[email protected]>
Date:   Thu Feb 22 23:00:34 2024 +0100

    bugfix: raise error in check_version if broker is unavailable
    
    This fixes an issue in check_version where KeyError is raised if the broker is unavailable or an invalid node_id is used. Instead it will return BrokerNotAvailableError.

commit 4ca2f45994ff71da76211ca28b8978fbc2278b17
Author: Augusto F. Hack <[email protected]>
Date:   Tue Jan 5 13:20:38 2021 +0100

    bugfix: race among _connecting and cluster metadata (#2189)
    
    A call to `maybe_connect` can be performed while the cluster metadata is
    being updated. If that happens, the assumption that every entry in
    `_connecting` has metadata won't hold. The existing assert will then
    raise on every subsequent call to `poll` driving the client instance
    unusable.
    
    This fixes the issue by ignoring connetion request to nodes that do not
    have the metadata available anymore.

commit 6985761d2f981ace1943e728a33781afc454552c
Author: Augusto F. Hack <[email protected]>
Date:   Thu Jan 14 17:50:28 2021 +0100

    bugfix: infinite loop when send msgs to controller (#2194)
    
    If the value `_controller_id` is out-of-date and the node was removed
    from the cluster, `_send_request_to_node` would enter an infinite loop.

commit a927ff28a2d78fdf45f8892ad7b67c66ce6276fc
Author: Augusto F. Hack <[email protected]>
Date:   Thu Jan 14 15:18:05 2021 +0100

    bugfix: fix infinite loop on KafkaAdminClient (#2194)
    
    An infinite loop may happen with the following pattern:
    
        self._send_request_to_node(self._client.least_loaded_node(), request)
    
    The problem happens when `self._client`'s cluster metadata is out-of-date, and the
    result of `least_loaded_node()` is a node that has been removed from the cluster but
    the client is unware of it. When this happens `_send_request_to_node` will enter an
    infinite loop waiting for the chosen node to become available, which won't happen,
    resulting in an infinite loop.
    
    This commit introduces a new method named `_send_request_to_least_loaded_node` which
    handles the case above. This is done by regularly checking if the target node is
    available in the cluster metadata, and if not, a new node is chosen.
    
    Notes:
    
    - This does not yet cover every call site to `_send_request_to_node`, there are some
      other places were similar race conditions may happen.
    - The code above does not guarantee that the request itself will be sucessful, since
      it is still possible for the target node to exit, however, it does remove the
      infinite loop which can render client code unusable.

commit 6efff5251b887306f46c1aeff8984d6292390378
Author: Heikki Nousiainen <[email protected]>
Date:   Tue Sep 29 21:45:27 2020 +0300

    Support connecting through SOCKS5 proxies (#2169)
    
    Implement support for SOCKS5 proxies. Implement a new proxy wrapper
    that handles SOCKS5 connection, authentication and requesting
    connections to the actual Kafka broker endpoint.
    
    The proxy can be configured via a new keyword argument `socks5_proxy`
    to consumers, producers or admin client. The value is URL with optional
    username and password. E.g.
    `socks5://user:[email protected]:10800`
    
    The implementation is done in state machine that emulates repeated
    calls to connect_ex. The rationale with this design is minimal changes
    to the actual BrokerConnection object.

commit ce7d853198b4207fa52d6bbf4ef0f5426ae102bf
Author: Orange Kao <[email protected]>
Date:   Tue Aug 4 01:42:47 2020 +0000

    Avoid 100% CPU usage while socket is closed (sleep)
    
    After stop/start kafka service, kafka-python may use 100% CPU caused by
    busy-retry while the socket was closed. This fix the issue by sleep 0.1
    second if the fd is negative.

commit b68f61d49556377bf111bebb82f8f2bd360cc6f7 (upstream/master)
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Wed Dec 13 21:11:04 2023 -0500

    Bump actions/setup-java from 3 to 4 (#2417)
    
    Bumps [actions/setup-java](https://github.com/actions/setup-java) from 3 to 4.
    - [Release notes](https://github.com/actions/setup-java/releases)
    - [Commits](https://github.com/actions/setup-java/compare/v3...v4)
    
    ---
    updated-dependencies:
    - dependency-name: actions/setup-java
      dependency-type: direct:production
      update-type: version-update:semver-major
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

@aiven-anton
Copy link

aiven-anton commented Feb 29, 2024

@rushidave I've verified:

I would consider omitting 1a6bf016fe5c47c597afa081c6e103ca0d283d14 and 919f61da9efdfe1cf760b94585d72b16d414cdda as one reverts the other, but no strong opinion either way. This LGTM.

@aiven-anton
Copy link

@rushidave From your comment here.

Discussion here to remove branch protection

Yes, I think you can ask in access management if you can get admin permissions for this repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants