Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadlock in oath2 authentication #59557

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

nyalldawson
Copy link
Collaborator

We have to take care that we don't directly create QgsO2 objects on the thread where the authentication request is occurring,
as QgsO2 objects are long running but the thread requesting authentication may be short-lived. If we create QgsO2 objects
on a short-running thread, then when we later try to authenticate using the same oauth2 authentication method we
will get a deadlock, as the O2 reply server will no longer have an active thread or an event loop running.

This was especially evident in browser items which use oauth2 authentication. Since these are usually populated using very
short-life threads, we'd often hit this situation:

  1. The connection would be expanded. Browser would create a thread to populate the connection. The oauth2 objects would then be created on this same thread, and everything would initially work OK. The user could complete the authentication since the browser population thread was blocked until this was done.
  2. The browser item got populated, and then the thread populating it was destroyed
  3. Later, something else would request authentication using the same oauth2 config. This may be eg the user expanding out a
    different folder on the browser connection.
  4. If the oauth2 token had expired, then we'd try to refresh it. But this relied on event-based logic, and the event loop for the
    QgsO2 object was no longer around to handle this. The authentication request would dead-lock.

Fix this by ensuring that all QgsO2 objects are created and run on an instance of a new QgsOAuth2Factory QThread. This thread is created on demand, and will then exist for the life of the QGIS session.

This ensures that all QgsO2 objects will run on a thread with an application-long lifetime, so there's no issue if they later
require event-loop based logic (such as token refresh)

Funded by RGD Savoie Mont Blanc

The internal copy of the o2 library has customisations and API
changes which are required for QGIS... as a result it hasn't
been possible to build with the non-internal library version
for years.

Remove all this dead/unused code.
We only ever create one instance of a QgsAuthMethod, so there's no need
for this member to be static
This is being accessed across multiple threads, so we need
to protect access accordingly
We do not need to use deleteLater here for QgsAuthOAuth2Config,
so we can considerably simplify the logic
And make ownership clear by setting parent to the QgsO2 parent
We have to take care that we don't directly create QgsO2 objects
on the thread where the authentication request is occurring,
as QgsO2 objects are long running but the thread requesting
authentication may be short-lived. If we create QgsO2 objects
on a short-running thread, then when we later try to
authenticate using the same oauth2 authentication method we
will get a deadlock, as the O2 reply server will no longer
have an active thread or an event loop running.

This was especially evident in browser items which use oauth2
authentication. Since these are usually populated using very
short-life threads, we'd often hit this situation:

1. The connection would be expanded. Browser would create a thread
to populate the connection. The oauth2 objects would then be created
on this same thread, and everything would initially work OK. The
user could complete the authentication since the browser population
thread was blocked until this was done.
2. The browser item got populated, and then the thread populating
it was destroyed
3. Later, something else would request authentication using the
same oauth2 config. This may be eg the user expanding out a
different folder on the browser connection.
4. If the oauth2 token had expired, then we'd try to refresh it.
But this relied on event-based logic, and the event loop for the
QgsO2 object was no longer around to handle this. The authentication
request would dead-lock.

Fix this by ensuring that all QgsO2 objects are created and run on
an instance of a new QgsOAuth2Factory QThread. This thread is created
on demand, and will then exist for the life of the QGIS session.

This ensures that all QgsO2 objects will run on a thread with an
application-long lifetime, so there's no issue if they later
require event-loop based logic (such as token refresh)
We need to ensure that the link method is ONLY ever called from
the QgsO2 object's thread, because it involves creation of
child items and that is explicitly prohibited from external
threads by Qt.
@github-actions github-actions bot added this to the 3.42.0 milestone Nov 22, 2024

target_include_directories(authmethod_oauth2_a PRIVATE
${CMAKE_SOURCE_DIR}/external/qjsonwrapper
${CMAKE_SOURCE_DIR}/external/o2/src
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change we accept the fact that we are adopting a forked version of o2.
This makes it different from other external "convenience" libraries to overcome the lack of these in some distributions.

I would suggest we move this out of the external folder into src/auth/oauth2, as all the cmake logic to build this library also sits in here.
Ideally I would also suggest to namespace the code to avoid symbol collision when linking to other libraries that depend on an upstream version of o2. (I've gone through some pain for poly2tri and similar libs in the past).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m-kuhn

With this change we accept the fact that we are adopting a forked version of o2.

I don't see any change in the situation here. This was all dead code, unused by anyway since the introduction of the method.

Instead of abandoning any hope of a maintained upstream, I'd prefer to leave things as they are BUT fork the library repo to QGIS/o2, and then maintain our patchset on there. At least then we'd be making a semi-maintained version of the library available for others to use...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that would be a good move indeed.

With vcpkg builds, I would like to build this as an external port and disable WITH_INTERNAL_O2 (this will improve the caching of the dependency and speed up the build).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m-kuhn thanks for creating https://github.com/qgis/o2 -- I've just submitted a bunch of pull requests adding downstream changes onto that repo.

When those are merged I'll do a bit of modernization on the library (eg remove gross old style connects), and then resync our external copy.

I'm thinking I'll drop 81d221c just in case there's anything useful in there for you. I suspect you'll need to rewrite a lot of that anyway since it was never used, but maybe there's something there 😆

Copy link

github-actions bot commented Nov 22, 2024

🪟 Windows builds

Download Windows builds of this PR for testing.
Debug symbols for this build are available here.
(Built from commit 8c46b3e)

🪟 Windows Qt6 builds

Download Windows Qt6 builds of this PR for testing.
(Built from commit 8c46b3e)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants