Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vcos_dummy_thread_cleanup() SEGV when attempting to call function in unloaded lib #716

Open
patwil opened this issue Nov 30, 2021 · 2 comments

Comments

@patwil
Copy link

patwil commented Nov 30, 2021

Describe the bug
When a thread containing SDL_CreateWindow() finishes its at_exit functions (i.e. thread->at_exit[i].pfn) may include pointers to functions which are no longer valid.
For example, client_thread_detach() is added to at_exit[].pfn by platform_tls_get(); however the calling thread exists after the EGL lib has been unloaded.

To reproduce
Compile and run the following demo code:

#include <iostream>
#include <vector>
#include <thread>
#include <chrono>
#include <memory>
#include <functional>
#include <signal.h>
#include <string.h>
#include <SDL.h>
#include <SDL_thread.h>

class Foo
{
    private:
        SDL_Window* window = nullptr;
    public:
        Foo() {}

        void run()
        {
            std::this_thread::sleep_for(std::chrono::milliseconds(100));
            std::cout << "Testing video drivers..." << std::endl;
            std::vector<bool> drivers( SDL_GetNumVideoDrivers() );

            for( int i = 0; i < static_cast<int>(drivers.size()); ++i ) {
                drivers[i] = (0 == SDL_VideoInit( SDL_GetVideoDriver(i) ) );
                SDL_VideoQuit();
            }

            std::cout << "SDL_VIDEODRIVER available:";

            for ( int i = 0; i < static_cast<int>(drivers.size()); ++i ) {
                std::cout << " " << SDL_GetVideoDriver(i);
            }

            std::cout << std::endl;

            std::cout << "SDL_VIDEODRIVER usable   :";

            for ( int i = 0; i < static_cast<int>(drivers.size()); ++i ) {
                if( !drivers[i] ) {
                    continue;
                }

                std::cout << " " << SDL_GetVideoDriver(i);
            }

            std::cout << std::endl;

            if ( SDL_Init( SDL_INIT_VIDEO ) < 0 ) {
                std::cerr << "SDL_Init returned error: \"" << SDL_GetError() << "\"" << std::endl;
                return;
            }

            std::cout << "SDL_VIDEODRIVER selected : " << SDL_GetCurrentVideoDriver() << std::endl;

            window = SDL_CreateWindow("RPI-Test",
                                      SDL_WINDOWPOS_UNDEFINED,
                                      SDL_WINDOWPOS_UNDEFINED,
                                      0,
                                      0,
                                      SDL_WINDOW_FULLSCREEN | SDL_WINDOW_BORDERLESS);

            if (!window) {
                std::cerr << "SDL_CreateWindow returned error: \"" << SDL_GetError() << "\"" << std::endl;
                return;
            }

            std::this_thread::sleep_for(std::chrono::seconds(1));

            std::cout << "foo - SDL_DestroyWindow()" << std::endl;
            SDL_DestroyWindow(window);
            std::cout << "foo - SDL_Quit()" << std::endl;
            SDL_Quit();

            std::cout << "foo - end" << std::endl;
        }
};

int main()
{
    Foo* foo = new Foo();
    std::thread* fooThread;

    fooThread  = new std::thread(std::bind(&Foo::run, foo));
    std::cout << "main, foo now execute concurrently..." << std::endl;

    std::this_thread::sleep_for(std::chrono::milliseconds(500));
    std::cout << "main finished sleep..." << std::endl;

    fooThread->join();  // pauses until fooThread finishes
    std::cout << "fooThread completed." << std::endl;

    return 0;
}

Build as follows:

g++ -o sdl_RPI_threads sdl_RPI_threads.cpp $(sdl2-config --cflags --libs) -lpthread

Expected behaviour

pi@rpi4b-02:~/dev/sdl/pitest $ export SDL_VIDEODRIVER=x11
pi@rpi4b-02:~/dev/sdl/pitest $ ./sdl_RPI_threads 
main, foo now execute concurrently...
Testing video drivers...
main finished sleep...
SDL_VIDEODRIVER available: x11 wayland RPI dummy
SDL_VIDEODRIVER usable   : x11 RPI
SDL_VIDEODRIVER selected : x11
foo - SDL_DestroyWindow()
foo - SDL_Quit()
foo - end
fooThread completed.

Actual behaviour

pi@rpi3b-02:~/dev/sdl/pitest $ export SDL_VIDEODRIVER=RPI
pi@rpi3b-02:~/dev/sdl/pitest $ ./sdl_RPI_threads
main, foo now execute concurrently...
Testing video drivers...
SDL_VIDEODRIVER available: RPI dummy
SDL_VIDEODRIVER usable   : RPI
SDL_VIDEODRIVER selected : RPI
main finished sleep...
foo - SDL_DestroyWindow()
foo - SDL_Quit()
foo - end
Segmentation fault (core dumped)

System

raspinfo

Also happens on Pi3A, PiZW and on Raspberry OS and Arch Linux.

Logs
Backtrace shows:

(gdb) bt
#0  0x758eb574 in ?? ()
#1  0x76a5a990 in vcos_dummy_thread_cleanup (cxt=0x76103480)
    at /home/dom/projects/staging/userland/interface/vcos/pthreads/vcos_pthreads.c:104
#2  0x76e212fc in __nptl_deallocate_tsd () at pthread_create.c:301
#3  0x76e214a4 in start_thread (arg=0x76a56440) at pthread_create.c:497
Backtrace stopped: Cannot access memory at address 0x1fb509f8

Additional context
A limitation of SDL2 is that it must be initialised in the same thread as it waits for events.

Thus, we have a thread which has:

SDL_init();
...
SDL_CreateWindow();  // platform_tls_get() adds client_thread_detach as an at_exit function for thread
...
SDL_DestroyWindow(); // calls SDL_EGL_UnloadLibrary() which unloads client_thread_detach which is still in at_exit pfn list
...
SDL_Quit();
...
// thread ends, causing call to vcos_dummy_thread_cleanup() which calls at_exit pfn (i.e. client_thread_detach).
// This causes segmentation violation

I have a workaround in a forked repo which seems to fix issue. However I doubt that it is the right way to go. I can create a pull request if it helps.

@pelwell
Copy link
Contributor

pelwell commented Dec 1, 2021

Don't you just call SDL_EGL_LoadLibrary(NULL) from the main application thread? SDL_CreateWindow only loads the library because you haven't already done so.

@patwil
Copy link
Author

patwil commented Dec 1, 2021

The only SDL_EGL_LoadLibrary() I see is
extern int SDL_EGL_LoadLibrary(_THIS, const char *path, NativeDisplayType native_display, EGLenum platform);
(where _THIS is defined as SDL_VideoDevice *_this ).
It assumes SDL has been initialised such that it dereferences _this without checking that it's been set.

Initialising SDL in main application is not feasible as the thread needs to call SDL_WaitEvent();
References: [1] , [2] , and others.

I think that this is because SDL video initialisation also initialises the Events subsystem.

Maybe I could duplicate the SDL code to load the EGL lib, but this is also far from ideal. Given the vast multitude of deployments of userland and SDL, it's highly unlikely that the issue is anything other than my code structure. I just haven't seen this problem mentioned anywhere else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants