Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashed in a multi-threaded environment #225

Closed
wciq1208 opened this issue Jul 9, 2024 · 8 comments
Closed

Crashed in a multi-threaded environment #225

wciq1208 opened this issue Jul 9, 2024 · 8 comments

Comments

@wciq1208
Copy link

wciq1208 commented Jul 9, 2024

pypdfium2==4.30.0
marker-pdf==0.2.13

I call the code as follows:

    self.model = load_all_models()
# other code
            with self._chat_lock:
                full_text, images, out_meta = convert_single_pdf(pdf_file.name, self.model, max_pages=max_pages, langs=langs, batch_multiplier=batch_multiplier, start_page=start_page)

An error occurred when triggering Python's GC:

#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=46945239307840) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=46945239307840) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=46945239307840, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00002aaaaac31476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 0x00002aaaaac177f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x00002aaaaac78676 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x2aaaaadcab77 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#6 0x00002aaaaac8fcfc in malloc_printerr (str=str@entry=0x2aaaaadc870e "corrupted double-linked list") at ./malloc/malloc.c:5664
#7 0x00002aaaaac907cc in unlink_chunk (p=, av=0x2ab754000030) at ./malloc/malloc.c:1635
#8 0x00002aaaaac90969 in malloc_consolidate (av=av@entry=0x2ab754000030) at ./malloc/malloc.c:4780
#9 0x00002aaaaac91ea0 in _int_free (av=0x2ab754000030, p=0x2ab7546bce60, have_lock=) at ./malloc/malloc.c:4674
#10 0x00002aaaaac94453 in __GI___libc_free (mem=) at ./malloc/malloc.c:3391
#11 0x00002aab5f946488 in std::__Cr::deque<std::__Cr::unique_ptr<CPDF_ObjectWalker::SubobjectIterator, std::__Cr::default_delete<CPDF_ObjectWalker::SubobjectIterator> >, std::__Cr::allocator<std::__Cr::unique_ptr<CPDF_ObjectWalker::SubobjectIterator,
std::__Cr::default_delete<CPDF_ObjectWalker::SubobjectIterator> > > >::~deque() () from /opt/conda/lib/python3.10/site-packages/pypdfium2_raw/libpdfium.so #12 0x00002aab5f946236 in CPDF_PageObjectHolder::~CPDF_PageObjectHolder() () from /opt/conda/lib/python3.10/site-packages/pypdfium2_raw/libpdfium.so #13 0x00002aab5f942d1e in CPDF_Page::~CPDF_Page() () from /opt/conda/lib/python3.10/site-packages/pypdfium2_raw/libpdfium.so
#14 0x00002aaaab31d052 in ffi_call_unix64 () from /opt/conda/lib/python3.10/lib-dynload/../../libffi.so.8
#15 0x00002aaaab31b925 in ffi_call_int () from /opt/conda/lib/python3.10/lib-dynload/../../libffi.so.8
#16 0x00002aaaab31c06e in ffi_call () from /opt/conda/lib/python3.10/lib-dynload/../../libffi.so.8
#17 0x00002aaaab2fc1e7 in _call_function_pointer (argtypecount=, argcount=1, resmem=0x2ab24a4feb90, restype=, atypes=, avalues=, pProc=0x2aab5fa2c3b0 <FPDF_ClosePage>, flags=4353)
at /usr/local/src/conda/python-3.10.14/Modules/_ctypes/callproc.c:916
#18 _ctypes_callproc (pProc=0x2aab5fa2c3b0 <FPDF_ClosePage>, argtuple=0x2ab24b34f340, flags=4353, argtypes=, restype=0x747720 <_Py_NoneStruct>, checker=0x0) at /usr/local/src/conda/python-3.10.14/Modules/_ctypes/callproc.c:1262
#19 0x00002aaaab30523e in PyCFuncPtr_call (self=, inargs=, kwds=0x0) at /usr/local/src/conda/python-3.10.14/Modules/_ctypes/_ctypes.c:4221
#20 0x00000000004f705b in _PyObject_MakeTpCall (tstate=0x2ab73c0608c0, callable=0x2aab5f6213c0, args=, nargs=, keywords=0x0) at /usr/local/src/conda/python-3.10.14/Objects/call.c:215
#21 0x00000000004f3106 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2aaf5e17fb20, callable=0x2aab5f6213c0, tstate=) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:112
#22 _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2aaf5e17fb20, callable=0x2aab5f6213c0, tstate=) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:99
#23 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x2aaf5e17fb20, callable=0x2aab5f6213c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123
#24 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x2ab24a4fee80, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893
#25 _PyEval_EvalFrameDefault (tstate=, f=0x2aaf5e17f9a0, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4181
#26 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2aaf5e17f9a0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46
#27 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aab5fbb2720, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067
#28 _PyFunction_Vectorcall (func=0x2aab5fbb2710, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342
#29 0x00000000004f08a9 in do_call_core (kwdict=0x2ab50b9f0640, callargs=0x2ab24b6895c0, func=0x2aab5fbb2710, trace_info=0x2ab24a4ff040, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5945
#30 _PyEval_EvalFrameDefault (tstate=, f=0x2aaf52726440, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4277
#31 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2aaf52726440, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46
#32 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aab5f630dd0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067
#33 _PyFunction_Vectorcall (func=0x2aab5f630dc0, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342
#34 0x00000000004f08a9 in do_call_core (kwdict=0x2ab50b9f0a00, callargs=0x2ab24a9475e0, func=0x2aab5f630dc0, trace_info=0x2ab24a4ff200, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5945
#35 _PyEval_EvalFrameDefault (tstate=, f=0x2aabc3a928c0, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4277
#36 0x00000000004f63ad in _PyEval_EvalFrame (throwflag=0, f=0x2aabc3a928c0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46
#37 _PyEval_Vector (kwnames=0x0, argcount=, args=, locals=0x0, con=0x2aaaab149d90, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067
#38 _PyFunction_Vectorcall (kwnames=0x0, nargsf=, stack=, func=0x2aaaab149d80) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342
#39 _PyObject_FastCallDictTstate (tstate=0x2ab73c0608c0, callable=0x2aaaab149d80, args=, nargsf=, kwargs=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:142
#40 0x0000000000507b36 in _PyObject_Call_Prepend (tstate=0x2ab73c0608c0, callable=0x2aaaab149d80, obj=0x2ab48fe2ca80, args=, kwargs=0x0) at /usr/local/src/conda/python-3.10.14/Objects/call.c:431
#41 0x00000000005cf913 in slot_tp_call (self=0x2ab48fe2ca80, args=0x2ab24b44d6c0, kwds=0x0) at /usr/local/src/conda/python-3.10.14/Objects/typeobject.c:7494
#42 0x00000000004f705b in _PyObject_MakeTpCall (tstate=0x2ab73c0608c0, callable=0x2ab48fe2ca80, args=, nargs=, keywords=0x0) at /usr/local/src/conda/python-3.10.14/Objects/call.c:215
#43 0x000000000059860a in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=9223372036854775809, args=0x2ab24a4ff458, callable=0x2ab48fe2ca80, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:112
#44 PyObject_CallOneArg (func=0x2ab48fe2ca80, arg=) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:184
#45 0x00000000004e19c9 in handle_weakrefs (old=0x75b6d0, unreachable=0x2ab24a4ff520) at /usr/local/src/conda/python-3.10.14/Modules/gcmodule.c:887
#46 gc_collect_main (tstate=0x2ab73c0608c0, generation=2, n_collected=0x2ab24a4ff600, n_uncollectable=0x2ab24a4ff5f8, nofail=0) at /usr/local/src/conda/python-3.10.14/Modules/gcmodule.c:1281
#47 0x000000000059168c in gc_collect_with_callback (tstate=tstate@entry=0x2ab73c0608c0, generation=2) at /usr/local/src/conda/python-3.10.14/Modules/gcmodule.c:1413
#48 0x00000000004d789a in gc_collect_generations (tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Modules/gcmodule.c:1468
#49 _PyObject_GC_Alloc (basicsize=, use_calloc=0) at /usr/local/src/conda/python-3.10.14/Modules/gcmodule.c:2297
#50 _PyObject_GC_Malloc (basicsize=) at /usr/local/src/conda/python-3.10.14/Modules/gcmodule.c:2307
#51 _PyObject_GC_New (tp=0x749da0 <PyDict_Type>) at /usr/local/src/conda/python-3.10.14/Modules/gcmodule.c:2319
#52 0x00000000004d8d5a in new_dict (values=0x759048 <empty_values>, keys=0x749d60 <empty_keys_struct>) at /usr/local/src/conda/python-3.10.14/Objects/dictobject.c:663
#53 PyDict_New () at /usr/local/src/conda/python-3.10.14/Objects/dictobject.c:745
#54 0x00002aab129d56e6 in _parse_object_unicode (next_idx_ptr=, idx=50500, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:704
#55 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=50499, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1064
#56 0x00002aab129d55b9 in _parse_array_unicode (next_idx_ptr=, idx=, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:841
#57 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1072
--Type for more, q to quit, c to continue without paging--
#58 0x00002aab129d583f in _parse_object_unicode (next_idx_ptr=, idx=, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:743
#59 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1064
#60 0x00002aab129d55b9 in _parse_array_unicode (next_idx_ptr=, idx=, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:841
#61 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1072
#62 0x00002aab129d583f in _parse_object_unicode (next_idx_ptr=, idx=, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:743
#63 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1064
#64 0x00002aab129d55b9 in _parse_array_unicode (next_idx_ptr=, idx=, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:841
#65 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1072
#66 0x00002aab129d583f in _parse_object_unicode (next_idx_ptr=, idx=, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:743
#67 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1064
#68 0x00002aab129d55b9 in _parse_array_unicode (next_idx_ptr=, idx=, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:841
#69 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1072
#70 0x00002aab129d583f in _parse_object_unicode (next_idx_ptr=, idx=, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:743
#71 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1064
#72 0x00002aab129d4c48 in scanner_call (self=0x2aab5a4088e0, args=, kwds=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1149
#73 0x00000000004f705b in _PyObject_MakeTpCall (tstate=0x2ab73c0608c0, callable=0x2aab5a4088e0, args=, nargs=, keywords=0x0) at /usr/local/src/conda/python-3.10.14/Objects/call.c:215
#74 0x00000000004f3106 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2ab6e5f706d0, callable=0x2aab5a4088e0, tstate=) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:112
#75 _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2ab6e5f706d0, callable=0x2aab5a4088e0, tstate=) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:99
#76 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x2ab6e5f706d0, callable=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123
#77 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x2ab24a4ffc60, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893
#78 _PyEval_EvalFrameDefault (tstate=, f=0x2ab6e5f70530, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4181
#79 0x00000000005095ce in _PyEval_EvalFrame (throwflag=0, f=0x2ab6e5f70530, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46
#80 _PyEval_Vector (kwnames=, argcount=, args=0x2ab7507cd8b8, locals=0x0, con=0x2aab5a3c35c0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067
#81 _PyFunction_Vectorcall (kwnames=, nargsf=, stack=0x2ab7507cd8b8, func=0x2aab5a3c35b0) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342
#82 _PyObject_VectorcallTstate (kwnames=, nargsf=, args=0x2ab7507cd8b8, callable=0x2aab5a3c35b0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114
#83 method_vectorcall (method=, args=0x2ab7507cd8c0, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/classobject.c:53
#84 0x00000000004ef0e3 in _PyObject_VectorcallTstate (kwnames=0x2aab5a3db430, nargsf=, args=, callable=0x2aafa1a2f640, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114
#85 PyObject_Vectorcall (kwnames=0x2aab5a3db430, nargsf=, args=, callable=0x2aafa1a2f640) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123
#86 call_function (kwnames=0x2aab5a3db430, oparg=, pp_stack=, trace_info=0x2ab24a4ffe70, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893
#87 _PyEval_EvalFrameDefault (tstate=, f=0x2ab7507cd730, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4231
#88 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2ab7507cd730, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46
#89 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aab5a3c3530, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067
#90 _PyFunction_Vectorcall (func=0x2aab5a3c3520, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342
#91 0x00000000004ee461 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2ab760c7b2e8, callable=0x2aab5a3c3520, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114
#92 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x2ab760c7b2e8, callable=0x2aab5a3c3520) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123
#93 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x2ab24a500030, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893
#94 _PyEval_EvalFrameDefault (tstate=, f=0x2ab760c7b140, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4198
#95 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2ab760c7b140, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46
#96 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aab5a3c3c80, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067
#97 _PyFunction_Vectorcall (func=0x2aab5a3c3c70, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342
#98 0x00000000004f08a9 in do_call_core (kwdict=0x2aafa1a23f00, callargs=0x2ab24b44c400, func=0x2aab5a3c3c70, trace_info=0x2ab24a5001f0, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5945
#99 _PyEval_EvalFrameDefault (tstate=, f=0x2aabc3a76510, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4277
#100 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2aabc3a76510, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46
#101 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aab5cd65490, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067
#102 _PyFunction_Vectorcall (func=0x2aab5cd65480, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342
#103 0x00000000004ee461 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2ab73c4f0310, callable=0x2aab5cd65480, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114
#104 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x2ab73c4f0310, callable=0x2aab5cd65480) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123
#105 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x2ab24a5003b0, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893
#106 _PyEval_EvalFrameDefault (tstate=, f=0x2ab73c4f0180, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4198
#107 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2ab73c4f0180, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46
#108 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aab9595fad0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067
#109 _PyFunction_Vectorcall (func=0x2aab9595fac0, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342
#110 0x00000000004ee461 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2ab73c008968, callable=0x2aab9595fac0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114
#111 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x2ab73c008968, callable=0x2aab9595fac0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123
#112 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x2ab24a500570, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893
#113 _PyEval_EvalFrameDefault (tstate=, f=0x2ab73c0087c0, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4198
#114 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2ab73c0087c0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46
#115 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aabc3a49760, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067
#116 _PyFunction_Vectorcall (func=0x2aabc3a49750, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342
#117 0x00000000004ee461 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2aabc3ab6550, callable=0x2aabc3a49750, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114
--Type for more, q to quit, c to continue without paging--
#118 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x2aabc3ab6550, callable=0x2aabc3a49750) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123
#119 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x2ab24a500730, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893
#120 _PyEval_EvalFrameDefault (tstate=, f=0x2aabc3ab63e0, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4198
#121 0x0000000000509857 in _PyEval_EvalFrame (throwflag=0, f=0x2aabc3ab63e0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46
#122 _PyEval_Vector (kwnames=0x0, argcount=1, args=0x2ab24a500818, locals=0x0, con=0x2aabc3a495b0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067
#123 _PyFunction_Vectorcall (kwnames=0x0, nargsf=1, stack=0x2ab24a500818, func=0x2aabc3a495a0) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342
#124 _PyObject_VectorcallTstate (kwnames=0x0, nargsf=1, args=0x2ab24a500818, callable=0x2aabc3a495a0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114
#125 method_vectorcall (method=, args=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/classobject.c:61
#126 0x00000000004f08a9 in do_call_core (kwdict=0x2ab24a2a7040, callargs=0x2aaaaae78070, func=0x2aaf53f6b080, trace_info=0x2ab24a500940, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5945
#127 _PyEval_EvalFrameDefault (tstate=, f=0x2aabc3ab67a0, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4277
#128 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2aabc3ab67a0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46
#129 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aaaab161370, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067
#130 _PyFunction_Vectorcall (func=0x2aaaab161360, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342
#131 0x00000000004ee461 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2aaf52c618f0, callable=0x2aaaab161360, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114
#132 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x2aaf52c618f0, callable=0x2aaaab161360) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123
#133 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x2ab24a500b00, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893
#134 _PyEval_EvalFrameDefault (tstate=, f=0x2aaf52c61780, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4198
#135 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2aaf52c61780, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46
#136 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aaaab161640, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067
#137 _PyFunction_Vectorcall (func=0x2aaaab161630, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342
#138 0x00000000004ee461 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2aaf5284d470, callable=0x2aaaab161630, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114
#139 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x2aaf5284d470, callable=0x2aaaab161630) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123
#140 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x2ab24a500cc0, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893
#141 _PyEval_EvalFrameDefault (tstate=, f=0x2aaf5284d300, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4198
#142 0x0000000000509857 in _PyEval_EvalFrame (throwflag=0, f=0x2aaf5284d300, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46
#143 _PyEval_Vector (kwnames=0x0, argcount=1, args=0x2ab24a500da8, locals=0x0, con=0x2aaaab161400, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067
#144 _PyFunction_Vectorcall (kwnames=0x0, nargsf=1, stack=0x2ab24a500da8, func=0x2aaaab1613f0) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342
#145 _PyObject_VectorcallTstate (kwnames=0x0, nargsf=1, args=0x2ab24a500da8, callable=0x2aaaab1613f0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114
#146 method_vectorcall (method=, args=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/classobject.c:61
#147 0x00000000005e5dd5 in thread_run (boot_raw=0x2aabc3aba940) at /usr/local/src/conda/python-3.10.14/Modules/_threadmodule.c:1100
#148 0x00000000005e5d34 in pythread_wrapper (arg=) at /usr/local/src/conda/python-3.10.14/Python/thread_pthread.h:248
#149 0x00002aaaaac83ac3 in start_thread (arg=) at ./nptl/pthread_create.c:442
#150 0x00002aaaaad14a04 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100

@rmast
Copy link

rmast commented Aug 25, 2024

Those hyperlinks cause all kinds of mentioning-pollution. Can you put the log in between a 'code' block?

image

@HaoRenkk123
Copy link

I got the same question:
my server code:
image
error:
image

@mara004
Copy link

mara004 commented Oct 8, 2024

That's expected, as one of marker's dependencies (pypdfium2/pdfium) is not thread-compatible:
It is the caller's responsibility to use locks or similar to prevent threaded access of pdfium's APIs.
https://pypdfium2.readthedocs.io/en/stable/python_api.html#thread-incompatibility

@wciq1208
Copy link
Author

That's expected, as one of marker's dependencies (pypdfium2/pdfium) is not thread-compatible: It is the caller's responsibility to use locks or similar to prevent threaded access of pdfium's APIs. https://pypdfium2.readthedocs.io/en/stable/python_api.html#thread-incompatibility

I have already used a thread lock during parsing, but the core dump occurred during GC

@mara004
Copy link

mara004 commented Oct 10, 2024

I have already used a thread lock during parsing, but the core dump occurred during GC

Hmm, pypdfium2 auto-closes pdfium objects on garbage collection using weakref.finalize(), if the caller did not close explicitly. In the above backtrace, there is an FPDF_ClosePage() call on #17.
The question is, who causes simultaneous pdfium calls during the GC phase, and how can we prevent that? Or is there some prior corruption that causes the close call to fail?

A possible workaround/test might be to add explicit close calls to all pdfium root objects throughout the dependencies and see if that fixes the issue.

Unfortunately, threading/GC-related issues are hard to debug.

@aj8907
Copy link

aj8907 commented Nov 26, 2024

I had a similar error with the corrupted double-linked list (Colab A100). Running this code fixed the problem for me:

from threading import RLock
from contextlib import contextmanager
import logging
import time
import traceback

class SafeLock:
    def __init__(self, name="SafeLock", timeout=60, max_retries=3, retry_delay=1):
        self._lock = RLock()  # Reentrant lock is safer than regular Lock
        self.name = name
        self.timeout = timeout
        self.max_retries = max_retries
        self.retry_delay = retry_delay
        self.logger = logging.getLogger(__name__)

    @contextmanager
    def acquire_safely(self):
        attempt = 0
        while attempt < self.max_retries:
            try:
                acquired = self._lock.acquire(timeout=self.timeout)
                if acquired:
                    try:
                        yield
                    except Exception as e:
                        self.logger.error(f"Error while holding lock: {str(e)}\n{traceback.format_exc()}")
                        raise
                    finally:
                        try:
                            self._lock.release()
                        except Exception as e:
                            self.logger.error(f"Error releasing lock: {str(e)}")
                    return
                else:
                    attempt += 1
                    self.logger.warning(
                        f"Failed to acquire lock {self.name} (attempt {attempt}/{self.max_retries})"
                    )
                    time.sleep(self.retry_delay)
            except Exception as e:
                attempt += 1
                self.logger.error(f"Lock acquisition error: {str(e)}")
                time.sleep(self.retry_delay)
                
        raise TimeoutError(f"Failed to acquire {self.name} after {self.max_retries} attempts")

safe_lock = SafeLock(name="PDFConverter", timeout=120, max_retries=5, retry_delay=2)
fpath = "/path/to/pdf/file.pdf"
with safe_lock.acquire_safely():
    try:
        full_text, images, out_meta = convert_single_pdf(fpath, model_lst)
    except Exception as e:
        logging.error(f"PDF conversion error: {str(e)}\n{traceback.format_exc()}")
        raise

@mara004
Copy link

mara004 commented Nov 26, 2024

@aj8907 Sorry, I'm not much into threading, but I don't logically see how this is supposed to fix the above issue? And why is a bare RLock not sufficient?

The question is, who causes simultaneous pdfium calls during the GC phase, and how can we prevent that? Or is there some prior corruption that causes the close call to fail?

If the cause is indeed simultaneous calls due to GC, and not other caller-caused corruption, I figured we may be able to add an API to plug in a caller-provided lock into our auto-close machinery.
@wciq1208, or anyone else affected: The pre-requisite for me to work on this would be a minimal reproducible example (the snippet in the initial post is incomplete).

@wciq1208
Copy link
Author

wciq1208 commented Nov 27, 2024

@aj8907 Sorry, I'm not much into threading, but I don't logically see how this is supposed to fix the above issue? And why is a bare RLock not sufficient?

The question is, who causes simultaneous pdfium calls during the GC phase, and how can we prevent that? Or is there some prior corruption that causes the close call to fail?

If the cause is indeed simultaneous calls due to GC, and not other caller-caused corruption, I figured we may be able to add an API to plug in a caller-provided lock into our auto-close machinery. @wciq1208, or anyone else affected: The pre-requisite for me to work on this would be a minimal reproducible example (the snippet in the initial post is incomplete).

I have switched from multithreading to multiprocessing for inference, and my project version is currently at 0.2.17. Since the project seems to be undergoing a restructuring for version 2, I don't need a solution to this issue for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants