Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiences with gdb elf #8

Open
mikrosk opened this issue Oct 19, 2023 · 16 comments
Open

Experiences with gdb elf #8

mikrosk opened this issue Oct 19, 2023 · 16 comments

Comments

@mikrosk
Copy link
Member

mikrosk commented Oct 19, 2023

This is not a bug report, just a "thread" to collect experiences (ok and bugs ;)) with our gdb. I wrote to both @vinriviere and @th-otto (I wasn't sure whom to ask ;-)) for an executable: Thorsten had a newer one (built in October) while Vincent had one from August.

Vincent's worked fine but Thorsten's, whoa, that was yet another level! Without installing anything, just having TosWin2 + gdb executable I immediately got colors. I've tried so far:

  • stepping over instructions
  • stepping over source code (with manual source code path setting, cool stuff!)
  • breakpoints

Everything worked all right, except:

  • the stairs effect on quitting gdb (when having a program running)

So I'd say, pretty good, pretty good. If you have other things worth mentioning, feel free to do so (in case of more complex bugs or if too many of them we can create separate issues).

@vinriviere
Copy link
Member

Haha, I initially disabled the colors to avoid trouble. With that old gdb binary, colors can be enabled manually with set style enabled on. No more required with current sources, as I reverted the color settings to default (colors on).

And indeed, stepping into sources is really pleasant. You can also try disas. Be sure to test the tui mode with gdb --tui my.tos. Or enable it manually while already inside gdb with Ctrl+X+A.

NB: I didn't encounter any stairs effect. When quitting a program, I see a line Inferior 1 .. will be killed. As I understand, that's expected indentation, not a stairs bug.

@mikrosk
Copy link
Member Author

mikrosk commented Oct 19, 2023

Ha! You are correct. Your executable doesn't seem to exhibit the stairs effect:
vincent-color
While Thorsten's does:
thorsten

@th-otto
Copy link

th-otto commented Oct 20, 2023

Interesting. I can only assume that this is due to different ncurses/readline libraries being used. The only patch i remember there is https://github.com/th-otto/rpmint/blob/master/patches/readline/readline-mint.patch, but this should be active in the version i used (and also applied to the readline library of binutils https://github.com/th-otto/binutils/blob/8b0530c0176e461d94644ddbd26a1148c51d7ba2/readline/readline/terminal.c#L703-L706)

Which version did you use @vinriviere ?

@vinriviere
Copy link
Member

vinriviere commented Oct 20, 2023

I build gdb from that branch :
https://github.com/freemint/m68k-atari-mint-binutils-gdb/commits/gdb-mintelf-vri

That's work in progress. I should have forked the repository to my private space, instead of creating a branch there. Anyway, when it's mature, I will squash/rework the commits and push them to the official mintelf branch.

I remember there was a stairs effect long ago. I fixed it that way: e98a051
I believe this is the right way, because MiNT doesn't support different CRNL/NLCR translation for input and output.

BTW, I use an unpatched ncurses 6.4, and IIRC the readline shipped with binutils/gdb.

@th-otto
Copy link

th-otto commented Oct 20, 2023

I build gdb from that branch :

That probably means it was using the bundled version of readline.

I fixed it that way: e98a051

Uh, buts that's a bug in mint definitely. Those settings are only meant for input, and should not affect output.

But indeed, after applying that patch, stairs effects are gone in my version. I've rebuild the readline library, and also applied the official patches from https://ftp.gnu.org/pub/gnu/readline/readline-7.0-patches/

@th-otto
Copy link

th-otto commented May 25, 2024

There is one issue which i just found when trying to debug GEM programs: when the application calls wind_update(BEG_UPDATE), the console window hangs when toswin calls wind_update() to update the screen. Any idea how that can be solved?

Edit:
One solution could be, to simply ignore wind_update() calls of traced programs. That can result in those programs writing to foreign windows/the desktop, but would atleast prevent such lockups. What do you think?

--- a/xaaes/src.km/xa_wind.c
+++ b/xaaes/src.km/xa_wind.c
@@ -1895,6 +1895,12 @@ XA_wind_update(int lock, struct xa_client *client, AESPB *pb)
 	else
 		p = get_curproc();
 
+	if (p->ptracer)
+	{
+		DIAG((D_sema, NULL, "'%s:wind_update(0x%x) ignored for traced program", p->name, op));
+		return XAC_DONE;
+	}
+
 	switch (op & 0xff)
 	{
 

@mikrosk
Copy link
Member Author

mikrosk commented May 26, 2024

Similar (and maybe more compatible?) approach could be forcing the NO_BLOCK flag in case of a traced program?

@th-otto
Copy link

th-otto commented May 27, 2024

No, that won't help. That flag would only prevent the application from being blocked. But it is the terminal window that is blocked later when you reenter gdb.

@th-otto
Copy link

th-otto commented May 27, 2024

Another issue: if some library function crashes, gdb does not seem to able to display a correct backtrace:

Screenshot_20240527_075304

On x86, that seems to work more reliable, even if functions were compiled with -fomit-framepointer. Haven't checked yet how gdb tries to get those backtraces.

One solution could be to link against libc_g.a instead of libc.a. But i think that cannot be done automatically (the -g flag is almost always passed to the linker, and linking against libc_g.a in that case would be sub-optimal). So that requires some hacking to Makefiles.

@mikrosk
Copy link
Member Author

mikrosk commented May 27, 2024

No, that won't help. That flag would only prevent the application from being blocked. But it is the terminal window that is blocked later when you reenter gdb.

Oh, right. In that case I don't have a better proposal than what you proposed.

the -g flag is almost always passed to the linker, and linking against libc_g.a in that case would be sub-optimal

Yes, there was even explicit commit in our gcc to revert this behaviour as this was exactly the case back then.

@th-otto
Copy link

th-otto commented May 29, 2024

Fun info: im currently trying to debug a crash in the gcc-14 cc1 native compiler. 512MB fast ram was not enough, so i had to increase that to 1GB for aranym. It took more than one hour to load the ~287MB of debug info...
And now i have the problem i mentioned above: the crash happens in free(), and i don't get a usable backtrace :(

@mikrosk
Copy link
Member Author

mikrosk commented May 29, 2024

I feel you. (Un)fortunately, the memory hunger isn't limited to Atari: just recently I wanted to use the trick from your patch (make -j$(getconf _NPROCESSORS_ONLN)) so in my case it would build on 16 cores and ... I ran out of 32 GB RAM (!) because I have "only" 8 GB swap enabled.

Insane.

@th-otto
Copy link

th-otto commented May 29, 2024

That is actually dangerous. A few days ago i accidently ran some build process with just -j, thus no limit on jobs. That also got me to the place where the system started to swap (16GB RAM & 32GB swap), until it actually became totally unusable and took about 1min to recognize mouse clicks, so i could switch to a different terminal and kill it. All in all took almost 1hour to get it back to normal operation. Already thought about pressing the reset button...

@mikrosk
Copy link
Member Author

mikrosk commented May 29, 2024

You are more patient than I am, so yes, I did press the reset button. ;-)

@mikrosk
Copy link
Member Author

mikrosk commented Jul 6, 2024

@th-otto as for your #8 (comment), I have witnessed it, too. And it's a shame because I wanted to debug exactly such a crash, so gdb should be helping me there and not vice versa.

One solution could be to link against libc_g.a instead of libc.a

Did you verify that this indeed fix the issue?

@th-otto
Copy link

th-otto commented Jul 6, 2024

Yes, atleast when you compile also your application with -g and -fno-omit-framepointer

th-otto pushed a commit that referenced this issue Aug 8, 2024
When running test-case gdb.server/connect-with-no-symbol-file.exp on
aarch64-linux (specifically, an opensuse leap 15.5 container on a
fedora asahi 39 system), I run into:
...
(gdb) detach^M
Detaching from program: target:connect-with-no-symbol-file, process 185104^M
Ending remote debugging.^M
terminate called after throwing an instance of 'gdb_exception_error'^M
...

The detailed backtrace of the corefile is:
...
 (gdb) bt
 #0  0x0000ffff75504f54 in raise () from /lib64/libpthread.so.0
 #1  0x00000000007a86b4 in handle_fatal_signal (sig=6)
     at gdb/event-top.c:926
 #2  <signal handler called>
 #3  0x0000ffff74b977b4 in raise () from /lib64/libc.so.6
 #4  0x0000ffff74b98c18 in abort () from /lib64/libc.so.6
 #5  0x0000ffff74ea26f4 in __gnu_cxx::__verbose_terminate_handler() ()
    from /usr/lib64/libstdc++.so.6
 #6  0x0000ffff74ea011c in ?? () from /usr/lib64/libstdc++.so.6
 #7  0x0000ffff74ea0180 in std::terminate() () from /usr/lib64/libstdc++.so.6
 #8  0x0000ffff74ea0464 in __cxa_throw () from /usr/lib64/libstdc++.so.6
 #9  0x0000000001548870 in throw_it (reason=RETURN_ERROR,
     error=TARGET_CLOSE_ERROR, fmt=0x16c7810 "Remote connection closed", ap=...)
     at gdbsupport/common-exceptions.cc:203
 #10 0x0000000001548920 in throw_verror (error=TARGET_CLOSE_ERROR,
     fmt=0x16c7810 "Remote connection closed", ap=...)
     at gdbsupport/common-exceptions.cc:211
 #11 0x0000000001548a00 in throw_error (error=TARGET_CLOSE_ERROR,
     fmt=0x16c7810 "Remote connection closed")
     at gdbsupport/common-exceptions.cc:226
 #12 0x0000000000ac8f2c in remote_target::readchar (this=0x233d3d90, timeout=2)
     at gdb/remote.c:9856
 #13 0x0000000000ac9f04 in remote_target::getpkt (this=0x233d3d90,
     buf=0x233d40a8, forever=false, is_notif=0x0) at gdb/remote.c:10326
 #14 0x0000000000acf3d0 in remote_target::remote_hostio_send_command
     (this=0x233d3d90, command_bytes=13, which_packet=17,
     remote_errno=0xfffff1a3cf38, attachment=0xfffff1a3ce88,
     attachment_len=0xfffff1a3ce90) at gdb/remote.c:12567
 #15 0x0000000000ad03bc in remote_target::fileio_fstat (this=0x233d3d90, fd=3,
     st=0xfffff1a3d020, remote_errno=0xfffff1a3cf38)
     at gdb/remote.c:12979
 #16 0x0000000000c39878 in target_fileio_fstat (fd=0, sb=0xfffff1a3d020,
     target_errno=0xfffff1a3cf38) at gdb/target.c:3315
 #17 0x00000000007eee5c in target_fileio_stream::stat (this=0x233d4400,
     abfd=0x2323fc40, sb=0xfffff1a3d020) at gdb/gdb_bfd.c:467
 #18 0x00000000007f012c in <lambda(bfd*, void*, stat*)>::operator()(bfd *,
     void *, stat *) const (__closure=0x0, abfd=0x2323fc40, stream=0x233d4400,
     sb=0xfffff1a3d020) at gdb/gdb_bfd.c:955
 #19 0x00000000007f015c in <lambda(bfd*, void*, stat*)>::_FUN(bfd *, void *,
     stat *) () at gdb/gdb_bfd.c:956
 #20 0x0000000000f9b838 in opncls_bstat (abfd=0x2323fc40, sb=0xfffff1a3d020)
     at bfd/opncls.c:665
 #21 0x0000000000f90adc in bfd_stat (abfd=0x2323fc40, statbuf=0xfffff1a3d020)
     at bfd/bfdio.c:431
 #22 0x000000000065fe20 in reopen_exec_file () at gdb/corefile.c:52
 #23 0x0000000000c3a3e8 in generic_mourn_inferior ()
     at gdb/target.c:3642
 #24 0x0000000000abf3f0 in remote_unpush_target (target=0x233d3d90)
     at gdb/remote.c:6067
 #25 0x0000000000aca8b0 in remote_target::mourn_inferior (this=0x233d3d90)
     at gdb/remote.c:10587
 #26 0x0000000000c387cc in target_mourn_inferior (
     ptid=<error reading variable: Cannot access memory at address 0x2d310>)
     at gdb/target.c:2738
 #27 0x0000000000abfff0 in remote_target::remote_detach_1 (this=0x233d3d90,
     inf=0x22fce540, from_tty=1) at gdb/remote.c:6421
 #28 0x0000000000ac0094 in remote_target::detach (this=0x233d3d90,
     inf=0x22fce540, from_tty=1) at gdb/remote.c:6436
 #29 0x0000000000c37c3c in target_detach (inf=0x22fce540, from_tty=1)
     at gdb/target.c:2526
 #30 0x0000000000860424 in detach_command (args=0x0, from_tty=1)
    at gdb/infcmd.c:2817
 #31 0x000000000060b594 in do_simple_func (args=0x0, from_tty=1, c=0x231431a0)
     at gdb/cli/cli-decode.c:94
 #32 0x00000000006108c8 in cmd_func (cmd=0x231431a0, args=0x0, from_tty=1)
     at gdb/cli/cli-decode.c:2741
 #33 0x0000000000c65a94 in execute_command (p=0x232e52f6 "", from_tty=1)
     at gdb/top.c:570
 #34 0x00000000007a7d2c in command_handler (command=0x232e52f0 "")
     at gdb/event-top.c:566
 #35 0x00000000007a8290 in command_line_handler (rl=...)
     at gdb/event-top.c:802
 #36 0x0000000000c9092c in tui_command_line_handler (rl=...)
     at gdb/tui/tui-interp.c:103
 #37 0x00000000007a750c in gdb_rl_callback_handler (rl=0x23385330 "detach")
     at gdb/event-top.c:258
 #38 0x0000000000d910f4 in rl_callback_read_char ()
     at readline/readline/callback.c:290
 #39 0x00000000007a7338 in gdb_rl_callback_read_char_wrapper_noexcept ()
     at gdb/event-top.c:194
 #40 0x00000000007a73f0 in gdb_rl_callback_read_char_wrapper
     (client_data=0x22fbf640) at gdb/event-top.c:233
 #41 0x0000000000cbee1c in stdin_event_handler (error=0, client_data=0x22fbf640)
     at gdb/ui.c:154
 #42 0x000000000154ed60 in handle_file_event (file_ptr=0x232be730, ready_mask=1)
     at gdbsupport/event-loop.cc:572
 #43 0x000000000154f21c in gdb_wait_for_event (block=1)
     at gdbsupport/event-loop.cc:693
 #44 0x000000000154dec4 in gdb_do_one_event (mstimeout=-1)
    at gdbsupport/event-loop.cc:263
 #45 0x0000000000910f98 in start_event_loop () at gdb/main.c:400
 #46 0x0000000000911130 in captured_command_loop () at gdb/main.c:464
 #47 0x0000000000912b5c in captured_main (data=0xfffff1a3db58)
     at gdb/main.c:1338
 #48 0x0000000000912bf4 in gdb_main (args=0xfffff1a3db58)
     at gdb/main.c:1357
 #49 0x00000000004170f4 in main (argc=10, argv=0xfffff1a3dcc8)
     at gdb/gdb.c:38
 (gdb)
...

The abort happens because a c++ exception escapes to c code, specifically
opncls_bstat in bfd/opncls.c.  Compiling with -fexceptions works around this.

Fix this by catching the exception just before it escapes, in stat_trampoline
and likewise in few similar spot.

Add a new template catch_exceptions to do so in a consistent way.

Tested on aarch64-linux.

Approved-by: Pedro Alves <[email protected]>

PR remote/31577
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31577
th-otto pushed a commit that referenced this issue Aug 8, 2024
Since commit b1da98a ("gdb: remove use of alloca in
new_macro_definition"), if cached_argv is empty, we call macro_bcache
with a nullptr data.  This ends up caught by UBSan deep down in the
bcache code:

    $ ./gdb -nx -q --data-directory=data-directory  /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/macscp/macscp -readnow
    Reading symbols from /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/macscp/macscp...
    Expanding full symbols from /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/macscp/macscp...
    /home/smarchi/src/binutils-gdb/gdb/bcache.c:195:12: runtime error: null pointer passed as argument 2, which is declared to never be null

The backtrace:

    #1  0x00007ffff619a05d in __ubsan::__ubsan_handle_nonnull_arg_abort (Data=<optimized out>) at ../../../../src/libsanitizer/ubsan/ubsan_handlers.cpp:750
    #2  0x000055556337fba2 in gdb::bcache::insert (this=0x62d0000c8458, addr=0x0, length=0, added=0x0) at /home/smarchi/src/binutils-gdb/gdb/bcache.c:195
    #3  0x0000555564b49222 in gdb::bcache::insert<char const*, void> (this=0x62d0000c8458, addr=0x0, length=0, added=0x0) at /home/smarchi/src/binutils-gdb/gdb/bcache.h:158
    #4  0x0000555564b481fa in macro_bcache<char const*> (t=0x62100007ae70, addr=0x0, len=0) at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:117
    #5  0x0000555564b42b4a in new_macro_definition (t=0x62100007ae70, kind=macro_function_like, special_kind=macro_ordinary, argv=std::__debug::vector of length 0, capacity 0, replacement=0x62a00003af3a "__builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:573
    #6  0x0000555564b44674 in macro_define_internal (source=0x6210000ab9e0, line=469, name=0x7fffffffa710 "__va_arg_pack", kind=macro_function_like, special_kind=macro_ordinary, argv=std::__debug::vector of length 0, capacity 0, replacement=0x62a00003af3a "__builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:777
    #7  0x0000555564b44ae2 in macro_define_function (source=0x6210000ab9e0, line=469, name=0x7fffffffa710 "__va_arg_pack", argv=std::__debug::vector of length 0, capacity 0, replacement=0x62a00003af3a "__builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:816
    #8  0x0000555563f62fc8 in parse_macro_definition (file=0x6210000ab9e0, line=469, body=0x62a00003af2a "__va_arg_pack() __builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/dwarf2/macro.c:203

This can be reproduced by running gdb.base/macscp.exp.  Avoid calling
macro_bcache if the macro doesn't have any arguments.

Change-Id: I33b5a7c3b3a93d5adba98983fcaae9c8522c383d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants