Report thread exit event for leader if reporting thread exit events
authorPedro Alves <pedro@palves.net>
Wed, 15 Jun 2022 12:19:47 +0000 (13:19 +0100)
committerPedro Alves <pedro@palves.net>
Mon, 13 Nov 2023 14:16:11 +0000 (14:16 +0000)
commite8a625d126e4be34d23b5df535bed134b2bb3156
tree4f6facd5e1358c20b4ac8de9df20eb7957dd7f75
parent7ac958f267898ba87ae67ea04f94a3fda24567dc
Report thread exit event for leader if reporting thread exit events

If GDB sets the GDB_THREAD_OPTION_EXIT option on a thread, then if the
thread disappears from the thread list, GDB expects to shortly see a
thread exit event for it.  See e.g., here, in
remote_target::update_thread_list():

    /* Do not remove the thread if we've requested to be
       notified of its exit.  For example, the thread may be
       displaced stepping, infrun will need to handle the
       exit event, and displaced stepping info is recorded
       in the thread object.  If we deleted the thread now,
       we'd lose that info.  */
    if ((tp->thread_options () & GDB_THREAD_OPTION_EXIT) != 0)
      continue;

There's one scenario that is deleting a thread from the
remote/gdbserver thread list without ever reporting a corresponding
thread exit event though -- check_zombie_leaders.  This can lead to
GDB getting confused.  For example, with a following patch that
enables GDB_THREAD_OPTION_EXIT whenever schedlock is enabled, we'd see
this regression:

 $ make check RUNTESTFLAGS="--target_board=native-extended-gdbserver" TESTS="gdb.threads/no-unwaited-for-left.exp"
 ...
 Running src/gdb/testsuite/gdb.threads/no-unwaited-for-left.exp ...
 FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout)
 ... some more cascading FAILs ...

gdb.log shows:

 (gdb) continue
 Continuing.
 FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout)

A passing run would have resulted in:

 (gdb) continue
 Continuing.
 No unwaited-for children left.
 (gdb) PASS: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits

note how the leader thread is not listed in the remote-reported XML
thread list below:

 (gdb) set debug remote 1
 (gdb) set debug infrun 1
 (gdb) info threads
   Id   Target Id                                Frame
 * 1    Thread 1163850.1163850 "no-unwaited-for" main () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/no-unwaited-for-left.c:65
   3    Thread 1163850.1164130 "no-unwaited-for" [remote] Sending packet: $Hgp11c24a.11c362#39
 (gdb) c
 Continuing.
 [infrun] clear_proceed_status_thread: 1163850.1163850.0
 ...
     [infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=1, current thread [1163850.1163850.0] at 0x55555555534f
     [remote] Sending packet: $QPassSignals:#f3
     [remote] Packet received: OK
     [remote] Sending packet: $QThreadOptions;3:p11c24a.11c24a#f3
     [remote] Packet received: OK
 ...
     [infrun] target_set_thread_options: [options for Thread 1163850.1163850 are now 0x3]
 ...
   [infrun] do_target_resume: resume_ptid=1163850.1163850.0, step=0, sig=GDB_SIGNAL_0
   [remote] Sending packet: $vCont;c:p11c24a.11c24a#98
   [infrun] prepare_to_wait: prepare_to_wait
   [infrun] reset: reason=handling event
   [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target extended-remote
   [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target extended-remote
 [infrun] fetch_inferior_event: exit
 [infrun] fetch_inferior_event: enter
   [infrun] scoped_disable_commit_resumed: reason=handling event
   [infrun] random_pending_event_thread: None found.
   [remote] wait: enter
     [remote] Packet received: N
   [remote] wait: exit
   [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
   [infrun] print_target_wait_results:   -1.0.0 [process -1],
   [infrun] print_target_wait_results:   status->kind = NO_RESUMED
   [infrun] handle_inferior_event: status->kind = NO_RESUMED
   [remote] Sending packet: $Hgp0.0#ad
   [remote] Packet received: OK
   [remote] Sending packet: $qXfer:threads:read::0,1000#92
   [remote] Packet received: l<threads>\n<thread id="p11c24a.11c362" core="0" name="no-unwaited-for" handle="0097d8f7ff7f0000"/>\n</threads>\n
   [infrun] handle_no_resumed: TARGET_WAITKIND_NO_RESUMED (ignoring: found resumed)
 ...

... however, infrun decided there was a resumed thread still, so
ignored the TARGET_WAITKIND_NO_RESUMED event.  Debugging GDB, we see
that the "found resumed" thread that GDB finds, is the leader thread.
Even though that thread is not on the remote-reported thread list, it
is still on the GDB thread list, due to the special case in remote.c
mentioned above.

This commit addresses the issue by fixing GDBserver to report a thread
exit event for the zombie leader too, i.e., making GDBserver respect
the "if thread has GDB_THREAD_OPTION_EXIT set, report a thread exit"
invariant.  To do that, it takes a bit more code than one would
imagine off hand, due to the fact that we currently always report LWP
exit pending events as TARGET_WAITKIND_EXITED, and then decide whether
to convert it to TARGET_WAITKIND_THREAD_EXITED just before reporting
the event to GDBserver core.  For the zombie leader scenario
described, we need to record early on that we want to report a
THREAD_EXITED event, and then make sure that decision isn't lost along
the way to reporting the event to GDBserver core.

Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I1e68fccdbc9534434dee07163d3fd19744c8403b
gdbserver/linux-low.cc
gdbserver/linux-low.h