Gary stumbled on this:
(gdb) PASS: gdb.threads/thread-specific-bp.exp: all-stop: continue to end
info threads
Id Target Id Frame
* 1 Thread 0x7ffff7fdb700 (LWP 13717) "thread-specific" end () at /home/gary/work/archer/startswith/src/gdb/testsuite/gdb.threads/thread-specific-bp.c:29
(gdb) FAIL: gdb.threads/thread-specific-bp.exp: all-stop: thread start is gone
info breakpoint
The problem is that "...archer/startswith/src..." has a "start" in it,
which matches the too-lax regex in the test.
Rather than tweaking the regex, we can just remove the whole "info
threads", like we removed similar ones in other files -- GDB nowadays
does this implicitly already, so things should work without it. Thus
removing this even improves testing here a bit.
gdb/testsuite/ChangeLog:
2015-03-04 Pedro Alves <palves@redhat.com>
* gdb.threads/thread-specific-bp.exp: Delete "info threads" test.
This fixes:
> gdb compile failed, /gdb/testsuite/gdb.threads/clone-thread_db.c: In function 'main':
> /gdb/testsuite/gdb.threads/clone-thread_db.c:67:3: warning: implicit declaration of function 'alarm' [-Wimplicit-function-declaration]
> alarm (300);
> ^
> /gdb/testsuite/gdb.threads/clone-thread_db.c:69:3: warning: implicit declaration of function 'pthread_create' [-Wimplicit-function-declaration]
> pthread_create (&child, NULL, thread_fn, NULL);
> ^
> /gdb/testsuite/gdb.threads/clone-thread_db.c:70:3: warning: implicit declaration of function 'pthread_join' [-Wimplicit-function-declaration]
> pthread_join (child);
> ^
And then adding the missing headers revealed the pthread_join call was
incorrect. This probably fixes the crash we see on ppc64be, e.g., at
https://sourceware.org/ml/gdb-testers/2015-q1/msg04415.html
the logs there show:
...
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x3fffb7ff54a0 (LWP 9275)]
0x00003fffb7f3ce74 in .pthread_join () from /lib64/libpthread.so.0
(gdb) FAIL: gdb.threads/clone-thread_db.exp: continue to end
...
Tested on x86_64 Fedora 20.
gdb/testsuite/
2015-03-04 Pedro Alves <palves@redhat.com>
* gdb.threads/clone-thread_db.c: Include unistd.h and pthread.h.
(main): Pass missing retval argument to pthread_join call.
This fixes invalid reads Valgrind first caught when debugging against
a GDBserver patched with a series that adds exec events to the remote
protocol. Like these, using the gdb.threads/thread-execl.exp test:
$ valgrind ./gdb -data-directory=data-directory ./testsuite/gdb.threads/thread-execl -ex "tar extended-remote :9999" -ex "b thread_execler" -ex "c" -ex "set scheduler-locking on"
...
Breakpoint 1, thread_execler (arg=0x0) at src/gdb/testsuite/gdb.threads/thread-execl.c:29
29 if (execl (image, image, NULL) == -1)
(gdb) n
Thread 32509.32509 is executing new program: build/gdb/testsuite/gdb.threads/thread-execl
[New Thread 32509.32532]
==32510== Invalid read of size 4
==32510== at 0x5AA7D8: delete_breakpoint (breakpoint.c:13989)
==32510== by 0x6285D3: delete_thread_breakpoint (thread.c:100)
==32510== by 0x628603: delete_step_resume_breakpoint (thread.c:109)
==32510== by 0x61622B: delete_thread_infrun_breakpoints (infrun.c:2928)
==32510== by 0x6162EF: for_each_just_stopped_thread (infrun.c:2958)
==32510== by 0x616311: delete_just_stopped_threads_infrun_breakpoints (infrun.c:2969)
==32510== by 0x616C96: fetch_inferior_event (infrun.c:3267)
==32510== by 0x63A2DE: inferior_event_handler (inf-loop.c:57)
==32510== by 0x4E0E56: remote_async_serial_handler (remote.c:11877)
==32510== by 0x4AF620: run_async_handler_and_reschedule (ser-base.c:137)
==32510== by 0x4AF6F0: fd_event (ser-base.c:182)
==32510== by 0x63806D: handle_file_event (event-loop.c:762)
==32510== Address 0xcf333e0 is 16 bytes inside a block of size 200 free'd
==32510== at 0x4A07577: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==32510== by 0x77CB74: xfree (common-utils.c:98)
==32510== by 0x5AA954: delete_breakpoint (breakpoint.c:14056)
==32510== by 0x5988BD: update_breakpoints_after_exec (breakpoint.c:3765)
==32510== by 0x61360F: follow_exec (infrun.c:1091)
==32510== by 0x6186FA: handle_inferior_event (infrun.c:4061)
==32510== by 0x616C55: fetch_inferior_event (infrun.c:3261)
==32510== by 0x63A2DE: inferior_event_handler (inf-loop.c:57)
==32510== by 0x4E0E56: remote_async_serial_handler (remote.c:11877)
==32510== by 0x4AF620: run_async_handler_and_reschedule (ser-base.c:137)
==32510== by 0x4AF6F0: fd_event (ser-base.c:182)
==32510== by 0x63806D: handle_file_event (event-loop.c:762)
==32510==
[Switching to Thread 32509.32532]
Breakpoint 1, thread_execler (arg=0x0) at src/gdb/testsuite/gdb.threads/thread-execl.c:29
29 if (execl (image, image, NULL) == -1)
(gdb)
The breakpoint in question is the step-resume breakpoint of the
non-main thread, the one that was "next"ed.
The exact same issue can be seen on mainline with native debugging, by
running the thread-execl.exp test in non-stop mode, because the kernel
doesn't report a thread exit event for the execing thread.
Tested on x86_64 Fedora 20.
gdb/ChangeLog:
2015-03-02 Pedro Alves <palves@redhat.com>
* infrun.c (follow_exec): Delete all threads of the process except
the event thread. Extended comments.
gdb/testsuite/ChangeLog:
2015-03-02 Pedro Alves <palves@redhat.com>
* gdb.threads/thread-execl.exp (do_test): Handle non-stop.
(top level): Call do_test with non-stop as well.
The buildbot shows that the new
gdb.threads/multi-create-ns-info-thr.exp test is timing out when
tested with --target=native-extended-remote. The reason is:
No breakpoints or watchpoints.
(gdb) break main
Breakpoint 1 at 0x10000b00: file ../../../binutils-gdb/gdb/testsuite/gdb.threads/multi-create.c, line 72.
(gdb) run
Starting program: /home/gdb-buildbot/fedora-21-ppc64be-1/fedora-ppc64be-native-extended-gdbserver/build/gdb/testsuite/outputs/gdb.threads/multi-create-ns-info-thr/multi-cre
ate-ns-info-thr
Process /home/gdb-buildbot/fedora-21-ppc64be-1/fedora-ppc64be-native-extended-gdbserver/build/gdb/testsuite/outputs/gdb.threads/multi-create-ns-info-thr/multi-create-ns-inf
o-thr created; pid = 16266
Unexpected vCont reply in non-stop mode: T0501:00003fffffffd190;40:00000080560fe290;thread:p3f8a.3f8a;core:0;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(gdb) break multi-create.c:45
Breakpoint 2 at 0x10000994: file ../../../binutils-gdb/gdb/testsuite/gdb.threads/multi-create.c, line 45.
(gdb) commands
Type commands for breakpoint(s) 2, one per line.
Non-stop tests don't really work with the
--target_board=native-extended-remote board, because tests toggle
non-stop on after GDB is already connected to gdbserver, while
Currently, non-stop must be enabled before connecting.
This adjusts the test to bail if running to main fails, like all other
non-stop tests.
Note non-stop tests do work with --target_board=native-gdbserver.
gdb/testsuite/ChangeLog:
2015-02-21 Pedro Alves <palves@redhat.com>
* gdb.threads/multi-create-ns-info-thr.exp: Return early if
runto_main fails.
TL;DR - GDB can hang if something refreshes the thread list out of the
target while the target is running. GDB hangs inside td_ta_thr_iter.
The fix is to not use that libthread_db function anymore.
Long version:
Running the testsuite against my all-stop-on-top-of-non-stop series is
still exposing latent non-stop bugs.
I was originally seeing this with the multi-create.exp test, back when
we were still using libthread_db thread event breakpoints. The
all-stop-on-top-of-non-stop series forces a thread list refresh each
time GDB needs to start stepping over a breakpoint (to pause all
threads). That test hits the thread event breakpoint often, resulting
in a bunch of step-over operations, thus a bunch of thread list
refreshes while some threads in the target are running.
The commit adds a real non-stop mode test that triggers the issue,
based on multi-create.exp, that does an explicit "info threads" when a
breakpoint is hit. IOW, it does the same things the as-ns series was
doing when testing multi-create.exp.
The bug is a race, so it unfortunately takes several runs for the test
to trigger it. In fact, even when setting the test running in a loop,
it sometimes takes several minutes for it to trigger for me.
The race is related to libthread_db's td_ta_thr_iter. This is
libthread_db's entry point for walking the thread list of the
inferior.
Sometimes, when GDB refreshes the thread list from the target,
libthread_db's td_ta_thr_iter can somehow see glibc's thread list as a
cycle, and get stuck in an infinite loop.
The issue is that when a thread exits, its thread control structure in
glibc is moved from a "used" list to a "cache" list. These lists are
simply circular linked lists where the "next/prev" pointers are
embedded in the thread control structure itself. The "next" pointer
of the last element of the list points back to the list's sentinel
"head". There's only one set of "next/prev" pointers for both lists;
thus a thread can only be in one of the lists at a time, not in both
simultaneously.
So when thread C exits, simplifying, the following happens. A-C are
threads. stack_used and stack_cache are the list's heads.
Before:
stack_used -> A -> B -> C -> (&stack_used)
stack_cache -> (&stack_cache)
After:
stack_used -> A -> B -> (&stack_used)
stack_cache -> C -> (&stack_cache)
td_ta_thr_iter starts by iterating at the list's head's next, and
iterates until it sees a thread whose next pointer points to the
list's head again. Thus in the before case above, C's next points to
stack_used, indicating end of list. In the same case, the stack_cache
list is empty.
For each thread being iterated, td_ta_thr_iter reads the whole thread
object out of the inferior. This includes the thread's "next"
pointer.
In the scenario above, it may happen that td_ta_thr_iter is iterating
thread B and has already read B's thread structure just before thread
C exits and its control structure moves to the cached list.
Now, recall that td_ta_thr_iter is running in the context of GDB, and
there's no locking between GDB and the inferior. From it's local copy
of B, td_ta_thr_iter believes that the next thread after B is thread
C, so it happilly continues iterating to C, a thread that has already
exited, and is now in the stack cache list.
After iterating C, td_ta_thr_iter finds the stack_cache head, which
because it is not stack_used, td_ta_thr_iter assumes it's just another
thread. After this, unless the reverse race triggers, GDB gets stuck
in td_ta_thr_iter forever walking the stack_cache list, as no thread
in thatlist has a next pointer that points back to stack_used (the
terminating condition).
Before fully understanding the issue, I tried adding cycle detection
to GDB's td_ta_thr_iter callback. However, td_ta_thr_iter skips
calling the callback in some cases, which means that it's possible
that the callback isn't called at all, making it impossible for GDB to
break the loop. I did manage to get GDB stuck in that state more than
once.
Fortunately, we can avoid the issue altogether. We don't really need
td_ta_thr_iter for live debugging nowadays, given PTRACE_EVENT_CLONE.
We already know how to map and lwp id to a thread id without iterating
(thread_from_lwp), so use that more.
gdb/ChangeLog:
2015-02-20 Pedro Alves <palves@redhat.com>
* linux-nat.c (linux_handle_extended_wait): Call
thread_db_notice_clone whenever a new clone LWP is detected.
(linux_stop_and_wait_all_lwps, linux_unstop_all_lwps): New
functions.
* linux-nat.h (thread_db_attach_lwp): Delete declaration.
(thread_db_notice_clone, linux_stop_and_wait_all_lwps)
(linux_unstop_all_lwps): Declare.
* linux-thread-db.c (struct thread_get_info_inout): Delete.
(thread_get_info_callback): Delete.
(thread_from_lwp): Use td_thr_get_info and record_thread.
(thread_db_attach_lwp): Delete.
(thread_db_notice_clone): New function.
(try_thread_db_load_1): If /proc is mounted and shows the
process'es task list, walk over all LWPs and call thread_from_lwp
instead of relying on td_ta_thr_iter.
(attach_thread): Don't call check_thread_signals here. Split the
tail part of the function (which adds the thread to the core GDB
thread list) to ...
(record_thread): ... this function. Call check_thread_signals
here.
(thread_db_wait): Don't call thread_db_find_new_threads_1. Always
call thread_from_lwp.
(thread_db_update_thread_list): Rename to ...
(thread_db_update_thread_list_org): ... this.
(thread_db_update_thread_list): New function.
(thread_db_find_thread_from_tid): Delete.
(thread_db_get_ada_task_ptid): Simplify.
* nat/linux-procfs.c: Include <sys/stat.h>.
(linux_proc_task_list_dir_exists): New function.
* nat/linux-procfs.h (linux_proc_task_list_dir_exists): Declare.
gdb/gdbserver/ChangeLog:
2015-02-20 Pedro Alves <palves@redhat.com>
* thread-db.c: Include "nat/linux-procfs.h".
(thread_db_init): Skip listing new threads if the kernel supports
PTRACE_EVENT_CLONE and /proc/PID/task/ is accessible.
gdb/testsuite/ChangeLog:
2015-02-20 Pedro Alves <palves@redhat.com>
* gdb.threads/multi-create-ns-info-thr.exp: New file.
On GNU/Linux, if a pthreaded program has a thread call clone(CLONE_VM)
directly, and then that clone LWP hits a debug event (breakpoint,
etc.) GDB internal errors. Threaded programs shouldn't really be
calling clone directly, but GDB shouldn't crash either.
The crash looks like this:
(gdb) break clone_fn
Breakpoint 2 at 0x4007d8: file clone-thread_db.c, line 35.
(gdb) r
...
[Thread debugging using libthread_db enabled]
...
src/gdb/linux-nat.c:1030: internal-error: lin_lwp_attach_lwp: Assertion `lwpid > 0' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
The problem is that 'clone' ends up clearing the parent thread's tid
field in glibc's thread data structure. For x86_64, the glibc code in
question is here:
sysdeps/unix/sysv/linux/x86_64/clone.S:
...
testq $CLONE_THREAD, %rdi
jne 1f
testq $CLONE_VM, %rdi
movl $-1, %eax <----
jne 2f
movl $SYS_ify(getpid), %eax
syscall
2: movl %eax, %fs:PID
movl %eax, %fs:TID <----
1:
When GDB refreshes the thread list out of libthread_db, it finds a
thread with LWP with pid -1 (the clone's parent), which naturally
isn't yet on the thread list. GDB then tries to attach to that bogus
LWP id, which is caught by that assertion.
The fix is to detect the bad PID early.
Tested on x86-64 Fedora 20. GDBserver doesn't need any fix.
gdb/ChangeLog:
2015-02-20 Pedro Alves <palves@redhat.com>
PR threads/18006
* linux-thread-db.c (thread_get_info_callback): Return early if
the thread's lwp id is -1.
gdb/testsuite/ChangeLog:
2015-02-20 Pedro Alves <palves@redhat.com>
PR threads/18006
* gdb.threads/clone-thread_db.c: New file.
* gdb.threads/clone-thread_db.exp: New file.
On decr_pc_after_break targets, GDB adjusts the PC incorrectly if a
background single-step stops somewhere where PC-$decr_pc has a
breakpoint, and the thread that finishes the step is not the current
thread, like:
ADDR1 nop <-- breakpoint here
ADDR2 jmp PC
IOW, say thread A is stepping ADDR2's line in the background (an
infinite loop), and the user switches focus to thread B. GDB's
adjust_pc_after_break logic confuses the single-step stop of thread A
for a hit of the breakpoint at ADDR1, and thus adjusts thread A's PC
to point at ADDR1 when it should not, and reports a breakpoint hit,
when thread A did not execute the instruction at ADDR1 at all.
The test added by this patch exercises exactly that.
I can't find any reason we'd need the "thread to be examined is still
the current thread" condition in adjust_pc_after_break, at least
nowadays; it might have made sense in the past. Best just remove it,
and rely on currently_stepping().
Here's the test's log of a run with an unpatched GDB:
35 while (1);
(gdb) PASS: gdb.threads/step-bg-decr-pc-switch-thread.exp: next over nop
next&
(gdb) PASS: gdb.threads/step-bg-decr-pc-switch-thread.exp: next& over inf loop
thread 1
[Switching to thread 1 (Thread 0x7ffff7fc2740 (LWP 29027))](running)
(gdb)
PASS: gdb.threads/step-bg-decr-pc-switch-thread.exp: switch to main thread
Breakpoint 2, thread_function (arg=0x0) at ...src/gdb/testsuite/gdb.threads/step-bg-decr-pc-switch-thread.c:34
34 NOP; /* set breakpoint here */
FAIL: gdb.threads/step-bg-decr-pc-switch-thread.exp: no output while stepping
gdb/ChangeLog:
2015-02-11 Pedro Alves <pedro@codesourcery.com>
* infrun.c (adjust_pc_after_break): Don't adjust the PC just
because the event thread is not the current thread.
gdb/testsuite/ChangeLog:
2015-02-11 Pedro Alves <pedro@codesourcery.com>
* gdb.threads/step-bg-decr-pc-switch-thread.c: New file.
* gdb.threads/step-bg-decr-pc-switch-thread.exp: New file.
Some local changes I was working on related to SIGTRAP handling
resulted in "signal SIGTRAP" no longer passing the SIGTRAP to the
inferior.
Surprisingly, only annota1.exp catches this. This commit adds a test
that doesn't rely on annotations, so that at the point annotations are
finaly dropped, we still have this use case covered ...
This is a multi-threaded test to also exercise the case of first
needing to do a step-over before delivering the signal.
Tested on x86_64 Fedora 20, native, remote/extended-remote gdbserver.
gdb/testsuite/
2015-02-10 Pedro Alves <palves@redhat.com>
* gdb.threads/signal-sigtrap.c: New file.
* gdb.threads/signal-sigtrap.exp: New file.
The buildbot shows that this test is still racy, and occasionally
fails with time outs on some machines. I'd like to get major issues
with load out of the way.
The test currently exits after 180s, which is just a random number,
that has no relation to what the .exp file considers a time out. This
commit makes the program wait a bit longer than what the .exp file
considers a time out, and, resets the timer for each iteration.
Tested on x86_64 Fedora 20, native and extended-remote gdbserver.
gdb/testsuite/
2015-02-06 Pedro Alves <palves@redhat.com>
* gdb.threads/attach-many-short-lived-threads.c (SECONDS): New
macro.
(seconds_left, again): New globals.
(main): Wait seconds_left in a 1-second sleep loop instead of
sleeping 180 seconds. If 'again' is set, reset the seconds
counter.
* gdb.threads/attach-many-short-lived-threads.exp (test): Set
'again' in the inferior before detaching. Print the seconds left.
(options): New global.
(top level): Build program with -DTIMEOUT=$timeout.
GCC5 defaults to the GNU11 standard for C and warns by default for
implicit function declarations and implicit return types.
https://gcc.gnu.org/gcc-5/porting_to.html
Fixing these issues in the testsuite turns 9 untested and 17 unsupported
testcases into 417 new passes when compiling with GCC5.
gdb/testsuite/ChangeLog:
* gdb.arch/i386-bp_permanent.c (standard): New declaration.
* gdb.base/disp-step-fork.c: Include unistd.h.
* gdb.base/siginfo-obj.c: Include stdio.h.
* gdb.base/siginfo-thread.c: Likewise.
* gdb.mi/non-stop.c: Include unistd.h.
* gdb.mi/nsthrexec.c: Include stdio.h.
* gdb.mi/pthreads.c: Include unistd.h.
* gdb.modula2/unbounded1.c (main): Declare returns int.
* gdb.reverse/consecutive-reverse.c: Likewise.
* gdb.threads/create-fail.c: Include unistd.h.
* gdb.threads/killed.c: Likewise.
* gdb.threads/linux-dp.c: Likewise.
* gdb.threads/non-ldr-exc-1.c: Include stdio.h and string.h.
* gdb.threads/non-ldr-exc-2.c: Likewise.
* gdb.threads/non-ldr-exc-3.c: Likewise.
* gdb.threads/non-ldr-exc-4.c: Likewise.
* gdb.threads/pthreads.c: Include unistd.h.
(main): Declare returns int.
* gdb.threads/tls-main.c (foo): New declaration.
* gdb.threads/watchpoint-fork-mt.c: Define _GNU_SOURCE.
linux_nat_is_async_p currently always returns true, even when the
target is _not_ async. That confuses
gdb_readline_wrapper/gdb_readline_wrapper_cleanup, which
force-disables target-async while the secondary prompt is active. As
a result, when gdb_readline_wrapper returns, the target is left async,
even through it was sync to begin with.
That can result in weird bugs, like the one the test added by this
commit exposes.
Ref: https://sourceware.org/ml/gdb-patches/2015-01/msg00592.html
gdb/ChangeLog:
2015-01-23 Pedro Alves <palves@redhat.com>
* linux-nat.c (linux_is_async_p): New macro.
(linux_nat_is_async_p):
(linux_nat_terminal_inferior): Check whether the target can async
instead of whether it is already async.
(linux_nat_terminal_ours): Don't check whether the target is
async.
(linux_async_pipe): Use linux_is_async_p.
gdb/testsuite/ChangeLog:
2015-01-23 Pedro Alves <palves@redhat.com>
* gdb.threads/continue-pending-after-query.c: New file.
* gdb.threads/continue-pending-after-query.exp: New file.
This commit adds a non-stop mode test originally inspired by
signal-while-stepping-over-bp-other-thread.exp, that exposes the
thread starvation issues fixed by the previous patches. It sets a set
of threads stepping in parallel, and has one of them get a signal.
Without the previous fixes, this would fail with timeouts.
gdb/testsuite/
2015-01-09 Pedro Alves <palves@redhat.com>
* gdb.threads/non-stop-fair-events.c: New file.
* gdb.threads/non-stop-fair-events.exp: New file.
These three test all spawn a few threads and then send a SIGSTOP to
their parent GDB in order to pause it while the new threads set things
up for the test. With a GDB patch that changes the inferior thread's
scheduling a bit, I sometimes see:
FAIL: gdb.threads/siginfo-threads.exp: catch signal 0 (timeout)
...
FAIL: gdb.threads/watchthreads-reorder.exp: reorder1: continue a (timeout)
...
FAIL: gdb.threads/ia64-sigill.exp: continue (timeout)
...
The issue is that the test program stops GDB before it had a chance of
processing the new thread's clone event:
(gdb) PASS: gdb.threads/siginfo-threads.exp: get pid
continue
Continuing.
Stopping GDB PID 21541.
Waiting till the threads initialize their TIDs.
FAIL: gdb.threads/siginfo-threads.exp: catch signal 0 (timeout)
On Linux (at least), new threads start stopped, and the debugger must
resume them. The fix is to make the test program wait for the new
threads to be running before stopping GDB.
gdb/testsuite/
2015-01-09 Pedro Alves <palves@redhat.com>
* gdb.threads/ia64-sigill.c (threads_started_barrier): New global.
(thread_func): Wait on barrier.
(main): Wait for all threads to start before stopping GDB.
* gdb.threads/siginfo-threads.c (threads_started_barrier): New
global.
(thread1_func, thread2_func): Wait on barrier.
(main): Wait for all threads to start before stopping GDB.
* gdb.threads/watchthreads-reorder.c (threads_started_barrier):
New global.
(thread1_func, thread2_func): Wait on barrier.
(main): Wait for all threads to start before stopping GDB.
Before the previous fixes, on Linux, this would trigger several
different problems, like:
[New LWP 27106]
[New LWP 27047]
warning: unable to open /proc file '/proc/-1/status'
[New LWP 27813]
[New LWP 27869]
warning: Can't attach LWP 11962: No child processes
Warning: couldn't activate thread debugging using libthread_db: Cannot find new threads: debugger service failed
warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.
gdb/testsuite/
2015-01-09 Pedro Alves <palves@redhat.com>
* gdb.threads/attach-many-short-lived-threads.c: New file.
* gdb.threads/attach-many-short-lived-threads.exp: New file.
[A test I wrote stumbled on a libthread_db issue related to thread
event breakpoints. See glibc PR17705:
[nptl_db: stale thread create/death events if debugger detaches]
https://sourceware.org/bugzilla/show_bug.cgi?id=17705
This patch avoids that whole issue by making GDB stop using thread
event breakpoints in the first place, which is good for other reasons
as well, anyway.]
Before PTRACE_EVENT_CLONE (Linux 2.6), the only way to learn about new
threads in the inferior (to attach to them) or to learn about thread
exit was to coordinate with the inferior's glibc/runtime, using
libthread_db. That works by putting a breakpoint at a magic address
which is called when a new thread is spawned, or when a thread is
about to exit. When that breakpoint is hit, all threads are stopped,
and then GDB coordinates with libthread_db to read data structures out
of the inferior to learn about what happened. Then the breakpoint is
single-stepped, and then all threads are re-resumed. This isn't very
efficient (stops all threads) and is more fragile (inferior's thread
list in memory may be corrupt; libthread_db bugs, etc.) than ideal.
When the kernel supports PTRACE_EVENT_CLONE (which we already make use
of), there's really no need to use libthread_db's event reporting
mechanism to learn about new LWPs. And if the kernel supports that,
then we learn about LWP exits through regular WIFEXITED wait statuses,
so no need for the death event breakpoint either.
GDBserver has been likewise skipping the thread_db events for a long
while:
https://sourceware.org/ml/gdb-patches/2007-10/msg00547.html
There's one user-visible difference: we'll no longer print about
threads being created and exiting while the program is running, like:
[Thread 0x7ffff7dbb700 (LWP 30670) exited]
[New Thread 0x7ffff7db3700 (LWP 30671)]
[Thread 0x7ffff7dd3700 (LWP 30667) exited]
[New Thread 0x7ffff7dab700 (LWP 30672)]
[Thread 0x7ffff7db3700 (LWP 30671) exited]
[Thread 0x7ffff7dcb700 (LWP 30668) exited]
This is exactly the same behavior as when debugging against remote
targets / gdbserver. I actually think that's a good thing (and as
such have listed this in the local/remote parity wiki page a while
ago), as the printing slows down the inferior. It's also a
distraction to keep bothering the user about short-lived threads that
she won't be able to interact with anyway. Instead, the user (and
frontend) will be informed about new threads that currently exist in
the program when the program next stops:
(gdb) c
...
* ctrl-c *
[New Thread 0x7ffff7963700 (LWP 7797)]
[New Thread 0x7ffff796b700 (LWP 7796)]
Program received signal SIGINT, Interrupt.
[Switching to Thread 0x7ffff796b700 (LWP 7796)]
clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:81
81 testq %rax,%rax
(gdb) info threads
A couple of tests had assumptions on GDB thread numbers that no longer
hold.
Tested on x86_64 Fedora 20.
gdb/
2014-01-09 Pedro Alves <palves@redhat.com>
Skip enabling event reporting if the kernel supports
PTRACE_EVENT_CLONE.
* linux-thread-db.c: Include "nat/linux-ptrace.h".
(thread_db_use_events): New function.
(try_thread_db_load_1): Check thread_db_use_events before enabling
event reporting.
(update_thread_state): New function.
(attach_thread): Use it. Check thread_db_use_events before
enabling event reporting.
(thread_db_detach): Check thread_db_use_events before disabling
event reporting.
(find_new_threads_callback): Check thread_db_use_events before
enabling event reporting. Update the thread's state if not using
libthread_db events.
gdb/testsuite/
2014-01-09 Pedro Alves <palves@redhat.com>
* gdb.threads/fork-thread-pending.exp: Switch to the main thread
instead of to thread 2.
* gdb.threads/signal-command-multiple-signals-pending.c (main):
Add barrier around each pthread_create call instead of around all
calls.
* gdb.threads/signal-command-multiple-signals-pending.exp (test):
Set a break on thread_function and have the child threads hit it
one at at a time.
The target->request_interrupt callback implements the handling for
ctrl-c. User types ctrl-c in GDB, GDB sends a \003 to the remote
target, and the remote targets stops the program with a SIGINT, just
like if the user typed ctrl-c in GDBserver's terminal.
The trouble is that using kill_lwp(signal_pid, SIGINT) sends the
SIGINT directly to the program's main thread. If that thread has
exited already, then that kill won't do anything.
Instead, send the SIGINT to the process group, just like GDB
does (see inf-ptrace.c:inf_ptrace_stop).
gdb.threads/leader-exit.exp is extended to cover the scenario. It
fails against GDBserver before the patch.
Tested on x86_64 Fedora 20, native and GDBserver.
gdb/gdbserver/
2014-11-12 Pedro Alves <palves@redhat.com>
* linux-low.c (linux_request_interrupt): Always send a SIGINT to
the process group instead of to a specific LWP.
gdb/testsuite/
2014-11-12 Pedro Alves <palves@redhat.com>
* gdb.threads/leader-exit.exp: Test sending ctrl-c works after the
leader has exited.
This PR shows that GDB can easily trigger an assertion here, in
infrun.c:
5392 /* Did we find the stepping thread? */
5393 if (tp->control.step_range_end)
5394 {
5395 /* Yep. There should only one though. */
5396 gdb_assert (stepping_thread == NULL);
5397
5398 /* The event thread is handled at the top, before we
5399 enter this loop. */
5400 gdb_assert (tp != ecs->event_thread);
5401
5402 /* If some thread other than the event thread is
5403 stepping, then scheduler locking can't be in effect,
5404 otherwise we wouldn't have resumed the current event
5405 thread in the first place. */
5406 gdb_assert (!schedlock_applies (currently_stepping (tp)));
5407
5408 stepping_thread = tp;
5409 }
Like:
gdb/infrun.c:5406: internal-error: switch_back_to_stepped_thread: Assertion `!schedlock_applies (1)' failed.
The way the assertion is written is assuming that with schedlock=step
we'll always leave threads other than the one with the stepping range
locked, while that's not true with the "next" command. With schedlock
"step", other threads still run unlocked when "next" detects a
function call and steps over it. Whether that makes sense or not,
still, it's documented that way in the manual. If another thread hits
an event that doesn't cause a stop while the nexting thread steps over
a function call, we'll get here and fail the assertion.
The fix is just to adjust the assertion. Even though we found the
stepping thread, we'll still step-over the breakpoint that just
triggered correctly.
Surprisingly, gdb.threads/schedlock.exp doesn't have any test that
steps over a function call. This commits fixes that. This ensures
that "next" doesn't switch focus to another thread, and checks whether
other threads run locked or not, depending on scheduler locking mode
and command. There's a lot of duplication in that file that this ends
cleaning up. There's more that could be cleaned up, but that would
end up an unrelated change, best done separately.
This new coverage in schedlock.exp happens to trigger the internal
error in question, like so:
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (1) (GDB internal error)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (3) (GDB internal error)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (5) (GDB internal error)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (7) (GDB internal error)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next to increment (9) (GDB internal error)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: next does not change thread (switched to thread 0)
FAIL: gdb.threads/schedlock.exp: schedlock=step: cmd=next: call_function=1: current thread advanced - unlocked (wrong amount)
That's because we have more than one thread running the same loop, and
while one thread is stepping over a function call, the other thread
hits the step-resume breakpoint of the first, which needs to be
stepped over, and we end up in switch_back_to_stepped_thread exactly
in the problem case.
I think a simpler and more directed test is also useful, to not rely
on internal breakpoint magics. So this commit also adds a test that
has a thread trip on a conditional breakpoint that doesn't cause a
user-visible stop while another thread is stepping over a call. That
currently fails like this:
FAIL: gdb.threads/next-bp-other-thread.exp: schedlock=step: next over function call (GDB internal error)
Tested on x86_64 Fedora 20.
gdb/
2014-10-29 Pedro Alves <palves@redhat.com>
PR gdb/17408
* infrun.c (switch_back_to_stepped_thread): Use currently_stepping
instead of assuming a thread with a stepping range is always
stepping.
gdb/testsuite/
2014-10-29 Pedro Alves <palves@redhat.com>
PR gdb/17408
* gdb.threads/schedlock.c (some_function): New function.
(call_function): New global.
(MAYBE_CALL_SOME_FUNCTION): New macro.
(thread_function): Call it.
* gdb.threads/schedlock.exp (get_args): Add description parameter,
and use it instead of a global counter. Adjust all callers.
(get_current_thread): Use "find current thread" for test message
here rather than having all callers pass down the same string.
(goto_loop): New procedure, factored out from ...
(my_continue): ... this.
(step_ten_loops): Change parameter from test message to command to
use. Adjust.
(list_count): Delete global.
(check_result): New procedure, factored out from duplicate top
level code.
(continue tests): Wrap in with_test_prefix.
(test_step): New procedure, factored out from duplicate top level
code.
(top level): Test "step" in combination with all scheduler-locking
modes. Test "next" in combination with all scheduler-locking
modes, and in combination with stepping over a function call or
not.
* gdb.threads/next-bp-other-thread.c: New file.
* gdb.threads/next-bp-other-thread.exp: New file.
This commit does most of the mechanical removal. IOW, the easy part.
procfs.c isn't touched beyond removing a couple obvious bits that are
guarded by a couple macros defined in config/alpha/nm-osf3.h. Going
beyond that for procfs.c & co would be a harder excision that
potentially affects Solaris.
Some comments in the generic alpha code ABIs that may still be
relevant and I wouldn't know what to do with them. That can always be
done on a separate pass, preferably by someone who can test on alpha.
A couple other spots have references to OSF/Tru64 and related files
being removed, but it felt like removing them would make things worse,
not better. We can revisit those when we next need to touch that
code.
I didn't remove a reference to osf in testsuite/lib/future.exp, as I
believe that code is imported from DejaGNU.
Built and tested on x86_64 Fedora 20, with --enable-targets=all.
Tested that building for --target=alpha-osf3 on x86_64 Fedora 20
fails with:
checking for default auto-load directory... $debugdir:$datadir/auto-load
checking for default auto-load safe-path... $debugdir:$datadir/auto-load
*** Configuration alpha-unknown-osf3 is obsolete.
*** Support has been REMOVED.
make[1]: *** [configure-gdb] Error 1
make[1]: Leaving directory `build-osf'
make: *** [all] Error 2
gdb/
2014-10-17 Pedro Alves <palves@redhat.com>
* Makefile.in (ALL_64_TARGET_OBS): Remove alpha-osf1-tdep.o.
(HFILES_NO_SRCDIR): Remove config/alpha/nm-osf3.h.
(ALLDEPFILES): Remove alpha-nat.c, alpha-osf1-tdep.c and
solib-osf.c.
* NEWS: Mention that support for alpha*-*-osf* has been removed.
* ada-lang.h [__alpha__ && __osf__]
(ADA_KNOWN_RUNTIME_FILE_NAME_PATTERNS): Delete.
* alpha-nat.c, alpha-osf1-tdep.c: Delete files.
* alpha-tdep.c (alpha_gdbarch_init): Remove reference to
GDB_OSABI_OSF1.
* config/alpha/alpha-osf3.mh, config/alpha/nm-osf3.h: Delete
files.
* config/djgpp/fnchange.lst (config/alpha/alpha-osf1.mh)
(config/alpha/alpha-osf2.mh, config/alpha/alpha-osf3.mh): Delete.
* configure: Regenerate.
* configure.ac: Remove references to osf.
* configure.host: Handle alpha*-*-osf* in the obsolete hosts
section. Remove all other references to osf.
* configure.tgt: Add alpha*-*-osf* to the obsolete targets section.
Remove all other references to osf.
* dec-thread.c: Delete file.
* defs.h (GDB_OSABI_OSF1): Delete.
* inferior.h (START_INFERIOR_TRAPS_EXPECTED): New unconditionally
defined.
* osabi.c (gdb_osabi_names): Delete "OSF/1".
* procfs.c (procfs_debug_inferior) [PROCFS_DONT_TRACE_FAULTS]:
Delete code.
(unconditionally_kill_inferior)
[PROCFS_NEED_CLEAR_CURSIG_FOR_KILL]: Delete code.
* solib-osf.c: Delete file.
gdb/testsuite/
2014-10-17 Pedro Alves <palves@redhat.com>
* gdb.base/callfuncs.exp: emove references to osf.
* gdb.base/sigall.exp: Likewise.
* gdb.gdb/selftest.exp: Likewise.
* gdb.hp/gdb.base-hp/callfwmall.exp: Likewise.
* gdb.mi/non-stop.c: Likewise.
* gdb.mi/pthreads.c: Likewise.
* gdb.reverse/sigall-precsave.exp: Likewise.
* gdb.reverse/sigall-reverse.exp: Likewise.
* gdb.threads/pthreads.c: Likewise.
* gdb.threads/pthreads.exp: Likewise.
gdb/doc/
2014-10-17 Pedro Alves <palves@redhat.com>
* gdb.texinfo (Ada Tasks and Core Files): Delete mention of Tru64.
(SVR4 Process Information): Delete mention of OSF/1.
As the result of the patch below, GDB updates thread list when a stop is
presented to user. The tests don't have to fetch thread list explicitly.
[PATCH 3/3] Fix non-stop regressions caused by "breakpoints always-inserted off" changes
https://sourceware.org/ml/gdb-patches/2014-09/msg00734.html
This patch is to remove the test code updating thread list.
Run these three tests many times on arm-linux-gnueabi and x86-linux.
No regressions.
gdb/testsuite:
2014-10-11 Yao Qi <yao@codesourcery.com>
* gdb.threads/thread-find.exp: Don't execute command
"info threads".
* gdb.threads/attach-into-signal.exp (corefunc): Likewise.
* gdb.threads/linux-dp.exp: Don't check the condition
$threads_created equals to zero.
In git b57bacec, I said:
> With that in place, the need to delay "Program received signal FOO"
> was actually caught by the manythreads.exp test. Without that bit, I
> was getting:
>
> [Thread 0x7ffff7f13700 (LWP 4499) exited]
> [New Thread 0x7ffff7f0b700 (LWP 4500)]
> ^C
> Program received signal SIGINT, Interrupt.
> [New Thread 0x7ffff7f03700 (LWP 4501)] <<< new output
> [Switching to Thread 0x7ffff7f0b700 (LWP 4500)]
> __GI___nptl_death_event () at events.c:31
> 31 {
> (gdb) FAIL: gdb.threads/manythreads.exp: stop threads 1
>
> That is, I was now getting "New Thread" lines after the "Program
> received signal" line, and the test doesn't expect them. As the
> number of new threads discovered before and after the "Program
> received signal" output is unbounded, it's much nicer to defer
> "Program received signal" until after synching the thread list, thus
> close to the "switching to thread" output and "current frame/source"
> info:
>
> [Thread 0x7ffff7863700 (LWP 7647) exited]
> ^C[New Thread 0x7ffff786b700 (LWP 7648)]
>
> Program received signal SIGINT, Interrupt.
> [Switching to Thread 0x7ffff7fc4740 (LWP 6243)]
> __GI___nptl_create_event () at events.c:25
> 25 {
> (gdb) PASS: gdb.threads/manythreads.exp: stop threads 1
This commit factors out the two places in the test that are effected
by this, and adds there a destilled version of the comment above.
gdb/testsuite/
2014-10-02 Pedro Alves <palves@redhat.com>
* gdb.threads/manythreads.exp (interrupt_and_wait): New procedure.
(top level) <stop threads 1, stop threads 2>: Use it.
Commit a25a5a45 (Fix "breakpoint always-inserted off"; remove
"breakpoint always-inserted auto") regressed non-stop remote
debugging.
This was exposed by mi-nsintrall.exp intermittently failing with a
spurious SIGTRAP.
The problem is that when debugging with "target remote", new threads
the target has spawned but have never reported a stop aren't visible
to GDB until it explicitly resyncs its thread list with the target's.
For example, in a program like this:
int
main (void)
{
pthread_t child_thread;
pthread_create (&child_thread, NULL, child_function, NULL);
return 0; <<<< set breakpoint here
}
If the user sets a breakpoint at the "return" statement, and runs the
program, when that breakpoint hit is reported, GDB is only aware of
the main thread. So if we base the decision to remove or insert
breakpoints from the target based on whether all the threads we know
about are stopped, we'll miss that child_thread is running, and thus
we'll remove breakpoints from the target, even through they should
still remain inserted, otherwise child_thread will miss them.
The break-while-running.exp test actually should also be exposing this
thread-list-out-of-synch problem. That test sets a breakpoint while
the main thread is stopped, but other threads are running. Because
other threads are running, the breakpoint is supposed to be inserted
immediately. But, unless something forces a refetch of the thread
list, like, e.g., "info threads", GDB won't be aware of the other
threads that had been spawned by the main thread, and so won't insert
new or old breakpoints in the target. And it turns out that the test
is exactly doing an explicit "info threads", masking out the
problem... This commit adjust the test to exercise the case of not
issuing "info threads". The test then fails without the GDB fix.
In the ni-nsintrall.exp case, what happens is that several threads hit
the same breakpoint, and when the first thread reports the stop,
because GDB wasn't aware other threads exist, all threads known to GDB
are found stopped, so GDB removes the breakpoints from the target.
The other threads follow up with SIGTRAPs too for that same
breakpoint, which has already been removed. For the first few
threads, the moribund breakpoints machinery suppresses the SIGTRAPs,
but after a few events (precisely '3 * thread_count () + 1' at the
time the breakpoint was removed, see update_global_location_list), the
moribund breakpoint machinery is no longer aware of the removed
breakpoint, and the SIGTRAP is reported as a spurious stop.
The fix is naturally then to stop assuming that if no thread in the
list is executing, then the target is fully stopped. We can't know
that until we fully sync the thread list. Because updating the thread
list on every stop would be too much RSP traffic, I chose instead to
update it whenever we're about to present a stop to the user.
Actually updating the thread list at that point happens to be an item
I had added to the local/remote parity wiki page a while ago:
Native GNU/Linux debugging adds new threads to the thread list as
the program creates them "The [New Thread foo] messages". Remote
debugging can't do that, and it's arguable whether we shouldn't even
stop native debugging from doing that, as it hinders inferior
performance. However, a related issue is that with remote targets
(and gdbserver), even after the program stops, the user still needs
to do "info threads" to pull an updated thread list. This, should
most likely be addressed, so that GDB pulls the list itself, perhaps
just before presenting a stop to the user.
With that in place, the need to delay "Program received signal FOO"
was actually caught by the manythreads.exp test. Without that bit, I
was getting:
[Thread 0x7ffff7f13700 (LWP 4499) exited]
[New Thread 0x7ffff7f0b700 (LWP 4500)]
^C
Program received signal SIGINT, Interrupt.
[New Thread 0x7ffff7f03700 (LWP 4501)] <<< new output
[Switching to Thread 0x7ffff7f0b700 (LWP 4500)]
__GI___nptl_death_event () at events.c:31
31 {
(gdb) FAIL: gdb.threads/manythreads.exp: stop threads 1
That is, I was now getting "New Thread" lines after the "Program
received signal" line, and the test doesn't expect them. As the
number of new threads discovered before and after the "Program
received signal" output is unbounded, it's much nicer to defer
"Program received signal" until after synching the thread list, thus
close to the "switching to thread" output and "current frame/source"
info:
[Thread 0x7ffff7863700 (LWP 7647) exited]
^C[New Thread 0x7ffff786b700 (LWP 7648)]
Program received signal SIGINT, Interrupt.
[Switching to Thread 0x7ffff7fc4740 (LWP 6243)]
__GI___nptl_create_event () at events.c:25
25 {
(gdb) PASS: gdb.threads/manythreads.exp: stop threads 1
Tested on x86_64 Fedora 20, native and gdbserver.
gdb/
2014-10-02 Pedro Alves <palves@redhat.com>
* breakpoint.c (breakpoints_should_be_inserted_now): Use
threads_are_executing.
* breakpoint.h (breakpoints_should_be_inserted_now): Add
describing comment.
* gdbthread.h (threads_are_executing): Declare.
(handle_signal_stop) <random signals>: Don't print about the
signal here if stopping.
(end_stepping_range): Don't notify observers here.
(normal_stop): Update the thread list. If stopped by a random
signal or a stepping range ended, notify observers.
* thread.c (threads_executing): New global.
(init_thread_list): Clear 'threads_executing'.
(set_executing): Set or clear 'threads_executing'.
(threads_are_executing): New function.
(update_threads_executing): New function.
(update_thread_list): Use it.
gdb/testsuite/
2014-10-02 Pedro Alves <palves@redhat.com>
* gdb.threads/break-while-running.exp (test): Add new
'update_thread_list' argument. Skip "info threads" if false.
(top level): Add new 'update_thread_list' axis.
I see the following fails on arm-linux-gnueabi,
result of ldd build-git/arm/gdb/testsuite/gdb.threads/dlopen-libpthread.so is 1
output of ldd build-git/arm/gdb/testsuite/gdb.threads/dlopen-libpthread.so is not a dynamic executable
child process exited abnormally
FAIL: gdb.threads/dlopen-libpthread.exp: ldd dlopen-libpthread.so
FAIL: gdb.threads/dlopen-libpthread.exp: ldd dlopen-libpthread.so output contains libs
the test script invokes ldd (on host) for the target libraries, which
is wrong. ldd can't be cross because it invokes dynamic linker with
LD_TRACE_LOADED_OBJECTS and gets the dependent libraries. My first
reaction to this problem is to execute ld.so on the target (like
remote_exec target). When I start to hack proc build_executable_own_libs,
I find it has assumptions here and there that the native testing is
performed. Then I check the callers of build_executable_own_libs,
and they are all skipped if isnative is false. It is reasonable to do
the same in dlopen-libpthread.exp too.
gdb/testsuite:
2014-09-30 Yao Qi <yao@codesourcery.com>
* gdb.threads/dlopen-libpthread.exp: Skip it if isnative is
false.
By default, GDB removes all breakpoints from the target when the
target stops and the prompt is given back to the user. This is useful
in case GDB crashes while the user is interacting, as otherwise,
there's a higher chance breakpoints would be left planted on the
target.
But, as long as any thread is running free, we need to make sure to
keep breakpoints inserted, lest a thread misses a breakpoint. With
that in mind, in preparation for non-stop mode, we added a "breakpoint
always-inserted on" mode. This traded off the extra crash protection
for never having threads miss breakpoints, and in addition is more
efficient if there's a ton of breakpoints to remove/insert at each
user command (e.g., at each "step").
When we added non-stop mode, and for a period, we required users to
manually set "always-inserted on" when they enabled non-stop mode, as
otherwise GDB removes all breakpoints from the target as soon as any
thread stops, which means the other threads still running will miss
breakpoints. The test added by this patch exercises this.
That soon revealed a nuisance, and so later we added an extra
"breakpoint always-inserted auto" mode, that made GDB behave like
"always-inserted on" when non-stop was enabled, and "always-inserted
off" when non-stop was disabled. "auto" was made the default at the
same time.
In hindsight, this "auto" setting was unnecessary, and not the ideal
solution. Non-stop mode does depends on breakpoints always-inserted
mode, but only as long as any thread is running. If no thread is
running, no breakpoint can be missed. The same is true for all-stop
too. E.g., if, in all-stop mode, and the user does:
(gdb) c&
(gdb) b foo
That breakpoint at "foo" should be inserted immediately, but it
currently isn't -- currently it'll end up inserted only if the target
happens to trip on some event, and is re-resumed, e.g., an internal
breakpoint triggers that doesn't cause a user-visible stop, and so we
end up in keep_going calling insert_breakpoints. The test added by
this patch also covers this.
IOW, no matter whether in non-stop or all-stop, if the target fully
stops, we can remove breakpoints. And no matter whether in all-stop
or non-stop, if any thread is running in the target, then we need
breakpoints to be immediately inserted. And then, if the target has
global breakpoints, we need to keep breakpoints even when the target
is stopped.
So with that in mind, and aiming at reducing all-stop vs non-stop
differences for all-stop-on-stop-of-non-stop, this patch fixes
"breakpoint always-inserted off" to not remove breakpoints from the
target until it fully stops, and then removes the "auto" setting as
unnecessary. I propose removing it straight away rather than keeping
it as an alias, unless someone complains they have scripts that need
it and that can't adjust.
Tested on x86_64 Fedora 20.
gdb/
2014-09-22 Pedro Alves <palves@redhat.com>
* NEWS: Mention merge of "breakpoint always-inserted" modes "off"
and "auto" merged.
* breakpoint.c (enum ugll_insert_mode): New enum.
(always_inserted_mode): Now a plain boolean.
(show_always_inserted_mode): No longer handle AUTO_BOOLEAN_AUTO.
(breakpoints_always_inserted_mode): Delete.
(breakpoints_should_be_inserted_now): New function.
(insert_breakpoints): Pass UGLL_INSERT to
update_global_location_list instead of calling
insert_breakpoint_locations manually.
(create_solib_event_breakpoint_1): New, factored out from ...
(create_solib_event_breakpoint): ... this.
(create_and_insert_solib_event_breakpoint): Use
create_solib_event_breakpoint_1 instead of calling
insert_breakpoint_locations manually.
(update_global_location_list): Change parameter type from boolean
to enum ugll_insert_mode. All callers adjusted. Adjust to use
breakpoints_should_be_inserted_now and handle UGLL_INSERT.
(update_global_location_list_nothrow): Change parameter type from
boolean to enum ugll_insert_mode.
(_initialize_breakpoint): "breakpoint always-inserted" option is
now a boolean command. Update help text.
* breakpoint.h (breakpoints_always_inserted_mode): Delete declaration.
(breakpoints_should_be_inserted_now): New declaration.
* infrun.c (handle_inferior_event) <TARGET_WAITKIND_LOADED>:
Remove breakpoints_always_inserted_mode check.
(normal_stop): Adjust to use breakpoints_should_be_inserted_now.
* remote.c (remote_start_remote): Likewise.
gdb/doc/
2014-09-22 Pedro Alves <palves@redhat.com>
* gdb.texinfo (Set Breaks): Document that "set breakpoint
always-inserted off" is the default mode now. Delete
documentation of "set breakpoint always-inserted auto".
gdb/testsuite/
2014-09-22 Pedro Alves <palves@redhat.com>
* gdb.threads/break-while-running.exp: New file.
* gdb.threads/break-while-running.c: New file.
The test does a backtrace to see which thread (#2 or #3) is assigned
to which SIGUSR (1 or 2). If the main thread gets to all_threads_running
before the sigusr threads get to their entry point, then the function
name isn't in the backtrace and the test fails.
Alas this version of the code is within epsilon of what I started with,
and then over-simplified things.
If I want to change the signalled state of multiple threads
it's a bit cumbersome to do with the "signal" command.
What you really want is a way to set the signal state of the
desired threads and then just do "continue".
This patch adds a new command, queue-signal, to accomplish this.
Basically "signal N" == "queue-signal N" + "continue".
That's not precisely true in that "signal" can be used to inject
any signal, including signals set to "nopass"; whereas "queue-signal"
just queues the signal as if the thread stopped because of it.
"nopass" handling is done when the thread is resumed which
"queue-signal" doesn't do.
One could add extra complexity to allow queue-signal to be used to
deliver "nopass" signals like the "signal" command. I have no current
need for it so in the interests of incremental complexity, I have
left such support out and just have the code flag an error if one
tries to queue a nopass signal.
gdb/ChangeLog:
* NEWS: Mention new "queue-signal" command.
* infcmd.c (queue_signal_command): New function.
(_initialize_infcmd): Add new queue-signal command.
gdb/doc/ChangeLog:
* gdb.texinfo (Signaling): Document new queue-signal command.
gdb/testsuite/ChangeLog:
* gdb.threads/queue-signal.c: New file.
* gdb.threads/queue-signal.exp: New file.
See:
https://sourceware.org/ml/gdb-patches/2014-09/msg00404.html
We have a number of places that do gdb_run_cmd followed by gdb_expect,
when it would be better to use gdb_test_multiple or gdb_test.
This converts all that "grep gdb_run_cmd -A 2 | grep gdb_expect"
found.
Tested on x86_64 Fedora 20, native and gdbserver.
gdb/testsuite/
2014-09-12 Pedro Alves <palves@redhat.com>
* gdb.arch/gdb1558.exp: Replace uses of gdb_expect after
gdb_run_cmd with gdb_test_multiple or gdb_test throughout.
* gdb.arch/i386-size-overlap.exp: Likewise.
* gdb.arch/i386-size.exp: Likewise.
* gdb.arch/i386-unwind.exp: Likewise.
* gdb.base/a2-run.exp: Likewise.
* gdb.base/break.exp: Likewise.
* gdb.base/charset.exp: Likewise.
* gdb.base/chng-syms.exp: Likewise.
* gdb.base/commands.exp: Likewise.
* gdb.base/dbx.exp: Likewise.
* gdb.base/find.exp: Likewise.
* gdb.base/funcargs.exp: Likewise.
* gdb.base/jit-simple.exp: Likewise.
* gdb.base/reread.exp: Likewise.
* gdb.base/sepdebug.exp: Likewise.
* gdb.base/step-bt.exp: Likewise.
* gdb.cp/mb-inline.exp: Likewise.
* gdb.cp/mb-templates.exp: Likewise.
* gdb.objc/basicclass.exp: Likewise.
* gdb.threads/killed.exp: Likewise.
Program received signal SIGABRT, Aborted.
[...]
(gdb) gcore foobar
Couldn't get registers: No such process.
(gdb) info threads
[...]
(gdb) gcore foobar
Saved corefile foobar
(gdb)
gcore tries to access the exited thread:
[Thread 0x7ffff7fce700 (LWP 6895) exited]
ptrace(PTRACE_GETREGS, 6895, 0, 0x7fff18167dd0) = -1 ESRCH (No such process)
Without the TRY_CATCH protection testsuite FAILs for:
gcore .../gdb/testsuite/gdb.threads/gcore-thread0.test
Cannot find new threads: debugger service failed
(gdb) FAIL: gdb.threads/gcore-thread.exp: save a zeroed-threads corefile
+
core .../gdb/testsuite/gdb.threads/gcore-thread0.test
".../gdb/testsuite/gdb.threads/gcore-thread0.test" is not a core dump: File format not recognized
(gdb) FAIL: gdb.threads/gcore-thread.exp: core0file: re-load generated corefile (bad file format)
Maybe the TRY_CATCH could be more inside update_thread_list().
Similar update_thread_list() call is IMO missing in procfs_make_note_section()
but I do not have where to verify that change.
gdb/ChangeLog
2014-08-21 Jan Kratochvil <jan.kratochvil@redhat.com>
* linux-tdep.c (linux_corefile_thread_callback): Ignore THREAD_EXITED.
(linux_make_corefile_notes): call update_thread_list, protected against
exceptions.
gdb/testsuite/ChangeLog
2014-08-21 Jan Kratochvil <jan.kratochvil@redhat.com>
* gdb.threads/gcore-stale-thread.c: New file.
* gdb.threads/gcore-stale-thread.exp: New file.
Checking whether the gcore command is included in the GDB build as
proxy for checking whether core dumping is supported by the target is
useless, as gcore.o has been in COMMON_OBS since git 9b4eba8e:
2009-10-26 Michael Snyder <msnyder@vmware.com>
Hui Zhu <teawater@gmail.com>
* Makefile.in (SFILES): Add gcore.c.
(COMMON_OBS): Add gcore.o.
* config/alpha/alpha-linux.mh (NATDEPFILES): Delete gcore.o.
* config/alpha/fbsd.mh (NATDEPFILES): Ditto.
...
IOW, the command is always included in the build.
Instead, nowadays, tests bail out if actually trying to generate a
core fails with an indication the target doesn't support it. See
gdb_gcore_cmd and callers.
Tested on x86_64 Fedora 20.
gdb/testsuite/ChangeLog:
* gdb.base/gcore-buffer-overflow.exp: Remove "help gcore" test.
* gdb.base/gcore-relro-pie.exp: Likewise.
* gdb.base/gcore-relro.exp: Likewise.
* gdb.base/gcore.exp: Likewise.
* gdb.base/print-symbol-loading.exp: Likewise.
* gdb.threads/gcore-thread.exp: Likewise.
* lib/gdb.exp (gdb_gcore_cmd): Don't expect "Undefined command".
Currently, GDB can pass a signal to the wrong thread in several
different but related scenarios.
E.g., if thread 1 stops for signal SIGFOO, the user switches to thread
2, and then issues "continue", SIGFOO is actually delivered to thread
2, not thread 1. This obviously messes up programs that use
pthread_kill to send signals to specific threads.
This has been a known issue for a long while. Back in 2008 when I
made stop_signal be per-thread (2020b7ab), I kept the behavior -- see
code in 'proceed' being removed -- wanting to come back to it later.
The time has finally come now.
The patch fixes this -- on resumption, intercepted signals are always
delivered to the thread that had intercepted them.
Another example: if thread 1 stops for a breakpoint, the user switches
to thread 2, and then issues "signal SIGFOO", SIGFOO is actually
delivered to thread 1, not thread 2, because 'proceed' first switches
to thread 1 to step over its breakpoint... If the user deletes the
breakpoint before issuing "signal FOO", then the signal is delivered
to thread 2 (the current thread).
"signal SIGFOO" can be used for two things: inject a signal in the
program while the program/thread had stopped for none, bypassing
"handle nopass"; or changing/suppressing a signal the program had
stopped for. These scenarios are really two faces of the same coin,
and GDB can't really guess what the user is trying to do. GDB might
have intercepted signals in more than one thread even (see the new
signal-command-multiple-signals-pending.exp test). At least in the
inject case, it's obviously clear to me that the user means to deliver
the signal to the currently selected thread, so best is to make the
command's behavior consistent and easy to explain.
Then, if the user is trying to suppress/change a signal the program
had stopped for instead of injecting a new signal, but, the user had
changed threads meanwhile, then she will be surprised that with:
(gdb) continue
Thread 1 stopped for signal SIGFOO.
(gdb) thread 2
(gdb) signal SIGBAR
... GDB actually delivers SIGFOO to thread 1, and SIGBAR to thread 2
(with scheduler-locking off, which is the default, because then
"signal" or any other resumption command resumes all threads).
So the patch makes GDB detect that, and ask for confirmation:
(gdb) thread 1
[Switching to thread 1 (Thread 10979)]
(gdb) signal SIGUSR2
Note:
Thread 3 previously stopped with signal SIGUSR2, User defined signal 2.
Thread 2 previously stopped with signal SIGUSR1, User defined signal 1.
Continuing thread 1 (the current thread) with specified signal will
still deliver the signals noted above to their respective threads.
Continue anyway? (y or n)
All these scenarios are covered by the new tests.
Tested on x86_64 Fedora 20, native and gdbserver.
gdb/
2014-07-25 Pedro Alves <palves@redhat.com>
* NEWS: Mention signal passing and "signal" command changes.
* gdbthread.h (struct thread_suspend_state) <stop_signal>: Extend
comment.
* breakpoint.c (until_break_command): Adjust clear_proceed_status
call.
* infcall.c (run_inferior_call): Adjust clear_proceed_status call.
* infcmd.c (proceed_thread_callback, continue_1, step_once)
(jump_command): Adjust clear_proceed_status call.
(signal_command): Warn if other thread that are resumed have
signals that will be delivered. Adjust clear_proceed_status call.
(until_next_command, finish_command)
(proceed_after_attach_callback, attach_command_post_wait)
(attach_command): Adjust clear_proceed_status call.
* infrun.c (proceed_after_vfork_done): Likewise.
(proceed_after_attach_callback): Adjust comment.
(clear_proceed_status_thread): Clear stop_signal if not in pass
state.
(clear_proceed_status_callback): Delete.
(clear_proceed_status): New 'step' parameter. Only clear the
proceed status of threads the command being prepared is about to
resume.
(proceed): If passed in an explicit signal, override stop_signal
with it. Don't pass the last stop signal to the thread we're
resuming.
(init_wait_for_inferior): Adjust clear_proceed_status call.
(switch_back_to_stepped_thread): Clear the signal if it should not
be passed.
* infrun.h (clear_proceed_status): New 'step' parameter.
(user_visible_resume_ptid): Add comment.
* linux-nat.c (linux_nat_resume_callback): Don't check whether the
signal is in pass state.
* remote.c (append_pending_thread_resumptions): Likewise.
* mi/mi-main.c (proceed_thread): Adjust clear_proceed_status call.
gdb/doc/
2014-07-25 Pedro Alves <palves@redhat.com>
Eli Zaretskii <eliz@gnu.org>
* gdb.texinfo (Signaling) <signal command>: Explain what happens
with multi-threaded programs.
gdb/testsuite/
2014-07-25 Pedro Alves <palves@redhat.com>
* gdb.threads/signal-command-handle-nopass.c: New file.
* gdb.threads/signal-command-handle-nopass.exp: New file.
* gdb.threads/signal-command-multiple-signals-pending.c: New file.
* gdb.threads/signal-command-multiple-signals-pending.exp: New file.
* gdb.threads/signal-delivered-right-thread.c: New file.
* gdb.threads/signal-delivered-right-thread.exp: New file.
Here's an example, with the new test:
gdbserver :9999 gdb.threads/kill
gdb gdb.threads/kill
(gdb) b 52
Breakpoint 1 at 0x4007f4: file kill.c, line 52.
Continuing.
Breakpoint 1, main () at kill.c:52
52 return 0; /* set break here */
(gdb) k
Kill the program being debugged? (y or n) y
gdbserver :9999 gdb.threads/kill
Process gdb.base/watch_thread_num created; pid = 9719
Listening on port 1234
Remote debugging from host 127.0.0.1
Killing all inferiors
Segmentation fault (core dumped)
Backtrace:
(gdb) bt
#0 0x00000000004068a0 in find_inferior (list=0x66b060 <all_threads>, func=0x427637 <kill_one_lwp_callback>, arg=0x7fffffffd3fc) at src/gdb/gdbserver/inferiors.c:199
#1 0x00000000004277b6 in linux_kill (pid=15708) at src/gdb/gdbserver/linux-low.c:966
#2 0x000000000041354d in kill_inferior (pid=15708) at src/gdb/gdbserver/target.c:163
#3 0x00000000004107e9 in kill_inferior_callback (entry=0x6704f0) at src/gdb/gdbserver/server.c:2934
#4 0x0000000000406522 in for_each_inferior (list=0x66b050 <all_processes>, action=0x4107a6 <kill_inferior_callback>) at src/gdb/gdbserver/inferiors.c:57
#5 0x0000000000412377 in process_serial_event () at src/gdb/gdbserver/server.c:3767
#6 0x000000000041267c in handle_serial_event (err=0, client_data=0x0) at src/gdb/gdbserver/server.c:3880
#7 0x00000000004189ff in handle_file_event (event_file_desc=4) at src/gdb/gdbserver/event-loop.c:434
#8 0x00000000004181c6 in process_event () at src/gdb/gdbserver/event-loop.c:189
#9 0x0000000000418f45 in start_event_loop () at src/gdb/gdbserver/event-loop.c:552
#10 0x0000000000411272 in main (argc=3, argv=0x7fffffffd8d8) at src/gdb/gdbserver/server.c:3283
The problem is that linux_wait_for_event deletes lwps that have exited
(even those not passed in as lwps of interest), while the lwp/thread
list is being walked on with find_inferior. find_inferior can handle
the current iterated inferior being deleted, but not others.
When killing lwps, we don't really care about any of the pending
status handling of linux_wait_for_event. We can just waitpid the lwps
directly, which is also what GDB does (see
linux-nat.c:kill_wait_callback). This way the lwps are not deleted
while we're walking the list. They'll be deleted by linux_mourn
afterwards.
This crash triggers several times when running the testsuite against
GDBserver with the native-gdbserver board (target remote), but as GDB
can't distinguish between GDBserver crashing and "kill" being
sucessful, as in both cases the connection is closed (the 'k' packet
doesn't require a reply), and the inferior is gone, that results in no
FAIL.
The patch adds a generic test that catches the issue with
extended-remote mode (and works fine with native testing too). Here's
how it fails with the native-extended-gdbserver board without the fix:
(gdb) info threads
Id Target Id Frame
6 Thread 15367.15374 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
5 Thread 15367.15373 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
4 Thread 15367.15372 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
3 Thread 15367.15371 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
2 Thread 15367.15370 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
* 1 Thread 15367.15367 main () at .../gdb.threads/kill.c:52
(gdb) kill
Kill the program being debugged? (y or n) y
Remote connection closed
^^^^^^^^^^^^^^^^^^^^^^^^
(gdb) FAIL: gdb.threads/kill.exp: kill
Extended remote should remain connected after the kill.
gdb/gdbserver/
2014-07-11 Pedro Alves <palves@redhat.com>
* linux-low.c (kill_wait_lwp): New function, based on
kill_one_lwp_callback, but use my_waitpid directly.
(kill_one_lwp_callback, linux_kill): Use it.
gdb/testsuite/
2014-07-11 Pedro Alves <palves@redhat.com>
* gdb.threads/kill.c: New file.
* gdb.threads/kill.exp: New file.
Running gdb.threads/thread-execl.exp with scheduler-locking set to
"step" reveals a problem:
(gdb) next^M
[Thread 0x7ffff7fda700 (LWP 27168) exited]^M
[New LWP 27168]^M
[Thread 0x7ffff74ee700 (LWP 27174) exited]^M
process 27168 is executing new program: /home/jkratoch/redhat/gdb-clean/gdb/testsuite/gdb.threads/thread-execl^M
[Thread debugging using libthread_db enabled]^M
Using host libthread_db library "/lib64/libthread_db.so.1".^M
infrun.c:5225: internal-error: switch_back_to_stepped_thread: Assertion `!schedlock_applies (1)' failed.^M
A problem internal to GDB has been detected,^M
further debugging may prove unreliable.^M
Quit this debugging session? (y or n) FAIL: gdb.threads/thread-execl.exp: schedlock step: get to main in new image (GDB internal error)
The assertion is correct. The issue is that GDB is mistakenly trying
to switch back to an exited thread, that was previously stepping when
it exited. This is exactly the sort of thing the test wants to make
sure doesn't happen:
# Now set a breakpoint at `main', and step over the execl call. The
# breakpoint at main should be reached. GDB should not try to revert
# back to the old thread from the old image and resume stepping it
We don't see this bug with schedlock off only because a different
sequence of events makes GDB manage to delete the thread instead of
marking it exited.
This particular internal error can be fixed by making the loop over
all threads in switch_back_to_stepped_thread skip exited threads.
But, looking over other ALL_THREADS users, all either can or should be
skipping exited threads too. So for simplicity, this patch replaces
ALL_THREADS with a new macro that skips exited threads itself, and
updates everything to use it.
Tested on x86_64 Fedora 20.
gdb/
2014-06-19 Pedro Alves <palves@redhat.com>
* gdbthread.h (ALL_THREADS): Delete.
(ALL_NON_EXITED_THREADS): New macro.
* btrace.c (btrace_free_objfile): Use ALL_NON_EXITED_THREADS
instead of ALL_THREADS.
* infrun.c (find_thread_needs_step_over)
(switch_back_to_stepped_thread): Use ALL_NON_EXITED_THREADS
instead of ALL_THREADS.
* record-btrace.c (record_btrace_open)
(record_btrace_stop_recording, record_btrace_close)
(record_btrace_is_replaying, record_btrace_resume)
(record_btrace_find_thread_to_move, record_btrace_wait): Likewise.
* remote.c (append_pending_thread_resumptions): Likewise.
* thread.c (thread_apply_all_command): Likewise.
gdb/testsuite/
2014-06-19 Pedro Alves <palves@redhat.com>
* gdb.threads/thread-execl.exp (do_test): New procedure, factored
out from ...
(top level): ... here. Iterate running tests under different
scheduler-locking settings.
The code in gdb.threads/staticthreads.exp about checking the value of
tlsvar in main thread is racy, because when child thread hits
breakpoint, the main thread may not go into pthread_join yet, and
may not be unwind to main.
This patch is to move the line setting breakpoint on after sem_wait,
so that the child thread will hit breakpoint after main thread calls
sem_post. IOW, when child thread hits breakpoint, the main thread is
in either sem_post or pthread_join. "up 10" can unwind main thread to
main.
gdb/testsuite:
2014-06-06 Yao Qi <yao@codesourcery.com>
* gdb.threads/staticthreads.c (thread_function): Move the line
setting breakpoint on forward.
* gdb.threads/staticthreads.exp: Update comments.
This finally makes background execution commands possible by default.
However, in order to do that, there's one last thing we need to do --
we need to separate the MI and target notions of "async". Unlike the
CLI, where the user explicitly requests foreground vs background
execution in the execution command itself (c vs c&), MI chose to treat
"set target-async" specially -- setting it changes the default
behavior of execution commands.
So, we can't simply "set target-async" default to on, as that would
affect MI frontends. Instead we have to make the setting MI-specific,
and teach MI about sync commands on top of an async target.
Because the "target" word in "set target-async" ends up as a potential
source of confusion, the patch adds a "set mi-async" option, and makes
"set target-async" a deprecated alias.
Rather than make the targets always async, this patch introduces a new
"maint set target-async" option so that the GDB developer can control
whether the target is async. This makes it simpler to debug issues
arising only in the synchronous mode; important because sync mode
seems unlikely to go away.
Unlike in previous revisions, "set target-async" does not affect this
new maint parameter. The rationale for this is that then one can
easily run the test suite in the "maint set target-async off" mode and
have tests that enable mi-async fail just like they fail on
non-async-capable targets. This emulation is exactly the point of the
maint option.
I had asked Tom in a previous iteration to split the actual change of
the target async default to a separate patch, but it turns out that
that is quite awkward in this version of the patch, because with MI
async and target async decoupled (unlike in previous versions), if we
don't flip the default at the same time, then just "set target-async
on" alone never actually manages to do anything. It's best to not
have that transitory state in the tree.
Given "set target-async on" now only has effect for MI, the patch goes
through the testsuite removing it from non-MI tests. MI tests are
adjusted to use the new and less confusing "mi-async" spelling.
2014-05-29 Pedro Alves <palves@redhat.com>
Tom Tromey <tromey@redhat.com>
* NEWS: Mention "maint set target-async", "set mi-async", and that
background execution commands are now always available.
* target.h (target_async_permitted): Update comment.
* target.c (target_async_permitted, target_async_permitted_1):
Default to 1.
(set_target_async_command): Rename to ...
(maint_set_target_async_command): ... this.
(show_target_async_command): Rename to ...
(maint_show_target_async_command): ... this.
(_initialize_target): Adjust.
* infcmd.c (prepare_execution_command): Make extern.
* inferior.h (prepare_execution_command): Declare.
* infrun.c (set_observer_mode): Leave target async alone.
* mi/mi-interp.c (mi_interpreter_init): Install
mi_on_sync_execution_done as sync_execution_done observer.
(mi_on_sync_execution_done): New function.
(mi_execute_command_input_handler): Don't print the prompt if we
just started a synchronous command with an async target.
(mi_on_resume): Check sync_execution before printing prompt.
* mi/mi-main.h (mi_async_p): Declare.
* mi/mi-main.c: Include gdbcmd.h.
(mi_async_p): New function.
(mi_async, mi_async_1): New globals.
(set_mi_async_command, show_mi_async_command, mi_async): New
functions.
(exec_continue): Call prepare_execution_command.
(run_one_inferior, mi_cmd_exec_run, mi_cmd_list_target_features)
(mi_execute_async_cli_command): Use mi_async_p.
(_initialize_mi_main): Install "set mi-async". Make
"target-async" a deprecated alias.
2014-05-29 Pedro Alves <palves@redhat.com>
Tom Tromey <tromey@redhat.com>
* gdb.texinfo (Non-Stop Mode): Remove "set target-async 1"
from example.
(Asynchronous and non-stop modes): Document '-gdb-set mi-async'.
Mention that target-async is now deprecated.
(Maintenance Commands): Document maint set/show target-async.
2014-05-29 Pedro Alves <palves@redhat.com>
Tom Tromey <tromey@redhat.com>
* gdb.base/async-shell.exp: Don't enable target-async.
* gdb.base/async.exp
* gdb.base/corefile.exp (corefile_test_attach): Remove 'async'
parameter. Adjust.
(top level): Don't test with "target-async".
* gdb.base/dprintf-non-stop.exp: Don't enable target-async.
* gdb.base/gdb-sigterm.exp: Don't test with "target-async".
* gdb.base/inferior-died.exp: Don't enable target-async.
* gdb.base/interrupt-noterm.exp: Likewise.
* gdb.mi/mi-async.exp: Use "mi-async" instead of "target-async".
* gdb.mi/mi-nonstop-exit.exp: Likewise.
* gdb.mi/mi-nonstop.exp: Likewise.
* gdb.mi/mi-ns-stale-regcache.exp: Likewise.
* gdb.mi/mi-nsintrall.exp: Likewise.
* gdb.mi/mi-nsmoribund.exp: Likewise.
* gdb.mi/mi-nsthrexec.exp: Likewise.
* gdb.mi/mi-watch-nonstop.exp: Likewise.
* gdb.multi/watchpoint-multi.exp: Adjust comment.
* gdb.python/py-evsignal.exp: Don't enable target-async.
* gdb.python/py-evthreads.exp: Likewise.
* gdb.python/py-prompt.exp: Likewise.
* gdb.reverse/break-precsave.exp: Don't test with "target-async".
* gdb.server/solib-list.exp: Don't enable target-async.
* gdb.threads/thread-specific-bp.exp: Likewise.
* lib/mi-support.exp: Adjust to use mi-async.
I have posted:
TLS variables access for -static -lpthread executables
https://sourceware.org/ml/libc-help/2014-03/msg00024.html
and the GDB patch below has been confirmed as OK for current glibcs.
Further work should be done for newer glibcs:
Improve TLS variables glibc compatibility
https://sourceware.org/bugzilla/show_bug.cgi?id=16954
Still the patch below implements the feature in a fully functional way backward
compatible with current glibcs, it depends on the following glibc source line:
csu/libc-tls.c
main_map->l_tls_modid = 1;
gdb/
2014-05-21 Jan Kratochvil <jan.kratochvil@redhat.com>
Fix TLS access for -static -pthread.
* linux-thread-db.c (struct thread_db_info): Add td_thr_tlsbase_p.
(try_thread_db_load_1): Initialize it.
(thread_db_get_thread_local_address): Call it if LM is zero.
* target.c (target_translate_tls_address): Remove LM_ADDR zero check.
* target.h (struct target_ops) (to_get_thread_local_address): Add
load_module_addr comment.
gdb/gdbserver/
2014-05-21 Jan Kratochvil <jan.kratochvil@redhat.com>
Fix TLS access for -static -pthread.
* gdbserver/thread-db.c (struct thread_db): Add td_thr_tlsbase_p.
(thread_db_get_tls_address): Call it if LOAD_MODULE is zero.
(thread_db_load_search, try_thread_db_load_1): Initialize it.
gdb/testsuite/
2014-05-21 Jan Kratochvil <jan.kratochvil@redhat.com>
Fix TLS access for -static -pthread.
* gdb.threads/staticthreads.c <HAVE_TLS> (tlsvar): New.
<HAVE_TLS> (thread_function, main): Initialize it.
* gdb.threads/staticthreads.exp: Try gdb_compile_pthreads for $have_tls.
Add clean_restart.
<$have_tls != "">: Check TLSVAR.
Message-ID: <20140410115204.GB16411@host2.jankratochvil.net>
Clang defaults this warning to an error, breaking the build & causing
these tests not to run.
gdb/testsuite/
* gdb.mi/non-stop.c: Add return value for non-void function return
statement.
* gdb.threads/staticthreads.c: Ditto.
This fixes:
FAIL: gdb.threads/thread-specific.exp: continue to thread-specific breakpoint (timeout)
ERROR: tcl error sourcing .../gdb/testsuite/gdb.threads/thread-specific.exp.
ERROR: can't read "this_breakpoint": no such variable
while executing
"gdb_test_multiple "info breakpoint $this_breakpoint" "info on bp" {
-re ".*stop only in thread (\[0-9\]*).*$gdb_prompt $" {
set this_thread $expe..."
(file ".../gdb/testsuite/gdb.threads/thread-specific.exp" line 108)
invoked from within
"source .../gdb/testsuite/gdb.threads/thread-specific.exp"
("uplevel" body line 1)
invoked from within
"uplevel #0 source .../gdb/testsuite/gdb.threads/thread-specific.exp"
invoked from within
"catch "uplevel #0 source $test_file_name""
and then:
FAIL: gdb.threads/thread-specific.exp: continue to thread-specific breakpoint (timeout)
UNTESTED: gdb.threads/thread-specific.exp: info on bp
ERROR: tcl error sourcing .../gdb/testsuite/gdb.threads/thread-specific.exp.
ERROR: can't read "this_thread": no such variable
while executing
"gdb_test {print $_thread} ".* = $this_thread" "thread var at break""
(file ".../gdb/testsuite/gdb.threads/thread-specific.exp" line 119)
invoked from within
"source .../gdb/testsuite/gdb.threads/thread-specific.exp"
("uplevel" body line 1)
invoked from within
"uplevel #0 source .../gdb/testsuite/gdb.threads/thread-specific.exp"
invoked from within
"catch "uplevel #0 source $test_file_name""
Final results:
FAIL: gdb.threads/thread-specific.exp: continue to thread-specific breakpoint (timeout)
UNTESTED: gdb.threads/thread-specific.exp: info on bp
UNTESTED: gdb.threads/thread-specific.exp: thread var at break
Of course the first failure best wasn't there, but failing that the script
shouldn't crash.
* gdb.threads/thread-specific.exp: Handle the lack of usable
$this_breakpoint and $this_thread.
This test now uses pthread_kill instead of the host's kill command, so
no longer need to block signals, or store the the inferior's PID.
gdb/testsuite/
2014-03-20 Pedro Alves <palves@redhat.com>
* gdb.threads/signal-while-stepping-over-bp-other-thread.c (pid):
Delete.
(block_signals, unblock_signals): Delete.
(child_function_2, main): Remove references to deleted variable
and functions.
Use pthread_kill instead of the host's "kill". The reason the test
wasn't written that way to begin with, is that done this way, before
the previous fixes to make GDB step-over all other threads before the
stepping thread, the test would fail...
Tested on x86_64 Fedora 17, native and gdbserver.
gdb/testsuite/
2014-03-20 Pedro Alves <palves@redhat.com>
* gdb.threads/signal-while-stepping-over-bp-other-thread.c (main):
Use pthread_kill to signal thread 2.
* gdb.threads/signal-while-stepping-over-bp-other-thread.exp:
Adjust to make the test send itself a signal rather than using the
host's "kill" command.
This test fails with current mainline.
If the program stopped for a breakpoint in thread 1, and then the user
switches to thread 2, and resumes the program, GDB first switches back
to thread 1 to step it over the breakpoint, in order to make progress.
However, that logic only considers the last reported event, assuming
only one thread needs that stepping over dance.
That's actually not true when we play with scheduler-locking. The
patch adds an example to the testsuite of multiple threads needing a
step-over before the stepping thread can be resumed. With current
mainline, the program re-traps the same breakpoint it had already
trapped before.
E.g.:
Breakpoint 2, main () at ../../../src/gdb/testsuite/gdb.threads/multiple-step-overs.c:99
99 wait_threads (); /* set wait-threads breakpoint here */
(gdb) PASS: gdb.threads/multiple-step-overs.exp: step: continue to breakpoint: run to breakpoint
info threads
Id Target Id Frame
3 Thread 0x7ffff77c9700 (LWP 4310) "multiple-step-o" 0x00000000004007ca in child_function_3 (arg=0x1) at ../../../src/gdb/testsuite/gdb.threads/multiple-step-overs.c:43
2 Thread 0x7ffff7fca700 (LWP 4309) "multiple-step-o" 0x0000000000400827 in child_function_2 (arg=0x0) at ../../../src/gdb/testsuite/gdb.threads/multiple-step-overs.c:60
* 1 Thread 0x7ffff7fcb740 (LWP 4305) "multiple-step-o" main () at ../../../src/gdb/testsuite/gdb.threads/multiple-step-overs.c:99
(gdb) PASS: gdb.threads/multiple-step-overs.exp: step: info threads shows all threads
set scheduler-locking on
(gdb) PASS: gdb.threads/multiple-step-overs.exp: step: set scheduler-locking on
break 44
Breakpoint 3 at 0x4007d3: file ../../../src/gdb/testsuite/gdb.threads/multiple-step-overs.c, line 44.
(gdb) break 61
Breakpoint 4 at 0x40082d: file ../../../src/gdb/testsuite/gdb.threads/multiple-step-overs.c, line 61.
(gdb) thread 3
[Switching to thread 3 (Thread 0x7ffff77c9700 (LWP 4310))]
#0 0x00000000004007ca in child_function_3 (arg=0x1) at ../../../src/gdb/testsuite/gdb.threads/multiple-step-overs.c:43
43 (*myp) ++;
(gdb) PASS: gdb.threads/multiple-step-overs.exp: step: thread 3
continue
Continuing.
Breakpoint 3, child_function_3 (arg=0x1) at ../../../src/gdb/testsuite/gdb.threads/multiple-step-overs.c:44
44 callme (); /* set breakpoint thread 3 here */
(gdb) PASS: gdb.threads/multiple-step-overs.exp: step: continue to breakpoint: run to breakpoint in thread 3
p *myp = 0
$1 = 0
(gdb) PASS: gdb.threads/multiple-step-overs.exp: step: unbreak loop in thread 3
thread 2
[Switching to thread 2 (Thread 0x7ffff7fca700 (LWP 4309))]
#0 0x0000000000400827 in child_function_2 (arg=0x0) at ../../../src/gdb/testsuite/gdb.threads/multiple-step-overs.c:60
60 (*myp) ++;
(gdb) PASS: gdb.threads/multiple-step-overs.exp: step: thread 2
continue
Continuing.
Breakpoint 4, child_function_2 (arg=0x0) at ../../../src/gdb/testsuite/gdb.threads/multiple-step-overs.c:61
61 callme (); /* set breakpoint thread 2 here */
(gdb) PASS: gdb.threads/multiple-step-overs.exp: step: continue to breakpoint: run to breakpoint in thread 2
p *myp = 0
$2 = 0
(gdb) PASS: gdb.threads/multiple-step-overs.exp: step: unbreak loop in thread 2
thread 1
[Switching to thread 1 (Thread 0x7ffff7fcb740 (LWP 4305))]
#0 main () at ../../../src/gdb/testsuite/gdb.threads/multiple-step-overs.c:99
99 wait_threads (); /* set wait-threads breakpoint here */
(gdb) PASS: gdb.threads/multiple-step-overs.exp: step: thread 1
set scheduler-locking off
(gdb) PASS: gdb.threads/multiple-step-overs.exp: step: set scheduler-locking off
At this point all thread are stopped for a breakpoint that needs stepping over.
(gdb) step
Breakpoint 2, main () at ../../../src/gdb/testsuite/gdb.threads/multiple-step-overs.c:99
99 wait_threads (); /* set wait-threads breakpoint here */
(gdb) FAIL: gdb.threads/multiple-step-overs.exp: step
But that "step" retriggers the same breakpoint instead of making
progress.
The patch teaches GDB to step over all breakpoints of all threads
before resuming the stepping thread.
Tested on x86_64 Fedora 17, against pristine mainline, and also my
branch that implements software single-stepping on x86.
gdb/
2014-03-20 Pedro Alves <palves@redhat.com>
* infrun.c (prepare_to_proceed): Delete.
(thread_still_needs_step_over): New function.
(find_thread_needs_step_over): New function.
(proceed): If the current thread needs a step-over, set its
steping_over_breakpoint flag. Adjust to use
find_thread_needs_step_over instead of prepare_to_proceed.
(process_event_stop_test): For BPSTAT_WHAT_STOP_NOISY and
BPSTAT_WHAT_STOP_SILENT, assume the thread stopped for a
breakpoint.
(switch_back_to_stepped_thread): Step over breakpoints of all
threads not the stepping thread, before switching back to the
stepping thread.
gdb/testsuite/
2014-03-20 Pedro Alves <palves@redhat.com>
* gdb.threads/multiple-step-overs.c: New file.
* gdb.threads/multiple-step-overs.exp: New file.
* gdb.threads/signal-while-stepping-over-bp-other-thread.exp:
Adjust expected infrun debug output.
Even with deferred_step_ptid out of the way, GDB can still lose
watchpoints.
If a watchpoint triggers and the PC points to an address where a
thread-specific breakpoint for another thread is set, the thread-hop
code triggers, and we lose the watchpoint:
if (ecs->event_thread->suspend.stop_signal == GDB_SIGNAL_TRAP)
{
int thread_hop_needed = 0;
struct address_space *aspace =
get_regcache_aspace (get_thread_regcache (ecs->ptid));
/* Check if a regular breakpoint has been hit before checking
for a potential single step breakpoint. Otherwise, GDB will
not see this breakpoint hit when stepping onto breakpoints. */
if (regular_breakpoint_inserted_here_p (aspace, stop_pc))
{
if (!breakpoint_thread_match (aspace, stop_pc, ecs->ptid))
thread_hop_needed = 1;
^^^^^^^^^^^^^^^^^^^^^
}
And on software single-step targets, even without a thread-specific
breakpoint in the way, here in the thread-hop code:
else if (singlestep_breakpoints_inserted_p)
{
...
if (!ptid_equal (singlestep_ptid, ecs->ptid)
&& in_thread_list (singlestep_ptid))
{
/* If the PC of the thread we were trying to single-step
has changed, discard this event (which we were going
to ignore anyway), and pretend we saw that thread
trap. This prevents us continuously moving the
single-step breakpoint forward, one instruction at a
time. If the PC has changed, then the thread we were
trying to single-step has trapped or been signalled,
but the event has not been reported to GDB yet.
There might be some cases where this loses signal
information, if a signal has arrived at exactly the
same time that the PC changed, but this is the best
we can do with the information available. Perhaps we
should arrange to report all events for all threads
when they stop, or to re-poll the remote looking for
this particular thread (i.e. temporarily enable
schedlock). */
CORE_ADDR new_singlestep_pc
= regcache_read_pc (get_thread_regcache (singlestep_ptid));
if (new_singlestep_pc != singlestep_pc)
{
enum gdb_signal stop_signal;
if (debug_infrun)
fprintf_unfiltered (gdb_stdlog, "infrun: unexpected thread,"
" but expected thread advanced also\n");
/* The current context still belongs to
singlestep_ptid. Don't swap here, since that's
the context we want to use. Just fudge our
state and continue. */
stop_signal = ecs->event_thread->suspend.stop_signal;
ecs->event_thread->suspend.stop_signal = GDB_SIGNAL_0;
ecs->ptid = singlestep_ptid;
ecs->event_thread = find_thread_ptid (ecs->ptid);
ecs->event_thread->suspend.stop_signal = stop_signal;
stop_pc = new_singlestep_pc;
}
else
{
if (debug_infrun)
fprintf_unfiltered (gdb_stdlog,
"infrun: unexpected thread\n");
thread_hop_needed = 1;
stepping_past_singlestep_breakpoint = 1;
saved_singlestep_ptid = singlestep_ptid;
}
}
}
we either end up with thread_hop_needed, ignoring the watchpoint
SIGTRAP, or switch to the stepping thread, again ignoring that the
SIGTRAP could be for some other event.
The new test added by this patch exercises both paths.
So the fix is similar to the deferred_step_ptid fix -- defer the
thread hop to _after_ the SIGTRAP had a change of passing through the
regular bpstat handling. If the wrong thread hits a breakpoint, we'll
just end up with BPSTAT_WHAT_SINGLE, and if nothing causes a stop,
keep_going starts a step-over.
Most of the stepping_past_singlestep_breakpoint mechanism is really
not necessary -- setting the thread to step over a breakpoint with
thread->trap_expected is sufficient to keep all other threads locked.
It's best to still keep the flag in some form though, because when we
get to keep_going, the software single-step breakpoint we need to step
over is already gone -- an optimization done by a follow up patch will
check whether a step-over is still be necessary by looking to see
whether the breakpoint is still there, and would find the thread no
longer needs a step-over, while we still want it.
Special care is still needed to handle the case of PC of the thread we
were trying to single-step having changed, like in the old code. We
can't just keep_going and re-step it, as in that case we can over-step
the thread (if it was already done with the step, but hasn't reported
it yet, we'd ask it to step even further). That's now handled in
switch_back_to_stepped_thread. As bonus, we're now using a technique
that doesn't lose signals, unlike the old code -- we now insert a
breakpoint at PC, and resume, which either reports the breakpoint
immediately, or any pending signal.
Tested on x86_64 Fedora 17, against pristine mainline, and against a
branch that implements software single-step on x86.
gdb/
2014-03-20 Pedro Alves <palves@redhat.com>
* breakpoint.c (single_step_breakpoint_inserted_here_p): Make
extern.
* breakpoint.h (single_step_breakpoint_inserted_here_p): Declare.
* infrun.c (saved_singlestep_ptid)
(stepping_past_singlestep_breakpoint): Delete.
(resume): Remove stepping_past_singlestep_breakpoint handling.
(proceed): Store the prev_pc of the stepping thread too.
(init_wait_for_inferior): Adjust. Clear singlestep_ptid and
singlestep_pc.
(enum infwait_states): Delete infwait_thread_hop_state.
(struct execution_control_state) <hit_singlestep_breakpoint>: New
field.
(handle_inferior_event): Adjust.
(handle_signal_stop): Delete stepping_past_singlestep_breakpoint
handling and the thread-hop code. Before removing single-step
breakpoints, check whether the thread hit a single-step breakpoint
of another thread. If it did, the trap is not a random signal.
(switch_back_to_stepped_thread): If the event thread hit a
single-step breakpoint, unblock it before switching to the
stepping thread. Handle the case of the stepped thread having
advanced already.
(keep_going): Handle the case of the current thread moving past a
single-step breakpoint.
gdb/testsuite/
2014-03-20 Pedro Alves <palves@redhat.com>
* gdb.threads/step-over-trips-on-watchpoint.c: New file.
* gdb.threads/step-over-trips-on-watchpoint.exp: New file.
Consider the case of the user doing "step" in thread 2, while thread 1
had previously stopped for a breakpoint. In order to make progress,
GDB makes thread 1 step over its breakpoint first (with all other
threads stopped), and once that is over, thread 2 then starts stepping
(with thread 1 and all others running free, by default). If GDB
didn't do that, thread 1 would just trip on the same breakpoint
immediately again. This is what the prepare_to_proceed /
deferred_step_ptid code is all about.
However, deferred_step_ptid code resumes the target with:
resume (1, GDB_SIGNAL_0);
prepare_to_wait (ecs);
return;
Recall we were just stepping over a breakpoint when we get here. That
means that _nothing_ had installed breakpoints yet! If there's
another breakpoint just after the breakpoint that was just stepped,
we'll miss it. The fix for that would be to use keep_going instead.
However, there are more problems. What if the instruction that was
just single-stepped triggers a watchpoint? Currently, GDB just
happily resumes the thread, losing that too...
Missed watchpoints will need yet further fixes, but we should keep
those in mind.
So the fix must be to let the trap fall through the regular bpstat
handling, and only if no breakpoint, watchpoint, etc. claims the trap,
shall we switch back to the stepped thread.
Now, nowadays, we have code at the tail end of trap handling that does
exactly that -- switch back to the stepped thread
(switch_back_to_the_stepped_thread).
So the deferred_step_ptid code is just standing in the way, and can
simply be eliminated, fixing bugs in the process. Sweet.
The comment about spurious "Switching to ..." made me pause, but is
actually stale nowadays. That isn't needed anymore.
previous_inferior_ptid used to be re-set at each (internal) event, but
now it's only touched in proceed and normal stop.
The two tests added by this patch fail without the fix.
Tested on x86_64 Fedora 17 (also against my software single-stepping
on x86 branch).
gdb/
2014-03-20 Pedro Alves <palves@redhat.com>
* infrun.c (previous_inferior_ptid): Adjust comment.
(deferred_step_ptid): Delete.
(infrun_thread_ptid_changed, prepare_to_proceed)
(init_wait_for_inferior): Adjust.
(handle_signal_stop): Delete deferred_step_ptid handling.
gdb/testsuite/
2014-03-20 Pedro Alves <palves@redhat.com>
* gdb.threads/step-over-lands-on-breakpoint.c: New file.
* gdb.threads/step-over-lands-on-breakpoint.exp: New file.
I realized that the name of this test only made sense when considering
the old (never committed) implementation of the fix that came along
with the test originally, that forced a schedlock while a step-resume
(to get over the signal handler) was inserted. The final solution
that went into the tree does not force that locking.
So this renames it to something more descriptive.
gdb/testsuite/
2014-02-21 Pedro Alves <palves@redhat.com>
* gdb.threads/step-after-sr-lock.c: Rename to ...
* gdb.threads/signal-while-stepping-over-bp-other-thread.c: ... this.
* gdb.threads/step-after-sr-lock.exp: Rename to ...
* gdb.threads/signal-while-stepping-over-bp-other-thread.exp:
... this.
Say:
<stopped at a breakpoint in thread 2>
(gdb) thread 3
(gdb) step
The above triggers the prepare_to_proceed/deferred_step_ptid process,
which switches back to thread 2, to step over its breakpoint before
getting back to thread 3 and "step" it.
If while stepping over the breakpoint in thread 2, a signal arrives,
and it is set to pass/nostop, we'll set a step-resume breakpoint at
the supposed signal-handler resume address, and call keep_going. The
problem is that we were supposedly stepping thread 3, and that
keep_going delivers a signal to thread 2, and due to scheduler-locking
off, resumes everything else, _including_ thread 3, the thread we want
stepping. This means that we lose control of thread 3 until the next
event, when we stop everything. The end result for the user, is that
GDB lost control of the "step".
Here's the current infrun debug output of the above, with the testcase
in the patch below:
infrun: clear_proceed_status_thread (Thread 0x2aaaab8f5700 (LWP 11663))
infrun: clear_proceed_status_thread (Thread 0x2aaaab6f4700 (LWP 11662))
infrun: clear_proceed_status_thread (Thread 0x2aaaab4f2b20 (LWP 11659))
infrun: proceed (addr=0xffffffffffffffff, signal=144, step=1)
infrun: prepare_to_proceed (step=1), switched to [Thread 0x2aaaab6f4700 (LWP 11662)]
infrun: resume (step=1, signal=0), trap_expected=1, current thread [Thread 0x2aaaab6f4700 (LWP 11662)] at 0x40098f
infrun: wait_for_inferior ()
infrun: target_wait (-1, status) =
infrun: 11659 [Thread 0x2aaaab6f4700 (LWP 11662)],
infrun: status->kind = stopped, signal = SIGUSR1
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x40098f
infrun: random signal 30
Program received signal SIGUSR1, User defined signal 1.
infrun: signal arrived while stepping over breakpoint
infrun: inserting step-resume breakpoint at 0x40098f
infrun: resume (step=0, signal=30), trap_expected=0, current thread [Thread 0x2aaaab6f4700 (LWP 11662)] at 0x40098f
^^^ this is a wildcard resume.
infrun: prepare_to_wait
infrun: target_wait (-1, status) =
infrun: 11659 [Thread 0x2aaaab6f4700 (LWP 11662)],
infrun: status->kind = stopped, signal = SIGTRAP
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x40098f
infrun: BPSTAT_WHAT_STEP_RESUME
infrun: resume (step=1, signal=0), trap_expected=1, current thread [Thread 0x2aaaab6f4700 (LWP 11662)] at 0x40098f
^^^ step-resume hit, meaning the handler returned, so we go back to stepping thread 3.
infrun: prepare_to_wait
infrun: target_wait (-1, status) =
infrun: 11659 [Thread 0x2aaaab6f4700 (LWP 11662)],
infrun: status->kind = stopped, signal = SIGTRAP
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x40088b
infrun: switching back to stepped thread
infrun: Switching context from Thread 0x2aaaab6f4700 (LWP 11662) to Thread 0x2aaaab8f5700 (LWP 11663)
infrun: resume (step=1, signal=0), trap_expected=0, current thread [Thread 0x2aaaab8f5700 (LWP 11663)] at 0x400938
infrun: prepare_to_wait
infrun: target_wait (-1, status) =
infrun: 11659 [Thread 0x2aaaab8f5700 (LWP 11663)],
infrun: status->kind = stopped, signal = SIGTRAP
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x40093a
infrun: keep going
infrun: resume (step=1, signal=0), trap_expected=0, current thread [Thread 0x2aaaab8f5700 (LWP 11663)] at 0x40093a
infrun: prepare_to_wait
infrun: target_wait (-1, status) =
infrun: 11659 [Thread 0x2aaaab8f5700 (LWP 11663)],
infrun: status->kind = stopped, signal = SIGTRAP
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x40091e
infrun: stepped to a different line
infrun: stop_stepping
[Switching to Thread 0x2aaaab8f5700 (LWP 11663)]
69 (*myp) ++; /* set breakpoint child_two here */
^^^ we stopped at the wrong line. We still stepped a bit because the
test is running in a loop, and when we got back to stepping thread 3,
it happened to be in the stepping range. (The loop increments a
counter, and the test makes sure it increments exactly once. Without
the fix, the counter increments a bunch, since the user-stepped thread
runs free without GDB noticing.)
The fix is to switch to the stepping thread before continuing for the
step-resume breakpoint.
gdb/
2014-02-07 Pedro Alves <palves@redhat.com>
* infrun.c (handle_signal_stop) <signal arrives while stepping
over a breakpoint>: Switch back to the stepping thread.
gdb/testsuite/
2014-02-07 Pedro Alves <pedro@codesourcery.com>
Pedro Alves <palves@redhat.com>
* gdb.threads/step-after-sr-lock.c: New file.
* gdb.threads/step-after-sr-lock.exp: New file.
Currently on software single-step Linux targets we get:
(gdb) PASS: gdb.threads/stepi-random-signal.exp: before stepi: get hexadecimal valueof "$pc"
stepi
infrun: clear_proceed_status_thread (Thread 0x7ffff7fca700 (LWP 7073))
infrun: clear_proceed_status_thread (Thread 0x7ffff7fcb740 (LWP 7069))
infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT, step=1)
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0x7ffff7fcb740 (LWP 7069)] at 0x400700
infrun: wait_for_inferior ()
infrun: target_wait (-1, status) =
infrun: 7069 [Thread 0x7ffff7fcb740 (LWP 7069)],
infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x400704
infrun: software single step trap for Thread 0x7ffff7fcb740 (LWP 7069)
infrun: stepi/nexti
infrun: stop_stepping
44 while (counter != 0)
(gdb) FAIL: gdb.threads/stepi-random-signal.exp: stepi (no random signal)
Vs hardware-step targets:
(gdb) PASS: gdb.threads/stepi-random-signal.exp: before stepi: get hexadecimal valueof "$pc"
stepi
infrun: clear_proceed_status_thread (Thread 0x7ffff7fca700 (LWP 9565))
infrun: clear_proceed_status_thread (Thread 0x7ffff7fcb740 (LWP 9561))
infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT, step=1)
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0x7ffff7fcb740 (LWP 9561)] at 0x400700
infrun: wait_for_inferior ()
infrun: target_wait (-1, status) =
infrun: 9561 [Thread 0x7ffff7fcb740 (LWP 9561)],
infrun: status->kind = stopped, signal = GDB_SIGNAL_CHLD
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x400700
infrun: random signal (GDB_SIGNAL_CHLD)
infrun: random signal, keep going
infrun: resume (step=1, signal=GDB_SIGNAL_CHLD), trap_expected=0, current thread [Thread 0x7ffff7fcb740 (LWP 9561)] at 0x400700
infrun: prepare_to_wait
infrun: target_wait (-1, status) =
infrun: 9561 [Thread 0x7ffff7fcb740 (LWP 9561)],
infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x400704
infrun: stepi/nexti
infrun: stop_stepping
44 while (counter != 0)
(gdb) PASS: gdb.threads/stepi-random-signal.exp: stepi
The test turns on infrun debug, does a stepi while a SIGCHLD is
pending, and checks whether the "random signal" paths in infrun.c are
taken.
On the software single-step variant above, those paths were not taken.
This is a test bug.
The Linux backend short-circuits reporting signals that are set to
pass/nostop/noprint. But _only_ if the thread is _not_
single-stepping. So on hardware-step targets, even though the signal
is set to pass/nostop/noprint by default, the thread is indeed told to
single-step, and so the core sees the signal. On the other hand, on
software single-step architectures, the backend never actually gets a
single-step request (steps are emulated by setting a breakpoint at the
next pc, and then the target told to continue, not step). So the
short-circuiting code triggers and the core doesn't see the signal.
The fix is to make the test be sure the target doesn't bypass
reporting the signal to the core.
Tested on x86_64 Fedora 17, both with and without a series that
implements software single-step for x86_64.
gdb/testsuite/
2014-02-07 Pedro Alves <palves@redhat.com>
* gdb.threads/stepi-random-signal.exp: Set SIGCHLD to print.
Currently, when GDB connects in all-stop mode, GDBserver always
responds to the status packet with a GDB_SIGNAL_TRAP, even if the
program is actually stopped for some other signal.
(gdb) tar rem ...
...
(gdb) c
Program received signal SIGUSR1, User defined signal 1.
(gdb) disconnect
(gdb) tar rem ...
(gdb) c
(Or a GDB crash instead of an explicit disconnect.)
This results in the program losing that signal on that last continue,
because gdb will tell the target to resume with no signal (to suppress
the GDB_SIGNAL_TRAP, due to 'handle SISGTRAP nopass'), and that will
actually suppress the real signal the program had stopped for
(SIGUSR1). To fix that, I think we should make GDBserver report the
real signal the thread had stopped for in response to the status
packet:
@item ?
@cindex @samp{?} packet
Indicate the reason the target halted. The reply is the same as for
step and continue.
But, that raises the question -- which thread are we reporting the
status for? Due to how the RSP in all-stop works, we can only report
one status. The status packet's response is a stop reply packet, so
it includes the thread identifier, so it's not a problem packet-wise.
However, GDBserver is currently always reporting the status for first
thread in the thread list, even though that may well not be the thread
that got the signal that caused the program to stop. So the next
logical step would be to report the status for the
last_ptid/last_status thread (the last event reported to gdb), if it's
still around; and if not, fallback to some other thread.
There's an issue on the GDB side with that, though...
GDB currently always adds the thread reported in response to the
status query as the first thread in its list. That means that if we
start with e.g.,
(gdb) info threads
3 Thread 1003 ...
* 2 Thread 1002 ...
1 Thread 1001 ...
And reconnect:
(gdb) disconnect
(gdb) tar rem ...
We end up with:
(gdb) info threads
3 Thread 1003 ...
2 Thread 1001 ...
* 1 Thread 1002 ...
Not a real big issue, but it's reasonably fixable, by having GDB
fetch/sync the thread list before fetching the status/'?', and then
using the status to select the right thread as current on the GDB
side. Holes in the thread numbers are squashed before/after
reconnection (e.g., 2,3,5 becomes 1,2,3), but the order is preserved,
which I think is both good, and good enough.
However (yes, there's more...), the previous GDB that was connected
might have had gdbserver running in non-stop mode, or could have left
gdbserver doing disconnected tracing (which also forces non-stop), and
if the new gdb/connection is in all-stop mode, we can end up with more
than one thread with a signal to report back to gdb. As we can only
report one thread/status (in the all-stop RSP variant; the non-stop
variant doesn't have this issue), we get to do what we do at every
other place we have this situation -- leave events we can't report
right now as pending, so that the next resume picks them up.
Note all this ammounts to a QoI change, within the existing framework.
There's really no RSP change here.
The only user visible change (other than that the signal is program is
stopped at isn't lost / is passed to the program), is in "info
program", that now can show the signal the program stopped for. Of
course, the next resume will respect the pass/nopass setting for the
signal in question. It'd be reasonable to have the initial connection
tell the user the program was stopped with a signal, similar to when
we load a core to debug, but I'm leaving that out for a future change.
I think we'll need to either change how handle_inferior_event & co
handle stop_soon, or maybe bypass them completely (like
fork-child.c:startup_inferior) for that.
Tested on x86_64 Fedora 17.
gdb/gdbserver/
2014-01-08 Pedro Alves <palves@redhat.com>
* gdbthread.h (struct thread_info) <status_pending_p>: New field.
* server.c (visit_actioned_threads, handle_pending_status): New
function.
(handle_v_cont): Factor out parts to ...
(resume): ... this new function. If in all-stop, and a thread
being resumed has a pending status, report it without actually
resuming.
(myresume): Adjust to use the new 'resume' function.
(clear_pending_status_callback, set_pending_status_callback)
(find_status_pending_thread_callback): New functions.
(handle_status): Handle the case of multiple threads having
interesting statuses to report. Report threads' real last signal
instead of always reporting GDB_SIGNAL_TRAP. Look for a thread
with an interesting thread to report the status for, instead of
always reporting the status of the first thread.
gdb/
2014-01-08 Pedro Alves <palves@redhat.com>
* remote.c (remote_add_thread): Add threads silently if starting
up.
(remote_notice_new_inferior): If in all-stop, and starting up,
don't call notice_new_inferior.
(get_current_thread): New function, factored out from ...
(add_current_inferior_and_thread): ... this. Adjust.
(remote_start_remote) <all-stop>: Fetch the thread list. If we
found any thread, then select the remote's current thread as GDB's
current thread too.
gdb/testsuite/
2014-01-08 Pedro Alves <palves@redhat.com>
* gdb.threads/reconnect-signal.c: New file.
* gdb.threads/reconnect-signal.exp: New file.