gdb: avoid premature dummy frame garbage collection

Consider the following chain of events:

  * GDB is performing an inferior call, and

  * the inferior calls longjmp, and

  * GDB detects that the longjmp has completed, stops, and enters
    check_longjmp_breakpoint_for_call_dummy (in breakpoint.c), and

  * GDB tries to unwind the stack in order to check that the dummy
    frame (setup for the inferior call) is still on the stack, but

  * The unwind fails, possibly due to missing debug information, so

  * GDB incorrectly concludes that the inferior has longjmp'd past the
    dummy frame, and so deletes the dummy frame, including the dummy
    frame breakpoint, but then

  * The inferior continues, and eventually returns to the dummy frame,
    which is usually (always?) on the stack, the inferior starts
    trying to execute the random contents of the stack, this results
    in undefined behaviour.

This situation is already warned about in the comment on the function
check_longjmp_breakpoint_for_call_dummy where we say:

   You should call this function only at places where it is safe to currently
   unwind the whole stack.  Failed stack unwind would discard live dummy
   frames.

The warning here is fine, the problem is that, even though we call the
function from a location within GDB where we hope to be able to
unwind, sometime the state of the inferior means that the unwind will
not succeed.

This commit tries to improve the situation by adding the following
additional check; when GDB fails to find the dummy frame on the stack,
instead of just assuming that the dummy frame can be garbage
collected, first find the stop_reason for the last frame on the stack.
If this stop_reason indicates that the stack unwinding may have failed
then we assume that the dummy frame is still in use.  However, if the
last frame's stop_reason indicates that the stack unwind completed
successfully then we can be confident that the dummy frame is no
longer in use, and we garbage collect it.

Tested on x86-64 GNU/Linux.

gdb/ChangeLog:

	* breakpoint.c (check_longjmp_breakpoint_for_call_dummy): Add
	check for why the backtrace stopped.

gdb/testsuite/ChangeLog:

	* gdb.base/premature-dummy-frame-removal.c: New file.
	* gdb.base/premature-dummy-frame-removal.exp: New file.
	* gdb.base/premature-dummy-frame-removal.py: New file.

Change-Id: I8f330cfe0f3f33beb3a52a36994094c4abada07e
This commit is contained in:
Andrew Burgess 2019-08-29 12:37:00 +01:00
parent a2cf3633b3
commit b4b3e2dee2
6 changed files with 238 additions and 4 deletions

View file

@ -7357,9 +7357,10 @@ set_longjmp_breakpoint_for_call_dummy (void)
TP. Remove those which can no longer be found in the current frame
stack.
You should call this function only at places where it is safe to currently
unwind the whole stack. Failed stack unwind would discard live dummy
frames. */
If the unwind fails then there is not sufficient information to discard
dummy frames. In this case, elide the clean up and the dummy frames will
be cleaned up next time this function is called from a location where
unwinding is possible. */
void
check_longjmp_breakpoint_for_call_dummy (struct thread_info *tp)
@ -7371,12 +7372,55 @@ check_longjmp_breakpoint_for_call_dummy (struct thread_info *tp)
{
struct breakpoint *dummy_b = b->related_breakpoint;
/* Find the bp_call_dummy breakpoint in the list of breakpoints
chained off b->related_breakpoint. */
while (dummy_b != b && dummy_b->type != bp_call_dummy)
dummy_b = dummy_b->related_breakpoint;
/* If there was no bp_call_dummy breakpoint then there's nothing
more to do. Or, if the dummy frame associated with the
bp_call_dummy is still on the stack then we need to leave this
bp_call_dummy in place. */
if (dummy_b->type != bp_call_dummy
|| frame_find_by_id (dummy_b->frame_id) != NULL)
continue;
/* We didn't find the dummy frame on the stack, this could be
because we have longjmp'd to a stack frame that is previous to
the dummy frame, or it could be because the stack unwind is
broken at some point between the longjmp frame and the dummy
frame.
Next we figure out why the stack unwind stopped. If it looks
like the unwind is complete then we assume the dummy frame has
been jumped over, however, if the unwind stopped for an
unexpected reason then we assume the stack unwind is currently
broken, and that we will (eventually) return to the dummy
frame.
It might be tempting to consider using frame_id_inner here, but
that is not safe. There is no guarantee that the stack frames
we are looking at here are even on the same stack as the
original dummy frame, hence frame_id_inner can't be used. See
the comments on frame_id_inner for more details. */
bool unwind_finished_unexpectedly = false;
for (struct frame_info *fi = get_current_frame (); fi != nullptr; )
{
struct frame_info *prev = get_prev_frame (fi);
if (prev == nullptr)
{
/* FI is the last stack frame. Why did this frame not
unwind further? */
auto stop_reason = get_frame_unwind_stop_reason (fi);
if (stop_reason != UNWIND_NO_REASON
&& stop_reason != UNWIND_OUTERMOST)
unwind_finished_unexpectedly = true;
}
fi = prev;
}
if (unwind_finished_unexpectedly)
continue;
dummy_frame_discard (dummy_b->frame_id, tp);
while (b->related_breakpoint != b)