gdb/python: add gdb.format_address function

Add a new function, gdb.format_address, which is a wrapper around
GDB's print_address function.

This method takes an address, and returns a string with the format:

  ADDRESS <SYMBOL+OFFSET>

Where, ADDRESS is the original address, formatted as hexadecimal,
SYMBOL is a symbol with an address lower than ADDRESS, and OFFSET is
the offset from SYMBOL to ADDRESS in decimal.

If there's no SYMBOL suitably close to ADDRESS then the
<SYMBOL+OFFSET> part is not included.

This is useful if a user wants to write a Python script that
pretty-prints addresses, the user no longer needs to do manual symbol
lookup, or worry about correctly formatting addresses.

Additionally, there are some settings that effect how GDB picks
SYMBOL, and whether the file name and line number should be included
with the SYMBOL name, the gdb.format_address function ensures that the
users Python script also benefits from these settings.

The gdb.format_address by default selects SYMBOL from the current
inferiors program space, and address is formatted using the
architecture for the current inferior.  However, a user can also
explicitly pass a program space and architecture like this:

  gdb.format_address(ADDRESS, PROGRAM_SPACE, ARCHITECTURE)

In order to format an address for a different inferior.

Notes on the implementation:

In py-arch.c I extended arch_object_to_gdbarch to add an assertion for
the type of the PyObject being worked on.  Prior to this commit all
uses of arch_object_to_gdbarch were guaranteed to pass this function a
gdb.Architecture object, but, with this commit, this might not be the
case.

So, with this commit I've made it a requirement that the PyObject be a
gdb.Architecture, and this is checked with the assert.  And in order
that callers from other files can check if they have a
gdb.Architecture object, I've added the new function
gdbpy_is_architecture.

In py-progspace.c I've added two new function, the first
progspace_object_to_program_space, converts a PyObject of type
gdb.Progspace to the associated program_space pointer, and
gdbpy_is_progspace checks if a PyObject is a gdb.Progspace or not.
This commit is contained in:
Andrew Burgess 2021-10-23 09:59:25 +01:00 committed by Andrew Burgess
parent 6c111a4ec2
commit 25209e2c69
8 changed files with 423 additions and 2 deletions

View file

@ -3,6 +3,14 @@
*** Changes since GDB 12 *** Changes since GDB 12
* Python API
** New function gdb.format_address(ADDRESS, PROGSPACE, ARCHITECTURE),
that formats ADDRESS as 'address <symbol+offset>', where symbol is
looked up in PROGSPACE, and ARCHITECTURE is used to format address.
This is the same format that GDB uses when printing address, symbol,
and offset information from the disassembler.
*** Changes in GDB 12 *** Changes in GDB 12
* DBX mode is deprecated, and will be removed in GDB 13 * DBX mode is deprecated, and will be removed in GDB 13

View file

@ -615,6 +615,60 @@ currently active connection (@pxref{Connections In Python}). The
connection objects are in no particular order in the returned list. connection objects are in no particular order in the returned list.
@end defun @end defun
@defun gdb.format_address (@var{address} @r{[}, @var{progspace}, @var{architecture}@r{]})
Return a string in the format @samp{@var{addr}
<@var{symbol}+@var{offset}>}, where @var{addr} is @var{address}
formatted in hexadecimal, @var{symbol} is the symbol whose address is
the nearest to @var{address} and below it in memory, and @var{offset}
is the offset from @var{symbol} to @var{address} in decimal.
If no suitable @var{symbol} was found, then the
<@var{symbol}+@var{offset}> part is not included in the returned
string, instead the returned string will just contain the
@var{address} formatted as hexadecimal. How far @value{GDBN} looks
back for a suitable symbol can be controlled with @kbd{set print
max-symbolic-offset} (@pxref{Print Settings}).
Additionally, the returned string can include file name and line
number information when @kbd{set print symbol-filename on}
(@pxref{Print Settings}), in this case the format of the returned
string is @samp{@var{addr} <@var{symbol}+@var{offset}> at
@var{filename}:@var{line-number}}.
The @var{progspace} is the gdb.Progspace in which @var{symbol} is
looked up, and @var{architecture} is used when formatting @var{addr},
e.g.@: in order to determine the size of an address in bytes.
If neither @var{progspace} or @var{architecture} are passed, then by
default @value{GDBN} will use the program space and architecture of
the currently selected inferior, thus, the following two calls are
equivalent:
@smallexample
gdb.format_address(address)
gdb.format_address(address,
gdb.selected_inferior().progspace,
gdb.selected_inferior().architecture())
@end smallexample
It is not valid to only pass one of @var{progspace} or
@var{architecture}, either they must both be provided, or neither must
be provided (and the defaults will be used).
This method uses the same mechanism for formatting address, symbol,
and offset information as core @value{GDBN} does in commands such as
@kbd{disassemble}.
Here are some examples of the possible string formats:
@smallexample
0x00001042
0x00001042 <symbol+16>
0x00001042 <symbol+16 at file.c:123>
@end smallexample
@end defun
@node Exception Handling @node Exception Handling
@subsubsection Exception Handling @subsubsection Exception Handling
@cindex python exceptions @cindex python exceptions

View file

@ -62,16 +62,25 @@ arch_object_data_init (struct gdbarch *gdbarch)
} }
/* Returns the struct gdbarch value corresponding to the given Python /* Returns the struct gdbarch value corresponding to the given Python
architecture object OBJ. */ architecture object OBJ, which must be a gdb.Architecture object. */
struct gdbarch * struct gdbarch *
arch_object_to_gdbarch (PyObject *obj) arch_object_to_gdbarch (PyObject *obj)
{ {
arch_object *py_arch = (arch_object *) obj; gdb_assert (gdbpy_is_architecture (obj));
arch_object *py_arch = (arch_object *) obj;
return py_arch->gdbarch; return py_arch->gdbarch;
} }
/* See python-internal.h. */
bool
gdbpy_is_architecture (PyObject *obj)
{
return PyObject_TypeCheck (obj, &arch_object_type);
}
/* Returns the Python architecture object corresponding to GDBARCH. /* Returns the Python architecture object corresponding to GDBARCH.
Returns a new reference to the arch_object associated as data with Returns a new reference to the arch_object associated as data with
GDBARCH. */ GDBARCH. */

View file

@ -504,6 +504,23 @@ pspace_to_pspace_object (struct program_space *pspace)
return gdbpy_ref<>::new_reference (result); return gdbpy_ref<>::new_reference (result);
} }
/* See python-internal.h. */
struct program_space *
progspace_object_to_program_space (PyObject *obj)
{
gdb_assert (gdbpy_is_progspace (obj));
return ((pspace_object *) obj)->pspace;
}
/* See python-internal.h. */
bool
gdbpy_is_progspace (PyObject *obj)
{
return PyObject_TypeCheck (obj, &pspace_object_type);
}
void _initialize_py_progspace (); void _initialize_py_progspace ();
void void
_initialize_py_progspace () _initialize_py_progspace ()

View file

@ -497,6 +497,13 @@ struct symtab_and_line *sal_object_to_symtab_and_line (PyObject *obj);
struct frame_info *frame_object_to_frame_info (PyObject *frame_obj); struct frame_info *frame_object_to_frame_info (PyObject *frame_obj);
struct gdbarch *arch_object_to_gdbarch (PyObject *obj); struct gdbarch *arch_object_to_gdbarch (PyObject *obj);
/* Convert Python object OBJ to a program_space pointer. OBJ must be a
gdb.Progspace reference. Return nullptr if the gdb.Progspace is not
valid (see gdb.Progspace.is_valid), otherwise return the program_space
pointer. */
extern struct program_space *progspace_object_to_program_space (PyObject *obj);
void gdbpy_initialize_gdb_readline (void); void gdbpy_initialize_gdb_readline (void);
int gdbpy_initialize_auto_load (void) int gdbpy_initialize_auto_load (void)
CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION; CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION;
@ -838,4 +845,13 @@ typedef std::unique_ptr<Py_buffer, Py_buffer_deleter> Py_buffer_up;
extern bool gdbpy_parse_register_id (struct gdbarch *gdbarch, extern bool gdbpy_parse_register_id (struct gdbarch *gdbarch,
PyObject *pyo_reg_id, int *reg_num); PyObject *pyo_reg_id, int *reg_num);
/* Return true if OBJ is a gdb.Architecture object, otherwise, return
false. */
extern bool gdbpy_is_architecture (PyObject *obj);
/* Return true if OBJ is a gdb.Progspace object, otherwise, return false. */
extern bool gdbpy_is_progspace (PyObject *obj);
#endif /* PYTHON_PYTHON_INTERNAL_H */ #endif /* PYTHON_PYTHON_INTERNAL_H */

View file

@ -1294,6 +1294,107 @@ gdbpy_colorize_disasm (const std::string &content, gdbarch *gdbarch)
/* Implement gdb.format_address(ADDR,P_SPACE,ARCH). Provide access to
GDB's print_address function from Python. The returned address will
have the format '0x..... <symbol+offset>'. */
static PyObject *
gdbpy_format_address (PyObject *self, PyObject *args, PyObject *kw)
{
static const char *keywords[] =
{
"address", "progspace", "architecture", nullptr
};
PyObject *addr_obj = nullptr, *pspace_obj = nullptr, *arch_obj = nullptr;
CORE_ADDR addr;
struct gdbarch *gdbarch = nullptr;
struct program_space *pspace = nullptr;
if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "O|OO", keywords,
&addr_obj, &pspace_obj, &arch_obj))
return nullptr;
if (get_addr_from_python (addr_obj, &addr) < 0)
return nullptr;
/* If the user passed None for progspace or architecture, then we
consider this to mean "the default". Here we replace references to
None with nullptr, this means that in the following code we only have
to handle the nullptr case. These are only borrowed references, so
no decref is required here. */
if (pspace_obj == Py_None)
pspace_obj = nullptr;
if (arch_obj == Py_None)
arch_obj = nullptr;
if (pspace_obj == nullptr && arch_obj == nullptr)
{
/* Grab both of these from the current inferior, and its associated
default architecture. */
pspace = current_inferior ()->pspace;
gdbarch = current_inferior ()->gdbarch;
}
else if (arch_obj == nullptr || pspace_obj == nullptr)
{
/* If the user has only given one of program space or architecture,
then don't use the default for the other. Sure we could use the
default, but it feels like there's too much scope of mistakes in
this case, so better to require the user to provide both
arguments. */
PyErr_SetString (PyExc_ValueError,
_("The architecture and progspace arguments must both be supplied"));
return nullptr;
}
else
{
/* The user provided an address, program space, and architecture.
Just check that these objects are valid. */
if (!gdbpy_is_progspace (pspace_obj))
{
PyErr_SetString (PyExc_TypeError,
_("The progspace argument is not a gdb.Progspace object"));
return nullptr;
}
pspace = progspace_object_to_program_space (pspace_obj);
if (pspace == nullptr)
{
PyErr_SetString (PyExc_ValueError,
_("The progspace argument is not valid"));
return nullptr;
}
if (!gdbpy_is_architecture (arch_obj))
{
PyErr_SetString (PyExc_TypeError,
_("The architecture argument is not a gdb.Architecture object"));
return nullptr;
}
/* Architectures are never deleted once created, so gdbarch should
never come back as nullptr. */
gdbarch = arch_object_to_gdbarch (arch_obj);
gdb_assert (gdbarch != nullptr);
}
/* By this point we should know the program space and architecture we are
going to use. */
gdb_assert (pspace != nullptr);
gdb_assert (gdbarch != nullptr);
/* Unfortunately print_address relies on the current program space for
its symbol lookup. Temporarily switch now. */
scoped_restore_current_program_space restore_progspace;
set_current_program_space (pspace);
/* Format the address, and return it as a string. */
string_file buf;
print_address (gdbarch, addr, &buf);
return PyString_FromString (buf.c_str ());
}
/* Printing. */ /* Printing. */
/* A python function to write a single string using gdb's filtered /* A python function to write a single string using gdb's filtered
@ -2445,6 +2546,13 @@ Return a list of all the architecture names GDB understands." },
"connections () -> List.\n\ "connections () -> List.\n\
Return a list of gdb.TargetConnection objects." }, Return a list of gdb.TargetConnection objects." },
{ "format_address", (PyCFunction) gdbpy_format_address,
METH_VARARGS | METH_KEYWORDS,
"format_address (ADDRESS, PROG_SPACE, ARCH) -> String.\n\
Format ADDRESS, an address within PROG_SPACE, a gdb.Progspace, using\n\
ARCH, a gdb.Architecture to determine the address size. The format of\n\
the returned string is 'ADDRESS <SYMBOL+OFFSET>' without the quotes." },
{NULL, NULL, 0, NULL} {NULL, NULL, 0, NULL}
}; };

View file

@ -0,0 +1,32 @@
/* This testcase is part of GDB, the GNU debugger.
Copyright 2022 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. */
/* This test is compiled multiple times with FUNCTION_NAME defined to
different strings, this means we should (hopefully) get the same code
layout in memory, but with different strings for the function name. */
int
FUNCTION_NAME (void)
{
return 0;
}
int
main (void)
{
return FUNCTION_NAME ();
}

View file

@ -0,0 +1,177 @@
# Copyright 2022 Free Software Foundation, Inc.
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
load_lib gdb-python.exp
standard_testfile
foreach func_name { foo bar } {
if {[build_executable "build binary with ${func_name} function" \
"$testfile-${func_name}" $srcfile \
[list debug \
additional_flags=-DFUNCTION_NAME=${func_name}]] == -1} {
return -1
}
}
set binary_foo [standard_output_file "${testfile}-foo"]
set binary_bar [standard_output_file "${testfile}-bar"]
clean_restart $binary_foo
# Skip all tests if Python scripting is not enabled.
if { [skip_python_tests] } { continue }
if ![runto_main] {
return -1
}
# Check the gdb.format_address method when using the default values
# for the program space and architecture (these will be selected based
# on the current inferior).
set main_addr [get_hexadecimal_valueof "&main" "UNKNOWN"]
set next_addr [format 0x%x [expr $main_addr + 1]]
foreach_with_prefix symbol_filename { on off } {
gdb_test_no_output "set print symbol-filename ${symbol_filename}"
if { $symbol_filename == "on" } {
set filename_pattern " at \[^\r\n\]+/${srcfile}:$decimal"
} else {
set filename_pattern ""
}
gdb_test "python print(\"Got: \" + gdb.format_address($main_addr))" \
"Got: $main_addr <main${filename_pattern}>" \
"gdb.format_address, result should have no offset"
gdb_test "python print(\"Got: \" + gdb.format_address($next_addr))" \
"Got: $next_addr <main\\+1${filename_pattern}>" \
"gdb.format_address, result should have an offset"
}
if {![is_address_zero_readable]} {
gdb_test "python print(\"Got: \" + gdb.format_address(0))" \
"Got: 0x0" \
"gdb.format_address for address 0"
}
# Now check that gdb.format_address will accept the program space and
# architecture arguments correctly.
gdb_test_no_output "python inf = gdb.selected_inferior()"
# First, pass both arguments, this should be fine.
gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, inf.progspace, inf.architecture()))" \
"Got: $main_addr <main>" \
"gdb.format_address passing program space and architecture"
# Now pass the program space and architecture as None.
# First, pass both arguments, this should be fine.
gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, None, None))" \
"Got: $main_addr <main>" \
"gdb.format_address passing program space and architecture as None"
# Now forget the architecture, this should fail.
gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, inf.progspace))" \
[multi_line \
"ValueError: The architecture and progspace arguments must both be supplied" \
"Error while executing Python code\\."] \
"gdb.format_address passing program space only"
gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, inf.progspace, None))" \
[multi_line \
"ValueError: The architecture and progspace arguments must both be supplied" \
"Error while executing Python code\\."] \
"gdb.format_address passing real program space, but architecture is None"
# Now skip the program space argument.
gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, architecture=inf.architecture()))" \
[multi_line \
"ValueError: The architecture and progspace arguments must both be supplied" \
"Error while executing Python code\\."] \
"gdb.format_address passing architecture only"
gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, None, inf.architecture()))" \
[multi_line \
"ValueError: The architecture and progspace arguments must both be supplied" \
"Error while executing Python code\\."] \
"gdb.format_address passing real architecture, but progspace is None"
# Now, before we add a second inferior, lets just check we can format
# the address of 'foo' correctly.
set foo_addr [get_hexadecimal_valueof "&foo" "UNKNOWN"]
gdb_test "python print(\"Got: \" + gdb.format_address($foo_addr, inf.progspace, inf.architecture()))" \
"Got: $foo_addr <foo>" \
"gdb.format_address for foo, with just one inferior"
# Now lets add a second inferior, using a slightly different
# executable, select that inferior, and capture a reference to the
# inferior in a Python object.
gdb_test "add-inferior -exec ${binary_bar}" ".*" \
"add a second inferior running the bar executable"
gdb_test "inferior 2" ".*"
gdb_test_no_output "python inf2 = gdb.selected_inferior()"
# Now we can test formatting an address from inferior 1.
gdb_test "python print(\"Got: \" + gdb.format_address($foo_addr, inf.progspace, inf.architecture()))" \
"Got: $foo_addr <foo>" \
"gdb.format_address for foo, while inferior 2 is selected"
# Grab the address of 'bar'. Hopefully this will be the same address
# as 'foo', but if not, that's not the end of the world, the test just
# wont be quite as tough.
set bar_addr [get_hexadecimal_valueof "&bar" "UNKNOWN"]
# Now format the address of bar using the default inferior and
# architecture, this should display the 'bar' symbol rather than
# 'foo'.
gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr))" \
"Got: $foo_addr <bar>" \
"gdb.format_address for bar, while inferior 2 is selected"
# And again, but this time, specificy the program space and
# architecture.
gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, inf2.progspace, inf2.architecture()))" \
"Got: $foo_addr <bar>" \
"gdb.format_address for bar, while inferior 2 is selected, pass progspace and architecture"
# Reselect inferior 1, and then format an address from inferior 2.
gdb_test "inferior 1" ".*"
gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, inf2.progspace, inf2.architecture()))" \
"Got: $foo_addr <bar>" \
"gdb.format_address for bar, while inferior 1 is selected, pass progspace and architecture"
# Try pasing incorrect object types for program space and architecture.
gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, inf2.progspace, inf2.progspace))" \
[multi_line \
"TypeError: The architecture argument is not a gdb.Architecture object" \
"Error while executing Python code\\."] \
"gdb.format_address pass wrong object type for architecture"
gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, inf2.architecture(), inf2.architecture()))" \
[multi_line \
"TypeError: The progspace argument is not a gdb.Progspace object" \
"Error while executing Python code\\."] \
"gdb.format_address pass wrong object type for progspace"
# Now invalidate inferior 2's program space, and try using that.
gdb_test "python pspace = inf2.progspace"
gdb_test "python arch = inf2.architecture()"
gdb_test "remove-inferior 2"
gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, pspace, arch))" \
[multi_line \
"ValueError: The progspace argument is not valid" \
"Error while executing Python code\\."] \
"gdb.format_address called with an invalid program space"