GDB uses the environment variable PYTHONDONTWRITEBYTECODE to
determine whether or not to write the result of byte-compiling
python modules when the "python dont-write-bytecode" setting
is "auto". Simon noticed that GDB's implementation doesn't
follow the Python documentation.
At present, GDB only checks for the existence of this environment
variable. That is not sufficient though. Regarding
PYTHONDONTWRITEBYTECODE, this document...
https://docs.python.org/3/using/cmdline.html
...says:
If this is set to a non-empty string, Python won't try to write
.pyc files on the import of source modules.
This commit fixes GDB's handling of PYTHONDONTWRITEBYTECODE by adding
an empty string check.
This commit also corrects the set/show command documentation for
"python dont-write-bytecode". The current doc was just a copy
of that for set/show python ignore-environment.
During his review of an earlier version of this patch, Eli Zaretskii
asked that the help text that I proposed for "set/show python
dont-write-bytecode" be expanded. I've done that in addition to
clarifying the documentation of this option in the GDB manual.
After this commit:
commit 81384924cd
Date: Tue Apr 5 11:06:16 2022 +0100
gdb: have gdb_disassemble_info carry 'this' in its stream pointer
The disassemble_info::stream field will no longer be a ui_file*. That
commit failed to update one location in py-disasm.c though.
While running some tests using the Python disassembler API, I
triggered a call to gdbpy_disassembler::print_address_func, and, as I
had compiled GDB with the undefined behaviour sanitizer, GDB crashed
as the code currently (incorrectly) casts the stream field to be a
ui_file*.
In this commit I fix this error.
In order to test this case I had to tweak the existing test case a
little. I also spotted some debug printf statements in py-disasm.py,
which I have removed.
With python 3.11 I noticed:
...
$ gdb -q -batch -ex "maint selftest python"
Running selftest python.
Self test failed: self-test failed at gdb/python/python.c:2246
Ran 1 unit tests, 1 failed
...
In more detail:
...
(gdb) p output
$5 = "Traceback (most recent call last):\n File \"<string>\", line 0, \
in <module>\nKeyboardInterrupt\n"
(gdb) p ref_output
$6 = "Traceback (most recent call last):\n File \"<string>\", line 1, \
in <module>\nKeyboardInterrupt\n"
...
Fix this by also allowing line number 0.
Tested on x86_64-linux.
This should hopefully fix buildbot builder gdb-rawhide-x86_64.
Python 3.11 deprecates PySys_SetPath and Py_SetProgramName. The
PyConfig API replaces these and other functions. This commit uses the
PyConfig API to provide equivalent functionality while also preserving
support for older versions of Python, i.e. those before Python 3.8.
A beta version of Python 3.11 is available in Fedora Rawhide. Both
Fedora 35 and Fedora 36 use Python 3.10, while Fedora 34 still used
Python 3.9. I've tested these changes on Fedora 34, Fedora 36, and
rawhide, though complete testing was not possible on rawhide due to
a kernel bug. That being the case, I decided to enable the newer
PyConfig API by testing PY_VERSION_HEX against 0x030a0000. This
corresponds to Python 3.10.
We could try to use the PyConfig API for Python versions as early as 3.8,
but I'm reluctant to do this as there may have been PyConfig related
bugs in earlier versions which have since been fixed. Recent linux
distributions should have support for Python 3.10. This should be
more than adequate for testing the new Python initialization code in
GDB.
Information about the PyConfig API as well as the motivation behind
deprecating the old interface can be found at these links:
https://github.com/python/cpython/issues/88279https://peps.python.org/pep-0587/https://docs.python.org/3.11/c-api/init_config.html
The v2 commit also addresses several problems that Simon found in
the v1 version.
In v1, I had used Py_DontWriteBytecodeFlag in the new initialization
code, but Simon pointed out that this global configuration variable
will be deprecated in Python 3.12. This version of the patch no longer
uses Py_DontWriteBytecodeFlag in the new initialization code.
Additionally, both Py_DontWriteBytecodeFlag and Py_IgnoreEnvironmentFlag
will no longer be used when building GDB against Python 3.10 or higher.
While it's true that both of these global configuration variables are
deprecated in Python 3.12, it makes sense to disable their use for
gdb builds against 3.10 and higher since those are the versions for
which the PyConfig API is now being used by GDB. (The PyConfig API
includes different mechanisms for making the same settings afforded
by use of the soon-to-be deprecated global configuration variables.)
Simon also noted that PyConfig_Clear() would not have be called for
one of the failure paths. I've fixed that problem and also made the
rest of the "bail out" code more direct. In particular,
PyConfig_Clear() will always be called, both for success and failure.
The v3 patch addresses some rebase conflicts related to module
initialization . Commit 3acd9a692d ("Make 'import gdb.events' work")
uses PyImport_ExtendInittab instead of PyImport_AppendInittab. That
commit also initializes a struct for each module to import. Both the
initialization and the call to were moved ahead of the ifdefs to avoid
having to replicate (at least some of) the code three times in various
portions of the ifdefs.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28668
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29287
Currently, Python code can use event registries to detect when gdb
loads a new objfile, and when gdb clears the objfile list. However,
there's no way to detect the removal of an objfile, say when the
inferior calls dlclose.
This patch adds a gdb.free_objfile event registry and arranges for an
event to be emitted in this case.
When I rebased and updated the print_options patch, I forgot to update
print_options to add the new 'nibbles' feature to the result. This
patch fixes the oversight. I'm checking this in.
This adds a 'summary' mode to Value.format_string and to
gdb.print_options. For the former, it lets Python code format values
using this mode. For the latter, it lets a printer potentially detect
if it is being called in a backtrace with 'set print frame-arguments'
set to 'scalars'.
I considered adding a new mode here to let a pretty-printer see
whether it was being called in a 'backtrace' context at all, but I'm
not sure if this is really desirable.
PR python/17291 asks for access to the current print options. While I
think this need is largely satisfied by the existence of
Value.format_string, it seemed to me that a bit more could be done.
First, while Value.format_string uses the user's settings, it does not
react to temporary settings such as "print/x". This patch changes
this.
Second, there is no good way to examine the current settings (in
particular the temporary ones in effect for just a single "print").
This patch adds this as well.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=17291
PR python/27000 points out that gdb.block_for_pc will accept a Python
integer, but not a gdb.Value. This patch corrects this oversight.
I looked at all uses of GDB_PY_LLU_ARG and fixed these up to use
get_addr_from_python instead. I also looked at uses of GDB_PY_LL_ARG,
but those seemed relatively unlikely to be useful with a gdb.Value, so
I didn't change them. My thinking here is that a Value will typically
come from inferior memory, and something like a line number is not too
likely to be found this way.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27000
PR python/29217 points out that gdb.parameter will return bool values,
but gdb.set_parameter will not properly accept them. This patch fixes
the problem by adding a special case to set_parameter.
I looked at a fix involving rewriting set_parameter in C++. However,
this one is simpler.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29217
Sometimes an objfile comes from memory and not from a file. It can be
useful to be able to check this from Python, so this patch adds a new
"is_file" attribute.
Pierre-Marie noticed that, while gdb.events is a Python module, it
can't be imported. This patch changes how this module is created, so
that it can be imported, while also ensuring that the module is always
visible, just as it was in the past.
This new approach required one non-obvious change -- when running
gdb.base/warning.exp, where --data-directory is intentionally not
found, the event registries can now be nullptr. Consequently, this
patch probably also requires
https://sourceware.org/pipermail/gdb-patches/2022-June/189796.html
Note that this patch obsoletes
https://sourceware.org/pipermail/gdb-patches/2022-June/189797.html
I noticed a few spots that were explicitly creating new references to
Py_True or Py_False. It's simpler here to use PyBool_FromLong, so
this patch changes all the places I found.
This patch makes it possible to allow Value.format_string() to return
nibbles output.
When we set the parameter of nibbles to True, we can achieve the
displaying binary values in groups of every four bits.
Here's an example:
(gdb) py print (gdb.Value (1230).format_string (format='t', nibbles=True))
0100 1100 1110
(gdb)
Note that the parameter nibbles is only useful if format='t' is also used.
This patch also includes update to the relevant testcase and
documentation.
Tested on x86_64 openSUSE Tumbleweed.
This converts location_spec_to_string to a method of location_spec,
simplifying the code using it, as it no longer has to use
std::unique_ptr::get().
Change-Id: I621bdad8ea084470a2724163f614578caf8f2dd5
Currently, there's the location_spec hierarchy, and then some
location_spec subclasses have their own struct type holding all their
data fields.
I.e., there is this:
location_spec
explicit_location_spec
linespec_location_spec
address_location_spec
probe_location_spec
and then these separate types:
explicit_location
linespec_location
where:
explicit_location_spec
has-a explicit_location
linespec_location_spec
has-a linespec_location
This patch eliminates explicit_location and linespec_location,
inlining their members in the corresponding location_spec type.
The location_spec subclasses were the ones currently defined in
location.c, so they are moved to the header. Since the definitions of
the classes are now visible, we no longer need location_spec_deleter.
Some constructors that are used for cloning location_specs, like:
explicit explicit_location_spec (const struct explicit_location *loc)
... were converted to proper copy ctors.
In the process, initialize_explicit_location is eliminated, and some
functions that returned the "data type behind a locspec", like
get_linespec_location are converted to downcast functions, like
as_linespec_location_spec.
Change-Id: Ia31ccef9382b25a52b00fa878c8df9b8cf2a6c5a
Currently, GDB internally uses the term "location" for both the
location specification the user input (linespec, explicit location, or
an address location), and for actual resolved locations, like the
breakpoint locations, or the result of decoding a location spec to
SaLs. This is expecially confusing in the breakpoints module, as
struct breakpoint has these two fields:
breakpoint::location;
breakpoint::loc;
"location" is the location spec, and "loc" is the resolved locations.
And then, we have a method called "locations()", which returns the
resolved locations as range...
The location spec type is presently called event_location:
/* Location we used to set the breakpoint. */
event_location_up location;
and it is described like this:
/* The base class for all an event locations used to set a stop event
in the inferior. */
struct event_location
{
and even that is incorrect... Location specs are used for finding
actual locations in the program in scenarios that have nothing to do
with stop events. E.g., "list" works with location specs.
To clean all this confusion up, this patch renames "event_location" to
"location_spec" throughout, and then all the variables that hold a
location spec, they are renamed to include "spec" in their name, like
e.g., "location" -> "locspec". Similarly, functions that work with
location specs, and currently have just "location" in their name are
renamed to include "spec" in their name too.
Change-Id: I5814124798aa2b2003e79496e78f95c74e5eddca
I noticed that emit_exiting_event does not check whether there are any
listeners before creating the event object. All other event emitters
do this, so this patch updates this one as well.
This commit extends the Python API to include disassembler support.
The motivation for this commit was to provide an API by which the user
could write Python scripts that would augment the output of the
disassembler.
To achieve this I have followed the model of the existing libopcodes
disassembler, that is, instructions are disassembled one by one. This
does restrict the type of things that it is possible to do from a
Python script, i.e. all additional output has to fit on a single line,
but this was all I needed, and creating something more complex would,
I think, require greater changes to how GDB's internal disassembler
operates.
The disassembler API is contained in the new gdb.disassembler module,
which defines the following classes:
DisassembleInfo
Similar to libopcodes disassemble_info structure, has read-only
properties: address, architecture, and progspace. And has methods:
__init__, read_memory, and is_valid.
Each time GDB wants an instruction disassembled, an instance of
this class is passed to a user written disassembler function, by
reading the properties, and calling the methods (and other support
methods in the gdb.disassembler module) the user can perform and
return the disassembly.
Disassembler
This is a base-class which user written disassemblers should
inherit from. This base class provides base implementations of
__init__ and __call__ which the user written disassembler should
override.
DisassemblerResult
This class can be used to hold the result of a call to the
disassembler, it's really just a wrapper around a string (the text
of the disassembled instruction) and a length (in bytes). The user
can return an instance of this class from Disassembler.__call__ to
represent the newly disassembled instruction.
The gdb.disassembler module also provides the following functions:
register_disassembler
This function registers an instance of a Disassembler sub-class
as a disassembler, either for one specific architecture, or, as a
global disassembler for all architectures.
builtin_disassemble
This provides access to GDB's builtin disassembler. A common
use case that I see is augmenting the existing disassembler output.
The user code can call this function to have GDB disassemble the
instruction in the normal way. The user gets back a
DisassemblerResult object, which they can then read in order to
augment the disassembler output in any way they wish.
This function also provides a mechanism to intercept the
disassemblers reads of memory, thus the user can adjust what GDB
sees when it is disassembling.
The included documentation provides a more detailed description of the
API.
There is also a new CLI command added:
maint info python-disassemblers
This command is defined in the Python gdb.disassemblers module, and
can be used to list the currently registered Python disassemblers.
This commit is setup for the next commit.
In the next commit I will add a Python API to intercept the print_insn
calls within GDB, each print_insn call is responsible for
disassembling, and printing one instruction. After the next commit it
will be possible for a user to write Python code that either wraps
around the existing disassembler, or even, in extreme situations,
entirely replaces the existing disassembler.
This commit does not add any new Python API.
What this commit does is put the extension language framework in place
for a print_insn hook. There's a new callback added to 'struct
extension_language_ops', which is then filled in with nullptr for Python
and Guile.
Finally, in the disassembler, the code is restructured so that the new
extension language function ext_lang_print_insn is called before we
delegate to gdbarch_print_insn.
After this, the next commit can focus entirely on providing a Python
implementation of the new print_insn callback.
There should be no user visible change after this commit.
Convert the gdbpy_err_fetch class to make use of gdbpy_ref, this
removes the need for manual reference count management, and allows the
destructor to be removed.
There should be no functional change after this commit.
I think this cleanup is worth doing on its own, however, in a later
commit I will want to copy instances of gdbpy_err_fetch, and switching
to using gdbpy_ref means that I can rely on the default copy
constructor, without having to add one that handles the reference
counts, so this is good preparation for that upcoming change.
I noticed that solib_name_from_address returned a non-const string,
but it's more appropriate to return const. This patch implements
this. Tested by rebuilding.
I found a comment that referred to Python 2, but that is now obsolete
-- the code it refers to is gone. I'm checking in this patch to
remove the comment.
There's a similar comment elsewhere, but I plan to remove that one in
another patch I'm going to submit shortly.
This adds the gdb.current_language function, which can be used to find
the current language without (1) ever having the value "auto" or (2)
having to parse the output of "show language".
It also adds the gdb.Frame.language, which can be used to find the
language of a given frame. This is normally preferable if one has a
Frame object handy.
Consider this command defined in Python (in the file test-cmd.py):
class test_cmd (gdb.Command):
"""
This is the first line.
Indented second line.
This is the third line.
"""
def __init__ (self):
super ().__init__ ("test-cmd", gdb.COMMAND_OBSCURE)
def invoke (self, arg, from_tty):
print ("In test-cmd")
test_cmd()
Now, within a GDB session:
(gdb) source test-cmd.py
(gdb) help test-cmd
This is the first line.
Indented second line.
This is the third line.
(gdb)
I think there's three things wrong here:
1. The leading blank line,
2. The trailing blank line, and
3. Every line is indented from the left edge slightly.
The problem of course, is that GDB is using the Python doc string
verbatim as its help text. While the user has formatted the help text
so that it appears clear within the .py file, this means that the text
appear less well formatted when displayed in the "help" output.
The same problem can be observed for gdb.Parameter objects in their
set/show output.
In this commit I aim to improve the "help" output for commands and
parameters.
To do this I have added gdbpy_fix_doc_string_indentation, a new
function that rewrites the doc string text following the following
rules:
1. Leading blank lines are removed,
2. Trailing blank lines are removed, and
3. Leading whitespace is removed in a "smart" way such that the
relative indentation of lines is retained.
With this commit in place the above example now looks like this:
(gdb) source ~/tmp/test-cmd.py
(gdb) help test-cmd
This is the first line.
Indented second line.
This is the third line.
(gdb)
Which I think is much neater. Notice that the indentation of the
second line is retained. Any blank lines within the help text (not
leading or trailing) will be retained.
I've added a NEWS entry to note that there has been a change in
behaviour, but I didn't update the manual. The existing manual is
suitably vague about how the doc string is used, so I think the new
behaviour is covered just as well by the existing text.
Make use of gdb::unique_xmalloc_ptr<char> to hold the documentation
string in cmdpy_init (when creating a custom GDB command in Python).
I think this is all pretty straight forward, the only slight weirdness
is the removal of the call to free toward the end of this function.
Prior to this commit, if an exception was thrown after the GDB command
was created then we would (I think) end up freeing the documentation
string even though the command would remain registered with GDB, which
would surely lead to undefined behaviour.
After this commit we release the doc string at the point that we hand
it over to the command creation routines. If we throw _after_ the
command has been created within GDB then the doc string will be left
live. If we throw during the command creation itself (either from
add_prefix_cmd or add_cmd) then it is up to those functions to free
the doc string (I suspect we don't, but I think in general the
commands are pretty bad at cleaning up after themselves, so I don't
think this is a huge problem).
Even after the previous patches reworking the inheritance of several
breakpoint types, the present breakpoint hierarchy looks a bit
surprising, as we have "breakpoint" as the superclass, and then
"base_breakpoint" inherits from "breakpoint". Like so, simplified:
breakpoint
base_breakpoint
ordinary_breakpoint
internal_breakpoint
momentary_breakpoint
ada_catchpoint
exception_catchpoint
tracepoint
watchpoint
catchpoint
exec_catchpoint
...
The surprising part to me is having "base_breakpoint" being a subclass
of "breakpoint". I'm just refering to naming here -- I mean, you'd
expect that it would be the top level baseclass that would be called
"base".
Just flipping the names of breakpoint and base_breakpoint around
wouldn't be super great for us, IMO, given we think of every type of
*point as a breakpoint at the user visible level. E.g., "info
breakpoints" shows watchpoints, tracepoints, etc. So it makes to call
the top level class breakpoint.
Instead, I propose renaming base_breakpoint to code_breakpoint. The
previous patches made sure that all code breakpoints inherit from
base_breakpoint, so it's fitting. Also, "code breakpoint" contrasts
nicely with a watchpoint also being typically known as a "data
breakpoint".
After this commit, the resulting hierarchy looks like:
breakpoint
code_breakpoint
ordinary_breakpoint
internal_breakpoint
momentary_breakpoint
ada_catchpoint
exception_catchpoint
tracepoint
watchpoint
catchpoint
exec_catchpoint
...
... which makes a lot more sense to me.
I've left this patch as last in the series in case people want to
bikeshed on the naming.
"code" has a nice property that it's exactly as many letters as
"base", so this patch didn't require any reindentation. :-)
Change-Id: Id8dc06683a69fad80d88e674f65e826d6a4e3f66
This converts "ordinary" breakpoint to use vtable_breakpoint_ops.
Recall that an ordinary breakpoint is both the kind normally created
by users, and also a base class used by other classes.
I noticed a few spots in GDB that use "typedef enum". However, in C++
this isn't as useful, as the tag is automatically entered as a
typedef. This patch removes most uses of "typedef enum" -- the
exceptions being in some nat-* code I can't compile, and
glibc_thread_db.h, which I think is more or less a copy of some C code
from elsewhere.
Tested by rebuilding.
Replace with calls to blockvector::blocks, and the appropriate method
call on the returned array_view.
Change-Id: I04d1f39603e4d4c21c96822421431d9a029d8ddd
Same idea as previous patch, but for symtab::objfile. I find
it clearer without this wrapper, as it shows that the objfile is
common to all symtabs of a given compunit. Otherwise, you could think
that each symtab (of a given compunit) can have a specific objfile.
Change-Id: Ifc0dbc7ec31a06eefa2787c921196949d5a6fcc6
symtab::blockvector is a wrapper around compunit_symtab::blockvector.
It is a bit misleadnig, as it gives the impression that a symtab has a
blockvector. Remove it, change all users to fetch the blockvector
through the compunit instead.
Change-Id: Ibd062cd7926112a60d52899dff9224591cbdeebf
Move 'struct reggroup' into the reggroups.h header. Remove the
reggroup_name and reggroup_type accessor functions, and just use the
name/type member functions within 'struct reggroup', update all uses
of these removed functions.
There should be no user visible changes after this commit.
Add a new function gdbarch_reggroups that returns a reference to a
vector containing all the reggroups for an architecture.
Make use of this function throughout GDB instead of the existing
reggroup_next and reggroup_prev functions.
Finally, delete the reggroup_next and reggroup_prev functions.
Most of these changes are pretty straight forward, using range based
for loops instead of the old style look using reggroup_next. There
are two places where the changes are less straight forward.
In gdb/python/py-registers.c, the register group iterator needed to
change slightly. As the iterator is tightly coupled to the gdbarch, I
just fetch the register group vector from the gdbarch when needed, and
use an index counter to find the next item from the vector when
needed.
In gdb/tui/tui-regs.c the tui_reg_next and tui_reg_prev functions are
just wrappers around reggroup_next and reggroup_prev respectively.
I've just inlined the logic of the old functions into the tui
functions. As the tui function had its own special twist (wrap around
behaviour) I think this is OK.
There should be no user visible changes after this commit.
While working on the disassembler I was getting frustrated. Every
time I touched disasm.h it seemed like every file in GDB would need to
be rebuilt. Surely the disassembler can't be required by that many
parts of GDB, right?
Turns out that disasm.h is included in target.h, so pretty much every
file was being rebuilt!
The only thing from disasm.h that target.h needed is the
gdb_disassembly_flag enum, as this is part of the target_ops api.
In this commit I move gdb_disassembly_flag into its own file. This is
then included in target.h and disasm.h, after which, the number of
files that depend on disasm.h is much reduced.
I also audited all the other includes of disasm.h and found that the
includes in mep-tdep.c and python/py-registers.c are no longer needed,
so I've removed these.
Now, after changing disasm.h, GDB rebuilds much quicker.
There should be no user visible changes after this commit.
Now that filtered and unfiltered output can be treated identically, we
can unify the printf family of functions. This is done under the name
"gdb_printf". Most of this patch was written by script.