gdb: use python to colorize disassembler output

This commit adds styling support to the disassembler output, as such
two new commands are added to GDB:

  set style disassembler enabled on|off
  show style disassembler enabled

In this commit I make use of the Python Pygments package to provide
the styling.  I did investigate making use of libsource-highlight,
however, I found the highlighting results to be inferior to those of
Pygments; only some mnemonics were highlighted, and highlighting of
register names such as r9d and r8d (on x86-64) was incorrect.

To enable disassembler highlighting via Pygments, I've added a new
extension language hook, which is then implemented for Python.  This
hook is very similar to the existing hook for source code
colorization.

One possibly odd choice I made with the new hook is to pass a
gdb.Architecture through, even though this is currently unused.  The
reason this argument is not used is that, currently, styling is
performed identically for all architectures.

However, even though the Python function used to perform styling of
disassembly output is not part of any documented API, I don't want
to close the door on a user overriding this function to provide
architecture specific styling.  To do this, the user would inevitably
require access to the gdb.Architecture, and so I decided to add this
field now.

The styling is applied within gdb_disassembler::print_insn, to achieve
this, gdb_disassembler now writes its output into a temporary buffer,
styling is then applied to the contents of this buffer.  Finally the
gdb_disassembler buffer is copied out to its final destination stream.

There's a new test to check that the disassembler output includes some
escape sequences, though I don't check for specific colours; the
precise colors will depend on which instructions are in the
disassembler output, and, I guess, how pygments is configured.

The only negative change with this commit is how we currently style
addresses in GDB.

Currently, when the disassembler wants to print an address, we call
back into GDB, and GDB prints the address value using the `address`
styling, and the symbol name using `function` styling.  After this
commit, if pygments is used, then all disassembler styling is done
through pygments, and this include the address and symbol name parts
of the disassembler output.

I don't know how much of an issue this will be for people.  There's
already some precedent for this in GDB when we look at source styling.
For example, function names in styled source listings are not styled
using the `function` style, but instead, either GNU Source Highlight,
or pygments gets to decide how the function name should be styled.

If the Python pygments library is not present then GDB will continue
to behave as it always has, the disassembler output is mostly
unstyled, but the address and symbols are styled using the `address`
and `function` styles, as they are today.

However, if the user does `set style disassembler enabled off`, then
all disassembler styling is switched off.  This obviously covers the
use of pygments, but also includes the minimal styling done by GDB
when pygments is not available.
This commit is contained in:
Andrew Burgess 2021-10-25 17:26:57 +01:00 committed by Andrew Burgess
parent 20ea3acc72
commit e867795e8b
13 changed files with 352 additions and 3 deletions

View file

@ -264,7 +264,20 @@ try:
except:
return None
def colorize_disasm(content, gdbarch):
# Don't want any errors.
try:
lexer = lexers.get_lexer_by_name("asm")
formatter = formatters.TerminalFormatter()
return highlight(content, lexer, formatter).rstrip().encode()
except:
return None
except:
def colorize(filename, contents):
return None
def colorize_disasm(content, gdbarch):
return None

View file

@ -121,6 +121,8 @@ static enum ext_lang_rc gdbpy_before_prompt_hook
(const struct extension_language_defn *, const char *current_gdb_prompt);
static gdb::optional<std::string> gdbpy_colorize
(const std::string &filename, const std::string &contents);
static gdb::optional<std::string> gdbpy_colorize_disasm
(const std::string &content, gdbarch *gdbarch);
/* The interface between gdb proper and loading of python scripts. */
@ -162,6 +164,8 @@ static const struct extension_language_ops python_extension_ops =
gdbpy_get_matching_xmethod_workers,
gdbpy_colorize,
gdbpy_colorize_disasm,
};
#endif /* HAVE_PYTHON */
@ -1213,6 +1217,69 @@ gdbpy_colorize (const std::string &filename, const std::string &contents)
return std::string (PyBytes_AsString (result.get ()));
}
/* This is the extension_language_ops.colorize_disasm "method". */
static gdb::optional<std::string>
gdbpy_colorize_disasm (const std::string &content, gdbarch *gdbarch)
{
if (!gdb_python_initialized)
return {};
gdbpy_enter enter_py;
if (gdb_python_module == nullptr
|| !PyObject_HasAttrString (gdb_python_module, "colorize_disasm"))
return {};
gdbpy_ref<> hook (PyObject_GetAttrString (gdb_python_module,
"colorize_disasm"));
if (hook == nullptr)
{
gdbpy_print_stack ();
return {};
}
if (!PyCallable_Check (hook.get ()))
return {};
gdbpy_ref<> content_arg (PyBytes_FromString (content.c_str ()));
if (content_arg == nullptr)
{
gdbpy_print_stack ();
return {};
}
gdbpy_ref<> gdbarch_arg (gdbarch_to_arch_object (gdbarch));
if (gdbarch_arg == nullptr)
{
gdbpy_print_stack ();
return {};
}
gdbpy_ref<> result (PyObject_CallFunctionObjArgs (hook.get (),
content_arg.get (),
gdbarch_arg.get (),
nullptr));
if (result == nullptr)
{
gdbpy_print_stack ();
return {};
}
if (result == Py_None)
return {};
if (!PyBytes_Check (result.get ()))
{
PyErr_SetString (PyExc_TypeError,
_("Return value from gdb.colorize_disasm should be a bytes object or None."));
gdbpy_print_stack ();
return {};
}
return std::string (PyBytes_AsString (result.get ()));
}
/* Printing. */