gcc/contrib/unicode/gen-printable-chars.py
David Malcolm 4f01ae3761 diagnostics: add support for "text art" diagrams
Existing text output in GCC has to be implemented by writing
sequentially to a pretty_printer instance.  This makes it
hard to implement some kinds of diagnostic output (see e.g.
diagnostic-show-locus.cc).

This patch adds more flexible ways of creating text output:
- a canvas class, which can be "painted" to via random-access (rather
that sequentially)
- a table class for 2D grid layout, supporting items that span
multiple rows/columns
- a widget class for organizing diagrams hierarchically.

The patch also expands GCC's diagnostics subsystem so that diagnostics
can have "text art" diagrams - think ASCII art, but potentially
including some Unicode characters, such as box-drawing chars.

The new code is in a new "gcc/text-art" subdirectory and "text_art"
namespace.

The patch adds a new "-fdiagnostics-text-art-charset=VAL" option, with
values:
- "none": don't emit diagrams (added to -fdiagnostics-plain-output)
- "ascii": use pure ASCII in diagrams
- "unicode": allow for conservative use of unicode drawing characters
(such as box-drawing characters).
- "emoji" (the default): as "unicode", but potentially allow for
conservative use of emoji in the output (such as U+26A0 WARNING SIGN).
I made it possible to disable emoji separately from unicode as I believe
there's a generation gap in acceptance of these characters (some older
programmers have a visceral reaction against them, whereas younger
programmers may have no problem with them).

Diagrams are emitted to stderr by default.  With SARIF output they are
captured as a location in "relatedLocations", with the diagram as a
code block in Markdown within a "markdown" property of a message.

This patch doesn't add any such diagram usage to GCC, saving that for
followups, apart from adding a plugin to the test suite to exercise the
functionality.

contrib/ChangeLog:
	* unicode/gen-box-drawing-chars.py: New file.
	* unicode/gen-combining-chars.py: New file.
	* unicode/gen-printable-chars.py: New file.

gcc/ChangeLog:
	* Makefile.in (OBJS-libcommon): Add text-art/box-drawing.o,
	text-art/canvas.o, text-art/ruler.o, text-art/selftests.o,
	text-art/style.o, text-art/styled-string.o, text-art/table.o,
	text-art/theme.o, and text-art/widget.o.
	* color-macros.h (COLOR_FG_BRIGHT_BLACK): New.
	(COLOR_FG_BRIGHT_RED): New.
	(COLOR_FG_BRIGHT_GREEN): New.
	(COLOR_FG_BRIGHT_YELLOW): New.
	(COLOR_FG_BRIGHT_BLUE): New.
	(COLOR_FG_BRIGHT_MAGENTA): New.
	(COLOR_FG_BRIGHT_CYAN): New.
	(COLOR_FG_BRIGHT_WHITE): New.
	(COLOR_BG_BRIGHT_BLACK): New.
	(COLOR_BG_BRIGHT_RED): New.
	(COLOR_BG_BRIGHT_GREEN): New.
	(COLOR_BG_BRIGHT_YELLOW): New.
	(COLOR_BG_BRIGHT_BLUE): New.
	(COLOR_BG_BRIGHT_MAGENTA): New.
	(COLOR_BG_BRIGHT_CYAN): New.
	(COLOR_BG_BRIGHT_WHITE): New.
	* common.opt (fdiagnostics-text-art-charset=): New option.
	(diagnostic-text-art.h): New SourceInclude.
	(diagnostic_text_art_charset) New Enum and EnumValues.
	* configure: Regenerate.
	* configure.ac (gccdepdir): Add text-art to loop.
	* diagnostic-diagram.h: New file.
	* diagnostic-format-json.cc (json_emit_diagram): New.
	(diagnostic_output_format_init_json): Wire it up to
	context->m_diagrams.m_emission_cb.
	* diagnostic-format-sarif.cc: Include "diagnostic-diagram.h" and
	"text-art/canvas.h".
	(sarif_result::on_nested_diagnostic): Move code to...
	(sarif_result::add_related_location): ...this new function.
	(sarif_result::on_diagram): New.
	(sarif_builder::emit_diagram): New.
	(sarif_builder::make_message_object_for_diagram): New.
	(sarif_emit_diagram): New.
	(diagnostic_output_format_init_sarif): Set
	context->m_diagrams.m_emission_cb to sarif_emit_diagram.
	* diagnostic-text-art.h: New file.
	* diagnostic.cc: Include "diagnostic-text-art.h",
	"diagnostic-diagram.h", and "text-art/theme.h".
	(diagnostic_initialize): Initialize context->m_diagrams and
	call diagnostics_text_art_charset_init.
	(diagnostic_finish): Clean up context->m_diagrams.m_theme.
	(diagnostic_emit_diagram): New.
	(diagnostics_text_art_charset_init): New.
	* diagnostic.h (text_art::theme): New forward decl.
	(class diagnostic_diagram): Likewise.
	(diagnostic_context::m_diagrams): New field.
	(diagnostic_emit_diagram): New decl.
	* doc/invoke.texi (Diagnostic Message Formatting Options): Add
	-fdiagnostics-text-art-charset=.
	(-fdiagnostics-plain-output): Add
	-fdiagnostics-text-art-charset=none.
	* gcc.cc: Include "diagnostic-text-art.h".
	(driver_handle_option): Handle OPT_fdiagnostics_text_art_charset_.
	* opts-common.cc (decode_cmdline_options_to_array): Add
	"-fdiagnostics-text-art-charset=none" to expanded_args for
	-fdiagnostics-plain-output.
	* opts.cc: Include "diagnostic-text-art.h".
	(common_handle_option): Handle OPT_fdiagnostics_text_art_charset_.
	* pretty-print.cc (pp_unicode_character): New.
	* pretty-print.h (pp_unicode_character): New decl.
	* selftest-run-tests.cc: Include "text-art/selftests.h".
	(selftest::run_tests): Call text_art_tests.
	* text-art/box-drawing-chars.inc: New file, generated by
	contrib/unicode/gen-box-drawing-chars.py.
	* text-art/box-drawing.cc: New file.
	* text-art/box-drawing.h: New file.
	* text-art/canvas.cc: New file.
	* text-art/canvas.h: New file.
	* text-art/ruler.cc: New file.
	* text-art/ruler.h: New file.
	* text-art/selftests.cc: New file.
	* text-art/selftests.h: New file.
	* text-art/style.cc: New file.
	* text-art/styled-string.cc: New file.
	* text-art/table.cc: New file.
	* text-art/table.h: New file.
	* text-art/theme.cc: New file.
	* text-art/theme.h: New file.
	* text-art/types.h: New file.
	* text-art/widget.cc: New file.
	* text-art/widget.h: New file.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic-test-text-art-ascii-bw.c: New test.
	* gcc.dg/plugin/diagnostic-test-text-art-ascii-color.c: New test.
	* gcc.dg/plugin/diagnostic-test-text-art-none.c: New test.
	* gcc.dg/plugin/diagnostic-test-text-art-unicode-bw.c: New test.
	* gcc.dg/plugin/diagnostic-test-text-art-unicode-color.c: New test.
	* gcc.dg/plugin/diagnostic_plugin_test_text_art.c: New test plugin.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add them.

libcpp/ChangeLog:
	* charset.cc (get_cppchar_property): New function template, based
	on...
	(cpp_wcwidth): ...this function.  Rework to use the above.
	Include "combining-chars.inc".
	(cpp_is_combining_char): New function
	Include "printable-chars.inc".
	(cpp_is_printable_char): New function
	* combining-chars.inc: New file, generated by
	contrib/unicode/gen-combining-chars.py.
	* include/cpplib.h (cpp_is_combining_char): New function decl.
	(cpp_is_printable_char): New function decl.
	* printable-chars.inc: New file, generated by
	contrib/unicode/gen-printable-chars.py.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-06-21 21:49:00 -04:00

77 lines
2.3 KiB
Python
Executable file

#!/usr/bin/env python3
#
# Script to generate libcpp/printable-chars.inc
#
# This file is part of GCC.
#
# GCC is free software; you can redistribute it and/or modify it under
# the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 3, or (at your option) any later
# version.
#
# GCC is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or
# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
# for more details.
#
# You should have received a copy of the GNU General Public License
# along with GCC; see the file COPYING3. If not see
# <http://www.gnu.org/licenses/>. */
from pprint import pprint
import unicodedata
def is_printable_char(code_point) -> bool:
category = unicodedata.category(chr(code_point))
# "Cc" is "control" and "Cf" is "format"
return category[0] != 'C'
class Range:
def __init__(self, start, end, value):
self.start = start
self.end = end
self.value = value
def __repr__(self):
return f'Range({self.start:x}, {self.end:x}, {self.value})'
def make_ranges(value_callback):
ranges = []
for code_point in range(0x10FFFF):
value = is_printable_char(code_point)
if 0:
print(f'{code_point=:x} {value=}')
if ranges and ranges[-1].value == value:
# Extend current range
ranges[-1].end = code_point
else:
# Start a new range
ranges.append(Range(code_point, code_point, value))
return ranges
ranges = make_ranges(is_printable_char)
if 0:
pprint(ranges)
print(f"/* Generated by contrib/unicode/gen-printable-chars.py")
print(f" using version {unicodedata.unidata_version}"
" of the Unicode standard. */")
print("\nstatic const cppchar_t printable_range_ends[] = {", end="")
for i, r in enumerate(ranges):
if i % 8:
print(" ", end="")
else:
print("\n ", end="")
print("0x%x," % r.end, end="")
print("\n};\n")
print("static const bool is_printable[] = {", end="")
for i, r in enumerate(ranges):
if i % 24:
print(" ", end="")
else:
print("\n ", end="")
if r.value:
print("1,", end="")
else:
print("0,", end="")
print("\n};")