Adds two new external authors to etc/update-copyright.py to cover
bfd/ax_tls.m4, and adds gprofng to dirs handled automatically, then
updates copyright messages as follows:
1) Update cgen/utils.scm emitted copyrights.
2) Run "etc/update-copyright.py --this-year" with an extra external
author I haven't committed, 'Kalray SA.', to cover gas testsuite
files (which should have their copyright message removed).
3) Build with --enable-maintainer-mode --enable-cgen-maint=yes.
4) Check out */po/*.pot which we don't update frequently.
This patch makes the BPF assembler to support double-slash line
comments, like the llvm BPF assembler does. At this point both
assemblers support the same commenting styles:
- Line comments preceded by # or //.
- Non-nestable block comments delimited by /* and */.
This patch also adds a couple of tests to make sure all the comment
styles work in both normal and pseudoc syntax. The manual is also
updated to mention double-slash line comments.
To support the "pseudo-C" asm dialect in BPF, the BPF parser must often
attempt multiple different templates for a single instruction. In some
cases this can cause the parser to incorrectly parse part of the
instruction opcode as an expression, which leads to the creation of a
new undefined symbol.
Once the parser recognizes the error, the expression is discarded and it
tries again with a new instruction template. However, symbols created
during the process are added to the symbol table and are not removed
even if the expression is discarded.
This is a problem for BPF: generally the assembled object will be loaded
directly to the Linux kernel, without being linked. These erroneous
parser-created symbols are rejected by the kernel BPF loader, and the
entire object is refused.
This patch remedies the issue by tentatively creating symbols while
parsing instruction operands, and storing them in a temporary list
rather than immediately inserting them into the symbol table. Later,
after the parser is sure that it has correctly parsed the instruction,
those symbols are committed to the real symbol table.
This approach is modeled directly after Jan Beulich's patch for RISC-V:
commit 7a29ee2903
RISC-V: adjust logic to avoid register name symbols
Many thanks to Jan for recognizing the problem as similar, and pointing
me to that patch.
gas/
* config/tc-bpf.c (parsing_insn_operands): New.
(parse_expression): Set it here.
(deferred_sym_rootP, deferred_sym_lastP): New.
(orphan_sym_rootP, orphan_sym_lastP): New.
(bpf_parse_name): New.
(parse_error): Clear deferred symbol list on error.
(md_assemble): Clear parsing_insn_operands. Commit deferred
symbols to symbol table on successful parse.
* config/tc-bpf.h (md_parse_name): Define to...
(bpf_parse_name): ...this. New prototype.
* testsuite/gas/bpf/asm-extra-sym-1.s: New test source.
* testsuite/gas/bpf/asm-extra-sym-1.d: New test.
* testsuite/gas/bpf/bpf.exp: Run new test.
This patch adds support for EF_BPF_CPUVER bits in the ELF
machine-dependent header flags. These bits encode the BPF CPU
version for which the object file has been compiled for.
The BPF assembler is updated so it annotates the object files it
generates with these bits.
The BPF disassembler is updated so it honors EF_BPF_CPUVER to use the
appropriate ISA version if the user didn't specify an explicit ISA
version in the command line. Note that a value of zero in
EF_BPF_CPUVER is interpreted by the disassembler as "use the later
supported version" (the BPF CPU versions start with v1.)
The readelf utility is updated to pretty print EF_BPF_CPUVER when it
prints out the ELF header:
$ readelf -h a.out
ELF Header:
...
Flags: 0x4, CPU Version: 4
Tested in bpf-unknown-none.
include/ChangeLog:
2023-07-30 Jose E. Marchesi <jose.marchesi@oracle.com>
* elf/bpf.h (EF_BPF_CPUVER): Define.
* opcode/bpf.h (BPF_XBPF): Change from 0xf to 0xff so it fits in
EF_BPF_CPUVER.
binutils/ChangeLog:
2023-07-30 Jose E. Marchesi <jose.marchesi@oracle.com>
* readelf.c (get_machine_flags): Recognize and pretty print BPF
machine flags.
opcodes/ChangeLog:
2023-07-30 Jose E. Marchesi <jose.marchesi@oracle.com>
* bpf-dis.c: Initialize asm_bpf_version to -1.
(print_insn_bpf): Set BPF ISA version from the cpu version ELF
header flags if no explicit version set in the command line.
* disassemble.c (disassemble_init_for_target): Remove unused code.
gas/ChangeLog:
2023-07-30 Jose E. Marchesi <jose.marchesi@oracle.com>
* config/tc-bpf.h (elf_tc_final_processing): Define.
* config/tc-bpf.c (bpf_elf_final_processing): New function.
CGEN is cool, but the BPF architecture is simply too bizarre for it.
The weird way of BPF to handle endianness in instruction encoding, the
weird C-like alternative assembly syntax, the weird abuse of
multi-byte (or infra-byte) instruction fields as opcodes, the unusual
presence of opcodes beyond the first 32-bits of some instructions, are
all examples of what makes it a PITA to continue using CGEN for this
port. The bpf.cpu file is becoming so complex and so nested with
p-macros that it is very difficult to read, and quite challenging to
update. Also, every time we are forced to change something in CGEN to
accommodate BPF requirements (which is often) we have to do extensive
testing to make sure we do not break any other target using CGEN.
This is getting un-maintenable.
So I have decided to bite the bullet and revamp/rewrite the port so it
no longer uses CGEN. Overall, this involved:
* To remove the cpu/bpf.{cpu,opc} descriptions.
* To remove the CGEN generated files.
* To replace the CGEN generated opcodes table with a new hand-written
opcodes table for BPF.
* To replace the CGEN generated disassembler wih a new disassembler
that uses the new opcodes.
* To replace the CGEN generated assembler with a new assembler that uses the
new opcodes.
* To replace the CGEN generated simulator with a new simulator that uses the
new opcodes. [This is pushed in GDB in another patch.]
* To adapt the build systems to the new situation.
Additionally, this patch introduces some extensions and improvements:
* A new BPF relocation BPF_RELOC_BPF_DISP16 plus corresponding ELF
relocation R_BPF_GNU_64_16 are added to the BPF BFD port. These
relocations are used for section-relative 16-bit offsets used in
load/store instructions.
* The disassembler now has support for the "pseudo-c" assembly syntax of
BPF. What dialect to use when disassembling is controlled by a command
line option.
* The disassembler now has support for dumping instruction immediates in
either octal, hexadecimal or decimal. The used output base is controlled
by a new command-line option.
* The GAS BPF test suite has been re-structured and expanded in order to
test the disassembler pseudoc syntax support. Minor bugs have been also
fixed there. The assembler generic tests that were disabled for bpf-*-*
targets due to the previous implementation of pseudoc syntax are now
re-enabled. Additional tests have been added to test the new features of
the assembler. .dump files are no longer used.
* The linker BPF test suite has been adapted to the command line options
used by the new disassembler.
The result is very satisfactory. This patchs adds 3448 lines of code
and removes 10542 lines of code.
Tested in:
* Target bpf-unknown-none with 64-bit little-endian host and 32-bit
little-endian host.
* Target x86-64-linux-gnu with --enable-targets=all
Note that I have not tested in a big-endian host yet. I will do so
once this lands upstream so I can use the GCC compiler farm.
I have not included ChangeLog entries in this patch: these would be
massive and not very useful, considering this is pretty much a rewrite
of the port. I beg the indulgence of the global maintainers.
This patch adds support to the GNU assembler for an alternative
assembly syntax used in BPF. This syntax is C-like and very
unconventional for an assembly language, but it is generated by
clang/llvm and is also used in inline asm templates in kernel code, so
we ought to support it.
After this patch, the assembler is able to parse instructions in both
supported syntax: the normal assembly-like syntax and the pseudo-C
syntax. Instruction formats can be mixed in the source program: the
assembler recognizes the right syntax to use.
gas/ChangeLog:
2023-04-20 Guillermo E. Martinez <guillermo.e.martinez@oracle.com>
PR gas/29728
* config/tc-bpf.h (TC_EQUAL_IN_INSN): Define.
* config/tc-bpf.c (LEX_IS_SYMBOL_COMPONENT): Define.
(LEX_IS_WHITESPACE): Likewise.
(LEX_IS_NEWLINE): Likewise.
(LEX_IS_ARITHM_OP): Likewise.
(LEX_IS_STAR): Likewise.
(LEX_IS_CLSE_BR): Likewise.
(LEX_IS_OPEN_BR): Likewise.
(LEX_IS_EQUAL): Likewise.
(LEX_IS_EXCLA): Likewise.
(ST_EOI): Likewise.
(MAX_TOKEN_SZ): Likewise.
(init_pseudoc_lex): New function.
(md_begin): Call init_pseudoc_lex.
(valid_expr): New function.
(build_bpf_non_generic_load): Likewise.
(build_bpf_atomic_insn): Likewise.
(build_bpf_jmp_insn): Likewise.
(build_bpf_arithm_insn): Likewise.
(build_bpf_endianness): Likewise.
(build_bpf_load_store_insn): Likewise.
(look_for_reserved_word): Likewise.
(is_register): Likewise.
(is_cast): Likewise.
(get_token): Likewise.
(bpf_pseudoc_to_normal_syntax): Likewise.
(md_assemble): Try pseudo-C syntax if an instruction cannot be
parsed.
The newer update-copyright.py fixes file encoding too, removing cr/lf
on binutils/bfdtest2.c and ld/testsuite/ld-cygwin/exe-export.exp, and
embedded cr in binutils/testsuite/binutils-all/ar.exp string match.
The result of running etc/update-copyright.py --this-year, fixing all
the files whose mode is changed by the script, plus a build with
--enable-maintainer-mode --enable-cgen-maint=yes, then checking
out */po/*.pot which we don't update frequently.
The copy of cgen was with commit d1dd5fcc38ead reverted as that commit
breaks building of bfp opcodes files.
* read.c (s_nop): New function. Handles the .nop directive.
(potable): Add entry for "nop".
(s_nops): Code tidy.
* read.h (s_nop): Add prototype.
* config/tc-bpf.h (md_single_noop_insn): Define.
* config/tc-mmix.h (md_single_noop_insn): Define.
* config/tc-or1k.h (md_single_noop_insn): Define.
* config/tc-s12z.c (md_assemble): Preserve the input line pointer,
rather than corrupting it.
* write.c (relax_segment): Update error message regarding
non-absolute values passed to .fill and .nops.
* NEWS: Mention the new directive.
* doc/as.texi: Document the new directive.
* doc/internals.texi: Document the new internal macros used to
implement the new directive.
* testsuite/gas/all/nop.s: New test.
* testsuite/gas/all/nop.d: New test control file.
* testsuite/gas/all/gas.exp: Run the new test.
* testsuite/gas/elf/dwarf-5-nop-for-line-table.s: New test.
* testsuite/gas/elf/dwarf-5-nop-for-line-table.d: New test
control file.
* testsuite/gas/elf/elf.exp: Run the new test.
* testsuite/gas/i386/space1.l: Adjust expected output.