gas/
* tc-riscv.c (percent_op_*): Add support for %tlsdesc_hi,
%tlsdesc_load_lo, %tlsdesc_add_lo and %tlsdesc_call. percent_op_rtype
renamed to percent_op_relax_only as this matcher is extended to handle
jalr as well which is not R-type.
(riscv_ip): Apply the percent_op_relax_only rename and update comment.
(md_apply_fix): Add TLSDESC_* to relaxable list. Add TLSDESC_HI20 to
TLS relocation check list.
* testsuite/gas/riscv/tlsdesc.*: New test cases for TLSDESC relocation
generation.
opcodes/
* riscv-opc.c (riscv_opcodes): Add a new syntax for jalr with
%tlsdesc_call annotations.
Hi,
Commits af1bd77 and 3f4ff08 introduced the Pointer Authentication feature with internal names that don't match the actual feature name pauth. The new feature PAuth_LR introduced in Armv9.5-A is an extension of the PAuth feature of Armv8.3-A. Using a different naming for it not based on the formerly "PAC" would create confusion.
Regression tested on aarch64-none-elf, and no regression found.
Ok for binutils-master? I don't have commit access so I need someone to commit on my behalf.
Regards,
Matthieu.
From 58b38358b2788939d81f2df7f5fb4c64a31ae06e Mon Sep 17 00:00:00 2001
From: Matthieu Longo <matthieu.longo@arm.com>
Date: Fri, 23 Feb 2024 11:30:40 +0000
Subject: [PATCH] aarch64: rename internals related to PAuth feature to use
pauth in their naming for coherency
Commits af1bd77 and 3f4ff08 introduced the Pointer Authentication feature
with internal names that don't match the actual feature name pauth. The new
feature PAuth_LR introduced in Armv9.5-A is an extension of the PAuth feature
of Armv8.3-A. Using a different naming for it not based on the formerly "PAC"
would create confusion.
Next to code using %ymm<N> or %zmm<N> it is more natural to have .cfi_*
directives also reference those, not the corresponding %xmm<N>. Accept
their names as kind of aliases, i.e. resolving to the same numbers.
While extending the respective 64-bit testcase, also add %bnd<N> there
(should have happened right with 633789901c ["x86-64: Dwarf2 register
numbers for %bnd<N>"], sorry), requiring binutils/dwarf.c to be adjusted
accordingly as well.
While various other entries in version 003 of the spec aren't quite as
explicit (due to simply leaving the respective field blank), all three
have a clear IGNORED there. IOW they ought to be emitted with EVEX.W=0
by default (and respect -mevexwig=).
Most registers from a register-pair suffixed by .lo and .hi suffixes.
This was not the case of $r14 and $r15 since they are defined by the
ABI: $r14 is the frame pointer, and $r15 is used to return aggregates
from functions. We do not add aliases for $r12 (the stack pointer) and
$r13 (the tls register).
opcodes/ChangeLog:
* kvx-opc.c: Regenerate.
gas/ChangeLog:
* config/kvx-parse.h: Regenerate.
TCA instructions start with an X, this introduces ambiguities when it
comes to XOR (Is it the OR on the TCA or the XOR of the core?). For this
reason, we rename OR to IOR and XOR to EOR.
OR and XOR variants are still valid on KV3-1 and KV3-2. However, they
have been completely removed from KV4-1.
opcodes/ChangeLog:
* kvx-opc.c: Regenerate.
include/ChangeLog:
* opcode/kvx.h: Regenerate.
gas/ChangeLog:
* config/kvx-parse.h: Regenerate.
* testsuite/gas/kvx/kv3-1-insns-32.d: Regenerate.
* testsuite/gas/kvx/kv3-1-insns-32.s: Regenerate.
* testsuite/gas/kvx/kv3-1-insns-64.d: Regenerate.
* testsuite/gas/kvx/kv3-1-insns-64.s: Regenerate.
* testsuite/gas/kvx/kv3-2-insns-32.d: Regenerate.
* testsuite/gas/kvx/kv3-2-insns-32.s: Regenerate.
* testsuite/gas/kvx/kv3-2-insns-64.d: Regenerate.
* testsuite/gas/kvx/kv3-2-insns-64.s: Regenerate.
* testsuite/gas/kvx/kv4-1-insns-32.d: Regenerate.
* testsuite/gas/kvx/kv4-1-insns-32.s: Regenerate.
* testsuite/gas/kvx/kv4-1-insns-64.d: Regenerate.
* testsuite/gas/kvx/kv4-1-insns-64.s: Regenerate.
Hi,
[PATCH][Binutils] aarch64: Add support for the id_aa64isar3_el1 system register
AArch64 defines a read-only system register called id_aa64isar3_el1.
This patch also adds relevant tests.
Regression tested on the aarch64-none-elf and aarch64-none-linux-gnu targets
and no regressions was found.
Is this Ok for trunk? I do not have commit rights, if OK, can someone commit on my behalf?
Thanks,
Yury Khrustalev
From e42c835e8f2ee81150f498675f2faf108bbe79f8 Mon Sep 17 00:00:00 2001
From: Yury Khrustalev <yury.khrustalev@arm.com>
Date: Tue, 6 Feb 2024 11:05:39 +0000
Subject: [PATCH] [PATCH][Binutils] aarch64: Add support for the
id_aa64isar3_el1 system register
AArch64 defines a read-only system register called id_aa64isar3_el1.
This patch also adds relevant tests.
Regression tested on the aarch64-none-elf and aarch64-none-linux-gnu targets
and no regressions was found.
While necessary on the legacy encodings, the EVEX ones don't need it.
Even more so when they're available for 64-bit mode only, when the
legacy encodings have the attribute only for correctly handling things
in 16-bit mode.
Several years ago it was decided that SSE2AVX templates should not be
sensitive to -mvexwig= (upon my suggestion to consistently make all
sensitive as long as they don't require a specific setting of VEX.W).
Adjust the four that still are, switching to use of Vex128 at the same
time.
While e20298da05 ("Remove redundant Byte, Word, Dword and Qword from
insn templates") did so for Byte/Word/Dword/Qword, the same kind of
redundancy was left in place for a few 128-bit SIMD operands.
Albeit not being a currently valid BPF instruction, callx is generated
by both clang and GCC when BPF programs are compiled unoptimized.
Until now, GCC would emit it only whe using the experimental
compiler-testing cpu version xbpf, whereas clang would emit it from
v1. This patch makes GAS to accept callx also starting with cpu v1.
opcodes/ChangeLog
* bpf-opc.c: Move callx into the v1 BPF CPU variant.
gas/ChangeLog
* testsuite/gas/bpf/indcall-1-pseudoc.d: Do not select xbpf cpu
version.
* testsuite/gas/bpf/indcall-1.d: Likewise.
DBNZ instruction decrements its source register operand, and if
the result is non-zero it branches to the location defined by a signed
half-word displacement operand.
DBNZ instruction is in BRANCH class as other branch instrucitons
like B, Bcc, etc. However, DBNZ is the only branch instruction
that stores a branch offset in the second operand. Thus it must
be placed in a distinct class and treated differently.
For example, current logic of arc_insn_get_branch_target in GDB
assumes that a branch offset is always stored in the first operand
for BRANCH class and it's wrong for DBNZ.
include/ChangeLog:
2024-02-14 Yuriy Kolerov <ykolerov@synopsys.com>
* opcode/arc.h (enum insn_class_t): Add DBNZ class.
opcodes/ChangeLog:
2024-02-14 Yuriy Kolerov <ykolerov@synopsys.com>
* arc-tbl.h (dbnz): Use "DBNZ" class.
* arc-dis.c (arc_opcode_to_insn_type): Handle "DBNZ" class.
gas/ChangeLog:
2024-02-14 Yuriy Kolerov <ykolerov@synopsys.com>
* config/tc-arc.c (is_br_jmp_insn_p): Add check against "DBNZ".
Don't wander into three_byte_table[] when REX2 is present.
While there also eliminate related confusion when accessing
dis386_twobyte[]: There's nothing 3-byte-ish involved there. Dropping
the odd variable gets things better in sync with 1-byte handling as
well.
Interestingly unlike VROUND{P,S}{S,D} and VPERM{F,I}128 they weren't
even present in the x86-64-apx-egpr-inval testcase, hence why I
overlooked that these can actually be encoded, (again) using suitable
AVX512 counterparts.
While there also "modernize" the adjacent AVX/AVX2 entries.
Already the %bnd<N> registers used numbers beyond 127, and eGPR ones are
all out of reach for "signed char", at least when CHAR_BITS=8. Switch to
"unsigned char", covering appropriately in places where the value
returned for "none" actually matters (in tc_x86_parse_to_dw2regnum()
this is actually achieved by altering how X_op is set).
There are no legacy ldind nor ldabs BPF instructions with BPF_SIZE_DW.
For some reason we were (incorrectly) supporting these. This patch
updates the opcodes so the instructions get removed and modifies the
GAS manual and testsuite accordingly.
See discussion at
https://lore.kernel.org/bpf/110aad7a-f8a3-46ed-9fda-2f8ee54dcb89@linux.dev
Tested in bpf-uknonwn-none target, x86-64-linux-gnu host.
include/ChangeLog:
2024-01-29 Jose E. Marchesi <jose.marchesi@oracle.com>
* opcode/bpf.h (enum bpf_insn_id): Remove BPF_INSN_LDINDDW and
BPF_INSN_LDABSDW instructions.
opcodes/ChangeLog:
2024-01-29 Jose E. Marchesi <jose.marchesi@oracle.com>
* bpf-opc.c (bpf_opcodes): Remove BPF_INSN_LDINDDW and
BPF_INSN_LDABSDW instructions.
gas/ChangeLog:
2024-01-29 Jose E. Marchesi <jose.marchesi@oracle.com>
* doc/c-bpf.texi (BPF Instructions): There is no indirect 64-bit
load instruction.
(BPF Instructions): There is no absolute 64-bit load instruction.
* testsuite/gas/bpf/mem.s: Update test accordingly.
* testsuite/gas/bpf/mem-be-pseudoc.d: Likewise.
* testsuite/gas/bpf/mem-be.d: Likewise.
* testsuite/gas/bpf/mem-pseudoc.d: Likewise.
* testsuite/gas/bpf/mem-pseudoc.s: Likewise.
* testsuite/gas/bpf/mem.d: Likewise.
* testsuite/gas/bpf/mem.s: Likewise.
SHA512 instructions were added to the architecture at the same time as SHA3
instructions, but later than the SHA1 and SHA256 instructions. Furthermore,
implementations must support either both or neither of the SHA512 and SHA3
instruction sets. However, SHA512 instructions were originally (and
incorrectly) added to Binutils under the +sha2 flag.
This patch moves SHA512 instructions under the +sha3 flag, which matches the
architecture constraints and existing GCC and LLVM behaviour.
Re-using the entire VEX decode hierarchy for the respective major opcode
has led to those two also being decoded as-if valid. Follow the earlier
USE_X86_64_EVEX_{PFX,W}_TABLE approach to avoid this happening.
As suggested during review already, all such entries have their first
slot as Bad_Opcode, so by adding two more enumerators we can avoid doing
that decode step altogether.
With identical source and destination it can be covered by the NDD-to-
legacy conversion logic as well, even if in this case the original insn
doesn't use an NDD encoding. The size savings are even better here, for
the replacement (BSWAP) not having a ModR/M byte.
GCC 14 will detect when the size and count arguments of calloc are
swapped.
binutils-gdb/opcodes/tic4x-dis.c: In function ‘tic4x_disassemble’:
binutils-gdb/opcodes/tic4x-dis.c:710:32: error: ‘xcalloc’ sizes specified with ‘sizeof’ in the earlier argument and not in the later argument [-Werror=calloc-transposed-args]
710 | optab = xcalloc (sizeof (tic4x_inst_t *), (1 << TIC4X_HASH_SIZE));
| ^~~~~~~~~~~~
binutils-gdb/opcodes/tic4x-dis.c:710:32: note: earlier argument should specify number of elements, later size of each element
binutils-gdb/opcodes/tic4x-dis.c:712:40: error: ‘xcalloc’ sizes specified with ‘sizeof’ in the earlier argument and not in the later argument [-Werror=calloc-transposed-args]
712 | optab_special = xcalloc (sizeof (tic4x_inst_t *), TIC4X_SPESOP_SIZE);
| ^~~~~~~~~~~~
binutils-gdb/opcodes/tic4x-dis.c:712:40: note: earlier argument should specify number of elements, later size of each element
opcodes/ChangeLog:
* /tic4x-dis.c (tic4x_disassemble): Swap size and count xcalloc
arguments.
VRNDSCALE{P,S}{S,D} is the AVX512 generalization of these AVX insns. As
long as the immediate has the top 4 bits clear, they are equivalent to
the earlier VEX-encoded insns, and hence can be used to permit use of
eGPR-s in the memory operand. Since this is the normal way of using
these insns, also alter the resulting diagnostic to complain about the
immediate, not the eGPR use.
When there's a suitably disambiguating register operand, suffixes are
generally omitted (unless in suffix-always mode). All NDD insns have a
suitable register operand, so they shouldn't have suffixes by default.
Along with the relevant unit-tests, this adds the following rcpc3
instructions:
STL1 { <Vt>.D }[<index>], [<Xn|SP>]
LDAP1 { <Vt>.D }[<index>], [<Xn|SP>]
LDAPUR <Bt>, [<Xn|SP>{, #<simm>}]
LDAPUR <Ht>, [<Xn|SP>{, #<simm>}]
LDAPUR <St>, [<Xn|SP>{, #<simm>}]
LDAPUR <Dt>, [<Xn|SP>{, #<simm>}]
LDAPUR <Qt>, [<Xn|SP>{, #<simm>}]
STLUR <Bt>, [<Xn|SP>{, #<simm>}]
STLUR <Ht>, [<Xn|SP>{, #<simm>}]
STLUR <St>, [<Xn|SP>{, #<simm>}]
STLUR <Dt>, [<Xn|SP>{, #<simm>}]
STLUR <Qt>, [<Xn|SP>{, #<simm>}]
with `#<simm>' taking on a signed 8-bit integer value in the range
[-256,255] and `index' the values 0 or 1.
Co-authored-by: Srinath Parvathaneni <srinath.parvathaneni@arm.com>
Given the introduction of the new address operand types for rcpc3
instructions, this patch adds the necessary logic to teach
`general_constraint_met_p` how to proper handle these.
The particular choices of address indexing, along with their encoding
for RCPC3 instructions lead to the requirement of a new set of operand
descriptions, along with the relevant inserter/extractor set.
That is, for the integer load/stores, there is only a single valid
indexing offset quantity and offset mode is allowed - The value is
always equivalent to the amount of data read/stored by the
operation and the offset is post-indexed for Load-Acquire RCpc, and
pre-indexed with writeback for Store-Release insns.
This indexing quantity/mode pair is selected by the setting of a
single bit in the instruction. To represent these insns, we add the
following operand types:
- AARCH64_OPND_RCPC3_ADDR_OPT_POSTIND
- AARCH64_OPND_RCPC3_ADDR_OPT_PREIND_WB
In the case of loads and stores involving SIMD/FP registers, the
optional offset is encoded as an 8-bit signed immediate, but neither
post-indexing or pre-indexing with writeback is available. This
created the need for an operand type similar to
AARCH64_OPND_ADDR_OFFSET, with the difference that FLD_index should
not be checked.
We thus introduce the AARCH64_OPND_RCPC3_ADDR_OFFSET operand, a
variant of AARCH64_OPND_ADDR_OFFSET, w/o the FLD_index bitfield.
Beyond the need to encode any registers involved in data transfer and
the address base register for load/stores, it is necessary to specify
the data register addressing mode and whether the address register is
to be pre/post-indexed, whereby loads may be post-indexed and stores
pre-indexed with write-back.
The use of a single bit to specify both the indexing mode and indexing
value requires a novel function be written to accommodate this for
address operand insertion in assembly and another for extraction in
disassembly, along with the definition of two insn fields for use with
these instructions.
This therefore defines the following functions:
- aarch64_ins_rcpc3_addr_opt_offset
- aarch64_ins_rcpc3_addr_offset
- aarch64_ext_rcpc3_addr_opt_offset
- aarch64_ext_rcpc3_addr_offset
It extends the `do_special_{encoding|decoding}' functions and defines
two rcpc3 instruction fields:
- FLD_opc2
- FLD_rcpc3_size
The allowed immediate offsets in integer rcpc3 load store instructions
are not encoded explicitly in the instruction itself, being rather
implicitly equivalent to the amount of data loaded/stored by the
instruction.
This leads to the requirement that this quantity be calculated based on
the number of registers involved in the transfer, either as data
source or destination registers and their respective qualifiers.
This is done via `calc_ldst_datasize (const aarch64_opnd_info *opnds)'
implemented here, using a cumulative sum of qualifier sizes preceding
the address operand in the OPNDS operand list argument.
Indicating the presence of the Armv8.2-a feature adding further
support for the Release Consistency Model, the `+rcpc3' architectural
extension flag is added to the list of possible `-march' options in
Binutils, together with the necessary macro for encoding rcpc3
instructions.
There are some tlbi operations that don't have a corresponding tlbip operation,
but we were incorrectly using the same list for both. Add the missing tlbi
*nxs operations, and use the F_REG_128 flag to filter tlbi operations that
don't have a tlbip analogue. For increased clarity, I have also used a macro
to reduce duplication between the 'nxs' and non-'nxs' variants, and added a
test to verify that no invalid combinations are accepted.
Additionally, fix two missing checks for AARCH64_OPND_SYSREG_TLBIP that were
preventing disassembly of tlbip instructions.