Pre- and post-increment/decrement are side effects, the behavior of
which is undefined when combined with passing an address of the accessed
variable in the same function invocation. There's no need for the
increments here - simply adding 1 achieves the intended effect without
triggering compiler diagnostics (which are fatal with -Werror).
FENCE.TSO isn't an alias. ZIP and UNZIP in the long run likely are, but
presently they aren't. This fixes disassembly of these insns with
-Mno-aliases.
For disassembly to pick up aliases in favor of underlying insns (helping
readability in the common case), the aliases need to come ahead of the
"base" insns. Slightly more code movement is needed because of insns
with the same name needing to stay next to each other.
Note that the "rorw" alias entry also has the missing INSN_ALIAS added
here.
Clone a few testcases to exercise -Mno-aliases some more, better
covering the differences between the default and that disassembly mode.
With the command in the rule merely being "echo", i386-tbl.h won't be
rebuilt if missing, when at the same time i386-init.h is present and
up-to-date. Use a pattern rule instead to express the multiple targets
correctly (the &: rule separator is supported only by GNU make 4.3 and
newer). Note that now, for the opposite case to work (i386-tbl.h is
up-to-date but i386-init.h is missing), i386-init.h also needs
mentioning as a dependency somewhere: Add a fake dependency for
i386-opc.lo ("fake" because i386-opc.c doesn't include that header).
At the same time use $(AM_V_GEN) in the actual rule, replacing the
earlier (open-coded) "echo". And while there also drop a duplicate
dependency of i386-gen.o on i386-opc.h.
While in some cases deriving an AT&T-style suffix from an Intel syntax
memory operand size specifier is necessary, in many cases this is not
only pointless, but has led to the introduction of various workarounds:
Excessive use of IgnoreSize and NoRex64 as well as the ToDword and
ToQword attributes. Suppress suffix derivation when we can clearly tell
that the memory operand's size isn't going to be needed to infer the
possible need for the low byte/word opcode bit or an operand size prefix
(0x66 or REX.W).
As a result ToDword and ToQword can be dropped entirely, plus a fair
number of IgnoreSize and NoRex64 can also be got rid of. Note that
IgnoreSize needs to remain on legacy encoded SIMD insns with GPR
operand, to avoid emitting an operand size prefix in 16-bit mode. (Since
16-bit code using SIMD insns isn't well tested, clone an existing
testcase just enough to cover a few insns which are potentially
problematic but are being touched here.)
Note that while folding the VCVT{,T}S{S,D}2SI templates, VCVT{,T}SH2SI
isn't included there. This is to fulfill the request of not allowing L
and Q suffixes there, despite the inconsistency with VCVT{,T}S{S,D}2SI.
This patch adds support for the Zawrs ISA extension
("wrs.nto" and "wrs.sto" instructions).
The specification can be found here:
https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
T-Head has a range of vendor-specific instructions.
Therefore it makes sense to group them into smaller chunks
in form of vendor extensions.
This patch adds the XTheadMemPair extension, a collection of T-Head specific
two-GP-register memory operations.
The 'th' prefix and the "XTheadMemPair" extension are documented in a PR
for the RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
This patch introduces support for arbitrary literal instruction
arguments, that are not encoded in the opcode.
A typical use case for this feature would be an instruction that
applies an implicit shift by a constant value on an immediate
(that is a real operand). With this patch it is possible to make
this shift visible in the dissasembly and support such artificial
parameter as part of the asssembly code.
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
T-Head has a range of vendor-specific instructions.
Therefore it makes sense to group them into smaller chunks
in form of vendor extensions.
This patch adds the XTheadMemIdx extension, a collection of T-Head specific
GPR memory access instructions.
The 'th' prefix and the "XTheadMemIdx" extension are documented in a PR
for the RISC-V toolchain conventions ([1]).
In total XTheadCmo introduces the following 44 instructions
(BU,HU,WU only for loads (zero-extend instead of sign-extend)):
* {L,S}{D,W,WU,H,HU,B,BU}{IA,IB} rd, rs1, imm5, imm2
* {L,S}R{D,W,WU,H,HU,B,BU} rd, rs1, rs2, imm2
* {L,S}UR{D,W,WU,H,HU,B,BU} rd, rs1, rs2, imm2
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
T-Head has a range of vendor-specific instructions.
Therefore it makes sense to group them into smaller chunks
in form of vendor extensions.
This patch adds the XTheadFMemIdx extension, a collection of
T-Head-specific floating-point memory access instructions.
The 'th' prefix and the "XTheadFMemIdx" extension are documented
in a PR for the RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
T-Head has a range of vendor-specific instructions.
Therefore it makes sense to group them into smaller chunks
in form of vendor extensions.
This patch adds the XTheadMac extension, a collection of
T-Head-specific multiply-accumulate instructions.
The 'th' prefix and the "XTheadMac" extension are documented
in a PR for the RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
T-Head has a range of vendor-specific instructions.
Therefore it makes sense to group them into smaller chunks
in form of vendor extensions.
This patch adds the XTheadCondMov extension, a collection of
T-Head-specific conditional move instructions.
The 'th' prefix and the "XTheadCondMov" extension are documented
in a PR for the RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
T-Head has a range of vendor-specific instructions.
Therefore it makes sense to group them into smaller chunks
in form of vendor extensions.
This patch adds the XThead{Ba,Bb,Bs} extensions, a collection of
T-Head-specific bitmanipulation instructions.
The 'th' prefix and the "XThead{Ba,Bb,Bs}" extension are documented
in a PR for the RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
This patch introduces support for arbitrary signed or unsigned immediate
encoding formats. The formats have the form "XsN@S" and "XuN@S" with N
being the number of bits and S the LSB position.
For example an immediate field of 5 bytes that encodes a signed value
and is stored in the bits 24-20 of the instruction word can use the
format specifier "Xs5@20".
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
T-Head has a range of vendor-specific instructions.
Therefore it makes sense to group them into smaller chunks
in form of vendor extensions.
This patch adds the XTheadSync extension, a collection of
T-Head-specific multi-processor synchronization instructions.
The 'th' prefix and the "XTheadSync" extension are documented in a PR
for the RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
T-Head has a range of vendor-specific instructions.
Therefore it makes sense to group them into smaller chunks
in form of vendor extensions.
This patch adds the XTheadCmo extension, a collection of T-Head specific
cache management operations.
The 'th' prefix and the "XTheadCmo" extension are documented in a PR
for the RISC-V toolchain conventions ([1]).
In total XTheadCmo introduces the following 21 instructions:
* DCACHE.{C,CI,I}ALL
* DCACHE.{C,CI,I}{PA,VA,SW} rs1
* DCACHE.C{PAL1,VAL1} rs1
* ICACHE.I{ALL,ALLS}
* ICACHE.I{PA,VA} rs1
* L2CACHE.{C,CI,I}ALL
Contrary to Zicbom, the XTheadCmo instructions don't have a constant
displacement, therefore we have a different syntax for the arguments.
To clarify this is intended behaviour, there is a set of negative test
for Zicbom-style arguments in x-thead-cmo-fail.s.
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
v2:
- Add missing DECLARE_INSN() list
- Fix ordering
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
There are a few operand types not used by any RISC-V instructions.
- Cx
- Vf
- Ve
- [
- ]
- b
But most of them has a reasoning to keep them:
- Cx : Same as "Ct" except it has a constraint to have rd == rs2
(similar to "Cw"). Although it hasn't used, its role is clear
enough to implement a new instruction with this operand type.
- Vf, Ve : Used by vector AMO instructions (not ratified and real
instructions are not upstreamed yet).
- [, ] : Unused tokenization symbols. Reserving them is not harmful
and a vendor may use this symbol for special purposes.
... except "b". I could not have found any reference to this operand type
except it works like the "s" operand type. Historically, it seems... it's
just unused from the beginning. Its role is not clear either.
On such cases, we should vacate this room for the new operand type with
much clearer roles.
opcodes/ChangeLog:
* riscv-dis.c (print_insn_args): Remove 'b' operand type.
Some components of GNU Binutils will pass "-Wstack-usage=262144" when
"GCC >= 5.0" is detected. However, Clang does not support "-Wstack-usage",
despite that related configuration part in bfd/warning.m4 handles the latest
Clang (15.0.0 as of this writing) as "GCC >= 5.0".
The option "-Wstack-usage" was ignored when the first version of Clang is
released but even this "ignoring" behavior is removed before Clang 4.0.0.
So, if we give Clang "-Wstack-usage=262144", it generates a warning, making
the build failure.
This commit checks "__clang__" macro to prevent adding the option if the
compiler is identified as Clang.
bfd/ChangeLog:
* warning.m4: Stop appending "-Wstack-usage=262144" option when
compiled with Clang.
* configure: Regenerate.
binutils/ChangeLog:
* configure: Regenerate.
gas/ChangeLog:
* configure: Regenerate.
gold/ChangeLog:
* configure: Regenerate.
gprof/ChangeLog:
* configure: Regenerate.
ld/ChangeLog:
* configure: Regenerate.
opcodes/ChangeLog:
* configure: Regenerate.
The -mfuture and -Mfuture options which are used for adding potential
new ISA instructions were not documented. They also lacked a bitmask
so new instructions could not be enabled by those options. Fixed.
binutils/
* doc/binutils.texi: Document -Mfuture.
gas/
* config/tc-ppc.c: Document -mfuture
* doc/c-ppc.texi: Likewise.
include/
* opcode/ppc.h (PPC_OPCODE_FUTURE): Define.
opcodes/
* ppc-dis.c (ppc_opts) <future>: Use it.
* ppc-opc.c (FUTURE): Define.
While PR binutils/29483 has now been addressed differently, this
originally proposed change still has its merits: Avoiding vsnprintf()
for typically far more than half of the overall output results in a 2-3%
performance gain in my testing (with debug builds of objdump, libbfd,
and libopcodes).
With that part of output no longer using staging_area[], the array also
doesn't need to be quite as large anymore (the largest presently used
size is 27, from "64-bit address is disabled").
While limiting the scope of "res" it became apparent that
- no caller cares about the function's return value,
- the comment about the return value was wrong,
- a particular positive return value would have been meaningless to the
caller.
Therefore convert the function to return "void" at the same time.
This is paired with "gdb: Add non-enum disassembler options".
There is a portable mechanism for disassembler options and used on some
architectures:
- ARC
- Arm
- MIPS
- PowerPC
- RISC-V
- S/390
However, it only supports following forms:
- [NAME]
- [NAME]=[ENUM_VALUE]
Valid values for [ENUM_VALUE] must be predefined in
disasm_option_arg_t.values. For instance, for -M cpu=[CPU] in ARC
architecture, opcodes/arc-dis.c builds valid CPU model list from
include/elf/arc-cpu.def.
In this commit, it adds following format:
- [NAME]=[ARBITRARY_VALUE] (cannot contain "," though)
This is identified by NULL value of disasm_option_arg_t.values
(normally, this is a non-NULL pointer to a NULL-terminated list).
include/ChangeLog:
* dis-asm.h (disasm_option_arg_t): Update comment of values
to allow non-enum disassembler options.
opcodes/ChangeLog:
* riscv-dis.c (print_riscv_disassembler_options): Support
non-enum disassembler options on printing disassembler help.
* arc-dis.c (print_arc_disassembler_options): Likewise.
* mips-dis.c (print_mips_disassembler_options): Likewise.
This patch makes possible to print the highest address (-1) and the addresses
related to gp which value is -1. This is particularly useful if the highest
address space is used for I/O registers and corresponding symbols are defined.
Besides, despite that it is very rare to have GP the highest address, it would
be nice because we enabled highest address printing on regular cases.
gas/ChangeLog:
* testsuite/gas/riscv/dis-addr-topaddr.s: New test for the top
address (-1) printing.
* testsuite/gas/riscv/dis-addr-topaddr-32.d: Likewise.
* testsuite/gas/riscv/dis-addr-topaddr-64.d: Likewise.
* testsuite/gas/riscv/dis-addr-topaddr-gp.s: New test for
GP-relative addressing when GP is the highest address (-1).
* testsuite/gas/riscv/dis-addr-topaddr-gp-32.d: Likewise.
* testsuite/gas/riscv/dis-addr-topaddr-gp-64.d: Likewise.
opcodes/ChangeLog:
* riscv-dis.c (struct riscv_private_data): Add `to_print_addr' to
enable printing the highest address.
(maybe_print_address): Utilize `to_print_addr'.
(riscv_disassemble_insn): Likewise.
If either the base register is `zero', `tp' or `gp' and XLEN is 32, an
incorrectly sign-extended address is produced when printing. This commit
fixes this by fitting an address into a 32-bit value on RV32.
Besides, H. Peter Anvin discovered that we have wrong address computation
for JALR instruction (the initial bug is back in 2018). This commit also
fixes that based on the idea of Palmer Dabbelt.
gas/
pr29342
* testsuite/gas/riscv/lla32.d: Reflect RV32 address computation fix.
* testsuite/gas/riscv/dis-addr-overflow.s: New testcase.
* testsuite/gas/riscv/dis-addr-overflow-32.d: Likewise.
* testsuite/gas/riscv/dis-addr-overflow-64.d: Likewise.
opcodes/
pr29342
* riscv-dis.c (maybe_print_address): Fit address into 32-bit on RV32.
(print_insn_args): Fix JALR address by adding EXTRACT_ITYPE_IMM.
Three-part patch set from Tsukasa OI to support zmmul in assembler.
The 'Zmmul' is a RISC-V extension consisting of only multiply instructions
(a subset of 'M' which has multiply and divide instructions).
bfd/
* elfxx-riscv.c (riscv_implicit_subsets): Add 'Zmmul' implied by 'M'.
(riscv_supported_std_z_ext): Add 'Zmmul' extension.
(riscv_multi_subset_supports): Add handling for new instruction class.
gas/
* testsuite/gas/riscv/attribute-09.d: Updated implicit 'Zmmul' by 'M'.
* testsuite/gas/riscv/option-arch-02.d: Likewise.
* testsuite/gas/riscv/m-ext.s: New test.
* testsuite/gas/riscv/m-ext-32.d: New test (RV32).
* testsuite/gas/riscv/m-ext-64.d: New test (RV64).
* testsuite/gas/riscv/zmmul-32.d: New expected output.
* testsuite/gas/riscv/zmmul-64.d: Likewise.
* testsuite/gas/riscv/m-ext-fail-xlen-32.d: New test (failure
by using RV64-only instructions in RV32).
* testsuite/gas/riscv/m-ext-fail-xlen-32.l: Likewise.
* testsuite/gas/riscv/m-ext-fail-zmmul-32.d: New failure test
(RV32 + Zmmul but with no M).
* testsuite/gas/riscv/m-ext-fail-zmmul-32.l: Likewise.
* testsuite/gas/riscv/m-ext-fail-zmmul-64.d: New failure test
(RV64 + Zmmul but with no M).
* testsuite/gas/riscv/m-ext-fail-zmmul-64.l: Likewise.
* testsuite/gas/riscv/m-ext-fail-noarch-64.d: New failure test
(no Zmmul or M).
* testsuite/gas/riscv/m-ext-fail-noarch-64.l: Likewise.
include/
* opcode/riscv.h (enum riscv_insn_class): Added INSN_CLASS_ZMMUL.
ld/
* testsuite/ld-riscv-elf/attr-merge-arch-01.d: We don't care zmmul in
these testcases, so just replaced m by a.
* testsuite/ld-riscv-elf/attr-merge-arch-01a.s: Likewise.
* testsuite/ld-riscv-elf/attr-merge-arch-01b.s: Likewise.
* testsuite/ld-riscv-elf/attr-merge-arch-02.d: Likewise.
* testsuite/ld-riscv-elf/attr-merge-arch-02a.s: Likewise.
* testsuite/ld-riscv-elf/attr-merge-arch-03.d: Likewise.
* testsuite/ld-riscv-elf/attr-merge-arch-03a.s: Likewise.
* testsuite/ld-riscv-elf/attr-merge-user-ext-01.d: Likewise.
* testsuite/ld-riscv-elf/attr-merge-user-ext-rv32i2p1_a2p0.s: Renamed.
* testsuite/ld-riscv-elf/attr-merge-user-ext-rv32i2p1_a2p1.s: Renamed.
opcodes/
* riscv-opc.c (riscv_opcodes): Updated multiply instructions to zmmul.
When displaying operands, invalid opcodes may overflow operand buffer
due to additional styling characters. Each style is encoded with 3
bytes. Define MAX_OPERAND_BUFFER_SIZE for operand buffer size and
increase it from 100 bytes to 128 bytes to accommodate 9 sets of styles
in an operand.
gas/
PR binutils/29483
* testsuite/gas/i386/i386.exp: Run pr29483.
* testsuite/gas/i386/pr29483.d: New file.
* testsuite/gas/i386/pr29483.s: Likewise.
opcodes/
PR binutils/29483
* i386-dis.c (MAX_OPERAND_BUFFER_SIZE): New.
(obuf): Replace 100 with MAX_OPERAND_BUFFER_SIZE.
(staging_area): Likewise.
(op_out): Likewise.
Now that we can purge templates, let's use this to improve readability a
little by shortening a few of their names, making functionally similar
ones also have identical names in their multiple incarnations.
Many of the vector conversion insns come with X/Y/Z suffixed forms, for
disambiguation purposes in AT&T syntax. All of these gorups follow
certain patterns. Introduce "xy" and "xyz" templates to reduce
redundancy.
To facilitate using a uniform name for both AVX and AVX512, further
introduce a means to purge a previously defined template: A standalone
<name> will be recognized to have this effect.
Note that in the course of the conversion VFPCLASSPH is properly split
to separate AT&T and Intel syntax forms, matching VFPCLASSP{S,D} and
yielding the intended "ambiguous operand size" diagnostic in Intel mode.
Many of the vector integer insns come in byte/word element pairs. Most
of these pairs follow certain encoding patterns. Introduce a "bw"
template to reduce redundancy.
Note that in the course of the conversion
- the AVX VPEXTRW template which is not being touched needs to remain
ahead of the new "combined" ones, as (a) this should be tried first
when matching insns against templates and (b) its Load attributes
requires it to be first,
- this add a benign/meaningless IgnoreSize attribute to the memory form
of PEXTRB; it didn't seem worth avoiding this.
Many of the vector integer insns come in dword/qword element pairs. Most
of these pairs follow certain encoding patterns. Introduce a "dq"
template to reduce redundancy.
Note that in the course of the conversion
- a few otherwise untouched templates are moved, so they end up next to
their siblings),
- drop an unhelpful Cpu64 from the GPR form of VPBROADCASTQ, matching
what we already have for KMOVQ - the diagnostic is better this way for
insns with multiple forms (i.e. the same Cpu64 attributes on {,V}MOVQ,
{,V}PEXTRQ, and {,V}PINSRQ are useful to keep),
- this adds benign/meaningless IgnoreSize attributes to the GPR forms of
KMOVD and VPBROADCASTD; it didn't seem worth avoiding this.
The vast majority of vector FP insns comes in single/double pairs. Many
pairs follow certain encoding patterns. Introduce an "sd" template to
reduce redundancy. Similarly, to further cover similarities between
AVX512F and AVX512-FP16, introduce an "sdh" template.
For element-size Disp8 shift generalize i386-gen's broadcast size
determination, allowing Disp8MemShift to be specified without an operand
in the affected templated templates. While doing the adjustment also
eliminate an unhelpful (lost information) diagnostic combined with a use
after free in what is now get_element_size().
Note that in the course of the conversion
- the AVX512F form of VMOVUPD has a stray (leftover) Load attribute
dropped,
- VMOVSH has a benign IgnoreSize added (the attribute is still strictly
necessary for VMOVSD, and necessary for VMOVSS as long as we permit
strange combinations like "-march=i286+avx"),
- VFPCLASSPH is properly split to separate AT&T and Intel syntax forms,
matching VFPCLASSP{S,D}.
This reverts commit 384f368958, which
broke i386-gen's emitting of diagnostics. As a replacement to address
the original issue of newer gcc no longer splicing lines when dropping
the line continuation backslashes, switch to using + as the line
continuation character, doing the line splicing in i386-gen.
svstep and svshape instructions subtract 1 before encoding some of the
operands. Obviously zero is not supported for these operands. Whilst
PPC_OPERAND_PLUS1 fits perfectly to mark that maximal value should be
incremented, there is no flag which marks the fact that zero values are
not allowed. This patch adds a new flag, PPC_OPERAND_NONZERO, for this
purpose.
This patch adds support for LibreSOC machine and SVP64 extension flag
for PowerPC architecture. SV (Simple-V) is a strict RISC-paradigm
Scalable Vector Extension for the Power ISA. SVP64 is the 64-bit
Prefixed instruction format implementing SV. Funded by NLnet through EU
Grants No: 825310 and 825322, SV is in DRAFT form and is to be publicly
submitted via the OpenPOWER Foundation ISA Working Group via the
newly-created External RFC Process.
For more details, visit https://libre-soc.org.
It is unclear to me why the corresponding MOV (no Q suffix) can be
issued without REX.W, but MOVQ has to have that prefix (bit). Add
NoRex64 and in exchange drop Size64.
The non-SSE2AVX form of the SIMD variant of the instruction needlessly
has the (still multi-purpose) IgnoreSize attribute. All other similar
SSE2 insns use NoRex64 instead. Make this consistent, noting that the
SSE2AVX form can't have the same change made - there the memory operand
doesn't at the same time permit RegXMM (which logic uses when deciding
whether a Q suffix is okay outside of 64-bit mode).
While the other three variants each differ in attributes and hence can't
be folded, these two pairs actually can be (and were previously
overlooked). This effectively matches their AVX512VL counterparts, which
are also expressed as a single template.
While the x/y/z suffix isn't necessary to use in this case, it is still
odd that these forms don't support broadcast (unlike their AVX512F /
AVX512DQ counterparts). The lack thereof can e.g. make macro-ized
programming more difficult.
One more place where pre-existing templates should have been taken as a
basis: In Intel syntax we want to consistently issue an "ambiguous
operand size" error when a size-less memory operand is specified for an
insn where register use alone isn't sufficient for disambiguation.
BFD_VMA_FMT can't be used in format strings that need to be
translated, because the translation won't work when the type of
bfd_vma differs from the machine used to compile .pot files. We've
known about this for a long time, but patches slip through review.
So just get rid of BFD_VMA_FMT, instead using the appropriate PRId64,
PRIu64, PRIx64 or PRIo64 and SCN variants for scanf. The patch is
mostly mechanical, the only thing requiring any thought is casts
needed to preserve PRId64 output from bfd_vma values, or to preserve
one of the unsigned output formats from bfd_signed_vma values.
First of all rename the meanwhile misleading Opcode_SIMD_FloatD, as it
has also been used for KMOV* and BNDMOV. Then simplify the condition
selecting which form if "reversing" to use - except for the MOV to/from
control/debug/test registers all extended opcode space insns use bit 0
(rather than bit 1) to indicate the direction (from/to memory) of an
operation. With that, D can simply be set on the first of the two
templates, while the other can be dropped.