By slightly relaxing the checking in operand_type_register_match() we
can fold the vector shift insns with an XMM source as well. While
strictly speaking an overlap in just one size (see the code comment) is
not enough (both operands could have multiple sizes with just a single
common one), this is good enough for all templates we have, or which
could sensibly / usefully appear (within the scope of the present
operand matching model).
Tightening this a little would be possible, but would require broadcast
related information to be passed into the function.
Setting this field risks cpu_flags_all_zero() mistakenly returning
"false" when the object passed in was e.g. the result of ANDing together
two objects which had the bit set, or ANDNing together an object with
the field set and one with the field clear.
While there also avoid setting CpuNo64: Like Cpu64 this is driven
differently anyway and hence shouldn't be set anywhere by default.
Note that the moving of the two items in i386-gen.c's cpu_flags[] is
only for documentation purposes (and slight reducing of overhead), as
the fields are sorted anyway upon program start.
There's no need for the arbitrary special "unknown" token: Simply
recognize the leading ~ and process everything else the same, merely
recording whether to set individual fields to 1 or 0.
While there exclude CpuIAMCU from CPU_UNKNOWN_FLAGS - CPU_IAMCU_FLAGS
override cpu_arch_flags anyway when -march=iamcu is passed, and there's
no reason to have the stray flag set even if no insn actually is keyed
to it.
The checks done by check_cpu_arch_compatible() were halfway sensible
only at the time where only L1OM support was there. The purpose,
however, has always been to prevent bad uses of .arch (turning off the
base CPU "feature" flag) while at the same time permitting extensions to
be enabled / disabled. In order to achieve this (and to prevent
regressions when L1OM and K1OM support are removed)
- set CpuIAMCU in CPU_IAMCU_FLAGS,
- adjust the IAMCU check in the function itself (the other two similarly
broken checks aren't adjusted as they're slated to be removed anyway),
- avoid calling the function for extentions (which would never have the
base "feature" flag set),
- add a new testcase actually exercising ".arch iamcu" (which would also
regress with the planned removal).
There isn't an actual opcodes implementation for the AMDGCN arch (yet),
this is just the bare minimum to get
$ ./configure --target=amdgcn-hsa-amdhsa --disable-gas
$ make all-binutils
working later in this series.
opcodes/ChangeLog:
* configure.ac: Handle bfd_amdgcn_arch.
* configure: Re-generate.
Change-Id: Ib7d7c5533a803ed8b2a293e9275f667ed781ce79
Let's hope this stays dead, but it's here as a patch separate from
those that removed use of powerpc_macros just in case it needs to be
resurrected.
include/
* opcode/ppc.h (struct powerpc_macro): Delete declaration.
(powerpc_macros, powerpc_num_macros): Likewise..
opcodes/
* ppc-opc.c (powerpc_macros, powerpc_num_macros): Delete.
gas/
* config/tc-ppc.c (ppc_macro): Delete function.
(ppc_macro_hash): Delete.
(ppc_setup_opcodes, md_assemble): Delete macro support.
This moves VLE insn out of the macro table. "e_slwi" and "e_srwi"
already exist in vle_opcodes as distinct instructions rather than
encodings of e_rlwinm.
opcodes/
* ppc-opc.c (vle_opcodes): Typo fix e_rlwinm operand.
Add "e_inslwi", "e_insrwi", "e_rotlwi", "e_rotrwi", "e_clrlwi",
"e_clrrwi", "e_extlwi", "e_extrwi", and "e_clrlslwi".
(powerpc_macros): Delete same. Delete "e_slwi" and "e_srwi" too.
gas/
* testsuite/gas/ppc/vle-simple-5.d: Update.
The extended instructions implemented in powerpc_macros aren't used by
the disassembler. That means instructions like "sldi r3,r3,2" appear
in disassembly as "rldicr r3,r3,2,61", which is annoying since many
other extended instructions are shown.
Note that some of the instructions moved out of the macro table to the
opcode table won't appear in disassembly, because they are aliases
rather than a subset of the underlying raw instruction. If enabled,
rotrdi, extrdi, extldi, clrlsldi, and insrdi would replace all
occurrences of rotldi, rldicl, rldicr, rldic and rldimi. (Or many
occurrences in the case of clrlsldi if n <= b was added to the extract
functions.)
The patch also fixes a small bug in opcode sanity checking.
include/
* opcode/ppc.h (PPC_OPSHIFT_SH6): Define.
opcodes/
* ppc-opc.c (insert_erdn, extract_erdn, insert_eldn, extract_eldn),
(insert_crdn, extract_crdn, insert_rrdn, extract_rrdn),
(insert_sldn, extract_sldn, insert_srdn, extract_srdn),
(insert_erdb, extract_erdb, insert_csldn, extract_csldb),
(insert_irdb, extract_irdn): New functions.
(ELDn, ERDn, ERDn, RRDn, SRDn, ERDb, CSLDn, CSLDb, IRDn, IRDb):
Define and add associated powerpc_operands entries.
(powerpc_opcodes): Add "rotrdi", "srdi", "extrdi", "clrrdi",
"sldi", "extldi", "clrlsldi", "insrdi" and corresponding record
(ie. dot suffix) forms.
(powerpc_macros): Delete same from here.
gas/
* config/tc-ppc.c (insn_validate): Don't modify value passed
to operand->insert for PPC_OPERAND_PLUS1 when calculating mask.
Handle PPC_OPSHIFT_SH6.
* testsuite/gas/ppc/prefix-reloc.d: Update.
* testsuite/gas/ppc/simpshft.d: Update.
ld/
* testsuite/ld-powerpc/elfv2so.d: Update.
* testsuite/ld-powerpc/notoc.d: Update.
* testsuite/ld-powerpc/notoc3.d: Update.
* testsuite/ld-powerpc/tlsdesc2.d: Update.
* testsuite/ld-powerpc/tlsget.d: Update.
* testsuite/ld-powerpc/tlsget2.d: Update.
* testsuite/ld-powerpc/tlsopt5.d: Update.
* testsuite/ld-powerpc/tlsopt6.d: Update.
Without a -M cpu option given, powerpc objdump defaults currently to
-Mpower10 but -Many is also given. Commit 1ff6a3b8e5 regressed
-Many disassembly of instructions that are encoded differently
depending on cpu, such as mftb which has pre- and post-power4
encodings.
PR 28959
* ppc-dis.c (lookup_powerpc): Revert 2021-05-28 change. Instead
only look at deprecated PPC_OPCODE_RAW bit when -Many.
Correct issues with INSN2_ALIAS annotation for branch instructions:
- regular MIPS BEQZ/L and BNEZ/L assembly instructions are idioms for
BEQ/L and BNE/L respectively with the `rs' operand equal to $0,
- microMIPS 32-bit BEQZ and BNEZ assembly instructions are idioms for
BEQ and BNE respectively with the `rt' operand equal to $0,
- regular MIPS BAL assembly instruction is an idiom for architecture
levels of up to the MIPSr5 ISA and a machine instruction on its own
from the MIPSr6 ISA up.
Add missing annotation to BEQZ/L and BNEZ/L accordingly then and add a
new entry for BAL for the MIPSr6 ISA, correcting a disassembly bug:
$ mips-linux-gnu-objdump -m mips:isa64r6 -M no-aliases -d bal.o
bal.o: file format elf32-tradlittlemips
Disassembly of section .text:
00000000 <foo>:
0: 04110000 0x4110000
...
$
Add test cases accordingly.
Parts for regular MIPS BEQZ/L and BNEZ/L instructions from Sagar Patel.
2022-03-06 Maciej W. Rozycki <macro@orcam.me.uk>
binutils/
* testsuite/binutils-all/mips/mips1-branch-alias.d: New test.
* testsuite/binutils-all/mips/mips1-branch-noalias.d: New test.
* testsuite/binutils-all/mips/mips2-branch-alias.d: New test.
* testsuite/binutils-all/mips/mips2-branch-noalias.d: New test.
* testsuite/binutils-all/mips/mips32r6-branch-alias.d: New test.
* testsuite/binutils-all/mips/mips32r6-branch-noalias.d: New
test.
* testsuite/binutils-all/mips/micromips-branch-alias.d: New
test.
* testsuite/binutils-all/mips/micromips-branch-noalias.d: New
test.
* testsuite/binutils-all/mips/mips-branch-alias.s: New test
source.
* testsuite/binutils-all/mips/micromips-branch-alias.s: New test
source.
* testsuite/binutils-all/mips/mips.exp: Run the new tests.
2022-03-06 Sagar Patel <sagarmp@cs.unc.edu>
Maciej W. Rozycki <macro@orcam.me.uk>
opcodes/
* mips-opc.c (mips_builtin_opcodes): Fix INSN2_ALIAS annotation
for "bal", "beqz", "beqzl", "bnez" and "bnezl" instructions.
* micromips-opc.c (micromips_opcodes): Likewise for "beqz" and
"bnez" instructions.
Add has_sib to struct instr_info and use SIB info only if ins->has_sib
is true.
PR binutils/28892
* i386-dis.c (instr_info): Add has_sib.
(get_sib): Set has_sib.
(OP_E_memory): Replace havesib with ins->has_sib.
(OP_VEX): Use ins->sib.index only if ins->has_sib is true.
Now that this lives on the stack, let's have it be a little less
wasteful in terms of space. Switch boolean fields to "bool" (also when
this doesn't change their size) and also limit the widths of "rex",
"rex_used", "op_ad", and "op_index". Do a little bit of re-ordering as
well to limit the number of padding holes.
There's no real need for the pseudo-boolean "haveindex" or for separate
32-bit / 64-bit index pointers. Fold them into a single "indexes" and
set that uniformly to AT&T names, compensating by emitting the register
name via oappend_maybe_intel().
On top of prior similar work more opportunities have appeared in the
meantime. Note that this also happens to address the prior lack of
decoding of EVEX.L'L for VMOV{L,H}P{S,D} and VMOV{LH,HL}PS.
For some reason the original AVFX512F insns were not taken as a basis
here, causing unnecessary divergence. While not an active issue, it is
still relevant to note that OP_XMM() has special treatment of e.g.
scalar_mode (marking broadcast as invalid). Such would better be
consistent for all sufficiently similar insns.
For one EVEX.W set does not imply EVEX.b is uniformly valid. Reject it
for modes which occur for insns allowing for EVEX.W to be set (noticed
with VMOV{H,L}PD and VMOVDDUP, and only in AT&T mode, but not checked
whether further insns would also have been impacted; I expect e.g.
VCMPSD would have had the same issue). And then the present concept of
broadcast makes no sense at all when the memory operand of an insn is
the destination.
Like for AVX512-FP16, there's not that many FP insns where going through
this table is easier / cheaper than using suitable macros. Utilize %XS
and %XD more to eliminate a fair number of table entries.
While doing this I noticed a few anomalies. Where lines get touched /
moved anyway, these are being addressed right here:
- vmovshdup used EXx for its 2nd operand, thus displaying seemingly
valid broadcast when EVEX.b is set with a memory operand; use
EXEvexXNoBcst instead just like vmovsldup already does
- vmovlhps used EXx for its 3rd operand, when all sibling entries use
EXq; switch to EXq there for consistency (the two differ only for
memory operands)
Like already indicated during review of the original submission, there's
really only very few insns where going through this table is easier /
cheaper than using suitable macros. Utilize %XH more and introduce
similar %XS and %XD (which subsequently can be used for further table
size reduction).
While there also switch to using oappend() in 'XH' macro processing.
This patch adds support for three new SME instructions: ADDSPL,
ADDSVL and RDSVL. They behave like ADDPL, ADDVL and RDVL, but read
the streaming vector length instead of the current vector length.
opcodes/
* aarch64-tbl.h (aarch64_opcode_table): Add ADDSPL, ADDSVL and RDSVL.
* aarch64-dis-2.c: Regenerate.
gas/
* testsuite/gas/aarch64/sme.s, testsuite/gas/aarch64/sme.d: Add tests
for ADDSPL, ADDSVL and RDSVL.
To avoid issues like that addressed by 6e3e5c9e41 ("x86: extend SSE
check to PCLMULQDQ, AES, and GFNI insns"), base the check on opcode
attributes and operand types.
As already indicated in a remark when introducing these templates, the
"commutative" attribute is ignored for legacy encoding templates. Hence
it is possible to shorten a number of templates by specifying C directly
rather than through a template parameter. I think this helps readability
a bit.
Improve thread safety in print_insn_i386_att, print_insn_i386_intel and
print_insn_i386 by removing the use of static variables.
Tested on x86_64-pc-linux-gnu.
2022-01-04 Vladimir Mezentsev <vladimir.mezentsev@oracle.com>
* i386-dis.c: Make print_insn_i386_att, print_insn_i386_intel
and print_insn_i386 thread-safe
The result of running etc/update-copyright.py --this-year, fixing all
the files whose mode is changed by the script, plus a build with
--enable-maintainer-mode --enable-cgen-maint=yes, then checking
out */po/*.pot which we don't update frequently.
The copy of cgen was with commit d1dd5fcc38ead reverted as that commit
breaks building of bfp opcodes files.
Move the 64-bit bfd logic out of bfd/configure.ac and into bfd64.m4
under config so it can be shared between all the other subdirs.
This replaces want64 with enable_64_bit_bfd which was already being
declared, but not used directly.