For VEX-encoded ones, all three involved vector registers have to be
distinct. For EVEX-encoded ones an actual mask register has to be in use
and zeroing-masking cannot be used (violation of either will #UD).
Additionally both involved vector registers have to be distinct for
EVEX-encoded gathers.
Commit 6ff00b5e12 ("x86/Intel: correct permitted operand sizes for
AVX512 scatter/gather") brought the assembler side of AVX512 S/G insn
handling in line with AVX2's, but the disassembler side was forgotten.
This has the benefit of
- allowing to fold a number of table entries,
- rendering a few #define-s and enumerators unused.
Some of the enumerators have ended up misplaced under the general
current ordering scheme. Move them (and their table entries) around
accordingly. Add a couple of blank lines as separators when close to
code being touched anyway. Also drop the odd 0F from 0FXOP (there's no
"0f" involved there anywhere) infixes where the respective enum gets
played with anyway.
When the VEX.L=1 decode matches that of both EVEX.L'L=1 and EVEX.L'L=2
(typically when all three are invalid) the (smaller) VEX table entry can
be reused by EVEX, instead of duplicating data. (Note that XM and XMM as
well as EXxmm_md and EXd are equivalent at least for the purposes here.)
The order of decodes influences the overall number of table entries.
Reduce table size quite a bit by first decoding few-alternatives
attributes common to all valid leaves.
This also adds a PREFIX_DATA 7531c61332 ("x86: simplify decode of
opcodes valid with (embedded) 66 prefix only") missed to apply to
vbroadcastf64x4.
The order of decodes influences the overall number of table entries.
Reduce table size quite a bit by first decoding few-alternatives
attributes common to all valid leaves.
The order of decodes influences the overall number of table entries.
Reduce table size quite a bit by first decoding few-alternatives
attributes common to all valid leaves.
The order of decodes influences the overall number of table entries.
Reduce table size quite a bit by first decoding few-alternatives
attributes common to all valid leaves.
All encodings not used in this range are (reserved) NOPs. Hence their
decoding should be fully consistent. For this to work the PREFIX_IGNORED
logic needs slightly extending, such that the attribute will also
- have an effect when used inside prefix_table[], yet without always
falling back to using slot 0,
- cause prefixes marked as ignored while decoding through prefix_table[]
to no longer be considered decoded, when encountered in a subsequent
decoding step.
In adjacent code also drop meaningless PREFIX_OPCODE.
Despite SYSEXIT being an Intel-only insn in long mode, its behavior
there is similar to SYSRET's: Depending on REX.W execution continues in
either 64-bit or compatibility mode. Hence distinguishing by suffix is
as necessary here as it is there.
The previous change
"x86: Ignore CS/DS/ES/SS segment-override prefixes in 64-bit mode"
to ignore segment override prefixes in 64-bit mode lead to dumping
branch hints as excessive prefixes:
ffffffff8109d5a0 <vmx_get_rflags>:
...
ffffffff8109d601: 3e 77 0a ds ja,pt ffffffff8109d60e <vmx_get_rflags+0x6e>
^^^^^
In this particular case, those prefixes are not excessive but are used
to provide branch hints - taken/not-taken - to the CPU.
Assign active_seg_prefix in that particular case to consume them.
gas/
2002-11-29 Borislav Petkov <bp@suse.de>
* testsuite/gas/i386/branch.d: Add new branch insns test.
* testsuite/gas/i386/branch.s: Likewise.
* testsuite/gas/i386/i386.exp: Insert the new branch test.
* testsuite/gas/i386/x86-64-branch.d: Test for branch hints insns.
* testsuite/gas/i386/x86-64-branch.s: Likewise.
* testsuite/gas/i386/ilp32/x86-64-branch.d: Likewise.
opcodes/
2020-11-28 Borislav Petkov <bp@suse.de>
* i386-dis.c (print_insn): Set active_seg_prefix for branch hint insns
to not dump branch hint prefixes 0x2E and 0x3E as unused prefixes.
In 64bit, assembler generates a warning for "sysret":
$ echo sysret | as --64 -o x.o -
{standard input}: Assembler messages:
{standard input}:1: Warning: no instruction mnemonic suffix given and no register operands; using default for `sysret'
Always display suffix for %LQ in 64bit to display "sysretl".
gas/
PR binutils/26704
* testsuite/gas/i386/noreg64-data16.d: Expect sysretl instead of
sysret.
* testsuite/gas/i386/noreg64.d: Likewise.
* testsuite/gas/i386/x86-64-intel64.d: Likewise.
* testsuite/gas/i386/x86-64-opcode.d: Likewise.
opcodes/
PR binutils/26704
* i386-dis.c (putop): Always display suffix for %LQ in 64bit.
The MODRM byte can be checked to display the instruction name only if the
MODRM byte needed. Clear modrm if the MODRM byte isn't needed so that
modrm field checks in putop like, modrm.mod == N with N != 0, can be done
without checking need_modrm.
gas/
PR binutils/26705
* testsuite/gas/i386/x86-64-suffix.s: Add "mov %rsp,%rbp" before
sysretq.
* testsuite/gas/i386/x86-64-suffix-intel.d: Updated.
* testsuite/gas/i386/x86-64-suffix.d: Likewise.
opcodes/
PR binutils/26705
* i386-dis.c (print_insn): Clear modrm if not needed.
(putop): Check need_modrm for modrm.mod != 3. Don't check
need_modrm for modrm.mod == 3.
There are 11 MOD_VEX_0F38* inserted in MOD_0F38* group,
which should be placed in MOD_VEX_0F38* group.
opcode/
PR 26654
*i386-dis.c (enum): Put MOD_VEX_0F38* together.
i386-dis.c:12207 left shift of 128 by 24 places cannot be represented in type 'long int'
i386-dis.c:12220 left shift of 128 by 24 places cannot be represented in type 'long int'
i386-dis.c:12222 left shift of 1 by 31 places cannot be represented in type 'long int'
i386-dis.c:12222 signed integer overflow: 162254319 - -2147483648 cannot be represented in type 'long int'
* i386-dis.c (OP_E_memory): Don't cast to signed type when
negating.
(get32, get32s): Use unsigned types in shift expressions.
This reverts commit 04c662e2b6.
In my underlying suggestion I neglected the fact that in those
cases (,%eiz,1) is the only visible indication that 32-bit
addressing is in effect.
Change
67 48 8b 1c 25 ef cd ab 89 mov 0x89abcdef(,%eiz,1),%rbx
to
67 48 8b 1c 25 ef cd ab 89 mov 0x89abcdef,%rbx
in AT&T syntax and
67 48 8b 1c 25 ef cd ab 89 mov rbx,QWORD PTR [eiz*1+0x89abcdef]
to
67 48 8b 1c 25 ef cd ab 89 mov rbx,QWORD PTR ds:0x89abcdef
in Intel syntax.
gas/
PR gas/26237
* testsuite/gas/i386/evex-no-scale-64.d: Updated.
* testsuite/gas/i386/addr32.d: Likewise.
* testsuite/gas/i386/x86-64-addr32-intel.d: Likewise.
* testsuite/gas/i386/x86-64-addr32.d: Likewise.
opcodes/
PR gas/26237
* i386-dis.c (OP_E_memory): Don't display eiz with no scale
without base nor index registers.
Irrespective of their encoding the resulting output should look the
same. Therefore wire the handling of PUSH/POP with GPR operands
encoded in the main opcode byte to the same logic used for other
operands. This frees up yet another macro character.
"Unambiguous" is is in particular taking as reference the assembler,
which also accepts certain insns - despite them allowing for varying
operand size, and hence in principle being ambiguous - without any
suffix. For example, from the very beginning of the life of x86-64 I had
trouble understanding why a plain and simple RET had to be printed as
RETQ. In case someone really used the 16-bit form, RETW disambiguates
the two quite fine.
Since the addr32 (0x67) prefix zero-extends the lower 32 bits address to
64 bits, change disassembler to zero-extend the lower 32 bits displacement
to 64 bits when there is no base nor index registers.
gas/
PR gas/26237
* testsuite/gas/i386/addr32.s: Add tests for 32-bit wrapped around
address.
* testsuite/gas/i386/x86-64-addr32.s: Likewise.
* testsuite/gas/i386/addr32.d: Updated.
* testsuite/gas/i386/x86-64-addr32-intel.d: Likewise.
* testsuite/gas/i386/x86-64-addr32.d: Likewise.
* testsuite/gas/i386/ilp32/x86-64-addr32-intel.d: Likewise.
* testsuite/gas/i386/ilp32/x86-64-addr32.d: Likewise.
opcodes/
PR gas/26237
* i386-dis.c (OP_E_memory): Without base nor index registers,
32-bit displacement to 64 bits.
%db<n> is an AT&T invention; the Intel documentation and MASM have only
ever specified DRn (in line with CRn and TRn). (In principle gas also
shouldn't accept the names in Intel mode, but at least for now I've kept
things as they are. Perhaps as a first step this should just be warned
about.)
Rm (and hence OP_R()) can be dropped by making 'Z' force modrm.mod to 3
(for OP_E()) instead of ignoring it. While at it move 'Z' handling to
its designated place (after 'Y'; 'W' handling will be moved by a later
change).
Moves to/from TRn are illegal in 64-bit mode and thus get converted to
honor this at the same time (also getting them in line with moves
to/from CRn/DRn ModRM.mod handling wise). This then also frees up the L
macro.
Rdq, Rd, and MaskR can be replaced by Edq, Ed / Rm, and MaskE
respectively, as OP_R() doesn't enforce ModRM.mod == 3, and hence where
MOD matters but hasn't been decoded yet it needs to be anyway. (The case
of converting to Rm is temporary until a subsequent change.)
In this case there's no need to go through prefix_table[] at all - the
.prefix_requirement == PREFIX_OPCODE machinery takes care of this case
already.
A couple of further adjustments are needed though:
- Gv / Ev and alike then can't be used (needs to be Gdq / Edq instead),
- dq_mode and friends shouldn't lead to PREFIX_DATA getting set in
used_prefixes.
The only valid (embedded or explicit) prefix being the data size one
(which is a fairly common pattern), avoid going through prefix_table[].
Instead extend the "required prefix" logic to also handle PREFIX_DATA
alone in a table entry, now used to identify this case. This requires
moving the (adjusted) ->prefix_requirement logic ahead of the printing
of stray prefixes, as the latter needs to observe the new setting of
PREFIX_DATA in used_prefixes.
Also add PREFIX_OPCODE on related entries when previously there was
mistakenly no decode step through prefix_table[].
It was quite odd for the prior operand handling to have to clear this
flag for the actual operand handling to print nothing. Have the actual
operand handling determine whether the operand is actually present.
With this {d,q}_scalar_swap_mode become unused and hence also get dropped.
These are only used when VEX.L or EVEX.L'L have already been decoded,
and hence the "normal" length dependent name determination is quite
fine. Adjust a few enumerators to make clear that vex_len_table[] has
been consulted; be consistent and do so for all *f128 and *i128 insns
in one go.
Fold redundant case blocks and move the extra adjustments logic into
the single case block that actually needs it - there's no need to go
through the extra logic for all the other cases. Also utilize there that
vex.b cannot be set at this point, due to earlier logic. Reduce the
comment there, which was partly stale anyway.
Unlike the earlier ones these also need their operands adjusted. Replace
the (mis-described: there's nothing "scalar" here) {b,w}_scalar_mode by
a single new mode, with the actual unit width controlled by EVEX.W.