The extended instructions implemented in powerpc_macros aren't used by
the disassembler. That means instructions like "sldi r3,r3,2" appear
in disassembly as "rldicr r3,r3,2,61", which is annoying since many
other extended instructions are shown.
Note that some of the instructions moved out of the macro table to the
opcode table won't appear in disassembly, because they are aliases
rather than a subset of the underlying raw instruction. If enabled,
rotrdi, extrdi, extldi, clrlsldi, and insrdi would replace all
occurrences of rotldi, rldicl, rldicr, rldic and rldimi. (Or many
occurrences in the case of clrlsldi if n <= b was added to the extract
functions.)
The patch also fixes a small bug in opcode sanity checking.
include/
* opcode/ppc.h (PPC_OPSHIFT_SH6): Define.
opcodes/
* ppc-opc.c (insert_erdn, extract_erdn, insert_eldn, extract_eldn),
(insert_crdn, extract_crdn, insert_rrdn, extract_rrdn),
(insert_sldn, extract_sldn, insert_srdn, extract_srdn),
(insert_erdb, extract_erdb, insert_csldn, extract_csldb),
(insert_irdb, extract_irdn): New functions.
(ELDn, ERDn, ERDn, RRDn, SRDn, ERDb, CSLDn, CSLDb, IRDn, IRDb):
Define and add associated powerpc_operands entries.
(powerpc_opcodes): Add "rotrdi", "srdi", "extrdi", "clrrdi",
"sldi", "extldi", "clrlsldi", "insrdi" and corresponding record
(ie. dot suffix) forms.
(powerpc_macros): Delete same from here.
gas/
* config/tc-ppc.c (insn_validate): Don't modify value passed
to operand->insert for PPC_OPERAND_PLUS1 when calculating mask.
Handle PPC_OPSHIFT_SH6.
* testsuite/gas/ppc/prefix-reloc.d: Update.
* testsuite/gas/ppc/simpshft.d: Update.
ld/
* testsuite/ld-powerpc/elfv2so.d: Update.
* testsuite/ld-powerpc/notoc.d: Update.
* testsuite/ld-powerpc/notoc3.d: Update.
* testsuite/ld-powerpc/tlsdesc2.d: Update.
* testsuite/ld-powerpc/tlsget.d: Update.
* testsuite/ld-powerpc/tlsget2.d: Update.
* testsuite/ld-powerpc/tlsopt5.d: Update.
* testsuite/ld-powerpc/tlsopt6.d: Update.
The save of r2 in __glink_PLTresolve is the culprit. Remove it,
unless we know we need it for --plt-localentry. --plt-localentry
should not be used with power10 pc-relative code that makes tail
calls.
The patch also removes use of r2 as a scratch reg in the ELFv2
__glink_PLTresolve. Using r2 isn't a problem, this is just reducing
the number of scratch regs.
bfd/
* elf64-ppc.c (GLINK_PLTRESOLVE_SIZE): Depend on has_plt_localentry0.
(LD_R0_0R11, ADD_R11_R0_R11): Define.
(ppc64_elf_tls_setup): Disable params->plt_localentry0 when power10
code detected.
(ppc64_elf_size_stubs): Update __glink_PLTresolve eh_frame.
(ppc64_elf_build_stubs): Move r2 save to start of __glink_PLTresolve,
and only emit for has_plt_localentry0. Don't use r2 in the stub.
ld/
* testsuite/ld-powerpc/elfv2so.d,
* testsuite/ld-powerpc/notoc2.d,
* testsuite/ld-powerpc/tlsdesc.wf,
* testsuite/ld-powerpc/tlsdesc2.d,
* testsuite/ld-powerpc/tlsdesc2.wf,
* testsuite/ld-powerpc/tlsopt5.d,
* testsuite/ld-powerpc/tlsopt5.wf,
* testsuite/ld-powerpc/tlsopt6.d,
* testsuite/ld-powerpc/tlsopt6.wf: Update __glink_PLTresolve.
Stripping .rela.branch_lt is easy enough but messes with the
testsuite due to stub symbols (that use section id) changing. Tests
that run on more than one target variant can be tricky to fix, this
renaming happened to work.
bfd/
* elf64-ppc.c (ppc64_elf_size_stubs): Strip relbrlt too.
ld/
* testsuite/ld-powerpc/tlsopt5.s: Rename foo to aaaaa.
* testsuite/ld-powerpc/tlsopt5.d: Adjust to suit.
* testsuite/ld-powerpc/tlsopt6.d: Likewise.
This modifies the special __tls_get_addr stub that checks for a
tlsdesc style __tls_index entry and returns early. Not using r11
isn't much benefit at the moment but a followup patch will preserve
regs around the first call to __tls_get_addr when the __tls_index
entry isn't yet set up for an early return.
bfd/
* elf64-ppc.c (LD_R11_0R3, CMPDI_R11_0, STD_R11_0R1, LD_R11_0R1),
(MTLR_R11): Don't define.
(LD_R0_0R3, CMPDI_R0_0): Define.
(build_tls_get_addr_stub): Don't use r11 in stub.
ld/
* testsuite/ld-powerpc/tlsexe.d: Match new __tls_get_addr stub.
* testsuite/ld-powerpc/tlsexeno.d: Likewise.
* testsuite/ld-powerpc/tlsexetoc.d: Likewise.
* testsuite/ld-powerpc/tlsexetocno.d: Likewise.
* testsuite/ld-powerpc/tlsopt5.d: Likewise.
When an instruction has operands, the PowerPC disassembler prints
spaces after the opcode so as to line up operands. If the operands
are all optional and all default value, then no operands are printed,
leaving trailing spaces. This patch fixes that.
opcodes/
* ppc-dis.c (print_insn_powerpc): Delay printing spaces after
opcode until first operand is output.
gas/
* testsuite/gas/ppc/476.d: Remove trailing spaces.
* testsuite/gas/ppc/a2.d: Likewise.
* testsuite/gas/ppc/booke.d: Likewise.
* testsuite/gas/ppc/booke_xcoff.d: Likewise.
* testsuite/gas/ppc/e500.d: Likewise.
* testsuite/gas/ppc/e500mc.d: Likewise.
* testsuite/gas/ppc/e6500.d: Likewise.
* testsuite/gas/ppc/htm.d: Likewise.
* testsuite/gas/ppc/power6.d: Likewise.
* testsuite/gas/ppc/power8.d: Likewise.
* testsuite/gas/ppc/power9.d: Likewise.
* testsuite/gas/ppc/vle.d: Likewise.
ld/
* testsuite/ld-powerpc/tlsexe32.d: Remove trailing spaces.
* testsuite/ld-powerpc/tlsopt5.d: Likewise.
* testsuite/ld-powerpc/tlsopt5_32.d: Likewise.
This patch sets stub_offset in ppc_size_one_stub rather than in
ppc_build_one_stub. That allows the plt stub alignment to be done in
just ppc_size_one_stub rather than both functions. The patch also
corrects the place where the alignment was done, fixing a possible
error in .eh_frame data, and tidies some offset calculations.
bfd/
* elf64-ppc.c (plt_stub_pad): Delay plt_stub_size call until needed.
(ppc_build_one_stub): Don't set stub_offset, instead assert that
it is sane. Don't adjust stub_offset for alignment. Adjust size
calculation. Use "targ" temp when calculating offsets.
(ppc_size_one_stub): Set stub_offset here. Use "targ" temp when
calculating offsets. Adjust for alignment before setting
tls_get_addr_opt_bctrl.
ld/
* testsuite/ld-powerpc/powerpc.exp: Run tlsopt5 with plt alignment.
* testsuite/ld-powerpc/tlsopt5.s: Add extra call.
* testsuite/ld-powerpc/tlsopt5.wf: Adjust expected output.
* testsuite/ld-powerpc/tlsopt5.d: Likewise.
Since the __tls_get_addr_opt stub saves LR and makes a call, eh_frame
info should be generated to describe how to unwind through the stub.
The patch also changes the way the backend iterates over stubs, from
looking at all sections in stub_bfd to which all dynamic sections are
attached as well, to iterating over the group list, which gets just
the stub sections. Most binaries will have just one or two stub
groups, so this is a little faster.
bfd/
* elf64-ppc.c (struct map_stub): Add tls_get_addr_opt_bctrl.
(stub_eh_frame_size): New function.
(ppc_size_one_stub): Set group tls_get_addr_opt_bctrl.
(group_sections): Init group tls_get_addr_opt_bctrl.
(ppc64_elf_size_stubs): Update sizing and initialization of
.eh_frame. Iteration over stubs via group list.
(ppc64_elf_build_stubs): Iterate over stubs via group list.
(ppc64_elf_finish_dynamic_sections): Update finalization of
.eh_frame.
ld/
* testsuite/ld-powerpc/tlsopt5.s: Add cfi.
* testsuite/ld-powerpc/tlsopt5.d: Update.
* testsuite/ld-powerpc/tlsopt5.wf: New file.
* testsuite/ld-powerpc/powerpc.exp: Perform new tlsopt5 test.
These all were odd in that they used r13 as the GOT pointer. That
didn't matter for the purpose of testing, but would never occur in
practice. Also, the tlsopt5 tests could have their global dynamic
sequences optimized to initial exec, so link with -shared.
* testsuite/ld-powerpc/powerpc.exp: Add -shared to tlsop5 tests.
* testsuite/ld-powerpc/tlsopt5.d: Adjust.
* testsuite/ld-powerpc/tlsopt1_32.s: Use r30 as GOT pointer.
* testsuite/ld-powerpc/tlsopt2_32.s: Likewise.
* testsuite/ld-powerpc/tlsopt3_32.s: Likewise.
* testsuite/ld-powerpc/tlsopt4_32.s: Likewise.
* testsuite/ld-powerpc/tlsopt5_32.s: Rewrite.
* testsuite/ld-powerpc/tlsopt1_32.d: Adjust.
* testsuite/ld-powerpc/tlsopt2_32.d: Adjust.
* testsuite/ld-powerpc/tlsopt3_32.d: Adjust.
* testsuite/ld-powerpc/tlsopt5_32.d: Adjust.
ELFv2 functions with localentry:0 are those with a single entry point,
ie. global entry == local entry, and that have no requirement on r2 or
r12, and guarantee r2 is unchanged on return. Such an external
function can be called via the PLT without saving r2 or restoring it
on return, avoiding a common load-hit-store for small functions. The
optimization is attractive. The TOC pointer load-hit-store is a major
reason why calls to small functions that need no register saves, or
with shrink-wrap, no register saves on a fast path, are slow on
powerpc64le.
To be safe, this optimization needs ld.so support to check that the
run-time matches link-time function implementation. If a function
in a shared library with st_other localentry non-zero is called
without saving and restoring r2, r2 will be trashed on return, leading
to segfaults. For that reason the optimization does not happen for
weak functions since a weak definition is a fairly solid hint that the
function will likely be overridden. I'm also not enabling the
optimization by default unless glibc-2.26 is detected, which should
have the ld.so checks implemented.
bfd/
* elf64-ppc.c (struct ppc_link_hash_table): Add has_plt_localentry0.
(ppc64_elf_merge_symbol_attribute): Merge localentry bits from
dynamic objects.
(is_elfv2_localentry0): New function.
(ppc64_elf_tls_setup): Default params->plt_localentry0.
(plt_stub_size): Adjust size for tls_get_addr_opt stub.
(build_tls_get_addr_stub): Use a simpler stub when r2 is not saved.
(ppc64_elf_size_stubs): Leave stub_type as ppc_stub_plt_call for
optimized localentry:0 stubs.
(ppc64_elf_build_stubs): Save r2 in ELFv2 __glink_PLTresolve.
(ppc64_elf_relocate_section): Leave nop unchanged for optimized
localentry:0 stubs.
(ppc64_elf_finish_dynamic_sections): Set PPC64_OPT_LOCALENTRY in
DT_PPC64_OPT.
* elf64-ppc.h (struct ppc64_elf_params): Add plt_localentry0.
include/
* elf/ppc64.h (PPC64_OPT_LOCALENTRY): Define.
ld/
* emultempl/ppc64elf.em (params): Init plt_localentry0 field.
(enum ppc64_opt): New, replacing OPTION_* defines. Add
OPTION_PLT_LOCALENTRY, and OPTION_NO_PLT_LOCALENTRY.
(PARSE_AND_LIST_*): Support --plt-localentry and --no-plt-localentry.
* testsuite/ld-powerpc/elfv2so.d: Update.
* testsuite/ld-powerpc/powerpc.exp (TLS opt 5): Use --no-plt-localentry.
* testsuite/ld-powerpc/tlsopt5.d: Update.