In the TLS GD/LD to LE optimization, ld replaces a sequence like
addi 3,2,x@got@tlsgd R_PPC64_GOT_TLSGD16 x
bl __tls_get_addr(x@tlsgd) R_PPC64_TLSGD x
R_PPC64_REL24 __tls_get_addr
nop
with
addis 3,13,x@tprel@ha R_PPC64_TPREL16_HA x
addi 3,3,x@tprel@l R_PPC64_TPREL16_LO x
nop
When the tprel offset is small, this can be further optimized to
nop
addi 3,13,x@tprel
nop
bfd/
* elf64-ppc.c (struct ppc_link_hash_table): Add do_tls_opt.
(ppc64_elf_tls_optimize): Set it.
(ppc64_elf_relocate_section): Nop addis on TPREL16_HA, and convert
insn on TPREL16_LO and TPREL16_LO_DS relocs to use r13 when
addis would add zero.
* elf32-ppc.c (struct ppc_elf_link_hash_table): Add do_tls_opt.
(ppc_elf_tls_optimize): Set it.
(ppc_elf_relocate_section): Nop addis on TPREL16_HA, and convert
insn on TPREL16_LO relocs to use r2 when addis would add zero.
gold/
* powerpc.cc (Target_powerpc::Relocate::relocate): Nop addis on
TPREL16_HA, and convert insn on TPREL16_LO and TPREL16_LO_DS
relocs to use r2/r13 when addis would add zero.
ld/
* testsuite/ld-powerpc/tls.s: Add calls with tls markers.
* testsuite/ld-powerpc/tls32.s: Likewise.
* testsuite/ld-powerpc/powerpc.exp: Run tls marker tests.
* testsuite/ld-powerpc/tls.d: Adjust for TPREL16_HA/LO optimization.
* testsuite/ld-powerpc/tlsexe.d: Likewise.
* testsuite/ld-powerpc/tlsexetoc.d: Likewise.
* testsuite/ld-powerpc/tlsld.d: Likewise.
* testsuite/ld-powerpc/tlsmark.d: Likewise.
* testsuite/ld-powerpc/tlsopt4.d: Likewise.
* testsuite/ld-powerpc/tlstoc.d: Likewise.
There isn't a good reason for ld.bfd to behave differently from gold
in the code generated by TLS GD/LD to LE optimization.
bfd/
* elf64-ppc.c (ppc64_elf_relocate_section): When optimizing
__tls_get_addr call sequences to LE, don't move the addi down
to the nop. Replace the bl with addi and leave the nop alone.
ld/
* testsuite/ld-powerpc/tls.d: Update.
* testsuite/ld-powerpc/tlsexe.d: Update.
* testsuite/ld-powerpc/tlsexetoc.d: Update.
* testsuite/ld-powerpc/tlsld.d: Update.
* testsuite/ld-powerpc/tlsmark.d: Update.
* testsuite/ld-powerpc/tlsopt4.d: Update.
* testsuite/ld-powerpc/tlstoc.d: Update.
* ppc.h (R_PPC_TLSGD, R_PPC_TLSLD): Add new relocs.
* ppc64.h (R_PPC64_TLSGD, R_PPC64_TLSLD): Add new relocs.
bfd/
* reloc.c (BFD_RELOC_PPC_TLSGD, BFD_RELOC_PPC_TLSLD): New.
* section.c (struct bfd_section): Add has_tls_get_addr_call.
(BFD_FAKE_SECTION): Init new flag.
* ecoff.c (bfd_debug_section): Likewise.
* bfd-in2.h: Regenerate.
* libbfd.h: Regenerate.
* elf32-ppc.c (ppc_elf_howto_raw): Add R_PPC_TLSGD and R_PPC_TLSLD.
(ppc_elf_reloc_type_lookup): Handle new relocs.
(ppc_elf_check_relocs): Set has_tls_get_addr_call on finding such
without marker relocs.
(ppc_elf_tls_optimize): Allow out-of-order __tls_get_addr relocs
if section has no old-style calls.
(ppc_elf_relocate_section): Set tls_mask for non-tls relocs too.
Don't try to optimize new-style __tls_get_addr call when handling
arg setup relocs. Instead do so for R_PPC_TLSGD and R_PPC_TLSLD
relocs.
* elf64-ppc.c (ppc64_elf_howto_raw): Add R_PPC64_TLSGD, R_PPC64_TLSLD.
(ppc64_elf_reloc_type_lookup): Handle new relocs.
(ppc64_elf_check_relocs): Set has_tls_get_addr_call on finding such
without marker relocs.
(ppc64_elf_tls_optimize): Allow out-of-order __tls_get_addr relocs
if section has no old-style calls. Set toc_ref for new relocs as
appropriate.
(ppc64_elf_relocate_section): Set tls_mask for non-tls relocs too.
Don't try to optimize new-style __tls_get_addr call when handling
arg setup relocs. Instead do so for R_PPC_TLSGD and R_PPC_TLSLD
relocs.
gas/
* config/tc-ppc.c (ppc_elf_suffix): Error if ppc32 tls got relocs
have non-zero addend.
(md_assemble): Parse args of __tls_get_addr calls.
(md_apply_fix): Handle BFD_RELOC_PPC_TLSGD and BFD_RELOC_PPC_TLSLD.
ld/testsuite/
* ld-powerpc/tlsmark.s, * ld-powerpc/tlsmark.d: New test.
* ld-powerpc/tlsmark32.s, * ld-powerpc/tlsmark32.d: New test.
* ld-powerpc/powerpc.exp: Run them.