This patch better supports mixing of power10 and non-power10 code,
as might be seen in a cpu-optimized library using ifuncs to select
functions optimized for a given cpu. Using -Wl,--no-power10-stubs
isn't that good in this situation since non-power10 notoc stubs are
slower and larger than the power10 variants, which you'd like to use
on power10 code paths.
With this change, power10 pc-relative code that makes calls marked
@notoc uses power10 stubs if stubs are necessary, and other calls use
non-power10 instructions in stubs. This will mean that if gcc is
generating code for -mcpu=power10 but with pc-rel disabled then you'll
get the older stubs even on power10 (unless you force with
-Wl,--power10-stubs). That shouldn't be too big a problem: stubs that
use r2 are reasonable. It's just the ones that set up addressing
using "mflr 12; bcl 20,31,.+4; mflr 11; mtlr 12" that should be
avoided if possible.
bfd/
* elf64-ppc.c (struct ppc_link_hash_table): Add has_power10_relocs.
(select_alt_stub): New function.
(ppc_get_stub_entry): Use it here.
(ppc64_elf_check_relocs): Set had_power10_relocs rather than
power10_stubs.
(ppc64_elf_size_stubs): Clear power10_stubs here instead. Don't
merge notoc stubs with other varieties when power10_stubs is "auto".
Instead dup the stub hash table entry.
(plt_stub_size, ppc_build_one_stub, ppc_size_one_stub): Adjust
tests of power10_stubs.
ld/
* emultempl/ppc64elf.em (power10-stubs): Accept optional "auto" arg.
* ld.texi (power10-stubs): Update.
* testsuite/ld-powerpc/callstub-1.d: Force --power10-stubs.
* testsuite/ld-powerpc/callstub-2.d: Relax branch offset comparison.
* testsuite/ld-powerpc/callstub-4.d: New test.
* testsuite/ld-powerpc/notoc.d: Force --no-power10-stubs.
* testsuite/ld-powerpc/notoc3.d,
* testsuite/ld-powerpc/notoc3.s,
* testsuite/ld-powerpc/notoc3.wf: New test.
* testsuite/ld-powerpc/powerpc.exp: Run new tests. Pass
--no-power10-stubs for notoc link.
Needed for libraries that use ifuncs or other means to support
cpu-optimized versions of functions, some power10, some not, and those
functions make calls using linkage stubs.
bfd/
* elf64-ppc.h (struct ppc64_elf_params): Add power10_stubs.
* elf64-ppc.c (struct ppc_link_hash_table): Delete
power10_stubs.
(ppc64_elf_check_relocs): Adjust setting of power10_stubs.
(plt_stub_size, ppc_build_one_stub, ppc_size_one_stub): Adjust
uses of power10_stubs.
ld/
* emultempl/ppc64elf.em (params): Init new field.
(enum ppc64_opt): Add OPTION_POWER10_STUBS and OPTION_NO_POWER10_STUBS.
(PARSE_AND_LIST_LONGOPTS): Support --power10-stubs and
--no-power10-stubs.
(PARSE_AND_LIST_OPTIONS, PARSE_AND_LIST_ARGS_CASES): Likewise.
* testsuite/ld-powerpc/callstub-3.d: New test.
* testsuite/ld-powerpc/powerpc.exp: Run it.
Add ifunc_resolvers to elf_link_hash_table and use it for both x86 and
ppc64. Before glibc commit b5c45e837, DT_TEXTREL is incompatible with
IFUNC resolvers. Set ifunc_resolvers if there are IFUNC resolvers and
issue a warning for IFUNC resolvers with DT_TEXTREL.
bfd/
PR ld/18801
* elf-bfd.h (elf_link_hash_table): Add ifunc_resolvers.
(_bfd_elf_allocate_ifunc_dyn_relocs): Remove the
bfd_boolean * argument. Set ifunc_resolvers if there are IFUNC
resolvers.
* elf-ifunc.c (_bfd_elf_allocate_ifunc_dyn_relocs): Updated.
Set ifunc_resolvers if there are FUNC resolvers.
* elf64-ppc.c (ppc_link_hash_table): Remove local_ifunc_resolver.
(build_global_entry_stubs_and_plt): Replace local_ifunc_resolver
with elf.ifunc_resolvers.
(write_plt_relocs_for_local_syms): Likewise.
(ppc64_elf_relocate_section): Likewise.
(ppc64_elf_finish_dynamic_sections): Likewise.
* elfnn-aarch64.c (elfNN_aarch64_allocate_ifunc_dynrelocs):
Updated.
* elfxx-x86.c (elf_x86_allocate_dynrelocs): Likewise.
(_bfd_x86_elf_size_dynamic_sections): Check elf.ifunc_resolvers
instead of readonly_dynrelocs_against_ifunc.
* elfxx-x86.h (elf_x86_link_hash_table): Remove
readonly_dynrelocs_against_ifunc.
ld/
PR ld/18801
* testsuite/ld-i386/i386.exp: Run ifunc-textrel-1a,
ifunc-textrel-1b, ifunc-textrel-2a and ifunc-textrel-2b.
* testsuite/ld-x86-64/x86-64.exp: Likewise.
* testsuite/ld-i386/ifunc-textrel-1a.d: Likewise.
* testsuite/ld-i386/ifunc-textrel-1b.d: Likewise.
* testsuite/ld-i386/ifunc-textrel-2a.d: Likewise.
* testsuite/ld-i386/ifunc-textrel-2b.d: Likewise.
* testsuite/ld-x86-64/ifunc-textrel-1.s: Likewise.
* testsuite/ld-x86-64/ifunc-textrel-1a.d: Likewise.
* testsuite/ld-x86-64/ifunc-textrel-1b.d: Likewise.
* testsuite/ld-x86-64/ifunc-textrel-2.s: Likewise.
* testsuite/ld-x86-64/ifunc-textrel-2a.d: Likewise.
* testsuite/ld-x86-64/ifunc-textrel-2b.d: Likewise.
* testsuite/ld-i386/pr18801a.d: Expect warning for IFUNC
resolvers.
* testsuite/ld-i386/pr18801b.d: Likewise.
* estsuite/ld-x86-64/pr18801a.d: Likewise.
* estsuite/ld-x86-64/pr18801b.d: Likewise.
For ppc64 I set flags when recording the dynamic relocation rather
than when allocating space. That allows you to distinguish three
cases:
1) The dynamic ifunc relocation is in an executable and will always be
to an ifunc resolver in the executable.
2) The dynamic ifunc relocation is in a shared library which provides
an ifunc resolver, but that may be overridden at runtime to use a
resolver in another binary.
3) The dynamic ifunc relocation is not to a locally defined ifunc
resolver.
Case (3) won't cause a segfault trying to run resolver code that is
non-exec on older glibc.
I made case (1) an error for ppc64, but since newer glibc ld.so does
allow running ifunc resolvers when segments are writable I suppose I
should downgrade that to a warning like case (2).
* elf64-ppc.c (struct ppc_link_hash_table): Delete
maybe_local_ifunc_resolver field.
(build_global_entry_stubs_and_plt): Set local_ifunc_resolver in
cases where maybe_local_ifunc_resolver was set.
(ppc64_elf_relocate_section): Likewise.
(ppc64_elf_finish_dynamic_sections): Downgrade ifunc with textrel
error to a warning.
These relocations should have had REL in their names, to reflect the
fact that they are pc-relative. Fix that now by adding _PCREL.
I've added some back-compatibility code to support anyone using
.reloc with the old relocations.
include/
* elf/ppc64.h (elf_ppc64_reloc_type): Rename
R_PPC64_GOT_TLSGD34 to R_PPC64_GOT_TLSGD_PCREL34,
R_PPC64_GOT_TLSLD34 to R_PPC64_GOT_TLSLD_PCREL34,
R_PPC64_GOT_TPREL34 to R_PPC64_GOT_TPREL_PCREL34, and
R_PPC64_GOT_DTPREL34 to R_PPC64_GOT_DTPREL_PCREL34.
bfd/
* reloc.c: Rename
BFD_RELOC_PPC64_GOT_TLSGD34 to BFD_RELOC_PPC64_GOT_TLSGD_PCREL34,
BFD_RELOC_PPC64_GOT_TLSLD34 to BFD_RELOC_PPC64_GOT_TLSLD_PCREL34,
BFD_RELOC_PPC64_GOT_TPREL34 to BFD_RELOC_PPC64_GOT_TPREL_PCREL34,
BFD_RELOC_PPC64_GOT_DTPREL34 to BFD_RELOC_PPC64_GOT_DTPREL_PCREL34.
* elf64-ppc.c: Update throughout for reloc renaming.
(ppc64_elf_reloc_name_lookup): Handle old reloc names.
* libbfd.h: Regenerate.
* bfd-in2.h: Regenerate.
gas/
* config/tc-ppc.c: Update throughout for reloc renaming.
elfcpp/
* powerpc.h: Rename
R_PPC64_GOT_TLSGD34 to R_PPC64_GOT_TLSGD_PCREL34,
R_PPC64_GOT_TLSLD34 to R_PPC64_GOT_TLSLD_PCREL34,
R_PPC64_GOT_TPREL34 to R_PPC64_GOT_TPREL_PCREL34, and
R_PPC64_GOT_DTPREL34 to R_PPC64_GOT_DTPREL_PCREL34.
gold/
* powerpc.cc: Update throughout for reloc renaming.
Don't do anything special with non-loaded, non-alloced sections.
In particular, any relocs in such sections should not affect GOT
and PLT reference counting (ie. we don't allow them to create GOT
or PLT entries), there's no possibility or desire to optimize TLS
relocs, and there's not much point in propagating relocs to shared
libs that the dynamic linker won't relocate.
Since check_relocs is no longer called on non-loaded, non-alloced
sections, remove SEC_ALLOC check. Resolve relocation in debug section
against symbol defined in shared library to 0.
bfd/
PR ld/26080
* elf-m10300.c (mn10300_elf_relocate_section): Resolve relocation
in debug section against symbol defined in shared library to 0.
* elf32-i386.c (elf_i386_check_relocs): Remove SEC_ALLOC check.
* elf32-lm32.c (lm32_elf_check_relocs): Likewise.
* elf32-m32r.c (m32r_elf_check_relocs): Likewise.
* elf32-nds32.c (nds32_elf_check_relocs): Likewise.
* elf32-nios2.c (nios2_elf32_check_relocs): Likewise.
* elf32-or1k.c (or1k_elf_check_relocs): Likewise.
* elf32-ppc.c (ppc_elf_check_relocs): Likewise.
* elf32-sh.c (sh_elf_check_relocs): Likewise.
* elf32-xtensa.c (elf_xtensa_check_relocs): Likewise.
* elf64-alpha.c (elf64_alpha_check_relocs): Likewise.
* elf64-ppc.c (ppc64_elf_check_relocs): Likewise.
* elf64-x86-64.c (elf_x86_64_check_relocs): Likewise.
* elfxx-mips.c (_bfd_mips_elf_check_relocs): Likewise.
* elf32-vax.c (elf_vax_check_relocs): Set non_got_ref for non-GOT
reference.
(elf_vax_adjust_dynamic_symbol): Generate a copy reloc only if
there is non-GOT reference.
* elflink.c (_bfd_elf_link_check_relocs): Skip non-loaded,
non-alloced sections.
ld/
PR ld/26080
* testsuite/ld-elf/comm-data.exp: Remove copy_reloc.
* testsuite/ld-elf/comm-data2r.rd: Removed.
* testsuite/ld-elf/comm-data2r.sd: Likewise.
* testsuite/ld-elf/comm-data2r.xd: Likewise.
Now that ISA3.1 is out we can finish with the powerxx silliness.
bfd/
* elf64-ppc.c: Rename powerxx to power10 throughout.
gas/
* config/tc-ppc.c (md_assemble): Update for PPC_OPCODE_POWER10
renaming.
* testsuite/gas/ppc/prefix-align.d: Use -mpower10/-Mpower10 in
place of -mfuture/-Mfuture.
* testsuite/gas/ppc/prefix-pcrel.d: Likewise.
* testsuite/gas/ppc/prefix-reloc.d: Likewise.
gold/
* powerpc.cc: Rename powerxx to power10 throughout.
include/
* elf/ppc64.h: Update comment.
* opcode/ppc.h (PPC_OPCODE_POWER10): Rename from PPC_OPCODE_POWERXX.
ld/
* testsuite/ld-powerpc/callstub-1.d: Use -mpower10/-Mpower10 in
place of -mfuture/-Mfuture.
* testsuite/ld-powerpc/notoc2.d: Likewise.
* testsuite/ld-powerpc/powerpc.exp: Likewise.
* testsuite/ld-powerpc/tlsgd.d: Likewise.
* testsuite/ld-powerpc/tlsie.d: Likewise.
* testsuite/ld-powerpc/tlsld.d: Likewise.
opcodes/
* ppc-dis.c (ppc_opts): Add "power10" entry.
(print_insn_powerpc): Update for PPC_OPCODE_POWER10 renaming.
* ppc-opc.c (POWER10): Rename from POWERXX. Update all uses.
Stripping .rela.branch_lt is easy enough but messes with the
testsuite due to stub symbols (that use section id) changing. Tests
that run on more than one target variant can be tricky to fix, this
renaming happened to work.
bfd/
* elf64-ppc.c (ppc64_elf_size_stubs): Strip relbrlt too.
ld/
* testsuite/ld-powerpc/tlsopt5.s: Rename foo to aaaaa.
* testsuite/ld-powerpc/tlsopt5.d: Adjust to suit.
* testsuite/ld-powerpc/tlsopt6.d: Likewise.
I can't see any reason why ELFv2 should create a PLT entry for ifuncs
referenced by GOT relocs as long as the GOT entry remains. The GOT
entry ought to be resolved by ld.so to the value returned by the ifunc
resolver, or if there is global entry stub created for some other
reason, by the linker to the stub address.
* elf64-ppc.c (ppc64_elf_check_relocs): Don't create plt entries
for GOT relocs against ifuncs.
When the symbol referenced by a GOT reloc is an ifunc, we can't
optimise away the GOT indirection. Well, we can, but only if a global
entry stub is created with the ifunc symbol redefined to the stub.
But that results in slower code and an indirection via the PLT so
there isn't much to like about that solution.
* elf64-ppc.c (ppc64_elf_edit_toc): Exclude ifunc from GOT
optimisation.
(ppc64_elf_relocate_section): Likewise.
If this code dealing with possible conversion of inline plt sequences
is ever executed, ld will hang. A binary with such sequences and of
code size larger than approximately 90% the reach of an unconditional
branch is the trigger. Oops.
* elf64-ppc.c (ppc64_elf_inline_plt): Do increment rel in for loop.
For those ELF targets that have .sdata or .sbss sections, or similar
sections, arrange to mark the sections with the SEC_SMALL_DATA flag.
This fixes regressions in nm symbol type caused by removing .sdata
and .sbss from coff_section_type with commit 49d9fd42ac.
* elf32-m32r.c (m32r_elf_section_flags): New function.
(elf_backend_section_flags): Define.
* elf32-nds32.c (nds32_elf_section_flags): New function.
(elf_backend_section_flags): Define.
* elf32-ppc.c (ppc_elf_section_from_shdr): Set SEC_SMALL_DATA for
.sbss and .sdata sections.
* elf32-v850.c (v850_elf_section_from_shdr): Set SEC_SMALL_DATA
for SHF_V850_GPREL sections.
* elf64-alpha.c (elf64_alpha_section_from_shdr): Delete outdated
FIXME.
* elf64-hppa.c (elf64_hppa_section_from_shdr): Set SEC_SMALL_DATA
for SHF_PARISC_SHORT sections.
* elf64-ppc.c (ppc64_elf_section_flags): New function.
(elf_backend_section_flags): Define.
* elfxx-mips.c (_bfd_mips_elf_section_from_shdr): Set SEC_SMALL_DATA
for SHF_MIPS_GPREL sections. Delete FIXME.
This provides a linker generated __tls_get_addr_desc wrapper function
preserving registers around a __tls_get_addr call. The idea being to
support __tls_get_addr_desc without requiring a glibc update.
bfd/
* elf64-ppc.c (struct ppc_link_hash_table): Add tga_group.
(ppc64_elf_archive_symbol_lookup): Extract __tls_get_addr_opt for
__tls_get_addr_desc.
(ppc64_elf_size_stubs): Add section for linker generated
__tls_get_addr_desc wrapper function. Loop at least once if
generating this function.
(emit_tga_desc, emit_tga_desc_eh_frame): New functions.
(ppc64_elf_build_stubs): Generate __tls_get_addr_desc.
ld/
* testsuite/ld-powerpc/tlsdesc3.d,
* testsuite/ld-powerpc/tlsdesc3.wf,
* testsuite/ld-powerpc/tlsdesc4.d,
* testsuite/ld-powerpc/tlsdesc4.s,
* testsuite/ld-powerpc/tlsdesc4.wf: New tests.
* testsuite/ld-powerpc/powerpc.exp: Run them.
This implements register saving and restoring in the __tls_get_addr
call stub, so that when glibc supports the optimized tls call stub gcc
can generate code that assumes only r0, r12 and of course r3 are
changed on a __tls_get_addr call. When gcc expects __tls_get_addr
calls to preserve registers the call will be to __tls_get_addr_desc,
which will be translated by the linker to a call to __tls_get_addr_opt.
bfd/
* elf64-ppc.h (struct ppc64_elf_params): Add no_tls_get_addr_regsave.
* elf64-ppc.c (struct ppc_link_hash_table): Add tga_desc and
tga_desc_fd.
(is_tls_get_addr): Match tga_desc and tga_desc_df too.
(STDU_R1_0R1, ADDI_R1_R1): Define.
(tls_get_addr_prologue, tls_get_addr_epilogue): New functions.
(ppc64_elf_tls_setup): Set up tga_desc and tga_desc_fd. Indirect
tga_desc_fd to opt_fd, and tga_desc to opt. Set
no_tls_get_addr_regsave.
(branch_reloc_hash_match): Add hash3 and hash4.
(ppc64_elf_tls_optimize): Handle tga_desc_fd and tga_desc too.
(ppc64_elf_size_dynamic_sections): Likewise.
(ppc64_elf_relocate_section): Likewise.
(plt_stub_size, build_plt_stub): Likewise. Size regsave
__tls_get_addr stub.
(build_tls_get_addr_stub): Build regsave __tls_get_addr stub and
eh_frame.
(ppc_size_one_stub): Handle tga_desc_fd and tga_desc too. Size
eh_frame for regsave __tls_get_addr.
gas/
* config/tc-ppc.c (parse_tls_arg): Handle tls arg for
__tls_get_addr_desc and __tls_get_addr_opt.
ld/
* emultempl/ppc64elf.em (ppc64_opt, PARSE_AND_LIST_LONGOPTS),
(PARSE_AND_LIST_OPTIONS, PARSE_AND_LIST_ARGS_CASES): Support
--tls-get-addr-regsave and --no-tls-get-addr-regsave.
(params): Init new field.
* ld.texi (--tls-get-addr-regsave, --no-tls-get-addr-regsave):
Document.
* testsuite/ld-powerpc/tlsdesc.s,
* testsuite/ld-powerpc/tlsdesc.d,
* testsuite/ld-powerpc/tlsdesc.wf,
* testsuite/ld-powerpc/tlsdesc2.d,
* testsuite/ld-powerpc/tlsdesc2.wf,
* testsuite/ld-powerpc/tlsexenors.d,
* testsuite/ld-powerpc/tlsexenors.r,
* testsuite/ld-powerpc/tlsexers.d,
* testsuite/ld-powerpc/tlsexers.r,
* testsuite/ld-powerpc/tlsexetocnors.d,
* testsuite/ld-powerpc/tlsexetocrs.d,
* testsuite/ld-powerpc/tlsexetocrs.r,
* testsuite/ld-powerpc/tlsopt6.d,
* testsuite/ld-powerpc/tlsopt6.wf: New.
* testsuite/ld-powerpc/powerpc.exp: Run new tests.
When linking with --no-tls-optimize the linker doesn't generate a call
or long branch stub to __tls_get_addr in some circumstances, giving:
relocation truncated to fit: R_PPC64_REL24 against symbol `__tls_get_addr'
* elf64-ppc.c (ppc64_elf_size_stubs): Correct condition under
which __tls_get_addr calls will be eliminated.
This modifies the special __tls_get_addr stub that checks for a
tlsdesc style __tls_index entry and returns early. Not using r11
isn't much benefit at the moment but a followup patch will preserve
regs around the first call to __tls_get_addr when the __tls_index
entry isn't yet set up for an early return.
bfd/
* elf64-ppc.c (LD_R11_0R3, CMPDI_R11_0, STD_R11_0R1, LD_R11_0R1),
(MTLR_R11): Don't define.
(LD_R0_0R3, CMPDI_R0_0): Define.
(build_tls_get_addr_stub): Don't use r11 in stub.
ld/
* testsuite/ld-powerpc/tlsexe.d: Match new __tls_get_addr stub.
* testsuite/ld-powerpc/tlsexeno.d: Likewise.
* testsuite/ld-powerpc/tlsexetoc.d: Likewise.
* testsuite/ld-powerpc/tlsexetocno.d: Likewise.
* testsuite/ld-powerpc/tlsopt5.d: Likewise.
Function symbols of course don't normally want .dynbss copies but
with some old versions of gcc they are needed to copy the function
descriptor. This patch restricts the cases where they are useful to
compilers using dot-symbols, and enables the warning regardless of
whether a PLT entry is emitted in the executable. PLTs in shared
libraries are affected by a .dynbss copy in the executable.
bfd/
PR 25384
* elf64-ppc.c (ELIMINATE_COPY_RELOCS): Update comment.
(ppc64_elf_adjust_dynamic_symbol): Don't allow .dynbss copies
of function symbols unless dot symbols are present. Do warn
whenever one is created, regardles of whether a PLT entry is
also emitted for the function symbol.
ld/
* testsuite/ld-powerpc/ambiguousv1b.d: Adjust expected output.
* testsuite/ld-powerpc/funref.s: Align func_tab.
* testsuite/ld-powerpc/funref2.s: Likewise.
* testsuite/ld-powerpc/funv1.s: Add dot symbols.
This is fussing about nothing really but since I was looking at signed
vs. unsigned issues, I decided to use the correct types here.
* elf32-ppc.c (ppc_elf_get_synthetic_symtab): Use size_t for vars.
* elf64-ppc.c (sym_exists_at): Use size_t for lo, hi and mid.
and other tidies. I think it's better to default to passing the
section to bfd_octets_per_byte, even in cases where we know it won't
make a difference.
A number of the coff reloc functions used bfd_octets_per_byte wrongly,
not factoring it into the offset into the data buffer. As it happens,
the targets using those files always had bfd_octets_per_byte equal to
one, so there wasn't any detectable wrong behaviour. However, it is
wrong in the source and might cause trouble for anyone creating a new
target. Besides fixing that, the patch also defines OCTETS_PER_BYTE
as one in target files where that is appropriate.
bfd/
* archures.c (bfd_octets_per_byte): Tail call
bfd_arch_mach_octets_per_byte.
* coff-arm.c (OCTETS_PER_BYTE): Define.
(coff_arm_reloc): Introduce new "octets" temp. Use OCTETS_PER_BYTE
with section. Correct "addr". Remove ATTRIBUTE_UNUSED.
* coff-i386.c (coff_i386_reloc): Similarly.
* coff-mips.c (mips_reflo_reloc): Similarly.
* coff-x86_64.c (coff_amd64_reloc): Similarly.
* elf32-msp430.c (OCTETS_PER_BYTE): Define.
(rl78_sym_diff_handler): Use OCTETS_PER_BYTE, with section.
* elf32-nds32.c (nds32_elf_get_relocated_section_contents): Similarly.
* elf32-ppc.c (ppc_elf_addr16_ha_reloc): Similarly.
* elf32-pru.c (pru_elf32_do_ldi32_relocate): Similarly.
* elf32-s12z.c (opru18_reloc): Similarly.
* elf32-sh.c (sh_elf_reloc): Similarly.
* elf32-spu.c (spu_elf_rel9): Similarly.
* elf32-xtensa.c (bfd_elf_xtensa_reloc): Similarly.
* elf64-ppc.c (ppc64_elf_ha_reloc, ppc64_elf_brtaken_reloc),
(ppc64_elf_toc64_reloc): Similarly.
* bfd.c (bfd_get_section_limit): Pass section to bfd_octets_per_byte.
* cofflink.c (_bfd_coff_link_input_bfd),
(_bfd_coff_reloc_link_order): Likewise.
* elf.c (_bfd_elf_section_offset): Likewise.
* elflink.c (resolve_section, bfd_elf_perform_complex_relocation),
(elf_link_input_bfd, elf_reloc_link_order, elf_fixup_link_order),
(bfd_elf_final_link): Likewise.
* elf.c (_bfd_elf_make_section_from_shdr): Don't strncmp twice
to set SEC_ELF_OCTETS.
* reloc.c (bfd_perform_relocation): Tidy SEC_ELF_OCTETS special case.
(bfd_install_relocation): Likewise.
(_bfd_final_link_relocate): Don't recalculate octets.
* syms.c (_bfd_stab_section_find_nearest_line): Introduc new
"octets" temp.
* bfd-in2.h: Regenerate.
ld/
* ldexp.c (fold_name): Pass section to bfd_octets_per_byte.
* ldlang.c (init_opb): Don't call bfd_arch_mach_octets_per_byte
unnecessarily.
All symbols, sizes and relocations in this section are octets instead of
bytes. Required for DWARF debug sections as DWARF information is
organized in octets, not bytes.
bfd/
* section.c (struct bfd_section): New flag SEC_ELF_OCTETS.
* archures.c (bfd_octets_per_byte): New parameter sec.
If section is not NULL and SEC_ELF_OCTETS is set, one octet es
returned [ELF targets only].
* bfd.c (bfd_get_section_limit): Provide section parameter to
bfd_octets_per_byte.
* bfd-in2.h: regenerate.
* binary.c (binary_set_section_contents): Move call to
bfd_octets_per_byte into section loop. Provide section parameter
to bfd_octets_per_byte.
* coff-arm.c (coff_arm_reloc): Provide section parameter
to bfd_octets_per_byte.
* coff-i386.c (coff_i386_reloc): likewise.
* coff-mips.c (mips_reflo_reloc): likewise.
* coff-x86_64.c (coff_amd64_reloc): likewise.
* cofflink.c (_bfd_coff_link_input_bfd): likewise.
(_bfd_coff_reloc_link_order): likewise.
* elf.c (_bfd_elf_section_offset): likewise.
(_bfd_elf_make_section_from_shdr): likewise.
Set SEC_ELF_OCTETS for sections with names .gnu.build.attributes,
.debug*, .zdebug* and .note.gnu*.
* elf32-msp430.c (rl78_sym_diff_handler): Provide section parameter
to bfd_octets_per_byte.
* elf32-nds.c (nds32_elf_get_relocated_section_contents): likewise.
* elf32-ppc.c (ppc_elf_addr16_ha_reloc): likewise.
* elf32-pru.c (pru_elf32_do_ldi32_relocate): likewise.
* elf32-s12z.c (opru18_reloc): likewise.
* elf32-sh.c (sh_elf_reloc): likewise.
* elf32-spu.c (spu_elf_rel9): likewise.
* elf32-xtensa.c (bfd_elf_xtensa_reloc): likewise
* elf64-ppc.c (ppc64_elf_brtaken_reloc): likewise.
(ppc64_elf_addr16_ha_reloc): likewise.
(ppc64_elf_toc64_reloc): likewise.
* elflink.c (bfd_elf_final_link): likewise.
(bfd_elf_perform_complex_relocation): likewise.
(elf_fixup_link_order): likewise.
(elf_link_input_bfd): likewise.
(elf_link_sort_relocs): likewise.
(elf_reloc_link_order): likewise.
(resolve_section): likewise.
* linker.c (_bfd_generic_reloc_link_order): likewise.
(bfd_generic_define_common_symbol): likewise.
(default_data_link_order): likewise.
(default_indirect_link_order): likewise.
* srec.c (srec_set_section_contents): likewise.
(srec_write_section): likewise.
* syms.c (_bfd_stab_section_find_nearest_line): likewise.
* reloc.c (_bfd_final_link_relocate): likewise.
(bfd_generic_get_relocated_section_contents): likewise.
(bfd_install_relocation): likewise.
For section which have SEC_ELF_OCTETS set, multiply output_base
and output_offset with bfd_octets_per_byte.
(bfd_perform_relocation): likewise.
include/
* coff/ti.h (GET_SCNHDR_SIZE, PUT_SCNHDR_SIZE, GET_SCN_SCNLEN),
(PUT_SCN_SCNLEN): Adjust bfd_octets_per_byte calls.
binutils/
* objdump.c (disassemble_data): Provide section parameter to
bfd_octets_per_byte.
(dump_section): likewise
(dump_section_header): likewise. Show SEC_ELF_OCTETS flag if set.
gas/
* as.h: Define SEC_OCTETS as SEC_ELF_OCTETS if OBJ_ELF.
* dwarf2dbg.c: (dwarf2_finish): Set section flag SEC_OCTETS for
.debug_line, .debug_info, .debug_abbrev, .debug_aranges, .debug_str
and .debug_ranges sections.
* write.c (maybe_generate_build_notes): Set section flag
SEC_OCTETS for .gnu.build.attributes section.
* frags.c (frag_now_fix): Don't divide by OCTETS_PER_BYTE if
SEC_OCTETS is set.
* symbols.c (resolve_symbol_value): Likewise.
ld/
* ldexp.c (fold_name): Provide section parameter to
bfd_octets_per_byte.
* ldlang (init_opb): New argument s. Set opb_shift to 0 if
SEC_ELF_OCTETS for the current section is set.
(print_input_section): Pass current section to init_opb.
(print_data_statement,print_reloc_statement,
print_padding_statement): Likewise.
(lang_check_section_addresses): Call init_opb for each
section.
(lang_size_sections_1,lang_size_sections_1,
lang_do_assignments_1): Likewise.
(lang_process): Pass NULL to init_opb.
qsort isn't guaranteed to be a stable sort, that is, elements
comparing equal according to the comparison function may be reordered
relative to their original ordering. Of course sometimes you may not
care, but even in those cases it is good to force some ordering
(ie. not have the comparison function return 0) so that linker output
is reproducible over different libc qsort implementations.
One way to make qsort stable (which the glibc manual incorrectly says
is the only way) is to augment the elements being sorted with a
monotonic counter of some kind, and use that counter as the final
arbiter of ordering in the comparison function.
Another way is to set up an array of pointers into the array of
elements, first pointer to first element, second pointer to second
element and so so, and sort the pointer array rather than the element
array. Final arbiter in the comparison function then is the pointer
difference. This works well with, for example, the symbol pointers
returned by _bfd_elf_canonicalize_symtab which point into a symbol
array.
This patch fixes a few places where sorting by symbol pointers is
appropriate, and adds comments where qsort stability is a non-issue.
* elf-strtab.c (strrevcmp): Comment.
* merge.c (strrevcmp): Likewise.
* elf64-ppc.c (compare_symbols): Correct final pointer comparison.
Comment on why comparing pointers ensures a stable sort.
* elflink.c (struct elf_symbol): Add void* to union.
(elf_sort_elf_symbol): Ensure a stable sort with pointer comparison.
(elf_sym_name_compare): Likewise.
(bfd_elf_match_symbols_in_sections): Style fix.
(elf_link_sort_cmp1): Comment.
A bug crept into commit f749f26eea, which could cause linker
segfaults when creating PIEs. This patch fixes it.
* elf64-ppc.c (ppc64_elf_size_dynamic_sections): Do allocate
space for local got non-tls relocs when PIE.
ppc*_elf_tls_optimize decrements the PLT refcount for __tls_get_addr
when a GD or LD sequence can be optimized. Without tls marker relocs
this must be done when processing the argument setup relocations.
With marker relocs it's better done when processing the marker reloc.
But don't count them both ways.
Seen as "unresolvable R_PPC_REL24 relocation against symbol
`__tls_get_addr_opt'" (and other branch relocs).
* elf32-ppc.c (ppc_elf_tls_optimize): Don't process R_PPC_TLSLD
with non-local symbol. Don't double count __tls_get_addr calls
with marker relocs.
* elf64-ppc.c (ppc64_elf_tls_optimize): Likewise.
has_tls_get_addr_call is no longer named correctly as the flag is
only set on finding a __tls_get_addr call without tlsld/tlsgd marker
relocations.
* elf32-ppc.c (nomark_tls_get_addr): Rename from has_tls_get_addr_call
throughout.
* elf64-ppc.c (nomark_tls_get_addr): Likewise.
1) GOT entries generated for any of the GOT TLS relocations don't need
dynamic relocations for locally defined symbols in PIEs. In the case
of a tls_index doubleword, the dtpmod entry is known to be 1, and the
dtprel entry is also known at link time and relative. Similarly,
dtprel and tprel words are known at link time and relative. (GOT
entries for other than TLS symbols are not relative and thus need
dynamic relocations in PIEs.)
2) Local dynamic TLS code is really only meant for accesses local to
the current binary. There was a cheapskate test for this before using
the common tlsld_got slot, but the test wasn't exactly correct and
might confuse anyone looking at the code. The proper test,
SYMBOL_REFERENCES_LOCAL isn't so expensive that it should be avoided.
3) The same cheap test for local syms when optimising TLS sequences
should be SYMBOL_REFERENCES_LOCAL too.
bfd/
* elf64-ppc.c (ppc64_elf_check_relocs): Move initialisation of vars.
(ppc64_elf_tls_optimize): Correct is_local condition.
(allocate_got): Don't reserve dynamic relocations for any of the
tls got relocs in PIEs when the symbol is local.
(allocate_dynrelocs): Correct validity test for local sym using
tlsld_got slot.
(ppc64_elf_size_dynamic_sections): Don't reserve dynamic relocations
for any of the tls got relocs in PIEs.
(ppc64_elf_layout_multitoc): Likewise.
(ppc64_elf_relocate_section): Correct validity test for local sym
using tlsld_got slot. Don't emit dynamic relocations for any of
the tls got relocs in PIEs when the symbol is local.
* elf32-ppc.c (ppc_elf_tls_optimize): Correct is_local condition.
(got_relocs_needed): Delete.
(allocate_dynrelocs): Correct validity test for local sym using
tlsld_got slot. Don't reserve dynamic relocations for any of the
tls got relocs in PIEs when the symbol is local.
(ppc_elf_size_dynamic_sections): Don't reserve dynamic relocations
for any of the tls got relocs in PIEs.
(ppc_elf_relocate_section): Correct validity test for local sym
using tlsld_got slot. Don't emit dynamic relocations for any of
the tls got relocs in PIEs when the symbol is local.
ld/
* testsuite/ld-powerpc/tlsso.d: Adjust to suit tlsld_got usage change.
* testsuite/ld-powerpc/tlsso.g: Likewise.
* testsuite/ld-powerpc/tlsso.r: Likewise.
* testsuite/ld-powerpc/tlsso32.d: Likewise.
* testsuite/ld-powerpc/tlsso32.g: Likewise.
* testsuite/ld-powerpc/tlsso32.r: Likewise.
In check_relocs, bfd_link_pic true means ld is producing a shared
library or a position independent executable. !bfd_link_pic means a
fixed position (ie. static) executable since the relocatable linking
case is excluded. So it is appropriate to continue using bfd_link_pic
when testing whether non-pcrelative relocations should be dynamic, and
!bfd_link_pic for the special case of ifunc in static executables.
However, -Bsymbolic shouldn't affect PIEs (they are executables so
none of their symbols should be overridden) and PIEs can support copy
relocations, thus bfd_link_executable should be used in those cases
rather than bfd_link_pic.
I've also removed the test of ELIMINATE_COPY_RELOCS in check_relocs.
We can sort out what to do regarding copy relocs later, which allows
the code in check_relocs to be simplified.
* elf64-ppc.c (ppc64_elf_check_relocs): Use bfd_link_executable
in choosing between different actions for shared library and
non-shared library cases. Delete ELIMINATE_COPY_RELOCS test.
(dec_dynrel_count): Likewise. Account for ifunc special case.
(ppc64_elf_adjust_dynamic_symbol): Copy relocs are for executables,
not non-pic.
(allocate_dynrelocs): Comment fixes. Delete ELIMINATE_COPY_RELOCS
test.
This patch corrects the set of dynamic relocations recognised by gold
as supported by glibc, and teaches ld.bfd to report an error similar
to the gold error. Note that ld --noinhibit-exec can be used to
produce an output, supporting older ld with newer glibc if the set of
supported glibc dynamic relocations changes.
bfd/
* elf64-ppc.c (ppc64_glibc_dynamic_reloc): New function.
(ppc64_elf_relocate_section): Error if emitting unsupported
dynamic relocations.
gold/
* powerpc.cc (Target_powerpc::Scan::check_non_pic): Move REL24
to 32-bit supported.
Some versions of clang apparently generate non-PIC on powerpc64le to
access common symbols. Since a common symbol and a strong definition
with the same name should resolve to the strong definition we have the
possibility of non-PIC attempting to access shared library variables.
This is really a clanger since powerpc64le is supposed to be PIC by
default, but let's see if ld can cope by generating .dynbss copies.
* elf64-ppc.c (must_be_dyn_reloc): Return 0 for TOC16 relocs.
(ppc64_elf_check_relocs): Support dynamic/copy relocs for TOC16.
(ppc64_elf_adjust_dynamic_symbol): Don't keep dynamic reloc when
needs_copy even if all relocs are in rw sections.
(dec_dynrel_count): Handle TOC16 relocs.
(ppc64_elf_relocate_section): Support dynamic relocs for TOC16.
(ppc64_elf_finish_dynamic_symbol): Adjust to handle needs_copy
semantic change.
PC-relative relocs typically use the addend in adjusting what they are
relative to. For example:
bcl 20,31,1f
1: mflr 12
addi 12,12,xxx-1b
generates "R_PPC64_REL16 xxx+0x4" for the addi (when little-endian).
The addend reflects the fact that you want the offset relative to the
previous insn not the current one in this case.
So the question is, will we ever want to do something like that for an
instruction using R_PPC64_GOT_PCREL34? I thought so at the time I
first implemented support in ld but at the time I think the hardware
was possibly going to support pcrel+offset+reg addressing. In which
case you might want something like:
load_big_offset_into_r2
pld 3,sym-big_offset@got@pcrel(2)
which would be a way of supporting more than 8G offsets from code to
the GOT. We could do the same with
load_big_offset_into_r2
pla 9,sym-big_offset@got@pcrel
ldx 3,9,2
However, this is really a poor version of TOC-pointer relative code.
So let's go with an addend on R_PPC64_GOT_PCREL34 meaning that
sym+addend should be put in a GOT entry, and the relocation calculate
the pc-relative offset to that GOT entry.
Note that this is an extension to the ABI, which says (by the
expression given for GOT relocs) that non-zero addends on GOT and PLT
relocs are ignored. This is true for all GOT/PLT relocs, not just the
pcrel ones.
* elf64-ppc.c (ppc64_elf_check_relocs): Interpret an addend in
GOT_PCREL and PLT_PCREL relocs as affecting the value stored
in the GOT/PLT entry rather than affecting the offset to that
GOI/PLT entry.
(ppc64_elf_edit_toc, ppc64_elf_relocate_section): Likewise.
The loads and stores handled in the second instruction of a sequence
marked by R_PPC64_PCREL_OPT may be a prefix instruction. For example:
pld ra,symbol@got@pcrel
0:
pld rt,off(ra)
.reloc 0b-8,R_PPC64_PCREL_OPT,(.-8)-(0b-8)
can be optimised to
pld rt,symbol+off@pcrel
pnop
* elf64-ppc.c (xlate_pcrel_opt): Handle prefix loads and stores
in second instruction.
(ppc64_elf_relocate_section): Likewise.
We can easily support an offset on the second instruction of a
sequence marked with R_PPC64_PCREL_OPT. For example,
pla ra,symbol@pcrel
ld rt,off(ra)
can be optimised to
pld rt,symbol+off@pcrel
nop
* elf64-ppc.c (xlate_pcrel_opt): Add poff parameter. Allow offset
on second insn, return it in poff.
(ppc64_elf_relocate_section): Add offset to paddi addend for
PCREL_OPT.
Found on a GOT reference to __ehdr_start, which is tweaked to be
undefined weak at some stages of linking. SYMBOL_REFERENCES_LOCAL
isn't a sufficient test.
* elf64-ppc.c (ppc64_elf_edit_toc): Exclude undefined weak
symbols from GOT optimisation.
These are done in ppc64_elf_edit_toc, which now also garbage collects
unused GOT entries. The checks for legitimate instructions weren't
being done for the GOT relocs, unless the file also happened to have a
toc section.
* elf64-ppc.c (struct ppc64_elf_obj_tdata): Rename has_gotrel
to has_optrel.
(struct _ppc64_elf_section_data): Likewise.
(ppc64_elf_check_relocs): Set has_optrel for more relocs.
(ppc64_elf_edit_toc): Do ha/lo insn checks in GOT loop rather
than TOC loop. Check PLT16 insns too.
This patch supports using pcrel instructions in TLS code sequences. A
number of new relocations are needed, gas operand modifiers to
generate those relocations, and new TLS optimisation. For
optimisation it turns out that the new pcrel GD and LD sequences can
be distinguished from the non-pcrel GD and LD sequences by there being
different relocations on the new sequence. The final "add ra,rb,13"
on IE sequences similarly needs a new relocation, or as I chose, a
modification of R_PPC64_TLS. On pcrel IE code, the R_PPC64_TLS points
one byte into the "add" instruction rather than being on the
instruction boundary.
GD:
pla 3,z@got@tlsgd@pcrel # R_PPC64_GOT_TLSGD34
bl __tls_get_addr@notoc(z@tlsgd) # R_PPC64_TLSGD and R_PPC64_REL24_NOTOC
edited to IE
pld 3,z@got@tprel@pcrel
add 3,3,13
edited to LE
paddi 3,13,z@tprel
nop
LD:
pla 3,z@got@tlsld@pcrel # R_PPC64_GOT_TLSLD34
bl __tls_get_addr@notoc(z@tlsld) # R_PPC64_TLSLD and R_PPC64_REL24_NOTOC
..
paddi 9,3,z2@dtprel
pld 10,z3@got@dtprel@pcrel
add 10,10,3
edited to LE
paddi 3,13,0x1000
nop
IE:
pld 9,z@got@tprel@pcrel # R_PPC64_GOT_TPREL34
add 3,9,z@tls@pcrel # R_PPC64_TLS at insn+1
ldx 4,9,z@tls@pcrel
lwax 5,9,z@tls@pcrel
stdx 5,9,z@tls@pcrel
edited to LE
paddi 9,13,z@tprel
nop
ld 4,0(9)
lwa 5,0(9)
std 5,0(9)
LE:
paddi 10,13,z@tprel
include/
* elf/ppc64.h (R_PPC64_TPREL34, R_PPC64_DTPREL34),
(R_PPC64_GOT_TLSGD34, R_PPC64_GOT_TLSLD34),
(R_PPC64_GOT_TPREL34, R_PPC64_GOT_DTPREL34): Define.
(IS_PPC64_TLS_RELOC): Include new tls relocs.
bfd/
* reloc.c (BFD_RELOC_PPC64_TPREL34, BFD_RELOC_PPC64_DTPREL34),
(BFD_RELOC_PPC64_GOT_TLSGD34, BFD_RELOC_PPC64_GOT_TLSLD34),
(BFD_RELOC_PPC64_GOT_TPREL34, BFD_RELOC_PPC64_GOT_DTPREL34),
(BFD_RELOC_PPC64_TLS_PCREL): New pcrel tls relocs.
* elf64-ppc.c (ppc64_elf_howto_raw): Add howtos for pcrel tls relocs.
(ppc64_elf_reloc_type_lookup): Translate pcrel tls relocs.
(must_be_dyn_reloc, dec_dynrel_count): Add R_PPC64_TPREL64.
(ppc64_elf_check_relocs): Support pcrel tls relocs.
(ppc64_elf_tls_optimize, ppc64_elf_relocate_section): Likewise.
* bfd-in2.h: Regenerate.
* libbfd.h: Regenerate.
gas/
* config/tc-ppc.c (ppc_elf_suffix): Map "tls@pcrel", "got@tlsgd@pcrel",
"got@tlsld@pcrel", "got@tprel@pcrel", and "got@dtprel@pcrel".
(fixup_size, md_assemble): Handle pcrel tls relocs.
(ppc_force_relocation, ppc_fix_adjustable): Likewise.
(md_apply_fix, tc_gen_reloc): Likewise.
ld/
* testsuite/ld-powerpc/tlsgd.d,
* testsuite/ld-powerpc/tlsgd.s,
* testsuite/ld-powerpc/tlsie.d,
* testsuite/ld-powerpc/tlsie.s,
* testsuite/ld-powerpc/tlsld.d,
* testsuite/ld-powerpc/tlsld.s: New tests.
* testsuite/ld-powerpc/powerpc.exp: Run them.