For 'libgcc/config/gcn/gthr-gcn.h' used in libstdc++ context (WIP), we have:
[...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libstdc++-v3/include/amdgcn-amdhsa/bits/gthr-default.h: In function ‘void* __gthread_getspecific(__gthread_key_t)’:
[...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libstdc++-v3/include/amdgcn-amdhsa/bits/gthr-default.h:90:10: error: ‘NULL’ was not declared in this scope
90 | return NULL;
| ^~~~
Resolve this with 's%NULL%0', as is used in
'libgcc/gthr-single.h:__gthread_getspecific', for example.
Follow-up to commit 76d4633107
"Create GCN-specific gthreads".
libgcc/
* config/gcn/gthr-gcn.h (__gthread_getspecific): 's%NULL%0'.
Recent Darwin versions place contraints on the use of run paths
specified in environment variables. This breaks some assumptions
in the GCC build.
This change allows the user to configure a Darwin build to use
'@rpath/libraryname.dylib' in library names and then to add an
embedded runpath to executables (and libraries with dependents).
The embedded runpath is added by default unless the user adds
'-nodefaultrpaths' to the link line.
For an installed compiler, it means that any executable built with
that compiler will reference the runtimes installed with the
compiler (equivalent to hard-coding the library path into the name
of the library).
During build-time configurations any "-B" entries will be added to
the runpath thus the newly-built libraries will be found by exes.
Since the install name is set in libtool, that decision needs to be
available here (but might also cause dependent ones in Makefiles,
so we need to export a conditional).
This facility is not available for Darwin 8 or earlier, however the
existing environment variable runpath does work there.
We default this on for systems where the external DYLD_LIBRARY_PATH
does not work and off for Darwin 8 or earlier. For systems that can
use either method, if the value is unset, we use the default (which
is currently DYLD_LIBRARY_PATH).
ChangeLog:
* configure: Regenerate.
* configure.ac: Do not add default runpaths to GCC exes
when we are building -static-libstdc++/-static-libgcc (the
default).
* libtool.m4: Add 'enable-darwin-at-runpath'. Act on the
enable flag to alter Darwin libraries to use @rpath names.
gcc/ChangeLog:
* aclocal.m4: Regenerate.
* configure: Regenerate.
* configure.ac: Handle Darwin rpaths.
* config/darwin.h: Handle Darwin rpaths.
* config/darwin.opt: Handle Darwin rpaths.
* Makefile.in: Handle Darwin rpaths.
gcc/ada/ChangeLog:
* gcc-interface/Makefile.in: Handle Darwin rpaths.
gcc/jit/ChangeLog:
* Make-lang.in: Handle Darwin rpaths.
libatomic/ChangeLog:
* Makefile.am: Handle Darwin rpaths.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac: Handle Darwin rpaths.
libbacktrace/ChangeLog:
* configure: Regenerate.
* configure.ac: Handle Darwin rpaths.
libcc1/ChangeLog:
* configure: Regenerate.
libffi/ChangeLog:
* Makefile.am: Handle Darwin rpaths.
* Makefile.in: Regenerate.
* configure: Regenerate.
libgcc/ChangeLog:
* config/t-slibgcc-darwin: Generate libgcc_s
with an @rpath name.
* config.host: Handle Darwin rpaths.
libgfortran/ChangeLog:
* Makefile.am: Handle Darwin rpaths.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac: Handle Darwin rpaths
libgm2/ChangeLog:
* Makefile.am: Handle Darwin rpaths.
* Makefile.in: Regenerate.
* aclocal.m4: Regenerate.
* configure: Regenerate.
* configure.ac: Handle Darwin rpaths.
* libm2cor/Makefile.am: Handle Darwin rpaths.
* libm2cor/Makefile.in: Regenerate.
* libm2iso/Makefile.am: Handle Darwin rpaths.
* libm2iso/Makefile.in: Regenerate.
* libm2log/Makefile.am: Handle Darwin rpaths.
* libm2log/Makefile.in: Regenerate.
* libm2min/Makefile.am: Handle Darwin rpaths.
* libm2min/Makefile.in: Regenerate.
* libm2pim/Makefile.am: Handle Darwin rpaths.
* libm2pim/Makefile.in: Regenerate.
libgomp/ChangeLog:
* Makefile.am: Handle Darwin rpaths.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac: Handle Darwin rpaths
libitm/ChangeLog:
* Makefile.am: Handle Darwin rpaths.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac: Handle Darwin rpaths.
libobjc/ChangeLog:
* configure: Regenerate.
* configure.ac: Handle Darwin rpaths.
libphobos/ChangeLog:
* configure: Regenerate.
* configure.ac: Handle Darwin rpaths.
* libdruntime/Makefile.am: Handle Darwin rpaths.
* libdruntime/Makefile.in: Regenerate.
* src/Makefile.am: Handle Darwin rpaths.
* src/Makefile.in: Regenerate.
libquadmath/ChangeLog:
* Makefile.am: Handle Darwin rpaths.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac: Handle Darwin rpaths.
libsanitizer/ChangeLog:
* asan/Makefile.am: Handle Darwin rpaths.
* asan/Makefile.in: Regenerate.
* configure: Regenerate.
* hwasan/Makefile.am: Handle Darwin rpaths.
* hwasan/Makefile.in: Regenerate.
* lsan/Makefile.am: Handle Darwin rpaths.
* lsan/Makefile.in: Regenerate.
* tsan/Makefile.am: Handle Darwin rpaths.
* tsan/Makefile.in: Regenerate.
* ubsan/Makefile.am: Handle Darwin rpaths.
* ubsan/Makefile.in: Regenerate.
libssp/ChangeLog:
* Makefile.am: Handle Darwin rpaths.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac: Handle Darwin rpaths.
libstdc++-v3/ChangeLog:
* configure: Regenerate.
* configure.ac: Handle Darwin rpaths.
* src/Makefile.am: Handle Darwin rpaths.
* src/Makefile.in: Regenerate.
libvtv/ChangeLog:
* configure: Regenerate.
* configure.ac: Handle Darwin rpaths.
lto-plugin/ChangeLog:
* configure: Regenerate.
* configure.ac: Handle Darwin rpaths.
zlib/ChangeLog:
* configure: Regenerate.
* configure.ac: Handle Darwin rpaths.
Accept the architecture configure option and resolve build failures. This is
enough to build binaries, but I've not got a device to test it on, so there
are probably runtime issues to fix. The cache control instructions might be
unsafe (or too conservative), and the kernel metadata might be off. Vector
reductions will need to be reworked for RDNA2. In principle, it would be
better to use wavefrontsize32 for this architecture, but that would mean
switching everything to allow SImode masks, so wavefrontsize64 it is.
The multilib is not included in the default configuration so either configure
--with-arch=gfx1030 or include it in --with-multilib-list=gfx1030,....
The majority of this patch has no effect on other devices, but changing from
using scalar writes for the exit value to vector writes means we don't need
the scalar cache write-back instruction anywhere (which doesn't exist in RDNA2).
gcc/ChangeLog:
* config.gcc: Allow --with-arch=gfx1030.
* config/gcn/gcn-hsa.h (NO_XNACK): gfx1030 does not support xnack.
(ASM_SPEC): gfx1030 needs -mattr=+wavefrontsize64 set.
* config/gcn/gcn-opts.h (enum processor_type): Add PROCESSOR_GFX1030.
(TARGET_GFX1030): New.
(TARGET_RDNA2): New.
* config/gcn/gcn-valu.md (@dpp_move<mode>): Disable for RDNA2.
(addc<mode>3<exec_vcc>): Add RDNA2 syntax variant.
(subc<mode>3<exec_vcc>): Likewise.
(<convop><mode><vndi>2_exec): Add RDNA2 alternatives.
(vec_cmp<mode>di): Likewise.
(vec_cmp<u><mode>di): Likewise.
(vec_cmp<mode>di_exec): Likewise.
(vec_cmp<u><mode>di_exec): Likewise.
(vec_cmp<mode>di_dup): Likewise.
(vec_cmp<mode>di_dup_exec): Likewise.
(reduc_<reduc_op>_scal_<mode>): Disable for RDNA2.
(*<reduc_op>_dpp_shr_<mode>): Likewise.
(*plus_carry_dpp_shr_<mode>): Likewise.
(*plus_carry_in_dpp_shr_<mode>): Likewise.
* config/gcn/gcn.cc (gcn_option_override): Recognise gfx1030.
(gcn_global_address_p): RDNA2 only allows smaller offsets.
(gcn_addr_space_legitimate_address_p): Likewise.
(gcn_omp_device_kind_arch_isa): Recognise gfx1030.
(gcn_expand_epilogue): Use VGPRs instead of SGPRs.
(output_file_start): Configure gfx1030.
* config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Add __RDNA2__;
(ASSEMBLER_DIALECT): New.
* config/gcn/gcn.md (rdna): New define_attr.
(enabled): Use "rdna" attribute.
(gcn_return): Remove s_dcache_wb.
(addcsi3_scalar): Add RDNA2 syntax variant.
(addcsi3_scalar_zero): Likewise.
(addptrdi3): Likewise.
(mulsi3): v_mul_lo_i32 should be v_mul_lo_u32 on all ISA.
(*memory_barrier): Add RDNA2 syntax variant.
(atomic_load<mode>): Add RDNA2 cache control variants, and disable
scalar atomics for RDNA2.
(atomic_store<mode>): Likewise.
(atomic_exchange<mode>): Likewise.
* config/gcn/gcn.opt (gpu_type): Add gfx1030.
* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX1030): New.
(main): Recognise -march=gfx1030.
* config/gcn/t-omp-device: Add gfx1030 isa.
libgcc/ChangeLog:
* config/gcn/amdgcn_veclib.h (CDNA3_PLUS): Set false for __RDNA2__.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (EF_AMDGPU_MACH_AMDGCN_GFX1030): New.
(isa_hsa_name): Recognise gfx1030.
(isa_code): Likewise.
* team.c (defined): Remove s_endpgm.
libgcc/config/avr/libf7/
* libf7.h (F7_SIZEOF): New macro.
* libf7-asm.sx: Use F7_SIZEOF instead of magic number "10".
(F7MOD_D_fma_, __fma): New module and function.
(fma) [-mdouble=64]: Define as alias for __fma.
(fmal) [-mlong-double=64]: Define as alias for __fma.
* libf7-common.mk (F7_ASM_PARTS): Add D_fma.
libgcc/config/avr/libf7/
* libf7.h (F7_FLAGNO_plusx, F7_FLAG_plusx): New macros.
* libf7.c (f7_horner): Handle F7_FLAG_plusx in highest coefficient.
* libf7-const.def [F7MOD_atan_]: Denominator: Set F7_FLAG_plusx
and omit highest term.
[F7MOD_asinacos_]: Use rational function with normalized denominator.
The outline atomic functions have hidden visibility and can only be called
directly. Therefore we can remove the BTI at function entry. This improves
security by reducing the number of indirect entry points in a binary.
The BTI markings on the objects are still emitted.
libgcc/ChangeLog:
* config/aarch64/lse.S (BTI_C): Remove define.
Be const and sign correct by using a matching CIE augmentation type.
Use a builtin instead of relying <string.h> being included.
libgcc/ChangeLog:
* config/aarch64/aarch64-unwind.h (aarch64_cie_signed_with_b_key):
Use const unsigned type and a builtin.
Signed-off-by: Pekka Seppänen <pexu@gcc.mail.kapsi.fi>
This patch adds the library helpers for multiplication, division + modulo
and casts from and to floating point (both binary and decimal).
As described in the intro, the first step is try to reduce further the
passed in precision by skipping over most significant limbs with just zeros
or sign bit copies. For multiplication and division I've implemented
a simple algorithm, using something smarter like Karatsuba or Toom N-Way
might be faster for very large _BitInts (which we don't support right now
anyway), but could mean more code in libgcc, which maybe isn't what people
are willing to accept.
For the to/from floating point conversions the patch uses soft-fp, because
it already has tons of handy macros which can be used for that. In theory
it could be implemented using {,unsigned} long long or {,unsigned} __int128
to/from floating point conversions with some frexp before/after, but at that
point we already need to force it into integer registers and analyze it
anyway. Plus, for 32-bit arches there is no __int128 that could be used
for XF/TF mode stuff.
I know that soft-fp is owned by glibc and I think the op-common.h change
should be propagated there, but the bitint stuff is really GCC specific
and IMHO doesn't belong into the glibc copy.
2023-09-06 Jakub Jelinek <jakub@redhat.com>
PR c/102989
libgcc/
* config/aarch64/t-softfp (softfp_extras): Use += rather than :=.
* config/i386/64/t-softfp (softfp_extras): Likewise.
* config/i386/libgcc-glibc.ver (GCC_14.0.0): Export _BitInt support
routines.
* config/i386/t-softfp (softfp_extras): Add fixxfbitint and
bf, hf and xf mode floatbitint.
(CFLAGS-floatbitintbf.c, CFLAGS-floatbitinthf.c): Add -msse2.
* config/riscv/t-softfp32 (softfp_extras): Use += rather than :=.
* config/rs6000/t-e500v1-fp (softfp_extras): Likewise.
* config/rs6000/t-e500v2-fp (softfp_extras): Likewise.
* config/t-softfp (softfp_floatbitint_funcs): New.
(softfp_bid_list): New.
(softfp_func_list): Add sf and df mode from and to _BitInt libcalls.
(softfp_bid_file_list): New.
(LIB2ADD_ST): Add $(softfp_bid_file_list).
* config/t-softfp-sfdftf (softfp_extras): Add fixtfbitint and
floatbitinttf.
* config/t-softfp-tf (softfp_extras): Likewise.
* libgcc2.c (bitint_reduce_prec): New inline function.
(BITINT_INC, BITINT_END): Define.
(bitint_mul_1, bitint_addmul_1): New helper functions.
(__mulbitint3): New function.
(bitint_negate, bitint_submul_1): New helper functions.
(__divmodbitint4): New function.
* libgcc2.h (LIBGCC2_UNITS_PER_WORD): When building _BitInt support
libcalls, redefine depending on __LIBGCC_BITINT_LIMB_WIDTH__.
(__mulbitint3, __divmodbitint4): Declare.
* libgcc-std.ver.in (GCC_14.0.0): Export _BitInt support routines.
* Makefile.in (lib2funcs): Add _mulbitint3.
(LIB2_DIVMOD_FUNCS): Add _divmodbitint4.
* soft-fp/bitint.h: New file.
* soft-fp/fixdfbitint.c: New file.
* soft-fp/fixsfbitint.c: New file.
* soft-fp/fixtfbitint.c: New file.
* soft-fp/fixxfbitint.c: New file.
* soft-fp/floatbitintbf.c: New file.
* soft-fp/floatbitintdf.c: New file.
* soft-fp/floatbitinthf.c: New file.
* soft-fp/floatbitintsf.c: New file.
* soft-fp/floatbitinttf.c: New file.
* soft-fp/floatbitintxf.c: New file.
* soft-fp/op-common.h (_FP_FROM_INT): Add support for rsize up to
4 * _FP_W_TYPE_SIZE rather than just 2 * _FP_W_TYPE_SIZE.
* soft-fp/bitintpow10.c: New file.
* soft-fp/fixsdbitint.c: New file.
* soft-fp/fixddbitint.c: New file.
* soft-fp/fixtdbitint.c: New file.
* soft-fp/floatbitintsd.c: New file.
* soft-fp/floatbitintdd.c: New file.
* soft-fp/floatbitinttd.c: New file.
The problem -fasynchronous-unwind-tables is on by default for riscv linux
We need turn it off for crt*.o because it would make __EH_FRAME_BEGIN__ point
to .eh_frame data from crtbeginT.o instead of the user-defined object
during static linking.
This turns it off.
OK?
libgcc/ChangeLog:
* config.host (riscv*-*-linux*): Add t-crtstuff to tmake_file.
(riscv*-*-freebsd*): Likewise.
* config/riscv/t-crtstuff: New file.
Enable _Float16 and __bf16 all the time but issue errors when the
types are used in conversion, unary operation, binary operation,
parameter passing or value return when TARGET_SSE2 is not available.
Also undef macros which are used by libgcc/libstdc++ to check the
backend support of the _Float16/__bf16 types when TARGET_SSE2 is not
available.
gcc/ChangeLog:
PR target/109504
* config/i386/i386-builtins.cc
(ix86_register_float16_builtin_type): Remove TARGET_SSE2.
(ix86_register_bf16_builtin_type): Ditto.
* config/i386/i386-c.cc (ix86_target_macros): When TARGET_SSE2
isn't available, undef the macros which are used to check the
backend support of the _Float16/__bf16 types when building
libstdc++ and libgcc.
* config/i386/i386.cc (construct_container): Issue errors for
HFmode/BFmode when TARGET_SSE2 is not available.
(function_value_32): Ditto.
(ix86_scalar_mode_supported_p): Remove TARGET_SSE2 for HFmode/BFmode.
(ix86_libgcc_floating_mode_supported_p): Ditto.
(ix86_emit_support_tinfos): Adjust codes.
(ix86_invalid_conversion): Return diagnostic message string
when there's conversion from/to BF/HFmode w/o TARGET_SSE2.
(ix86_invalid_unary_op): New function.
(ix86_invalid_binary_op): Ditto.
(TARGET_INVALID_UNARY_OP): Define.
(TARGET_INVALID_BINARY_OP): Define.
* config/i386/immintrin.h [__SSE2__]: Remove for fp16/bf16
related instrinsics header files.
* config/i386/i386.h (VALID_SSE2_TYPE_MODE): New macro.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr109504.c: New test.
* gcc.target/i386/sse2-bfloat16-1.c: Adjust error info.
* gcc.target/i386/sse2-float16-1.c: Ditto.
* gcc.target/i386/sse2-float16-4.c: New test.
* gcc.target/i386/sse2-float16-5.c: New test.
* g++.target/i386/float16-1.C: Adjust error info.
libgcc/ChangeLog:
* config/i386/t-softfp: Add -msse2 to libbid HFtype related
files.
Zfinx has provide fcsr like F, so rouding mode should use fcsr instead
of `soft` fenv.
libgcc/ChangeLog:
* config/riscv/sfp-machine.h (FP_INIT_ROUNDMODE): Check zfinx.
(FP_HANDLE_EXCEPTIONS): Ditto.
Also divmod, but only for scalar modes, for now (because there are no complex
int vectors yet).
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_expand_divmod_libfunc): New function.
(gcn_init_libfuncs): Add div and mod functions for all modes.
Add placeholders for divmod functions.
(TARGET_EXPAND_DIVMOD_LIBFUNC): Define.
libgcc/ChangeLog:
* config/gcn/lib2-divmod-di.c: Reimplement like lib2-divmod.c.
* config/gcn/lib2-divmod.c: Likewise.
* config/gcn/lib2-gcn.h: Add new types and prototypes for all the
new vector libfuncs.
* config/gcn/t-amdgcn: Add new files.
* config/gcn/amdgcn_veclib.h: New file.
* config/gcn/lib2-vec_divmod-di.c: New file.
* config/gcn/lib2-vec_divmod-hi.c: New file.
* config/gcn/lib2-vec_divmod-qi.c: New file.
* config/gcn/lib2-vec_divmod.c: New file.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/predcom-2.c: Avoid vectors on amdgcn.
* gcc.dg/unroll-8.c: Likewise.
* gcc.dg/vect/slp-26.c: Change expected results on amdgdn.
* lib/target-supports.exp
(check_effective_target_vect_int_mod): Add amdgcn.
(check_effective_target_divmod): Likewise.
* gcc.target/gcn/simd-math-3-16.c: New test.
* gcc.target/gcn/simd-math-3-2.c: New test.
* gcc.target/gcn/simd-math-3-32.c: New test.
* gcc.target/gcn/simd-math-3-4.c: New test.
* gcc.target/gcn/simd-math-3-8.c: New test.
* gcc.target/gcn/simd-math-3-char-16.c: New test.
* gcc.target/gcn/simd-math-3-char-2.c: New test.
* gcc.target/gcn/simd-math-3-char-32.c: New test.
* gcc.target/gcn/simd-math-3-char-4.c: New test.
* gcc.target/gcn/simd-math-3-char-8.c: New test.
* gcc.target/gcn/simd-math-3-char-run-16.c: New test.
* gcc.target/gcn/simd-math-3-char-run-2.c: New test.
* gcc.target/gcn/simd-math-3-char-run-32.c: New test.
* gcc.target/gcn/simd-math-3-char-run-4.c: New test.
* gcc.target/gcn/simd-math-3-char-run-8.c: New test.
* gcc.target/gcn/simd-math-3-char-run.c: New test.
* gcc.target/gcn/simd-math-3-char.c: New test.
* gcc.target/gcn/simd-math-3-long-16.c: New test.
* gcc.target/gcn/simd-math-3-long-2.c: New test.
* gcc.target/gcn/simd-math-3-long-32.c: New test.
* gcc.target/gcn/simd-math-3-long-4.c: New test.
* gcc.target/gcn/simd-math-3-long-8.c: New test.
* gcc.target/gcn/simd-math-3-long-run-16.c: New test.
* gcc.target/gcn/simd-math-3-long-run-2.c: New test.
* gcc.target/gcn/simd-math-3-long-run-32.c: New test.
* gcc.target/gcn/simd-math-3-long-run-4.c: New test.
* gcc.target/gcn/simd-math-3-long-run-8.c: New test.
* gcc.target/gcn/simd-math-3-long-run.c: New test.
* gcc.target/gcn/simd-math-3-long.c: New test.
* gcc.target/gcn/simd-math-3-run-16.c: New test.
* gcc.target/gcn/simd-math-3-run-2.c: New test.
* gcc.target/gcn/simd-math-3-run-32.c: New test.
* gcc.target/gcn/simd-math-3-run-4.c: New test.
* gcc.target/gcn/simd-math-3-run-8.c: New test.
* gcc.target/gcn/simd-math-3-run.c: New test.
* gcc.target/gcn/simd-math-3-short-16.c: New test.
* gcc.target/gcn/simd-math-3-short-2.c: New test.
* gcc.target/gcn/simd-math-3-short-32.c: New test.
* gcc.target/gcn/simd-math-3-short-4.c: New test.
* gcc.target/gcn/simd-math-3-short-8.c: New test.
* gcc.target/gcn/simd-math-3-short-run-16.c: New test.
* gcc.target/gcn/simd-math-3-short-run-2.c: New test.
* gcc.target/gcn/simd-math-3-short-run-32.c: New test.
* gcc.target/gcn/simd-math-3-short-run-4.c: New test.
* gcc.target/gcn/simd-math-3-short-run-8.c: New test.
* gcc.target/gcn/simd-math-3-short-run.c: New test.
* gcc.target/gcn/simd-math-3-short.c: New test.
* gcc.target/gcn/simd-math-3.c: New test.
* gcc.target/gcn/simd-math-4-char-run.c: New test.
* gcc.target/gcn/simd-math-4-char.c: New test.
* gcc.target/gcn/simd-math-4-long-run.c: New test.
* gcc.target/gcn/simd-math-4-long.c: New test.
* gcc.target/gcn/simd-math-4-run.c: New test.
* gcc.target/gcn/simd-math-4-short-run.c: New test.
* gcc.target/gcn/simd-math-4-short.c: New test.
* gcc.target/gcn/simd-math-4.c: New test.
* gcc.target/gcn/simd-math-5-16.c: New test.
* gcc.target/gcn/simd-math-5-32.c: New test.
* gcc.target/gcn/simd-math-5-4.c: New test.
* gcc.target/gcn/simd-math-5-8.c: New test.
* gcc.target/gcn/simd-math-5-char-16.c: New test.
* gcc.target/gcn/simd-math-5-char-32.c: New test.
* gcc.target/gcn/simd-math-5-char-4.c: New test.
* gcc.target/gcn/simd-math-5-char-8.c: New test.
* gcc.target/gcn/simd-math-5-char-run-16.c: New test.
* gcc.target/gcn/simd-math-5-char-run-32.c: New test.
* gcc.target/gcn/simd-math-5-char-run-4.c: New test.
* gcc.target/gcn/simd-math-5-char-run-8.c: New test.
* gcc.target/gcn/simd-math-5-char-run.c: New test.
* gcc.target/gcn/simd-math-5-char.c: New test.
* gcc.target/gcn/simd-math-5-long-16.c: New test.
* gcc.target/gcn/simd-math-5-long-32.c: New test.
* gcc.target/gcn/simd-math-5-long-4.c: New test.
* gcc.target/gcn/simd-math-5-long-8.c: New test.
* gcc.target/gcn/simd-math-5-long-run-16.c: New test.
* gcc.target/gcn/simd-math-5-long-run-32.c: New test.
* gcc.target/gcn/simd-math-5-long-run-4.c: New test.
* gcc.target/gcn/simd-math-5-long-run-8.c: New test.
* gcc.target/gcn/simd-math-5-long-run.c: New test.
* gcc.target/gcn/simd-math-5-long.c: New test.
* gcc.target/gcn/simd-math-5-run-16.c: New test.
* gcc.target/gcn/simd-math-5-run-32.c: New test.
* gcc.target/gcn/simd-math-5-run-4.c: New test.
* gcc.target/gcn/simd-math-5-run-8.c: New test.
* gcc.target/gcn/simd-math-5-run.c: New test.
* gcc.target/gcn/simd-math-5-short-16.c: New test.
* gcc.target/gcn/simd-math-5-short-32.c: New test.
* gcc.target/gcn/simd-math-5-short-4.c: New test.
* gcc.target/gcn/simd-math-5-short-8.c: New test.
* gcc.target/gcn/simd-math-5-short-run-16.c: New test.
* gcc.target/gcn/simd-math-5-short-run-32.c: New test.
* gcc.target/gcn/simd-math-5-short-run-4.c: New test.
* gcc.target/gcn/simd-math-5-short-run-8.c: New test.
* gcc.target/gcn/simd-math-5-short-run.c: New test.
* gcc.target/gcn/simd-math-5-short.c: New test.
* gcc.target/gcn/simd-math-5.c: New test.
The HImode libfuncs weren't called and trying to enable them fails because
TARGET_PROMOTE_FUNCTION_MODE wants to widen the arguments but the signedness
isn't known.
libgcc/ChangeLog:
* config/gcn/lib2-gcn.h (QItype, UQItype, HItype, UHItype): Delete.
(__divhi3, __modhi3, __udivhi3, __umodhi3): Delete.
* config/gcn/t-amdgcn: Don't build lib2-divmod-hi.c.
* config/gcn/lib2-divmod-hi.c: Removed.
One of my workmates found there is a warning like:
libgcc/config/rs6000/morestack.S:402: Warning: ignoring
incorrect section type for .init_array.00000
when compiling libgcc/config/rs6000/morestack.S.
Since commit r13-6545 touched that file recently, which was
suspected to be responsible for this warning, I did some
investigation and found this is a warning staying for a long
time. For section .init_stack*, it's preferred to use
section type SHT_INIT_ARRAY. So this patch is use
"@init_array" to replace "@progbits".
Although the warning is trivial, Segher suggested me to
post this to fix it, in order to avoid any possible
misunderstanding/confusion on the warning.
As Alan confirmed, this doesn't require a premise check
on if the existing binutils supports "@init_array" or not,
"because if you want split-stack to work, you must link
with gold, any version of binutils that has gold has an
assembler that understands @init_array". (Thanks Alan!)
libgcc/ChangeLog:
* config/i386/morestack.S: Use @init_array rather than
@progbits for section type of section .init_array.
* config/rs6000/morestack.S: Likewise.
* config/s390/morestack.S: Likewise.
speculation_barrier for MIPS needs sync+jr.hb (r2+),
so we implement __speculation_barrier in libgcc, like arm32 does.
gcc/ChangeLog:
* config/mips/mips-protos.h (mips_emit_speculation_barrier): New
prototype.
* config/mips/mips.cc (speculation_barrier_libfunc): New static
variable.
(mips_init_libfuncs): Initialize it.
(mips_emit_speculation_barrier): New function.
* config/mips/mips.md (speculation_barrier): Call
mips_emit_speculation_barrier.
libgcc/ChangeLog:
* config/mips/lib1funcs.S: New file.
define __speculation_barrier and include mips16.S.
* config/mips/t-mips: define LIB1ASMSRC as mips/lib1funcs.S.
define LIB1ASMFUNCS as _speculation_barrier.
set version info for __speculation_barrier.
* config/mips/libgcc-mips.ver: New file.
* config/mips/t-mips16: don't define LIB1ASMSRC as mips16.S
included in lib1funcs.S now.
Tools from later versions of the OS deprecate or fail to support
earlier OS revisions.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
libgcc/ChangeLog:
* config.host: Arrange to set min Darwin OS versions from
the configured host version.
* config/darwin10-unwind-find-enc-func.c: Do not use current
headers, but declare the nexessary structures locally to the
versions in use for Mac OSX 10.6.
* config/t-darwin: Amend to handle configured min OS
versions.
* config/t-darwin-min-1: New.
* config/t-darwin-min-5: New.
* config/t-darwin-min-8: New.
Replace LR.aq/SC.rl pairs with the SEQ_CST LR.aqrl/SC.rl pairs
recommended by table A.6 of the ISA manual.
2023-04-27 Patrick O'Neill <patrick@rivosinc.com>
libgcc/ChangeLog:
* config/riscv/atomic.c: Change LR.aq/SC.rl pairs into
sequentially consistent LR.aqrl/SC.rl pairs.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
This patch aligns the configuration to the actual PRU capabilities. It
also reduces the size of the affected libgcc functions.
For a real-world project using integer arithmetics the savings
are significant:
Before:
text data bss dec hex filename
3688 865 544 5097 13e9 hc-sr04-range-sensor.elf
With TARGET_HAS_NO_HW_DIVIDE defined:
text data bss dec hex filename
2824 865 544 4233 1089 hc-sr04-range-sensor.elf
Execution speed also appears to have improved. The moddi3 function is
now executed in half the CPU cycles.
libgcc/ChangeLog:
* config/pru/t-pru (HOST_LIBGCC2_CFLAGS): Add
-DTARGET_HAS_NO_HW_DIVIDE.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
With this, execution time for e.g. __moddi3 go from 59 to 40 cycles in
the "fast" case or from 290 to 200 cycles in the "slow" case (when the
!TARGET_HAS_NO_HW_DIVIDE variant calls division and modulus functions
for 32-bit SImode), as exposed by gcc.c-torture/execute/arith-rand-ll.c
compiled for -march=v10.
Unfortunately, it just puts a performance improvement "dent" of 0.07%
in a arith-rand-ll.c-based performance test - where all loops are also
reduced to 1/10.
The size of every affected libgcc function is reduced to less than
half and they are all now leaf functions.
* config/cris/t-cris (HOST_LIBGCC2_CFLAGS): Add
-DTARGET_HAS_NO_HW_DIVIDE.
RISC-V has no support for subword atomic operations; code currently
generates libatomic library calls.
This patch changes the default behavior to inline subword atomic calls
(using the same logic as the existing library call).
Behavior can be specified using the -minline-atomics and
-mno-inline-atomics command line flags.
gcc/libgcc/config/riscv/atomic.c has the same logic implemented in asm.
This will need to stay for backwards compatibility and the
-mno-inline-atomics flag.
2023-04-18 Patrick O'Neill <patrick@rivosinc.com>
gcc/ChangeLog:
PR target/104338
* config/riscv/riscv-protos.h: Add helper function stubs.
* config/riscv/riscv.cc: Add helper functions for subword masking.
* config/riscv/riscv.opt: Add command-line flag.
* config/riscv/sync.md: Add masking logic and inline asm for fetch_and_op,
fetch_and_nand, CAS, and exchange ops.
* doc/invoke.texi: Add blurb regarding command-line flag.
libgcc/ChangeLog:
PR target/104338
* config/riscv/atomic.c: Add reference to duplicate logic.
gcc/testsuite/ChangeLog:
PR target/104338
* gcc.target/riscv/inline-atomics-1.c: New test.
* gcc.target/riscv/inline-atomics-2.c: New test.
* gcc.target/riscv/inline-atomics-3.c: New test.
* gcc.target/riscv/inline-atomics-4.c: New test.
* gcc.target/riscv/inline-atomics-5.c: New test.
* gcc.target/riscv/inline-atomics-6.c: New test.
* gcc.target/riscv/inline-atomics-7.c: New test.
* gcc.target/riscv/inline-atomics-8.c: New test.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
muldi3 will deallocate stack space after the call to __save_r26_r31,
then re-allocate the space a short while later. If an interrupt
occurs in that window, it can clobber items on the stack.
PR target/109402
libgcc/
* config/v850/lib1funcs.S (___muldi3): Remove unnecessary
stack manipulations.
The millicode division and remainder routines trap division by zero.
The unwinder needs these directives to unwind divide by zero traps.
2023-04-05 John David Anglin <danglin@gcc.gnu.org>
libgcc/ChangeLog:
PR target/109374
* config/pa/milli64.S (RETURN_COLUMN): Define.
($$divI): Add CFI directives.
($$divU): Likewise.
($$remI): Likewise.
($$remU): Likewise.
We should always carry the exceptions forward. This bug was found when
working on testing glibc math tests, many tests were failing with
Overflow and Underflow flags not set. This was traced to here.
libgcc/ChangeLog:
* config/or1k/sfp-machine.h (FP_HANDLE_EXCEPTIONS): Remove
statement clearing existing exceptions.
x86_64/i686 has for a few months working std::bfloat16_t support, __bf16
there is no longer a storage only type, but can be used for arithmetics
and is supported in libgcc and libstdc++.
The following patch adds similar support for AArch64.
Unlike the x86 changes, this one keeps the old __bf16 mangling of
u6__bf16 rather than DF16b (so an exception from Itanium ABI), but
otherwise __bf16 and decltype (0.0bf16) are the same type and both
in C++ act as extended floating-point type.
2023-03-13 Jakub Jelinek <jakub@redhat.com>
gcc/
* config/aarch64/aarch64.h (aarch64_bf16_type_node): Remove.
(aarch64_bf16_ptr_type_node): Adjust comment.
* config/aarch64/aarch64.cc (aarch64_gimplify_va_arg_expr): Use
bfloat16_type_node rather than aarch64_bf16_type_node.
(aarch64_libgcc_floating_mode_supported_p,
aarch64_scalar_mode_supported_p): Also support BFmode.
(aarch64_invalid_conversion, aarch64_invalid_unary_op): Remove.
(aarch64_invalid_binary_op): Remove BFmode related rejections.
(TARGET_INVALID_CONVERSION, TARGET_INVALID_UNARY_OP): Don't redefine.
* config/aarch64/aarch64-builtins.cc (aarch64_bf16_type_node): Remove.
(aarch64_int_or_fp_type): Use bfloat16_type_node rather than
aarch64_bf16_type_node.
(aarch64_init_simd_builtin_types): Likewise.
(aarch64_init_bf16_types): Likewise. Don't create bfloat16_type_node,
which is created in tree.cc already.
* config/aarch64/aarch64-sve-builtins.def (svbfloat16_t): Likewise.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_opt_n_1.c:
Don't expect one __bf16 related error.
* gcc.target/aarch64/bfloat16_vector_typecheck_1.c: Adjust or remove
dg-error directives for __bf16 being an extended arithmetic type.
* gcc.target/aarch64/bfloat16_vector_typecheck_2.c: Likewise.
* gcc.target/aarch64/bfloat16_scalar_typecheck.c: Likewise.
* g++.target/aarch64/bfloat_cpp_typecheck.C: Don't expect two __bf16
related errors.
libgcc/
* config/aarch64/t-softfp (softfp_extensions): Add bfsf.
(softfp_truncations): Add tfbf dfbf sfbf hfbf.
(softfp_extras): Add floatdibf floatundibf floattibf floatuntibf.
* config/aarch64/libgcc-softfp.ver (GCC_13.0.0): Export
__extendbfsf2 and __trunc{s,d,t,h}fbf2.
* config/aarch64/sfp-machine.h (_FP_NANFRAC_B, _FP_NANSIGN_B): Define.
* soft-fp/floatundibf.c: New file.
* soft-fp/floatdibf.c: New file.
libstdc++-v3/
* config/abi/pre/gnu.ver (CXXABI_1.3.14): Also export __bf16 tinfos
if it isn't mangled as DF16b but u6__bf16.
While DI <-> BF conversions can be handled (and are) through
DI <-> XF <-> BF and for narrower integral modes even sometimes
through DF or SF, because XFmode has 64-bit mantissa and so all
the DImode values are exactly representable in XFmode.
That is not the case for TImode, and while e.g. the HF -> TI
conversions are IMHO useless in libgcc, because HFmode has
-65504.0f16, 65504.0f16 range, all the integers will be already
representable in SImode (or even HImode for unsigned) and so
I think HF -> DI -> TI conversions are faster and valid,
BFmode has roughly the same range as SFmode and so we absolutely need
the TI -> BF conversions to avoid double rounding.
As for BF -> TI conversions, they can be either also implemented
in libgcc, or they can be implemented (as done in this commit)
as BF -> SF -> TI conversions with the same code generation used
elsewhere, just doing the 16-bit left shift of the bits - I think
we don't need to handle sNaNs during the BF -> SF part because
SF -> TI (which is already a libcall too) will handle that too.
The BF -> SF -> TI path avoids wasting
32: 0000000000015e10 321 FUNC GLOBAL DEFAULT 13 __fixbfti@@GCC_13.0.0
89: 0000000000015f60 299 FUNC GLOBAL DEFAULT 13 __fixunsbfti@@GCC_13.0.0
2023-03-10 Jakub Jelinek <jakub@redhat.com>
PR target/107703
* optabs.cc (expand_fix): For conversions from BFmode to integral,
use shifts to convert it to SFmode first and then convert SFmode
to integral.
* soft-fp/floattibf.c: New file.
* soft-fp/floatuntibf.c: New file.
* config/i386/libgcc-glibc.ver: Export __float{,un}tibf @ GCC_13.0.0.
* config/i386/64/t-softfp (softfp_extras): Add floattibf and
floatuntibf.
(CFLAGS-floattibf.c, CFLAGS-floatunstibf.c): Add -msse2.
As PR108727 shows, when cleanup code called by the stack
unwinder calls function _Unwind_Resume, it goes via plt
stub like:
function 00000000.plt_call._Unwind_Resume:
=> 0x0000000010003580 <+0>: std r2,40(r1)
0x0000000010003584 <+4>: ld r12,-31760(r2)
0x0000000010003588 <+8>: mtctr r12
0x000000001000358c <+12>: ld r2,-31752(r2)
0x0000000010003590 <+16>: cmpldi r2,0
0x0000000010003594 <+20>: bnectr+
0x0000000010003598 <+24>: b 0x100031a4
<_Unwind_Resume@plt>
It wants to save TOC base (r2) to r1 + 40, but we only
bump the stack segment by 32 bytes as follows:
stdu %r29,-32(%r3)
It means the access is out of the stack segment allocated
by __generic_morestack, once the touch area isn't writable
like this failure shows, it would cause segment fault.
So fix the bump size with one reasonable value PARAMS.
PR libgcc/108727
libgcc/ChangeLog:
* config/rs6000/morestack.S (__morestack): Use PARAMS for new stack
bump size.
This patch updates the IEEE 128-bit types used in libgcc.
At the moment, we cannot build GCC when the target uses IEEE 128-bit long
doubles, such as building the compiler for a native Fedora 36 system. The
build dies when it is trying to build the _mulkc3.c and _divkc3 modules.
This patch changes libgcc to use long double for the IEEE 128-bit base type if
long double is IEEE 128-bit, and it uses _Float128 otherwise. The built-in
functions are adjusted to be the correct version based on the IEEE 128-bit base
type used.
While it is desirable to ultimately have __float128 and _Float128 use the same
internal type and mode within GCC, at present if you use the option
-mabi=ieeelongdouble, the __float128 type will use the long double type and not
the _Float128 type. We get an internal compiler error if we combine the
signbitf128 built-in with a long double type.
I've gone through several iterations of trying to fix this within GCC, and
there are various problems that have come up. I developed this alternative
patch that changes libgcc so that it does not tickle the issue. I hope we can
fix the compiler at some point, but right now, this is preventing people on
Fedora 36 systems from building compilers where the default long double is IEEE
128-bit.
2023-03-06 Michael Meissner <meissner@linux.ibm.com>
libgcc/
PR target/107299
* config/rs6000/_divkc3.c (COPYSIGN): Use the correct built-in based on
whether long double is IBM or IEEE.
(INFINITY): Likewise.
(FABS): Likewise.
* config/rs6000/_mulkc3.c (COPYSIGN): Likewise.
(INFINITY): Likewise.
* config/rs6000/quad-float128.h (TF): Remove definition.
(TFtype): Define to be long double or _Float128.
(TCtype): Define to be _Complex long double or _Complex _Float128.
* libgcc2.h (TFtype): Allow machine config files to override this.
(TCtype): Likewise.
* soft-fp/quad.h (TFtype): Likewise.
I have noticed some warnings when building GCC for arm-eabi:
pr-support.c:110:7: warning: variable ‘set_pac_sp’ set but not used [-Wunused-but-set-variable]
pr-support.c:109:7: warning: variable ‘set_pac’ set but not used [-Wunused-but-set-variable]
This small patch avoids them by defining these two variables undef
TARGET_HAVE_PACBTI, like the code which actually uses them.
libgcc/
* config/arm/pr-support.c (__gnu_unwind_execute): Use
TARGET_HAVE_PACBTI to define set_pac and set_pac_sp.
Tested by building a toolchain and compiling gnumach for x86_64 [1].
This is the basic version without unwind support which I think is only
required to implement exceptions.
[1]
https://github.com/flavioc/cross-hurd/blob/master/bootstrap-kernel.sh.
gcc/ChangeLog:
* config.gcc: Recognize x86_64-*-gnu* targets and include
i386/gnu64.h.
* config/i386/gnu64.h: Define configuration for new target
including ld.so location.
libgcc/ChangeLog:
* config.host: Recognize x86_64-*-gnu* targets.
* config/i386/gnu-unwind.h: Update to handle __x86_64__ with a
TODO for now.
Signed-off-by: Flavio Cruz <flaviocruz@gmail.com>
This patch adds support for Arm frame unwinding instruction "0xb5" [1]. When
an exception is taken and "0xb5" instruction is encounter during runtime
stack-unwinding, we use effective vsp as modifier in pointer authentication.
On completion of stack unwinding if "0xb5" instruction is not encountered
then CFA will be used as modifier in pointer authentication.
[1] https://github.com/ARM-software/abi-aa/releases/download/2022Q3/ehabi32.pdf
libgcc/ChangeLog:
2022-11-09 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
* config/arm/pr-support.c (__gnu_unwind_execute): Decode opcode
"0xb5".
This patch adds authentication for when the stack is unwound when an
exception is taken. All the changes here are done to the runtime code
in libgcc's unwinder code for Arm target. All the changes are guarded
under defined (__ARM_FEATURE_PAC_DEFAULT) and activated only if the
+pacbti feature is switched on for the architecture. This means that
switching on the target feature via -march or -mcpu is sufficient and
-mbranch-protection need not be enabled. This ensures that the
unwinder is authenticated only if the PACBTI instructions are
available in the non-NOP space as it uses AUTG. Just generating
PAC/AUT instructions using -mbranch-protection will not enable
authentication on the unwinder.
Pre-approved with the requested changes here
<https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586555.html>.
gcc/ChangeLog:
* ginclude/unwind-arm-common.h (_Unwind_VRS_RegClass): Introduce
new pseudo register class _UVRSC_PAC.
libgcc/ChangeLog:
* config/arm/pr-support.c (__gnu_unwind_execute): Decode
exception opcode (0xb4) for saving RA_AUTH_CODE and authenticate
with AUTG if found.
* config/arm/unwind-arm.c (struct pseudo_regs): New.
(phase1_vrs): Introduce new field to store pseudo-reg state.
(phase2_vrs): Likewise.
(_Unwind_VRS_Get): Load pseudo register state from virtual reg set.
(_Unwind_VRS_Set): Store pseudo register state to virtual reg set.
(_Unwind_VRS_Pop): Load pseudo register value from stack into VRS.
Co-Authored-By: Tejas Belagod <tbelagod@arm.com>
Co-Authored-By: Srinath Parvathaneni <srinath.parvathaneni@arm.com>
A recent change only initializes the regs.how[] during Dwarf unwinding
which resulted in an uninitialized offset used in return address signing
and random failures during unwinding. The fix is to encode the return
address signing state in REG_UNSAVED and a new state REG_UNSAVED_ARCHEXT.
libgcc/
PR target/107678
* unwind-dw2.h (REG_UNSAVED_ARCHEXT): Add new enum.
* unwind-dw2.c (uw_update_context_1): Add REG_UNSAVED_ARCHEXT case.
* unwind-dw2-execute_cfa.h: Use REG_UNSAVED_ARCHEXT/REG_UNSAVED to
encode the return address signing state.
* config/aarch64/aarch64-unwind.h (aarch64_demangle_return_addr)
Check current return address signing state.
(aarch64_frob_update_contex): Remove.