The libgcc implementation of __clzhi2 can be tweaked by
one cycle in some situations by re-arranging the instructions.
It also reduces the WCET by 1 cycle.
libgcc/
PR target/115065
* config/avr/lib1funcs.S (__clzhi2): Tweak.
(cherry picked from commit 988838da722dea09bd81ee9d49800a6f24980372)
This supports __powidf2 by means of a double wrapper for already
existing f7_powi (renamed to __f7_powi by f7-renames.h).
It tweaks the implementation so that it does not perform trivial
multiplications with 1.0 any more, but instead uses a move.
It also fixes the last statement of f7_powi, which was wrong.
Notice that f7_powi was unused until now.
PR target/114981
libgcc/config/avr/libf7/
* libf7-common.mk (F7_ASM_PARTS): Add D_powi
* libf7-asm.sx (F7MOD_D_powi_, __powidf2): New module and function.
* libf7.c (f7_powi): Fix last (wrong) statement.
Tweak trivial multiplications with 1.0.
gcc/testsuite/
* gcc.target/avr/pr114981-powil.c: New test.
(cherry picked from commit de4eea7d7ea86e54843507c68d6672eca9d8c7bb)
cppcheck apparently warns on the | !!sticky part of the expression and
using | (!!sticky) quiets it up (it is correct as is).
The following patch adds the ()s, and also adds them around mant >> 1 just
in case it makes it clearer to all readers that the expression is parsed
that way already.
2024-04-15 Jakub Jelinek <jakub@redhat.com>
PR libgcc/114689
* config/m68k/fpgnulib.c (__truncdfsf2): Add parentheses around
!!sticky bitwise or operand to quiet up cppcheck. Add parentheses
around mant >> 1 bitwise or operand.
This patch adds support for C23's _BitInt for the AArch64 port when compiling
for little endianness. Big Endianness requires further target-agnostic
support and we therefor disable it for now.
gcc/ChangeLog:
* config/aarch64/aarch64.cc (TARGET_C_BITINT_TYPE_INFO): Declare MACRO.
(aarch64_bitint_type_info): New function.
(aarch64_return_in_memory_1): Return large _BitInt's in memory.
(aarch64_function_arg_alignment): Adapt to correctly return the ABI
mandated alignment of _BitInt(N) where N > 128 as the alignment of
TImode.
(aarch64_composite_type_p): Return true for _BitInt(N), where N > 128.
libgcc/ChangeLog:
* config/aarch64/t-softfp (softfp_extras): Add floatbitinthf,
floatbitintbf, floatbitinttf and fixtfbitint.
* config/aarch64/libgcc-softfp.ver (GCC_14.0.0): Add __floatbitinthf,
__floatbitintbf, __floatbitinttf and __fixtfbitint.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/bitint-alignments.c: New test.
* gcc.target/aarch64/bitint-args.c: New test.
* gcc.target/aarch64/bitint-sizes.c: New test.
* gcc.target/aarch64/bitfield-bitint-abi.h: New header.
* gcc.target/aarch64/bitfield-bitint-abi-align16.c: New test.
* gcc.target/aarch64/bitfield-bitint-abi-align8.c: New test.
There is currently no unwinding implementation.
libgcc/ChangeLog:
* config.host: Recognize aarch64*-*-gnu* hosts.
* config/aarch64/gnu-unwind.h: New file.
* config/aarch64/heap-trampoline.c
(allocate_trampoline_page): Support GNU/Hurd.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
A few HWCAP entries are missing from aarch64/cpuinfo.c. This results in build
errors on older machines.
libgcc/
* config/aarch64/cpuinfo.c: Add HWCAP_EVTSTRM, HWCAP_CRC32, HWCAP_CPUID,
HWCAP_PACA and HWCAP_PACG.
Tested with some simple toy examples where an exception is thrown in the
signal handler.
libgcc/ChangeLog:
* config/i386/gnu-unwind.h: Support unwinding x86_64 signal frames.
Signed-off-by: Flavio Cruz <flaviocruz@gmail.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
This arranges that the byte order of the instruction sequences is
independent of the byte order of memory.
libgcc/ChangeLog:
* config/aarch64/heap-trampoline.c
(aarch64_trampoline_insns): Arrange to encode instructions as a
byte array so that the order is independent of memory byte order.
(struct aarch64_trampoline): Likewise.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
This allows the same trampoline pattern to be used on all linux variants
rather than restricting it to linux gnu.
PR target/113971
libgcc/ChangeLog:
* config/aarch64/heap-trampoline.c: Allow all linux variants.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
Fix a typo in __gthr_win32_abs_to_rel_time that caused it to return a
relative time in seconds instead of milliseconds. As a consequence,
__gthr_win32_cond_timedwait called SleepConditionVariableCS with a
1000x shorter timeout; this caused ~1000x more spurious wakeups in
CV timed waits such as std::condition_variable::wait_for or wait_until,
resulting generally in much higher CPU usage.
This can be demonstrated by this sample program:
```
int main() {
std::condition_variable cv;
std::mutex mx;
bool pass = false;
auto thread_fn = [&](bool timed) {
int wakeups = 0;
using sc = std::chrono::system_clock;
auto before = sc::now();
std::unique_lock<std::mutex> ml(mx);
if (timed) {
cv.wait_for(ml, std::chrono::seconds(2), [&]{
++wakeups;
return pass;
});
} else {
cv.wait(ml, [&]{
++wakeups;
return pass;
});
}
printf("pass: %d; wakeups: %d; elapsed: %d ms\n", pass, wakeups,
int((sc::now() - before) / std::chrono::milliseconds(1)));
pass = false;
};
{
// timed wait, let expire
std::thread t(thread_fn, true);
t.join();
}
{
// timed wait, wake up explicitly after 1 second
std::thread t(thread_fn, true);
std::this_thread::sleep_for(std::chrono::seconds(1));
{
std::unique_lock<std::mutex> ml(mx);
pass = true;
}
cv.notify_all();
t.join();
}
{
// non-timed wait, wake up explicitly after 1 second
std::thread t(thread_fn, false);
std::this_thread::sleep_for(std::chrono::seconds(1));
{
std::unique_lock<std::mutex> ml(mx);
pass = true;
}
cv.notify_all();
t.join();
}
return 0;
}
```
On builds based on non-affected threading models (e.g. POSIX on Linux,
or winpthreads or MCF on Win32) the output is something like
```
pass: 0; wakeups: 2; elapsed: 2000 ms
pass: 1; wakeups: 2; elapsed: 991 ms
pass: 1; wakeups: 2; elapsed: 996 ms
```
while with the Win32 threading model we get
```
pass: 0; wakeups: 1418; elapsed: 2000 ms
pass: 1; wakeups: 479; elapsed: 988 ms
pass: 1; wakeups: 2; elapsed: 992 ms
```
(notice the huge number of wakeups in the timed wait cases only).
This commit fixes the conversion, adjusting the final division by
NSEC100_PER_SEC to use NSEC100_PER_MSEC instead (already defined in the
file and not used in any other place, so probably just a typo).
libgcc/ChangeLog:
PR libgcc/113850
* config/i386/gthr-win32-cond.c (__gthr_win32_abs_to_rel_time):
fix absolute timespec to relative milliseconds count
conversion (it incorrectly returned seconds instead of
milliseconds); this avoids spurious wakeups in
__gthr_win32_cond_timedwait
Add x32 and IBT support to x86 heap trampoline implementation with a
testcase.
2024-02-13 Jakub Jelinek <jakub@redhat.com>
H.J. Lu <hjl.tools@gmail.com>
libgcc/
PR target/113855
* config/i386/heap-trampoline.c (trampoline_insns): Add IBT
support and pad to the multiple of 4 bytes. Use movabsq
instead of movabs in comments. Add -mx32 variant.
gcc/testsuite/
PR target/113855
* gcc.dg/heap-trampoline-1.c: New test.
* lib/target-supports.exp (check_effective_target_heap_trampoline):
New.
The initial heap trampoline implementation was targeting 64b
platforms. As the PR demonstrates this creates an issue where it
is expected that the same symbols are exported for 32 and 64b.
Rather than conditionalize the exports and code-gen on x86_64,
this patch provides a basic implementation of the IA32 trampoline.
This also avoids potential user confusion, when a 32b target has
64b multilibs, and vice versa; which is the case for Darwin.
PR target/113855
gcc/ChangeLog:
* config/i386/darwin.h (DARWIN_HEAP_T_LIB): Moved to be
available to all sub-targets.
* config/i386/darwin32-biarch.h (DARWIN_HEAP_T_LIB): Delete.
* config/i386/darwin64-biarch.h (DARWIN_HEAP_T_LIB): Delete.
libgcc/ChangeLog:
* config.host: Add trampoline support to x?86-linux.
* config/i386/heap-trampoline.c (trampoline_insns): Provide
a variant for IA32.
(union ix86_trampoline): Likewise.
(__gcc_nested_func_ptr_created): Implement a basic trampoline
for IA32.
Some exports were missed from the GCC-13 cycle, these are added here
along with the bitint-related ones added in GCC-14.
libgcc/ChangeLog:
* config/i386/libgcc-darwin.ver: Export bf and bitint-related
synbols.
As reported in the PR, all libgcc x86 symbol versions added after
GCC_7.0.0 were only added to i386/libgcc-glibc.ver, missing all of
libgcc-sol2.ver, libgcc-bsd.ver, and libgcc-darwin.ver.
This patch fixes this for Solaris/x86, adding all of them
(GCC_1[234].0.0) as GCC_14.0.0 to not retroactively change history.
Since this isn't the first time this happens, I've added a note to the
end of libgcc-glibc.ver to request notifying other maintainers in case
of additions.
Tested on i386-pc-solaris2.11.
2024-02-01 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
libgcc:
PR target/113700
* config/i386/libgcc-sol2.ver (GCC_14.0.0): Added all symbols from
i386/libgcc-glibc.ver (GCC_12.0.0, GCC_13.0.0, GCC_14.0.0).
* config/i386/libgcc-glibc.ver: Request notifications on updates.
Rainer pointed out that __PFX__ and __FIXPTPFX__ prefix replacement is done
solely for libgcc-std.ver.in and not for the *.ver files in config.
I've used the __PFX__ prefix even in config/i386/libgcc-glibc.ver because it
was used for similar symbols in libgcc-std.ver.in, and that results in those
symbols being STB_LOCAL in libgcc_s.so.1. Tests still work because gcc by
default uses -static-libgcc when linking (unlike g++ etc.), but would
have failed when using -shared-libgcc (but I see nothing in the testsuite
actually testing with -shared-libgcc, so am not adding tests).
With the patch, libgcc_s.so.1 now exports
__fixtfbitint@@GCC_14.0.0 FUNC GLOBAL DEFAULT
__fixxfbitint@@GCC_14.0.0 FUNC GLOBAL DEFAULT
__floatbitintbf@@GCC_14.0.0 FUNC GLOBAL DEFAULT
__floatbitinthf@@GCC_14.0.0 FUNC GLOBAL DEFAULT
__floatbitinttf@@GCC_14.0.0 FUNC GLOBAL DEFAULT
__floatbitintxf@@GCC_14.0.0 FUNC GLOBAL DEFAULT
on x86_64-linux which it wasn't before.
2024-02-02 Jakub Jelinek <jakub@redhat.com>
PR target/113700
* config/i386/libgcc-glibc.ver (GCC_14.0.0): Remove __PFX prefixes
from symbol names.
I'm seeing hundreds of
In file included from ../../../libgcc/libgcc2.c:56:
../../../libgcc/libgcc2.h:32:13: warning: conflicting types for built-in function ‘__gcc_nested_func_ptr_created’; expected ‘void(void *, void *, void *)’
+[-Wbuiltin-declaration-mismatch]
32 | extern void __gcc_nested_func_ptr_created (void *, void *, void **);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
warnings.
Either we need to add like in r14-6218
#pragma GCC diagnostic ignored "-Wbuiltin-declaration-mismatch"
(but in that case because of the libgcc2.h prototype (why is it there?)
it would need to be also with #pragma GCC diagnostic push/pop around),
or we could go with just following how the builtins are prototyped on the
compiler side and only cast to void ** when dereferencing (which is in
a single spot in each TU).
2024-02-01 Jakub Jelinek <jakub@redhat.com>
PR libgcc/113402
* libgcc2.h (__gcc_nested_func_ptr_created): Change type of last
argument from void ** to void *.
* config/i386/heap-trampoline.c (__gcc_nested_func_ptr_created):
Change type of dst from void ** to void * and cast dst to void **
before dereferencing it.
* config/aarch64/heap-trampoline.c (__gcc_nested_func_ptr_created):
Likewise.
I'm seeing
../../../libgcc/shared-object.mk:14: warning: overriding recipe for target 'heap-trampoline.o'
../../../libgcc/shared-object.mk:14: warning: ignoring old recipe for target 'heap-trampoline.o'
../../../libgcc/shared-object.mk:17: warning: overriding recipe for target 'heap-trampoline_s.o'
../../../libgcc/shared-object.mk:17: warning: ignoring old recipe for target 'heap-trampoline_s.o'
This patch fixes that.
2024-02-01 Jakub Jelinek <jakub@redhat.com>
PR libgcc/113403
* config/i386/t-heap-trampoline: Add to LIB2ADDEHSHARED
i386/heap-trampoline.c rather than aarch64/heap-trampoline.c.
Use aarch64-asm.h in asm code consistently, this was started in
commit c608ada288
Author: Zac Walker <zacwalker@microsoft.com>
CommitDate: 2024-01-23 15:32:30 +0000
Ifdef `.hidden`, `.type`, and `.size` pseudo-ops for `aarch64-w64-mingw32` target
But that commit failed to remove some existing markings from asm files,
which means some objects got double marked with gnu property notes.
libgcc/ChangeLog:
* config/aarch64/crti.S: Remove stack marking.
* config/aarch64/crtn.S: Remove stack marking, include aarch64-asm.h
* config/aarch64/lse.S: Remove stack and GNU property markings.
In order to handle system security constraints during GCC build
and test and that most platform versions cannot link to libgcc_eh
since the unwinder there is incompatible with the system one.
1. We make the support functions weak definitions.
2. We include them as a CRT for platform conditions that do not
allow libgcc_eh.
3. We ensure that the weak symbols are exported from DSOs (which
includes exes on Darwin) so that the dynamic linker will
pick one instance (which avoids duplication of trampoline
caches).
PR libgcc/113403
gcc/ChangeLog:
* config/darwin.h (DARWIN_SHARED_WEAK_ADDS, DARWIN_WEAK_CRTS): New.
(REAL_LIBGCC_SPEC): Move weak CRT handling to separate spec.
* config/i386/darwin.h (DARWIN_HEAP_T_LIB): New.
* config/i386/darwin32-biarch.h (DARWIN_HEAP_T_LIB): New.
* config/i386/darwin64-biarch.h (DARWIN_HEAP_T_LIB): New.
* config/rs6000/darwin.h (DARWIN_HEAP_T_LIB): New.
libgcc/ChangeLog:
* config.host: Build libheap_t.a for i686/x86_64 Darwin.
* config/aarch64/heap-trampoline.c (HEAP_T_ATTR): New.
(allocate_tramp_ctrl): Allow a target to build this as a weak def.
(__gcc_nested_func_ptr_created): Likewise.
* config/i386/heap-trampoline.c (HEAP_T_ATTR): New.
(allocate_tramp_ctrl): Allow a target to build this as a weak def.
(__gcc_nested_func_ptr_created): Likewise.
* config/t-darwin: Build libheap_t.a (a CRT with heap trampoline
support).
This removes the heap trampoline support functions from libgcc.a and
adds them to libgcc_eh.a. They are also present in libgcc_s.
PR libgcc/113403
libgcc/ChangeLog:
* config/aarch64/t-heap-trampoline: Move the heap trampoline
support functions from libgcc.a to libgcc_eh.a.
* config/i386/t-heap-trampoline: Likewise.
The symbols for the functions supporting heap-based trampolines were
exported at an incorrect symbol version, the following patch fixes that.
As requested in the PR, this also renames __builtin_nested_func_ptr* to
__gcc_nested_func_ptr*. In carrying our the rename, we move the builtins
to use DEF_EXT_LIB_BUILTIN.
PR libgcc/113402
gcc/ChangeLog:
* builtins.cc (expand_builtin): Handle BUILT_IN_GCC_NESTED_PTR_CREATED
and BUILT_IN_GCC_NESTED_PTR_DELETED.
* builtins.def (BUILT_IN_GCC_NESTED_PTR_CREATED,
BUILT_IN_GCC_NESTED_PTR_DELETED): Make these builtins LIB-EXT and
rename the library fallbacks to __gcc_nested_func_ptr_created and
__gcc_nested_func_ptr_deleted.
* doc/invoke.texi: Rename these to __gcc_nested_func_ptr_created
and __gcc_nested_func_ptr_deleted.
* tree-nested.cc (finalize_nesting_tree_1): Use builtin_explicit for
BUILT_IN_GCC_NESTED_PTR_CREATED and BUILT_IN_GCC_NESTED_PTR_DELETED.
* tree.cc (build_common_builtin_nodes): Build the
BUILT_IN_GCC_NESTED_PTR_CREATED and BUILT_IN_GCC_NESTED_PTR_DELETED local
builtins only for non-explicit.
libgcc/ChangeLog:
* config/aarch64/heap-trampoline.c: Rename
__builtin_nested_func_ptr_created to __gcc_nested_func_ptr_created and
__builtin_nested_func_ptr_deleted to __gcc_nested_func_ptr_deleted.
* config/i386/heap-trampoline.c: Likewise.
* libgcc2.h: Likewise.
* libgcc-std.ver.in (GCC_7.0.0): Likewise and then move
__gcc_nested_func_ptr_created and
__gcc_nested_func_ptr_deleted from this symbol version to ...
(GCC_14.0.0): ... this one.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
Co-authored-by: Jakub Jelinek <jakub@redhat.com>
This is enough to get gfx1030 and gfx1100 working; there are still some test
failures to investigate, and probably some tuning to do.
gcc/ChangeLog:
* config/gcn/gcn-opts.h (TARGET_PACKED_WORK_ITEMS): Add TARGET_RDNA3.
* config/gcn/gcn-valu.md (all_convert): New iterator.
(<convop><V_INT_1REG_ALT:mode><V_INT_1REG:mode>2<exec>): New
define_expand, and rename the old one to ...
(*<convop><V_INT_1REG_ALT:mode><V_INT_1REG:mode>_sdwa<exec>): ... this.
(extend<V_INT_1REG_ALT:mode><V_INT_1REG:mode>2<exec>): Likewise, to ...
(extend<V_INT_1REG_ALT:mode><V_INT_1REG:mode>_sdwa<exec>): .. this.
(*<convop><V_INT_1REG_ALT:mode><V_INT_1REG:mode>_shift<exec>): New.
* config/gcn/gcn.cc (gcn_global_address_p): Use "offsetbits" correctly.
(gcn_hsa_declare_function_name): Update the vgpr counting for gfx1100.
* config/gcn/gcn.md (<u>mulhisi3): Disable on RDNA3.
(<u>mulqihi3_scalar): Likewise.
libgcc/ChangeLog:
* config/gcn/amdgcn_veclib.h (CDNA3_PLUS): Handle RDNA3.
libgomp/ChangeLog:
* config/gcn/time.c (RTC_TICKS): Configure RDNA3.
(omp_get_wtime): Add RDNA3-compatible variant.
* plugin/plugin-gcn.c (max_isa_vgprs): Tune for gfx1030 and gfx1100.
Signed-off-by: Andrew Stubbs <ams@baylibre.com>
Recent
change (https://gcc.gnu.org/pipermail/gcc-cvs/2023-December/394915.html)
added a generic SME support using `.hidden`, `.type`, and ``.size`
pseudo-ops in the assembly sources, `aarch64-w64-mingw32` does not
support the pseudo-ops though. This patch wraps usage of those
pseudo-ops using macros and ifdefs them for `__ELF__` define.
libgcc/
* config/aarch64/aarch64-asm.h (HIDDEN, SYMBOL_SIZE, SYMBOL_TYPE)
(ENTRY_ALIGN, GNU_PROPERTY): New macros.
* config/aarch64/__arm_sme_state.S: Use them.
* config/aarch64/__arm_tpidr2_save.S: Likewise.
* config/aarch64/__arm_za_disable.S: Likewise.
* config/aarch64/crti.S: Likewise.
* config/aarch64/lse.S: Likewise.
For now, for single-threaded GCN, nvptx target use only; extension for
multi-threaded offloading use is to follow later. Eventually switch to
libstdc++-v3/libsupc++ proper.
libgcc/
* c++-minimal/README: New.
* c++-minimal/guard.c: New.
* config/gcn/t-amdgcn (LIB2ADD): Add it.
* config/nvptx/t-nvptx (LIB2ADD): Likewise.
If we allow __strub_leave to allocate a frame on sparc, it will
overlap with a lot of the stack range we're supposed to scrub, because
of the large fixed-size outgoing args and register save area.
Unfortunately, setting up the PIC register seems to prevent the frame
pointer from being omitted.
Since the strub runtime doesn't issue calls or use global variables,
at least on sparc, disabling PIC to compile strub.c seems to do the
right thing.
for libgcc/ChangeLog
PR middle-end/112917
* config.host (sparc, sparc64): Enable...
* config/sparc/t-sparc: ... this new fragment.
This adds initial support for function multiversioning on aarch64 using
the target_version and target_clones attributes. This loosely follows
the Beta specification in the ACLE [1], although with some differences
that still need to be resolved (possibly as follow-up patches).
Existing function multiversioning implementations are broken in various
ways when used across translation units. This includes placing
resolvers in the wrong translation units, and using symbol mangling that
callers to unintentionally bypass the resolver in some circumstances.
Fixing these issues for aarch64 will require modifications to our ACLE
specification. It will also require further adjustments to existing
middle end code, to facilitate different mangling and resolver
placement while preserving existing target behaviours.
The list of function multiversioning features specified in the ACLE is
also inconsistent with the list of features supported in target option
extensions. I intend to resolve some or all of these inconsistencies at
a later stage.
The target_version attribute is currently only supported in C++, since
this is the only frontend with existing support for multiversioning
using the target attribute. On the other hand, this patch happens to
enable multiversioning with the target_clones attribute in Ada and D, as
well as the entire C family, using their existing frontend support.
This patch also does not support the following aspects of the Beta
specification:
- The target_clones attribute should allow an implicit unlisted
"default" version.
- There should be an option to disable function multiversioning at
compile time.
- Unrecognised target names in a target_clones attribute should be
ignored (with an optional warning). This current patch raises an
error instead.
[1] https://github.com/ARM-software/acle/blob/main/main/acle.md#function-multi-versioning
gcc/ChangeLog:
* config/aarch64/aarch64-feature-deps.h (fmv_deps_<FEAT_NAME>):
Define aarch64_feature_flags mask foreach FMV feature.
* config/aarch64/aarch64-option-extensions.def: Use new macros
to define FMV feature extensions.
* config/aarch64/aarch64.cc (aarch64_option_valid_attribute_p):
Check for target_version attribute after processing target
attribute.
(aarch64_fmv_feature_data): New.
(aarch64_parse_fmv_features): New.
(aarch64_process_target_version_attr): New.
(aarch64_option_valid_version_attribute_p): New.
(get_feature_mask_for_version): New.
(compare_feature_masks): New.
(aarch64_compare_version_priority): New.
(build_ifunc_arg_type): New.
(make_resolver_func): New.
(add_condition_to_bb): New.
(dispatch_function_versions): New.
(aarch64_generate_version_dispatcher_body): New.
(aarch64_get_function_versions_dispatcher): New.
(aarch64_common_function_versions): New.
(aarch64_mangle_decl_assembler_name): New.
(TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P): New implementation.
(TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE): New implementation.
(TARGET_OPTION_FUNCTION_VERSIONS): New implementation.
(TARGET_COMPARE_VERSION_PRIORITY): New implementation.
(TARGET_GENERATE_VERSION_DISPATCHER_BODY): New implementation.
(TARGET_GET_FUNCTION_VERSIONS_DISPATCHER): New implementation.
(TARGET_MANGLE_DECL_ASSEMBLER_NAME): New implementation.
* config/aarch64/aarch64.h (TARGET_HAS_FMV_TARGET_ATTRIBUTE):
Set target macro.
* config/arm/aarch-common.h (enum aarch_parse_opt_result): Add
new value to report duplicate FMV feature.
* common/config/aarch64/cpuinfo.h: New file.
libgcc/ChangeLog:
* config/aarch64/cpuinfo.c (enum CPUFeatures): Move to shared
copy in gcc/common
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/options_set_17.c: Reorder expected flags.
* gcc.target/aarch64/cpunative/native_cpu_0.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_13.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_16.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_17.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_18.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_19.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_20.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_21.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_6.c: Ditto.
* gcc.target/aarch64/cpunative/native_cpu_7.c: Ditto.
This is added to enable function multiversioning, but can also be used
directly. The interface is chosen to match that used in LLVM's
compiler-rt, to facilitate cross-compiler compatibility.
The content of the patch is derived almost entirely from Pavel's prior
contributions to compiler-rt/lib/builtins/cpu_model.c. I have made minor
changes to align more closely with GCC coding style, and to exclude any code
from other LLVM contributors, and am adding this to GCC with Pavel's approval.
libgcc/ChangeLog:
* config/aarch64/t-aarch64: Include cpuinfo.c
* config/aarch64/cpuinfo.c: New file
(__init_cpu_features_constructor) New.
(__init_cpu_features_resolver) New.
(__init_cpu_features) New.
Co-authored-by: Pavel Iliin <Pavel.Iliin@arm.com>
To support the ZA lazy save scheme, the PCS requires the unwinder to
reset the SME state to PSTATE.SM=0, PSTATE.ZA=0, TPIDR2_EL0=0 on entry
to an exception handler. We use the __arm_za_disable SME runtime call
unconditionally to achieve this.
https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#exceptions
The hidden alias is used to avoid a PLT and avoid inconsistent VPCS
marking (we don't rely on special PCS at the call site). In case of
static linking the SME runtime init code is linked in code that raises
exceptions.
libgcc/ChangeLog:
* config/aarch64/__arm_za_disable.S: Add hidden alias.
* config/aarch64/aarch64-unwind.h: Reset the SME state before
EH return via the _Unwind_Frames_Extra hook.
The call ABI for SME (Scalable Matrix Extension) requires a number of
helper routines which are added to libgcc so they are tied to the
compiler version instead of the libc version. See
https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#sme-support-routines
The routines are in shared libgcc and static libgcc eh, even though
they are not related to exception handling. This is to avoid linking
a copy of the routines into dynamic linked binaries, because TPIDR2_EL0
block can be extended in the future which is better to handle in a
single place per process.
The support routines have to decide if SME is accessible or not. Linux
tells userspace if SME is accessible via AT_HWCAP2, otherwise a new
__aarch64_sme_accessible symbol was introduced that a libc can define.
Due to libgcc and libc build order, the symbol availability cannot be
checked so for __aarch64_sme_accessible an unistd.h feature test macro
is used while such detection mechanism is not available for __getauxval
so we rely on configure checks based on the target triplet.
Asm helper code is added to make writing the routines easier.
libgcc/ChangeLog:
* config/aarch64/t-aarch64: Add sources to the build.
* config/aarch64/__aarch64_have_sme.c: New file.
* config/aarch64/__arm_sme_state.S: New file.
* config/aarch64/__arm_tpidr2_restore.S: New file.
* config/aarch64/__arm_tpidr2_save.S: New file.
* config/aarch64/__arm_za_disable.S: New file.
* config/aarch64/aarch64-asm.h: New file.
* config/aarch64/libgcc-sme.ver: New file.
The rx port has a bunch of what I presume are ABI compatibility functions in
libgcc. Those compatibility functions routines such as __eqdf2 from libgcc,
but without a prototype. This patch adds the missing prototypes.
libgcc/
* config/rx/rx-abi-functions.c (__ltdf2, __gtdf2): Add prototype.
(__ledf2, __gedf2, __eqdf2, __nedf2): Likewise.
(__ltsf2, __gtsf2, __lesf2, __gesf2, __eqsf2, __nesf2): Likewise.
Two issues prevent the frv-elf port from building after the C99 changes. First
the trampoline code emitted into libgcc has calls to exit, but no prototype.
Adding a trivial prototype for exit() into the macro fixes that little goof.
Second, frvbegin.c has a call to atexit, so a quick prototype is added into
frvbegin.c to fix that problem.
That's enough to get the compiler building again.
gcc/
* config/frv/frv.h (TRANSFER_FROM_TRAMPOLINE): Add prototype for exit.
libgcc/
* config/frv/frvbegin.c (atexit): Add prototype.
__sync_val_compare_and_swap may be used on 128-bit types and either calls the
outline atomic code or uses an inline loop. On AArch64 LDXP is only atomic if
the value is stored successfully using STXP, but the current implementations
do not perform the store if the comparison fails. In this case the value
returned is not read atomically.
gcc/ChangeLog:
PR target/111404
* config/aarch64/aarch64.cc (aarch64_split_compare_and_swap):
For 128-bit store the loaded value and loop if needed.
libgcc/ChangeLog:
PR target/111404
* config/aarch64/lse.S (__aarch64_cas16_acq_rel): Execute STLXP using
either new value or loaded value.
My previous patch to add an implementation of __sync_syncrhonize with
a warning trips a testsuite failure in fortran (and possibly other
languages as well) as the framework expects no blank lines in the
output, but this warning was generating one. So remove the newline
from the end of the message and rely on the one added by the linker
instead.
Since we're there, remove the trailing period from the message as
well, since the convention seems to be not to have one.
libgcc/
* config/arm/lib1funcs.S (__sync_synchronize): Adjust warning message.
Prior to Armv6 there was no architected method to synchronize data
across processors. Armv6 saw the first introduction of
multi-processor support, using a CP15 operation; but this was
deprecated in Armv7 and is not supported on m-profile devices of any
form. Armv7 (and armv6-m) and later support data synchronization via
the DMB instruction.
This all leads to difficulties when linking programs as the user
generally needs to know which synchronization method is needed, but
there seems no easy way around this, when there are no OS-related
primitives available.
I've addressed this by adding multiple variants of __sync_synchronize
to libgcc, one for each of the above use cases. I've named these
__sync_synchronize_none, __sync_synchronize_cp15dmb and
__sync_synchronize_dmb. I've also added three specs files that can be
used to direct the linker to pick the appropriate implementation.
Using specs fragments for this is preferable to directing the user to
directly use --defsym as the latter has to be placed at the correct
position on the command line to be effective and the spec rule ensures
this automatically.
I've also added a default implementation of __sync_synchronize. The
default implementation will use DMB if that is available in the target
ISA, or fall back to a nul-implementation if it isn't. In the latter
case it will cause the linker (GNU LD) to emit a warning that
specifies how to pick a specific implementation. I've chosen not to
permit this default to use the CP15 solution as that has been
deprecated.
libgcc:
* config.host (arm*-*-eabi* | arm*-*-rtems*):
Add arm/t-sync to the makefile rules.
* config/arm/lib1funcs.S (__sync_synchronize_none)
(__sync_synchronize_cp15dmb, __sync_synchronize_dmb)
(__sync_synchronize): New functions.
* config/arm/t-sync: New file.
* config/arm/sync-none.specs: Likewise.
* config/arm/sync-dmb.specs: Likewise.
* config/arm/sync-cp15dmb.specs: Likewise.
libgcc/config/avr/libf7/
* libf7-const.def [F7MOD_sinh_]: Add MiniMax polynomial.
* libf7.c (f7_sinh): Use it instead of (exp(x) - exp(-x)) / 2
when |x| < 0.5 to avoid loss of precision due to cancellation.
Check for non-zero denorm in __adddf3. Need to check both the upper and
lower 32-bit chunks of a 64-bit float for a non-zero value when
checking to see if the value is -0.
Fix __addsf3 when the sum exponent is exactly 0xff to ensure that
produces infinity and not nan.
Handle converting NaN/inf values between formats.
Handle underflow and overflow when truncating.
Write a replacement for __fixxfsi so that it does not raise extra
exceptions during an extra conversion from long double to double.
libgcc/
* config/m68k/lb1sf68.S (__adddf3): Properly check for non-zero denorm.
(__divdf3): Restore sign bit properly.
(__addsf3): Correct exponent check.
* config/m68k/fpgnulib.c (EXPMASK): Define.
(__extendsfdf2): Handle Inf and NaN properly.
(__truncdfsf2): Handle underflow and overflow correctly.
(__extenddfxf2): Handle underflow, denorms, Inf and NaN correctly.
(__truncxfdf2): Handle underflow and denorms correctly.
(__fixxfsi): Reimplement.
The following patch adds the missing
{unsigned ,}__int128 <-> _Decimal{32,64,128}
conversion support into libgcc.a on top of the _BitInt support
(doing it without that would be larger amount of code and I hope all
the targets which support __int128 will eventually support _BitInt,
after all it is a required part of C23) and because it is in libgcc.a
only, it doesn't hurt that much if it is added for some architectures
only in GCC 15.
Initially I thought about doing this on the compiler side, but doing
it on the library side seems to be easier and more -Os friendly.
The tests currently require bitint effective target, that can be
removed when all the int128 targets support bitint.
2023-11-09 Jakub Jelinek <jakub@redhat.com>
PR libgcc/65833
libgcc/
* config/t-softfp (softfp_bid_list): Add
{U,}TItype <-> _Decimal{32,64,128} conversions.
* soft-fp/floattisd.c: New file.
* soft-fp/floattidd.c: New file.
* soft-fp/floattitd.c: New file.
* soft-fp/floatuntisd.c: New file.
* soft-fp/floatuntidd.c: New file.
* soft-fp/floatuntitd.c: New file.
* soft-fp/fixsdti.c: New file.
* soft-fp/fixddti.c: New file.
* soft-fp/fixtdti.c: New file.
* soft-fp/fixunssdti.c: New file.
* soft-fp/fixunsddti.c: New file.
* soft-fp/fixunstdti.c: New file.
gcc/testsuite/
* gcc.dg/dfp/int128-1.c: New test.
* gcc.dg/dfp/int128-2.c: New test.
* gcc.dg/dfp/int128-3.c: New test.
* gcc.dg/dfp/int128-4.c: New test.