
Here is a complete patch to add std::bfloat16_t support on x86 (AArch64 and ARM left for later). Almost no BFmode optabs are added by the patch, so for binops/unops it extends to SFmode first and then truncates back to BFmode. For {HF,SF,DF,XF,TF}mode -> BFmode conversions libgcc has implementations of all those conversions so that we avoid double rounding, for BFmode -> {DF,XF,TF}mode conversions to avoid growing libgcc too much it emits BFmode -> SFmode conversion first and then converts to the even wider mode, neither step should be imprecise. For BFmode -> HFmode, it first emits a precise BFmode -> SFmode conversion and then SFmode -> HFmode, because neither format is subset or superset of the other, while SFmode is superset of both. expr.cc then contains a -ffast-math optimization of the BF -> SF and SF -> BF conversions if we don't optimize for space (and for the latter if -frounding-math isn't enabled either). For x86, perhaps truncsfbf2 optab could be defined for TARGET_AVX512BF16 but IMNSHO should FAIL if !flag_finite_math || flag_rounding_math || !flag_unsafe_math_optimizations, because I think the insn doesn't raise on sNaNs, hardcodes round to nearest and flushes denormals to zero. By default (unless x86 -fexcess-precision=16) we use float excess precision for BFmode, so truncate only on explicit casts and assignments. The patch introduces a single __bf16 builtin - __builtin_nansf16b, because (__bf16) __builtin_nansf ("") will drop the sNaN into qNaN, and uses f16b suffix instead of bf16 because there would be ambiguity on log vs. logb - __builtin_logbf16 could be either log with bf16 suffix or logb with f16 suffix. In other cases libstdc++ should mostly use __builtin_*f for std::bfloat16_t overloads (we have a problem with std::nextafter though but that one we have also for std::float16_t). 2022-10-14 Jakub Jelinek <jakub@redhat.com> gcc/ * tree-core.h (enum tree_index): Add TI_BFLOAT16_TYPE. * tree.h (bfloat16_type_node): Define. * tree.cc (excess_precision_type): Promote bfloat16_type_mode like float16_type_mode. (build_common_tree_nodes): Initialize bfloat16_type_node if BFmode is supported. * expmed.h (maybe_expand_shift): Declare. * expmed.cc (maybe_expand_shift): No longer static. * expr.cc (convert_mode_scalar): Don't ICE on BF -> HF or HF -> BF conversions. If there is no optab, handle BF -> {DF,XF,TF,HF} conversions as separate BF -> SF -> {DF,XF,TF,HF} conversions, add -ffast-math generic implementation for BF -> SF and SF -> BF conversions. * builtin-types.def (BT_BFLOAT16, BT_FN_BFLOAT16_CONST_STRING): New. * builtins.def (BUILT_IN_NANSF16B): New builtin. * fold-const-call.cc (fold_const_call): Handle CFN_BUILT_IN_NANSF16B. * config/i386/i386.cc (classify_argument): Handle E_BCmode. (ix86_libgcc_floating_mode_supported_p): Also return true for BFmode for -msse2. (ix86_mangle_type): Mangle BFmode as DF16b. (ix86_invalid_conversion, ix86_invalid_unary_op, ix86_invalid_binary_op): Remove. (TARGET_INVALID_CONVERSION, TARGET_INVALID_UNARY_OP, TARGET_INVALID_BINARY_OP): Don't redefine. * config/i386/i386-builtins.cc (ix86_bf16_type_node): Remove. (ix86_register_bf16_builtin_type): Use bfloat16_type_node rather than ix86_bf16_type_node, only create it if still NULL. * config/i386/i386-builtin-types.def (BFLOAT16): Likewise. * config/i386/i386.md (cbranchbf4, cstorebf4): New expanders. gcc/c-family/ * c-cppbuiltin.cc (c_cpp_builtins): If bfloat16_type_node, predefine __BFLT16_*__ macros and for C++23 also __STDCPP_BFLOAT16_T__. Predefine bfloat16_type_node related macros for -fbuilding-libgcc. * c-lex.cc (interpret_float): Handle CPP_N_BFLOAT16. gcc/c/ * c-typeck.cc (convert_arguments): Don't promote __bf16 to double. gcc/cp/ * cp-tree.h (extended_float_type_p): Return true for bfloat16_type_node. * typeck.cc (cp_compare_floating_point_conversion_ranks): Set extended{1,2} if mv{1,2} is bfloat16_type_node. Adjust comment. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_bfloat16, check_effective_target_bfloat16_runtime, add_options_for_bfloat16): New. * gcc.dg/torture/bfloat16-basic.c: New test. * gcc.dg/torture/bfloat16-builtin.c: New test. * gcc.dg/torture/bfloat16-builtin-issignaling-1.c: New test. * gcc.dg/torture/bfloat16-complex.c: New test. * gcc.dg/torture/builtin-issignaling-1.c: Allow to be includable from bfloat16-builtin-issignaling-1.c. * gcc.dg/torture/floatn-basic.h: Allow to be includable from bfloat16-basic.c. * gcc.target/i386/vect-bfloat16-typecheck_2.c: Adjust expected diagnostics. * gcc.target/i386/sse2-bfloat16-scalar-typecheck.c: Likewise. * gcc.target/i386/vect-bfloat16-typecheck_1.c: Likewise. * g++.target/i386/bfloat_cpp_typecheck.C: Likewise. libcpp/ * include/cpplib.h (CPP_N_BFLOAT16): Define. * expr.cc (interpret_float_suffix): Handle bf16 and BF16 suffixes for C++. libgcc/ * config/i386/t-softfp (softfp_extensions): Add bfsf. (softfp_truncations): Add tfbf xfbf dfbf sfbf hfbf. (CFLAGS-extendbfsf2.c, CFLAGS-truncsfbf2.c, CFLAGS-truncdfbf2.c, CFLAGS-truncxfbf2.c, CFLAGS-trunctfbf2.c, CFLAGS-trunchfbf2.c): Add -msse2. * config/i386/libgcc-glibc.ver (GCC_13.0.0): Export __extendbfsf2 and __trunc{s,d,x,t,h}fbf2. * config/i386/sfp-machine.h (_FP_NANSIGN_B): Define. * config/i386/64/sfp-machine.h (_FP_NANFRAC_B): Define. * config/i386/32/sfp-machine.h (_FP_NANFRAC_B): Define. * soft-fp/brain.h: New file. * soft-fp/truncsfbf2.c: New file. * soft-fp/truncdfbf2.c: New file. * soft-fp/truncxfbf2.c: New file. * soft-fp/trunctfbf2.c: New file. * soft-fp/trunchfbf2.c: New file. * soft-fp/truncbfhf2.c: New file. * soft-fp/extendbfsf2.c: New file. libiberty/ * cp-demangle.h (D_BUILTIN_TYPE_COUNT): Increment. * cp-demangle.c (cplus_demangle_builtin_types): Add std::bfloat16_t entry. (cplus_demangle_type): Demangle DF16b. * testsuite/demangle-expected (_Z3xxxDF16b): New test.
101 lines
3.2 KiB
C
101 lines
3.2 KiB
C
#ifdef __MINGW32__
|
|
/* Make sure we are using gnu-style bitfield handling. */
|
|
#define _FP_STRUCT_LAYOUT __attribute__ ((gcc_struct))
|
|
#endif
|
|
|
|
/* The type of the result of a floating point comparison. This must
|
|
match `__libgcc_cmp_return__' in GCC for the target. */
|
|
typedef int __gcc_CMPtype __attribute__ ((mode (__libgcc_cmp_return__)));
|
|
#define CMPtype __gcc_CMPtype
|
|
|
|
#ifdef __x86_64__
|
|
#include "config/i386/64/sfp-machine.h"
|
|
#else
|
|
#include "config/i386/32/sfp-machine.h"
|
|
#endif
|
|
|
|
#define _FP_KEEPNANFRACP 1
|
|
#define _FP_QNANNEGATEDP 0
|
|
|
|
#define _FP_NANSIGN_H 1
|
|
#define _FP_NANSIGN_B 1
|
|
#define _FP_NANSIGN_S 1
|
|
#define _FP_NANSIGN_D 1
|
|
#define _FP_NANSIGN_E 1
|
|
#define _FP_NANSIGN_Q 1
|
|
|
|
/* Here is something Intel misdesigned: the specs don't define
|
|
the case where we have two NaNs with same mantissas, but
|
|
different sign. Different operations pick up different NaNs. */
|
|
#define _FP_CHOOSENAN(fs, wc, R, X, Y, OP) \
|
|
do { \
|
|
if (_FP_FRAC_GT_##wc(X, Y) \
|
|
|| (_FP_FRAC_EQ_##wc(X,Y) && (OP == '+' || OP == '*'))) \
|
|
{ \
|
|
R##_s = X##_s; \
|
|
_FP_FRAC_COPY_##wc(R,X); \
|
|
} \
|
|
else \
|
|
{ \
|
|
R##_s = Y##_s; \
|
|
_FP_FRAC_COPY_##wc(R,Y); \
|
|
} \
|
|
R##_c = FP_CLS_NAN; \
|
|
} while (0)
|
|
|
|
#ifndef _SOFT_FLOAT
|
|
#define FP_EX_INVALID 0x01
|
|
#define FP_EX_DENORM 0x02
|
|
#define FP_EX_DIVZERO 0x04
|
|
#define FP_EX_OVERFLOW 0x08
|
|
#define FP_EX_UNDERFLOW 0x10
|
|
#define FP_EX_INEXACT 0x20
|
|
#define FP_EX_ALL \
|
|
(FP_EX_INVALID | FP_EX_DENORM | FP_EX_DIVZERO | FP_EX_OVERFLOW \
|
|
| FP_EX_UNDERFLOW | FP_EX_INEXACT)
|
|
|
|
void __sfp_handle_exceptions (int);
|
|
|
|
#define FP_HANDLE_EXCEPTIONS \
|
|
do { \
|
|
if (__builtin_expect (_fex, 0)) \
|
|
__sfp_handle_exceptions (_fex); \
|
|
} while (0)
|
|
|
|
#define FP_TRAPPING_EXCEPTIONS ((~_fcw >> FP_EX_SHIFT) & FP_EX_ALL)
|
|
|
|
#define FP_ROUNDMODE (_fcw & FP_RND_MASK)
|
|
#endif
|
|
|
|
#define _FP_TININESS_AFTER_ROUNDING 1
|
|
|
|
#define __LITTLE_ENDIAN 1234
|
|
#define __BIG_ENDIAN 4321
|
|
|
|
#define __BYTE_ORDER __LITTLE_ENDIAN
|
|
|
|
/* Define ALIASNAME as a strong alias for NAME. */
|
|
#if defined __APPLE__
|
|
/* Mach-O doesn't support aliasing, so we build a secondary function for
|
|
the alias - we need to do a bit of a dance to find out what the type of
|
|
the arguments is and then apply that to the secondary function.
|
|
If these functions ever return anything but CMPtype we need to revisit
|
|
this... */
|
|
typedef float alias_HFtype __attribute__ ((mode (HF)));
|
|
typedef float alias_SFtype __attribute__ ((mode (SF)));
|
|
typedef float alias_DFtype __attribute__ ((mode (DF)));
|
|
typedef float alias_TFtype __attribute__ ((mode (TF)));
|
|
#define ALIAS_SELECTOR \
|
|
CMPtype (*) (alias_HFtype, alias_HFtype): (alias_HFtype) 0, \
|
|
CMPtype (*) (alias_SFtype, alias_SFtype): (alias_SFtype) 0, \
|
|
CMPtype (*) (alias_DFtype, alias_DFtype): (alias_DFtype) 0, \
|
|
CMPtype (*) (alias_TFtype, alias_TFtype): (alias_TFtype) 0
|
|
#define strong_alias(name, aliasname) \
|
|
CMPtype aliasname (__typeof (_Generic (name, ALIAS_SELECTOR)) a, \
|
|
__typeof (_Generic (name, ALIAS_SELECTOR)) b) \
|
|
{ return name (a, b); }
|
|
#else
|
|
# define strong_alias(name, aliasname) _strong_alias(name, aliasname)
|
|
# define _strong_alias(name, aliasname) \
|
|
extern __typeof (name) aliasname __attribute__ ((alias (#name)));
|
|
#endif
|