A missed piece of the patch for static operator(): in tsubst_function_decl,
we don't want to replace the first parameter with a new closure pointer if
operator() is static.
PR c++/108526
PR c++/106651
gcc/cp/ChangeLog:
* pt.cc (tsubst_function_decl): Don't replace the closure
parameter if DECL_STATIC_FUNCTION_P.
gcc/testsuite/ChangeLog:
* g++.dg/cpp23/static-operator-call5.C: Pass -g.
This updates baseline_symbols.txt for the Fedora 39 arches.
Most of the added symbols are added to all 5 files, exceptions are
DF16_ rtti stuff (only added on x86 and aarch64 which supports those),
DF16b rtti stuff (only x86 right now), _M_replace_cold (m vs. j
differences), DF128_ charconv (only x86), GLIBCXX_LDBL_3.4.31
symver (s390x), _M_get_sys_info/_M_get_local_info (l vs. x).
I was using
grep ^+ | sed 's/OBJECT:[0-9]*:/OBJECT:/' | sort | uniq -c | sort -n | less
on the patch to analyze.
powerpc64le-linux not included because I'll need to regenerate it.
2023-03-07 Jakub Jelinek <jakub@redhat.com>
* config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/x86_64-linux-gnu/32/baseline_symbols.txt: Update.
* config/abi/post/i486-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/aarch64-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/s390x-linux-gnu/baseline_symbols.txt: Update.
libstdc++-v3/ChangeLog:
PR libstdc++/108882
* config/abi/pre/gnu.ver (GLIBCXX_3.4.31): Adjust patterns to
not match symbols in namespace std::__gnu_cxx11_ieee128.
* config/os/gnu-linux/ldbl-ieee128-extra.ver: Add patterns for
std::__gnu_cxx11_ieee128::money_{get,put}.
Here, -Wdangling-reference triggers where it probably shouldn't, causing
some grief. The code in question uses a reference wrapper with a member
function returning a reference to a subobject of a non-temporary object:
const Plane & meta = fm.planes().inner();
I've tried a few approaches, e.g., checking that the member function's
return type is the same as the type of the enclosing class (which is
the case for member functions returning *this), but that then breaks
Wdangling-reference4.C with std::optional<std::string>.
This patch adjusts do_warn_dangling_reference so that we look through
reference wrapper classes (meaning, has a reference member and a
constructor taking the same reference type, or is std::reference_wrapper
or std::ranges::ref_view) and don't warn for them, supposing that the
member function returns a reference to a non-temporary object.
PR c++/107532
gcc/cp/ChangeLog:
* call.cc (reference_like_class_p): New.
(do_warn_dangling_reference): Add new bool parameter. See through
reference_like_class_p.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Wdangling-reference8.C: New test.
* g++.dg/warn/Wdangling-reference9.C: New test.
This fixes another syntax error in slp-3.c. I missed a '{ ... }' in
order to properly exclude s390_vx.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/slp-3.c: Add '{ ... }'.
In my recent rtti.cc change I assumed when emitting the support tinfos
that the tinfos for the fundamental types haven't been created yet.
Normally (in libsupc++.a (fundamental_type_info.o)) that is the case,
but as can be seen on the testcase, one can violate it by using typeid
etc. in the same TU and do it before ~__fundamental_type_info ()
definition.
The following patch fixes that by popping from unemitted_tinfo_decls
only in the normal case when it is there, and treating non-NULL
DECL_INITIAL on a tinfo node as indication that emit_tinfo_decl has
processed it already.
2023-03-07 Jakub Jelinek <jakub@redhat.com>
PR c++/109042
* rtti.cc (emit_support_tinfo_1): Don't assert that last
unemitted_tinfo_decls element is tinfo, instead pop from it only in
that case.
* decl2.cc (c_parse_final_cleanups): Don't call emit_tinfo_decl
for unemitted_tinfO_decls which have already non-NULL DECL_INITIAL.
* g++.dg/rtti/pr109042.C: New test.
When processing a noexcept, constructors aren't elided: build_over_call
has
/* It's unsafe to elide the constructor when handling
a noexcept-expression, it may evaluate to the wrong
value (c++/53025). */
&& (force_elide || cp_noexcept_operand == 0))
so the assert I added recently needs to be relaxed a little bit.
PR c++/109030
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_call_expression): Relax assert.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/noexcept77.C: New test.
Similarly to PR107938, this also started with r11-557, whereby cp_finish_decl
can call check_initializer even in a template for a constexpr initializer.
Here we are rejecting
extern const Q q;
template<int>
constexpr auto p = q(0);
even though q has a constexpr operator(). It's deemed non-const by
decl_maybe_constant_var_p because even though 'q' is const it is not
of integral/enum type.
If fun is not a function pointer, we don't know if we're using it as an
lvalue or rvalue, so with this patch we pass 'any' for want_rval. With
that, p_c_e/VAR_DECL doesn't flat out reject the underlying VAR_DECL.
PR c++/107939
gcc/cp/ChangeLog:
* constexpr.cc (potential_constant_expression_1) <case CALL_EXPR>: Pass
'any' when recursing on a VAR_DECL and not a pointer to function.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1y/var-templ74.C: Remove dg-error.
* g++.dg/cpp1y/var-templ77.C: New test.
Fix the bug of the rvv bool mode precision with the adjustment.
The bits size of vbool*_t will be adjusted to
[1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The
adjusted mode precison of vbool*_t will help underlying pass to
make the right decision for both the correctness and optimization.
Given below sample code:
void test_1(int8_t * restrict in, int8_t * restrict out)
{
vbool8_t v2 = *(vbool8_t*)in;
vbool16_t v5 = *(vbool16_t*)in;
*(vbool16_t*)(out + 200) = v5;
*(vbool8_t*)(out + 100) = v2;
}
Before the precision adjustment:
addi a4,a1,100
vsetvli a5,zero,e8,m1,ta,ma
addi a1,a1,200
vlm.v v24,0(a0)
vsm.v v24,0(a4)
// Need one vsetvli and vlm.v for correctness here.
vsm.v v24,0(a1)
After the precision adjustment:
csrr t0,vlenb
slli t1,t0,1
csrr a3,vlenb
sub sp,sp,t1
slli a4,a3,1
add a4,a4,sp
sub a3,a4,a3
vsetvli a5,zero,e8,m1,ta,ma
addi a2,a1,200
vlm.v v24,0(a0)
vsm.v v24,0(a3)
addi a1,a1,100
vsetvli a4,zero,e8,mf2,ta,ma
csrr t0,vlenb
vlm.v v25,0(a3)
vsm.v v25,0(a2)
slli t1,t0,1
vsetvli a5,zero,e8,m1,ta,ma
vsm.v v24,0(a1)
add sp,sp,t1
jr ra
However, there may be some optimization opportunates after
the mode precision adjustment. It can be token care of in
the RISC-V backend in the underlying separted PR(s).
gcc/ChangeLog:
PR target/108185
PR target/108654
* config/riscv/riscv-modes.def (ADJUST_PRECISION): Adjust VNx*BI
modes.
* config/riscv/riscv.cc (riscv_v_adjust_precision): New.
* config/riscv/riscv.h (riscv_v_adjust_precision): New.
* genmodes.cc (adj_precision): New.
(ADJUST_PRECISION): New.
(emit_mode_adjustments): Handle ADJUST_PRECISION.
gcc/testsuite/ChangeLog:
PR target/108185
PR target/108654
* gcc.target/riscv/rvv/base/pr108185-1.c: New test.
* gcc.target/riscv/rvv/base/pr108185-2.c: New test.
* gcc.target/riscv/rvv/base/pr108185-3.c: New test.
* gcc.target/riscv/rvv/base/pr108185-4.c: New test.
* gcc.target/riscv/rvv/base/pr108185-5.c: New test.
* gcc.target/riscv/rvv/base/pr108185-6.c: New test.
* gcc.target/riscv/rvv/base/pr108185-7.c: New test.
* gcc.target/riscv/rvv/base/pr108185-8.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
Stack protector needs a guard value on the stack and change the stack
layout. So we need to disable it for those tests, to avoid test failure
with --enable-default-ssp.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/shrink_wrap_1.c (dg-options): Add
-fno-stack-protector.
* gcc.target/aarch64/stack-check-cfa-1.c (dg-options): Add
-fno-stack-protector.
* gcc.target/aarch64/stack-check-cfa-2.c (dg-options): Add
-fno-stack-protector.
* gcc.target/aarch64/test_frame_17.c (dg-options): Add
-fno-stack-protector.
Storing stack guarding variable need one stp instruction, breaking the
scan-assembler-not pattern in the test. Disable stack protector to
avoid a test failure with --enable-default-ssp.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/pr104005.c (dg-options): Add
-fno-stack-protector.
The test scans for "const_int 0" in the RTL dump, but stack protector
can produce more "const_int 0". To avoid a failure with
--enable-default-ssp, disable stack protector for this.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/auto-init-7.c (dg-options): Add
-fno-stack-protector.
Stack protector influence code generation and cause function body checks
fail.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/pr103147-10.c (dg-options): Add
-fno-stack-protector.
* g++.target/aarch64/pr103147-10.C: Likewise.
If GCC is configured with --enable-default-ssp, the stack protector can
make many sve-pcs tests fail.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp (sve_flags):
Add -fno-stack-protector.
In PIE, symbol "fixed_regs" is addressed via GOT. It will break the
scan-assembler pattern and cause test failure with --enable-default-pie.
gcc/testsuite/ChangeLog:
PR testsuite/70150
* gcc.target/aarch64/fuse_adrp_add_1.c (dg-options): Add
-fno-pie.
These tests set large code model with -mcmodel=large or target pragma for
AArch64. But if GCC is configured with --enable-default-pie, it triggers
"sorry: unimplemented: code model large with -fpic". Disable PIE to make
avoid the issue.
gcc/testsuite/ChangeLog:
PR testsuite/70150
* gcc.dg/tls/pr78796.c (dg-additional-options): Add -fno-pie
-no-pie for aarch64-*-*.
* gcc.target/aarch64/pr63304_1.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr70120-2.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr78733.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr79041-2.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr94530.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr94577.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/reload-valid-spoff.c (dg-options): Add
-fno-pie.
If GCC is built with --enable-default-pie, a lot of aapcs64 tests fail
because relocation unsupported in PIE is used.
gcc/testsuite/ChangeLog:
PR testsuite/70150
* gcc.target/aarch64/aapcs64/aapcs64.exp (additional_flags):
Add -fno-pie -no-pie.
While gcc.dg/plugin/must-tail-call-2.c passes for all targets even
without this, the error message is, for a target like cris-elf that
doesn't implement sibling calls: "error: cannot tail-call: machine
description does not have a sibcall_epilogue instruction pattern"
rather than "error: cannot tail-call: callee returns a structure".
Also, it'd be confusing to exclude must-tail-call-1.c but not
must-tail-call-2.c
* gcc.dg/plugin/must-tail-call-1.c, gcc.dg/plugin/must-tail-call-2.c:
Gate on effective target tail_call.
The RTL "expand" dump is the first RTL dump, and it also appears to be
the earliest trace of the target having implemented sibcalls.
Including the "," in the pattern searched for, to try and avoid
possible false matches, but there doesn't appear to be any identifiers
or target names nearby so this is just belts and suspenders. Using
"tail_call" as a shorter and more commonly used term than a derivative
of "sibling calls", and expecting only gcc folks to have heard of
"sibcalls".
* lib/target-supports.exp (check_effective_target_tail_call): New.
For 32-bit newlib targets (such as cris-elf and pru-elf),
that int32_t is "long int". See other regexps in the
testsuite matching "aka (long )?int" (with single-quotes
where needed) where the pattern in
allocation-size-multiline-3.c matches plain "int". Uses the
special syntax recently introduced for multi-line patterns.
testsuite:
* gcc.dg/analyzer/allocation-size-multiline-3.c: Handle
int32_t being "long int".
Those multi-line-patterns are literal. Sometimes a regexp
needs to be matched. This is a start: just three elements
are supported: "(" ")" and the compound ")?" (and on second
thought, it can be argued that "(...)" alone is not useful).
Note that Tcl "string map" is documented to have the desired
effect: a once-over but no re-recognitions of previously
replaced mapped elements. Also, drop a doubled "containing".
testsuite:
* lib/multiline.exp (_build_multiline_regex): Map
"{re:" to "(", similarly ")?" from ":re?}" and the
same without question mark.
This patch updates the IEEE 128-bit types used in libgcc.
At the moment, we cannot build GCC when the target uses IEEE 128-bit long
doubles, such as building the compiler for a native Fedora 36 system. The
build dies when it is trying to build the _mulkc3.c and _divkc3 modules.
This patch changes libgcc to use long double for the IEEE 128-bit base type if
long double is IEEE 128-bit, and it uses _Float128 otherwise. The built-in
functions are adjusted to be the correct version based on the IEEE 128-bit base
type used.
While it is desirable to ultimately have __float128 and _Float128 use the same
internal type and mode within GCC, at present if you use the option
-mabi=ieeelongdouble, the __float128 type will use the long double type and not
the _Float128 type. We get an internal compiler error if we combine the
signbitf128 built-in with a long double type.
I've gone through several iterations of trying to fix this within GCC, and
there are various problems that have come up. I developed this alternative
patch that changes libgcc so that it does not tickle the issue. I hope we can
fix the compiler at some point, but right now, this is preventing people on
Fedora 36 systems from building compilers where the default long double is IEEE
128-bit.
2023-03-06 Michael Meissner <meissner@linux.ibm.com>
libgcc/
PR target/107299
* config/rs6000/_divkc3.c (COPYSIGN): Use the correct built-in based on
whether long double is IBM or IEEE.
(INFINITY): Likewise.
(FABS): Likewise.
* config/rs6000/_mulkc3.c (COPYSIGN): Likewise.
(INFINITY): Likewise.
* config/rs6000/quad-float128.h (TF): Remove definition.
(TFtype): Define to be long double or _Float128.
(TCtype): Define to be _Complex long double or _Complex _Float128.
* libgcc2.h (TFtype): Allow machine config files to override this.
(TCtype): Likewise.
* soft-fp/quad.h (TFtype): Likewise.
gcc/ChangeLog:
* config/gcn/gcn-valu.md (<expander><mode>3_exec): Add patterns for
{s|u}{max|min} in QI, HI and DI modes.
(<expander><mode>3): Add pattern for {s|u}{max|min} in DI mode.
(cond_<fexpander><mode>): Add pattern for cond_f{max|min}.
(cond_<expander><mode>): Add pattern for cond_{s|u}{max|min}.
* config/gcn/gcn.cc (gcn_spill_class): Allow the exec register to be
saved in SGPRs.
gcc/testsuite/ChangeLog:
* gcc.target/gcn/cond_fmaxnm_1.c: New test.
* gcc.target/gcn/cond_fmaxnm_1_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_2.c: New test.
* gcc.target/gcn/cond_fmaxnm_2_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_3.c: New test.
* gcc.target/gcn/cond_fmaxnm_3_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_4.c: New test.
* gcc.target/gcn/cond_fmaxnm_4_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_5.c: New test.
* gcc.target/gcn/cond_fmaxnm_5_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_6.c: New test.
* gcc.target/gcn/cond_fmaxnm_6_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_7.c: New test.
* gcc.target/gcn/cond_fmaxnm_7_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_8.c: New test.
* gcc.target/gcn/cond_fmaxnm_8_run.c: New test.
* gcc.target/gcn/cond_fminnm_1.c: New test.
* gcc.target/gcn/cond_fminnm_1_run.c: New test.
* gcc.target/gcn/cond_fminnm_2.c: New test.
* gcc.target/gcn/cond_fminnm_2_run.c: New test.
* gcc.target/gcn/cond_fminnm_3.c: New test.
* gcc.target/gcn/cond_fminnm_3_run.c: New test.
* gcc.target/gcn/cond_fminnm_4.c: New test.
* gcc.target/gcn/cond_fminnm_4_run.c: New test.
* gcc.target/gcn/cond_fminnm_5.c: New test.
* gcc.target/gcn/cond_fminnm_5_run.c: New test.
* gcc.target/gcn/cond_fminnm_6.c: New test.
* gcc.target/gcn/cond_fminnm_6_run.c: New test.
* gcc.target/gcn/cond_fminnm_7.c: New test.
* gcc.target/gcn/cond_fminnm_7_run.c: New test.
* gcc.target/gcn/cond_fminnm_8.c: New test.
* gcc.target/gcn/cond_fminnm_8_run.c: New test.
* gcc.target/gcn/cond_smax_1.c: New test.
* gcc.target/gcn/cond_smax_1_run.c: New test.
* gcc.target/gcn/cond_smin_1.c: New test.
* gcc.target/gcn/cond_smin_1_run.c: New test.
* gcc.target/gcn/cond_umax_1.c: New test.
* gcc.target/gcn/cond_umax_1_run.c: New test.
* gcc.target/gcn/cond_umin_1.c: New test.
* gcc.target/gcn/cond_umin_1_run.c: New test.
* gcc.target/gcn/smax_1.c: New test.
* gcc.target/gcn/smax_1_run.c: New test.
* gcc.target/gcn/smin_1.c: New test.
* gcc.target/gcn/smin_1_run.c: New test.
* gcc.target/gcn/umax_1.c: New test.
* gcc.target/gcn/umax_1_run.c: New test.
* gcc.target/gcn/umin_1.c: New test.
* gcc.target/gcn/umin_1_run.c: New test.
gcc/ada/
PR ada/108858
* sem_ch6.adb (Analyze_Subprogram_Body_Helper): For functions with
separate spec, if their return type was visible through a limited-
with context clause, their extra formals were not added when the
spec was analyzed. Now the full view must be available, and the
extra formals can be created and Returns_By_Ref computed.
The following closes a gap in double reduction detection where we
in the outer loop analysis fail to verify the inner LC PHI use is
the latch definition of the inner loop PHI. That latch definition
is used to detect that an inner loop is part of a double reduction
when later doing the inner loop analysis.
PR tree-optimization/109025
* tree-vect-loop.cc (vect_is_simple_reduction): Verify
the inner LC PHI use is the inner loop PHI latch definition
before classifying an outer PHI as double reduction.
* gcc.dg/vect/pr109025.c: New testcase.
Stack protector will affect stack layout and break the expectation of
these tests, causing test failures if GCC is configured with
--enable-default-ssp.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/prolog-opt.c (dg-options): Add
-fno-stack-protector.
* gcc.target/loongarch/stack-check-cfa-1.c (dg-options):
Likewise.
* gcc.target/loongarch/stack-check-cfa-2.c (dg-options):
Likewise.
In the toolchain convention, we describe -mfpu= as:
"Selects the allowed set of basic floating-point instructions and
registers. This option should not change the FP calling convention
unless it's necessary."
Though not explicitly stated, the rationale of this rule is to allow
combinations like "-mabi=lp64s -mfpu=64". This will be useful for
running applications with LP64S/F ABI on a double-float-capable
LoongArch hardware and using a math library with LP64S/F ABI but native
double float HW instructions, for a better performance.
And now a case in Linux kernel has again proven the usefulness of this
kind of combination. The AMDGPU DCN kernel driver needs to perform some
floating-point operation, but the entire kernel uses LP64S ABI. So the
translation units of the AMDGPU DCN driver need to be compiled with
-mfpu=64 (the kernel lacks soft-FP routines in libgcc), but -mabi=lp64s
(or you can't link it with the other part of the kernel).
Unfortunately, currently GCC uses TARGET_{HARD,SOFT,DOUBLE}_FLOAT to
determine the floating calling convention. This causes "-mfpu=64"
silently allow using $fa* to pass parameters and return values EVEN IF
-mabi=lp64s is used. To make things worse, the generated object file
has SOFT-FLOAT set in the eflags field so the linker will happily link
it with other LP64S ABI object files, but obviously this will lead to
bad results at runtime. And for now all loongarch64 CPU models (-march
settings) implies -mfpu=64 on by default, so the issue makes a single
"-mabi=lp64s" option basically broken (fortunately most projects for eg
the Linux kernel have used -msoft-float which implies both -mabi=lp64s
and -mfpu=none as we've recommended in the toolchain convention doc).
The fix is simple: use TARGET_*_FLOAT_ABI instead.
I consider this a bug fix: the behavior difference from the toolchain
convention doc is a bug, and generating object files with SOFT-FLOAT
flag but parameters/return values passed through FPRs is definitely a
bug.
Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk and
release/gcc-12 branch?
gcc/ChangeLog:
PR target/109000
* config/loongarch/loongarch.h (FP_RETURN): Use
TARGET_*_FLOAT_ABI instead of TARGET_*_FLOAT.
(UNITS_PER_FP_ARG): Likewise.
gcc/testsuite/ChangeLog:
PR target/109000
* gcc.target/loongarch/flt-abi-isa-1.c: New test.
* gcc.target/loongarch/flt-abi-isa-2.c: New test.
* gcc.target/loongarch/flt-abi-isa-3.c: New test.
* gcc.target/loongarch/flt-abi-isa-4.c: New test.
gcc/fortran/ChangeLog:
PR fortran/106856
* class.cc (gfc_build_class_symbol): Handle update of attributes of
existing class container.
(gfc_find_derived_vtab): Fix several memory leaks.
(find_intrinsic_vtab): Ditto.
* decl.cc (attr_decl1): Manage update of symbol attributes from
CLASS attributes.
* primary.cc (gfc_variable_attr): OPTIONAL shall not be taken or
updated from the class container.
* symbol.cc (free_old_symbol): Adjust management of symbol versions
to not prematurely free array specs while working on the declation
of CLASS variables.
gcc/testsuite/ChangeLog:
PR fortran/106856
* gfortran.dg/interface_41.f90: Remove dg-pattern from valid testcase.
* gfortran.dg/class_74.f90: New test.
* gfortran.dg/class_75.f90: New test.
Co-authored-by: Tobias Burnus <tobias@codesourcery.com>
On aarch64, powerpc64le and s390x-linux I'm seeing another syntax error
which didn't show up on x86_64-linux nor i686-linux:
ERROR: gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects: error executing dg-final: syntax error in target selector "target ! vect_load_lanes && vect_partial_vectors_usage_1 && ! s390_vx"
ERROR: gcc.dg/vect/slp-perm-8.c: error executing dg-final: syntax error in target selector "target ! vect_load_lanes && vect_partial_vectors_usage_1 && ! s390_vx"
The following patch fixes that.
2023-03-05 Jakub Jelinek <jakub@redhat.com>
* gcc.dg/vect/slp-perm-8.c: Fix up syntax error in
scan-tree-dump-times target selector.
This patch supports Zkbk, Zbkc and Zkbx extension.
It includes instruction's machine description and built-in funtions.
It is worth mentioning that this patch only adds instructions in Zbkb but no
longer in Zbb.
If any instructions both in Zbb and Zbkb, they will be generated by code
generator instead of built-in functions.
gcc/ChangeLog:
* config/riscv/bitmanip.md: Add ZBKB's instructions.
* config/riscv/riscv-builtins.cc (AVAIL): Add new.
* config/riscv/riscv.md: Add new type for crypto instructions.
* config/riscv/crypto.md: Add Scalar Cryptography extension's machine
description file.
* config/riscv/riscv-scalar-crypto.def: Add Scalar Cryptography
extension's built-in function file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zbkb32.c: New test.
* gcc.target/riscv/zbkb64.c: New test.
* gcc.target/riscv/zbkc32.c: New test.
* gcc.target/riscv/zbkc64.c: New test.
* gcc.target/riscv/zbkx32.c: New test.
* gcc.target/riscv/zbkx64.c: New test.
Co-Authored-By: SiYu Wu <siyu@isrc.iscas.ac.cn>
This showed up as dynamic icount regression in SPEC 531.deepsjeng with upstream
gcc (vs. gcc 12.2). gcc was resorting to synthetic multiply using shift+add(s)
even when multiply had clear cost benefit.
|00000000000133b8 <see(state_t*, int, int, int, int) [clone .constprop.0]+0x382>:
| 133b8: srl a3,a1,s6
| 133bc: and a3,a3,s5
| 133c0: slli a4,a3,0x9
| 133c4: add a4,a4,a3
| 133c6: slli a4,a4,0x9
| 133c8: add a4,a4,a3
| 133ca: slli a3,a4,0x1b
| 133ce: add a4,a4,a3
vs. gcc 12 doing something lke below.
|00000000000131c4 <see(state_t*, int, int, int, int) [clone .constprop.0]+0x35c>:
| 131c4: ld s1,8(sp)
| 131c6: srl a3,a1,s4
| 131ca: and a3,a3,s11
| 131ce: mul a3,a3,s1
Bisected this to f90cb39235 ("RISC-V: costs: support shift-and-add in
strength-reduction"). The intent was to optimize cost for
shift-add-pow2-{1,2,3} corresponding to bitmanip insns SH*ADD, but ended
up doing that for all shift values which seems to favor synthezing
multiply among others.
The bug itself is trivial, IN_RANGE() calling pow2p_hwi() which returns bool
vs. exact_log2() returning power of 2.
This fix also requires update to the test introduced by the same commit
which now generates MUL vs. synthesizing it.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_rtx_costs): Fixed IN_RANGE() to
use exact_log2().
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zba-shNadd-07.c: f2(i*783) now generates MUL vs.
5 insn sh1add+slli+add+slli+sub.
* gcc.target/riscv/pr108987.c: New test.
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/scalar_move-1.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-2.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-3.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-4.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-5.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-6.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-7.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-8.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-100.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-101.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-78.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-79.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-80.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-81.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-82.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-83.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-84.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-85.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-86.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-87.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-88.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-89.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-90.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-91.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-92.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-93.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-94.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-95.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-96.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-97.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-98.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-99.c: New test.