As PR108727 shows, when cleanup code called by the stack
unwinder calls function _Unwind_Resume, it goes via plt
stub like:
function 00000000.plt_call._Unwind_Resume:
=> 0x0000000010003580 <+0>: std r2,40(r1)
0x0000000010003584 <+4>: ld r12,-31760(r2)
0x0000000010003588 <+8>: mtctr r12
0x000000001000358c <+12>: ld r2,-31752(r2)
0x0000000010003590 <+16>: cmpldi r2,0
0x0000000010003594 <+20>: bnectr+
0x0000000010003598 <+24>: b 0x100031a4
<_Unwind_Resume@plt>
It wants to save TOC base (r2) to r1 + 40, but we only
bump the stack segment by 32 bytes as follows:
stdu %r29,-32(%r3)
It means the access is out of the stack segment allocated
by __generic_morestack, once the touch area isn't writable
like this failure shows, it would cause segment fault.
So fix the bump size with one reasonable value PARAMS.
PR libgcc/108727
libgcc/ChangeLog:
* config/rs6000/morestack.S (__morestack): Use PARAMS for new stack
bump size.
According to Haochen's finding in [1], currently ppc-fortran.exp
doesn't support Fortran specific warning or error messages well.
By looking into it, it's due to that gfortran uses some different
warning/error prefixes as follows:
set gcc_warning_prefix "\[Ww\]arning:"
set gcc_error_prefix "(Fatal )?\[Ee\]rror:"
comparing to:
set gcc_warning_prefix "warning:"
set gcc_error_prefix "(fatal )?error:"
So this is to override these two prefixes and make it support
dg-{warning,error} checks.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613302.html
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/ppc-fortran/ppc-fortran.exp: Override
gcc_{warning,error}_prefix with Fortran specific one used in
gfortran_init.
Test cases scalar-test-data-class-1[45].c adopts type __int128
which requires to check int128 effective target, otherwise the
testing on them will fail at -m32. This patch is to add int128
effective target requirement.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/bfp/scalar-test-data-class-14.c: Adjust with
int128 effective target requirement.
* gcc.target/powerpc/bfp/scalar-test-data-class-15.c: Likewise.
Two test cases scalar-test-data-class-12.c and vec-test-data-class-9.c
fail on Power9 BE testing at -m32, they adopts a built-in function
scalar_insert_exp which requires powerpc64 support. This patch
is to make them to check has_arch_ppc64 effective target requirement.
PR testsuite/108729
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/bfp/scalar-test-data-class-12.c: Adjust with
has_arch_ppc64 effective target.
* gcc.target/powerpc/bfp/vec-test-data-class-9.c: Likewise.
The built-in function scalar_test_neg_qp is under stanza
ieee128-hw, that is TARGET_FLOAT128_HW. Since we don't
have float128 hardware support on 32-bit as follows:
if (TARGET_FLOAT128_HW && !TARGET_64BIT)
{
if ((rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128_HW) != 0)
error ("%qs requires %qs", "%<-mfloat128-hardware%>", "-m64");
rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_HW;
}
So adjust the case with lp64 effective target accordingly.
PR testsuite/108730
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/bfp/scalar-test-neg-8.c: Adjust with lp64
effective target requirement.
Compiled with cpu type Power9 or later, GCC generates
xxspltib rather than vspltis*, so adjust the test
case scanning content accordingly.
PR testsuite/108813
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr101384-2.c: Adjust with xxspltib.
On BE, the extracted index for the leftmost element is 0
rather than 1, adjust the test case accordingly.
PR testsuite/108810
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/fold-vec-extract-double.p9.c (testd_cst): Adjust
the extracted index for BE.
The mips msa-ds.c test is trying to ensure that MSA branches can have their
delay slots filled. The regexp it used looked for the function name, a nop,
then the function name again. If found that sequence, then the test failed.
The problem is with Vlad's recent IRA work there's simply less code in the
test (good) and as a result one of the *other* branches in the test had an
unfilled delay slot -- the delay slot for the MSA branch was still being
filled.
This patch tightens up the regexp. In particular it looks for the MSA branch
and a nop on the next line (avoiding the over-eager .* construct). That
indicates that the MSA branch did not have its delay slot filled. When that
sequence is found, then the test fails.
This fixes the recent regressions for mips64 and mips64el in the tester.
Installing on the trunk,
gcc/testsuite:
* gcc.target/mips/msa-ds.c: Fix over eager pattern matching.
The recently added tests missed checking for "fopenmp" (see
other tests where "-fopenmp" is passed), which makes them
fail on non-openmp systems.
* gcc.dg/analyzer/omp-parallel-for-get-min.c,
gcc.dg/analyzer/omp-parallel-for-1.c: Require effective target fopenmp.
A missed piece of the patch for static operator(): in tsubst_function_decl,
we don't want to replace the first parameter with a new closure pointer if
operator() is static.
PR c++/108526
PR c++/106651
gcc/cp/ChangeLog:
* pt.cc (tsubst_function_decl): Don't replace the closure
parameter if DECL_STATIC_FUNCTION_P.
gcc/testsuite/ChangeLog:
* g++.dg/cpp23/static-operator-call5.C: Pass -g.
This updates baseline_symbols.txt for the Fedora 39 arches.
Most of the added symbols are added to all 5 files, exceptions are
DF16_ rtti stuff (only added on x86 and aarch64 which supports those),
DF16b rtti stuff (only x86 right now), _M_replace_cold (m vs. j
differences), DF128_ charconv (only x86), GLIBCXX_LDBL_3.4.31
symver (s390x), _M_get_sys_info/_M_get_local_info (l vs. x).
I was using
grep ^+ | sed 's/OBJECT:[0-9]*:/OBJECT:/' | sort | uniq -c | sort -n | less
on the patch to analyze.
powerpc64le-linux not included because I'll need to regenerate it.
2023-03-07 Jakub Jelinek <jakub@redhat.com>
* config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/x86_64-linux-gnu/32/baseline_symbols.txt: Update.
* config/abi/post/i486-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/aarch64-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/s390x-linux-gnu/baseline_symbols.txt: Update.
libstdc++-v3/ChangeLog:
PR libstdc++/108882
* config/abi/pre/gnu.ver (GLIBCXX_3.4.31): Adjust patterns to
not match symbols in namespace std::__gnu_cxx11_ieee128.
* config/os/gnu-linux/ldbl-ieee128-extra.ver: Add patterns for
std::__gnu_cxx11_ieee128::money_{get,put}.
Here, -Wdangling-reference triggers where it probably shouldn't, causing
some grief. The code in question uses a reference wrapper with a member
function returning a reference to a subobject of a non-temporary object:
const Plane & meta = fm.planes().inner();
I've tried a few approaches, e.g., checking that the member function's
return type is the same as the type of the enclosing class (which is
the case for member functions returning *this), but that then breaks
Wdangling-reference4.C with std::optional<std::string>.
This patch adjusts do_warn_dangling_reference so that we look through
reference wrapper classes (meaning, has a reference member and a
constructor taking the same reference type, or is std::reference_wrapper
or std::ranges::ref_view) and don't warn for them, supposing that the
member function returns a reference to a non-temporary object.
PR c++/107532
gcc/cp/ChangeLog:
* call.cc (reference_like_class_p): New.
(do_warn_dangling_reference): Add new bool parameter. See through
reference_like_class_p.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Wdangling-reference8.C: New test.
* g++.dg/warn/Wdangling-reference9.C: New test.
This fixes another syntax error in slp-3.c. I missed a '{ ... }' in
order to properly exclude s390_vx.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/slp-3.c: Add '{ ... }'.
In my recent rtti.cc change I assumed when emitting the support tinfos
that the tinfos for the fundamental types haven't been created yet.
Normally (in libsupc++.a (fundamental_type_info.o)) that is the case,
but as can be seen on the testcase, one can violate it by using typeid
etc. in the same TU and do it before ~__fundamental_type_info ()
definition.
The following patch fixes that by popping from unemitted_tinfo_decls
only in the normal case when it is there, and treating non-NULL
DECL_INITIAL on a tinfo node as indication that emit_tinfo_decl has
processed it already.
2023-03-07 Jakub Jelinek <jakub@redhat.com>
PR c++/109042
* rtti.cc (emit_support_tinfo_1): Don't assert that last
unemitted_tinfo_decls element is tinfo, instead pop from it only in
that case.
* decl2.cc (c_parse_final_cleanups): Don't call emit_tinfo_decl
for unemitted_tinfO_decls which have already non-NULL DECL_INITIAL.
* g++.dg/rtti/pr109042.C: New test.
When processing a noexcept, constructors aren't elided: build_over_call
has
/* It's unsafe to elide the constructor when handling
a noexcept-expression, it may evaluate to the wrong
value (c++/53025). */
&& (force_elide || cp_noexcept_operand == 0))
so the assert I added recently needs to be relaxed a little bit.
PR c++/109030
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_call_expression): Relax assert.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/noexcept77.C: New test.
Similarly to PR107938, this also started with r11-557, whereby cp_finish_decl
can call check_initializer even in a template for a constexpr initializer.
Here we are rejecting
extern const Q q;
template<int>
constexpr auto p = q(0);
even though q has a constexpr operator(). It's deemed non-const by
decl_maybe_constant_var_p because even though 'q' is const it is not
of integral/enum type.
If fun is not a function pointer, we don't know if we're using it as an
lvalue or rvalue, so with this patch we pass 'any' for want_rval. With
that, p_c_e/VAR_DECL doesn't flat out reject the underlying VAR_DECL.
PR c++/107939
gcc/cp/ChangeLog:
* constexpr.cc (potential_constant_expression_1) <case CALL_EXPR>: Pass
'any' when recursing on a VAR_DECL and not a pointer to function.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1y/var-templ74.C: Remove dg-error.
* g++.dg/cpp1y/var-templ77.C: New test.
Fix the bug of the rvv bool mode precision with the adjustment.
The bits size of vbool*_t will be adjusted to
[1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The
adjusted mode precison of vbool*_t will help underlying pass to
make the right decision for both the correctness and optimization.
Given below sample code:
void test_1(int8_t * restrict in, int8_t * restrict out)
{
vbool8_t v2 = *(vbool8_t*)in;
vbool16_t v5 = *(vbool16_t*)in;
*(vbool16_t*)(out + 200) = v5;
*(vbool8_t*)(out + 100) = v2;
}
Before the precision adjustment:
addi a4,a1,100
vsetvli a5,zero,e8,m1,ta,ma
addi a1,a1,200
vlm.v v24,0(a0)
vsm.v v24,0(a4)
// Need one vsetvli and vlm.v for correctness here.
vsm.v v24,0(a1)
After the precision adjustment:
csrr t0,vlenb
slli t1,t0,1
csrr a3,vlenb
sub sp,sp,t1
slli a4,a3,1
add a4,a4,sp
sub a3,a4,a3
vsetvli a5,zero,e8,m1,ta,ma
addi a2,a1,200
vlm.v v24,0(a0)
vsm.v v24,0(a3)
addi a1,a1,100
vsetvli a4,zero,e8,mf2,ta,ma
csrr t0,vlenb
vlm.v v25,0(a3)
vsm.v v25,0(a2)
slli t1,t0,1
vsetvli a5,zero,e8,m1,ta,ma
vsm.v v24,0(a1)
add sp,sp,t1
jr ra
However, there may be some optimization opportunates after
the mode precision adjustment. It can be token care of in
the RISC-V backend in the underlying separted PR(s).
gcc/ChangeLog:
PR target/108185
PR target/108654
* config/riscv/riscv-modes.def (ADJUST_PRECISION): Adjust VNx*BI
modes.
* config/riscv/riscv.cc (riscv_v_adjust_precision): New.
* config/riscv/riscv.h (riscv_v_adjust_precision): New.
* genmodes.cc (adj_precision): New.
(ADJUST_PRECISION): New.
(emit_mode_adjustments): Handle ADJUST_PRECISION.
gcc/testsuite/ChangeLog:
PR target/108185
PR target/108654
* gcc.target/riscv/rvv/base/pr108185-1.c: New test.
* gcc.target/riscv/rvv/base/pr108185-2.c: New test.
* gcc.target/riscv/rvv/base/pr108185-3.c: New test.
* gcc.target/riscv/rvv/base/pr108185-4.c: New test.
* gcc.target/riscv/rvv/base/pr108185-5.c: New test.
* gcc.target/riscv/rvv/base/pr108185-6.c: New test.
* gcc.target/riscv/rvv/base/pr108185-7.c: New test.
* gcc.target/riscv/rvv/base/pr108185-8.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
Stack protector needs a guard value on the stack and change the stack
layout. So we need to disable it for those tests, to avoid test failure
with --enable-default-ssp.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/shrink_wrap_1.c (dg-options): Add
-fno-stack-protector.
* gcc.target/aarch64/stack-check-cfa-1.c (dg-options): Add
-fno-stack-protector.
* gcc.target/aarch64/stack-check-cfa-2.c (dg-options): Add
-fno-stack-protector.
* gcc.target/aarch64/test_frame_17.c (dg-options): Add
-fno-stack-protector.
Storing stack guarding variable need one stp instruction, breaking the
scan-assembler-not pattern in the test. Disable stack protector to
avoid a test failure with --enable-default-ssp.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/pr104005.c (dg-options): Add
-fno-stack-protector.
The test scans for "const_int 0" in the RTL dump, but stack protector
can produce more "const_int 0". To avoid a failure with
--enable-default-ssp, disable stack protector for this.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/auto-init-7.c (dg-options): Add
-fno-stack-protector.
Stack protector influence code generation and cause function body checks
fail.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/pr103147-10.c (dg-options): Add
-fno-stack-protector.
* g++.target/aarch64/pr103147-10.C: Likewise.
If GCC is configured with --enable-default-ssp, the stack protector can
make many sve-pcs tests fail.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp (sve_flags):
Add -fno-stack-protector.
In PIE, symbol "fixed_regs" is addressed via GOT. It will break the
scan-assembler pattern and cause test failure with --enable-default-pie.
gcc/testsuite/ChangeLog:
PR testsuite/70150
* gcc.target/aarch64/fuse_adrp_add_1.c (dg-options): Add
-fno-pie.
These tests set large code model with -mcmodel=large or target pragma for
AArch64. But if GCC is configured with --enable-default-pie, it triggers
"sorry: unimplemented: code model large with -fpic". Disable PIE to make
avoid the issue.
gcc/testsuite/ChangeLog:
PR testsuite/70150
* gcc.dg/tls/pr78796.c (dg-additional-options): Add -fno-pie
-no-pie for aarch64-*-*.
* gcc.target/aarch64/pr63304_1.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr70120-2.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr78733.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr79041-2.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr94530.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/pr94577.c (dg-options): Add -fno-pie.
* gcc.target/aarch64/reload-valid-spoff.c (dg-options): Add
-fno-pie.
If GCC is built with --enable-default-pie, a lot of aapcs64 tests fail
because relocation unsupported in PIE is used.
gcc/testsuite/ChangeLog:
PR testsuite/70150
* gcc.target/aarch64/aapcs64/aapcs64.exp (additional_flags):
Add -fno-pie -no-pie.
While gcc.dg/plugin/must-tail-call-2.c passes for all targets even
without this, the error message is, for a target like cris-elf that
doesn't implement sibling calls: "error: cannot tail-call: machine
description does not have a sibcall_epilogue instruction pattern"
rather than "error: cannot tail-call: callee returns a structure".
Also, it'd be confusing to exclude must-tail-call-1.c but not
must-tail-call-2.c
* gcc.dg/plugin/must-tail-call-1.c, gcc.dg/plugin/must-tail-call-2.c:
Gate on effective target tail_call.
The RTL "expand" dump is the first RTL dump, and it also appears to be
the earliest trace of the target having implemented sibcalls.
Including the "," in the pattern searched for, to try and avoid
possible false matches, but there doesn't appear to be any identifiers
or target names nearby so this is just belts and suspenders. Using
"tail_call" as a shorter and more commonly used term than a derivative
of "sibling calls", and expecting only gcc folks to have heard of
"sibcalls".
* lib/target-supports.exp (check_effective_target_tail_call): New.
For 32-bit newlib targets (such as cris-elf and pru-elf),
that int32_t is "long int". See other regexps in the
testsuite matching "aka (long )?int" (with single-quotes
where needed) where the pattern in
allocation-size-multiline-3.c matches plain "int". Uses the
special syntax recently introduced for multi-line patterns.
testsuite:
* gcc.dg/analyzer/allocation-size-multiline-3.c: Handle
int32_t being "long int".
Those multi-line-patterns are literal. Sometimes a regexp
needs to be matched. This is a start: just three elements
are supported: "(" ")" and the compound ")?" (and on second
thought, it can be argued that "(...)" alone is not useful).
Note that Tcl "string map" is documented to have the desired
effect: a once-over but no re-recognitions of previously
replaced mapped elements. Also, drop a doubled "containing".
testsuite:
* lib/multiline.exp (_build_multiline_regex): Map
"{re:" to "(", similarly ")?" from ":re?}" and the
same without question mark.
This patch updates the IEEE 128-bit types used in libgcc.
At the moment, we cannot build GCC when the target uses IEEE 128-bit long
doubles, such as building the compiler for a native Fedora 36 system. The
build dies when it is trying to build the _mulkc3.c and _divkc3 modules.
This patch changes libgcc to use long double for the IEEE 128-bit base type if
long double is IEEE 128-bit, and it uses _Float128 otherwise. The built-in
functions are adjusted to be the correct version based on the IEEE 128-bit base
type used.
While it is desirable to ultimately have __float128 and _Float128 use the same
internal type and mode within GCC, at present if you use the option
-mabi=ieeelongdouble, the __float128 type will use the long double type and not
the _Float128 type. We get an internal compiler error if we combine the
signbitf128 built-in with a long double type.
I've gone through several iterations of trying to fix this within GCC, and
there are various problems that have come up. I developed this alternative
patch that changes libgcc so that it does not tickle the issue. I hope we can
fix the compiler at some point, but right now, this is preventing people on
Fedora 36 systems from building compilers where the default long double is IEEE
128-bit.
2023-03-06 Michael Meissner <meissner@linux.ibm.com>
libgcc/
PR target/107299
* config/rs6000/_divkc3.c (COPYSIGN): Use the correct built-in based on
whether long double is IBM or IEEE.
(INFINITY): Likewise.
(FABS): Likewise.
* config/rs6000/_mulkc3.c (COPYSIGN): Likewise.
(INFINITY): Likewise.
* config/rs6000/quad-float128.h (TF): Remove definition.
(TFtype): Define to be long double or _Float128.
(TCtype): Define to be _Complex long double or _Complex _Float128.
* libgcc2.h (TFtype): Allow machine config files to override this.
(TCtype): Likewise.
* soft-fp/quad.h (TFtype): Likewise.
gcc/ChangeLog:
* config/gcn/gcn-valu.md (<expander><mode>3_exec): Add patterns for
{s|u}{max|min} in QI, HI and DI modes.
(<expander><mode>3): Add pattern for {s|u}{max|min} in DI mode.
(cond_<fexpander><mode>): Add pattern for cond_f{max|min}.
(cond_<expander><mode>): Add pattern for cond_{s|u}{max|min}.
* config/gcn/gcn.cc (gcn_spill_class): Allow the exec register to be
saved in SGPRs.
gcc/testsuite/ChangeLog:
* gcc.target/gcn/cond_fmaxnm_1.c: New test.
* gcc.target/gcn/cond_fmaxnm_1_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_2.c: New test.
* gcc.target/gcn/cond_fmaxnm_2_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_3.c: New test.
* gcc.target/gcn/cond_fmaxnm_3_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_4.c: New test.
* gcc.target/gcn/cond_fmaxnm_4_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_5.c: New test.
* gcc.target/gcn/cond_fmaxnm_5_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_6.c: New test.
* gcc.target/gcn/cond_fmaxnm_6_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_7.c: New test.
* gcc.target/gcn/cond_fmaxnm_7_run.c: New test.
* gcc.target/gcn/cond_fmaxnm_8.c: New test.
* gcc.target/gcn/cond_fmaxnm_8_run.c: New test.
* gcc.target/gcn/cond_fminnm_1.c: New test.
* gcc.target/gcn/cond_fminnm_1_run.c: New test.
* gcc.target/gcn/cond_fminnm_2.c: New test.
* gcc.target/gcn/cond_fminnm_2_run.c: New test.
* gcc.target/gcn/cond_fminnm_3.c: New test.
* gcc.target/gcn/cond_fminnm_3_run.c: New test.
* gcc.target/gcn/cond_fminnm_4.c: New test.
* gcc.target/gcn/cond_fminnm_4_run.c: New test.
* gcc.target/gcn/cond_fminnm_5.c: New test.
* gcc.target/gcn/cond_fminnm_5_run.c: New test.
* gcc.target/gcn/cond_fminnm_6.c: New test.
* gcc.target/gcn/cond_fminnm_6_run.c: New test.
* gcc.target/gcn/cond_fminnm_7.c: New test.
* gcc.target/gcn/cond_fminnm_7_run.c: New test.
* gcc.target/gcn/cond_fminnm_8.c: New test.
* gcc.target/gcn/cond_fminnm_8_run.c: New test.
* gcc.target/gcn/cond_smax_1.c: New test.
* gcc.target/gcn/cond_smax_1_run.c: New test.
* gcc.target/gcn/cond_smin_1.c: New test.
* gcc.target/gcn/cond_smin_1_run.c: New test.
* gcc.target/gcn/cond_umax_1.c: New test.
* gcc.target/gcn/cond_umax_1_run.c: New test.
* gcc.target/gcn/cond_umin_1.c: New test.
* gcc.target/gcn/cond_umin_1_run.c: New test.
* gcc.target/gcn/smax_1.c: New test.
* gcc.target/gcn/smax_1_run.c: New test.
* gcc.target/gcn/smin_1.c: New test.
* gcc.target/gcn/smin_1_run.c: New test.
* gcc.target/gcn/umax_1.c: New test.
* gcc.target/gcn/umax_1_run.c: New test.
* gcc.target/gcn/umin_1.c: New test.
* gcc.target/gcn/umin_1_run.c: New test.
gcc/ada/
PR ada/108858
* sem_ch6.adb (Analyze_Subprogram_Body_Helper): For functions with
separate spec, if their return type was visible through a limited-
with context clause, their extra formals were not added when the
spec was analyzed. Now the full view must be available, and the
extra formals can be created and Returns_By_Ref computed.
The following closes a gap in double reduction detection where we
in the outer loop analysis fail to verify the inner LC PHI use is
the latch definition of the inner loop PHI. That latch definition
is used to detect that an inner loop is part of a double reduction
when later doing the inner loop analysis.
PR tree-optimization/109025
* tree-vect-loop.cc (vect_is_simple_reduction): Verify
the inner LC PHI use is the inner loop PHI latch definition
before classifying an outer PHI as double reduction.
* gcc.dg/vect/pr109025.c: New testcase.
Stack protector will affect stack layout and break the expectation of
these tests, causing test failures if GCC is configured with
--enable-default-ssp.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/prolog-opt.c (dg-options): Add
-fno-stack-protector.
* gcc.target/loongarch/stack-check-cfa-1.c (dg-options):
Likewise.
* gcc.target/loongarch/stack-check-cfa-2.c (dg-options):
Likewise.
In the toolchain convention, we describe -mfpu= as:
"Selects the allowed set of basic floating-point instructions and
registers. This option should not change the FP calling convention
unless it's necessary."
Though not explicitly stated, the rationale of this rule is to allow
combinations like "-mabi=lp64s -mfpu=64". This will be useful for
running applications with LP64S/F ABI on a double-float-capable
LoongArch hardware and using a math library with LP64S/F ABI but native
double float HW instructions, for a better performance.
And now a case in Linux kernel has again proven the usefulness of this
kind of combination. The AMDGPU DCN kernel driver needs to perform some
floating-point operation, but the entire kernel uses LP64S ABI. So the
translation units of the AMDGPU DCN driver need to be compiled with
-mfpu=64 (the kernel lacks soft-FP routines in libgcc), but -mabi=lp64s
(or you can't link it with the other part of the kernel).
Unfortunately, currently GCC uses TARGET_{HARD,SOFT,DOUBLE}_FLOAT to
determine the floating calling convention. This causes "-mfpu=64"
silently allow using $fa* to pass parameters and return values EVEN IF
-mabi=lp64s is used. To make things worse, the generated object file
has SOFT-FLOAT set in the eflags field so the linker will happily link
it with other LP64S ABI object files, but obviously this will lead to
bad results at runtime. And for now all loongarch64 CPU models (-march
settings) implies -mfpu=64 on by default, so the issue makes a single
"-mabi=lp64s" option basically broken (fortunately most projects for eg
the Linux kernel have used -msoft-float which implies both -mabi=lp64s
and -mfpu=none as we've recommended in the toolchain convention doc).
The fix is simple: use TARGET_*_FLOAT_ABI instead.
I consider this a bug fix: the behavior difference from the toolchain
convention doc is a bug, and generating object files with SOFT-FLOAT
flag but parameters/return values passed through FPRs is definitely a
bug.
Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk and
release/gcc-12 branch?
gcc/ChangeLog:
PR target/109000
* config/loongarch/loongarch.h (FP_RETURN): Use
TARGET_*_FLOAT_ABI instead of TARGET_*_FLOAT.
(UNITS_PER_FP_ARG): Likewise.
gcc/testsuite/ChangeLog:
PR target/109000
* gcc.target/loongarch/flt-abi-isa-1.c: New test.
* gcc.target/loongarch/flt-abi-isa-2.c: New test.
* gcc.target/loongarch/flt-abi-isa-3.c: New test.
* gcc.target/loongarch/flt-abi-isa-4.c: New test.
gcc/fortran/ChangeLog:
PR fortran/106856
* class.cc (gfc_build_class_symbol): Handle update of attributes of
existing class container.
(gfc_find_derived_vtab): Fix several memory leaks.
(find_intrinsic_vtab): Ditto.
* decl.cc (attr_decl1): Manage update of symbol attributes from
CLASS attributes.
* primary.cc (gfc_variable_attr): OPTIONAL shall not be taken or
updated from the class container.
* symbol.cc (free_old_symbol): Adjust management of symbol versions
to not prematurely free array specs while working on the declation
of CLASS variables.
gcc/testsuite/ChangeLog:
PR fortran/106856
* gfortran.dg/interface_41.f90: Remove dg-pattern from valid testcase.
* gfortran.dg/class_74.f90: New test.
* gfortran.dg/class_75.f90: New test.
Co-authored-by: Tobias Burnus <tobias@codesourcery.com>