Commit graph

210458 commits

Author SHA1 Message Date
Kyrylo Tkachov
1a97c8ed42 aarch64: PR target/115457 Implement missing __ARM_FEATURE_BF16 macro
The ACLE asks the user to test for __ARM_FEATURE_BF16 before using the
<arm_bf16.h> header but GCC doesn't set this up.
LLVM does, so this is an inconsistency between the compilers.

This patch enables that macro for TARGET_BF16_FP.
Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/

	PR target/115457
	* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins):
	Define __ARM_FEATURE_BF16 for TARGET_BF16_FP.

gcc/testsuite/

	PR target/115457
	* gcc.target/aarch64/acle/bf16_feature.c: New test.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
(cherry picked from commit c10942134fa759843ac1ed1424b86fcb8e6368ba)
2024-07-04 16:56:03 +05:30
Tamar Christina
1742b699c3 c++ frontend: check for missing condition for novector [PR115623]
It looks like I forgot to check in the C++ frontend if a condition exist for the
loop being adorned with novector.  This causes a segfault because cond isn't
expected to be null.

This fixes it by issuing ignoring the pragma when there's no loop condition
the same way we do in the C frontend.

gcc/cp/ChangeLog:

	PR c++/115623
	* semantics.cc (finish_for_cond): Add check for C++ cond.

gcc/testsuite/ChangeLog:

	PR c++/115623
	* g++.dg/vect/vect-novector-pragma_2.cc: New test.

(cherry picked from commit 84acbfbecbdbc3fb2a395bd97e338b2b26fad374)
2024-07-04 11:03:52 +01:00
GCC Administrator
0f71e52717 Daily bump. 2024-07-04 00:24:34 +00:00
John David Anglin
6e1fb1f9db Revert "Delete MALLOC_ABI_ALIGNMENT define from pa32-linux.h"
This reverts commit 0ee3266b3d.
2024-07-03 14:36:26 -04:00
John David Anglin
acde9f81da hppa: Fix ICE caused by mismatched predicate and constraint in xmpyu patterns
2024-06-30  John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog:

	PR target/115691
	* config/pa/pa.md: Remove incorrect xmpyu patterns.
2024-07-03 14:30:39 -04:00
Lewis Hyatt
3389a23fd4 preprocessor: Create the parser before handling command-line includes [PR115312]
Since r14-2893, we create a parser object in preprocess-only mode for the
purpose of parsing #pragma while preprocessing. The parser object was
formerly created after calling c_finish_options(), which leads to problems
on platforms that don't use stdc-predef.h (such as MinGW, as reported in
the PR). On such platforms, the call to c_finish_options() will process
the first command-line-specified include file. If that includes a PCH, then
c-ppoutput.cc will encounter a state it did not anticipate. Fix it by
creating the parser prior to calling c_finish_options().

gcc/c-family/ChangeLog:

	PR pch/115312
	* c-opts.cc (c_common_init): Call c_init_preprocess() before
	c_finish_options() so that a parser is available to process any
	includes specified on the command line.

gcc/testsuite/ChangeLog:

	PR pch/115312
	* g++.dg/pch/pr115312.C: New test.
	* g++.dg/pch/pr115312.Hs: New test.
2024-07-03 07:12:04 -04:00
Georg-Johann Lay
55744507ab AVR: target/98762 - Handle partial clobber in movqi output.
PR target/98762
gcc/
	* config/avr/avr.cc (avr_out_movqi_r_mr_reg_disp_tiny): Properly
	restore the base register when it is partially clobbered.
gcc/testsuite/
	* gcc.target/avr/torture/pr98762.c: New test.

(cherry picked from commit e9fb6efa1cf542353fd44ddcbb5136344c463fd0)
2024-07-03 10:35:37 +02:00
Kewen Lin
052f78d010 rs6000: Fix wrong RTL patterns for vector merge high/low short on LE
Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low short, which are altivec_vmrg[hl]h.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghh on BE while vmrglh on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
16-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-2.c is a typical example for this issue on element type
short.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghh expands
into altivec_vmrghh_direct_be or altivec_vmrglh_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo <xionghuluo@tencent.com>

	PR target/106069
	PR target/115355

gcc/ChangeLog:

	* config/rs6000/altivec.md (altivec_vmrghh_direct): Rename to ...
	(altivec_vmrghh_direct_be): ... this.  Add condition BYTES_BIG_ENDIAN.
	(altivec_vmrghh_direct_le): New define_insn.
	(altivec_vmrglh_direct): Rename to ...
	(altivec_vmrglh_direct_be): ... this.  Add condition BYTES_BIG_ENDIAN.
	(altivec_vmrglh_direct_le): New define_insn.
	(altivec_vmrghh): Adjust by calling gen_altivec_vmrghh_direct_be
	for BE and gen_altivec_vmrglh_direct_le for LE.
	(altivec_vmrglh): Adjust by calling gen_altivec_vmrglh_direct_be
	for BE and gen_altivec_vmrghh_direct_le for LE.
	(vec_widen_umult_hi_v16qi): Adjust the call to
	gen_altivec_vmrghh_direct by gen_altivec_vmrghh for BE
	and by gen_altivec_vmrglh for LE.
	(vec_widen_smult_hi_v16qi): Likewise.
	(vec_widen_umult_lo_v16qi): Adjust the call to
	gen_altivec_vmrglh_direct by gen_altivec_vmrglh for BE
	and by gen_altivec_vmrghh for LE.
	(vec_widen_smult_lo_v16qi): Likewise.
	* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
	CODE_FOR_altivec_vmrghh_direct by
	CODE_FOR_altivec_vmrghh_direct_be for BE and
	CODE_FOR_altivec_vmrghh_direct_le for LE.  And replace
	CODE_FOR_altivec_vmrglh_direct by
	CODE_FOR_altivec_vmrglh_direct_be for BE and
	CODE_FOR_altivec_vmrglh_direct_le for LE.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/pr106069-2.c: New test.

(cherry picked from commit 812c70bf4981958488331d4ea5af8709b5321da1)
2024-07-02 20:58:15 -05:00
Kewen Lin
0e495e8e3f rs6000: Fix wrong RTL patterns for vector merge high/low char on LE
Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low char, which are altivec_vmrg[hl]b.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghb on BE while vmrglb on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
8-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-1.c is a typical example for this issue.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghb expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo <xionghuluo@tencent.com>

	PR target/106069
	PR target/115355

gcc/ChangeLog:

	* config/rs6000/altivec.md (altivec_vmrghb_direct): Rename to ...
	(altivec_vmrghb_direct_be): ... this.  Add condition BYTES_BIG_ENDIAN.
	(altivec_vmrghb_direct_le): New define_insn.
	(altivec_vmrglb_direct): Rename to ...
	(altivec_vmrglb_direct_be): ... this.  Add condition BYTES_BIG_ENDIAN.
	(altivec_vmrglb_direct_le): New define_insn.
	(altivec_vmrghb): Adjust by calling gen_altivec_vmrghb_direct_be
	for BE and gen_altivec_vmrglb_direct_le for LE.
	(altivec_vmrglb): Adjust by calling gen_altivec_vmrglb_direct_be
	for BE and gen_altivec_vmrghb_direct_le for LE.
	* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
	CODE_FOR_altivec_vmrghb_direct by
	CODE_FOR_altivec_vmrghb_direct_be for BE and
	CODE_FOR_altivec_vmrghb_direct_le for LE.  And replace
	CODE_FOR_altivec_vmrglb_direct by
	CODE_FOR_altivec_vmrglb_direct_be for BE and
	CODE_FOR_altivec_vmrglb_direct_le for LE.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/pr106069-1.c: New test.

(cherry picked from commit 62520e4e9f7e2fe8a16ee57a4bd35da2e921ae22)
2024-07-02 20:57:47 -05:00
GCC Administrator
88bfbab8fb Daily bump. 2024-07-03 00:25:28 +00:00
Alex Coplan
8eb469546f aarch64: Fix typo in aarch64-ldp-fusion.cc:combine_reg_notes [PR114936]
This fixes a typo in combine_reg_notes in the load/store pair fusion
pass.  As it stands, the calls to filter_notes store any
REG_FRAME_RELATED_EXPR to fr_expr with the following association:

 - i2 -> fr_expr[0]
 - i1 -> fr_expr[1]

but then the checks inside the following if statement expect the
opposite (more natural) association, i.e.:

 - i2 -> fr_expr[1]
 - i1 -> fr_expr[0]

this patch fixes the oversight by swapping the fr_expr indices in the
calls to filter_notes.

In hindsight it would probably have been less confusing / error-prone to
have combine_reg_notes take an array of two insns, then we wouldn't have
to mix 1-based and 0-based indexing as well as remembering to call
filter_notes in reverse program order.  This however is a minimal fix
for backporting purposes.

gcc/ChangeLog:

	PR target/114936
	* config/aarch64/aarch64-ldp-fusion.cc (combine_reg_notes):
	Ensure insn iN has its REG_FRAME_RELATED_EXPR (if any) stored in
	FR_EXPR[N-1], thus matching the correspondence expected by the
	copy_rtx calls.

(cherry picked from commit 73c8e24b692e691c665d0f1f5424432837bd8c06)
2024-07-02 19:38:54 +01:00
GCC Administrator
5db1392e8e Daily bump. 2024-07-02 00:23:08 +00:00
Georg-Johann Lay
7249b3cdc1 AVR: target/88236, target/115726 - Fix __memx code generation.
PR target/88236
	PR target/115726
gcc/
	* config/avr/avr.md (mov<mode>) [avr_mem_memx_p]: Expand in such a
	way that the destination does not overlap with any hard register
	clobbered / used by xload8qi_A resp. xload<mode>_A.
	* config/avr/avr.cc (avr_out_xload): Avoid early-clobber
	situation for Z by executing just one load when the output register
	overlaps with Z.
gcc/testsuite/
	* gcc.target/avr/torture/pr88236-pr115726.c: New test.

(cherry picked from commit 3d23abd3dd9c8c226ea302203b214b346f4fe8d7)
2024-07-01 13:30:03 +02:00
Jakub Jelinek
37bbd2c166 c: Fix ICE related to incomplete structures in C23 [PR114930]
Here is a version of the c_update_type_canonical fixes which passed
bootstrap/regtest.
The non-trivial part is the handling of the case when
build_qualified_type (TYPE_CANONICAL (t), TYPE_QUALS (x))
returns a type with NULL TYPE_CANONICAL.  That should happen only
if TYPE_CANONICAL (t) == t, because otherwise c_update_type_canonical should
have been already called on the other type.  c, the returned type, is usually x
and in that case it should have TYPE_CANONICAL set to itself, or worst
for whatever reason x is not the right canonical type (say it has attributes
or whatever disqualifies it from check_qualified_type).  In that case
either it finds some pre-existing type from the variant chain of t which
is later in the chain and we haven't processed it yet (but then
get_qualified_type moves it right after t in:
        /* Put the found variant at the head of the variant list so
           frequently searched variants get found faster.  The C++ FE
           benefits greatly from this.  */
        tree t = *tp;
        *tp = TYPE_NEXT_VARIANT (t);
        TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (mv);
        TYPE_NEXT_VARIANT (mv) = t;
        return t;
optimization), or creates a fresh new type using build_variant_type_copy,
which again places the new type right after t:
  /* Add the new type to the chain of variants of TYPE.  */
  TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (m);
  TYPE_NEXT_VARIANT (m) = t;
  TYPE_MAIN_VARIANT (t) = m;
At this point we want to make c its own canonical type (i.e. TYPE_CANONICAL
(c) = c;), but also need to process pointers to it and only then return back
to processing x.  Processing the whole chain from c again could be costly,
we could have hundreds of types in the chain already processed, and while
the loop would just quickly skip them
  for (tree x = t, l = NULL_TREE; x; l = x, x = TYPE_NEXT_VARIANT (x))
    {
      if (x != t && TYPE_STRUCTURAL_EQUALITY_P (x))
...
      else if (x != t)
        continue;
it feels costly.  So, this patch instead moves c from right after t
to right before x in the chain (that shouldn't change anything, because
clearly build_qualified_type didn't find any matches in the chain before
x) and continues processing the c at that position, so should handle the
x that encountered this in the next iteration.

We could avoid some of the moving in the chain if we processed the chain
twice, once deal only with x != t && TYPE_STRUCTURAL_EQUALITY_P (x)
&& TYPE_CANONICAL (t) == t && check_qualified_type (t, x, TYPE_QUALS (x))
types (in that case set TYPE_CANONICAL (x) = x) and once the rest.  There
is still the theoretical case where build_qualified_type would return
a new type and in that case we are back to the moving the type around and
needing to handle it though.

2024-06-25  Jakub Jelinek  <jakub@redhat.com>
	    Martin Uecker  <uecker@tugraz.at>

	PR c/114930
	PR c/115502
gcc/c/
	* c-decl.cc (c_update_type_canonical): Assert t is main variant
	with 0 TYPE_QUALS.  Simplify and don't use check_qualified_type.
	Deal with the case where build_qualified_type returns
	TYPE_STRUCTURAL_EQUALITY_P type.
gcc/testsuite/
	* gcc.dg/pr114574-1.c: Require lto effective target.
	* gcc.dg/pr114574-2.c: Likewise.
	* gcc.dg/pr114930.c: New test.
	* gcc.dg/pr115502.c: New test.

(cherry picked from commit 777cc6a01d1cf783a36d0fa67ab20f0312f35d7a)
2024-07-01 09:38:15 +02:00
GCC Administrator
78bd4b1c23 Daily bump. 2024-07-01 00:24:05 +00:00
Harald Anlauf
603b344c07 Fortran: fix ALLOCATE with SOURCE of deferred character length [PR114019]
gcc/fortran/ChangeLog:

	PR fortran/114019
	* trans-stmt.cc (gfc_trans_allocate): Fix handling of case of
	scalar character expression being used for SOURCE.

gcc/testsuite/ChangeLog:

	PR fortran/114019
	* gfortran.dg/allocate_with_source_33.f90: New test.

(cherry picked from commit 7682d115402743090f20aca63a3b5e6c205dedff)
2024-06-30 20:20:26 +02:00
Harald Anlauf
9f147487de Fortran: fix passing of optional dummy as actual to optional argument [PR55978]
gcc/fortran/ChangeLog:

	PR fortran/55978
	* trans-array.cc (gfc_conv_array_parameter): Do not dereference
	data component of a missing allocatable dummy array argument for
	passing as actual to optional dummy.  Harden logic of presence
	check for optional pointer dummy by using TRUTH_ANDIF_EXPR instead
	of TRUTH_AND_EXPR.

gcc/testsuite/ChangeLog:

	PR fortran/55978
	* gfortran.dg/optional_absent_12.f90: New test.

(cherry picked from commit f02c70dafd384f0c44d7a0920f4a75a30e267045)
2024-06-30 20:20:26 +02:00
Harald Anlauf
b31e1900fa Fortran: fix for CHARACTER(len=*) dummies with bind(C) [PR115390]
gcc/fortran/ChangeLog:

	PR fortran/115390
	* trans-decl.cc (gfc_conv_cfi_to_gfc): Move derivation of type sizes
	for character via gfc_trans_vla_type_sizes to after character length
	has been set.

gcc/testsuite/ChangeLog:

	PR fortran/115390
	* gfortran.dg/bind_c_char_11.f90: New test.

(cherry picked from commit 954f9011c4923b72f42cc6ca8460333e7c7aad98)
2024-06-30 20:20:26 +02:00
GCC Administrator
4fe3fff904 Daily bump. 2024-06-30 00:22:56 +00:00
GCC Administrator
47cbc76568 Daily bump. 2024-06-29 00:22:50 +00:00
Patrick Palka
e6b115be1c c++: decltype of capture proxy of ref [PR115504]
The finish_decltype_type capture proxy handling added in r14-5330 was
incorrectly stripping references in the type of the captured variable.

	PR c++/115504

gcc/cp/ChangeLog:

	* semantics.cc (finish_decltype_type): Don't strip the reference
	type (if any) of a capture proxy's captured variable.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp1y/decltype-auto8.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 737449e5f233feb682b5dd2cc153892ad90a79bd)
2024-06-28 15:43:52 -04:00
Patrick Palka
a00a8d46ea c++: alias CTAD and copy deduction guide [PR115198]
Here we're neglecting to update DECL_NAME during the alias CTAD guide
transformation, which causes copy_guide_p to return false for the
transformed copy deduction guide since DECL_NAME is still __dguide_C
with TREE_TYPE C<B, T> but it should be __dguide_A with TREE_TYPE A<T>
(i.e. C<false, T>).  This ultimately results in ambiguity during
overload resolution between the copy deduction guide vs copy ctor guide.

This patch makes us update DECL_NAME of a transformed guide accordingly
during alias/inherited CTAD.

	PR c++/115198

gcc/cp/ChangeLog:

	* pt.cc (alias_ctad_tweaks): Update DECL_NAME of the transformed
	guides.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/class-deduction-alias22.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 06ebb7c6f31fe42ffdea6f51ac1ba1f6b058c090)
2024-06-28 15:43:52 -04:00
Patrick Palka
33a9c4dd5f c++: using non-dep array var of unknown bound [PR115358]
For a non-dependent array variable of unknown bound, it seems we need to
try instantiating its definition upon use in a template context for sake
of proper checking and typing of the overall expression, like we do for
function specializations with deduced return type.

	PR c++/115358

gcc/cp/ChangeLog:

	* decl2.cc (mark_used): Call maybe_instantiate_decl for an array
	variable with unknown bound.
	* semantics.cc (finish_decltype_type): Remove now redundant
	handling of array variables with unknown bound.
	* typeck.cc (cxx_sizeof_expr): Likewise.

gcc/testsuite/ChangeLog:

	* g++.dg/template/array37.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit e3915c1ad56591cbd68229a64c941c38330abd69)
2024-06-28 15:43:52 -04:00
Jonathan Wakely
d5e352addf libstdc++: Fix std::format for chrono::duration with unsigned rep [PR115668]
Using std::chrono::abs is only valid if numeric_limits<rep>::is_signed
is true, so using it unconditionally made it ill-formed to format a
duration with an unsigned rep.

The duration formatter might as well negate the duration itself instead
of using chrono::abs, because it already needs to check for a negative
value.

libstdc++-v3/ChangeLog:

	PR libstdc++/115668
	* include/bits/chrono_io.h (formatter<duration<R,P, C>::format):
	Do not use chrono::abs.
	* testsuite/20_util/duration/io.cc: Check formatting a duration
	with unsigned rep.

(cherry picked from commit dafa750c8a6f0a088677871bfaad054881737ab1)
2024-06-28 10:44:00 +01:00
Kewen Lin
ef8b60dd48 rs6000: Fix wrong RTL patterns for vector merge high/low word on LE
Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low word, which are altivec_vmrg[hl]w,
vsx_xxmrg[hl]w_<VSX_W:mode>.  These defines are mainly for
built-in function vec_merge{h,l}, __builtin_vsx_xxmrghw,
__builtin_vsx_xxmrghw_4si and some internal gen function
needs.  These functions should consider endianness, taking
vec_mergeh as example, as PVIPR defines, vec_mergeh "Merges
the first halves (in element order) of two vectors", it does
note it's in element order.  So it's mapped into vmrghw on
BE while vmrglw on LE respectively.  Although the mapped
insns are different, as the discussion in PR106069, the RTL
pattern should be still the same, it is conformed before
commit r12-4496, define_expand altivec_vmrghw got expanded
into:

  (vec_select:VSX_W
     (vec_concat:<VS_double>
        (match_operand:VSX_W 1 "register_operand" "wa,v")
        (match_operand:VSX_W 2 "register_operand" "wa,v"))
        (parallel [(const_int 0) (const_int 4)
                   (const_int 1) (const_int 5)])))]

on both BE and LE then.  But commit r12-4496 changed it to
expand into:

  (vec_select:VSX_W
     (vec_concat:<VS_double>
        (match_operand:VSX_W 1 "register_operand" "wa,v")
        (match_operand:VSX_W 2 "register_operand" "wa,v"))
        (parallel [(const_int 0) (const_int 4)
                   (const_int 1) (const_int 5)])))]

on BE, and

  (vec_select:VSX_W
     (vec_concat:<VS_double>
        (match_operand:VSX_W 1 "register_operand" "wa,v")
        (match_operand:VSX_W 2 "register_operand" "wa,v"))
        (parallel [(const_int 2) (const_int 6)
                   (const_int 3) (const_int 7)])))]

on LE, although the mapped insn are still vmrghw on BE and
vmrglw on LE, the associated RTL pattern is completely
wrong and inconsistent with the mapped insn.  If optimization
passes leave this pattern alone, even if its pattern doesn't
represent its mapped insn, it's still fine, that's why simple
testing on bif doesn't expose this issue.  But once some
optimization pass such as combine does some changes basing
on this wrong pattern, because the pattern doesn't match the
semantics that the expanded insn is intended to represent,
it would cause the unexpected result.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghw expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo <xionghuluo@tencent.com>

	PR target/106069
	PR target/115355

gcc/ChangeLog:

	* config/rs6000/altivec.md (altivec_vmrghw_direct_<VSX_W:mode>): Rename
	to ...
	(altivec_vmrghw_direct_<VSX_W:mode>_be): ... this.  Add the condition
	BYTES_BIG_ENDIAN.
	(altivec_vmrghw_direct_<VSX_W:mode>_le): New define_insn.
	(altivec_vmrglw_direct_<VSX_W:mode>): Rename to ...
	(altivec_vmrglw_direct_<VSX_W:mode>_be): ... this.  Add the condition
	BYTES_BIG_ENDIAN.
	(altivec_vmrglw_direct_<VSX_W:mode>_le): New define_insn.
	(altivec_vmrghw): Adjust by calling gen_altivec_vmrghw_direct_v4si_be
	for BE and gen_altivec_vmrglw_direct_v4si_le for LE.
	(altivec_vmrglw): Adjust by calling gen_altivec_vmrglw_direct_v4si_be
	for BE and gen_altivec_vmrghw_direct_v4si_le for LE.
	(vec_widen_umult_hi_v8hi): Adjust the call to
	gen_altivec_vmrghw_direct_v4si by gen_altivec_vmrghw for BE
	and by gen_altivec_vmrglw for LE.
	(vec_widen_smult_hi_v8hi): Likewise.
	(vec_widen_umult_lo_v8hi): Adjust the call to
	gen_altivec_vmrglw_direct_v4si by gen_altivec_vmrglw for BE
	and by gen_altivec_vmrghw for LE
	(vec_widen_smult_lo_v8hi): Likewise.
	* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
	CODE_FOR_altivec_vmrghw_direct_v4si by
	CODE_FOR_altivec_vmrghw_direct_v4si_be for BE and
	CODE_FOR_altivec_vmrghw_direct_v4si_le for LE.  And replace
	CODE_FOR_altivec_vmrglw_direct_v4si by
	CODE_FOR_altivec_vmrglw_direct_v4si_be for BE and
	CODE_FOR_altivec_vmrglw_direct_v4si_le for LE.
	* config/rs6000/vsx.md (vsx_xxmrghw_<VSX_W:mode>): Adjust by calling
	gen_altivec_vmrghw_direct_v4si_be for BE and
	gen_altivec_vmrglw_direct_v4si_le for LE.
	(vsx_xxmrglw_<VSX_W:mode>): Adjust by calling
	gen_altivec_vmrglw_direct_v4si_be for BE and
	gen_altivec_vmrghw_direct_v4si_le for LE.

gcc/testsuite/ChangeLog:

	* g++.target/powerpc/pr106069.C: New test.
	* gcc.target/powerpc/pr115355.c: New test.

(cherry picked from commit 52c112800d9f44457c4832309a48c00945811313)
2024-06-27 20:15:02 -05:00
GCC Administrator
15d304de84 Daily bump. 2024-06-28 00:23:15 +00:00
Jonathan Wakely
a8b77a6963 libstdc++: Replace viewcvs links in docs with cgit links
For this backport to the release branch, the links to the git repo refer
to the branch.

libstdc++-v3/ChangeLog:

	* doc/xml/faq.xml: Replace viewcvs links with cgit links.
	* doc/xml/manual/allocator.xml: Likewise.
	* doc/xml/manual/mt_allocator.xml: Likewise.
	* doc/html/*: Regenerate.

(cherry picked from commit 9d8021d1875677286c3dde90dfed2aca864edad0)
2024-06-27 23:46:21 +01:00
Alexandre Oliva
b70af0bd2e [libstdc++] [testsuite] defer to check_vect_support* [PR115454]
The newly-added testcase overrides the default dg-do action set by
check_vect_support_and_set_flags (in libstdc++-dg/conformance.exp), so
it attempts to run the test even if runtime vector support is not
available.

Remove the explicit dg-do directive, so that the default is honored,
and the test is run if vector support is found, and only compiled
otherwise.


for  libstdc++-v3/ChangeLog

	PR libstdc++/115454
	* testsuite/experimental/simd/pr115454_find_last_set.cc: Defer
	to check_vect_support_and_set_flags's default dg-do action.

(cherry picked from commit 95faa1bea7bdc7f92fcccb3543bfcbc8184c5e5b)
2024-06-27 08:14:34 -03:00
Kyrylo Tkachov
c2878a9a17 aarch64: Add support for -mcpu=grace
This adds support for the NVIDIA Grace CPU to aarch64.
We reuse the tuning decisions for the Neoverse V2 core, but include a
number of architecture features that are not enabled by default in
-mcpu=neoverse-v2.

This allows Grace users to more simply target the CPU with -mcpu=grace
rather than remembering what extensions to tag on top of
-mcpu=neoverse-v2.

Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/

	* config/aarch64/aarch64-cores.def (grace): New entry.
	* config/aarch64/aarch64-tune.md: Regenerate.
	* doc/invoke.texi (AArch64 Options): Document the above.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
2024-06-27 09:47:56 +02:00
Jiawei
6e6f10c3ad tree-ssa-pre.c/115214(ICE in find_or_generate_expression, at tree-ssa-pre.c:2780): Return NULL_TREE when deal special cases.
Return NULL_TREE when genop3 equal EXACT_DIV_EXPR.
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652641.html

version log v3: remove additional POLY_INT_CST check.
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652795.html

gcc/ChangeLog:

	* tree-ssa-pre.cc (create_component_ref_by_pieces_1): New conditions.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/vsetvl/pr115214.c: New test.
2024-06-27 13:08:46 +08:00
GCC Administrator
f9cc628d16 Daily bump. 2024-06-27 00:22:50 +00:00
GCC Administrator
532357bc27 Daily bump. 2024-06-26 00:23:48 +00:00
Jonathan Wakely
f91d9b3e91 libstdc++: Remove confusing text from status tables for release branch
When I tried to make the release branch versions of these docs refer to
the release branch instead of "mainline GCC", for some reason I left the
text "not any particular release" there. That's just confusing, because
the docs are for a particular release, the latest on that branch. Remove
that confusing text in several places.

libstdc++-v3/ChangeLog:

	* doc/xml/manual/status_cxx1998.xml: Remove confusing "not in
	any particular release" text.
	* doc/xml/manual/status_cxx2011.xml: Likewise.
	* doc/xml/manual/status_cxx2014.xml: Likewise.
	* doc/xml/manual/status_cxx2017.xml: Likewise.
	* doc/xml/manual/status_cxx2020.xml: Likewise.
	* doc/xml/manual/status_cxx2023.xml: Likewise.
	* doc/xml/manual/status_cxxtr1.xml: Likewise.
	* doc/xml/manual/status_cxxtr24733.xml: Likewise.
	* doc/html/manual/status.html: Regenerate.
2024-06-25 23:35:45 +01:00
Sandra Loosemore
b383719aeb Fix PR c/115587, uninitialized variable in c_parser_omp_loop_nest
This function had a reference to an uninitialized variable on the
error path.  The problem was diagnosed by clang but not gcc.  It seems
the cleanest solution is to initialize all the loop-clause variables
at the point of declaration rather than at different places in the
code.

The C++ front end didn't have this problem, but I've made similar
changes there to keep the code in sync.

gcc/c/ChangeLog:

	PR c/115587
	* c-parser.cc (c_parser_omp_loop_nest): Move initializations to
	point of declaration.

gcc/cp/ChangeLog:

	PR c/115587
	* parser.cc (cp_parser_omp_loop_nest): Move initializations to
	point of declaration.

(cherry picked from commit 21f1073d388af8af207183b0ed592e1cc47d20ab)
2024-06-25 14:53:05 +00:00
Eric Botcazou
4bf93fc3d3 SPARC: fix internal error with -mv8plus on 64-bit Linux
This passes -m32 when -mv8plus is specified on Linux (like on Solaris).

gcc/
	PR target/115608
	* config/sparc/linux64.h (CC1_SPEC): Pass -m32 for -mv8plus.
2024-06-25 11:50:46 +02:00
Andrew Pinski
b7157f3930 c-family: Add Warning property to Wnrvo option [PR115624]
This was missing when Wnrvo was added in
r14-1594-g2ae5384d457b9c67586de012816dfc71a6943164 .

Pushed after a bootstrap/test on x86_64-linux-gnu.

gcc/c-family/ChangeLog:

	PR c++/115624
	* c.opt (Wnrvo): Add Warning property.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit f7747210947a7c66e865c6ac571cce39e2b87caf)
2024-06-24 21:57:57 -07:00
GCC Administrator
faf5994e65 Daily bump. 2024-06-25 00:24:11 +00:00
Kewen Lin
2b5e8f918e rs6000: Don't clobber return value when eh_return called [PR114846]
As the associated test case in PR114846 shows, currently
with eh_return involved some register restoring for EH
RETURN DATA in epilogue can clobber the one which holding
the return value.  Referring to the existing handlings in
some other targets, this patch makes eh_return expander
call one new define_insn_and_split eh_return_internal which
directly calls rs6000_emit_epilogue with epilogue_type
EPILOGUE_TYPE_EH_RETURN instead of the previous treating
normal return with crtl->calls_eh_return specially.

	PR target/114846

gcc/ChangeLog:

	* config/rs6000/rs6000-logue.cc (rs6000_emit_epilogue): As
	EPILOGUE_TYPE_EH_RETURN would be passed as epilogue_type directly
	now, adjust the relevant handlings on it.
	* config/rs6000/rs6000.md (eh_return expander): Append by calling
	gen_eh_return_internal and emit_barrier.
	(eh_return_internal): New define_insn_and_split, call function
	rs6000_emit_epilogue with epilogue type EPILOGUE_TYPE_EH_RETURN.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/pr114846.c: New test.

(cherry picked from commit e5fc5d42d25c86ae48178db04ce64d340a834614)
2024-06-23 20:42:45 -05:00
GCC Administrator
1a2329d007 Daily bump. 2024-06-24 00:24:03 +00:00
GCC Administrator
1735b862ad Daily bump. 2024-06-23 00:22:43 +00:00
GCC Administrator
70d9d929ce Daily bump. 2024-06-22 00:24:39 +00:00
Wilco Dijkstra
9421f02916 AArch64: Fix cpu features initialization [PR115342]
The CPU features initialization code uses CPUID registers (rather than
HWCAP).  The equality comparisons it uses are incorrect: for example FEAT_SVE
is not set if SVE2 is available.  Using HWCAPs for these is both simpler and
correct.  The initialization must also be done atomically to avoid multiple
threads causing corruption due to non-atomic RMW accesses to the global.

libgcc:
	PR target/115342
	* config/aarch64/cpuinfo.c (__init_cpu_features_constructor):
	Use HWCAP where possible.  Use atomic write for initialization.
	Fix FEAT_PREDRES comparison.
	(__init_cpu_features_resolver): Use atomic load for correct
	initialization.
	(__init_cpu_features): Likewise.
(cherry picked from commit d7cbcfe7c33645eaf95f175f19884d443817857b)
2024-06-21 17:15:45 +01:00
Matthias Kretz
a851931bc0 libstdc++: Fix test on x86_64 and non-simd targets
* Running a test compiled with AVX512 instructions requires
avx512f_runtime not just avx512f.

* The 'reduce2' test violated an invariant of fixed_size_simd_mask and
thus failed on all targets without 16-Byte vector builtins enabled (in
bits/simd.h).

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>

libstdc++-v3/ChangeLog:

	PR libstdc++/115575
	* testsuite/experimental/simd/pr115454_find_last_set.cc: Require
	avx512f_runtime. Don't memcpy fixed_size masks.

(cherry picked from commit 77f321435b4ac37992c2ed6737ca0caa1dd50551)
2024-06-21 18:07:30 +02:00
YunQiang Su
a16f47f5f3 Build: Set gcc_cv_as_mips_explicit_relocs if gcc_cv_as_mips_explicit_relocs_pcrel
We check gcc_cv_as_mips_explicit_relocs if gcc_cv_as_mips_explicit_relocs_pcrel
only, while gcc_cv_as_mips_explicit_relocs is used by later code.

gcc
	* configure.ac: Set gcc_cv_as_mips_explicit_relocs if
	gcc_cv_as_mips_explicit_relocs_pcrel.
	* configure: Regenerate.

(cherry picked from commit 573f11ec34eeb6a6c3bd3d7619738f927236727b)
2024-06-21 21:55:12 +08:00
Richard Biener
272e8c90af tree-optimization/115278 - fix DSE in if-conversion wrt volatiles
The following adds the missing guard for volatile stores to the
embedded DSE in the loop if-conversion pass.

	PR tree-optimization/115278
	* tree-if-conv.cc (ifcvt_local_dce): Do not DSE volatile stores.

	* g++.dg/vect/pr115278.cc: New testcase.

(cherry picked from commit 65dbe0ab7cdaf2aa84b09a74e594f0faacf1945c)
2024-06-21 11:42:49 +02:00
Richard Biener
65e25860f4 tree-optimization/115508 - fix ICE with SLP scheduling and extern vector
When there's a permute after an extern vector we can run into a case
that didn't consider the scheduled node being a permute which lacks
a representative.

	PR tree-optimization/115508
	* tree-vect-slp.cc (vect_schedule_slp_node): Guard check on
	representative.

	* gcc.target/i386/pr115508.c: New testcase.

(cherry picked from commit 65e72b95c63a5501cf1482f3814ae8c8e672bf06)
2024-06-21 11:42:48 +02:00
Richard Biener
85d32e6f75 Avoid SLP_REPRESENTATIVE access for VEC_PERM in SLP scheduling
SLP permute nodes can end up without a SLP_REPRESENTATIVE now,
the following avoids touching it in this case in vect_schedule_slp_node.

	* tree-vect-slp.cc (vect_schedule_slp_node): Avoid looking
	at SLP_REPRESENTATIVE for VEC_PERM nodes.

(cherry picked from commit 31e9bae0ea5e5413abfa3ca9050e66cc6760553e)
2024-06-21 11:42:48 +02:00
GCC Administrator
30fca2ce1d Daily bump. 2024-06-21 00:26:37 +00:00
Matthias Kretz
e77f314ccd libstdc++: Fix find_last_set(simd_mask) to ignore padding bits
With the change to the AVX512 find_last_set implementation, the change
to AVX512 operator!= is unnecessary. However, the latter was not
producing optimal code and unnecessarily set the padding bits. In
theory, the compiler could determine that with the new !=
implementation, the bit operation for clearing the padding bits is a
no-op and can be elided.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>

libstdc++-v3/ChangeLog:

	PR libstdc++/115454
	* include/experimental/bits/simd_x86.h (_S_not_equal_to): Use
	neq comparison instead of bitwise negation after eq.
	(_S_find_last_set): Clear unused high bits before computing
	bit_width.
	* testsuite/experimental/simd/pr115454_find_last_set.cc: New
	test.

(cherry picked from commit 1340ddea0158de3f49aeb75b4013e5fc313ff6f4)
2024-06-20 13:28:06 +02:00
Andreas Krebbel
d26fa1c73b vshuf-mem.C: Make -march=z14 depend on s390_vxe
gcc/testsuite/ChangeLog:

	* g++.dg/torture/vshuf-mem.C: Use -march=z14 only, if the we are
	on a machine which can actually run it.

(cherry picked from commit 7e59f0c05da840ca13ba73d25947df8a4eaf199e)
2024-06-20 13:05:17 +02:00