Commit graph

210217 commits

Author SHA1 Message Date
Jakub Jelinek
0f616e75f3 bitint: Fix up lower_addsub_overflow [PR115352]
The following testcase is miscompiled because of a flawed optimization.
If one changes the 65 in the testcase to e.g. 66, one gets:
...
  _25 = .USUBC (0, _24, _14);
  _12 = IMAGPART_EXPR <_25>;
  _26 = REALPART_EXPR <_25>;
  if (_23 >= 1)
    goto <bb 8>; [80.00%]
  else
    goto <bb 11>; [20.00%]

  <bb 8> :
  if (_23 != 1)
    goto <bb 10>; [80.00%]
  else
    goto <bb 9>; [20.00%]

  <bb 9> :
  _27 = (signed long) _26;
  _28 = _27 >> 1;
  _29 = (unsigned long) _28;
  _31 = _29 + 1;
  _30 = _31 > 1;
  goto <bb 11>; [100.00%]

  <bb 10> :
  _32 = _26 != _18;
  _33 = _22 | _32;

  <bb 11> :
  # _17 = PHI <_30(9), _22(7), _33(10)>
  # _19 = PHI <_29(9), _18(7), _18(10)>
...
so there is one path for limbs below the boundary (in this case there are
actually no limbs there, maybe we could consider optimizing that further,
say with simply folding that _23 >= 1 condition to 1 == 1 and letting
cfg cleanup handle it), another case where it is exactly the limb on the
boundary (that is the bb 9 handling where it extracts the interesting
bits (the first 3 statements) and then checks if it is zero or all ones and
finally the case of limbs above that where it compares the current result
limb against the previously recorded 0 or all ones and ors differences into
accumulated result.

Now, the optimization which the first hunk removes was based on the idea
that for that case the extraction of the interesting bits from the limb
don't need anything special, so the _27/_28/_29 statements above aren't
needed, the whole limb is interesting bits, so it handled the >= 1
case like the bb 9 above without the first 3 statements and bb 10 wasn't
there at all.  There are 2 problems with that, for the higher limbs it
only checks if the the result limb bits are all zeros or all ones, but
doesn't check if they are the same as the other extension bits, and
it forgets the previous flag whether there was an overflow.
First I wanted to fix it just by adding the _33 = _22 | _30; statement
to the end of bb 9 above, which fixed the originally filed huge testcase
and the first 2 foo calls in the testcase included in the patch, it no
longer forgets about previously checked differences from 0/1.
But as the last 2 foo calls show, it still didn't check whether each
even (or each odd depending on the exact position) result limb is
equal to the first one, so every second limb it could choose some other
0 vs. all ones value and as long as it repeated in another limb above it
it would be ok.

So, the optimization just can't work properly and the following patch
removes it.

2024-06-07  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/115352
	* gimple-lower-bitint.cc (lower_addsub_overflow): Don't disable
	single_comparison if cmp_code is GE_EXPR.

	* gcc.dg/torture/bitint-71.c: New test.

(cherry picked from commit a47b1aaa7a76201da7e091d9f8d4488105786274)
2024-06-07 10:34:53 +02:00
GCC Administrator
7d40974268 Daily bump. 2024-06-07 00:22:48 +00:00
Jakub Jelinek
56c73729c3 c: Fix up pointer types to may_alias structures [PR114493]
The following testcase ICEs in ipa-free-lang, because the
fld_incomplete_type_of
          gcc_assert (TYPE_CANONICAL (t2) != t2
                      && TYPE_CANONICAL (t2) == TYPE_CANONICAL (TREE_TYPE (t)));
assertion doesn't hold.
This is because t is a struct S * type which was created while struct S
was still incomplete and without the may_alias attribute (and TYPE_CANONICAL
of a pointer type is a type created with can_alias_all = false argument),
while later on on the struct definition may_alias attribute was used.
fld_incomplete_type_of then creates an incomplete distinct copy of the
structure (but with the original attributes) but pointers created for it
are because of the "may_alias" attribute TYPE_REF_CAN_ALIAS_ALL, including
their TYPE_CANONICAL, because while that is created with !can_alias_all
argument, we later set it because of the "may_alias" attribute on the
to_type.

This doesn't ICE with C++ since PR70512 fix because the C++ FE sets
TYPE_REF_CAN_ALIAS_ALL on all pointer types to the class type (and its
variants) when the may_alias is added.

The following patch does that in the C FE as well.

2024-06-06  Jakub Jelinek  <jakub@redhat.com>

	PR c/114493
	* c-decl.cc (c_fixup_may_alias): New function.
	(finish_struct): Call it if "may_alias" attribute is
	specified.

	* gcc.dg/pr114493-1.c: New test.
	* gcc.dg/pr114493-2.c: New test.

(cherry picked from commit d5a3c6d43acb8b2211d9fb59d59482d74c010f01)
2024-06-06 22:18:54 +02:00
Richard Ball
35ed54f136 aarch64: Add missing ACLE macro for NEON-SVE Bridge
__ARM_NEON_SVE_BRIDGE was missed in the original patch and is
added by this patch.

gcc/ChangeLog:

	* config/aarch64/aarch64-c.cc (aarch64_define_unconditional_macros):
	Add missing __ARM_NEON_SVE_BRIDGE.

(cherry picked from commit 43530bc40b1d0465911e493e56a6631202ce85b1)
2024-06-06 16:33:30 +01:00
GCC Administrator
d5760344db Daily bump. 2024-06-06 00:22:30 +00:00
Rainer Orth
e11a42b8c7 testsuite: i386: Require ifunc support in gcc.target/i386/avx10_1-25.c etc.
Two new AVX10.1 tests FAIL on Solaris/x86:

FAIL: gcc.target/i386/avx10_1-25.c (test for excess errors)
FAIL: gcc.target/i386/avx10_1-26.c (test for excess errors)

Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/avx10_1-25.c:6:9: error: the call requires 'ifunc', which is not supported by this target

Fixed by requiring ifunc support.

Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.

2024-06-04  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	gcc/testsuite:
	* gcc.target/i386/avx10_1-25.c: Require ifunc support.
	* gcc.target/i386/avx10_1-26.c: Likewise.
2024-06-05 10:16:51 +08:00
GCC Administrator
7f0f88e282 Daily bump. 2024-06-05 00:22:26 +00:00
Jonathan Wakely
c6e6258ea4
libstdc++: Only define std::span::at for C++26 [PR115335]
In r14-5689-g1fa85dcf656e2f I added std::span::at and made the correct
changes to the __cpp_lib_span macro (with tests for the correct value in
C++20/23/26). But I didn't make the declaration of std::span::at
actually depend on the macro, so it was defined for C++20 and C++23, not
only for C++26. This fixes that oversight.

libstdc++-v3/ChangeLog:

	PR libstdc++/115335
	* include/std/span (span::at): Guard with feature test macro.

(cherry picked from commit 2197814011eec75022aa8550f10621409b69d4a1)
2024-06-04 15:29:28 +01:00
Jakub Jelinek
a88e13bd7e fold-const: Fix up CLZ handling in tree_call_nonnegative_warnv_p [PR115337]
The function currently incorrectly assumes all the __builtin_clz* and .CLZ
calls have non-negative result.  That is the case of the former which is UB
on zero and has [0, prec-1] return value otherwise, and is the case of the
single argument .CLZ as well (again, UB on zero), but for two argument
.CLZ is the case only if the second argument is also nonnegative (or if we
know the argument can't be zero, but let's do that just in the ranger IMHO).

The following patch does that.

2024-06-04  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/115337
	* fold-const.cc (tree_call_nonnegative_warnv_p) <CASE_CFN_CLZ>:
	If arg1 is non-NULL, RECURSE on it, otherwise return true.

	* gcc.dg/bitint-106.c: New test.

(cherry picked from commit b82a816000791e7a286c7836b3a473ec0e2a577b)
2024-06-04 16:20:25 +02:00
Jakub Jelinek
f9af4a05e0 builtins: Force SAVE_EXPR for __builtin_{add,sub,mul}_overflow and __builtin{add,sub}c [PR108789]
The following testcase is miscompiled, because we use save_expr
on the .{ADD,SUB,MUL}_OVERFLOW call we are creating, but if the first
two operands are not INTEGER_CSTs (in that case we just fold it right away)
but are TREE_READONLY/!TREE_SIDE_EFFECTS, save_expr doesn't actually
create a SAVE_EXPR at all and so we lower it to
*arg2 = REALPART_EXPR (.ADD_OVERFLOW (arg0, arg1)), \
IMAGPART_EXPR (.ADD_OVERFLOW (arg0, arg1))
which evaluates the ifn twice and just hope it will be CSEd back.
As *arg2 aliases *arg0, that is not the case.
The builtins are really never const/pure as they store into what
the third arguments points to, so after handling the INTEGER_CST+INTEGER_CST
case, I think we should just always use SAVE_EXPR.  Just building SAVE_EXPR
by hand and setting TREE_SIDE_EFFECTS on it doesn't work, because
c_fully_fold optimizes it away again, so the following patch marks the
ifn calls as TREE_SIDE_EFFECTS (but doesn't do it for the
__builtin_{add,sub,mul}_overflow_p case which were designed for use
especially in constant expressions and don't really evaluate the
realpart side, so we don't really need a SAVE_EXPR in that case).

2024-06-04  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/108789
	* builtins.cc (fold_builtin_arith_overflow): For ovf_only,
	don't call save_expr and don't build REALPART_EXPR, otherwise
	set TREE_SIDE_EFFECTS on call before calling save_expr.
	(fold_builtin_addc_subc): Set TREE_SIDE_EFFECTS on call before
	calling save_expr.

	* gcc.c-torture/execute/pr108789.c: New test.

(cherry picked from commit b8e28381cb5c0cddfe5201faf799d8b27f5d7d6c)
2024-06-04 16:19:42 +02:00
Jakub Jelinek
1c1bc2553f invoke.texi: Clarify -march=lujiazui
I was recently searching which exact CPUs are affected by the PR114576
wrong-code issue and went from the PTA_* bitmasks in GCC, so arrived
at the goldmont, goldmont-plus, tremont and lujiazui CPUs (as -march=
cases which do enable -maes and don't enable -mavx).
But when double-checking that against the invoke.texi documentation,
that was true for the first 3, but lujiazui said it supported AVX.
I was really confused by that, until I found the
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604407.html
explanation.  So, seems the CPUs do have AVX and F16C but -march=lujiazui
doesn't enable those and even activelly attempts to filter those out from
the announced CPUID features, in glibc as well as e.g. in libgcc.

Thus, I think we should document what actually happens, otherwise
users could assume that
gcc -march=lujiazui predefines __AVX__ and __F16C__, which it doesn't.

2024-06-04  Jakub Jelinek  <jakub@redhat.com>

	* doc/invoke.texi (lujiazui): Clarify that while the CPUs do support
	AVX and F16C, -march=lujiazui actually doesn't enable those.

(cherry picked from commit 09b4ab53155ea16e1fb12c2afcd9b6fe29a31c74)
2024-06-04 16:19:18 +02:00
Jakub Jelinek
a7dd44c02e rs6000: Fix up PCH in --enable-host-pie builds [PR115324]
PCH doesn't work properly in --enable-host-pie configurations on
powerpc*-linux*.
The problem is that the rs6000_builtin_info and rs6000_instance_info
arrays mix pointers to .rodata/.data (bifname and attr_string point
to string literals in .rodata section, and the next member is either NULL
or &rs6000_instance_info[XXX]) and GC member (tree fntype).
Now, for normal GC this works just fine, we emit
  {
    &rs6000_instance_info[0].fntype,
    1 * (RS6000_INST_MAX),
    sizeof (rs6000_instance_info[0]),
    &gt_ggc_mx_tree_node,
    &gt_pch_nx_tree_node
  },
  {
    &rs6000_builtin_info[0].fntype,
    1 * (RS6000_BIF_MAX),
    sizeof (rs6000_builtin_info[0]),
    &gt_ggc_mx_tree_node,
    &gt_pch_nx_tree_node
  },
GC roots which are strided and thus cover only the fntype members of all
the elements of the two arrays.
For PCH though it actually results in saving those huge arrays (one is
130832 bytes, another 81568 bytes) into the .gch files and loading them back
in full.  While the bifname and attr_string and next pointers are marked as
GTY((skip)), they are actually saved to point to the .rodata and .data
sections of the process which writes the PCH, but because cc1/cc1plus etc.
are position independent executables with --enable-host-pie, when it is
loaded from the PCH file, it can point in a completely different addresses
where nothing is mapped at all or some random different thing appears at.
While gengtype supports the callback option, that one is meant for
relocatable function pointers and doesn't work in the case of GTY arrays
inside of .data section anyway.

So, either we'd need to add some further GTY extensions, or the following
patch instead reworks it such that the fntype members which were the only
reason for PCH in those arrays are moved to separate arrays.

Size-wise in .data sections it is (in bytes):

                             vanilla    patched
rs6000_builtin_info          130832     110704
rs6000_instance_info          81568      40784
rs6000_overload_info           7392       7392
rs6000_builtin_info_fntype        0      10064
rs6000_instance_info_fntype       0      20392
sum                          219792     189336

where previously we saved/restored for PCH those 130832+81568 bytes, now we
save/restore just 10064+20392 bytes, so this change is beneficial for the
data section size.

Unfortunately, it grows the size of the rs6000_init_generated_builtins
function, vanilla had 218328 bytes, patched has 228668.

When I applied
 void
 rs6000_init_generated_builtins ()
 {
+  bifdata *rs6000_builtin_info_p;
+  tree *rs6000_builtin_info_fntype_p;
+  ovlddata *rs6000_instance_info_p;
+  tree *rs6000_instance_info_fntype_p;
+  ovldrecord *rs6000_overload_info_p;
+  __asm ("" : "=r" (rs6000_builtin_info_p) : "0" (rs6000_builtin_info));
+  __asm ("" : "=r" (rs6000_builtin_info_fntype_p) : "0" (rs6000_builtin_info_fntype));
+  __asm ("" : "=r" (rs6000_instance_info_p) : "0" (rs6000_instance_info));
+  __asm ("" : "=r" (rs6000_instance_info_fntype_p) : "0" (rs6000_instance_info_fntype));
+  __asm ("" : "=r" (rs6000_overload_info_p) : "0" (rs6000_overload_info));
+  #define rs6000_builtin_info rs6000_builtin_info_p
+  #define rs6000_builtin_info_fntype rs6000_builtin_info_fntype_p
+  #define rs6000_instance_info rs6000_instance_info_p
+  #define rs6000_instance_info_fntype rs6000_instance_info_fntype_p
+  #define rs6000_overload_info rs6000_overload_info_p
+
hack by hand, the size of the function is 209700 though, so if really
wanted, we could add __attribute__((__noipa__)) to the function when
building with recent enough GCC and pass pointers to the first elements
of the 5 arrays to the function as arguments.  If you want such a change,
could that be done incrementally?

2024-06-03  Jakub Jelinek  <jakub@redhat.com>

	PR target/115324
	* config/rs6000/rs6000-gen-builtins.cc (write_decls): Remove
	GTY markup from struct bifdata and struct ovlddata and remove their
	fntype members.  Change next member in struct ovlddata and
	first_instance member of struct ovldrecord to have int type rather
	than struct ovlddata *.  Remove GTY markup from rs6000_builtin_info
	and rs6000_instance_info arrays, declare new
	rs6000_builtin_info_fntype and rs6000_instance_info_fntype arrays,
	which have GTY markup.
	(write_bif_static_init): Adjust for the above changes.
	(write_ovld_static_init): Likewise.
	(write_init_bif_table): Likewise.
	(write_init_ovld_table): Likewise.
	* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Likewise.
	* config/rs6000/rs6000-c.cc (find_instance): Likewise.  Make static.
	(altivec_resolve_overloaded_builtin): Adjust for the above changes.

(cherry picked from commit 4cf2de9b5268224816a3d53fdd2c3d799ebfd9c8)
2024-06-04 16:19:12 +02:00
Jakub Jelinek
14a7296d04 combine: Fix up simplify_compare_const [PR115092]
The following testcases are miscompiled (with tons of GIMPLE
optimization disabled) because combine sees GE comparison of
1-bit sign_extract (i.e. something with [-1, 0] value range)
with (const_int -1) (which is always true) and optimizes it into
NE comparison of 1-bit zero_extract ([0, 1] value range) against
(const_int 0).
The reason is that simplify_compare_const first (correctly)
simplifies the comparison to
GE (ashift:SI something (const_int 31)) (const_int -2147483648)
and then an optimization for when the second operand is power of 2
triggers.  That optimization is fine for power of 2s which aren't
the signed minimum of the mode, or if it is NE, EQ, GEU or LTU
against the signed minimum of the mode, but for GE or LT optimizing
it into NE (or EQ) against const0_rtx is wrong, those cases
are always true or always false (but the function doesn't have
a standardized way to tell callers the comparison is now unconditional).

The following patch just disables the optimization in that case.

2024-05-15  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/114902
	PR rtl-optimization/115092
	* combine.cc (simplify_compare_const): Don't optimize
	GE op0 SIGNED_MIN or LT op0 SIGNED_MIN into NE op0 const0_rtx or
	EQ op0 const0_rtx.

	* gcc.dg/pr114902.c: New test.
	* gcc.dg/pr115092.c: New test.

(cherry picked from commit 0b93a0ae153ef70a82ff63e67926a01fdab9956b)
2024-06-04 16:18:14 +02:00
Rainer Orth
e80523288c testsuite: gm2: Remove timeout overrides [PR114886]
A large number of gm2 tests are timing out even on current Solaris/SPARC
systems.  As detailed in the PR, the problem is that the gm2 testsuite
artificially lowers many timeouts way below the DejaGnu default of 300
seconds, often as short as 10 seconds.  The problem lies both in the
values (they may be appropriate for some targets, but too low for
others, especially under high load) and the fact that it uses absolute
values, overriding e.g. settings from a build-wide site.exp.

Therefore this patch removes all those overrides, restoring the
defaults.

Tested on sparc-sun-solaris2.11 (where all the previous timeouts are
gone) and i386-pc-solaris2.11.

2024-04-29  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	gcc/testsuite:
	PR modula2/114886
	* lib/gm2.exp: Don't load timeout-dg.exp.
	Don't set gm2_previous_timeout.
	Don't call dg-timeout.
	(gm2_push_timeout, gm2_pop_timeout): Remove.
	(gm2_init): Don't call dg-timeout.
	* lib/gm2-torture.exp: Don't load timeout-dg.exp.
	Don't set gm2_previous_timeout.
	Don't call dg-timeout.
	(gm2_push_timeout, gm2_pop_timeout): Remove.

	* gm2/coroutines/pim/run/pass/coroutines-pim-run-pass.exp: Don't
	load timeout-dg.exp.
	Don't call gm2_push_timeout, gm2_pop_timeout.
	* gm2/examples/map/pass/examples-map-pass.exp: Don't call
	gm2_push_timeout, gm2_pop_timeout.
	* gm2/iso/run/pass/iso-run-pass.exp: Don't load timeout-dg.exp.
	Don't call gm2_push_timeout, gm2_pop_timeout.
	* gm2/pimlib/base/run/pass/pimlib-base-run-pass.exp: Don't load
	timeout-dg.exp.
	Don't call gm2_push_timeout, gm2_pop_timeout.
	* gm2/projects/iso/run/pass/halma/projects-iso-run-pass-halma.exp:
	Don't call gm2_push_timeout, gm2_pop_timeout.
	* gm2/switches/whole-program/pass/run/switches-whole-program-pass-run.exp:
	Don't load timeout-dg.exp.
	Don't call gm2_push_timeout, gm2_pop_timeout.

(cherry picked from commit aff63ac11099d100b6891f3bcc3dc6cbc4fad654)
2024-06-04 09:12:28 +02:00
Rainer Orth
d92b508dd1 libstdc++: Build libbacktrace and 19_diagnostics/stacktrace with -funwind-tables [PR111641]
Several of the 19_diagnostics/stacktrace tests FAIL on Solaris/SPARC (32
and 64-bit), Solaris/x86 (32-bit only), and several other targets:

FAIL: 19_diagnostics/stacktrace/current.cc  -std=gnu++23 execution test
FAIL: 19_diagnostics/stacktrace/current.cc  -std=gnu++26 execution test
FAIL: 19_diagnostics/stacktrace/entry.cc  -std=gnu++23 execution test
FAIL: 19_diagnostics/stacktrace/entry.cc  -std=gnu++26 execution test
FAIL: 19_diagnostics/stacktrace/output.cc  -std=gnu++23 execution test
FAIL: 19_diagnostics/stacktrace/output.cc  -std=gnu++26 execution test
FAIL: 19_diagnostics/stacktrace/stacktrace.cc  -std=gnu++23 execution test
FAIL: 19_diagnostics/stacktrace/stacktrace.cc  -std=gnu++26 execution test

As it turns out, both the copy of libbacktrace in libstdc++ and the
testcases proper need to compiled with -funwind-tables, as is done for
libbacktrace itself.

This isn't an issue on Linux/x86_64 and Solaris/amd64 since 64-bit x86
always defaults to -funwind-tables.  32-bit x86 does, too, when
-fomit-frame-pointer is enabled as on Linux/i686, but unlike
Solaris/i386.

So this patch always enables the option both for the libbacktrace copy
and the testcases.

Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11, and
x86_64-pc-linux-gnu.

2024-05-23  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	libstdc++-v3:
	PR libstdc++/111641
	* src/libbacktrace/Makefile.am (AM_CFLAGS): Add -funwind-tables.
	* src/libbacktrace/Makefile.in: Regenerate.

	* testsuite/19_diagnostics/stacktrace/current.cc (dg-options): Add
	-funwind-tables.
	* testsuite/19_diagnostics/stacktrace/entry.cc: Likewise.
	* testsuite/19_diagnostics/stacktrace/hash.cc: Likewise.
	* testsuite/19_diagnostics/stacktrace/output.cc: Likewise.
	* testsuite/19_diagnostics/stacktrace/stacktrace.cc: Likewise.

(cherry picked from commit a99ebb88f8f25e76ebed5afc22e64fa77a2f0d3f)
2024-06-04 09:10:24 +02:00
GCC Administrator
b2bbf9890e Daily bump. 2024-06-04 00:23:13 +00:00
François Dumont
955202eb2c libstdc++: Fix -Wstringop-overflow warning coming from std::vector [PR109849]
libstdc++-v3/ChangeLog:

	PR libstdc++/109849
	* include/bits/vector.tcc
	(std::vector<>::_M_range_insert(iterator, _FwdIt, _FwdIt,
	forward_iterator_tag))[__cplusplus < 201103L]: Add __builtin_unreachable
	expression to tell the compiler that the allocated buffer is large enough to
	receive current elements plus the elements of the range to insert.

(cherry picked from commit 0426be454448f8cfb9db21f4f669426afb7b57c8)
2024-06-03 21:52:58 +02:00
Haochen Jiang
97474ba207 Add AVX10.1 target_clones support
Since AVX10 is the first major ISA introduced after AVX-512, we propose
to add target_clones support for it.

Although AVX10.1-256 won't cover 512-bit part of AVX512F, but since
it is only for priority but not for implication, it won't be an issue.

gcc/ChangeLog:

	* common/config/i386/i386-common.cc: Change Granite Rapids
	series CPU type to P_PROC_AVX10_1_512.
	* common/config/i386/i386-cpuinfo.h (enum feature_priority):
	Revise comment part. Add P_AVX10_1_256, P_AVX10_1_512,
	P_PROC_AVX10_1_512.
	* common/config/i386/i386-isas.h: Link to avx10.1-256, avx10.1-512.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx10_1-25.c: New test.
	* gcc.target/i386/avx10_1-26.c: Ditto.
2024-06-03 14:53:58 +08:00
GCC Administrator
1dbf796579 Daily bump. 2024-06-03 00:22:58 +00:00
GCC Administrator
a31676a5d0 Daily bump. 2024-06-02 00:22:52 +00:00
Georg-Johann Lay
d7f42794d9 AVR: target/115317 - Make isinf(-Inf) return -1.
PR target/115317
libgcc/config/avr/libf7/
	* libf7-asm.sx (__isinf): Map -Inf to -1.

gcc/testsuite/
	* gcc.target/avr/torture/pr115317-isinf.c: New test.

(cherry picked from commit f12454278dc725fec3520a5d870e967d79292ee6)
2024-06-01 12:52:26 +02:00
Jonathan Wakely
2f097c0b3f libstdc++: Replace link to gcc-4.3.2 docs in manual [PR115269]
Link to the docs for GCC trunk instead. For the release branches, the
link should be to the docs for appropriate release branch.

Also replace the incomplete/outdated list of explicit -std options with
a single entry for the -std option.

libstdc++-v3/ChangeLog:

	PR libstdc++/115269
	* doc/xml/manual/using.xml: Replace link to gcc-4.3.2 docs.
	Replace list of -std=... options with a single entry for -std.
	* doc/html/manual/using.html: Regenerate.

(cherry picked from commit b460ede64f9471589822831e04eecff4a3dbecf2)
2024-06-01 11:01:37 +01:00
Georg-Johann Lay
9d08c55f7c AVR: tree-optimization/115307 - Work around isinf bloat from early passes.
PR tree-optimization/115307
gcc/
	* config/avr/avr.md (SFDF): New mode iterator.
	(isinf<mode>2) [sf, df]: New expanders.

gcc/testsuite/
	* gcc.target/avr/torture/pr115307-isinf.c: New test.

(cherry picked from commit fabb545026f714b7d1cbe586f4c5bbf6430bdde3)
2024-06-01 10:55:49 +02:00
GCC Administrator
5ca4e161b6 Daily bump. 2024-06-01 00:22:30 +00:00
Uros Bizjak
ec92744de5 alpha: Fix invalid RTX in divmodsi insn patterns [PR115297]
any_divmod instructions are modelled with invalid RTX:

  [(set (match_operand:DI 0 "register_operand" "=c")
        (sign_extend:DI (match_operator:SI 3 "divmod_operator"
                        [(match_operand:DI 1 "register_operand" "a")
                         (match_operand:DI 2 "register_operand" "b")])))
   (clobber (reg:DI 23))
   (clobber (reg:DI 28))]

where SImode divmod_operator (div,mod,udiv,umod) has DImode operands.

Wrap input operand with truncate:SI to make machine modes consistent.

	PR target/115297

gcc/ChangeLog:

	* config/alpha/alpha.md (<any_divmod:code>si3): Wrap DImode
	operands 3 and 4 with truncate:SI RTX.
	(*divmodsi_internal_er): Ditto for operands 1 and 2.
	(*divmodsi_internal_er_1): Ditto.
	(*divmodsi_internal): Ditto.
	* config/alpha/constraints.md ("b"): Correct register
	number in the description.

gcc/testsuite/ChangeLog:

	* gcc.target/alpha/pr115297.c: New test.

(cherry picked from commit 0ac802064c2a018cf166c37841697e867de65a95)
2024-05-31 15:53:15 +02:00
Richard Sandiford
36575f5fe4 vect: Fix access size alignment assumption [PR115192]
create_intersect_range_checks checks whether two access ranges
a and b are alias-free using something equivalent to:

  end_a <= start_b || end_b <= start_a

It has two ways of doing this: a "vanilla" way that calculates
the exact exclusive end pointers, and another way that uses the
last inclusive aligned pointers (and changes the comparisons
accordingly).  The comment for the latter is:

      /* Calculate the minimum alignment shared by all four pointers,
	 then arrange for this alignment to be subtracted from the
	 exclusive maximum values to get inclusive maximum values.
	 This "- min_align" is cumulative with a "+ access_size"
	 in the calculation of the maximum values.  In the best
	 (and common) case, the two cancel each other out, leaving
	 us with an inclusive bound based only on seg_len.  In the
	 worst case we're simply adding a smaller number than before.

The problem is that the associated code implicitly assumed that the
access size was a multiple of the pointer alignment, and so the
alignment could be carried over to the exclusive end pointer.

The testcase started failing after g:9fa5b473b5b8e289b6542
because that commit improved the alignment information for
the accesses.

gcc/
	PR tree-optimization/115192
	* tree-data-ref.cc (create_intersect_range_checks): Take the
	alignment of the access sizes into account.

gcc/testsuite/
	PR tree-optimization/115192
	* gcc.dg/vect/pr115192.c: New test.

(cherry picked from commit a0fe4fb1c8d7804515845dd5d2a814b3c7a1ccba)
2024-05-31 08:22:55 +01:00
Hongyu Wang
cd161b335c i386: Fix ix86_option override after change [PR 113719]
In ix86_override_options_after_change, calls to ix86_default_align
and ix86_recompute_optlev_based_flags will cause mismatched target
opt_set when doing cl_optimization_restore. Move them back to
ix86_option_override_internal to solve the issue.

gcc/ChangeLog:

	PR target/113719
	* config/i386/i386-options.cc (ix86_override_options_after_change):
	Remove call to ix86_default_align and
	ix86_recompute_optlev_based_flags.
	(ix86_option_override_internal): Call ix86_default_align and
	ix86_recompute_optlev_based_flags.

(cherry picked from commit 499d00127d39ba894b0f7216d73660b380bdc325)
2024-05-31 11:09:40 +08:00
GCC Administrator
06333a181d Daily bump. 2024-05-31 00:23:20 +00:00
YunQiang Su
201cfa7255 MIPS16: Mark $2/$3 as clobbered if GP is used
PR Target/84790.
The gp init sequence
        li      $2,%hi(_gp_disp)
        addiu   $3,$pc,%lo(_gp_disp)
        sll     $2,16
        addu    $2,$3
is generated directly in `mips_output_function_prologue`, and does
not appear in the RTL.

So the IRA/IPA passes are not aware that $2/$3 have been clobbered,
so they may be used for cross (local) function call.

Let's mark $2/$3 clobber both:
  - Just after the UNSPEC_GP RTL of a function;
  - Just after a function call.

Reported-by: Matthias Schiffer <mschiffer@universe-factory.net>
Origin-Patch-by: Felix Fietkau <nbd@nbd.name>.

gcc
	* config/mips/mips.cc(mips16_gp_pseudo_reg): Mark
	MIPS16_PIC_TEMP and MIPS_PROLOGUE_TEMP clobbered.
	(mips_emit_call_insn): Mark MIPS16_PIC_TEMP and
	MIPS_PROLOGUE_TEMP clobbered if MIPS16 and CALL_CLOBBERED_GP.

(cherry picked from commit 915440eed21de367cb41857afb5273aff5bcb737)
2024-05-30 09:48:08 +08:00
GCC Administrator
8f6c56cda5 Daily bump. 2024-05-30 00:23:02 +00:00
Eric Botcazou
fba2843b9b Fix link failure of GNAT tools on 32-bit SPARC/Linux
There is an incorrect binding to the 64-bit compare-and-exchange builtin.

gcc/ada/
	PR ada/115270
	* Makefile.rtl (PowerPC/Linux): Use libgnat/s-atopri__32.ads for
	the 32-bit library.
	(SPARC/Linux): Likewise.
2024-05-29 12:11:22 +02:00
Richard Biener
90a447677a tree-optimization/115149 - VOP live and missing PHIs
The following fixes a bug in vop-live get_live_in which was using
NULL to indicate the first processed edge but at the same time
using it for the case the live-in virtual operand cannot be computed.
The following fixes this, avoiding sinking a load to a place where
we'd have to insert virtual PHIs to make the virtual operand SSA
web OK.

	PR tree-optimization/115149
	* tree-ssa-live.cc (virtual_operand_live::get_live_in):
	Explicitly track the first processed edge.

	* gcc.dg/pr115149.c: New testcase.

(cherry picked from commit ec9b8bafe20755d13ab9a1b834b5da79ae972c0e)
2024-05-29 08:29:25 +02:00
Richard Biener
2a1fdd5fd0 tree-optimization/115197 - fix ICE w/ constant in LC PHI and loop distribution
Forgot a check for an SSA name before trying to replace a PHI arg with
its current definition.

	PR tree-optimization/115197
	* tree-loop-distribution.cc (copy_loop_before): Constant PHI
	args remain the same.

	* gcc.dg/pr115197.c: New testcase.

(cherry picked from commit 2b2476d4d18c92b8aba3567ebccd2100c2f7c258)
2024-05-29 08:29:25 +02:00
Richard Biener
9e971c671d tree-optimization/114921 - _Float16 -> __bf16 isn't noop fixup
The following further strengthens the check which convert expressions
we allow to vectorize as simple copy by resorting to
tree_nop_conversion_p on the vector components.

	PR tree-optimization/114921
	* tree-vect-stmts.cc (vectorizable_assignment): Use
	tree_nop_conversion_p to identify converts we can vectorize
	with a simple assignment.

(cherry picked from commit d0d6dcc019cd32eebf85d625f56e0f7573938319)
2024-05-29 08:29:25 +02:00
liuhongt
b4d4ece044 Align tight&hot loop without considering max skipping bytes.
When hot loop is small enough to fix into one cacheline, we should align
the loop with ceil_log2 (loop_size) without considering maximum
skipp bytes. It will help code prefetch.

gcc/ChangeLog:

	* config/i386/i386.cc (ix86_avoid_jump_mispredicts): Change
	gen_pad to gen_max_skip_align.
	(ix86_align_loops): New function.
	(ix86_reorg): Call ix86_align_loops.
	* config/i386/i386.md (pad): Rename to ..
	(max_skip_align): .. this, and accept 2 operands for align and
	skip.
2024-05-29 11:12:51 +08:00
Haochen Jiang
80600352d1 Adjust generic loop alignment from 16:11:8 to 16 for Intel processors
Previously, we use 16:11:8 in generic tune for Intel processors, which
lead to cross cache line issue and result in some random performance
penalty in benchmarks with small loops commit to commit.

After changing to always aligning to 16 bytes, it will somehow solve
the issue.

gcc/ChangeLog:

	* config/i386/x86-tune-costs.h (generic_cost): Change from
	16:11:8 to 16.
2024-05-29 11:12:37 +08:00
GCC Administrator
e2b66da9bd Daily bump. 2024-05-29 00:23:15 +00:00
Tobias Burnus
dbeb3d127d Fortran: Fix SHAPE for zero-size arrays
PR fortran/115150

gcc/fortran/ChangeLog:

	* trans-intrinsic.cc (gfc_conv_intrinsic_bound): Fix SHAPE
	for zero-size arrays

gcc/testsuite/ChangeLog:

	* gfortran.dg/shape_12.f90: New test.

(cherry picked from commit b701306a9b38bd74cdc26c7ece5add22f2203b56)
2024-05-28 12:59:46 +02:00
Jonathan Wakely
89dff1488e libstdc++: Guard use of sized deallocation [PR114940]
Clang does not enable -fsized-deallocation by default, which means it
can't compile our <stacktrace> and <generator> headers.

Make the __cpp_lib_generator macro depend on the compiler-defined
__cpp_sized_deallocation macro, and change <stacktrace> to use unsized
deallocation when __cpp_sized_deallocation isn't defined.

libstdc++-v3/ChangeLog:

	PR libstdc++/114940
	* include/bits/version.def (generator): Depend on
	__cpp_sized_deallocation.
	* include/bits/version.h: Regenerate.
	* include/std/stacktrace (_GLIBCXX_SIZED_DELETE): New macro.
	(basic_stacktrace::_Impl::_M_deallocate): Use it.

(cherry picked from commit b2fdd508d7e63158e9d2a6dd04f901d02900def3)
2024-05-28 10:19:44 +01:00
Xi Ruoyao
e78980fdd5
LoongArch: Guard REGNO with REG_P in loongarch_expand_conditional_move [PR115169]
gcc/ChangeLog:

	PR target/115169
	* config/loongarch/loongarch.cc
	(loongarch_expand_conditional_move): Guard REGNO with REG_P.

(cherry picked from commit ded91d857772c0183cc342cdc54d9128f6c57fa2)
2024-05-28 09:34:32 +08:00
GCC Administrator
133da68a4c Daily bump. 2024-05-28 00:21:29 +00:00
Richard Biener
4790076933 tree-optimization/115232 - demangle failure during -Waccess
For the following testcase we fail to demangle
_ZZN5OuterIvE6methodIvEEvvQ3cstITL0__EEN5InnernwEm and
_ZZN5OuterIvE6methodIvEEvvQ3cstITL0__EEN5InnerdlEPv and in turn end
up building NULL references.  The following puts in a safeguard for
faile demangling into -Waccess.

	PR tree-optimization/115232
	* gimple-ssa-warn-access.cc (new_delete_mismatch_p): Handle
	failure to demangle gracefully.

	* g++.dg/pr115232.C: New testcase.

(cherry picked from commit 311d7f5c17b8969c7ed8e4f23178d6ec4752e33f)
2024-05-27 11:24:48 +02:00
GCC Administrator
0cae44a288 Daily bump. 2024-05-27 00:21:39 +00:00
GCC Administrator
2e0f832cf7 Daily bump. 2024-05-26 00:22:08 +00:00
Harald Anlauf
b0b21d5bdf Fortran: fix bounds check for assignment, class component [PR86100]
gcc/fortran/ChangeLog:

	PR fortran/86100
	* trans-array.cc (gfc_conv_ss_startstride): Use abridged_ref_name
	to generate a more user-friendly name for bounds-check messages.
	* trans-expr.cc (gfc_copy_class_to_class): Fix bounds check for
	rank>1 by looping over the dimensions.

gcc/testsuite/ChangeLog:

	PR fortran/86100
	* gfortran.dg/bounds_check_25.f90: New test.

(cherry picked from commit 93765736815a049e14d985b758a213cfe60c1e1c)
2024-05-25 20:07:45 +02:00
GCC Administrator
cab894172d Daily bump. 2024-05-25 00:22:08 +00:00
Jason Merrill
9031c02782 c++: deleting array temporary [PR115187]
Decaying the array temporary to a pointer and then deleting that crashes in
verify_gimple_stmt, because the TARGET_EXPR is first evaluated inside the
TRY_FINALLY_EXPR, but the cleanup point is outside.  Fixed by using
get_target_expr instead of save_expr.

I also adjust the stabilize_expr comment to prevent me from again thinking
it's a suitable replacement.

	PR c++/115187

gcc/cp/ChangeLog:

	* init.cc (build_delete): Use get_target_expr instead of save_expr.
	* tree.cc (stabilize_expr): Update comment.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp1z/array-prvalue3.C: New test.
2024-05-24 11:15:13 -04:00
Nathaniel Shead
782ad2033e c++: Propagate using decls from partitions [PR114868]
The modules code currently neglects to set OVL_USING_P on the dependency
created for a using-decl, which causes it not to remember that the
OVL_EXPORT_P flag had been set on it when emitted from the primary
interface unit. This patch ensures that it occurs.

	PR c++/114868

gcc/cp/ChangeLog:

	* module.cc (depset:#️⃣:add_binding_entity): Propagate
	OVL_USING_P for using-declarations.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/using-15_a.C: New test.
	* g++.dg/modules/using-15_b.C: New test.
	* g++.dg/modules/using-15_c.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
(cherry picked from commit 0d0215b10dbbe39d655ceda4af283f288ec7680c)
2024-05-25 00:32:35 +10:00
Nathaniel Shead
fd6fd88b1a c++: Fix instantiation of imported temploid friends [PR114275]
This patch fixes a number of issues with the handling of temploid friend
declarations.

The primary issue is that instantiations of friend declarations should
attach the declaration to the same module as the befriending class, by
[module.unit] p7.1 and [temp.friend] p2; this could be a different
module from the current TU, and so needs special handling.

The other main issue here is that we can't assume that just because name
lookup didn't find a definition for a hidden class template, that it
doesn't exist at all: it could be a non-exported entity that we've
nevertheless streamed in from an imported module.  We need to ensure
that when instantiating template friend classes that we return the same
TEMPLATE_DECL that we got from our imports, otherwise we will get later
issues with 'duplicate_decls' (rightfully) complaining that they're
different when trying to merge.

This doesn't appear necessary for function templates due to the existing
name lookup handling already finding these hidden declarations.

(cherry-picked from commits b5f6a56940e70838a07e885de03a92e2bd64674a and
ec2365e07537e8b17745d75c28a2b45bf33be119)

	PR c++/105320
	PR c++/114275

gcc/cp/ChangeLog:

	* cp-tree.h (propagate_defining_module): Declare.
	(remove_defining_module): Declare.
	(lookup_imported_hidden_friend): Declare.
	* decl.cc (duplicate_decls): Also check if hidden decls can be
	redeclared in this module. Call remove_defining_module on
	to-be-freed newdecl.
	* module.cc (imported_temploid_friends): New.
	(init_modules): Initialize it.
	(trees_out::decl_value): Write it; don't consider imported
	temploid friends as attached to a module.
	(trees_in::decl_value): Read it for non-discarded decls.
	(get_originating_module_decl): Follow the owning decl for an
	imported temploid friend.
	(propagate_defining_module): New.
	(remove_defining_module): New.
	* name-lookup.cc (get_mergeable_namespace_binding): New.
	(lookup_imported_hidden_friend): New.
	* pt.cc (tsubst_friend_function): Propagate defining module for
	new friend functions.
	(tsubst_friend_class): Lookup imported hidden friends.  Check
	for valid module attachment of existing names.  Propagate
	defining module for new classes.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/tpl-friend-10_a.C: New test.
	* g++.dg/modules/tpl-friend-10_b.C: New test.
	* g++.dg/modules/tpl-friend-10_c.C: New test.
	* g++.dg/modules/tpl-friend-10_d.C: New test.
	* g++.dg/modules/tpl-friend-11_a.C: New test.
	* g++.dg/modules/tpl-friend-11_b.C: New test.
	* g++.dg/modules/tpl-friend-12_a.C: New test.
	* g++.dg/modules/tpl-friend-12_b.C: New test.
	* g++.dg/modules/tpl-friend-12_c.C: New test.
	* g++.dg/modules/tpl-friend-12_d.C: New test.
	* g++.dg/modules/tpl-friend-12_e.C: New test.
	* g++.dg/modules/tpl-friend-12_f.C: New test.
	* g++.dg/modules/tpl-friend-13_a.C: New test.
	* g++.dg/modules/tpl-friend-13_b.C: New test.
	* g++.dg/modules/tpl-friend-13_c.C: New test.
	* g++.dg/modules/tpl-friend-13_d.C: New test.
	* g++.dg/modules/tpl-friend-13_e.C: New test.
	* g++.dg/modules/tpl-friend-13_f.C: New test.
	* g++.dg/modules/tpl-friend-13_g.C: New test.
	* g++.dg/modules/tpl-friend-14_a.C: New test.
	* g++.dg/modules/tpl-friend-14_b.C: New test.
	* g++.dg/modules/tpl-friend-14_c.C: New test.
	* g++.dg/modules/tpl-friend-14_d.C: New test.
	* g++.dg/modules/tpl-friend-9.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-05-25 00:16:59 +10:00
Nathaniel Shead
557cddcc71 c++: Standardise errors for module_may_redeclare
Currently different places calling 'module_may_redeclare' all emit very
similar but slightly different error messages, and handle different
kinds of declarations differently.  This patch makes the function
perform its own error messages so that they're all in one place, and
prepares it for use with temploid friends.

gcc/cp/ChangeLog:

	* cp-tree.h (module_may_redeclare): Add default parameter.
	* decl.cc (duplicate_decls): Don't emit errors for failed
	module_may_redeclare.
	(xref_tag): Likewise.
	(start_enum): Likewise.
	* semantics.cc (begin_class_definition): Likewise.
	* module.cc (module_may_redeclare): Clean up logic. Emit error
	messages on failure.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/enum-12.C: Update error message.
	* g++.dg/modules/friend-5_b.C: Likewise.
	* g++.dg/modules/shadow-1_b.C: Likewise.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 2faf040335f9b49c33ba6d40cf317920f27ce431)
2024-05-25 00:16:59 +10:00