Commit graph

210451 commits

Author SHA1 Message Date
Robin Dapp
937713a523 RISC-V: Do not allow v0 as dest when merging [PR115068].
This patch splits the vfw...wf pattern so we do not emit e.g. vfwadd.wf
v0,v8,fa5,v0.t anymore.

gcc/ChangeLog:

	PR target/115068

	* config/riscv/vector.md:  Split vfw<insn>.wf pattern.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/pr115068-run.c: New test.
	* gcc.target/riscv/rvv/base/pr115068.c: New test.

(cherry picked from commit a2fd0812a54cf51520f15e900df4cfb5874b75ed)
2024-07-19 13:55:50 +08:00
Fangrui Song
3a7e796b48 RISC-V: Add -X to link spec
--discard-locals (-X) instructs the linker to remove local .L* symbols,
which occur a lot due to label differences for linker relaxation. The
arm port has a similar need and passes -X to ld.

In contrast, the RISC-V port does not pass -X to ld and rely on the
default --discard-locals in GNU ld's riscv port. The arm way is more
conventional (compiler driver instead of the linker customizes the
default linker behavior) and works with lld.

gcc/ChangeLog:

	* config/riscv/elf.h (LINK_SPEC): Add -X.
	* config/riscv/freebsd.h (LINK_SPEC): Add -X.
	* config/riscv/linux.h (LINK_SPEC): Add -X.

(cherry picked from commit 50c218e3ffe57860591a987ecf44fcc0abb31f2c)
2024-07-19 13:55:50 +08:00
Christoph Müllner
92003fad99 RISC-V: Fix parsing of Zic* extensions
The extension parsing table entries for a range of Zic* extensions
does not match the mask definition in riscv.opt.
This results in broken TARGET_ZIC* macros, because the values of
riscv_zi_subext and riscv_zicmo_subext are set wrong.

This patch fixes this by moving Zic64b into riscv_zicmo_subext
and all other affected Zic* extensions to riscv_zi_subext.

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc: Move ziccamoa, ziccif,
	zicclsm, and ziccrse into riscv_zi_subext.
	* config/riscv/riscv.opt: Define MASK_ZIC64B for
	riscv_ziccmo_subext.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
(cherry picked from commit 285300eb928b171236e895f28c960ad02dcb0d67)
2024-07-19 13:55:49 +08:00
Pan Li
68ef0c321a RISC-V: Bugfix ICE for RVV intrinisc vfw on _Float16 scalar
For the vfw vx format RVV intrinsic, the scalar type _Float16 also
requires the zvfh extension.  Unfortunately,  we only check the
vector tree type and miss the scalar _Float16 type checking.  For
example:

vfloat32mf2_t test_vfwsub_wf_f32mf2(vfloat32mf2_t vs2, _Float16 rs1, size_t vl)
{
  return __riscv_vfwsub_wf_f32mf2(vs2, rs1, vl);
}

It should report some error message like zvfh extension is required
instead of ICE for unreg insn.

This patch would like to make up such kind of validation for _Float16
in the RVV intrinsic API.  It will report some error like below when
there is no zvfh enabled.

error: built-in function '__riscv_vfwsub_wf_f32mf2(vs2,  rs1,  vl)'
  requires the zvfhmin or zvfh ISA extension

Passed the rv64gcv fully regression tests, included c/c++/fortran.

	PR target/114988

gcc/ChangeLog:

	* config/riscv/riscv-vector-builtins.cc
	(validate_instance_type_required_extensions): New func impl to
	validate the intrinisc func type ops.
	(expand_builtin): Validate instance type before expand.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/pr114988-1.c: New test.
	* gcc.target/riscv/rvv/base/pr114988-2.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
(cherry picked from commit 41b3cf262e61aee9d26380f1c820e0eaae740f50)
2024-07-19 13:55:49 +08:00
Liao Shihua
c38dbfc1ce RISC-V: Fix missing boolean_expression in zmmul extension
Update v1->v2
    Add testcase for this patch.

Missing boolean_expression TARGET_ZMMUL in riscv_rtx_costs() cause different instructions when
multiplying an integer with a constant. ( https://github.com/riscv-collab/riscv-gnu-toolchain/issues/1482 )

int foo(int *ib) {
    *ib = *ib * 33938;
    return 0;
}

rv64im:
        lw      a4,0(a1)
        li      a5,32768
        addiw   a5,a5,1170
        mulw    a5,a5,a4
        sw      a5,0(a1)
        ret

rv64i_zmmul:
        lw      a4,0(a1)
        slliw   a5,a4,5
        addw    a5,a5,a4
        slliw   a5,a5,3
        addw    a5,a5,a4
        slliw   a5,a5,3
        addw    a5,a5,a4
        slliw   a5,a5,3
        addw    a5,a5,a4
        slliw   a5,a5,1
        sw      a5,0(a1)
        ret

Fixed.

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_rtx_costs): Add TARGET_ZMMUL.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/zmmul-3.c: New test.

(cherry picked from commit 06bb125521dec5648b725ddee4345b00decfdc77)
2024-07-19 13:55:49 +08:00
Pan Li
4db38759dc RISC-V: Bugfix vec_extract v mode iterator restriction mismatch
We have vec_extract pattern which takes ZVFHMIN as the mode
iterator of the V mode.  Aka VF_ZVFHMIN iterator.  But it will
expand to pred_extract_first pattern which takes the ZVFH as the mode
iterator of the V mode.  AKa VF.  The mismatch will result in one ICE
similar as below:

insn 30 29 31 2 (set (reg:HF 156 [ _2 ])
        (unspec:HF [
                (vec_select:HF (reg:RVVMF2HF 134 [ _1 ])
                    (parallel [
                            (const_int 0 [0])
                        ]))
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)) "compress_run-2.c":22:3 -1
     (nil))
during RTL pass: vregs
compress_run-2.c:25:1: internal compiler error: in extract_insn, at
recog.cc:2812
0xb3bc47 _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
        ../../../gcc/gcc/rtl-error.cc:108
0xb3bc69 _fatal_insn_not_found(rtx_def const*, char const*, int, char
const*)
        ../../../gcc/gcc/rtl-error.cc:116
0xb3a545 extract_insn(rtx_insn*)
        ../../../gcc/gcc/recog.cc:2812
0x1010e9e instantiate_virtual_regs_in_insn
        ../../../gcc/gcc/function.cc:1612
0x1010e9e instantiate_virtual_regs
        ../../../gcc/gcc/function.cc:1995
0x1010e9e execute
        ../../../gcc/gcc/function.cc:2042

The below test suites are passed for this patch.
1. The rv64gcv fully regression test.
2. The rv64gcv build with glibc.

There may be other similar issue(s) for the mismatch,  we will take care
of them by test cases one by one.

	PR target/115456

gcc/ChangeLog:

	* config/riscv/vector-iterators.md: Leverage V_ZVFH instead of V
	which contains the VF_ZVFHMIN for alignment.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/pr115456-2.c: New test.
	* gcc.target/riscv/rvv/base/pr115456-3.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
(cherry picked from commit c2c61d8902dbda017b1647252d17bce141493433)
2024-07-19 13:55:49 +08:00
Pan Li
87346ed74c RISC-V: Bugfix vec_extract vls mode iterator restriction mismatch
We have vec_extract pattern which takes ZVFHMIN as the mode
iterator of the VLS mode.  Aka V_VLS.  But it will expand to
pred_extract_first pattern which takes the ZVFH as the mode
iterator of the VLS mode.  AKa V_VLSF.  The mismatch will
result in one ICE similar as below:

error: unrecognizable insn:
   27 | }
      | ^
(insn 19 18 20 2 (set (reg:HF 150 [ _13 ])
        (unspec:HF [
                (vec_select:HF (reg:V4HF 134 [ _1 ])
                    (parallel [
                            (const_int 0 [0])
                        ]))
                (reg:SI 67 vtype)
            ] UNSPEC_VPREDICATE)) "compress_run-2.c":24:5 -1
     (nil))
during RTL pass: vregs
compress_run-2.c:27:1: internal compiler error: in extract_insn, at
recog.cc:2812
0x1a627ef _fatal_insn(char const*, rtx_def const*, char const*, int,
char const*)
        ../../../gcc/gcc/rtl-error.cc:108
0x1a62834 _fatal_insn_not_found(rtx_def const*, char const*, int, char
const*)
        ../../../gcc/gcc/rtl-error.cc:116
0x1a0f356 extract_insn(rtx_insn*)
        ../../../gcc/gcc/recog.cc:2812
0x159ee61 instantiate_virtual_regs_in_insn
        ../../../gcc/gcc/function.cc:1612
0x15a04aa instantiate_virtual_regs
        ../../../gcc/gcc/function.cc:1995
0x15a058e execute
        ../../../gcc/gcc/function.cc:2042

This patch would like to fix this issue by align the mode
iterator restriction to ZVFH.

The below test suites are passed for this patch.
1. The rv64gcv fully regression test.
2. The rv64gcv build with glibc.

	PR target/115456

gcc/ChangeLog:

	* config/riscv/autovec.md: Take ZVFH mode iterator instead of
	the ZVFHMIN for the alignment.
	* config/riscv/vector-iterators.md: Add 2 new iterator
	V_VLS_ZVFH and VLS_ZVFH.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/pr115456-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
(cherry picked from commit 3dac1049c1211e6d06c2536b86445a6334c3866d)
2024-07-19 13:55:49 +08:00
Artemiy Volkov
c32995c445 [PATCH] RISC-V: Fix unrecognizable pattern in riscv_expand_conditional_move()
Presently, the code fragment:

int x[5];

void
d(int a, int b, int c) {
  for (int i = 0; i < 5; i++)
    x[i] = (a != b) ? c : a;
}

causes an ICE when compiled with -O2 -march=rv32i_zicond:

test.c: In function 'd':
test.c: error: unrecognizable insn:
   11 | }
      | ^
(insn 8 5 9 2 (set (reg:SI 139 [ iftmp.0_2 ])
        (if_then_else:SI (ne:SI (reg/v:SI 136 [ a ])
                (reg/v:SI 137 [ b ]))
            (reg/v:SI 136 [ a ])
            (reg/v:SI 138 [ c ]))) -1
     (nil))
during RTL pass: vregs

This happens because, as part of one of the optimizations in
riscv_expand_conditional_move(), an if_then_else is generated with both
comparands being register operands, resulting in an unmatchable insn since
Zicond patterns require constant 0 as the second comparand.  Fix this by adding
a extra check before performing this optimization.

The code snippet mentioned above is also included in this patch as a new Zicond
testcase.

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_expand_conditional_move): Add a
	CONST0_RTX check.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/zicond-ice-5.c: New test.

(cherry picked from commit eb647daa87b466d0a71246fad302cd81bfce9be5)
2024-07-19 13:55:49 +08:00
Robin Dapp
2d7dda8473 RISC-V: Use tu policy for first-element vec_set [PR115725].
This patch changes the tail policy for vmv.s.x from ta to tu.
By default the bug does not show up with qemu because qemu's
current vmv.s.x implementation always uses the tail-undisturbed
policy.  With a local qemu version that overwrites the tail
with ones when the tail-agnostic policy is specified, the bug
shows.

gcc/ChangeLog:

	* config/riscv/autovec.md: Add TU policy.
	* config/riscv/riscv-protos.h (enum insn_type): Define
	SCALAR_MOVE_MERGED_OP_TU.

gcc/testsuite/ChangeLog:

	PR target/115725

	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c: Adjust
	test expectation.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c: Ditto.

(cherry picked from commit acc3b703c05debc6276451f9daae5d0ffc797eac)
2024-07-19 13:55:49 +08:00
Fei Gao
b218c42532 [RISC-V] add implied extension repeatly until stable
Call handle_implied_ext repeatly until there's no
new subset added into the subset list.

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc (riscv_subset_list::riscv_subset_list):
	init m_subset_num to 0.
	(riscv_subset_list::add): increase m_subset_num once a subset added.
	(riscv_subset_list::finalize): call handle_implied_ext repeatly
	until no change in m_subset_num.
	* config/riscv/riscv-subset.h: add m_subset_num member.

Signed-off-by: Fei Gao <gaofei@eswincomputing.com>
(cherry picked from commit 682731d11f9c02b24358d1af1e2bf6fca0221ee7)
2024-07-19 13:55:49 +08:00
GCC Administrator
a2a2916755 Daily bump. 2024-07-19 00:25:13 +00:00
Marek Polacek
493035c878 eh: ICE with std::initializer_list and ASan [PR115865]
Here we ICE with -fsanitize=address on

  std::initializer_list x = { 1, 2, 3 };

since r14-8681, which removed .ASAN_MARK calls on TREE_STATIC variables.
That means that lower_try_finally now instead of

  try
    {
      .ASAN_MARK (UNPOISON, &C.0, 12);
      x = {};
      x._M_len = 3;
      x._M_array = &C.0;
    }
  finally
    {
      .ASAN_MARK (POISON, &C.0, 12);
    }

gets:

  try
    {
      x = {};
      x._M_len = 3;
      x._M_array = &C.0;
    }
  finally
    {

    }

and we ICE on the empty finally in lower_try_finally_onedest while
getting get_eh_else.

	PR c++/115865

gcc/ChangeLog:

	* tree-eh.cc (get_eh_else): Check that the result of
	gimple_seq_first_stmt is non-null.

gcc/testsuite/ChangeLog:

	* g++.dg/asan/initlist2.C: New test.

Co-authored-by: Jakub Jelinek  <jakub@redhat.com>
(cherry picked from commit 1e60a6abfece40c7bf55d6ca0a439078d3f5159a)
2024-07-18 11:49:54 -04:00
LIU Hao
747c4b5857 Do not use caller-saved registers for COMDAT functions
A reference to a COMDAT function may be resolved to another definition
outside the current translation unit, so it's not eligible for `-fipa-ra`.

In `decl_binds_to_current_def_p()` there is already a check for weak
symbols. This commit checks for COMDAT functions that are not implemented
as weak symbols, for example, on *-*-mingw32.

gcc/ChangeLog:

	PR rtl-optimization/115049
	* varasm.cc (decl_binds_to_current_def_p): Add a check for COMDAT
	declarations too, like weak ones.

(cherry picked from commit 5080840d8fbf25a321dd27543a1462d393d338bc)
2024-07-18 13:22:28 +00:00
Marek Polacek
c314867fc0 c++: ICE with __has_unique_object_representations [PR115476]
Here we started to ICE with r13-25: in check_trait_type, for "X[]" we
return true here:

  if (kind == 1 && TREE_CODE (type) == ARRAY_TYPE && !TYPE_DOMAIN (type))
    return true; // Array of unknown bound. Don't care about completeness.

and then end up crashing in record_has_unique_obj_representations:

4836	  if (cur != wi::to_offset (sz))

because sz is null.

https://eel.is/c++draft/type.traits#tab:meta.unary.prop-row-47-column-3-sentence-1
says that the preconditions for __has_unique_object_representations are:
"T shall be a complete type, cv void, or an array of unknown bound" and
that "For an array type T, the same result as
has_unique_object_representations_v<remove_all_extents_t<T>>" so T[]
should be treated as T.  So we should use kind==2 for the trait.

	PR c++/115476

gcc/cp/ChangeLog:

	* semantics.cc (finish_trait_expr)
	<case CPTK_HAS_UNIQUE_OBJ_REPRESENTATIONS>: Move below to call
	check_trait_type with kind==2.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp1z/has-unique-obj-representations4.C: New test.

(cherry picked from commit fc382a373e6824bb998007d1dcb0805b0cf4b8e8)
2024-07-18 13:50:42 +02:00
Roger Sayle
a4c9ade728 i386: PR target/115351: RTX costs for *concatditi3 and *insvti_highpart.
This patch addresses PR target/115351, which is a code quality regression
on x86 when passing floating point complex numbers.  The ABI considers
these arguments to have TImode, requiring interunit moves to place the
FP values (which are actually passed in SSE registers) into the upper
and lower parts of a TImode pseudo, and then similar moves back again
before they can be used.

The cause of the regression is that changes in how TImode initialization
is represented in RTL now prevents the RTL optimizers from eliminating
these redundant moves.  The specific cause is that the *concatditi3
pattern, (zext(hi)<<64)|zext(lo), has an inappropriately high (default)
rtx_cost, preventing fwprop1 from propagating it.  This pattern just
sets the hipart and lopart of a double-word register, typically two
instructions (less if reload can allocate things appropriately) but
the current ix86_rtx_costs actually returns INSN_COSTS(13), i.e. 52.

propagating insn 5 into insn 6, replacing:
(set (reg:TI 110)
    (ior:TI (and:TI (reg:TI 110)
            (const_wide_int 0x0ffffffffffffffff))
        (ashift:TI (zero_extend:TI (subreg:DI (reg:DF 112 [ zD.2796+8 ]) 0))
            (const_int 64 [0x40]))))
successfully matched this instruction to *concatditi3_3:
(set (reg:TI 110)
    (ior:TI (ashift:TI (zero_extend:TI (subreg:DI (reg:DF 112 [ zD.2796+8 ]) 0))
            (const_int 64 [0x40]))
        (zero_extend:TI (subreg:DI (reg:DF 111 [ zD.2796 ]) 0))))
change not profitable (cost 50 -> cost 52)

This issue is resolved by having ix86_rtx_costs return more reasonable
values for these (place-holder) patterns.

2024-06-07  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR target/115351
	* config/i386/i386.cc (ix86_rtx_costs): Provide estimates for
	the *concatditi3 and *insvti_highpart patterns, about two insns.

gcc/testsuite/ChangeLog
	PR target/115351
	* g++.target/i386/pr115351.C: New test case.

(cherry picked from commit fb3e4c549d16d5050e10114439ad77149f33c597)
2024-07-18 13:50:42 +02:00
David Malcolm
b0452ed2fd analyzer: fix ICE seen with -fsanitize=undefined [PR114899]
gcc/analyzer/ChangeLog:
	PR analyzer/114899
	* access-diagram.cc
	(written_svalue_spatial_item::get_label_string): Bulletproof
	against SSA_NAME_VAR being null.

gcc/testsuite/ChangeLog:
	PR analyzer/114899
	* c-c++-common/analyzer/out-of-bounds-diagram-pr114899.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
(cherry picked from commit 1779e22150b917e28e959623c819ef943fab02df)
2024-07-18 13:50:42 +02:00
Jan Hubicka
0b7ec50ae2 Fix points_to_local_or_readonly_memory_p wrt TARGET_MEM_REF
TARGET_MEM_REF can be used to offset constant base into a memory object (to
produce lea instruction).  This confuses points_to_local_or_readonly_memory_p
which treats the constant address as a base of the access.

Bootstrapped/regtsted x86_64-linux, comitted.
Honza

gcc/ChangeLog:

	PR ipa/113787
	* ipa-fnsummary.cc (points_to_local_or_readonly_memory_p): Do not
	look into TARGET_MEM_REFS with constant opreand 0.

gcc/testsuite/ChangeLog:

	* gcc.c-torture/execute/pr113787.c: New test.

(cherry picked from commit 96d53252aefcbc2fe419c4c3b4bcd3fc03d4d187)
2024-07-18 13:50:42 +02:00
Roger Sayle
0f593e4cd8 PR tree-optimization/113673: Avoid load merging when potentially trapping.
This patch fixes PR tree-optimization/113673, a P2 ice-on-valid regression
caused by load merging of (ptr[0]<<8)+ptr[1] when -ftrapv has been
specified.  When the operator is | or ^ this is safe, but for addition
of signed integer types, a trap may be generated/required, so merging this
idiom into a single non-trapping instruction is inappropriate, confusing
the compiler by transforming a basic block with an exception edge into one
without.

This revision implements Richard Biener's feedback to add an early check
for stmt_can_throw_internal (cfun, stmt) to prevent transforming in the
presence of any statement that could trap, not just overflow on addition.
The one other tweak included in this patch is to mark the local function
find_bswap_or_nop_load as static ensuring that it isn't called from outside
this file, and guaranteeing that it is dominated by stmt_can_throw_internal
checking.

2024-06-24  Roger Sayle  <roger@nextmovesoftware.com>
	    Richard Biener  <rguenther@suse.de>

gcc/ChangeLog
	PR tree-optimization/113673
	* gimple-ssa-store-merging.cc (find_bswap_or_nop_load): Make static.
	(find_bswap_or_nop_1): Avoid transformations (load merging) when
	stmt_can_throw_internal indicates that a statement can trap.

gcc/testsuite/ChangeLog
	PR tree-optimization/113673
	* g++.dg/pr113673.C: New test case.

(cherry picked from commit d8b05aef77443e1d3d8f3f5d2c56ac49a503fee3)
2024-07-18 13:50:41 +02:00
Jakub Jelinek
0fbad21b07 testsuite: Fix up builtin-clear-padding-3.c for -funsigned-char
As reported on gcc-regression, this test FAILs on aarch64, but my
r15-2090 change didn't change anything on the generated assembly,
just added the forgotten dg-do run directive to the test, so the
test has been failing forever, just we didn't know it.

I can actually reproduce it on x86_64 with -funsigned-char too,
s2.b.a has int type and -1 is stored to it, so we should compare
it against -1 rather than (char) -1; the latter is appropriate for
testing char fields into which we've stored -1.

2024-07-18  Jakub Jelinek  <jakub@redhat.com>

	* c-c++-common/torture/builtin-clear-padding-3.c (main): Compare
	s2.b.a against -1 rather than (char) -1.

(cherry picked from commit 958ee138748fae4371e453eb9b357f576abbe83e)
2024-07-18 09:37:04 +02:00
Nathaniel Shead
f0c3a1c16a c++/modules: Conditionally start timer during lazy load [PR115165]
While lazy loading, instantiation of pendings can sometimes recursively
perform name lookup and begin further lazy loading.  When using the
'-ftime-report' functionality this causes ICEs as we could start an
already-running timer for the importing.

This patch fixes the issue by using the 'timevar_cond*' API instead to
support such recursive calls.

	PR c++/115165

gcc/cp/ChangeLog:

	* module.cc (lazy_load_binding): Use 'timevar_cond*' APIs.
	(lazy_load_pendings): Likewise.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/timevar-1_a.H: New test.
	* g++.dg/modules/timevar-1_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2024-07-18 13:08:29 +10:00
GCC Administrator
4871b0f74c Daily bump. 2024-07-18 00:24:42 +00:00
Patrick Palka
1bbfe788d1 c++: constrained partial spec type context [PR111890]
maybe_new_partial_specialization wasn't propagating TYPE_CONTEXT when
creating a new class type corresponding to a constrained partial spec,
which do_friend relies on via template_class_depth to distinguish a
template friend from a non-template friend, and so in the below testcase
we were incorrectly instantiating the non-template operator+ as if it
were a template leading to an ICE.

	PR c++/111890

gcc/cp/ChangeLog:

	* pt.cc (maybe_new_partial_specialization): Propagate TYPE_CONTEXT
	to the newly created partial specialization.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/concepts-partial-spec15.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 247335823f420eb1dd56f4bf32ac78d441f5ccc2)
2024-07-17 15:02:25 -04:00
Patrick Palka
2249c63488 c++: alias template with dependent attributes [PR115897]
Here we're prematurely stripping the dependent alias template-id A<T> to
its defining-type-id T when used as a template argument, which in turn
causes us to essentially ignore A's vector_size attribute in the outer
template-id.

This has always been a problem for class template-ids it seems, and after
r14-2170 variable template-ids are affected as well.

This patch marks alias templates that have a dependent attribute as
complex (as with e.g. constrained alias templates) so that we don't look
through them prematurely.

	PR c++/115897

gcc/cp/ChangeLog:

	* pt.cc (complex_alias_template_p): Return true for an alias
	template with attributes.
	(get_underlying_template): Don't look through an alias template
	with attributes.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/alias-decl-77.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 7954bb4fcb6fa80f6bb840133314885011821188)
2024-07-17 15:02:24 -04:00
Patrick Palka
79c5a09f95 c++: bad 'this' conversion for nullary memfn [PR106760]
Here we notice the 'this' conversion for the call f<void>() is bad, so
we correctly defer deduction for the template candidate, but we end up
never adding it to 'bad_cands' since missing_conversion_p for it returns
false (its only argument is 'this' which has already been determined to
be bad).  This is not a huge deal, but it causes us to longer accept the
call with -fpermissive in release builds, and a tree check ICE in checking
builds.

So if we have a non-strictly viable template candidate that has not been
instantiated, then we need to add it to 'bad_cands' even if no argument
conversion is missing.

	PR c++/106760

gcc/cp/ChangeLog:

	* call.cc (add_candidates): Relax test for adding a candidate
	to 'bad_cands' to also accept an uninstantiated template candidate
	that has no missing conversions.

gcc/testsuite/ChangeLog:

	* g++.dg/ext/conv3.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 50073ffae0a9b8feb9b36fdafdebd9885f6d7dc8)
2024-07-17 15:02:24 -04:00
Uros Bizjak
3a963d441a alpha: Fix duplicate !tlsgd!62 assemble error [PR115526]
Add missing "cannot_copy" attribute to instructions that have to
stay in 1-1 correspondence with another insn.

	PR target/115526

gcc/ChangeLog:

	* config/alpha/alpha.md (movdi_er_high_g): Add cannot_copy attribute.
	(movdi_er_tlsgd): Ditto.
	(movdi_er_tlsldm): Ditto.
	(call_value_osf_<tls>): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/alpha/pr115526.c: New test.

(cherry picked from commit 0841fd4c42ab053be951b7418233f0478282d020)
2024-07-17 18:13:41 +02:00
Jakub Jelinek
01dfc5b4ad bitint: Use gsi_insert_on_edge rather than gsi_insert_on_edge_immediate [PR115887]
The following testcase ICEs on x86_64-linux, because we try to
gsi_insert_on_edge_immediate a statement on an edge which already has
statements queued with gsi_insert_on_edge, and the deferral has been
intentional so that we don't need to deal with cfg changes in between.

The following patch uses the delayed insertion as well.

2024-07-17  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/115887
	* gimple-lower-bitint.cc (gimple_lower_bitint): Use gsi_insert_on_edge
	instead of gsi_insert_on_edge_immediate and set edge_insertions to
	true.

	* gcc.dg/bitint-108.c: New test.

(cherry picked from commit 5104fe4c7808a66ed3041a8da8e4720585cc8a1f)
2024-07-17 17:43:04 +02:00
Jakub Jelinek
d668f87598 gimple-fold: Fix up __builtin_clear_padding lowering [PR115527]
The builtin-clear-padding-6.c testcase fails as clear_padding_type
doesn't correctly recompute the buf->size and buf->off members after
expanding clearing of an array using a runtime loop.
buf->size should be in that case the offset after which it should continue
with next members or padding before them modulo UNITS_PER_WORD and
buf->off that offset minus buf->size.  That is what the code was doing,
but with off being the start of the loop cleared array, not its end.
So, the last hunk in gimple-fold.cc fixes that.
When adding the testcase, I've noticed that the
c-c++-common/torture/builtin-clear-padding-* tests, although clearly
written as runtime tests to test the builtins at runtime, didn't have
{ dg-do run } directive and were just compile tests because of that.
When adding that to the tests, builtin-clear-padding-1.c was already
failing without that clear_padding_type hunk too, but
builtin-clear-padding-5.c was still failing even after the change.
That is due to a bug in clear_padding_flush which the patch fixes as
well - when clear_padding_flush is called with full=true (that happens
at the end of the whole __builtin_clear_padding or on those array
padding clears done by a runtime loop), it wants to flush all the pending
padding clearings rather than just some.  If it is at the end of the whole
object, it decreases wordsize when needed to make sure the code never writes
including RMW cycles to something outside of the object:
      if ((unsigned HOST_WIDE_INT) (buf->off + i + wordsize)
          > (unsigned HOST_WIDE_INT) buf->sz)
        {
          gcc_assert (wordsize > 1);
          wordsize /= 2;
          i -= wordsize;
          continue;
        }
but if it is full==true flush in the middle, this doesn't happen, but we
still process just the buffer bytes before the current end.  If that end
is not on a wordsize boundary, e.g. on the builtin-clear-padding-5.c test
the last chunk is 2 bytes, '\0', '\xff', i is 16 and end is 18,
nonzero_last might be equal to the end - i, i.e. 2 here, but still all_ones
might be true, so in some spots we just didn't emit any clearing in that
last chunk.

2024-07-17  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/115527
	* gimple-fold.cc (clear_padding_flush): Introduce endsize
	variable and use it instead of wordsize when comparing it against
	nonzero_last.
	(clear_padding_type): Increment off by sz.

	* c-c++-common/torture/builtin-clear-padding-1.c: Add dg-do run
	directive.
	* c-c++-common/torture/builtin-clear-padding-2.c: Likewise.
	* c-c++-common/torture/builtin-clear-padding-3.c: Likewise.
	* c-c++-common/torture/builtin-clear-padding-4.c: Likewise.
	* c-c++-common/torture/builtin-clear-padding-5.c: Likewise.
	* c-c++-common/torture/builtin-clear-padding-6.c: New test.

(cherry picked from commit 8b5919bae11754f4b65a17e63663d3143f9615ac)
2024-07-17 17:40:47 +02:00
Jakub Jelinek
297ea7e5bb c++: Fix ICE on constexpr placement new [PR115754]
C++26 is making in P2747R2 paper placement new constexpr.
While working on a patch for that, I've noticed we ICE starting with
GCC 14 on the following testcase.
The problem is that e.g. for the void * to sometype * casts checks,
we really assume the casts have their operand constant evaluated
as prvalue, but on the testcase the cast itself is evaluated with
vc_discard and that means op can end up e.g. a VAR_DECL which the
later code doesn't like and asserts on.
If the result type is void, we don't really need the cast operand
for anything, so can use vc_discard for the recursive call,
VIEW_CONVERT_EXPR can appear on the lhs, so we need to honor the
lval but otherwise the patch uses vc_prvalue.
I'd like to get this patch in before the rest of P2747R2 implementation,
so that it can be backported to 14.2 later on.

2024-07-02  Jakub Jelinek  <jakub@redhat.com>
	    Jason Merrill  <jason@redhat.com>

	PR c++/115754
	* constexpr.cc (cxx_eval_constant_expression) <case CONVERT_EXPR>:
	For conversions to void, pass vc_discard to the recursive call
	and otherwise for tcode other than VIEW_CONVERT_EXPR pass vc_prvalue.

	* g++.dg/cpp26/pr115754.C: New test.

(cherry picked from commit 1250540a98e0a1dfa4d7834672d88d8543ea70b1)
2024-07-17 17:38:04 +02:00
Robin Dapp
bf64404280 vect: Merge loop mask and cond_op mask in fold-left reduction [PR115382].
Currently we discard the cond-op mask when the loop is fully masked
which causes wrong code in
gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c
when compiled with
-O3 -march=cascadelake --param vect-partial-vector-usage=2.

This patch ANDs both masks.

gcc/ChangeLog:

	PR tree-optimization/115382

	* tree-vect-loop.cc (vectorize_fold_left_reduction): Use
	prepare_vec_mask.
	* tree-vect-stmts.cc (check_load_store_for_partial_vectors):
	Remove static of prepare_vec_mask.
	* tree-vectorizer.h (prepare_vec_mask): Export.

(cherry picked from commit 2b438a0d2aa80f051a09b245a58f643540d4004b)
2024-07-17 08:18:21 +02:00
Richard Biener
c58bede01c tree-optimization/115868 - ICE with .MASK_CALL in simdclone
The following adjusts mask recording which didn't take into account
that we can merge call arguments from two vectors like

  _50 = {vect_d_1.253_41, vect_d_1.254_43};
  _51 = VIEW_CONVERT_EXPR<unsigned char>(mask__19.257_49);
  _52 = (unsigned int) _51;
  _53 = _Z3bazd.simdclone.7 (_50, _52);
  _54 = BIT_FIELD_REF <_53, 256, 0>;
  _55 = BIT_FIELD_REF <_53, 256, 256>;

The testcase g++.dg/vect/pr68762-2.cc exercises this on x86_64 with
partial vector usage enabled and AVX512 support.

	PR tree-optimization/115868
	* tree-vect-stmts.cc (vectorizable_simd_clone_call): Correctly
	compute the number of mask copies required for vect_record_loop_mask.

(cherry picked from commit abf3964711f05b6858d9775c3595ec2b45483e14)
2024-07-17 08:14:27 +02:00
Nathaniel Shead
5fad0b552c c++/modules: Propagate BINDING_VECTOR_*_DUPS_P on realloc [PR99242]
When importing modules, when a binding vector for a name runs out of
slots it gets reallocated with a larger size, and existing bindings are
copied across.  However, the flags to indicate whether deduping needs to
occur did not: this causes ICEs, as it allows a duplicate binding to be
added which then violates assumptions later on.

	PR c++/99242

gcc/cp/ChangeLog:

	* name-lookup.cc (append_imported_binding_slot): Propagate dups
	flags.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/pr99242_a.H: New test.
	* g++.dg/modules/pr99242_b.H: New test.
	* g++.dg/modules/pr99242_c.H: New test.
	* g++.dg/modules/pr99242_d.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
(cherry picked from commit 1aa0f1627857c3e2d90982bdb07ca78ca10b26f3)
2024-07-17 11:23:23 +10:00
GCC Administrator
4039c7473a Daily bump. 2024-07-17 00:24:45 +00:00
Richard Biener
59ed01d5e3 tree-optimization/115841 - reduction epilogue placement issue
When emitting the compensation to the vectorized main loop for
a vector reduction value to be re-used in the vectorized epilogue
we fail to place it in the correct block when the main loop is
known to be entered (no loop_vinfo->main_loop_edge) but the
epilogue is not (a loop_vinfo->skip_this_loop_edge).  The code
currently disregards this situation.

With the recent znver4 cost fix I couldn't trigger this situation
with the testcase but I adjusted it so it could eventually trigger
on other targets.

	PR tree-optimization/115841
	* tree-vect-loop.cc (vect_transform_cycle_phi): Correctly
	place the partial vector reduction for the accumulator
	re-use when the main loop cannot be skipped but the
	epilogue can.

	* gcc.dg/vect/pr115841.c: New testcase.

(cherry picked from commit 016c947b02e79a5c0c0c2d4ad5cb71aa04db3efd)
2024-07-16 16:22:35 +02:00
Richard Biener
06829e593d tree-optimization/115843 - fix wrong-code with fully-masked loop and peeling
When AVX512 uses a fully masked loop and peeling we fail to create the
correct initial loop mask when the mask is composed of multiple
components in some cases.  The following fixes this by properly applying
the bias for the component to the shift amount.

	PR tree-optimization/115843
	* tree-vect-loop-manip.cc
	(vect_set_loop_condition_partial_vectors_avx512): Properly
	bias the shift of the initial mask for alignment peeling.

	* gcc.dg/vect/pr115843.c: New testcase.

(cherry picked from commit a177be05f6952c3f7e62186d2e138d96c475b81a)
2024-07-16 16:22:24 +02:00
Richard Biener
e01012c459 tree-optimization/115701 - fix maybe_duplicate_ssa_info_at_copy
The following restricts copying of points-to info from defs that
might be in regions invoking UB and are never executed.

	PR tree-optimization/115701
	* tree-ssanames.cc (maybe_duplicate_ssa_info_at_copy):
	Only copy info from within the same BB.

	* gcc.dg/torture/pr115701.c: New testcase.

(cherry picked from commit b77f17c5feec9614568bf2dee7f7d811465ee4a5)
2024-07-16 16:22:05 +02:00
Richard Biener
6f74a5f5dc tree-optimization/115701 - factor out maybe_duplicate_ssa_info_at_copy
The following factors out the code that preserves SSA info of the LHS
of a SSA copy LHS = RHS when LHS is about to be eliminated to RHS.

	PR tree-optimization/115701
	* tree-ssanames.h (maybe_duplicate_ssa_info_at_copy): Declare.
	* tree-ssanames.cc (maybe_duplicate_ssa_info_at_copy): New
	function, split out from ...
	* tree-ssa-copy.cc (fini_copy_prop): ... here.
	* tree-ssa-sccvn.cc (eliminate_dom_walker::eliminate_stmt): ...
	and here.

(cherry picked from commit b5c64b413fd5bc03a1a8ef86d005892071e42cbe)
2024-07-16 16:22:05 +02:00
Richard Biener
ca275b68ef tree-optimization/115867 - ICE with simdcall vectorization in masked loop
When only a loop mask is to be supplied for the inbranch arg to a
simd function we fail to handle integer mode masks correctly.  We
need to guess the number of elements represented by it.  This assumes
that excess arguments are all for masks, I wasn't able to create
a simdclone with more than one integer mode mask argument.

The gcc.dg/vect/vect-simd-clone-20.c exercises this with -mavx512vl

	PR tree-optimization/115867
	* tree-vect-stmts.cc (vectorizable_simd_clone_call): Properly
	guess the number of mask elements for integer mode masks.

(cherry picked from commit 4f4478f0f31263997bfdc4159f90e58dd79b38f9)
2024-07-16 16:22:05 +02:00
Richard Biener
4a04110ec8 Fixup unaligned load/store cost for znver5
Currently unaligned YMM and ZMM load and store costs are cheaper than
aligned which causes the vectorizer to purposely mis-align accesses
by adding an alignment prologue.  It looks like the unaligned costs
were simply copied from the bogus znver4 costs.  The following makes
the unaligned costs equal to the aligned costs like in the fixed znver4
version.

	* config/i386/x86-tune-costs.h (znver5_cost): Update unaligned
	load and store cost from the aligned costs.

(cherry picked from commit 896393791ee34ffc176c87d232dfee735db3aaab)
2024-07-16 16:22:05 +02:00
Richard Biener
d702a95775 Fixup unaligned load/store cost for znver4
Currently unaligned YMM and ZMM load and store costs are cheaper than
aligned which causes the vectorizer to purposely mis-align accesses
by adding an alignment prologue.  It looks like the unaligned costs
were simply left untouched from znver3 where they equate the aligned
costs when tweaking aligned costs for znver4.  The following makes
the unaligned costs equal to the aligned costs.

This avoids the miscompile seen in PR115843 but it's of course not
a real fix for the issue uncovered there.  But it makes it qualify
as a regression fix.

	PR tree-optimization/115843
	* config/i386/x86-tune-costs.h (znver4_cost): Update unaligned
	load and store cost from the aligned costs.

(cherry picked from commit 1e3aa9c9278db69d4bdb661a750a7268789188d6)
2024-07-16 16:22:05 +02:00
Alexandre Oliva
c8fdef7fc2 [alpha] adjust MEM alignment for block move [PR115459]
Before issuing loads or stores for a block move, adjust the MEM
alignments if analysis of the addresses enabled the inference of
stricter alignment.  This ensures that the MEMs are sufficiently
aligned for the corresponding insns, which avoids trouble in case of
e.g. substitutions into SUBREGs.


for  gcc/ChangeLog

	PR target/115459
	* config/alpha/alpha.cc (alpha_expand_block_move): Adjust
	MEMs to match inferred alignment.

(cherry picked from commit ccfe7151803956d178947d0afda0bd66ce097275)
2024-07-16 08:54:20 -03:00
Christoph Müllner
b3cff8357e RISC-V: Allow adding enabled extension via target arch attributes
The set of enabled extensions can be extended via target arch function
attributes by listing each extension with a '+' prefix and a comma as
list separator.  E.g.:
  __attribute__((target("arch=+zba,+zbb"))) void foo();

The programmer intends to ensure that one or more extensions
are enabled when building the code.  This is independent of the arch
string that is passed at build time via the -march= option.

Therefore, it is reasonable to allow enabling extensions via target arch
attributes, which have already been enabled via the -march= string.

The subset list code already supports such duplication for implied
extensions.  This patch adds an interface so the subset list
parser can be switched into a mode where duplication is allowed.

This commit fixes the following regressed test cases:
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-39.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-42.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-43.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-44.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-45.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-46.c

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc (riscv_subset_list::add):
	Allow adding enabled extension if m_allow_adding_dup is set.
	* config/riscv/riscv-subset.h: Add m_allow_adding_dup and setter.
	* config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::parse_arch):
	Allow adding enabled extensions.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/pr115554.c: Change expected fail to expected pass.
	* gcc.target/riscv/target-attr-16.c: New test.

(cherry picked from commit 61c21a719e205f70bd046c6a0275d1a3fd6341a4)
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-07-16 13:02:16 +02:00
Christoph Müllner
0e1f599d63 RISC-V: Rewrite target attribute handling
The target-arch attribute handling in RISC-V is only a few months old,
but already saw a rewrite (9941f0295a), which addressed an important
issue.  This rewrite introduced a hash table in the backend, which is
used to keep track of target-arch attributes of all functions.
The index of this hash table is the pointer to the function declaration
object (fndecl).  However, objects like these don't have the lifetime
that is assumed here, which resulted in observing two fndecl objects
with the same address for different objects (triggering the assertion
in riscv_func_target_put() -- see also PR115562).

This patch removes the hash table approach in favor of storing target
specific options using the DECL_FUNCTION_SPECIFIC_TARGET() macro, which
is also used by other backends and is specifically designed for this
purpose (https://gcc.gnu.org/onlinedocs/gccint/Function-Properties.html).

To have an accessible field in the target options, we need to
adjust riscv.opt and introduce the field riscv_arch_string
(for the already existing option '-march=').

Using this macro allows to remove much code from riscv-common.cc, which
controls access to the objects 'func_target_table' and 'current_subset_list'.

One thing to mention is, that we had two subset lists:
current_subset_list and cmdline_subset_list, with the latter being
introduced recently for target attribute handling.
This patch reduces them back to one (cmdline_subset_list) which
contains the list of extensions that have been enabled by the command
line arguments.

Note that the patch keeps the existing behavior of rejecting
duplications of extensions when added via the '+' operator in a function
target attribute.  E.g. "-march=rv64gc_zbb" and "arch=+zbb" will trigger
an error (see pr115554.c).  However, at the same time this patch breaks
the acceptance of adding implied extensions, which causes the following
six regressions (with the error "extension 'EXT' appear more than one time"):
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-39.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-42.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-43.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-44.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-45.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-46.c

New tests were added to document the behavior and to ensure it won't
regress.  This patch did not show any regressions for rv32/rv64
and fixes the ICEs from PR115554 and PR115562.

	PR target/115554
	PR target/115562

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc (struct riscv_func_target_info):
	Remove.
	(struct riscv_func_target_hasher): Likewise.
	(riscv_func_decl_hash): Likewise.
	(riscv_func_target_hasher::hash): Likewise.
	(riscv_func_target_hasher::equal): Likewise.
	(riscv_current_subset_list): Likewise.
	(riscv_cmdline_subset_list): Remove obsolete space.
	(riscv_func_target_table_lazy_init): Remove.
	(riscv_func_target_get): Likewise.
	(riscv_func_target_put): Likewise.
	(riscv_func_target_remove_and_destory): Likewise.
	(riscv_arch_str): Generate from cmdline_subset_list.
	(riscv_set_arch_by_subset_list): Don't set current_subset_list.
	(riscv_parse_arch_string): Remove current_subset_list.
	* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins):
	Get subset list via riscv_cmdline_subset_list().
	* config/riscv/riscv-subset.h (riscv_current_subset_list):
	Remove prototype.
	(riscv_func_target_get): Likewise.
	(riscv_func_target_put): Likewise.
	(riscv_func_target_remove_and_destory): Likewise.
	* config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::parse_arch):
	Build base arch string from existing target options, if any.
	(riscv_target_attr_parser::update_settings): Store new arch
	string in target options.
	(riscv_process_one_target_attr): Whitespace fix.
	(riscv_process_target_attr): Drop opts argument.
	(riscv_option_valid_attribute_p): Properly save, change and restore
	target options.
	* config/riscv/riscv.cc (get_arch_str): New function.
	(riscv_declare_function_name): Get arch string for option-arch
	directive from function's target options.
	* config/riscv/riscv.opt: Add riscv_arch_string variable to
	march option.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/target-attr-01.c: Add test for option-arch directive.
	* gcc.target/riscv/target-attr-02.c: Likewise.
	* gcc.target/riscv/target-attr-03.c: Likewise.
	* gcc.target/riscv/target-attr-04.c: Likewise.
	* gcc.target/riscv/target-attr-05.c: Fix formatting.
	* gcc.target/riscv/target-attr-06.c: Likewise.
	* gcc.target/riscv/target-attr-07.c: Likewise.
	* gcc.target/riscv/pr115554.c: New test.
	* gcc.target/riscv/pr115562.c: New test.
	* gcc.target/riscv/target-attr-08.c: New test.
	* gcc.target/riscv/target-attr-09.c: New test.
	* gcc.target/riscv/target-attr-10.c: New test.
	* gcc.target/riscv/target-attr-11.c: New test.
	* gcc.target/riscv/target-attr-12.c: New test.
	* gcc.target/riscv/target-attr-13.c: New test.
	* gcc.target/riscv/target-attr-14.c: New test.
	* gcc.target/riscv/target-attr-15.c: New test.

(cherry picked from commit aa8e2de78cae4dca7f9b0efe0685f3382f9ecb9a)
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-07-16 13:02:16 +02:00
Christoph Müllner
b604d59b23 RISC-V: Fix comment/naming in attribute parsing code
Function target attributes have to be separated by semi-colons.
Let's fix the comment and variable naming to better explain what
the code does.

gcc/ChangeLog:

	* config/riscv/riscv-target-attr.cc (riscv_process_target_attr):
	Fix comments and variable names.

(cherry picked from commit 5ef0b7d2048a7142174ee3e8e021fc1a9c3e3334)
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-07-16 13:02:16 +02:00
Christoph Müllner
20fb450d17 RISC-V: Deduplicate arch subset list processing
We have a code duplication in riscv_set_arch_by_subset_list() and
riscv_parse_arch_string(), where the latter function parses an ISA string
into a subset_list before doing the same as the former function.

riscv_parse_arch_string() is used to process command line options and
riscv_set_arch_by_subset_list() processes target attributes.
So, it is obvious that both functions should do the same.
Let's deduplicate the code to enforce this.

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc (riscv_set_arch_by_subset_list):
	Fix overlong line.
	(riscv_parse_arch_string): Replace duplicated code by a call to
	riscv_set_arch_by_subset_list.

(cherry picked from commit 85fa334fbcaa8e4b98ab197a8c9410dde87f0ae3)
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-07-16 13:02:16 +02:00
Christoph Müllner
ea5907d6d4 RISC-V: testsuite: Properly gate LTO tests
There are two test cases with the following skip directive:
  dg-skip-if "" { *-*-* } { "-flto -fno-fat-lto-objects" }
This reads as: skip if both '-flto' and '-fno-fat-lto-objects'
are present.  This is not the case if only '-flto' is present.

Since both tests depend on instruction sequences (one does
check-function-bodies the other tests for an assembler error
message), they won't work reliably with fat LTO objects.

Let's change the skip line to gate the test on '-flto'
to avoid failing tests like this:

FAIL: gcc.target/riscv/interrupt-misaligned.c   -O2 -flto   check-function-bodies interrupt
FAIL: gcc.target/riscv/interrupt-misaligned.c   -O2 -flto -flto-partition=none   check-function-bodies interrupt
FAIL: gcc.target/riscv/pr93202.c   -O2 -flto   (test for errors, line 10)
FAIL: gcc.target/riscv/pr93202.c   -O2 -flto   (test for errors, line 9)
FAIL: gcc.target/riscv/pr93202.c   -O2 -flto -flto-partition=none   (test for errors, line 10)
FAIL: gcc.target/riscv/pr93202.c   -O2 -flto -flto-partition=none   (test for errors, line 9)

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/interrupt-misaligned.c: Remove
	"-fno-fat-lto-objects" from skip condition.
	* gcc.target/riscv/pr93202.c: Likewise.

(cherry picked from commit 0717d50fc4ff983b79093bdef43b04e4584cc3cd)
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-07-16 13:02:16 +02:00
Alexandre Oliva
7bc63f1c70 [i386] adjust flag_omit_frame_pointer in a single function [PR113719]
The first two patches for PR113719 have each regressed
gcc.dg/ipa/iinline-attr.c on a different target.  The reason for this
instability is that there are competing flag_omit_frame_pointer
overriders on x86:

- ix86_recompute_optlev_based_flags computes and sets a
  -f[no-]omit-frame-pointer default depending on
  USE_IX86_FRAME_POINTER and, in 32-bit mode, optimize_size

- ix86_option_override_internal enables flag_omit_frame_pointer for
  -momit-leaf-frame-pointer to take effect

ix86_option_override[_internal] calls
ix86_recompute_optlev_based_flags before setting
flag_omit_frame_pointer.  It is called during global process_options.

But ix86_recompute_optlev_based_flags is also called by
parse_optimize_options, during attribute processing, and at that
point, ix86_option_override is not called, so the final overrider for
global options is not applied to the optimize attributes.  If they
differ, the testcase fails.

In order to fix this, we need to process all overriders of this option
whenever we process any of them.  Since this setting is affected by
optimization options, it makes sense to compute it in
parse_optimize_options, rather than in process_options.


for  gcc/ChangeLog

	PR target/113719
	* config/i386/i386-options.cc (ix86_option_override_internal):
	Move flag_omit_frame_pointer final overrider...
	(ix86_recompute_optlev_based_flags): ... here.

(cherry picked from commit bf8e80f9d164f8778d86a3dc50e501cf19a9eff1)
2024-07-16 06:37:13 -03:00
Alexandre Oliva
102bcf1478 [i386] restore recompute to override opts after change [PR113719]
The first patch for PR113719 regressed gcc.dg/ipa/iinline-attr.c on
toolchains configured to --enable-frame-pointer, because the
optimization node created within handle_optimize_attribute had
flag_omit_frame_pointer incorrectly set, whereas
default_optimization_node didn't.  With this difference,
can_inline_edge_by_limits_p flagged an optimization mismatch and we
refused to inline the function that had a redundant optimization flag
into one that didn't, which is exactly what is tested for there.

This patch restores the calls to ix86_default_align and
ix86_recompute_optlev_based_flags that used to be, and ought to be,
issued during TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE, but preserves the
intent of the original change, of having those functions called at
different spots within ix86_option_override_internal.  To that end,
the remaining bits were refactored into a separate function, that was
in turn adjusted to operate on explicitly-passed opts and opts_set,
rather than going for their global counterparts.


for  gcc/ChangeLog

	PR target/113719
	* config/i386/i386-options.cc
	(ix86_override_options_after_change_1): Add opts and opts_set
	parms, operate on them, after factoring out of...
	(ix86_override_options_after_change): ... this.  Restore calls
	of ix86_default_align and ix86_recompute_optlev_based_flags.
	(ix86_option_override_internal): Call the factored-out bits.

(cherry picked from commit bf2fc0a27b35de039c3d45e6d7ea9ad0a8a305ba)
2024-07-16 06:27:06 -03:00
H.J. Lu
1fff665a51 x86: Update branch hint for Redwood Cove.
According to Intel® 64 and IA-32 Architectures Optimization Reference
Manual[1], Branch Hint is updated for Redwood Cove.

--------cut from [1]-------------------------
Starting with the Redwood Cove microarchitecture, if the predictor has
no stored information about a branch, the branch has the Intel® SSE2
branch taken hint (i.e., instruction prefix 3EH), When the codec
decodes the branch, it flips the branch’s prediction from not-taken to
taken. It then flushes the pipeline in front of it and steers this
pipeline to fetch the taken path of the branch.
--------cut end -----------------------------

Split tune branch_prediction_hints into branch_prediction_hints_taken
and branch_prediction_hints_not_taken, always generate branch hint for
conditional branches, both tunes are disabled by default.

[1] https://www.intel.com/content/www/us/en/content-details/821612/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html

gcc/

	* config/i386/i386.cc (ix86_print_operand): Always generate
	branch hint for conditional branches.
	* config/i386/i386.h (TARGET_BRANCH_PREDICTION_HINTS): Split
	into ..
	(TARGET_BRANCH_PREDICTION_HINTS_TAKEN): .. this, and ..
	(TARGET_BRANCH_PREDICTION_HINTS_NOT_TAKEN): .. this.
	* config/i386/x86-tune.def (X86_TUNE_BRANCH_PREDICTION_HINTS):
	Split into ..
	(X86_TUNE_BRANCH_PREDICTION_HINTS_TAKEN): .. this, and ..
	(X86_TUNE_BRANCH_PREDICTION_HINTS_NOT_TAKEN): .. this.

(cherry picked from commit a910c30c7c27cd0f6d2d2694544a09fb11d611b9)
2024-07-16 09:28:08 +08:00
GCC Administrator
0fcadb3d51 Daily bump. 2024-07-16 00:26:23 +00:00
Harald Anlauf
71ec9ed7a7 Fortran: improve attribute conflict checking [PR93635]
gcc/fortran/ChangeLog:

	PR fortran/93635
	* symbol.cc (conflict_std): Helper function for reporting attribute
	conflicts depending on the Fortran standard version.
	(conf_std): Helper macro for checking standard-dependent conflicts.
	(gfc_check_conflict): Use it.

gcc/testsuite/ChangeLog:

	PR fortran/93635
	* gfortran.dg/c-interop/c1255-2.f90: Adjust pattern.
	* gfortran.dg/pr87907.f90: Likewise.
	* gfortran.dg/pr93635.f90: New test.

Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
(cherry picked from commit 9561cf550a66a89e7c8d31202a03c4fddf82a3f2)
2024-07-15 20:41:43 +02:00