Commit graph

204955 commits

Author SHA1 Message Date
Lehua Ding
5ee894130f RISC-V: Add the missed combine of [u]int64 -> _Float16 and vcond
Hi,

This patch let the INT64 to FP16 convert split to two small converts
(INT64 -> FP32 and FP32 -> FP16) when expanding instead of dealy the
split to split1 pass. This change could make it possible to combine
the FP32 to FP16 and vcond patterns and so we don't need to add an
combine pattern for INT64 to FP16 and vcond patterns.

Consider this code:
  void
  foo (_Float16 *__restrict r, int64_t *__restrict a, _FLoat16 *__restrict b,
       int64_t *__restrict pred, int n)
  {
    for (int i = 0; i < n; i += 1)
      {
        r[i] = pred[i] ? (_Float16) a[i] : b[i];
      }
  }

Before this patch:
  ...
  vfncvt.f.f.w    v2,v2
  vmerge.vvm      v1,v1,v2,v0
  vse16.v v1,0(a0)
  ...

After this patch:
  ...
  vfncvt.f.f.w    v1,v2,v0.t
  vse16.v v1,0(a0)
  ...

gcc/ChangeLog:

	* config/riscv/autovec.md (<float_cvt><mode><vnnconvert>2):
	Change to define_expand.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-1.c:
	Add vfncvt.f.f.w assert.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-2.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-1.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-2.c:
	Ditto.
2023-10-31 11:45:46 +08:00
liuhongt
f5d33d0c79 Fix wrong code due to incorrect define_split
-(define_split
-  [(set (match_operand:V2HI 0 "register_operand")
-        (eq:V2HI
-          (eq:V2HI
-            (us_minus:V2HI
-              (match_operand:V2HI 1 "register_operand")
-              (match_operand:V2HI 2 "register_operand"))
-            (match_operand:V2HI 3 "const0_operand"))
-          (match_operand:V2HI 4 "const0_operand")))]
-  "TARGET_SSE4_1"
-  [(set (match_dup 0)
-        (umin:V2HI (match_dup 1) (match_dup 2)))
-   (set (match_dup 0)
-        (eq:V2HI (match_dup 0) (match_dup 2)))])

the splitter is wrong when op1 == op2.(the original pattern returns 0, after split, it returns 1)
So remove the splitter.

Also extend another define_split to define_insn_and_split to handle
below pattern

494(set (reg:V4QI 112)
495    (unspec:V4QI [
496            (subreg:V4QI (reg:V2HF 111 [ bf ]) 0)
497            (subreg:V4QI (reg:V2HF 110 [ af ]) 0)
498            (subreg:V4QI (eq:V2HI (eq:V2HI (reg:V2HI 105)
499                        (const_vector:V2HI [
500                                (const_int 0 [0]) repeated x2
501                            ]))
502                    (const_vector:V2HI [
503                            (const_int 0 [0]) repeated x2
504                        ])) 0)
505        ] UNSPEC_BLENDV))

define_split doesn't work since pass_combine assume it produces at
most 2 insns after split, but here it produces 3 since we need to move
const0_rtx (V2HImode) to reg. The move insn can be eliminated later.

gcc/ChangeLog:

	PR target/112276
	* config/i386/mmx.md (*mmx_pblendvb_v8qi_1): Change
	define_split to define_insn_and_split to handle
	immediate_operand for comparison.
	(*mmx_pblendvb_v8qi_2): Ditto.
	(*mmx_pblendvb_<mode>_1): Ditto.
	(*mmx_pblendvb_v4qi_2): Ditto.
	(<code><mode>3): Remove define_split after it.
	(<code>v8qi3): Ditto.
	(<code><mode>3): Ditto.
	(<ode>v2hi3): Ditto.

gcc/testsuite/ChangeLog:

	* g++.target/i386/part-vect-vcondhf.C: Adjust testcase.
	* gcc.target/i386/pr112276.c: New test.
2023-10-31 11:24:45 +08:00
Andrew Pinski
541b754c77 MATCH: Add some more value_replacement simplifications to match
This moves a few more value_replacements simplifications to match.
/* a == 1 ? b : a * b -> a * b */
/* a == 1 ? b : b / a  -> b / a */
/* a == -1 ? b : a & b -> a & b */

Also adds a testcase to show can we catch these where value_replacement would not
(but other passes would).

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

	* match.pd (`a == 1 ? b : a OP b`): New pattern.
	(`a == -1 ? b : a & b`): New pattern.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/phi-opt-value-4.c: New test.
2023-10-30 19:15:25 -07:00
Andrew Pinski
598fdb5290 MATCH: first of the value replacement moving from phiopt
This moves a few simple patterns that are done in value replacement
in phiopt over to match.pd. Just the simple ones which might show up
in other code.

This allows some optimizations to happen even without depending
on sinking from happening and in some cases where phiopt is not
invoked (cond-1.c is an example there).

Changes since v1:
* v2: Add an extra testcase to showcase improvements at -O1.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

	* match.pd: (`a == 0 ? b : b + a`,
	`a == 0 ? b : b - a`): New patterns.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/cond-1.c: New test.
	* gcc.dg/tree-ssa/phi-opt-value-1.c: New test.
	* gcc.dg/tree-ssa/phi-opt-value-1a.c: New test.
	* gcc.dg/tree-ssa/phi-opt-value-2.c: New test.
2023-10-30 19:15:25 -07:00
GCC Administrator
a5c157b95a Daily bump. 2023-10-31 00:17:32 +00:00
Mayshao
94c0b26f45 i386: Zhaoxin yongfeng enablement
Enable -march/-mtune=yongfeng. Costs and tunings are set according
to the characteristics of the processor. Add a new .md file to describe
yongfeng processor.

gcc/ChangeLog:

	* common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Recognize yongfeng.
	* common/config/i386/i386-common.cc: Add yongfeng.
	* common/config/i386/i386-cpuinfo.h (enum processor_subtypes):
	Add ZHAOXIN_FAM7H_YONGFENG.
	* config.gcc: Add yongfeng.
	* config/i386/driver-i386.cc (host_detect_local_cpu):
	Let -march=native recognize yongfeng processors.
	* config/i386/i386-c.cc (ix86_target_macros_internal): Add yongfeng.
	* config/i386/i386-options.cc (m_YONGFENG): New definition.
	(m_ZHAOXIN): Ditto.
	* config/i386/i386.h (enum processor_type): Add PROCESSOR_YONGFENG.
	* config/i386/i386.md: Add yongfeng.
	* config/i386/lujiazui.md: Fix typo.
	* config/i386/x86-tune-costs.h (struct processor_costs):
	Add yongfeng costs.
	* config/i386/x86-tune-sched.cc (ix86_issue_rate): Add yongfeng.
	(ix86_adjust_cost): Ditto.
	* config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Replace
	m_LUJIAZUI with m_ZHAOXIN.
	(X86_TUNE_PARTIAL_REG_DEPENDENCY): Ditto.
	(X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY): Ditto.
	(X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): Ditto.
	(X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Ditto.
	(X86_TUNE_MOVX): Ditto.
	(X86_TUNE_MEMORY_MISMATCH_STALL): Ditto.
	(X86_TUNE_FUSE_CMP_AND_BRANCH_32): Ditto.
	(X86_TUNE_FUSE_CMP_AND_BRANCH_64): Ditto.
	(X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS): Ditto.
	(X86_TUNE_FUSE_ALU_AND_BRANCH): Ditto.
	(X86_TUNE_ACCUMULATE_OUTGOING_ARGS): Ditto.
	(X86_TUNE_USE_LEAVE): Ditto.
	(X86_TUNE_PUSH_MEMORY): Ditto.
	(X86_TUNE_LCP_STALL): Ditto.
	(X86_TUNE_INTEGER_DFMODE_MOVES): Ditto.
	(X86_TUNE_OPT_AGU): Ditto.
	(X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB): Ditto.
	(X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Ditto.
	(X86_TUNE_USE_SAHF): Ditto.
	(X86_TUNE_USE_BT): Ditto.
	(X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Ditto.
	(X86_TUNE_ONE_IF_CONV_INSN): Ditto.
	(X86_TUNE_AVOID_MFENCE): Ditto.
	(X86_TUNE_EXPAND_ABS): Ditto.
	(X86_TUNE_USE_SIMODE_FIOP): Ditto.
	(X86_TUNE_USE_FFREEP): Ditto.
	(X86_TUNE_EXT_80387_CONSTANTS): Ditto.
	(X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Ditto.
	(X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Ditto.
	(X86_TUNE_SSE_TYPELESS_STORES): Ditto.
	(X86_TUNE_SSE_LOAD0_BY_PXOR): Ditto.
	(X86_TUNE_USE_GATHER_2PARTS): Add m_YONGFENG.
	(X86_TUNE_USE_GATHER_4PARTS): Ditto.
	(X86_TUNE_USE_GATHER_8PARTS): Ditto.
	(X86_TUNE_AVOID_128FMA_CHAINS): Ditto.
	* doc/extend.texi: Add details about yongfeng.
	* doc/invoke.texi: Ditto.
	* config/i386/yongfeng.md: New file to describe yongfeng processor.

gcc/testsuite/ChangeLog:

	* g++.target/i386/mv32.C: Handle new -march.
	* gcc.target/i386/funcspec-56.inc: Ditto.
2023-10-30 22:20:01 +01:00
François Dumont
6504b4a498 libstdc++: [_GLIBCXX_INLINE_VERSION] Add comment on emul TLS symbols
libstdc++-v3/ChangeLog:

	* config/abi/pre/gnu-versioned-namespace.ver: Add comment on recently
	added emul TLS symbols.
2023-10-30 22:07:49 +01:00
François Dumont
5ea11700e5 libstdc++: [_GLIBCXX_INLINE_VERSION] Un-weak handle_contract_violation
libstdc++-v3/ChangeLog:

	* src/experimental/contract.cc
	[_GLIBCXX_INLINE_VERSION](handle_contract_violation): Rework comment.
	Remove weak attribute.
2023-10-30 21:49:31 +01:00
Iain Sandoe
434975cb1b configure, fixincludes: Add change missed in r14-4825.
This corrects an oversight in the r14-4825 commit.

fixincludes/ChangeLog:

	* configure: Regenerate.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
2023-10-30 19:05:00 +00:00
Martin Jambor
997c8219f0
ipa: Prune any IPA-CP aggregate constants known by modref to be killed (111157)
PR 111157 shows that IPA-modref and IPA-CP (when plugged into value
numbering) can optimize out a store both before a call (because the
call will overwrite it) and in the call (because the store is of the
same value) and by eliminating both create miscompilation.

This patch fixes that by pruning any constants from the list of IPA-CP
aggregate value constants that it knows the contents of the memory can
be "killed."  Unfortunately, doing so is tricky.  First, IPA-modref
loads override kills and so only stores not loaded are truly not
necessary.  Looking stuff up there means doing what most of what
modref_may_alias may do but doing exactly what it does is tricky
because it takes also aliasing into account and has bail-out counters.

To err on the side of caution in order to avoid this miscompilation we
have to prune a constant when in doubt.  However, pruning can
interfere with the mechanism of how clone materialization
distinguishes between the cases when a parameter was entirely removed
and when it was both IPA-CPed and IPA-SRAed (in order to make up for
the removal in debug info, which can bump into an assert when
compiling g++.dg/torture/pr103669.C when we are not careful).

Therefore this patch:

  1) marks constants that IPA-modref has in its kill list with a new
     "killed" flag, and
  2) prunes the list from entries with this flag after materialization
     and IPA-CP transformation is done using the template introduced in
     the previous patch

It does not try to look up anything in the load lists, this will be
done as a follow-up in order to ease review.

gcc/ChangeLog:

2023-10-27  Martin Jambor  <mjambor@suse.cz>

	PR ipa/111157
	* ipa-prop.h (struct ipa_argagg_value): Newf flag killed.
	* ipa-modref.cc (ipcp_argagg_and_kill_overlap_p): New function.
	(update_signature): Mark any any IPA-CP aggregate constants at
	positions known to be killed as killed.  Move check that there is
	clone_info after this pruning.
	* ipa-cp.cc (ipa_argagg_value_list::dump): Dump the killed flag.
	(ipa_argagg_value_list::push_adjusted_values): Clear the new flag.
	(push_agg_values_from_plats): Likewise.
	(ipa_push_agg_values_from_jfunc): Likewise.
	(estimate_local_effects): Likewise.
	(push_agg_values_for_index_from_edge): Likewise.
	* ipa-prop.cc (write_ipcp_transformation_info): Stream the killed
	flag.
	(read_ipcp_transformation_info): Likewise.
	(ipcp_get_aggregate_const): Update comment, assert that encountered
	record does not have killed flag set.
	(ipcp_transform_function): Prune all aggregate constants with killed
	set.

gcc/testsuite/ChangeLog:

2023-09-18  Martin Jambor  <mjambor@suse.cz>

	PR ipa/111157
	* gcc.dg/lto/pr111157_0.c: New test.
	* gcc.dg/lto/pr111157_1.c: Second file of the same new test.
2023-10-30 18:36:54 +01:00
Martin Jambor
1437df40f1
ipa-cp: Templatize filtering of m_agg_values
PR 111157 points to another place where IPA-CP collected aggregate
compile-time constants need to be filtered, in addition to the one
place that already does this in ipa-sra.  In order to re-use code,
this patch turns the common bit into a template.

The functionality is still covered by testcase gcc.dg/ipa/pr108959.c.

gcc/ChangeLog:

2023-09-13  Martin Jambor  <mjambor@suse.cz>

	PR ipa/111157
	* ipa-prop.h (ipcp_transformation): New member function template
	remove_argaggs_if.
	* ipa-sra.cc (zap_useless_ipcp_results): Use remove_argaggs_if to
	filter aggreagate constants.
2023-10-30 18:36:40 +01:00
Patrick O'Neill
68880e4053
RISC-V: Make rv32i_zcmp testcase more robust
GCC recently changed its register allocator which causes this
testcase to fail.
This patch updates the regex to be more robust to change by accepting
any s register in the range of 1-9 for cm.push and cm.popret insns.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rv32i_zcmp.c: Accept any register in the
	range of 1-9 for cm.push and cm.popret insns.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2023-10-30 09:58:54 -07:00
Roger Sayle
a3da9adeb4 ARC: Convert (signed<<31)>>31 to -(signed&1) without barrel shifter.
This patch optimizes PR middle-end/101955 for the ARC backend.  On ARC
CPUs with a barrel shifter, using two shifts is optimal as:

        asl_s   r0,r0,31
        asr_s   r0,r0,31

but without a barrel shifter, GCC -O2 -mcpu=em currently generates:

        and     r2,r0,1
        ror     r2,r2
        add.f   0,r2,r2
        sbc     r0,r0,r0

with this patch, we now generate the smaller, faster and non-flags
clobbering:

        bmsk_s  r0,r0,0
        neg_s   r0,r0

2023-10-30  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR middle-end/101955
	* config/arc/arc.md (*extvsi_1_0): New define_insn_and_split
	to convert sign extract of the least significant bit into an
	AND $1 then a NEG when !TARGET_BARREL_SHIFTER.

gcc/testsuite/ChangeLog
	PR middle-end/101955
	* gcc.target/arc/pr101955.c: New test case.
2023-10-30 16:21:28 +00:00
Roger Sayle
31cc9824d1 ARC: Improved ARC rtx_costs/insn_cost for SHIFTs and ROTATEs.
This patch overhauls the ARC backend's insn_cost target hook, and makes
some related improvements to rtx_costs, BRANCH_COST, etc.  The primary
goal is to allow the backend to indicate that shifts and rotates are
slow (discouraged) when the CPU doesn't have a barrel shifter. I should
also acknowledge Richard Sandiford for inspiring the use of set_cost
in this rewrite of arc_insn_cost; this implementation borrows heavily
for the target hooks for AArch64 and ARM.

The motivating example is derived from PR rtl-optimization/110717.

struct S { int a : 5; };
unsigned int foo (struct S *p) {
  return p->a;
}

With a barrel shifter, GCC -O2 generates the reasonable:

foo:    ldb_s   r0,[r0]
        asl_s   r0,r0,27
        j_s.d   [blink]
        asr_s   r0,r0,27

What's interesting is that during combine, the middle-end actually
has two shifts by three bits, and a sign-extension from QI to SI.

Trying 8, 9 -> 11:
    8: r158:SI=r157:QI#0<<0x3
      REG_DEAD r157:QI
    9: r159:SI=sign_extend(r158:SI#0)
      REG_DEAD r158:SI
   11: r155:SI=r159:SI>>0x3
      REG_DEAD r159:SI

Whilst it's reasonable to simplify this to two shifts by 27 bits when
the CPU has a barrel shifter, it's actually a significant pessimization
when these shifts are implemented by loops.  This combination can be
prevented if the backend provides accurate-ish estimates for insn_cost.

Previously, without a barrel shifter, GCC -O2 -mcpu=em generates:

foo:	ldb_s   r0,[r0]
        mov     lp_count,27
        lp      2f
        add     r0,r0,r0
        nop
2:      # end single insn loop
        mov     lp_count,27
        lp      2f
        asr     r0,r0
        nop
2:      # end single insn loop
        j_s     [blink]

which contains two loops and requires about ~113 cycles to execute.
With this patch to rtx_cost/insn_cost, GCC -O2 -mcpu=em generates:

foo:	ldb_s   r0,[r0]
        mov_s   r2,0    ;3
        add3    r0,r2,r0
        sexb_s  r0,r0
        asr_s   r0,r0
        asr_s   r0,r0
        j_s.d   [blink]
        asr_s   r0,r0

which requires only ~6 cycles, for the shorter shifts by 3 and sign
extension.

2023-10-30  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/arc/arc.cc (arc_rtx_costs): Improve cost estimates.
	Provide reasonable values for SHIFTS and ROTATES by constant
	bit counts depending upon TARGET_BARREL_SHIFTER.
	(arc_insn_cost): Use insn attributes if the instruction is
	recognized.  Avoid calling get_attr_length for type "multi",
	i.e. define_insn_and_split patterns without explicit type.
	Fall-back to set_rtx_cost for single_set and pattern_cost
	otherwise.
	* config/arc/arc.h (COSTS_N_BYTES): Define helper macro.
	(BRANCH_COST): Improve/correct definition.
	(LOGICAL_OP_NON_SHORT_CIRCUIT): Preserve previous behavior.
2023-10-30 16:17:42 +00:00
Roger Sayle
d24c3c5334 ARC: Improved SImode shifts and rotates with -mswap.
This patch improves the code generated by the ARC back-end for CPUs
without a barrel shifter but with -mswap.  The -mswap option provides
a SWAP instruction that implements SImode rotations by 16, but also
logical shift instructions (left and right) by 16 bits.  Clearly these
are also useful building blocks for implementing shifts by 17, 18, etc.
which would otherwise require a loop.

As a representative example:
int shl20 (int x) { return x << 20; }

GCC with -O2 -mcpu=em -mswap would previously generate:

shl20:  mov     lp_count,10
        lp      2f
        add     r0,r0,r0
        add     r0,r0,r0
2:      # end single insn loop
        j_s     [blink]

with this patch we now generate:

shl20:  mov_s   r2,0    ;3
        lsl16   r0,r0
        add3    r0,r2,r0
        j_s.d   [blink]
        asl_s r0,r0

Although both are four instructions (excluding the j_s),
the original takes ~22 cycles, and replacement ~4 cycles.

2023-10-30  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/arc/arc.cc (arc_split_ashl): Use lsl16 on TARGET_SWAP.
	(arc_split_ashr): Use swap and sign-extend on TARGET_SWAP.
	(arc_split_lshr): Use lsr16 on TARGET_SWAP.
	(arc_split_rotl): Use swap on TARGET_SWAP.
	(arc_split_rotr): Likewise.
	* config/arc/arc.md (ANY_ROTATE): New code iterator.
	(<ANY_ROTATE>si2_cnt16): New define_insn for alternate form of
	swap instruction on TARGET_SWAP.
	(ashlsi2_cnt16): Rename from *ashlsi16_cnt16 and move earlier.
	(lshrsi2_cnt16): New define_insn for LSR16 instruction.
	(*ashlsi2_cnt16): See above.

gcc/testsuite/ChangeLog
	* gcc.target/arc/lsl16-1.c: New test case.
	* gcc.target/arc/lsr16-1.c: Likewise.
	* gcc.target/arc/swap-1.c: Likewise.
	* gcc.target/arc/swap-2.c: Likewise.
2023-10-30 16:12:30 +00:00
Richard Ball
fb1941d08f arm: move the switch tables for Arm to the RO data section.
Follow up patch to arm: Use deltas for Arm switch tables
This patch moves the switch tables for Arm from the .text section
into the .rodata section.

gcc/ChangeLog:

	* config/arm/aout.h: Change to use the Lrtx label.
	* config/arm/arm.h (CASE_VECTOR_PC_RELATIVE): Remove arm targets
	from (!target_pure_code) condition.
	(ADDR_VEC_ALIGN): Add align for tables in rodata section.
	* config/arm/arm.cc (arm_output_casesi): Alter the function to include
	.Lrtx label and remove adr instructions.
	* config/arm/arm.md
	(arm_casesi_internal): Use force_reg to generate ldr instructions that
	would otherwise be out of range, and change rtl to accommodate force reg.
	Additionally remove unnecessary register temp.
	(casesi): Remove pure code check for Arm.
	* config/arm/elf.h (JUMP_TABLES_IN_TEXT_SECTION): Remove arm
	targets from JUMP_TABLES_IN_TEXT_SECTION definition.

gcc/testsuite/ChangeLog:

	* gcc.target/arm/arm-switchstatement.c: Alter the tests to
	change adr instruction to ldr.
2023-10-30 15:31:26 +00:00
Francois-Xavier Coudert
7666d94db0 Testsuite, i386: Mark test as requiring ifunc
Test is currently failing on x86_64-apple-darwin.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr105554.c: Require ifunc.
2023-10-30 15:57:33 +01:00
Francois-Xavier Coudert
89e97f655d Testsuite, Darwin: Fix trampoline warning
Heap-based trampolines are enabled on darwin20 and later,
meaning that no warning is emitted.

gcc/testsuite/ChangeLog:

	* gcc.dg/Wtrampolines.c: Skip on darwin20 and later.
2023-10-30 14:45:47 +01:00
Francois-Xavier Coudert
5c7bbb0fcd Testsuite, i386: Fix test by passing -march
The test currently fails on Darwin, where the default arch is core2.

gcc/testsuite/ChangeLog:

	PR target/112287
	* gcc.target/i386/pr111698.c: Pass -march=sandybridge.
2023-10-30 12:50:01 +01:00
Francois-Xavier Coudert
a0c557690c Testsuite, Darwin: skip PIE test
gcc/testsuite/ChangeLog:

	* gcc.dg/pie-2.c: Skip test on darwin.
2023-10-30 12:41:17 +01:00
Jeevitha
36a52cdc23 rs6000: Change bitwise xor to an equality operator [PR106907]
PR106907 has a few warnings spotted from cppcheck. These warnings
are related to the need of precedence clarification. Instead of using xor,
it has been changed to equality check, which achieves the same result.
Additionally, comment indentation has been fixed.

2023-10-11  Jeevitha Palanisamy  <jeevitha@linux.ibm.com>

gcc/
	PR target/106907
	* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Change bitwise
	xor to an equality and fix comment indentation.
2023-10-30 05:38:19 -05:00
Richard Biener
ff4cea05a6 PR testsuite/111462 - add powerpc64le to list of ssa-sink-18.c XFAIL
PR testsuite/111462
gcc/testsuite/
	* gcc.dg/tree-ssa/ssa-sink-18.c: XFAIL also powerpc64le.
2023-10-30 11:03:03 +01:00
Juzhe-Zhong
eb1cdb3e43 RISC-V: Fix bugs of handling scalar of SEW64 vx instruction in RV32
sew64_scalar_helper is handling SEW64 vx instruction pattern on RV32 system.
According to RVV ISA, we can directly use vx instruction of SEW64 on RV32 system
since RV32 GR reg is 32bit.

Consider this following case:

vsetvl e64m1
vadd.vx v,v,x

will be transform by sew64_scalar_helper:

vsetvl e64m1
sw
sw
vlse v
vadd.vv

This bug is reported by Robin.
(insn 143 179 230 9 (set (reg:SI 15 a5 [234])
        (unspec:SI [
                (const_int 64 [0x40])
            ] UNSPEC_VLMAX)) 751 {vlmax_avlsi}
     (expr_list:REG_EQUIV (unspec:SI [
                (const_int 64 [0x40])
            ] UNSPEC_VLMAX)
        (nil)))
(insn 230 143 78 9 (parallel [
            (set (reg:SI 66 vl)
                (unspec:SI [
                        (reg:SI 15 a5 [234])
                        (const_int 64 [0x40])
                        (const_int 0 [0])
                    ] UNSPEC_VSETVL))
            (set (reg:SI 67 vtype)
                (unspec:SI [
                        (const_int 64 [0x40])
                        (const_int 0 [0])
                        (const_int 1 [0x1]) repeated x2
                    ] UNSPEC_VSETVL))
        ]) "bug.c":14:14 discrim 1 1469 {vsetvl_discard_resultsi}
     (nil))
(insn 78 230 84 9 (set (reg:RVVM1DI 102 v6 [203])
        (if_then_else:RVVM1DI (unspec:RVVMF64BI [
                    (const_vector:RVVMF64BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (const_int 0 [0])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (vec_duplicate:RVVM1DI (mem/u/c:DI (reg/f:SI 29 t4 [230]) [0  S8 A64]))
            (unspec:RVVM1DI [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "bug.c":14:14 discrim 1 1872 {*pred_broadcastrvvm1di}
     (expr_list:REG_DEAD (reg/f:SI 29 t4 [230])
        (nil)))

The root cause of this is because we missed VLMAX handling since the codes was invented
long time ago (Callers always intrinsics codes, no VLMAX situation).

Now, all following bugs are fixed after this patch:

FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (sew64_scalar_helper): Fix bug.
	* config/riscv/riscv-v.cc (sew64_scalar_helper): Ditto.
	* config/riscv/vector.md: Ditto.
2023-10-30 15:48:29 +08:00
Paul Thomas
f3e44d0797 Fortran: Fix a problem with SELECT TYPE selectors [PR104555].
2023-10-30  Paul Thomas  <pault@gcc.gnu.org>

gcc/fortran
	PR fortran/104555
	* resolve.cc (resolve_select_type): If the selector expression
	has no class component references and the expression is a
	derived type, copy the typespec of the symbol to that of the
	expression.

gcc/testsuite/
	PR fortran/104555
	* gfortran.dg/pr104555.f90: New test.
2023-10-30 07:12:40 +00:00
liuhongt
8c40b72036 Improve memcmpeq for 512-bit vector with vpcmpeq + kortest.
When 2 vectors are equal, kmask is allones and kortest will set CF,
else CF will be cleared.

So CF bit can be used to check for the result of the comparison.

Before:
        vmovdqu (%rsi), %ymm0
        vpxorq  (%rdi), %ymm0, %ymm0
        vptest  %ymm0, %ymm0
        jne     .L2
        vmovdqu 32(%rsi), %ymm0
        vpxorq  32(%rdi), %ymm0, %ymm0
        vptest  %ymm0, %ymm0
        je      .L5
.L2:
        movl    $1, %eax
        xorl    $1, %eax
        vzeroupper
        ret

After:
        vmovdqu64       (%rsi), %zmm0
        xorl    %eax, %eax
        vpcmpeqd        (%rdi), %zmm0, %k0
        kortestw        %k0, %k0
        setc    %al
        vzeroupper
        ret

gcc/ChangeLog:

	PR target/104610
	* config/i386/i386-expand.cc (ix86_expand_branch): Handle
	512-bit vector with vpcmpeq + kortest.
	* config/i386/i386.md (cbranchxi4): New expander.
	* config/i386/sse.md: (cbranch<mode>4): Extend to V16SImode
	and V8DImode.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr104610-2.c: New test.
2023-10-30 11:10:01 +08:00
Haochen Gui
8111b5c23b Expand: Checking available optabs for scalar modes in by pieces operations
The former patch (f08ca5903c) examines the scalar modes by target
hook scalar_mode_supported_p.  It causes some i386 regression cases
as XImode and OImode are not enabled in i386 target function.  This
patch examines the scalar mode by checking if the corresponding optabs
are available for the mode.

gcc/
	PR target/111449
	* expr.cc (qi_vector_mode_supported_p): Rename to...
	(by_pieces_mode_supported_p): ...this, and extends it to do
	the checking for both scalar and vector mode.
	(widest_fixed_size_mode_for_size): Call
	by_pieces_mode_supported_p to examine the mode.
	(op_by_pieces_d::smallest_fixed_size_mode_for_size): Likewise.
2023-10-30 11:02:29 +08:00
GCC Administrator
39a11d8e0b Daily bump. 2023-10-30 00:17:23 +00:00
François Dumont
3c444fb2ff libstdc++: [_GLIBCXX_INLINE_VERSION] Add emul TLS symbols
libstdc++-v3/ChangeLog:

	* config/abi/pre/gnu-versioned-namespace.ver: Add missing emul TLS
	symbols.
2023-10-29 22:20:28 +01:00
François Dumont
5d1b723cef libstdc++: [_GLIBCXX_INLINE_VERSION] Provide handle_contract_violation symbol
libstdc++-v3/ChangeLog:

	* src/experimental/contract.cc
	[_GLIBCXX_INLINE_VERSION](handle_contract_violation): Provide symbol
	without version namespace decoration for gcc.
2023-10-29 22:10:33 +01:00
Iain Buclaw
ea8ffdcadb d: Fix ICE: verify_gimple_failed (conversion of register to a different size in 'view_convert_expr')
Static arrays in D are passed around by value, rather than decaying to a
pointer.  On x86_64 __builtin_va_list is an exception to this rule, but
semantically it's still treated as a static array.

This makes certain assignment operations fail due a mismatch in types.
As all examples in the test program are rejected by C/C++ front-ends,
these are now errors in D too to be consistent.

	PR d/110712

gcc/d/ChangeLog:

	* d-codegen.cc (d_build_call): Update call to convert_for_argument.
	* d-convert.cc (is_valist_parameter_type): New function.
	(check_valist_conversion): New function.
	(convert_for_assignment): Update signature.  Add check whether
	assigning va_list is permissible.
	(convert_for_argument): Likewise.
	* d-tree.h (convert_for_assignment): Update signature.
	(convert_for_argument): Likewise.
	* expr.cc (ExprVisitor::visit (AssignExp *)): Update call to
	convert_for_assignment.

gcc/testsuite/ChangeLog:

	* gdc.dg/pr110712.d: New test.
2023-10-29 20:13:14 +01:00
Iain Buclaw
e773c6c700 d: Merge upstream dmd, druntime e48bc0987d, phobos 2458e8f82.
D front-end changes:

    - Import dmd v2.106.0-beta.1.

D runtime changes:

    - Import druntime v2.106.0-beta.1.

Phobos changes:

    - Import phobos v2.106.0-beta.1.

gcc/d/ChangeLog:

	* dmd/MERGE: Merge upstream dmd e48bc0987d.
	* expr.cc (ExprVisitor::visit (NewExp *)): Update for new front-end
	interface.
	* runtime.def (NEWARRAYT): Remove.
	(NEWARRAYIT): Remove.

libphobos/ChangeLog:

	* libdruntime/MERGE: Merge upstream druntime e48bc0987d.
	* src/MERGE: Merge upstream phobos 2458e8f82.
2023-10-29 16:41:29 +01:00
Iain Sandoe
c6929b0855 testsuite, X86, Darwin: Skip a test for mcmodel=large.
The large model is not implemented so far for Darwin (and the
codegen will be different when it is).

gcc/testsuite/ChangeLog:

	* gcc.target/i386/large-data.c: Skip for Darwin.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
2023-10-29 07:12:48 +00:00
Iain Sandoe
78491bee70 testsuite, X86, Darwin: Skip tests with incompatible output.
Darwin platforms do not currently emit .cfi_xxx instructions so that these
tests do not work there.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/apx-interrupt-1.c: Skip for Darwin.
	* gcc.target/i386/apx-push2pop2-1.c: Likewise.
	* gcc.target/i386/apx-push2pop2_force_drap-1.c: Likewise.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
2023-10-29 07:07:07 +00:00
Martin Uecker
d96757842a tree-optimization/109334: Improve computation for access attribute
The fix for PR104970 restricted size computations to the case
where the access attribute was specified explicitly (no VLA).
It also restricted it to void pointers or elements with constant
sizes.  The second restriction is enough to fix the original bug.
Revert the first change to again allow size computations for VLA
parameters and for VLA parameters together with an explicit access
attribute.

gcc/ChangeLog:

	PR tree-optimization/109334
	* tree-object-size.cc (parm_object_size): Allow size
	computation for implicit access attributes.

gcc/testsuite/ChangeLog:

	PR tree-optimization/109334
	* gcc.dg/builtin-dynamic-object-size-0.c
	(test_parmsz_simple3): Supported again.
	(test_parmsz_external4): New test.
	* gcc.dg/builtin-dynamic-object-size-20.c: New test.
	* gcc.dg/pr104970.c: New test.
2023-10-29 07:47:36 +01:00
Max Filippov
cc7aca846a gcc: xtensa: fix salt/saltu version check
gcc/
	* config/xtensa/xtensa.h (TARGET_SALT): Change HW version from
	260000 (which corresponds to RF-2014.0) to 270000 (which
	corresponds to RG-2015.0, the release where salt/saltu opcodes
	were introduced).
2023-10-28 19:57:30 -07:00
Pan Li
b8b63e8766 RISC-V: Fix one range-loop-construct warning of avlprop
This patch would like to fix one warning of avlprop as below.

../../gcc/config/riscv/riscv-avlprop.cc: In member function 'virtual
unsigned int pass_avlprop::execute(function*)':
../../gcc/config/riscv/riscv-avlprop.cc:346:23: error: loop variable
'candidate' creates a copy from type 'const std::pair<avlprop_type,
rtl_ssa::insn_info*>' [-Werror=range-loop-construct]
  346 |       for (const auto candidate : m_candidates)
      |                       ^~~~~~~~~
../../gcc/config/riscv/riscv-avlprop.cc:346:23: note: use reference type
to prevent copying
  346 |       for (const auto candidate : m_candidates)
      |                       ^~~~~~~~~
      |                       &

gcc/ChangeLog:

	* config/riscv/riscv-avlprop.cc (pass_avlprop::execute): Use
	reference type to prevent copying.

Signed-off-by: Pan Li <pan2.li@intel.com>
2023-10-29 08:40:11 +08:00
GCC Administrator
b0f702922a Daily bump. 2023-10-29 00:17:16 +00:00
Iain Buclaw
10f1489dcb d: Fix ICE: in verify_gimple_in_seq on powerpc-darwin9 [PR112270]
This ICE was seen during stage2 on powerpc-darwin9 only.  There were
still some uses of GCC's boolean_type_node in the D front-end, which
caused a type mismatch to trigger as D bool size is fixed to 1 byte on
all targets.

So two new nodes have been introduced - d_bool_false_node and
d_bool_true_node - which have replaced all remaining uses of
boolean_false_node and boolean_true_node respectively.

	PR d/112270

gcc/d/ChangeLog:

	* d-builtins.cc (d_build_d_type_nodes): Initialize d_bool_false_node,
	d_bool_true_node.
	* d-codegen.cc (build_array_struct_comparison): Use d_bool_false_node
	instead of boolean_false_node.
	* d-convert.cc (d_truthvalue_conversion): Use d_bool_false_node and
	d_bool_true_node instead of boolean_false_node and boolean_true_node.
	* d-tree.h (enum d_tree_index): Add DTI_BOOL_FALSE and DTI_BOOL_TRUE.
	(d_bool_false_node): New macro.
	(d_bool_true_node): New macro.
	* modules.cc (build_dso_cdtor_fn): Use d_bool_false_node and
	d_bool_true_node instead of boolean_false_node and boolean_true_node.
	(register_moduleinfo): Use d_bool_type instead of boolean_type_node.

gcc/testsuite/ChangeLog:

	* gdc.dg/pr112270.d: New test.
2023-10-29 00:36:30 +02:00
Iain Buclaw
5d2a360f0a d: Add warning for call expression without side effects
In the last merge of the dmd front-end with upstream (r14-4830), this
warning got removed from the semantic passes.  Reimplement the warning
for the code generation pass instead, where it cannot have an effect on
conditional compilation.

gcc/d/ChangeLog:

	* d-codegen.cc (call_side_effect_free_p): New function.
	* d-tree.h (CALL_EXPR_WARN_IF_UNUSED): New macro.
	(call_side_effect_free_p): New prototype.
	* expr.cc (ExprVisitor::visit (CallExp *)): Set
	CALL_EXPR_WARN_IF_UNUSED on matched call expressions.
	(ExprVisitor::visit (NewExp *)): Don't dereference the result of an
	allocation call here.
	* toir.cc (add_stmt): Emit warning when call expression added to
	statement list without being used.

gcc/testsuite/ChangeLog:

	* gdc.dg/Wunused_value.d: New test.
2023-10-28 09:53:36 +02:00
GCC Administrator
7f974c5fd4 Daily bump. 2023-10-28 00:16:37 +00:00
Vladimir N. Makarov
4d3d2cdb57 [RA]: Fixing i686 bootstrap failure because of pushing the equivalence patch
GCC with my recent patch improving cost calculation for pseudos with
equivalence may generate different code with and without debug info
and as the result i686 bootstrap fails on i686.  The patch fixes this
bug.

gcc/ChangeLog:

	PR rtl-optimization/112107
	* ira-costs.cc: (calculate_equiv_gains): Use NONDEBUG_INSN_P
	instead of INSN_P.
2023-10-27 15:12:29 -04:00
Patrick O'Neill
92fcbe8a32
RISC-V: Make stack_save_restore_2 more robust
GCC recently changed to emit __riscv_restore_5 which causes this
testcase to fail.
This patch updates the regex to be more robust to change by accepting
any number after __riscv_save_ and __riscv_restore_.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/stack_save_restore_2.c: Accept any number
	after __riscv_save_ and __riscv_restore_.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2023-10-27 11:13:56 -07:00
Gaius Mulley
32cc0b82a3 PR modula2/112110: fails to build on freebsd when compiling wrapclock.cc
This patch fixes a mangled #if #endif conditional section within
wrapclock.cc.  The conditional section in wrapclock_timezone
should return 0 rather than return timezone.

libgm2/ChangeLog:

	PR modula2/112110
	* libm2iso/wrapclock.cc (timezone): Return 0 if unable
	to get the timezone from the tm struct.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-10-27 18:42:09 +01:00
Harald Anlauf
c6430d3e6d Fortran: diagnostics of MODULE PROCEDURE declaration conflicts [PR104649]
gcc/fortran/ChangeLog:

	PR fortran/104649
	* decl.cc (gfc_match_formal_arglist): Handle conflicting declarations
	of a MODULE PROCEDURE when one of the declarations is an alternate
	return.

gcc/testsuite/ChangeLog:

	PR fortran/104649
	* gfortran.dg/pr104649.f90: New test.

Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
2023-10-27 19:17:46 +02:00
Andrew Stubbs
9f3c4c673d amdgcn: Fix bug in gfx1030 support patch
The previous patch to add gfx1030 support introduced an issue with passing
exit codes from kernels run under gcn-run (offload kernels were unaffected).

gcc/ChangeLog:

	PR target/112088
	* config/gcn/gcn.cc (gcn_expand_epilogue): Fix kernel epilogue register
	conflict.
2023-10-27 18:00:55 +01:00
Andrew Stubbs
9ae1fbdd38 amdgcn: silence warnings
The operands really should be VOIDmode, so the warnings are false.

gcc/ChangeLog:

	* config/gcn/gcn-valu.md
	(vec_extract<V_1REG:mode><V_1REG_ALT:mode>_nop): Mention "operands" in
	condition to silence the warnings.
	(vec_extract<V_2REG:mode><V_2REG_ALT:mode>_nop): Likewise.
	* config/gcn/gcn.md (*movti_insn): Likewise.
2023-10-27 18:00:55 +01:00
Richard Sandiford
2672c60917 recog: Fix propagation into ASM_OPERANDS
An inline asm with multiple output operands is represented as a
parallel set in which the SET_SRCs are the same (shared) ASM_OPERANDS.
insn_propagation didn't account for this, and instead propagated
into each ASM_OPERANDS individually.  This meant that it could
apply a substitution X->Y to Y itself, which (a) could create
circularity and (b) would be semantically wrong in any case,
since Y might use a different value of X.

This patch checks explicitly for parallels involving ASM_OPERANDS,
just like combine does.

gcc/
	* recog.cc (insn_propagation::apply_to_pattern_1): Handle shared
	ASM_OPERANDS.
2023-10-27 16:37:11 +01:00
Patrick Palka
6ff8b93c7b c++: another build_new_1 folding fix [PR111929]
In build_new_1, we also need to avoid folding 'outer_nelts_check' when
in a template context to prevent an ICE on the below testcase.  This
patch replaces the problematic fold_build2 call with build2 (we'll later
fold it if appropriate during cp_fully_fold).

In passing, this patch removes an unnecessary conversion of 'nelts'
since it should always already be a size_t (and 'convert' isn't the best
conversion entry point to use anyway since it lacks a complain parameter).

	PR c++/111929

gcc/cp/ChangeLog:

	* init.cc (build_new_1): Remove unnecessary call to convert
	on 'nelts'.  Use build2 instead of fold_build2 for
	'outer_nelts_checks'.

gcc/testsuite/ChangeLog:

	* g++.dg/template/non-dependent28a.C: New test.
2023-10-27 11:31:02 -04:00
Patrick Palka
68e97c5442 c++: add testcase verifying non-dep new-expr checking
gcc/testsuite/ChangeLog:

	* g++.dg/template/new14.C: New test.
2023-10-27 11:26:40 -04:00
Patrick Palka
0f2e208068 c++: more ahead-of-time -Wparentheses warnings
Now that we don't have to worry about looking through NON_DEPENDENT_EXPR,
we can easily extend the -Wparentheses warning in convert_for_assignment
to consider (non-dependent) templated assignment operator expressions as
well, like r14-4111-g6e92a6a2a72d3b did in maybe_convert_cond.

gcc/cp/ChangeLog:

	* cp-tree.h (maybe_warn_unparenthesized_assignment): Declare.
	* semantics.cc (is_assignment_op_expr_p): Generalize to return
	true for any assignment operator expression, not just one that
	has been resolved to an operator overload.
	(maybe_warn_unparenthesized_assignment): Factored out from ...
	(maybe_convert_cond): ... here.
	(finish_parenthesized_expr): Mention
	maybe_warn_unparenthesized_assignment.
	* typeck.cc (convert_for_assignment): Replace -Wparentheses
	warning logic with maybe_warn_unparenthesized_assignment.

gcc/testsuite/ChangeLog:

	* g++.dg/warn/Wparentheses-13.C: Strengthen by expecting that
	we issue the -Wparentheses warnings ahead of time.
	* g++.dg/warn/Wparentheses-23.C: Likewise.
	* g++.dg/warn/Wparentheses-32.C: Remove xfails.
2023-10-27 11:14:04 -04:00