Commit graph

205607 commits

Author SHA1 Message Date
Jan Hubicka
53ba8d6695 inter-procedural value range propagation
implement very basic propapgation of return value ranges from VRP
pass.  This helps std::vector's push_back since we work out value range of
allocated block.  This propagates only within single translation unit.  I hoped
we will also do the propagation at WPA stage, but that needs more work on
ipa-cp side.

I also added code auto-detecting return_nonnull and corresponding -Wsuggest-attribute.

gcc/ChangeLog:

	* cgraph.cc (add_detected_attribute_1): New function.
	(cgraph_node::add_detected_attribute): Likewise.
	* cgraph.h (cgraph_node::add_detected_attribute): Declare.
	* common.opt: Add -Wsuggest-attribute=returns_nonnull.
	* doc/invoke.texi: Document new flag.
	* gimple-range-fold.cc (fold_using_range::range_of_call):
	Use known reutrn value ranges.
	* ipa-prop.cc (struct ipa_return_value_summary): New type.
	(class ipa_return_value_sum_t): New type.
	(ipa_return_value_sum): New summary.
	(ipa_record_return_value_range): New function.
	(ipa_return_value_range): New function.
	* ipa-prop.h (ipa_return_value_range): Declare.
	(ipa_record_return_value_range): Declare.
	* ipa-pure-const.cc (warn_function_returns_nonnull): New funcion.
	* ipa-utils.h (warn_function_returns_nonnull): Declare.
	* symbol-summary.h: Fix comment.
	* tree-vrp.cc (execute_ranger_vrp): Record return values.

gcc/testsuite/ChangeLog:

	* g++.dg/ipa/devirt-2.C: Add noipa attribute to prevent ipa-vrp.
	* g++.dg/ipa/devirt-7.C: Disable ipa-vrp.
	* g++.dg/ipa/ipa-icf-2.C: Disable ipa-vrp.
	* g++.dg/ipa/ipa-icf-3.C: Disable ipa-vrp.
	* g++.dg/ipa/ivinline-1.C: Disable ipa-vrp.
	* g++.dg/ipa/ivinline-3.C: Disable ipa-vrp.
	* g++.dg/ipa/ivinline-5.C: Disable ipa-vrp.
	* g++.dg/ipa/ivinline-8.C: Disable ipa-vrp.
	* g++.dg/ipa/nothrow-1.C: Disable ipa-vrp.
	* g++.dg/ipa/pure-const-1.C: Disable ipa-vrp.
	* g++.dg/ipa/pure-const-2.C: Disable ipa-vrp.
	* g++.dg/lto/inline-crossmodule-1_0.C: Disable ipa-vrp.
	* gcc.c-torture/compile/pr106433.c: Add noipa attribute to prevent ipa-vrp.
	* gcc.c-torture/execute/frame-address.c: Likewise.
	* gcc.dg/vla-1.c: Add noipa attribute to prevent ipa-vrp.
	* gcc.dg/ipa/fopt-info-inline-1.c: Disable ipa-vrp.
	* gcc.dg/ipa/ipa-icf-25.c: Disable ipa-vrp.
	* gcc.dg/ipa/ipa-icf-38.c: Disable ipa-vrp.
	* gcc.dg/ipa/pure-const-1.c: Disable ipa-vrp.
	* gcc.dg/ipa/remref-0.c: Add noipa attribute to prevent ipa-vrp.
	* gcc.dg/tree-prof/time-profiler-1.c: Disable ipa-vrp.
	* gcc.dg/tree-prof/time-profiler-2.c: Disable ipa-vrp.
	* gcc.dg/tree-ssa/pr110269.c: Disable ipa-vrp.
	* gcc.dg/tree-ssa/pr20701.c: Disable ipa-vrp.
	* gcc.dg/tree-ssa/vrp05.c: Disable ipa-vrp.
	* gcc.dg/tree-ssa/return-value-range-1.c: New test.
2023-11-20 19:37:45 +01:00
Richard Biener
57c028acbe tree-optimization/112618 - unused .MASK_CALL
We have to make sure to remove unused .MASK_CALL internal function
calls after vectorization.

	PR tree-optimization/112618
	* tree-vect-loop.cc (vect_transform_loop_stmt): For not
	relevant and unused .MASK_CALL make sure we remove the
	scalar stmt.

	* gcc.dg/pr112618.c: New testcase.
2023-11-20 14:58:10 +01:00
Richard Biener
3b34902417 tree-optimization/112281 - loop distribution and zero dependence distances
The following fixes an omission in dependence testing for loop
distribution.  When the overall dependence distance is not zero but
the dependence direction in the innermost common loop is = there is
a conflict between the partitions and we have to merge them.

	PR tree-optimization/112281
	* tree-loop-distribution.cc
	(loop_distribution::pg_add_dependence_edges): For = in the
	innermost common loop record a partition conflict.

	* gcc.dg/torture/pr112281-1.c: New testcase.
	* gcc.dg/torture/pr112281-2.c: Likewise.
2023-11-20 14:58:10 +01:00
Richard Biener
b7a1b89e60 middle-end/112622 - convert and vector-to-float
The following avoids ICEing when trying to convert a vector to
a scalar float.

	PR middle-end/112622
	* convert.cc (convert_to_real_1): Use element_precision
	where a vector type might appear.  Provide specific
	diagnostic for unexpected vector argument.

	* gcc.dg/pr112622.c: New testcase.
	* gcc.dg/simd-2.c: Adjust.
	* gcc.target/i386/vect-bfloat16-typecheck_1.c: Likewise.
	* gcc.target/i386/vect-bfloat16-typecheck_2.c: Likewise.
2023-11-20 14:57:52 +01:00
Juzhe-Zhong
a27f587816 RISC-V: Fix intermediate mode on slide1 instruction for SEW64 on RV32
This bug was discovered on PR112597, with -march=rv32gcv_zvl256b --param=riscv-autovec-preference=fixed-vlmax

ICE:
bug.c:10:1: error: unrecognizable insn:
   10 | }
      | ^
(insn 10 9 11 2 (set (reg:V4SI 140)
        (unspec:V4SI [
                (unspec:V4BI [
                        (const_vector:V4BI [
                                (const_int 1 [0x1]) repeated x4
                            ])
                        (const_int 4 [0x4])
                        (const_int 2 [0x2]) repeated x3
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                    ] UNSPEC_VPREDICATE)
                (unspec:V4SI [
                        (reg:SI 0 zero)
                    ] UNSPEC_VUNDEF)
                (subreg:V4SI (reg:V2DI 138 [ v ]) 0)
                (subreg:SI (reg/v:DI 136 [ b ]) 0)
            ] UNSPEC_VSLIDE1DOWN)) "bug.c":8:10 -1
     (nil))

The rootcase is we don't enable V4SImode, instead, we already have RVVMF2SI which is totally same as V4SI
on -march=rv32gcv_zvl256 + --param=riscv-autovec-preference=fixed-vlmax.

The attribute VDEMODE map to V4SI is incorrect, we remove attributes and use get_vector_mode to get
right mode.

	PR target/112597

gcc/ChangeLog:

	* config/riscv/vector-iterators.md: Remove VDEMOTE and VMDEMOTE.
	* config/riscv/vector.md: Fix slide1 intermediate mode bug.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/pr112597-1.c: New test.
2023-11-20 21:54:35 +08:00
Robin Dapp
b3677563cd RISC-V: Disallow 64-bit indexed loads and stores for rv32gcv.
We currently allow 64-bit indices/offsets for vector indexed loads and
stores even on rv32 but we should not.

This patch adjusts the iterators as well as the insn conditions to
reflect the RVV spec.

It also fixes an oversight in the VLS modes of the demote iterator that
was found while testing the patch.

gcc/ChangeLog:

	* config/riscv/riscv-v.cc (gather_scatter_valid_offset_mode_p):
	Add check for XLEN == 32.
	* config/riscv/vector-iterators.md: Change VLS part of the
	demote iterator to 2x elements modes
	* config/riscv/vector.md: Adjust iterators and insn conditions.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-1.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-10.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-11.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-12.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-2.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-3.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-4.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-5.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-6.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-7.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-8.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-9.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c:
	Adjust include.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-1.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-10.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-11.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-2.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-3.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-4.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-6.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-7.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-8.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-9.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c:
	Adjust include.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-10.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-2.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-3.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-4.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-5.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-6.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-7.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-8.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-9.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c:
	Adjust include.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-1.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-10.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-2.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-4.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-5.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-6.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-7.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-8.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-9.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c: Moved to...
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-2.c: ...here.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c:
	Adjust include.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c: Ditto.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-10.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-11.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-12.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-3.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-4.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-5.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-6.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-7.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-9.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-10.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-11.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-2.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-3.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-5.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-6.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-7.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-8.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-9.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-10.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-2.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-3.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-4.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-5.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-6.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-7.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-8.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-9.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-10.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-3.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-4.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-5.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-6.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-7.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-8.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-9.c: New test.
2023-11-20 14:16:30 +01:00
Christophe Lyon
4d7647edfd arm: [MVE intrinsics] rework vldq1 vst1q
Implement vld1q, vst1q using the new MVE builtins framework.

2023-11-16  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm-mve-builtins-base.cc (vld1_impl, vld1q)
	(vst1_impl, vst1q): New.
	* config/arm/arm-mve-builtins-base.def (vld1q, vst1q): New.
	* config/arm/arm-mve-builtins-base.h (vld1q, vst1q): New.
	* config/arm/arm_mve.h
	(vld1q): Delete.
	(vst1q): Delete.
	(vld1q_s8): Delete.
	(vld1q_s32): Delete.
	(vld1q_s16): Delete.
	(vld1q_u8): Delete.
	(vld1q_u32): Delete.
	(vld1q_u16): Delete.
	(vld1q_f32): Delete.
	(vld1q_f16): Delete.
	(vst1q_f32): Delete.
	(vst1q_f16): Delete.
	(vst1q_s8): Delete.
	(vst1q_s32): Delete.
	(vst1q_s16): Delete.
	(vst1q_u8): Delete.
	(vst1q_u32): Delete.
	(vst1q_u16): Delete.
	(__arm_vld1q_s8): Delete.
	(__arm_vld1q_s32): Delete.
	(__arm_vld1q_s16): Delete.
	(__arm_vld1q_u8): Delete.
	(__arm_vld1q_u32): Delete.
	(__arm_vld1q_u16): Delete.
	(__arm_vst1q_s8): Delete.
	(__arm_vst1q_s32): Delete.
	(__arm_vst1q_s16): Delete.
	(__arm_vst1q_u8): Delete.
	(__arm_vst1q_u32): Delete.
	(__arm_vst1q_u16): Delete.
	(__arm_vld1q_f32): Delete.
	(__arm_vld1q_f16): Delete.
	(__arm_vst1q_f32): Delete.
	(__arm_vst1q_f16): Delete.
	(__arm_vld1q): Delete.
	(__arm_vst1q): Delete.
	* config/arm/mve.md (mve_vld1q_f<mode>): Rename into ...
	(@mve_vld1q_f<mode>): ... this.
	(mve_vld1q_<supf><mode>): Rename into ...
	(@mve_vld1q_<supf><mode>) ... this.
	(mve_vst1q_f<mode>): Rename into ...
	(@mve_vst1q_f<mode>): ... this.
	(mve_vst1q_<supf><mode>): Rename into ...
	(@mve_vst1q_<supf><mode>) ... this.
2023-11-20 11:23:57 +00:00
Christophe Lyon
3282fecd82 arm: [MVE intrinsics] fix vst1 tests
vst1q intrinsics return void, so we should not do 'return vst1q_f16 (base, value);'

This was OK so far, but will trigger an error/warning with the new
implementation of these intrinsics.

This patch just removes the 'return' keyword.

2023-11-16  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/testsuite/
	* gcc.target/arm/mve/intrinsics/vst1q_f16.c: Remove 'return'.
	* gcc.target/arm/mve/intrinsics/vst1q_f32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst1q_s16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst1q_s32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst1q_s8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst1q_u16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst1q_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst1q_u8.c: Likewise.
2023-11-20 11:23:56 +00:00
Christophe Lyon
1145875206 arm: [MVE intrinsics] add load and store shapes
This patch adds the load and store shapes descriptions.

2023-11-16  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm-mve-builtins-shapes.cc (load, store): New.
	* config/arm/arm-mve-builtins-shapes.h (load, store): New.
2023-11-20 11:23:56 +00:00
Christophe Lyon
0c2037d9d9 arm: [MVE intrinsics] Add support for contiguous loads and stores
This patch adds base support for load/store intrinsics to the
framework, starting with loads and stores for contiguous memory
elements, without extension nor truncation.

Compared to the aarch64/SVE implementation, there's no support for
gather/scatter loads/stores yet.  This will be added later as needed.

2023-11-16  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm-mve-builtins-functions.h (multi_vector_function)
	(full_width_access): New classes.
	* config/arm/arm-mve-builtins.cc
	(find_type_suffix_for_scalar_type, infer_pointer_type)
	(require_pointer_type, get_contiguous_base, add_mem_operand)
	(add_fixed_operand, use_contiguous_load_insn)
	(use_contiguous_store_insn): New.
	* config/arm/arm-mve-builtins.h (memory_vector_mode)
	(infer_pointer_type, require_pointer_type, get_contiguous_base)
	(add_mem_operand)
	(add_fixed_operand, use_contiguous_load_insn)
	(use_contiguous_store_insn): New.
2023-11-20 11:23:56 +00:00
Christophe Lyon
524c892e64 arm: [MVE intrinsics] Add support for void and load/store pointers as argument types.
This patch adds support for '_', 'al' and 'as' for void, load pointer
and store pointer argument/return value types in intrinsic signatures.

It also adds a mew memory_scalar_type() helper to function_instance,
which is used by 'al' and 'as'.

2023-11-16  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm-mve-builtins-shapes.cc (build_const_pointer):
	New.
	(parse_type): Add support for '_', 'al' and 'as'.
	* config/arm/arm-mve-builtins.h (function_instance): Add
	memory_scalar_type.
	(function_base): Likewise.
2023-11-20 11:23:56 +00:00
Christophe Lyon
b859218661 arm: Fix arm_simd_types and MVE scalar_types
So far we define arm_simd_types and scalar_types using type
definitions like intSI_type_node, etc...

This is causing problems with later patches which re-implement
load/store MVE intrinsics, leading to error messages such as:
  error: passing argument 1 of 'vst1q_s32' from incompatible pointer type
  note: expected 'int *' but argument is of type 'int32_t *' {aka 'long int *'}

This patch uses get_typenode_from_name (INT32_TYPE) instead, which
defines the types as appropriate for the target/C library.

2023-11-16  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm-builtins.cc (arm_init_simd_builtin_types): Fix
	initialization of arm_simd_types[].eltype.
	* config/arm/arm-mve-builtins.def (DEF_MVE_TYPE): Fix scalar
	types.
2023-11-20 11:23:56 +00:00
Juzhe-Zhong
a63cbcc52e RISC-V Regression: Remove scalable compile option
Since we already set scalable vectorization by default, this flag is redundant.

Also, we are start to full coverage testing with different compile option.
E.g --param=riscv-autovec-preference=fixed-vlmax.
To avoid compile option confusion. Remove it.

gcc/testsuite/ChangeLog:

	* lib/target-supports.exp: Remove scalable compile option.
2023-11-20 19:20:39 +08:00
Jakub Jelinek
509b470dce c, c++: Add new value for vector types for __builtin_classify_type
While filing a clang request to return 18 on _BitInts for
__builtin_classify_type instead of -1 they return currently, I've
noticed that we return -1 for vector types.  Initially I wanted to change
behavior just for __builtin_classify_type (type) form, as that is new in
GCC 14 and we've returned for 20+ years -1 for __builtin_classify_type
on vector expressions, but I was convinved otherwise, so this changes
the behavior even for that and now returns 19.

2023-11-20  Jakub Jelinek  <jakub@redhat.com>

gcc/
	* typeclass.h (enum type_class): Add vector_type_class.
	* builtins.cc (type_to_class): Return vector_type_class for
	VECTOR_TYPE.
	* doc/extend.texi (__builtin_classify_type): Mention bit-precise
	integer types and vector types.
gcc/testsuite/
	* c-c++-common/builtin-classify-type-1.c (main): Add tests for vector
	types.
2023-11-20 10:44:31 +01:00
Robin Dapp
f25a5b199a vect: Add bool pattern handling for COND_OPs.
In order to handle masks properly for conditional operations this patch
teaches vect_recog_mask_conversion_pattern to also handle conditional
operations.  Now we convert e.g.

 _mask = *_6;
 _ifc123 = COND_OP (_mask, ...);

into
 _mask = *_6;
 patt200 = (<signed-boolean:1>) _mask;
 patt201 = COND_OP (patt200, ...);

This way the mask will be properly recognized as boolean mask and the
correct vector mask will be generated.

gcc/ChangeLog:

	PR middle-end/112406

	* tree-vect-patterns.cc (vect_recog_mask_conversion_pattern):
	Convert masks for conditional operations as well.

gcc/testsuite/ChangeLog:

	* gfortran.dg/pr112406.f90: New test.
2023-11-20 10:21:58 +01:00
Jakub Jelinek
103a3966bc tree-ssa-math-opts: popcount (X) == 1 to (X ^ (X - 1)) > (X - 1) optimization for direct optab [PR90693]
On Fri, Nov 17, 2023 at 03:01:04PM +0100, Jakub Jelinek wrote:
> As a follow-up, I'm considering changing in this routine the popcount
> call to IFN_POPCOUNT with 2 arguments and during expansion test costs.

Here is the follow-up which does the rtx costs testing.

2023-11-20  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/90693
	* tree-ssa-math-opts.cc (match_single_bit_test): Mark POPCOUNT with
	result only used in equality comparison against 1 with direct optab
	support as .POPCOUNT call with 2 arguments.
	* internal-fn.h (expand_POPCOUNT): Declare.
	* internal-fn.def (DEF_INTERNAL_INT_EXT_FN): New macro, document it,
	undefine at the end.
	(POPCOUNT): Use it instead of DEF_INTERNAL_INT_FN.
	* internal-fn.cc (DEF_INTERNAL_INT_EXT_FN): Define to nothing before
	inclusion to define expanders.
	(expand_POPCOUNT): New function.
2023-11-20 10:03:20 +01:00
Jakub Jelinek
d0b6b7f8a6 tree-ssa-math-opts: popcount (X) == 1 to (X ^ (X - 1)) > (X - 1) optimization [PR90693]
Per the earlier discussions on this PR, the following patch folds
popcount (x) == 1 (and != 1) into (x ^ (x - 1)) > x - 1 (or <=)
if the corresponding popcount optab isn't implemented (I think any
double-word popcount or call will be necessarily slower than the
above cheap 3 op check and even for -Os larger or same size).

I've noticed e.g. C++ aligned new starts with std::has_single_bit
which does popcount (x) == 1.

As a follow-up, I'm considering changing in this routine the popcount
call to IFN_POPCOUNT with 2 arguments and during expansion test costs.

2023-11-20  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/90693
	* tree-ssa-math-opts.cc (match_single_bit_test): New function.
	(math_opts_dom_walker::after_dom_children): Call it for EQ_EXPR
	and NE_EXPR assignments and GIMPLE_CONDs.

	* gcc.target/i386/pr90693.c: New test.
2023-11-20 10:00:09 +01:00
Jakub Jelinek
99fad213d5 internal-fn: Always undefine DEF_INTERNAL* macros at the end of internal-fn.def
I have noticed we are inconsistent, some DEF_INTERNAL*
macros (most of them) were undefined at the end of internal-fn.def (but in
some cases uselessly undefined again after inclusion), while others were not
(and sometimes undefined after the inclusion).  I've changed it to always
undefine at the end of internal-fn.def.

2023-11-20  Jakub Jelinek  <jakub@redhat.com>

	* internal-fn.def: Document missing DEF_INTERNAL* macros and make sure
	they are all undefined at the end.
	* internal-fn.cc (lookup_hilo_internal_fn, lookup_evenodd_internal_fn,
	widening_fn_p, get_len_internal_fn): Don't undef DEF_INTERNAL_*FN
	macros after inclusion of internal-fn.def.
2023-11-20 09:57:34 +01:00
Alexandre Oliva
4b51c7c913 testsuite: arm: fix arm_movt cut&pasto
I got spurious fails of tests that required arm_thumb1_movt_ok on a
target cpu that did not support movt.  Looking into it, I found the
arm_movt property to have been cut&pasted into other procs that
checked for different properties.  They shouldn't share the same test
results cache entry, so I'm changing their prop names.  Or rather its
prop name, because the other occurrence was already fixed recently.


for  gcc/testsuite/ChangeLog

	* lib/target-supports.exp
	(check_effective_target_arm_thumb1_cbz_ok): Fix prop name
	cut&pasto.
2023-11-20 05:16:36 -03:00
Alexandre Oliva
0e0e3420df testsuite: analyzer: expect alignment warning with -fshort-enums
On targets that have -fshort-enums enabled by default, the type casts
in the pr108251 analyzer tests warn that the byte-aligned enums may
not be sufficiently aligned to be a struct connection *.  The function
can't know better, the warning is reasonable, the code doesn't
expected enums to be shorter and less aligned than the struct.

Rather than use -fno-short-enums, I decided to embrace the warning on
targets that have short_enums enabled by default.

However, C++ doesn't issue the warning, because even with
-fshort-enums, enumeration types are not TYPE_PACKED, and the
expression is not sufficiently simplified by the C++ front-end for
check_and_warn_address_or_pointer_of_packed_member to identify the
insufficiently aligned pointer.  So don't expect the warning there.


for  gcc/testsuite/ChangeLog

	* c-c++-common/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c:
	Expect "unaligned pointer value" warning on short_enums
	targets, but not in c++.
	* c-c++-common/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c:
	Likewise.
2023-11-20 05:14:31 -03:00
Alexandre Oliva
69741355e6 testsuite: scev: expect fail on ilp32
I've recently patched scev-3.c and scev-5.c because it only passed by
accident on ia32.  It also fails on some (but not all) arm-eabi
variants.  It seems hard to characterize the conditions in which the
optimization is supposed to pass, but expecting them to fail on ilp32
targets, though probably a little excessive and possibly noisy, is not
quite as alarming as getting a fail in test reports, so I propose
changing the xfail marker from ia32 to ilp32.

I'm also proposing to add a similar marker to scev-4.c.  Though it
doesn't appear to be failing for me, I've got reports that suggest it
still does for others, and it certainly did for us as well.


for  gcc/testsuite/ChangeLog

	* gcc.dg/tree-ssa/scev-3.c: xfail on all ilp32 targets,
	though some of these do pass.
	* gcc.dg/tree-ssa/scev-4.c: Likewise.
	* gcc.dg/tree-ssa/scev-5.c: Likewise.
2023-11-20 05:14:25 -03:00
Haochen Jiang
2f8f7ee2db Initial support for AVX10.1
gcc/ChangeLog:

	* common/config/i386/cpuinfo.h (get_available_features):
	Add avx10_set and version and detect avx10.1.
	(cpu_indicator_init): Handle avx10.1-512.
	* common/config/i386/i386-common.cc
	(OPTION_MASK_ISA2_AVX10_1_256_SET): New.
	(OPTION_MASK_ISA2_AVX10_1_256_SET): Ditto.
	(OPTION_MASK_ISA2_AVX10_1_512_UNSET): Ditto.
	(OPTION_MASK_ISA2_AVX10_1_512_UNSET): Ditto.
	(OPTION_MASK_ISA2_AVX2_UNSET): Modify for AVX10.1.
	(ix86_handle_option): Handle -mavx10.1-256 and -mavx10.1-512.
	Add indicator for explicit no-avx512 and no-avx10.1 options.
	* common/config/i386/i386-cpuinfo.h (enum processor_features):
	Add FEATURE_AVX10_1_256 and FEATURE_AVX10_1_512.
	* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
	AVX10_1_256 and AVX10_1_512.
	* config/i386/cpuid.h (bit_AVX10): New.
	(bit_AVX10_256): Ditto.
	(bit_AVX10_512): Ditto.
	* config/i386/driver-i386.cc (check_avx10_avx512_features): New.
	(host_detect_local_cpu): Do not append "-mno-" options under
	specific scenarios to avoid emitting a warning.
	* config/i386/i386-isa.def
	(EVEX512): Add DEF_PTA(EVEX512).
	(AVX10_1_256): Add DEF_PTA(AVX10_1_256).
	(AVX10_1_512): Add DEF_PTA(AVX10_1_512).
	* config/i386/i386-options.cc (isa2_opts): Add -mavx10.1-256 and
	-mavx10.1-512.
	(ix86_function_specific_save): Save explicit no indicator.
	(ix86_function_specific_restore): Restore explicit no indicator.
	(ix86_valid_target_attribute_inner_p): Handle avx10.1, avx10.1-256 and
	avx10.1-512.
	(ix86_valid_target_attribute_tree): Handle avx512 function
	attributes with avx10.1 command line option.
	(ix86_option_override_internal): Handle AVX10.1 options.
	* config/i386/i386.h: Add PTA_EVEX512 for AVX512 target
	machines.
	* config/i386/i386.opt: Add variable ix86_no_avx512_explicit and
	ix86_no_avx10_1_explicit, option -mavx10.1, -mavx10.1-256 and
	-mavx10.1-512.
	* doc/extend.texi: Document avx10.1, avx10.1-256 and avx10.1-512.
	* doc/invoke.texi: Document -mavx10.1, -mavx10.1-256 and -mavx10.1-512.
	* doc/sourcebuild.texi: Document target avx10.1, avx10.1-256
	and avx10.1-512.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx10_1-1.c: New test.
	* gcc.target/i386/avx10_1-10.c: Ditto.
	* gcc.target/i386/avx10_1-11.c: Ditto.
	* gcc.target/i386/avx10_1-12.c: Ditto.
	* gcc.target/i386/avx10_1-13.c: Ditto.
	* gcc.target/i386/avx10_1-14.c: Ditto.
	* gcc.target/i386/avx10_1-15.c: Ditto.
	* gcc.target/i386/avx10_1-16.c: Ditto.
	* gcc.target/i386/avx10_1-17.c: Ditto.
	* gcc.target/i386/avx10_1-18.c: Ditto.
	* gcc.target/i386/avx10_1-19.c: Ditto.
	* gcc.target/i386/avx10_1-2.c: Ditto.
	* gcc.target/i386/avx10_1-20.c: Ditto.
	* gcc.target/i386/avx10_1-21.c: Ditto.
	* gcc.target/i386/avx10_1-22.c: Ditto.
	* gcc.target/i386/avx10_1-23.c: Ditto.
	* gcc.target/i386/avx10_1-3.c: Ditto.
	* gcc.target/i386/avx10_1-4.c: Ditto.
	* gcc.target/i386/avx10_1-5.c: Ditto.
	* gcc.target/i386/avx10_1-6.c: Ditto.
	* gcc.target/i386/avx10_1-7.c: Ditto.
	* gcc.target/i386/avx10_1-8.c: Ditto.
	* gcc.target/i386/avx10_1-9.c: Ditto.
2023-11-20 15:47:44 +08:00
Jason Merrill
e85c596ae2 c++: compare one level of template parms
There should never be a reason to compare more than one level of template
parameters; additional levels are for the enclosing context, which is either
irrelevant (for a template template parameter) or already compared (for a
member template).

Also, the comp_template_parms handling of type parameters was wrongly
checking for TEMPLATE_TYPE_PARM when a type parameter appears here as a
TYPE_DECL.

gcc/cp/ChangeLog:

	* pt.cc (comp_template_parms): Just one level.
	(template_parameter_lists_equivalent_p): Likewise.
2023-11-19 21:52:35 -05:00
Jason Merrill
c51eafc1a1 c++: add DECL_IMPLICIT_TEMPLATE_PARM_P macro
Let's use a more informative name instead of DECL_VIRTUAL_P directly.

gcc/cp/ChangeLog:

	* cp-tree.h (DECL_TEMPLATE_PARM_CHECK): New.
	(DECL_IMPLICIT_TEMPLATE_PARM_P): New.
	(decl_template_parm_check): New.
	* mangle.cc (write_closure_template_head): Use it.
	* parser.cc (synthesize_implicit_template_parm): Likewise.
	* pt.cc (template_parameters_equivalent_p): Likewise.
2023-11-19 21:52:35 -05:00
liuhongt
0d734c7938 Add i?86-*-* and x86_64-*-* to vect_logical_reduc
x86 backend support reduc_{and,ior,xor>_scal_m for vector integer
modes.

gcc/testsuite/ChangeLog:

	* lib/target-supports.exp (vect_logical_reduc): Add i?86-*-*
	and x86_64-*-*.
2023-11-20 10:51:54 +08:00
liuhongt
2b59e2b4df Support reduc_{plus,xor,and,ior}_scal_m for vector integer mode.
BB vectorizer relies on the backend support of
.REDUC_{PLUS,IOR,XOR,AND} to vectorize reduction.

gcc/ChangeLog:

	PR target/112325
	* config/i386/sse.md (reduc_<code>_scal_<mode>): New expander.
	(REDUC_ANY_LOGIC_MODE): New iterator.
	(REDUC_PLUS_MODE): Extend to VxHI/SI/DImode.
	(REDUC_SSE_PLUS_MODE): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr112325-1.c: New test.
	* gcc.target/i386/pr112325-2.c: New test.
2023-11-20 10:51:54 +08:00
xuli
e6269bb69c RISC-V: Implement -mmemcpy-strategy= options[PR112537]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112537

-mmemcpy-strategy=[auto|libcall|scalar|vector]

auto: Current status, use scalar or vector instructions.
libcall: Always use a library call.
scalar: Only use scalar instructions.
vector: Only use vector instructions.

	PR target/112537

gcc/ChangeLog:

	* config/riscv/riscv-opts.h (enum riscv_stringop_strategy_enum): Strategy enum.
	* config/riscv/riscv-string.cc (riscv_expand_block_move): Disabled based on options.
	(expand_block_move): Ditto.
	* config/riscv/riscv.opt: Add -mmemcpy-strategy=.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/cpymem-strategy-1.c: New test.
	* gcc.target/riscv/rvv/base/cpymem-strategy-2.c: New test.
	* gcc.target/riscv/rvv/base/cpymem-strategy-3.c: New test.
	* gcc.target/riscv/rvv/base/cpymem-strategy-4.c: New test.
	* gcc.target/riscv/rvv/base/cpymem-strategy-5.c: New test.
	* gcc.target/riscv/rvv/base/cpymem-strategy.h: New test.
2023-11-20 02:50:09 +00:00
Lulu Cheng
8bccee51f0 LoongArch: Modify MUSL_DYNAMIC_LINKER.
Use no suffix at all in the musl dynamic linker name for hard
float ABI. Use -sf and -sp suffixes in musl dynamic linker name
for soft float and single precision ABIs. The following table
outlines the musl interpreter names for the LoongArch64 ABI names.

musl interpreter            | LoongArch64 ABI
--------------------------- | -----------------
ld-musl-loongarch64.so.1    | loongarch64-lp64d
ld-musl-loongarch64-sp.so.1 | loongarch64-lp64f
ld-musl-loongarch64-sf.so.1 | loongarch64-lp64s

gcc/ChangeLog:

	* config/loongarch/gnu-user.h (MUSL_ABI_SPEC): Modify suffix.
2023-11-20 10:12:39 +08:00
GCC Administrator
b54b3800f7 Daily bump. 2023-11-20 00:17:10 +00:00
Juzhe-Zhong
bb6028b40b RISC-V: Optimize constant AVL for LRA pattern
This optimization was discovered in the tuple move splitted bug fix patch.

Before this patch:

vsetivli        zero,4,e16,mf2,ta,ma
        lhu     a3,96(a5)
        vlseg8e16.v     v1,(a5)
        lw      a4,%lo(e)(a2)
        vsetvli a6,zero,e64,m2,ta,ma
        addi    a0,a7,8
        vse16.v v1,0(a7)
        vse16.v v2,0(a0)
        addi    a0,a0,8
        vse16.v v3,0(a0)
        addi    a0,a0,8
        vse16.v v4,0(a0)
        addi    a0,a0,8
        vse16.v v5,0(a0)
        addi    a0,a0,8
        vse16.v v6,0(a0)
        addi    a0,a0,8
        vse16.v v7,0(a0)
        addi    a0,a0,8
        vse16.v v8,0(a0)

After this patch:

vsetivli	zero,4,e64,m2,ta,ma
	addi	a0,a7,8
	vlseg8e16.v	v1,(a5)
	vse16.v	v1,0(a7)
	vse16.v	v2,0(a0)
	addi	a0,a0,8
	vse16.v	v3,0(a0)
	addi	a0,a0,8
	vse16.v	v4,0(a0)
	addi	a0,a0,8
	vse16.v	v5,0(a0)
	addi	a0,a0,8
	vse16.v	v6,0(a0)
	addi	a0,a0,8
	vse16.v	v7,0(a0)
	addi	a0,a0,8
	vse16.v	v8,0(a0)

gcc/ChangeLog:

	* config/riscv/riscv-v.cc (emit_vlmax_insn_lra): Optimize constant AVL.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/post-ra-avl.c: New test.
2023-11-20 07:28:19 +08:00
Nathaniel Shead
eaeaad3fca c++: Set DECL_CONTEXT for __cxa_thread_atexit [PR99187]
Modules streaming requires DECL_CONTEXT to be set on declarations that
are streamed. This ensures that __cxa_thread_atexit is given translation
unit context much like is already done with many other support
functions.

	PR c++/99187

gcc/cp/ChangeLog:

	* cp-tree.h (enum cp_tree_index): Add CPTI_THREAD_ATEXIT.
	(thread_atexit_node): New.
	* decl.cc (get_thread_atexit_node): Cache in thread_atexit_node.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/pr99187.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Signed-off-by: Nathan Sidwell <nathan@acm.org>
2023-11-19 16:43:01 -05:00
Philipp Tomsich
c177f28d60 [committed] RISC-V: Infrastructure for instruction fusion
I've been meaning to extract this and upstream it for a long time.  The work is
primarily Philipp from VRULL with one case added by Raphael and light bugfixing
on my part.

Essentially there's 10 distinct fusions supported and they can be selected
individually by building a suitable mask in the uarch tuning structure.
Additional cases can be added -- the bulk of the effort is in recognizing the
two fusible instructions.

The cases supported in this patch are all from the Veyron V1 processor, though
the hope is they will be useful elsewhere.  I would encourage those familiar
with other uarch implementations to enable fusion cases for those uarchs and
extend the set of supported cases if any are missing.

gcc/
	* config/riscv/riscv-protos.h (extract_base_offset_in_addr): Prototype.
	* config/riscv/riscv.cc (riscv_fusion_pairs): New enum.
	(riscv_tune_param): Add fusible_ops field.
	(riscv_tune_param_rocket_tune_info): Initialize new field.
	(riscv_tune_param_sifive_7_tune_info): Likewise.
	(thead_c906_tune_info): Likewise.
	(generic_oo_tune_info): Likewise.
	(optimize_size_tune_info): Likewise.
	(riscv_macro_fusion_p): New function.
	(riscv_fusion_enabled_p): Likewise.
	(riscv_macro_fusion_pair_p): Likewise.
	(TARGET_SCHED_MACRO_FUSION_P): Define.
	(TARGET_SCHED_MACRO_FUSION_PAIR_P): Likewise.
	(extract_base_offset_in_addr): Moved into riscv.cc from...
	* config/riscv/thead.cc: Here.

	Co-authored-by: Raphael Zinsly <rzinsly@ventanamicro.com>
	Co-authored-by: Jeff Law <jlaw@ventanamicro.com>
2023-11-19 14:17:21 -07:00
Jeff Law
07da9b7f13 [committed] Fix missing mode on a few unspec/unspec_volatile operands
This is fix for a minor problem Jivan and I found while testing the ext-dce work originally from Joern.

The ext-dce pass will transform zero/sign extensions into subreg accesses when
the upper bits are actually unused.  So it's more likely with the ext-dce work
to get a sequence like this prior to combine:

>
>> (insn 10 9 11 2 (set (reg:SI 144)
>>         (unspec_volatile [
>>                 (const_int 0 [0])
>>             ] UNSPECV_FRFLAGS)) "j.c":11:3 discrim 1 362 {riscv_frflags}
>>      (nil))
>> (insn 11 10 55 2 (set (reg:DI 140 [ _12 ])
>>         (subreg:DI (reg:SI 144) 0)) "j.c":11:3 discrim 1 206 {*movdi_64bit}
>>      (expr_list:REG_DEAD (reg:SI 144)
>>         (nil)))

When we try to combine insn 10->11 we'll ultimately call simplify_subreg with
something like

(subreg:DI (unspec_volatile [...]) 0)

Note the lack of a mode on the unspec_volatile.  That in turn will cause
simplify_subreg to trigger an assertion.

The modeless unspec is generated by the RISC-V backend and the more I've
pondered this issue over the last few days the more I'm convinced it's a
backend bug.  Basically if the LHS of the set has a mode, then the RHS of the
set should have a mode as well.

I've audited the various backends and only found a few problems which are fixed
by this patch.  I've tested the relevant ports in my tester.  c6x, sh, mips and
s390[x].

There are other patterns that are potentially problematical in various ports.
They have a REG destination and an UNSPEC source, but the REG has no mode in
the pattern.  Since it wasn't clear what mode to give the UNSPEC, I left those
alone.

gcc/

	* config/c6x/c6x.md (mvilc): Add mode to UNSPEC source.
	* config/mips/mips.md (rdhwr_synci_step_<mode>): Likewise.
	* config/riscv/riscv.md (riscv_frcsr, riscv_frflags): Likewise.
	* config/s390/s390.md (@split_stack_call<mode>): Likewise.
	(@split_stack_cond_call<mode>): Likewise.
	* config/sh/sh.md (sp_switch_1): Likewise.
2023-11-19 11:58:54 -07:00
David Edelsohn
06e7cc79fd testsuite: Don't use -mfloat128 with AIX.
AIX doesn't support IEEE 128 floating point.  Don't add the -mfloat128
on AIX.

gcc/testsuite/ChangeLog:

	* lib/target-supports.exp (add_options_for___float128): Only add
	-mfloat128 to powerpc*-*-linux*.

Signed-off-by: David Edelsohn <dje.gcc@gmail.com>
2023-11-19 13:10:33 -05:00
Lewis Hyatt
56ca59a031 Makefile.tpl: Avoid race condition in generating site.exp from the top level
A command like "make -j 2 check-gcc-c check-gcc-c++" run in the top level of
a fresh build directory does not work reliably. That will spawn two
independent make processes inside the "gcc" directory, and each of those
will attempt to create site.exp if it doesn't exist and will interfere with
each other, producing often a corrupted or empty site.exp. Resolve that by
making these targets depend on a new phony target which makes sure site.exp
is created first before starting the recursive makes.

ChangeLog:

	* Makefile.in: Regenerate.
	* Makefile.tpl: Add dependency on site.exp to check-gcc-* targets
2023-11-19 11:07:09 -05:00
David Malcolm
78d132d73e libcpp: split decls out to rich-location.h
The various decls relating to rich_location are in
libcpp/include/line-map.h, but they don't relate to line maps.

Split them out to their own header: libcpp/include/rich-location.h

No functional change intended.

gcc/ChangeLog:
	* Makefile.in (CPPLIB_H): Add libcpp/include/rich-location.h.
	* coretypes.h (class rich_location): New forward decl.

gcc/analyzer/ChangeLog:
	* analyzer.h: Include "rich-location.h".

gcc/c-family/ChangeLog:
	* c-lex.cc: Include "rich-location.h".

gcc/cp/ChangeLog:
	* mapper-client.cc: Include "rich-location.h".

gcc/ChangeLog:
	* diagnostic.h: Include "rich-location.h".
	* edit-context.h (class fixit_hint): New forward decl.
	* gcc-rich-location.h: Include "rich-location.h".
	* genmatch.cc: Likewise.
	* pretty-print.h: Likewise.

gcc/rust/ChangeLog:
	* rust-location.h: Include "rich-location.h".

libcpp/ChangeLog:
	* Makefile.in (TAGS_SOURCES): Add "include/rich-location.h".
	* include/cpplib.h (class rich_location): New forward decl.
	* include/line-map.h (class range_label)
	(enum range_display_kind, struct location_range)
	(class semi_embedded_vec, class rich_location, class label_text)
	(class range_label, class fixit_hint): Move to...
	* include/rich-location.h: ...this new file.
	* internal.h: Include "rich-location.h".

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-11-19 06:26:40 -05:00
Juzhe-Zhong
af7fa3135b RISC-V: Fix bug of tuple move splitter
PR target/112561

gcc/ChangeLog:

	* config/riscv/riscv-v.cc (expand_tuple_move): Fix bug.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/pr112561.c: New test.
2023-11-19 10:44:20 +08:00
David Malcolm
f65f63c4d8 analyzer: new warning: -Wanalyzer-undefined-behavior-strtok [PR107573]
This patch:
- adds support to the analyzer for tracking API-private state
  or which we don't have a decl (such as strtok's internal state),
- uses it to implement a new -Wanalyzer-undefined-behavior-strtok which
  warns when strtok (NULL, delim) is called as the first call to
  strtok after main.

gcc/analyzer/ChangeLog:
	PR analyzer/107573
	* analyzer.h (register_known_functions): Add region_model_manager
	param.
	* analyzer.opt (Wanalyzer-undefined-behavior-strtok): New.
	* call-summary.cc
	(call_summary_replay::convert_region_from_summary_1): Handle
	RK_PRIVATE.
	* engine.cc (impl_run_checkers): Pass model manager to
	register_known_functions.
	* kf.cc (class undefined_function_behavior): New.
	(class kf_strtok): New.
	(register_known_functions): Add region_model_manager param.
	Use it to register "strtok".
	* region-model-manager.cc
	(region_model_manager::get_or_create_conjured_svalue): Add "idx"
	param.
	* region-model-manager.h
	(region_model_manager::get_or_create_conjured_svalue): Add "idx"
	param.
	(region_model_manager::get_root_region): New accessor.
	* region-model.cc (region_model::scan_for_null_terminator): Handle
	"expr" being null.
	(region_model::get_representative_path_var_1): Handle RK_PRIVATE.
	* region-model.h (region_model::called_from_main_p): Make public.
	* region.cc (region::get_memory_space): Handle RK_PRIVATE.
	(region::can_have_initial_svalue_p): Handle MEMSPACE_PRIVATE.
	(private_region::dump_to_pp): New.
	* region.h (MEMSPACE_PRIVATE): New.
	(RK_PRIVATE): New.
	(class private_region): New.
	(is_a_helper <const private_region *>::test): New.
	* store.cc (store::replay_call_summary_cluster): Handle
	RK_PRIVATE.
	* svalue.h (struct conjured_svalue::key_t): Add "idx" param to
	ctor and "m_idx" field.
	(class conjured_svalue::conjured_svalue): Likewise.

gcc/ChangeLog:
	PR analyzer/107573
	* doc/invoke.texi: Add -Wanalyzer-undefined-behavior-strtok.

gcc/testsuite/ChangeLog:
	PR analyzer/107573
	* c-c++-common/analyzer/strtok-1.c: New test.
	* c-c++-common/analyzer/strtok-2.c: New test.
	* c-c++-common/analyzer/strtok-3.c: New test.
	* c-c++-common/analyzer/strtok-4.c: New test.
	* c-c++-common/analyzer/strtok-cppreference.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-11-18 20:35:59 -05:00
GCC Administrator
9d58d2d8ba Daily bump. 2023-11-19 00:17:38 +00:00
Petter Tomner
f73808b3b4 MAINTAINERS: Update my email address.
Update my email address in the MAINTAINERS file.

2023-11-18	Petter Tomner <tomner@bahnhof.se>

ChangeLog:
	* MAINTAINERS: Update my email address.
2023-11-18 23:30:35 +01:00
Jonathan Wakely
279e407a06 libstdc++: Check string value_type in std::make_format_args [PR112607]
libstdc++-v3/ChangeLog:

	PR libstdc++/112607
	* include/std/format (basic_format_arg::_S_to_arg_type): Check
	value_type for basic_string_view and basic_string
	specializations.
	* testsuite/std/format/arguments/112607.cc: New test.
2023-11-18 21:42:33 +00:00
Jonathan Wakely
41a5ea4cab libstdc++: Add fast path for std::format("{}", x) [PR110801]
This optimizes the simple case of formatting a single string, integer
or bool, with no format-specifier (so no padding, alignment, alternate
form etc.)

libstdc++-v3/ChangeLog:

	PR libstdc++/110801
	* include/std/format (_Sink_iter::_M_reserve): New member
	function.
	(_Sink::_Reservation): New nested class.
	(_Sink::_M_reserve, _Sink::_M_bump): New virtual functions.
	(_Seq_sink::_M_reserve, _Seq_sink::_M_bump): New virtual
	overrides.
	(_Iter_sink<O, ContigIter>::_M_reserve): Likewise.
	(__do_vformat_to): Use new functions to optimize "{}" case.
2023-11-18 21:22:32 +00:00
Xi Ruoyao
84c5dede83
LoongArch: Fix "-mexplict-relocs=none -mcmodel=medium" producing %call36 when the assembler does not support it
Even if !HAVE_AS_SUPPORT_CALL36, const_call_insn_operand should still
return false when -mexplict-relocs=none -mcmodel=medium to make
loongarch_legitimize_call_address emit la.local or la.global.

gcc/ChangeLog:

	* config/loongarch/predicates.md (const_call_insn_operand):
	Remove buggy "HAVE_AS_SUPPORT_CALL36" conditions.  Change "1" to
	"true" to make the coding style consistent.
2023-11-19 01:45:29 +08:00
Xi Ruoyao
51bda9f136
LoongArch: Add fine-grained control for LAM_BH and LAMCAS
gcc/ChangeLog:

	* config/loongarch/genopts/isa-evolution.in: (lam-bh, lamcas):
	Add.
	* config/loongarch/loongarch-str.h: Regenerate.
	* config/loongarch/loongarch.opt: Regenerate.
	* config/loongarch/loongarch-cpucfg-map.h: Regenerate.
	* config/loongarch/loongarch-cpu.cc
	(ISA_BASE_LA64V110_FEATURES): Include OPTION_MASK_ISA_LAM_BH
	and OPTION_MASK_ISA_LAMCAS.
	* config/loongarch/sync.md (atomic_add<mode:SHORT>): Use
	TARGET_LAM_BH instead of ISA_BASE_IS_LA64V110.  Remove empty
	lines from assembly output.
	(atomic_exchange<mode>_short): Likewise.
	(atomic_exchange<mode:SHORT>): Likewise.
	(atomic_fetch_add<mode>_short): Likewise.
	(atomic_fetch_add<mode:SHORT>): Likewise.
	(atomic_cas_value_strong<mode>_amcas): Use TARGET_LAMCAS instead
	of ISA_BASE_IS_LA64V110.
	(atomic_compare_and_swap<mode>): Likewise.
	(atomic_compare_and_swap<mode:GPR>): Likewise.
	(atomic_compare_and_swap<mode:SHORT>): Likewise.
	* config/loongarch/loongarch.cc (loongarch_asm_code_end): Dump
	status if -mlam-bh and -mlamcas if -fverbose-asm.
2023-11-19 01:11:13 +08:00
Xi Ruoyao
181ed726b2
LoongArch: Don't emit dbar 0x700 if -mld-seq-sa
This option (CPUCFG word 0x3 bit 23) means "the hardware guarantee that
two loads on the same address won't be reordered with each other".  Thus
we can omit the "load-load" barrier dbar 0x700.

This is only a micro-optimization because dbar 0x700 is already treated
as nop if the hardware supports LD_SEQ_SA.

gcc/ChangeLog:

	* config/loongarch/loongarch.cc (loongarch_print_operand): Don't
	print dbar 0x700 if TARGET_LD_SEQ_SA.
	* config/loongarch/sync.md (atomic_load<mode>): Likewise.
2023-11-19 01:11:13 +08:00
Xi Ruoyao
5d3d605553
LoongArch: Take the advantage of -mdiv32 if it's enabled
With -mdiv32, we can assume div.w[u] and mod.w[u] works on low 32 bits
of a 64-bit GPR even if it's not sign-extended.

gcc/ChangeLog:

	* config/loongarch/loongarch.md (DIV): New mode iterator.
	(<optab:ANY_DIV><mode:GPR>3): Don't expand if TARGET_DIV32.
	(<optab:ANY_DIV>di3_fake): Disable if TARGET_DIV32.
	(*<optab:ANY_DIV><mode:GPR>3): Allow SImode if TARGET_DIV32.
	(<optab:ANY_DIV>si3_extended): New insn if TARGET_DIV32.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/div-div32.c: New test.
	* gcc.target/loongarch/div-no-div32.c: New test.
2023-11-19 01:11:12 +08:00
Xi Ruoyao
ccead01d9b
LoongArch: Add evolution features of base ISA revisions
* config/loongarch/loongarch-def.h:
	(loongarch_isa_base_features): Declare.  Define it in ...
	* config/loongarch/loongarch-cpu.cc
	(loongarch_isa_base_features): ... here.
	(fill_native_cpu_config): If we know the base ISA of the CPU
	model from PRID, use it instead of la64 (v1.0).  Check if all
	expected features of this base ISA is available, emit a warning
	if not.
	* config/loongarch/loongarch-opts.cc (config_target_isa): Enable
	the features implied by the base ISA if not -march=native.
2023-11-19 01:11:12 +08:00
Xi Ruoyao
8835242025
LoongArch: genopts: Add infrastructure to generate code for new features in ISA evolution
LoongArch v1.10 introduced the concept of ISA evolution.  During ISA
evolution, many independent features can be added and enumerated via
CPUCFG.

Add a data file into genopts storing the CPUCFG word, bit, the name
of the command line option controlling if this feature should be used
for compilation, and the text description.  Make genstr.sh process these
info and add the command line options into loongarch.opt and
loongarch-str.h, and generate a new file loongarch-cpucfg-map.h for
mapping CPUCFG output to the corresponding option.  When handling
-march=native, use the information in loongarch-cpucfg-map.h to generate
the corresponding option mask.  Enable the features implied by -march
setting unless the user has explicitly disabled the feature.

The added options (-mdiv32 and -mld-seq-sa) are not really handled yet.
They'll be used in the following patches.

gcc/ChangeLog:

	* config/loongarch/genopts/isa-evolution.in: New data file.
	* config/loongarch/genopts/genstr.sh: Translate info in
	isa-evolution.in when generating loongarch-str.h, loongarch.opt,
	and loongarch-cpucfg-map.h.
	* config/loongarch/genopts/loongarch.opt.in (isa_evolution):
	New variable.
	* config/loongarch/t-loongarch: (loongarch-cpucfg-map.h): New
	rule.
	(loongarch-str.h): Depend on isa-evolution.in.
	(loongarch.opt): Depend on isa-evolution.in.
	(loongarch-cpu.o): Depend on loongarch-cpucfg-map.h.
	* config/loongarch/loongarch-str.h: Regenerate.
	* config/loongarch/loongarch-def.h (loongarch_isa):  Add field
	for evolution features.  Add helper function to enable features
	in this field.
	Probe native CPU capability and save the corresponding options
	into preset.
	* config/loongarch/loongarch-cpu.cc (fill_native_cpu_config):
	Probe native CPU capability and save the corresponding options
	into preset.
	(cache_cpucfg): Simplify with C++11-style for loop.
	(cpucfg_useful_idx, N_CPUCFG_WORDS): Move to ...
	* config/loongarch/loongarch.cc
	(loongarch_option_override_internal): Enable the ISA evolution
	feature options implied by -march and not explicitly disabled.
	(loongarch_asm_code_end): New function, print ISA information as
	comments in the assembly if -fverbose-asm.  It makes easier to
	debug things like -march=native.
	(TARGET_ASM_CODE_END): Define.
	* config/loongarch/loongarch.opt: Regenerate.
	* config/loongarch/loongarch-cpucfg-map.h: Generate.
	(cpucfg_useful_idx, N_CPUCFG_WORDS) ... here.
2023-11-19 01:11:12 +08:00
Xi Ruoyao
56752a6bbf
LoongArch: Fix internal error running "gcc -march=native" on LA664
On LA664, the PRID preset is ISA_BASE_LA64V110 but the base architecture
is guessed ISA_BASE_LA64V100.  This causes a warning to be outputed:

    cc1: warning: base architecture 'la64' differs from PRID preset '?'

But we've not set the "?" above in loongarch_isa_base_strings, thus it's
a nullptr and then an ICE is triggered.

Add ISA_BASE_LA64V110 to genopts and initialize
loongarch_isa_base_strings[ISA_BASE_LA64V110] correctly to fix the ICE.
The warning itself will be fixed later.

gcc/ChangeLog:

	* config/loongarch/genopts/loongarch-strings:
	(STR_ISA_BASE_LA64V110): Add.
	* config/loongarch/genopts/loongarch.opt.in:
	(ISA_BASE_LA64V110): Add.
	* config/loongarch/loongarch-def.c
	(loongarch_isa_base_strings): Initialize [ISA_BASE_LA64V110]
	to STR_ISA_BASE_LA64V110.
	* config/loongarch/loongarch.opt: Regenerate.
	* config/loongarch/loongarch-str.h: Regenerate.
2023-11-19 01:11:09 +08:00
Sebastian Huber
20a3c74c34 gcov: Improve -fprofile-update=atomic
The code coverage support uses counters to determine which edges in the control
flow graph were executed.  If a counter overflows, then the code coverage
information is invalid.  Therefore the counter type should be a 64-bit integer.
In multi-threaded applications, it is important that the counter increments are
atomic.  This is not the case by default.  The user can enable atomic counter
increments through the -fprofile-update=atomic and
-fprofile-update=prefer-atomic options.

If the target supports 64-bit atomic operations, then everything is fine.  If
not and -fprofile-update=prefer-atomic was chosen by the user, then non-atomic
counter increments will be used.  However, if the target does not support the
required atomic operations and -fprofile-atomic=update was chosen by the user,
then a warning was issued and as a forced fallback to non-atomic operations was
done.  This is probably not what a user wants.  There is still hardware on the
market which does not have atomic operations and is used for multi-threaded
applications.  A user which selects -fprofile-update=atomic wants consistent
code coverage data and not random data.

This patch removes the fallback to non-atomic operations for
-fprofile-update=atomic the target platform supports libatomic.  To
mitigate potential performance issues an optimization for systems which
only support 32-bit atomic operations is provided.  Here, the edge
counter increments are done like this:

  low = __atomic_add_fetch_4 (&counter.low, 1, MEMMODEL_RELAXED);
  high_inc = low == 0 ? 1 : 0;
  __atomic_add_fetch_4 (&counter.high, high_inc, MEMMODEL_RELAXED);

In gimple_gen_time_profiler() this split operation cannot be used, since the
updated counter value is also required.  Here, a library call is emitted.  This
is not a performance issue since the update is only done if counters[0] == 0.

gcc/c-family/ChangeLog:

	* c-cppbuiltin.cc (c_cpp_builtins):  Define
	__LIBGCC_HAVE_LIBATOMIC for libgcov.

gcc/ChangeLog:

	* doc/invoke.texi (-fprofile-update): Clarify default method.  Document
	the atomic method behaviour.
	* tree-profile.cc (enum counter_update_method): New.
	(counter_update): Likewise.
	(gen_counter_update): Use counter_update_method.  Split the
	atomic counter update in two 32-bit atomic operations if
	necessary.
	(tree_profiling): Select counter_update_method.

libgcc/ChangeLog:

	* libgcov.h (GCOV_SUPPORTS_ATOMIC): Always define it.
	Set it also to 1, if __LIBGCC_HAVE_LIBATOMIC is defined.
2023-11-18 12:45:15 +01:00