Commit graph

203219 commits

Author SHA1 Message Date
Andrew Pinski
f956c23264 Fix PR 110954: wrong code with cmp | !cmp
This was an oversight on my part forgetting that
cmp will might have a different true value than all ones
but will have a value of 1 in most cases.
This means if we have `(f < 0) | !(f < 0)` we would
optimize this to -1 rather than just 1.

This is version 2 of the patch.
Decided to go down a different route than just checking if
the precission was 1 inside bitwise_inverted_equal_p.
So instead bitwise_inverted_equal_p gets passed an argument
that will be set if there was a comparison that was being compared
and the user of bitwise_inverted_equal_p decides what needs to be done.
In most uses of bitwise_inverted_equal_p, the check will be
`!wascmp || element_precision (type) == 1` .
But in the case of `a & ~a` and `a ^| ~a` we can handle the case
of wascmp by using constant_boolean_node isntead.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

	PR tree-optimization/110954

gcc/ChangeLog:

	* generic-match-head.cc (bitwise_inverted_equal_p): Add
	wascmp argument and set it accordingly.
	* gimple-match-head.cc (bitwise_inverted_equal_p): Add
	wascmp argument to the macro.
	(gimple_bitwise_inverted_equal_p): Add
	wascmp argument and set it accordingly.
	* match.pd (`a & ~a`, `a ^| ~a`): Update call
	to bitwise_inverted_equal_p and handle wascmp case.
	(`(~x | y) & x`, `(~x | y) & x`, `a?~t:t`): Update
	call to bitwise_inverted_equal_p and check to see
	if was !wascmp or if precision was 1.

gcc/testsuite/ChangeLog:

	* gcc.c-torture/execute/pr110954-1.c: New test.
2023-08-10 23:51:07 -07:00
Martin Uecker
68783211f6 c: Support for -Wuseless-cast [PR84510]
Add support for Wuseless-cast C (and ObjC).

	PR c/84510

gcc/c/:
	* c-typeck.cc (build_c_cast): Add warning.

gcc/c-family/:
	* c.opt: Enable warning for C and ObjC.

gcc/:
	* doc/invoke.texi: Update.

gcc/testsuite/:
	* gcc.dg/Wuseless-cast.c: New test.
2023-08-11 07:00:25 +02:00
Pan Li
ee8a844d02 RISC-V: Support RVV VFMSAC rounding mode intrinsic API
This patch would like to support the rounding mode API for the
VFMSAC for the below samples.

* __riscv_vfmsac_vv_f32m1_rm
* __riscv_vfmsac_vv_f32m1_rm_m
* __riscv_vfmsac_vf_f32m1_rm
* __riscv_vfmsac_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>

gcc/ChangeLog:

	* config/riscv/riscv-vector-builtins-bases.cc
	(class vfmsac_frm): New class for vfmsac frm.
	(vfmsac_frm_obj): New declaration.
	(BASE): Ditto.
	* config/riscv/riscv-vector-builtins-bases.h: Ditto.
	* config/riscv/riscv-vector-builtins-functions.def
	(vfmsac_frm): New function definition

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/float-point-msac.c: New test.
2023-08-11 11:19:28 +08:00
GCC Administrator
4271b7422f Daily bump. 2023-08-11 00:16:45 +00:00
Jonathan Wakely
ecfd8c7ffe libstdc++: Fix out-of-bounds read in format string "{:{}." [PR110974]
libstdc++-v3/ChangeLog:

	PR libstdc++/110974
	* include/std/format (_Spec::_S_parse_width_or_precision): Check
	for empty range before dereferencing iterator.
	* testsuite/std/format/string.cc: Check for expected exception.
	Fix expected exception message in test_pr110862() and actually
	call it.
2023-08-10 23:31:37 +01:00
Jonathan Wakely
f48a542396 libstdc++: Fix std::format for localized floats [PR110968]
The __formatter_fp::_M_localize function just returns an empty string if
the formatting locale is the C locale, as there is nothing to do. But
the caller was assuming that the returned string contains the localized
string. The caller should use the original string if _M_localize returns
an empty string.

libstdc++-v3/ChangeLog:

	PR libstdc++/110968
	* include/std/format (__formatter_fp::format): Check return
	value of _M_localize.
	* testsuite/std/format/functions/format.cc: Check classic
	locale.
2023-08-10 23:31:37 +01:00
Jonathan Wakely
9cb2a7c8d5 libstdc++: Use alias template for iterator_category [PR110970]
This renames __iterator_category_t to __iter_category_t, for consistency
with std::iter_value_t, std::iter_difference_t and std::iter_reference_t
in C++20. Then use __iter_category_t in <bits/stl_iterator.h>, which
fixes the problem of the missing 'typename' that Clang 15 incorrectly
still requires.

libstdc++-v3/ChangeLog:

	PR libstdc++/110970
	* include/bits/stl_iterator.h (__detail::__move_iter_cat): Use
	__iter_category_t.
	(iterator_traits<common_iterator<I, S>>::_S_iter_cat): Likewise.
	(__detail::__basic_const_iterator_iter_cat): Likewise.
	* include/bits/stl_iterator_base_types.h (__iterator_category_t):
	Rename to __iter_category_t.
2023-08-10 23:31:37 +01:00
Jan Hubicka
39204ae9dd Fix division by zero in loop splitting
Profile update I added to tree-ssa-loop-split can divide by zero in
situation that the conditional is predicted with 0 probability which
is triggered by jump threading update in the testcase.

gcc/ChangeLog:

	PR middle-end/110923
	* tree-ssa-loop-split.cc (split_loop): Watch for division by zero.

gcc/testsuite/ChangeLog:

	PR middle-end/110923
	* gcc.dg/tree-ssa/pr110923.c: New test.
2023-08-11 00:23:14 +02:00
Patrick O'Neill
0ac323238e RISC-V: Add Ztso atomic mappings
The RISC-V Ztso extension currently has no effect on generated code.
With the additional ordering constraints guarenteed by Ztso, we can emit
more optimized atomic mappings than the RVWMO mappings.

This PR implements the Ztso psABI mappings[1].

[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/391

2023-08-08 Patrick O'Neill <patrick@rivosinc.com>

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc: Add Ztso and mark Ztso as
	dependent on 'a' extension.
	* config/riscv/riscv-opts.h (MASK_ZTSO): New mask.
	(TARGET_ZTSO): New target.
	* config/riscv/riscv.cc (riscv_memmodel_needs_amo_acquire): Add
	Ztso case.
	(riscv_memmodel_needs_amo_release): Add Ztso case.
	(riscv_print_operand): Add Ztso case for LR/SC annotations.
	* config/riscv/riscv.md: Import sync-rvwmo.md and sync-ztso.md.
	* config/riscv/riscv.opt: Add Ztso target variable.
	* config/riscv/sync.md (mem_thread_fence_1): Expand to RVWMO or
	Ztso specific insn.
	(atomic_load<mode>): Expand to RVWMO or Ztso specific insn.
	(atomic_store<mode>): Expand to RVWMO or Ztso specific insn.
	* config/riscv/sync-rvwmo.md: New file. Seperate out RVWMO
	specific load/store/fence mappings.
	* config/riscv/sync-ztso.md: New file. Seperate out Ztso
	specific load/store/fence mappings.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/amo-table-ztso-amo-add-1.c: New test.
	* gcc.target/riscv/amo-table-ztso-amo-add-2.c: New test.
	* gcc.target/riscv/amo-table-ztso-amo-add-3.c: New test.
	* gcc.target/riscv/amo-table-ztso-amo-add-4.c: New test.
	* gcc.target/riscv/amo-table-ztso-amo-add-5.c: New test.
	* gcc.target/riscv/amo-table-ztso-compare-exchange-1.c: New test.
	* gcc.target/riscv/amo-table-ztso-compare-exchange-2.c: New test.
	* gcc.target/riscv/amo-table-ztso-compare-exchange-3.c: New test.
	* gcc.target/riscv/amo-table-ztso-compare-exchange-4.c: New test.
	* gcc.target/riscv/amo-table-ztso-compare-exchange-5.c: New test.
	* gcc.target/riscv/amo-table-ztso-compare-exchange-6.c: New test.
	* gcc.target/riscv/amo-table-ztso-compare-exchange-7.c: New test.
	* gcc.target/riscv/amo-table-ztso-fence-1.c: New test.
	* gcc.target/riscv/amo-table-ztso-fence-2.c: New test.
	* gcc.target/riscv/amo-table-ztso-fence-3.c: New test.
	* gcc.target/riscv/amo-table-ztso-fence-4.c: New test.
	* gcc.target/riscv/amo-table-ztso-fence-5.c: New test.
	* gcc.target/riscv/amo-table-ztso-load-1.c: New test.
	* gcc.target/riscv/amo-table-ztso-load-2.c: New test.
	* gcc.target/riscv/amo-table-ztso-load-3.c: New test.
	* gcc.target/riscv/amo-table-ztso-store-1.c: New test.
	* gcc.target/riscv/amo-table-ztso-store-2.c: New test.
	* gcc.target/riscv/amo-table-ztso-store-3.c: New test.
	* gcc.target/riscv/amo-table-ztso-subword-amo-add-1.c: New test.
	* gcc.target/riscv/amo-table-ztso-subword-amo-add-2.c: New test.
	* gcc.target/riscv/amo-table-ztso-subword-amo-add-3.c: New test.
	* gcc.target/riscv/amo-table-ztso-subword-amo-add-4.c: New test.
	* gcc.target/riscv/amo-table-ztso-subword-amo-add-5.c: New test.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2023-08-10 14:07:38 -07:00
Jan Hubicka
937591d236 Fix profile update in duplicat_loop_body_to_header_edge for loops with 0 count_in
this patch makes duplicate_loop_body_to_header_edge to not drop profile counts to
uninitialized when count_in is 0.  This happens because profile_probability in 0 count
is undefined.

gcc/ChangeLog:

	* cfgloopmanip.cc (duplicate_loop_body_to_header_edge): Special case loops with
	0 iteration count.
2023-08-10 19:01:43 +02:00
Jan Hubicka
546bf79bd7 Fix profile updating bug in tree-ssa-threadupdate
ssa_fix_duplicate_block_edges later calls update_profile to correct profile after threading.
In the testcase this does not work since we lose track of the duplicated edge.  This
happens because redirect_edge_and_branch returns NULL if the edge already has correct
destination which is the case.

gcc/ChangeLog:

	* tree-ssa-threadupdate.cc (ssa_fix_duplicate_block_edges): Fix profile update.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/phi_on_compare-1.c: Check profile consistency.
2023-08-10 18:39:33 +02:00
Jan Hubicka
e41103081b Fix undefined behaviour in profile_count::differs_from_p
This patch avoid overflow in profile_count::differs_from_p and also makes it to
return false from one of the values is undefined while other is defined.

gcc/ChangeLog:

	* profile-count.cc (profile_count::differs_from_p): Fix overflow and
	handling of undefined values.
2023-08-10 18:35:13 +02:00
Jakub Jelinek
8afe9d5d2f phiopt: Fix phiopt ICE on vops [PR102989]
I've ran into ICE on gcc.dg/torture/bitint-42.c with -O1 or -Os
when enabling expensive tests, and unfortunately I can't reproduce without
_BitInt.  The IL before phiopt3 has:
  <bb 87> [local count: 203190070]:
  # .MEM_428 = VDEF <.MEM_367>
  bitint.159 = VIEW_CONVERT_EXPR<unsigned long[8]>(*.LC3);
  goto <bb 89>; [100.00%]

  <bb 88> [local count: 203190070]:
  # .MEM_427 = VDEF <.MEM_367>
  bitint.159 = VIEW_CONVERT_EXPR<unsigned long[8]>(*.LC4);

  <bb 89> [local count: 406380139]:
  # .MEM_368 = PHI <.MEM_428(87), .MEM_427(88)>
  # VUSE <.MEM_368>
  _123 = VIEW_CONVERT_EXPR<unsigned long[8]>(r495[i_107].D.2780)[0];
and factor_out_conditional_operation is called on the vop PHI, it
sees it has exactly two operands and defining statements of both
PHI arguments are converts (VCEs in this case), so it thinks it is
a good idea to try to optimize that and while doing that it constructs
void type SSA_NAMEs and the like.

2023-08-10  Jakub Jelinek  <jakub@redhat.com>

	PR c/102989
	* tree-ssa-phiopt.cc (single_non_singleton_phi_for_edges): Never
	return virtual phis and return NULL if there is a virtual phi
	where the arguments from E0 and E1 edges aren't equal.
2023-08-10 17:29:23 +02:00
Richard Biener
b0894a12e9 Make ISEL used internal functions const/nothrow where appropriate
Both .VEC_SET and .VEC_EXTACT and the various .VCOND internal functions
are operating on registers only and they are not supposed to raise
any exceptions.  The following makes them const/nothrow.  I've
verified this avoids useless SSA updates in ISEL.

	* internal-fn.def (VCOND, VCONDU, VCONDEQ, VCOND_MASK,
	VEC_SET, VEC_EXTRACT): Make ECF_CONST | ECF_NOTHROW.
2023-08-10 15:29:58 +02:00
Juzhe-Zhong
da7b43fb02 RISC-V: Add MASK vec_duplicate pattern[PR110962]
This patch fix bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110962

SUBROUTINE a(b,c,d)
  LOGICAL,DIMENSION(INOUT)  :: b
  LOGICAL e
  REAL, DIMENSION(IN)     ::  c
  REAL, DIMENSION(INOUT)  ::  d
  REAL, DIMENSION(SIZE(c))   :: f
  WHERE (b.AND.e)
     WHERE (f>=0.)
        d = g
     ENDWHERE
  ENDWHERE
END SUBROUTINE a

   PR target/110962

gcc/ChangeLog:
	PR target/110962
	* config/riscv/autovec.md (vec_duplicate<mode>): New pattern.
2023-08-10 21:17:19 +08:00
Pan Li
6176527a75 RISC-V: Support RVV VFNMACC rounding mode intrinsic API
This patch would like to support the rounding mode API for the
VFNMACC for the below samples.

* __riscv_vfnmacc_vv_f32m1_rm
* __riscv_vfnmacc_vv_f32m1_rm_m
* __riscv_vfnmacc_vf_f32m1_rm
* __riscv_vfnmacc_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>

gcc/ChangeLog:

	* config/riscv/riscv-vector-builtins-bases.cc
	(class vfnmacc_frm): New class for vfnmacc.
	(vfnmacc_frm_obj): New declaration.
	(BASE): Ditto.
	* config/riscv/riscv-vector-builtins-bases.h: Ditto.
	* config/riscv/riscv-vector-builtins-functions.def
	(vfnmacc_frm): New function definition.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/float-point-nmacc.c: New test.
2023-08-10 20:38:47 +08:00
Pan Li
07e93224f5 RISC-V: Support RVV VFMACC rounding mode intrinsic API
This patch would like to support the rounding mode API for the
VFMACC for the below samples.

* __riscv_vfmacc_vv_f32m1_rm
* __riscv_vfmacc_vv_f32m1_rm_m
* __riscv_vfmacc_vf_f32m1_rm
* __riscv_vfmacc_vf_f32m1_rm_m

Signed-off-by: Pan Li <pan2.li@intel.com>

gcc/ChangeLog:

	* config/riscv/riscv-vector-builtins-bases.cc
	(class vfmacc_frm): New class for vfmacc frm.
	(vfmacc_frm_obj): New declaration.
	(BASE): Ditto.
	* config/riscv/riscv-vector-builtins-bases.h: Ditto.
	* config/riscv/riscv-vector-builtins-functions.def
	(vfmacc_frm): New function definition.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/float-point-macc.c: New test.
2023-08-10 20:37:35 +08:00
Juzhe-Zhong
887f13916b RISC-V: Support TU for integer ternary OP[PR110964]
PR target/110964

gcc/ChangeLog:
	PR target/110964
	* config/riscv/riscv-v.cc (expand_cond_len_ternop): Add integer ternary.

gcc/testsuite/ChangeLog:
	PR target/110964
	* gcc.target/riscv/rvv/autovec/pr110964.c: New test.
2023-08-10 20:29:47 +08:00
Richard Biener
9b8ebdb60c Remove insert location argument from vectorizable_live_operation
The insert location argument isn't actually used but we compute
that ourselves.  There's a single spot, namely when asking
for the loop mask via vect_get_loop_mask that the passed argument
is used but that looks like an oversight.  The following fixes that
and adjusts vectorizable_live_operation and can_vectorize_live_stmts
to no longer take a stmt iterator argument.

	* tree-vectorizer.h (vectorizable_live_operation): Remove
	gimple_stmt_iterator * argument.
	* tree-vect-loop.cc (vectorizable_live_operation): Likewise.
	Adjust plumbing around vect_get_loop_mask.
	(vect_analyze_loop_operations): Adjust.
	* tree-vect-slp.cc (vect_slp_analyze_node_operations_1): Likewise.
	(vect_bb_slp_mark_live_stmts): Likewise.
	(vect_schedule_slp_node): Likewise.
	* tree-vect-stmts.cc (can_vectorize_live_stmts): Likewise.
	Remove gimple_stmt_iterator * argument.
	(vect_transform_stmt): Adjust.
2023-08-10 14:23:16 +02:00
Juzhe-Zhong
6bdbf1722a RISC-V: Add missing modes to the iterators
gcc/ChangeLog:

	* config/riscv/vector-iterators.md: Add missing modes.
2023-08-10 20:21:34 +08:00
Jakub Jelinek
d5ad55a83d lto-streamer-in: Adjust assert [PR102989]
With _BitInt(575) or any other _BitInt(513) or larger constants we can
run into this assertion.  MAX_BITSIZE_MODE_ANY_INT is just a value from
which WIDE_INT_MAX_PRECISION is derived.

2023-08-10  Jakub Jelinek  <jakub@redhat.com>

	PR c/102989
	* lto-streamer-in.cc (lto_input_tree_1): Assert TYPE_PRECISION
	is up to WIDE_INT_MAX_PRECISION rather than MAX_BITSIZE_MODE_ANY_INT.
2023-08-10 09:23:08 +02:00
Jakub Jelinek
b129d6b5f5 expr: Small optimization [PR102989]
Small optimization to avoid testing modifier multiple times.

2023-08-10  Jakub Jelinek  <jakub@redhat.com>

	PR c/102989
	* expr.cc (expand_expr_real_1) <case MEM_REF>: Add an early return for
	EXPAND_WRITE or EXPAND_MEMORY modifiers to avoid testing it multiple
	times.
2023-08-10 09:22:03 +02:00
liuhongt
0c563a935c i386: Do not sanitize upper part of V2HFmode and V4HFmode reg with -fno-trapping-math [PR110832]
Also add ix86_partial_vec_fp_math to to condition of V2HF/V4HF named
patterns in order to avoid generation of partial vector V8HFmode
trapping instructions.

gcc/ChangeLog:

	PR target/110832
	* config/i386/mmx.md: (movq_<mode>_to_sse): Also do not
	sanitize upper part of V4HFmode register with
	-fno-trapping-math.
	(<insn>v4hf3): Enable for ix86_partial_vec_fp_math.
	(<divv4hf3): Ditto.
	(<insn>v2hf3): Ditto.
	(divv2hf3): Ditto.
	(movd_v2hf_to_sse): Do not sanitize upper part of V2HFmode
	register with -fno-trapping-math.
2023-08-10 14:06:18 +08:00
Pan Li
4cede0de9a RISC-V: Refactor RVV frm_mode attr for rounding mode intrinsic
The frm_mode attr has some assumptions for each define insn as below.

1. The define insn has at least 9 operands.
2. The operands[9] must be frm reg.
3. The operands[9] must be const int.

Actually, the frm operand can be operands[8], operands[9] or
operands[10], and not all the define insn has frm operands.

This patch would like to refactor frm and eliminate the above
assumptions, as well as unblock the underlying rounding mode intrinsic
API support.

After refactor, the default frm will be none, and the selected insn type
will be dyn. For the floating point which honors the frm, we will
set the frm_mode attr explicitly in define_insn.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored-by: Kito Cheng <kito.cheng@sifive.com>

gcc/ChangeLog:

	* config/riscv/riscv-protos.h
	(enum floating_point_rounding_mode): Add NONE, DYN_EXIT and DYN_CALL.
	(get_frm_mode): New declaration.
	* config/riscv/riscv-v.cc (get_frm_mode): New function to get frm mode.
	* config/riscv/riscv-vector-builtins.cc
	(function_expander::use_ternop_insn): Take care of frm reg.
	* config/riscv/riscv.cc (riscv_static_frm_mode_p): Migrate to FRM_XXX.
	(riscv_emit_frm_mode_set): Ditto.
	(riscv_emit_mode_set): Ditto.
	(riscv_frm_adjust_mode_after_call): Ditto.
	(riscv_frm_mode_needed): Ditto.
	(riscv_frm_mode_after): Ditto.
	(riscv_mode_entry): Ditto.
	(riscv_mode_exit): Ditto.
	* config/riscv/riscv.h (NUM_MODES_FOR_MODE_SWITCHING): Ditto.
	* config/riscv/vector.md
	(rne,rtz,rdn,rup,rmm,dyn,dyn_exit,dyn_call,none): Removed
	(symbol_ref): * config/riscv/vector.md: Set frm_mode attr explicitly.
2023-08-10 12:36:14 +08:00
GCC Administrator
9b099a83b4 Daily bump. 2023-08-10 00:17:26 +00:00
Juzhe-Zhong
83c77b31b8 RISC-V: Fix VLMAX AVL incorrect local anticipate [VSETVL PASS]
Realize we have a bug in VSETVL PASS which is triggered by strided_load_run-1.c in RV32 system.

FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c execution test

This is because VSETVL PASS incorrect hoist vsetvl instruction:

...
   10156:	0d9075d7          	vsetvli	a1,zero,e64,m2,ta,ma ---> pollute 'a1' register which will be used by following insns.
   1015a:	01d586b3          	add	a3,a1,t4  --------> use 'a1'
   1015e:	5e070257          	vmv.v.v	v4,v14
   10162:	b7032257          	vmacc.vv	v4,v6,v16
   10166:	26440257          	vand.vv	v4,v4,v8
   1016a:	22880227          	vs2r.v	v4,(a6)
   1016e:	00b6b7b3          	sltu	a5,a3,a1
   10172:	22888227          	vs2r.v	v4,(a7)
   10176:	9e60b157          	vmv2r.v	v2,v6
   1017a:	97ba                	add	a5,a5,a4
   1017c:	a6a62157          	vmadd.vv	v2,v12,v10
   10180:	26240157          	vand.vv	v2,v2,v8
   10184:	22830127          	vs2r.v	v2,(t1)
   10188:	873e                	mv	a4,a5
   1018a:	982a                	add	a6,a6,a0
   1018c:	98aa                	add	a7,a7,a0
   1018e:	932a                	add	t1,t1,a0
   10190:	85b6                	mv	a1,a3       -----> set 'a1'
...

gcc/ChangeLog:

	* config/riscv/riscv-vsetvl.cc (anticipatable_occurrence_p): Fix
	incorrect anticipate info.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c:
	Adapt test.
	* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-24.c: Ditto.
	* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Ditto.
	* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Ditto.
	* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-36.c: Ditto.
	* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-14.c: Ditto.
	* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-15.c: Ditto.
2023-08-10 07:38:01 +08:00
David Malcolm
73da34a538 analyzer: remove default return value from region_model::on_call_pre
Previously, the code for simulating calls to external functions in
region_model::on_call_pre wrote a default svalue to the LHS of the
call statement, which could be further overwritten by known_function
subclasses.

Unfortunately, this led to messy hacks, such as when the default svalue
was an allocation: the LHS would be written to with two different
heap-allocated regions, requiring special-case cleanups to avoid the
stray state from the first heap allocation leading to state explosions;
see r14-3001-g021077b94741c9.

The following patch eliminates this write of a default svalue to the LHS
of callsite.  Instead, all known_function implementations that have a
return value are now responsible for set the LHS themselves.  A new
call_details::set_any_lhs_with_defaults function is provided to make it
easy to get the old behavior.

On working through the various known_function subclasses, I noticed that
memset was using the default behavior.  That patch updates this so that
it's now known to return its first parameter.

Cleaning this up eliminates various doubling of saved_diagnostics (e.g.
for dubious_allocation_size) where it was generating a diagnostic for
both writes to the LHS, deduplicating them to the first diagnostic (with
the default LHS), and then failing to create a region_creation_event
when emitting the diagnostic, leading to the fallback wording in
dubious_allocation_size::describe_final_event, such as:

  (1) allocated 42 bytes and assigned to ‘int32_t *’ {aka ‘int *’} here; ‘sizeof (int32_t {aka int})’ is ‘4’

Without the double write to the LHS, it creates a region_creation_event,
so we get the allocation and the assignment as two separate events in
the diagnostic path, e.g.:

  (1) allocated 42 bytes here
  (2) assigned to ‘int32_t *’ {aka ‘int *’} here; ‘sizeof (int32_t {aka int})’ is ‘4’

gcc/analyzer/ChangeLog:
	* analyzer.h (class pure_known_function_with_default_return): New
	subclass.
	* call-details.cc (const_fn_p): Move here from region-model.cc.
	(maybe_get_const_fn_result): Likewise.
	(get_result_size_in_bytes): Likewise.
	(call_details::set_any_lhs_with_defaults): New function, based on
	code in region_model::on_call_pre.
	* call-details.h (call_details::set_any_lhs_with_defaults): New
	decl.
	* diagnostic-manager.cc
	(diagnostic_manager::emit_saved_diagnostic): Log the index of the
	saved_diagnostic.
	* kf.cc (pure_known_function_with_default_return::impl_call_pre):
	New.
	(kf_memset::impl_call_pre): Set the LHS to the first param.
	(kf_putenv::impl_call_pre): Call cd.set_any_lhs_with_defaults.
	(kf_sprintf::impl_call_pre): Call cd.set_any_lhs_with_defaults.
	(class kf_stack_restore): Derive from
	pure_known_function_with_default_return.
	(class kf_stack_save): Likewise.
	(kf_strlen::impl_call_pre): Call cd.set_any_lhs_with_defaults.
	* region-model-reachability.cc (reachable_regions::handle_sval):
	Remove logic for symbolic regions for pointers.
	* region-model.cc (region_model::canonicalize): Remove purging of
	dynamic extents workaround for surplus values from
	region_model::on_call_pre's default LHS code.
	(const_fn_p): Move to call-details.cc.
	(maybe_get_const_fn_result): Likewise.
	(get_result_size_in_bytes): Likewise.
	(region_model::update_for_nonzero_return): Call
	cd.set_any_lhs_with_defaults.
	(region_model::on_call_pre): Remove the assignment to the LHS of a
	default return value, instead requiring all known_function
	implementations to write to any LHS of the call.  Use
	cd.set_any_lhs_with_defaults on the non-kf paths.
	* sm-fd.cc (kf_socket::outcome_of_socket::update_model): Use
	cd.set_any_lhs_with_defaults when failing to get at fd state.
	(kf_bind::outcome_of_bind::update_model): Likewise.
	(kf_listen::outcome_of_listen::update_model): Likewise.
	(kf_accept::outcome_of_accept::update_model): Likewise.
	(kf_connect::outcome_of_connect::update_model): Likewise.
	(kf_read::impl_call_pre): Use cd.set_any_lhs_with_defaults.
	* sm-file.cc (class kf_stdio_output_fn): Derive from
	pure_known_function_with_default_return.
	(class kf_ferror): Likewise.
	(class kf_fileno): Likewise.
	(kf_fgets::impl_call_pre): Use cd.set_any_lhs_with_defaults.
	(kf_read::impl_call_pre): Likewise.
	(class kf_getc): Derive from
	pure_known_function_with_default_return.
	(class kf_getchar): Likewise.
	* varargs.cc (kf_va_arg::impl_call_pre): Use
	cd.set_any_lhs_with_defaults.

gcc/testsuite/ChangeLog:
	* gcc.dg/analyzer/allocation-size-1.c: Update expected results
	to reflect splitting of allocation size and assignment messages
	from a single event into pairs of events
	* gcc.dg/analyzer/allocation-size-2.c: Likewise.
	* gcc.dg/analyzer/allocation-size-3.c: Likewise.
	* gcc.dg/analyzer/allocation-size-4.c: Likewise.
	* gcc.dg/analyzer/allocation-size-multiline-1.c: Likewise.
	* gcc.dg/analyzer/allocation-size-multiline-2.c: Likewise.
	* gcc.dg/analyzer/allocation-size-multiline-3.c: Likewise.
	* gcc.dg/analyzer/memset-1.c (test_1): Verify that the return
	value is the initial argument.
	* gcc.dg/plugin/analyzer_kernel_plugin.c
	(copy_across_boundary_fn::impl_call_pre): Ensure the LHS is set on
	the "known zero size" case.
	* gcc.dg/plugin/analyzer_known_fns_plugin.c
	(known_function_attempt_to_copy::impl_call_pre): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-08-09 16:17:04 -04:00
Tsukasa OI
e5fe7f2fd6 RISC-V: Remove non-existing 'Zve32d' extension
Since this extension does not exist, this commit prunes this from
the defined extension version table.

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc (riscv_ext_version_table):
	Remove 'Zve32d' from the version list.
2023-08-09 13:59:43 -06:00
Jin Ma
f088b768d0 RISC-V: Handle no_insn in TARGET_SCHED_VARIABLE_ISSUE.
Reference: d0bc0cb66b

RISC-V should also be implemented to handle no_insn patterns for pipelining.

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_sched_variable_issue): New function.
	(TARGET_SCHED_VARIABLE_ISSUE): New macro.

	Co-authored-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
	Co-authored-by: Jeff Law <jlaw@ventanamicro.com>
2023-08-09 13:52:06 -06:00
Jivan Hakobyan
a16dc729fd RISC-V: Folding memory for FP + constant case
Accessing local arrays element turned into load form (fp + (index << C1)) +
C2 address.

In the case when access is in the loop we got loop invariant computation.  For
some reason, moving out that part cannot be done in loop-invariant passes.  But
we can handle that in target-specific hook (legitimize_address).  That provides
an opportunity to rewrite memory access more suitable for the target
architecture.

This patch solves the mentioned case by rewriting mentioned case to ((fp +
C2) + (index << C1))

I have evaluated it on SPEC2017 and got an improvement on leela (over 7b
instructions, .39% of the dynamic count) and dwarfs the regression for gcc (14m
instructions, .0012% of the dynamic count).

gcc/ChangeLog:
	* config/riscv/riscv.cc (riscv_legitimize_address): Handle folding.
	(mem_shadd_or_shadd_rtx_p): New function.
2023-08-09 13:29:27 -06:00
Andrew Pinski
7fb65f1028 MATCH: [PR110937/PR100798] (a ? ~b : b) should be optimized to b ^ -(a)
This adds a simple match pattern for this case.
I noticed it a couple of different places.
One while I was looking at code generation of a parser and
also while I was looking at locations where bitwise_inverted_equal_p
should be used more.

Committed as approved after bootstrapped and tested on x86_64-linux-gnu with no regressions.

	PR tree-optimization/110937
	PR tree-optimization/100798

gcc/ChangeLog:

	* match.pd (`a ? ~b : b`): Handle this
	case.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/bool-14.c: New test.
	* gcc.dg/tree-ssa/bool-15.c: New test.
	* gcc.dg/tree-ssa/phi-opt-33.c: New test.
	* gcc.dg/tree-ssa/20030709-2.c: Update testcase
	so `a ? -1 : 0` is not used to hit the match
	pattern.
2023-08-09 12:21:04 -07:00
Uros Bizjak
5c27c911f6 i386: Add missing dot to -mpartial-vector-fp-math description
gcc/ChangeLog:

	* config/i386/i386.opt (mpartial-vector-fp-math): Add dot.
2023-08-09 19:42:55 +02:00
Richard Ball
464e207496 aarch64: Add support for Cortex-A520 CPU
This patch adds support for the Cortex-A520 CPU to GCC.

gcc/ChangeLog:

	* config/aarch64/aarch64-cores.def (AARCH64_CORE): Add Cortex-A520 CPU.
	* config/aarch64/aarch64-tune.md: Regenerate.
	* doc/invoke.texi: Document Cortex-A520 CPU.
2023-08-09 16:32:33 +01:00
Carl Love
29e2bc5f9a rs6000: Fix __builtin_altivec_vcmpne{b,h,w} implementation
The current built-in definitions for vcmpneb, vcmpneh, vcmpnew are defined
under the Power 9 section of r66000-builtins.  This implies they are only
supported on Power 9 and above when in fact they are defined and work with
Altivec as well with the appropriate Altivec instruction generation.

The vec_cmpne builtin should generate the vcmpequ{b,h,w} instruction with
Altivec enabled and generate the vcmpne{b,h,w} on Power 9 and newer
processors.

This patch moves the definitions to the Altivec stanza to make it clear
the built-ins are supported for all Altivec processors.  The patch
removes the confusion as to which processors support the vcmpequ{b,h,w}
instructions.

There is existing test coverage for the vec_cmpne built-in for
vector bool char, vector bool short, vector bool int,
vector bool long long in builtins-3-p9.c and p8vector-builtin-2.c.
Coverage for vector signed int, vector unsigned int is in
p8vector-builtin-2.c.

Test vec-cmpne.c is updated to check the generation of the vcmpequ{b,h,w}
instructions for Altivec.  A new test vec-cmpne-runnable.c is added to
verify the built-ins work as expected.

Patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10 LE
with no regressions.

gcc/ChangeLog:

	* config/rs6000/rs6000-builtins.def (vcmpneb, vcmpneh, vcmpnew):
	Move definitions to Altivec stanza.
	* config/rs6000/altivec.md (vcmpneb, vcmpneh, vcmpnew): New
	define_expand.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/vec-cmpne-runnable.c: New execution test.
	* gcc.target/powerpc/vec-cmpne.c (define_test_functions,
	execute_test_functions): Move to vec-cmpne.h.  Add
	scan-assembler-times for vcmpequb, vcmpequh, vcmpequw.
	* gcc.target/powerpc/vec-cmpne.h: New include file for vec-cmpne.c
	and vec-cmpne-runnable.c. Split define_test_functions definition
	into define_test_functions and define_init_verify_functions.
2023-08-09 11:30:48 -04:00
Jonathan Wakely
b3a2b307b9 libstdc++: Fix constexpr functions to conform to older standards
Some constexpr functions were inadvertently relying on relaxed constexpr
rules from later standards.

libstdc++-v3/ChangeLog:

	* include/bits/chrono.h (duration_cast): Do not use braces
	around statements for C++11 constexpr rules.
	* include/bits/stl_algobase.h (__lg): Rewrite as a single
	statement for C++11 constexpr rules.
	* include/experimental/bits/fs_path.h (path::string): Use
	_GLIBCXX17_CONSTEXPR not _GLIBCXX_CONSTEXPR for 'if constexpr'.
	* include/std/charconv (__to_chars_8): Initialize variable for
	C++17 constexpr rules.
2023-08-09 15:19:16 +01:00
Jonathan Wakely
9bd194434a libstdc++: Fix a -Wsign-compare warning in std::list
libstdc++-v3/ChangeLog:

	* include/bits/list.tcc (list::sort(Cmp)): Fix -Wsign-compare
	warning for loop condition.
2023-08-09 15:19:16 +01:00
Jonathan Wakely
798b1f0476 libstdc++: Suppress clang -Wc99-extensions warnings in <complex>
This prevents Clang from warning about the use of the non-standard
__complex__ keyword.

libstdc++-v3/ChangeLog:

	* include/std/complex: Add diagnostic pragma for clang.
2023-08-09 15:19:16 +01:00
Jonathan Wakely
5b46eacc49 libstdc++: Fix some -Wmismatched-tags warnings
libstdc++-v3/ChangeLog:

	* include/bits/shared_ptr_atomic.h (atomic): Change class-head
	to struct.
	* include/bits/stl_tree.h (_Rb_tree_merge_helper): Change
	class-head to struct in friend declaration.
	* include/std/chrono (tzdb_list::_Node): Likewise.
	* include/std/future (_Task_state_base, _Task_state): Likewise.
	* include/std/scoped_allocator (__inner_type_impl): Likewise.
	* include/std/valarray (_BinClos, _SClos, _GClos, _IClos)
	(_ValFunClos, _RefFunClos): Change class-head to struct.
2023-08-09 15:19:15 +01:00
Jonathan Wakely
af89c7792d libstdc++: Fix some -Wunused-parameter warnings
libstdc++-v3/ChangeLog:

	* include/bits/alloc_traits.h (allocate): Add [[maybe_unused]]
	attribute.
	* include/bits/regex_executor.tcc: Remove name of unused
	parameter.
	* include/bits/shared_ptr_atomic.h (atomic_is_lock_free):
	Likewise.
	* include/bits/stl_uninitialized.h: Likewise.
	* include/bits/streambuf_iterator.h (operator==): Likewise.
	* include/bits/uses_allocator.h: Likewise.
	* include/c_global/cmath (isfinite, isinf, isnan): Likewise.
	* include/std/chrono (zoned_time): Likewise.
	* include/std/future (__future_base::_S_allocate_result):
	Likewise.
	(packaged_task): Likewise.
	* include/std/optional (_Optional_payload_base): Likewise.
	* include/std/scoped_allocator (__inner_type_impl): Likewise.
	* include/std/tuple (_Tuple_impl): Likewise.
2023-08-09 15:19:15 +01:00
Jonathan Wakely
008e439f34 libstdc++: Explicitly default some copy ctors and assignments
The standard says that the implicit copy assignment operator is
deprecated for classes that have a user-provided copy constructor, and
vice versa.

libstdc++-v3/ChangeLog:

	* include/bits/new_allocator.h (__new_allocator): Define copy
	assignment operator as defaulted.
	* include/std/complex (complex<float>, complex<double>)
	(complex<long double>): Define copy constructor as defaulted.
2023-08-09 15:19:15 +01:00
Jonathan Wakely
b9e5a4b4f0 libstdc++: Minor fixes for some warnings in <format>
libstdc++-v3/ChangeLog:

	* include/std/format: Fix some warnings.
	(__format::__write(Ctx&, basic_string_view<CharT>)): Remove
	unused function template.
2023-08-09 15:19:15 +01:00
Juzhe-Zhong
c4d6181430 RISC-V: Support NPATTERNS = 1 stepped vector[PR110950]
This patch fix ICE: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110950

0x1cf8939 expand_const_vector
        ../../../riscv-gcc/gcc/config/riscv/riscv-v.cc:1587

	PR target/110950

gcc/ChangeLog:

	* config/riscv/riscv-v.cc (expand_const_vector): Add NPATTERNS = 1
	stepped vector support.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/pr110950.c: New test.
2023-08-09 21:37:22 +08:00
Paul Thomas
b8ec3c9523 Fortran: Allow pure final procs contained in pure proc. [PR109684]
2023-08-09  Steve Kargl  <sgk@troutmask.apl.washington.edu>

gcc/fortran
	PR fortran/109684
	* resolve.cc (resolve_types): Exclude contained procedures with
	the artificial attribute from test for pureness.
2023-08-09 12:04:09 +01:00
Gaius Mulley
e3476ed223 PR modula2/110779: libgm2 fix solaris bootstrap check for tm_gmtoff
This patch defensively checks for every C function and every struct
used in wrapclock.cc.  It adds return values to GetTimespec and
SetTimespec to allow the module to return a code representing
unavailable.

gcc/m2/ChangeLog:

	PR modula2/110779
	* gm2-libs-iso/SysClock.mod (GetClock): Test GetTimespec
	return value.
	(SetClock): Test SetTimespec return value.
	* gm2-libs-iso/wrapclock.def (GetTimespec): Add integer
	return type.
	(SetTimespec): Add integer return type.

libgm2/ChangeLog:

	PR modula2/110779
	* config.h.in: Regenerate.
	* configure: Regenerate.
	* configure.ac (AC_CACHE_CHECK): Check for tm_gmtoff field in
	struct tm.
	(GM2_CHECK_LIB): Check for daylight, timezone and tzname.
	* libm2iso/wrapclock.cc (timezone): Guard against absence of
	struct tm and tm_gmtoff.
	(daylight): Check for daylight.
	(timezone): Check for timezone.
	(isdst): Check for isdst.
	(tzname): Check for tzname.
	(GetTimeRealtime): Check for struct timespec.
	(SetTimeRealtime): Check for struct timespec.
	(InitTimespec): Check for struct timespec.
	(KillTimespec): Check for struct timespec.
	(SetTimespec): Check for struct timespec.
	(GetTimespec): Check for struct timespec.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-08-09 09:35:13 +01:00
liuhongt
6ef7956e9d Rename local variable subleaf_level to max_subleaf_level.
gcc/ChangeLog:

	* common/config/i386/cpuinfo.h (get_available_features):
	Rename local variable subleaf_level to max_subleaf_level.
2023-08-09 15:47:48 +08:00
Richard Biener
b66e613a1a rtl-optimization/110587 - speedup find_hard_regno_for_1
The following applies a micro-optimization to find_hard_regno_for_1,
re-ordering the check so we can easily jump-thread by using an else.
This reduces the time spent in this function by 15% for the testcase
in the PR.

	PR rtl-optimization/110587
	* lra-assigns.cc (find_hard_regno_for_1): Re-order checks.
2023-08-09 08:46:58 +02:00
Kewen Lin
0412f0e374 rs6000: Teach legitimate_address_p about LEN_{LOAD,STORE} [PR110248]
This patch is to teach rs6000_legitimate_address_p to
handle the queried rtx constructed for LEN_{LOAD,STORE},
since lxvl and stxvl doesn't support x-form or ds-form,
so consider it as not legitimate when outer code is PLUS.

	PR tree-optimization/110248

gcc/ChangeLog:

	* config/rs6000/rs6000.cc (rs6000_legitimate_address_p): Check if
	the given code is for ifn LEN_{LOAD,STORE}, if yes then make it not
	legitimate when outer code is PLUS.
2023-08-09 01:16:05 -05:00
Kewen Lin
4a8e6fa801 ivopts: Call valid_mem_ref_p with ifn [PR110248]
As PR110248 shows, to get the expected query results for
that internal functions LEN_{LOAD,STORE} is able to adopt
some addressing modes, we need to pass down the related
IFN code as well.  This patch is to make IVOPTs pass down
ifn code for USE_PTR_ADDRESS type uses, it adjusts the
related functions {strict_,}memory_address_addr_space_p,
and valid_mem_ref_p as well.

	PR tree-optimization/110248

gcc/ChangeLog:

	* recog.cc (memory_address_addr_space_p): Add one more argument ch of
	type code_helper and pass it to targetm.addr_space.legitimate_address_p
	instead of ERROR_MARK.
	(offsettable_address_addr_space_p): Update one function pointer with
	one more argument of type code_helper as its assignees
	memory_address_addr_space_p and strict_memory_address_addr_space_p
	have been adjusted, and adjust some call sites with ERROR_MARK.
	* recog.h (tree.h): New include header file for tree_code ERROR_MARK.
	(memory_address_addr_space_p): Adjust with one more unnamed argument
	of type code_helper with default ERROR_MARK.
	(strict_memory_address_addr_space_p): Likewise.
	* reload.cc (strict_memory_address_addr_space_p): Add one unnamed
	argument of type code_helper.
	* tree-ssa-address.cc (valid_mem_ref_p): Add one more argument ch of
	type code_helper and pass it to memory_address_addr_space_p.
	* tree-ssa-address.h (valid_mem_ref_p): Adjust the declaration with
	one more unnamed argument of type code_helper with default value
	ERROR_MARK.
	* tree-ssa-loop-ivopts.cc (get_address_cost): Use ERROR_MARK as code
	by default, change it with ifn code for USE_PTR_ADDRESS type use, and
	pass it to all valid_mem_ref_p calls.
2023-08-09 01:16:05 -05:00
Kewen Lin
165b1f6ad1 targhooks: Extend legitimate_address_p with code_helper [PR110248]
As PR110248 shows, some middle-end passes like IVOPTs can
query the target hook legitimate_address_p with some
artificially constructed rtx to determine whether some
addressing modes are supported by target for some gimple
statement.  But for now the existing legitimate_address_p
only checks the given mode, it's unable to distinguish
some special cases unfortunately, for example, for LEN_LOAD
ifn on Power port, we would expand it with lxvl hardware
insn, which only supports one register to hold the address
(the other register is holding the length), that is we
don't support base (reg) + index (reg) addressing mode for
sure.  But hook legitimate_address_p only considers the
given mode which would be some vector mode for LEN_LOAD
ifn, and we do support base + index addressing mode for
normal vector load and store insns, so the hook will return
true for the query unexpectedly.

This patch is to introduce one extra argument of type
code_helper for hook legitimate_address_p, it makes targets
able to handle some special case like what's described
above.

	PR tree-optimization/110248

gcc/ChangeLog:

	* coretypes.h (class code_helper): Add forward declaration.
	* doc/tm.texi: Regenerate.
	* lra-constraints.cc (valid_address_p): Call target hook
	targetm.addr_space.legitimate_address_p with an extra parameter
	ERROR_MARK as its prototype changes.
	* recog.cc (memory_address_addr_space_p): Likewise.
	* reload.cc (strict_memory_address_addr_space_p): Likewise.
	* target.def (legitimate_address_p, addr_space.legitimate_address_p):
	Extend with one more argument of type code_helper, update the
	documentation accordingly.
	* targhooks.cc (default_legitimate_address_p): Adjust for the
	new code_helper argument.
	(default_addr_space_legitimate_address_p): Likewise.
	* targhooks.h (default_legitimate_address_p): Likewise.
	(default_addr_space_legitimate_address_p): Likewise.
	* config/aarch64/aarch64.cc (aarch64_legitimate_address_hook_p): Adjust
	with extra unnamed code_helper argument with default ERROR_MARK.
	* config/alpha/alpha.cc (alpha_legitimate_address_p): Likewise.
	* config/arc/arc.cc (arc_legitimate_address_p): Likewise.
	* config/arm/arm-protos.h (arm_legitimate_address_p): Likewise.
	(tree.h): New include for tree_code ERROR_MARK.
	* config/arm/arm.cc (arm_legitimate_address_p): Adjust with extra
	unnamed code_helper argument with default ERROR_MARK.
	* config/avr/avr.cc (avr_addr_space_legitimate_address_p): Likewise.
	* config/bfin/bfin.cc (bfin_legitimate_address_p): Likewise.
	* config/bpf/bpf.cc (bpf_legitimate_address_p): Likewise.
	* config/c6x/c6x.cc (c6x_legitimate_address_p): Likewise.
	* config/cris/cris-protos.h (cris_legitimate_address_p): Likewise.
	(tree.h): New include for tree_code ERROR_MARK.
	* config/cris/cris.cc (cris_legitimate_address_p): Adjust with extra
	unnamed code_helper argument with default ERROR_MARK.
	* config/csky/csky.cc (csky_legitimate_address_p): Likewise.
	* config/epiphany/epiphany.cc (epiphany_legitimate_address_p):
	Likewise.
	* config/frv/frv.cc (frv_legitimate_address_p): Likewise.
	* config/ft32/ft32.cc (ft32_addr_space_legitimate_address_p): Likewise.
	* config/gcn/gcn.cc (gcn_addr_space_legitimate_address_p): Likewise.
	* config/h8300/h8300.cc (h8300_legitimate_address_p): Likewise.
	* config/i386/i386.cc (ix86_legitimate_address_p): Likewise.
	* config/ia64/ia64.cc (ia64_legitimate_address_p): Likewise.
	* config/iq2000/iq2000.cc (iq2000_legitimate_address_p): Likewise.
	* config/lm32/lm32.cc (lm32_legitimate_address_p): Likewise.
	* config/loongarch/loongarch.cc (loongarch_legitimate_address_p):
	Likewise.
	* config/m32c/m32c.cc (m32c_legitimate_address_p): Likewise.
	(m32c_addr_space_legitimate_address_p): Likewise.
	* config/m32r/m32r.cc (m32r_legitimate_address_p): Likewise.
	* config/m68k/m68k.cc (m68k_legitimate_address_p): Likewise.
	* config/mcore/mcore.cc (mcore_legitimate_address_p): Likewise.
	* config/microblaze/microblaze-protos.h (tree.h): New include for
	tree_code ERROR_MARK.
	(microblaze_legitimate_address_p): Adjust with extra unnamed
	code_helper argument with default ERROR_MARK.
	* config/microblaze/microblaze.cc (microblaze_legitimate_address_p):
	Likewise.
	* config/mips/mips.cc (mips_legitimate_address_p): Likewise.
	* config/mmix/mmix.cc (mmix_legitimate_address_p): Likewise.
	* config/mn10300/mn10300.cc (mn10300_legitimate_address_p): Likewise.
	* config/moxie/moxie.cc (moxie_legitimate_address_p): Likewise.
	* config/msp430/msp430.cc (msp430_legitimate_address_p): Likewise.
	(msp430_addr_space_legitimate_address_p): Adjust with extra code_helper
	argument with default ERROR_MARK and adjust the call to function
	msp430_legitimate_address_p.
	* config/nds32/nds32.cc (nds32_legitimate_address_p): Adjust with extra
	unnamed code_helper argument with default ERROR_MARK.
	* config/nios2/nios2.cc (nios2_legitimate_address_p): Likewise.
	* config/nvptx/nvptx.cc (nvptx_legitimate_address_p): Likewise.
	* config/or1k/or1k.cc (or1k_legitimate_address_p): Likewise.
	* config/pa/pa.cc (pa_legitimate_address_p): Likewise.
	* config/pdp11/pdp11.cc (pdp11_legitimate_address_p): Likewise.
	* config/pru/pru.cc (pru_addr_space_legitimate_address_p): Likewise.
	* config/riscv/riscv.cc (riscv_legitimate_address_p): Likewise.
	* config/rl78/rl78-protos.h (rl78_as_legitimate_address): Likewise.
	(tree.h): New include for tree_code ERROR_MARK.
	* config/rl78/rl78.cc (rl78_as_legitimate_address): Adjust with
	extra unnamed code_helper argument with default ERROR_MARK.
	* config/rs6000/rs6000.cc (rs6000_legitimate_address_p): Likewise.
	(rs6000_debug_legitimate_address_p): Adjust with extra code_helper
	argument and adjust the call to function rs6000_legitimate_address_p.
	* config/rx/rx.cc (rx_is_legitimate_address): Adjust with extra
	unnamed code_helper argument with default ERROR_MARK.
	* config/s390/s390.cc (s390_legitimate_address_p): Likewise.
	* config/sh/sh.cc (sh_legitimate_address_p): Likewise.
	* config/sparc/sparc.cc (sparc_legitimate_address_p): Likewise.
	* config/v850/v850.cc (v850_legitimate_address_p): Likewise.
	* config/vax/vax.cc (vax_legitimate_address_p): Likewise.
	* config/visium/visium.cc (visium_legitimate_address_p): Likewise.
	* config/xtensa/xtensa.cc (xtensa_legitimate_address_p): Likewise.
	* config/stormy16/stormy16-protos.h (xstormy16_legitimate_address_p):
	Likewise.
	(tree.h): New include for tree_code ERROR_MARK.
	* config/stormy16/stormy16.cc (xstormy16_legitimate_address_p):
	Adjust with extra unnamed code_helper argument with default
	ERROR_MARK.
2023-08-09 01:16:05 -05:00
liuhongt
b39f8bdad1 Workaround possible CPUID bug in Sandy Bridge.
Don't access leaf 7 subleaf 1 unless subleaf 0 says it is
supported via EAX.

Intel documentation says invalid subleaves return 0. We had been
relying on that behavior instead of checking the max sublef number.

It appears that some Sandy Bridge CPUs return at least the subleaf 0
EDX value for subleaf 1. Best guess is that this is a bug in a
microcode patch since all of the bits we're seeing set in EDX were
introduced after Sandy Bridge was originally released.

This is causing avxvnniint16 to be incorrectly enabled with
-march=native on these CPUs.

gcc/ChangeLog:

	* common/config/i386/cpuinfo.h (get_available_features): Check
	EAX for valid subleaf before use CPUID.
2023-08-09 14:01:24 +08:00