Commit graph

200194 commits

Author SHA1 Message Date
Martin Liska
bd0791e899 libsanitizer: change LOCAL_PATCHES revision
libsanitizer/ChangeLog:

	* LOCAL_PATCHES: Change revision.
2023-04-26 15:52:21 +02:00
Martin Liska
21d3567068 libsanitizer: Apply local patches 2023-04-26 15:51:57 +02:00
Martin Liska
d53b3d94aa libsanitizer: merge from upstream (3185e47b5a8444e9fd). 2023-04-26 15:51:56 +02:00
Pan Li
a8e1551bdb RISC-V: Legitimise the const0_rtx for RVV load/store address
This patch try to legitimise the const0_rtx (aka zero register)
as the base register for the RVV load/store instructions.

For example:
vint32m1_t test_vle32_v_i32m1_shortcut (size_t vl)
{
  return __riscv_vle32_v_i32m1 ((int32_t *)0, vl);
}

Before this patch:
li      a5,0
vsetvli zero,a1,e32,m1,ta,ma
vle32.v v24,0(a5)  <- can propagate the const 0 to a5 here
vs1r.v  v24,0(a0)

After this patch:
vsetvli zero,a1,e32,m1,ta,ma
vle32.v v24,0(zero)
vs1r.v  v24,0(a0)

As above, this patch allow you to propagate the const 0 (aka zero
register) to the base register of the RVV Unit-Stride load in the
combine pass. This may benefit the underlying RVV auto-vectorization.

However, the indexed load failed to perform the optimization and it
will be take care of in another PATCH.

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_classify_address): Allow
	const0_rtx for the RVV load/store.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/zero_base_load_store_optimization.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
2023-04-26 21:23:00 +08:00
Aldy Hernandez
0ef3756adf Remove legacy range support.
This patch removes all the code paths guarded by legacy_mode_p(), thus
allowing us to re-use the int_range<1> idiom for a range of one
sub-range.  This allows us to represent these simple ranges in a more
efficient manner.

gcc/ChangeLog:

	* range-op.cc (range_op_cast_tests): Remove legacy support.
	* value-range-storage.h (vrange_allocator::alloc_irange): Same.
	* value-range.cc (irange::operator=): Same.
	(get_legacy_range): Same.
	(irange::copy_legacy_to_multi_range): Delete.
	(irange::copy_to_legacy): Delete.
	(irange::irange_set_anti_range): Delete.
	(irange::set): Remove legacy support.
	(irange::verify_range): Same.
	(irange::legacy_lower_bound): Delete.
	(irange::legacy_upper_bound): Delete.
	(irange::legacy_equal_p): Delete.
	(irange::operator==): Remove legacy support.
	(irange::singleton_p): Same.
	(irange::value_inside_range): Same.
	(irange::contains_p): Same.
	(intersect_ranges): Delete.
	(irange::legacy_intersect): Delete.
	(union_ranges): Delete.
	(irange::legacy_union): Delete.
	(irange::legacy_verbose_union_): Delete.
	(irange::legacy_verbose_intersect): Delete.
	(irange::irange_union): Remove legacy support.
	(irange::irange_intersect): Same.
	(irange::intersect): Same.
	(irange::invert): Same.
	(ranges_from_anti_range): Delete.
	(gt_pch_nx): Adjust for legacy removal.
	(gt_ggc_mx): Same.
	(range_tests_legacy): Delete.
	(range_tests_misc): Adjust for legacy removal.
	(range_tests): Same.
	* value-range.h (class irange): Same.
	(irange::legacy_mode_p): Delete.
	(ranges_from_anti_range): Delete.
	(irange::nonzero_p): Adjust for legacy removal.
	(irange::lower_bound): Same.
	(irange::upper_bound): Same.
	(irange::union_): Same.
	(irange::intersect): Same.
	(irange::set_nonzero): Same.
	(irange::set_zero): Same.
	* vr-values.cc (simplify_using_ranges::legacy_fold_cond_overflow): Same.
2023-04-26 13:49:51 +02:00
Aldy Hernandez
5db3d28e04 Remove range_has_numeric_bounds_p.
gcc/ChangeLog:

	* value-range.cc (irange::copy_legacy_to_multi_range): Rewrite use
	of range_has_numeric_bounds_p with irange API.
	(range_has_numeric_bounds_p): Delete.
	* value-range.h (range_has_numeric_bounds_p): Delete.
2023-04-26 13:49:51 +02:00
Aldy Hernandez
ebef388ec3 Remove range_int_cst_p.
gcc/ChangeLog:

	* tree-data-ref.cc (compute_distributive_range): Replace uses of
	range_int_cst_p with irange API.
	* tree-ssa-strlen.cc (get_range_strlen_dynamic): Same.
	* tree-vrp.h (range_int_cst_p): Delete.
	* vr-values.cc (check_for_binary_op_overflow): Replace usees of
	range_int_cst_p with irange API.
	(vr_set_zero_nonzero_bits): Same.
	(range_fits_type_p): Same.
	(simplify_using_ranges::simplify_casted_cond): Same.
	* tree-vrp.cc (range_int_cst_p): Remove.
2023-04-26 13:49:42 +02:00
Aldy Hernandez
fb5607ae6a Convert compare_nonzero_chars to wide_ints.
gcc/ChangeLog:

	* tree-ssa-strlen.cc (compare_nonzero_chars): Convert to wide_ints.
2023-04-26 11:46:06 +02:00
Aldy Hernandez
637037f4e6 Remove some uses of deprecated irange API.
gcc/ChangeLog:

	* builtins.cc (expand_builtin_strnlen): Rewrite deprecated irange
	API uses to new API.
	* gimple-predicate-analysis.cc (find_var_cmp_const): Same.
	* internal-fn.cc (get_min_precision): Same.
	* match.pd: Same.
	* tree-affine.cc (expr_to_aff_combination): Same.
	* tree-data-ref.cc (dr_step_indicator): Same.
	* tree-dfa.cc (get_ref_base_and_extent): Same.
	* tree-scalar-evolution.cc (iv_can_overflow_p): Same.
	* tree-ssa-phiopt.cc (two_value_replacement): Same.
	* tree-ssa-pre.cc (insert_into_preds_of_block): Same.
	* tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): Same.
	* tree-ssa-strlen.cc (compare_nonzero_chars): Same.
	* tree-switch-conversion.cc (bit_test_cluster::emit): Same.
	* tree-vect-patterns.cc (vect_recog_divmod_pattern): Same.
	* tree.cc (get_range_pos_neg): Same.
2023-04-26 11:46:06 +02:00
Aldy Hernandez
1a8087c7d1 Replace ad-hoc value_range dumpers with irange::dump.
This causes a regression in gcc.c-torture/unsorted/dump-noaddr.c.

The test is asserting that two dumps are identical, but they are not
because irange dumps the type which varies between runs:

               <          VR  [irange] void (*<T3dc>) (int) [1, +INF]
               >          VR  [irange] void (*<T3da>) (int) [1, +INF]

I have changed the pretty printer for irange types to pass TDF_NOUID,
thus avoiding this problem.

gcc/ChangeLog:

	* ipa-prop.cc (ipa_print_node_jump_functions_for_edge): Use
	vrange::dump instead of ad-hoc dumper.
	* tree-ssa-strlen.cc (dump_strlen_info): Same.
	* value-range-pretty-print.cc (visit): Pass TDF_NOUID to
	dump_generic_node.
2023-04-26 11:45:22 +02:00
Aldy Hernandez
04e5ddf8a3 Fix swapping of ranges.
The legacy range code has logic to swap out of order endpoints in the
irange constructor.  The new irange code expects the caller to fix any
inconsistencies, thus speeding up the common case.  However, this means
that when we remove legacy, any stragglers must be fixed.  This patch
fixes the 3 culprits found during the conversion.

gcc/ChangeLog:

	* range-op.cc (operator_cast::op1_range): Use
	create_possibly_reversed_range.
	(operator_bitwise_and::simple_op1_range_solver): Same.
	* value-range.cc (swap_out_of_order_endpoints): Delete.
	(irange::set): Remove call to swap_out_of_order_endpoints.
2023-04-26 11:12:39 +02:00
Aldy Hernandez
5bdc515513 Convert users of legacy API to get_legacy_range() function.
This patch converts the users of the legacy API to a function called
get_legacy_range() which will return the pieces of the soon to be
removed API (min, max, and kind).  This is a temporary measure while
these users are converted.

In upcoming patches I will convert most users, but most of the
middle-end warning uses will remain.  Naive attempts to remove them
showed that a lot of these uses are quite dependant on the anti-range
idiom, and converting them to the new API broke the tests, even when
the conversion was conceptually correct.  Perhaps someone who
understands these passes could take a stab at it.  In the meantime,
the legacy uses can be trivially found by grepping for
get_legacy_range.

gcc/ChangeLog:

	* builtins.cc (determine_block_size): Convert use of legacy API to
	get_legacy_range.
	* gimple-array-bounds.cc (check_out_of_bounds_and_warn): Same.
	(array_bounds_checker::check_array_ref): Same.
	* gimple-ssa-warn-restrict.cc
	(builtin_memref::extend_offset_range): Same.
	* ipa-cp.cc (ipcp_store_vr_results): Same.
	* ipa-fnsummary.cc (set_switch_stmt_execution_predicate): Same.
	* ipa-prop.cc (struct ipa_vr_ggc_hash_traits): Same.
	(ipa_write_jump_function): Same.
	* pointer-query.cc (get_size_range): Same.
	* tree-data-ref.cc (split_constant_offset): Same.
	* tree-ssa-strlen.cc (get_range): Same.
	(maybe_diag_stxncpy_trunc): Same.
	(strlen_pass::get_len_or_size): Same.
	(strlen_pass::count_nonzero_bytes_addr): Same.
	* tree-vect-patterns.cc (vect_get_range_info): Same.
	* value-range.cc (irange::maybe_anti_range): Remove.
	(get_legacy_range): New.
	(irange::copy_to_legacy): Use get_legacy_range.
	(ranges_from_anti_range): Same.
	* value-range.h (class irange): Remove maybe_anti_range.
	(get_legacy_range): New.
	* vr-values.cc (check_for_binary_op_overflow): Convert use of
	legacy API to get_legacy_range.
	(compare_ranges): Same.
	(compare_range_with_value): Same.
	(bounds_of_var_in_loop): Same.
	(find_case_label_ranges): Same.
	(simplify_using_ranges::simplify_switch_using_ranges): Same.
2023-04-26 10:35:53 +02:00
Aldy Hernandez
964b02cb26 Remove irange::constant_p.
gcc/ChangeLog:

	* value-range-pretty-print.cc (vrange_printer::visit): Remove
	constant_p use.
	* value-range.cc (irange::constant_p): Remove.
	(irange::get_nonzero_bits_from_range): Remove constant_p use.
	* value-range.h (class irange): Remove constant_p.
	(irange::num_pairs): Remove constant_p use.
2023-04-26 10:28:12 +02:00
Aldy Hernandez
a38bb14f01 Remove symbolics from irange.
gcc/ChangeLog:

	* value-range.cc (irange::copy_legacy_to_multi_range): Remove
	symbolics support.
	(irange::set): Same.
	(irange::legacy_lower_bound): Same.
	(irange::legacy_upper_bound): Same.
	(irange::contains_p): Same.
	(range_tests_legacy): Same.
	(irange::normalize_addresses): Remove.
	(irange::normalize_symbolics): Remove.
	(irange::symbolic_p): Remove.
	* value-range.h (class irange): Remove symbolic_p,
	normalize_symbolics, and normalize_addresses.
	* vr-values.cc (simplify_using_ranges::two_valued_val_range_p):
	Remove symbolics support.
2023-04-26 10:28:12 +02:00
Aldy Hernandez
983ad30d42 Remove irange::may_contain_p.
The deprecated irange::may_contain_p method differed from contains_p
in that it could handle symbolics, which no longer exist in VRP.

gcc/ChangeLog:

	* value-range.cc (irange::may_contain_p): Remove.
	* value-range.h (range_includes_zero_p):  Rewrite may_contain_p
	usage with contains_p.
	* vr-values.cc (compare_range_with_value): Same.
2023-04-26 10:28:12 +02:00
Aldy Hernandez
bfd9415761 Remove range_fold_{unary,binary}_expr.
gcc/ChangeLog:

	* tree-vrp.cc (supported_types_p): Remove.
	(defined_ranges_p): Remove.
	(range_fold_binary_expr): Remove.
	(range_fold_unary_expr): Remove.
	* tree-vrp.h (range_fold_unary_expr): Remove.
	(range_fold_binary_expr): Remove.
2023-04-26 10:28:12 +02:00
Aldy Hernandez
3c9372dfee Remove deprecated range_fold_{unary,binary}_expr uses from ipa-*.
gcc/ChangeLog:

	* ipa-cp.cc (ipa_vr_operation_and_type_effects): Convert to ranger API.
	(ipa_value_range_from_jfunc): Same.
	(propagate_vr_across_jump_function): Same.
	* ipa-fnsummary.cc (evaluate_conditions_for_known_args): Same.
	* ipa-prop.cc (ipa_compute_jump_functions_for_edge): Same.
	* vr-values.cc (bounds_of_var_in_loop): Same.
2023-04-26 10:28:12 +02:00
Aldy Hernandez
e6910b622a Remove range_query::get_value_range.
gcc/ChangeLog:

	* gimple-array-bounds.cc (array_bounds_checker::get_value_range):
	Add irange argument.
	(check_out_of_bounds_and_warn): Remove check for vr.
	(array_bounds_checker::check_array_ref): Remove pointer qualifier
	for vr and adjust accordingly.
	* gimple-array-bounds.h (get_value_range): Add irange argument.
	* value-query.cc (class equiv_allocator): Delete.
	(range_query::get_value_range): Delete.
	(range_query::range_query): Remove allocator access.
	(range_query::~range_query): Same.
	* value-query.h (get_value_range): Delete.
	* vr-values.cc
	(simplify_using_ranges::op_with_boolean_value_range_p): Remove
	call to get_value_range.
	(check_for_binary_op_overflow): Same.
	(simplify_using_ranges::legacy_fold_cond_overflow): Same.
	(simplify_using_ranges::simplify_abs_using_ranges): Same.
	(simplify_using_ranges::simplify_cond_using_ranges_1): Same.
	(simplify_using_ranges::simplify_casted_cond): Same.
	(simplify_using_ranges::simplify_switch_using_ranges): Same.
	(simplify_using_ranges::two_valued_val_range_p): Same.
2023-04-26 10:28:12 +02:00
Aldy Hernandez
3d8c2d3aef Refactor vrp_evaluate_conditional* and rename it.
gcc/ChangeLog:

	* vr-values.cc
	(simplify_using_ranges::vrp_evaluate_conditional_warnv_with_ops):
	Rename to...
	(simplify_using_ranges::legacy_fold_cond_overflow): ...this.
	(simplify_using_ranges::vrp_visit_cond_stmt): Rename to...
	(simplify_using_ranges::legacy_fold_cond): ...this.
	(simplify_using_ranges::fold_cond): Rename
	vrp_evaluate_conditional_warnv_with_ops to
	legacy_fold_cond_overflow.
	* vr-values.h (class vr_values): Replace vrp_visit_cond_stmt and
	vrp_evaluate_conditional_warnv_with_ops with legacy_fold_cond and
	legacy_fold_cond_overflow respectively.
2023-04-26 10:28:12 +02:00
Aldy Hernandez
f2b894b148 Remove compare_names* from legacy cond folding.
In a test run I have asserted that the legacy conditional folding only
gets overflows, so this removal is safe.

gcc/ChangeLog:

	* vr-values.cc (get_vr_for_comparison): Remove.
	(compare_name_with_value): Same.
	(vrp_evaluate_conditional_warnv_with_ops): Remove calls to
	compare_name_with_value.
	* vr-values.h: Remove compare_name_with_value.
	Remove get_vr_for_comparison.
2023-04-26 10:28:12 +02:00
Roger Sayle
1f0bfbb26e [xstormy16] Add support for byte and word swapping instructions.
This patch adds support for xstormy16's swpb (swap bytes) and swpw (swap
words) instructions.  The most obvious application of these to implement
the __builtin_bswap16 and __builtin_bswap32 intrinsics.

Currently, __builtin_bswap16 is implemented as:
foo:    mov r7,r2
        shl r7,#8
        shr r2,#8
        or r2,r7
        ret

but with this patch becomes:
foo:	swpb r2
	ret

Likewise, __builtin_bswap32 now becomes:
foo:	swpb r2 | swpb r3 | swpw r2,r3
        ret

Finally, the swpw instruction on its own can be used to exchange
two word mode registers without a temporary, so a new pattern and
peephole2 have been added to catch this.  As described in the
PR rtl-optimization/106518, register allocation can (in theory)
be more efficient on targets that provide a swap/exchange instruction.
The slightly unusual swap<mode> naming matches that used in i386.md.

2024-04-26  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/stormy16/stormy16.md (bswaphi2): New define_insn.
	(bswapsi2): New define_insn.
	(swaphi): New define_insn to exchange two registers (swpw).
	(define_peephole2): Recognize exchange of registers as swaphi.

gcc/testsuite/ChangeLog
	* gcc.target/xstormy16/bswap16.c: New test case.
	* gcc.target/xstormy16/bswap32.c: Likewise.
	* gcc.target/xstormy16/swpb.c: Likewise.
	* gcc.target/xstormy16/swpw-1.c: Likewise.
	* gcc.target/xstormy16/swpw-2.c: Likewise.
2023-04-26 09:10:06 +01:00
Martin Liska
1e832b4db7 MAINTAINERS: fix alphabetic sorting
ChangeLog:

	* MAINTAINERS: fix sorting
2023-04-26 09:30:32 +02:00
Jakub Jelinek
f2f721d13b Update gennews for GCC 13.
2023-04-26  Jakub Jelinek  <jakub@redhat.com>

	* gennews (files): Add files for GCC 13.
2023-04-26 09:05:49 +02:00
Richard Biener
db29daa5e6 More last_stmt removal
This adjusts more users of last_stmt where it is clear that debug
stmt skipping is unnecessary.  In most cases this also allowed
significant code simplification.

	gcc/c/
	* gimple-parser.cc (c_parser_parse_gimple_body): Avoid
	last_stmt.

	gcc/
	* gimple-range-path.cc (path_range_query::compute_outgoing_relations):
	Avoid last_stmt.
	* ipa-pure-const.cc (pass_nothrow::execute): Likewise.
	* predict.cc (apply_return_prediction): Likewise.
	* sese.cc (set_ifsese_condition): Likewise.  Simplify.
	* tree-cfg.cc (assert_unreachable_fallthru_edge_p): Avoid last_stmt.
	(make_edges_bb): Likewise.
	(make_cond_expr_edges): Likewise.
	(end_recording_case_labels): Likewise.
	(make_gimple_asm_edges): Likewise.
	(cleanup_dead_labels): Likewise.
	(group_case_labels): Likewise.
	(gimple_can_merge_blocks_p): Likewise.
	(gimple_merge_blocks): Likewise.
	(find_taken_edge): Likewise.  Also handle empty fallthru blocks.
	(gimple_duplicate_sese_tail): Avoid last_stmt.
	(find_loop_dist_alias): Likewise.
	(gimple_block_ends_with_condjump_p): Likewise.
	(gimple_purge_dead_eh_edges): Likewise.
	(gimple_purge_dead_abnormal_call_edges): Likewise.
	(pass_warn_function_return::execute): Likewise.
	(execute_fixup_cfg): Likewise.
	* tree-eh.cc (redirect_eh_edge_1): Likewise.
	(pass_lower_resx::execute): Likewise.
	(pass_lower_eh_dispatch::execute): Likewise.
	(cleanup_empty_eh): Likewise.
	* tree-if-conv.cc (if_convertible_bb_p): Likewise.
	(predicate_bbs): Likewise.
	(ifcvt_split_critical_edges): Likewise.
	* tree-loop-distribution.cc (create_edge_for_control_dependence):
	Likewise.
	(loop_distribution::transform_reduction_loop): Likewise.
	* tree-parloops.cc (transform_to_exit_first_loop_alt): Likewise.
	(try_transform_to_exit_first_loop_alt): Likewise.
	(transform_to_exit_first_loop): Likewise.
	(create_parallel_loop): Likewise.
	* tree-scalar-evolution.cc (get_loop_exit_condition): Likewise.
	* tree-ssa-dce.cc (mark_last_stmt_necessary): Likewise.
	(eliminate_unnecessary_stmts): Likewise.
	* tree-ssa-dom.cc
	(dom_opt_dom_walker::set_global_ranges_from_unreachable_edges):
	Likewise.
	* tree-ssa-ifcombine.cc (ifcombine_ifandif): Likewise.
	(pass_tree_ifcombine::execute): Likewise.
	* tree-ssa-loop-ch.cc (entry_loop_condition_is_static): Likewise.
	(should_duplicate_loop_header_p): Likewise.
	* tree-ssa-loop-ivcanon.cc (create_canonical_iv): Likewise.
	(tree_estimate_loop_size): Likewise.
	(try_unroll_loop_completely): Likewise.
	* tree-ssa-loop-ivopts.cc (tree_ssa_iv_optimize_loop): Likewise.
	* tree-ssa-loop-manip.cc (ip_normal_pos): Likewise.
	(canonicalize_loop_ivs): Likewise.
	* tree-ssa-loop-niter.cc (determine_value_range): Likewise.
	(bound_difference): Likewise.
	(number_of_iterations_popcount): Likewise.
	(number_of_iterations_cltz): Likewise.
	(number_of_iterations_cltz_complement): Likewise.
	(simplify_using_initial_conditions): Likewise.
	(number_of_iterations_exit_assumptions): Likewise.
	(loop_niter_by_eval): Likewise.
	(estimate_numbers_of_iterations): Likewise.
2023-04-26 08:39:58 +02:00
Ju-Zhe Zhong
5fce06b868 RISC-V: Fine tune vmadc/vmsbc RA constraint
gcc/ChangeLog:

	* config/riscv/vector.md: Refine vmadc/vmsbc RA constraint.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/narrow_constraint-13.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-14.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-15.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-16.c: New test.
2023-04-26 13:59:35 +08:00
Kewen Lin
33a44e3aa8 rs6000: Guard power9-vector for vsx_scalar_cmp_exp_qp_* [PR108758]
__builtin_vsx_scalar_cmp_exp_qp_{eq,gt,lt,unordered} used
to be guarded with condition TARGET_P9_VECTOR before new
bif framework was introduced (r12-5752-gd08236359eb229),
since r12-5752 they are placed under stanza ieee128-hw,
that is to check condition TARGET_FLOAT128_HW, it caused
test case float128-cmp2-runnable.c to fail at -m32 as the
condition TARGET_FLOAT128_HW isn't satisified with -m32.

By checking the commit history, I didn't see any notes on
why this condition change on them was made, so this patch
is to move these bifs from stanza ieee128-hw to stanza
power9-vector as before.

	PR target/108758

gcc/ChangeLog:

	* config/rs6000/rs6000-builtins.def
	(__builtin_vsx_scalar_cmp_exp_qp_eq, __builtin_vsx_scalar_cmp_exp_qp_gt
	__builtin_vsx_scalar_cmp_exp_qp_lt,
	__builtin_vsx_scalar_cmp_exp_qp_unordered): Move from stanza ieee128-hw
	to power9-vector.
2023-04-26 00:21:14 -05:00
Kewen Lin
fd75f6ae56 rs6000: Fix predicate for const vector in sldoi_to_mov [PR109069]
As PR109069 shows, commit r12-6537-g080a06fcb076b3 which
introduces define_insn_and_split sldoi_to_mov adopts
easy_vector_constant for const vector of interest, but it's
wrong since predicate easy_vector_constant doesn't guarantee
each byte in the const vector is the same.  One counter
example is the const vector in pr109069-1.c.  This patch is
to introduce new predicate const_vector_each_byte_same to
ensure all bytes in the given const vector are the same by
considering both int and float, meanwhile for the constants
which don't meet easy_vector_constant we need to gen a move
instead of just a set, and uses VECTOR_MEM_ALTIVEC_OR_VSX_P
rather than VECTOR_UNIT_ALTIVEC_OR_VSX_P for V2DImode support
under VSX since vector long long type of vec_sld is guarded
under stanza vsx.

	PR target/109069

gcc/ChangeLog:

	* config/rs6000/altivec.md (sldoi_to_mov<mode>): Replace predicate
	easy_vector_constant with const_vector_each_byte_same, add
	handlings in preparation for !easy_vector_constant, and update
	VECTOR_UNIT_ALTIVEC_OR_VSX_P with VECTOR_MEM_ALTIVEC_OR_VSX_P.
	* config/rs6000/predicates.md (const_vector_each_byte_same): New
	predicate.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/pr109069-1.c: New test.
	* gcc.target/powerpc/pr109069-2-run.c: New test.
	* gcc.target/powerpc/pr109069-2.c: New test.
	* gcc.target/powerpc/pr109069-2.h: New test.
2023-04-26 00:21:05 -05:00
Juzhe-Zhong
06792c142c RISC-V: Optimize comparison patterns for register allocation
Current RA constraint for RVV comparison instructions totall does not allow
registers between dest and source operand have any overlaps.

For example:
  vmseq.vv vd, vs2, vs1
If LMUL = 8, vs2 = v8, vs1 = v16:

In current GCC RA constraint, GCC does not allow vd to be any regno in v8 ~ v23.
However, it is too conservative and not true according to RVV ISA.

Since the dest EEW of comparison is always EEW = 1, so it always follows the overlap
rules of Dest EEW < Source EEW. So in this case, we should allow GCC RA have the chance
to allocate v8 or v16 for vd, so that we can have better vector registers usage in RA.

gcc/ChangeLog:

	* config/riscv/vector.md (*pred_cmp<mode>_merge_tie_mask): New pattern.
	(*pred_ltge<mode>_merge_tie_mask): Ditto.
	(*pred_cmp<mode>_scalar_merge_tie_mask): Ditto.
	(*pred_eqne<mode>_scalar_merge_tie_mask): Ditto.
	(*pred_cmp<mode>_extended_scalar_merge_tie_mask): Ditto.
	(*pred_eqne<mode>_extended_scalar_merge_tie_mask): Ditto.
	(*pred_cmp<mode>_narrow_merge_tie_mask): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/binop_vv_constraint-4.c: Adapt testcase.
	* gcc.target/riscv/rvv/base/narrow_constraint-17.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-18.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-19.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-20.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-21.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-22.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-23.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-24.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-25.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-26.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-27.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-28.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-29.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-30.c: New test.
	* gcc.target/riscv/rvv/base/narrow_constraint-31.c: New test.
2023-04-26 11:58:17 +08:00
Ju-Zhe Zhong
4f9eac2f26 RISC-V: Fix redundant vmv1r.v instruction in vmsge.vx codegen
Current expansion of vmsge will make RA produce redundant vmv1r.v.

testcase:
void f1 (void * in, void *out, int32_t x)
{
    vbool32_t mask = *(vbool32_t*)in;
    asm volatile ("":::"memory");
    vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4);
    vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, in, 4);
    vbool32_t m3 = __riscv_vmsge_vx_i32m1_b32 (v, x, 4);
    vbool32_t m4 = __riscv_vmsge_vx_i32m1_b32_mu (mask, m3, v, x, 4);
    m4 = __riscv_vmsge_vv_i32m1_b32_m (m4, v2, v2, 4);
    __riscv_vsm_v_b32 (out, m4, 4);
}

Before this patch:
f1:
	vsetvli a5,zero,e8,mf4,ta,ma
	vlm.v   v0,0(a0)
	vsetivli	zero,4,e32,m1,ta,mu
	vle32.v v3,0(a0)
	vle32.v v2,0(a0),v0.t
	vmslt.vx	v1,v3,a2
	vmnot.m v1,v1
	vmslt.vx	v1,v3,a2,v0.t
	vmxor.mm	v1,v1,v0
	vmv1r.v v0,v1
	vmsge.vv	v2,v2,v2,v0.t
	vsm.v   v2,0(a1)
	ret

After this patch:
f1:
	vsetvli a5,zero,e8,mf4,ta,ma
	vlm.v   v0,0(a0)
	vsetivli	zero,4,e32,m1,ta,mu
	vle32.v v3,0(a0)
	vle32.v v2,0(a0),v0.t
	vmslt.vx	v1,v3,a2
	vmnot.m v1,v1
	vmslt.vx	v1,v3,a2,v0.t
	vmxor.mm	v0,v1,v0
	vmsge.vv	v2,v2,v2,v0.t
	vsm.v   v2,0(a1)
	ret

gcc/ChangeLog:

	* config/riscv/vector.md: Fix redundant vmv1r.v.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/binop_vx_constraint-150.c: Adapt assembly
	check.
2023-04-26 11:58:06 +08:00
Ju-Zhe Zhong
a010f0e085 RISC-V: Fine tune gather load RA constraint
For DEST EEW < SOURCE EEW, we can partial overlap register
according to RVV ISA.

gcc/ChangeLog:

	* config/riscv/vector.md: Fix RA constraint.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/narrow_constraint-12.c: New test.
2023-04-26 11:57:51 +08:00
Pan Li
a8d5e14f52 RISC-V: Bugfix for RVV vbool*_t vn_reference_equal
In most architecture the precision_size of vbool*_t types are caculated
like as the multiple of the type size.  For example:
precision_size = type_size * 8 (aka, bit count per bytes).

Unfortunately, some architecture like RISC-V will adjust the
precision_size
for the vbool*_t in order to align the ISA. For example as below.
type_size      = [1, 1, 1, 1,  2,  4,  8]
precision_size = [1, 2, 4, 8, 16, 32, 64]

Then the precision_size of RISC-V vbool*_t will not be the multiple of
the
type_size. This PATCH try to enrich this case when comparing the
vn_reference.

Given we have the below code:
void test_vbool8_then_vbool16(int8_t * restrict in, int8_t * restrict
out) {
    vbool8_t v1 = *(vbool8_t*)in;
    vbool16_t v2 = *(vbool16_t*)in;

    *(vbool8_t*)(out + 100) = v1;
    *(vbool16_t*)(out + 200) = v2;
}

Before this PATCH:
csrr    t0,vlenb
slli    t1,t0,1
csrr    a3,vlenb
sub     sp,sp,t1
slli    a4,a3,1
add     a4,a4,sp
addi    a2,a1,100
vsetvli a5,zero,e8,m1,ta,ma
sub     a3,a4,a3
vlm.v   v24,0(a0)
vsm.v   v24,0(a2)
vsm.v   v24,0(a3)
addi    a1,a1,200
csrr    t0,vlenb
vsetvli a4,zero,e8,mf2,ta,ma
slli    t1,t0,1
vlm.v   v24,0(a3)
vsm.v   v24,0(a1)
add     sp,sp,t1
jr      ra

After this PATCH:
addi    a3,a1,100
vsetvli a4,zero,e8,m1,ta,ma
addi    a1,a1,200
vlm.v   v24,0(a0)
vsm.v   v24,0(a3)
vsetvli a5,zero,e8,mf2,ta,ma
vlm.v   v24,0(a0)
vsm.v   v24,0(a1)
ret

	PR target/109272

gcc/ChangeLog:

	* tree-ssa-sccvn.cc (vn_reference_eq): add type vector subparts
	check for vn_reference equal.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/pr108185-4.c: Update test check
	condition.
	* gcc.target/riscv/rvv/base/pr108185-5.c: Likewise.
	* gcc.target/riscv/rvv/base/pr108185-6.c: Likewise.

Signed-off-by: Pan Li <pan2.li@intel.com>
2023-04-26 11:29:45 +08:00
Ju-Zhe Zhong
2fb7df82b8 RISC-V: Add auto-vectorization compile option for RVV
This patch is adding 2 compile option for RVV auto-vectorization.
1. -param=riscv-autovec-preference=
   This option is to specify the auto-vectorization approach for RVV.
   Currently, we only support scalable and fixed-vlmax.

    - scalable means VLA auto-vectorization. The vector-length to compiler is
      unknown and runtime invariant. Such approach can allow us compile the code
      run on any vector-length RVV CPU.

    - fixed-vlmax means the compile known the RVV CPU vector-length, compile option
      in fixed-length VLS auto-vectorization. Meaning if we specify vector-length=512.
      The execution file can only run on vector-length = 512 RVV CPU.

    - TODO: we may need to support min-length VLS auto-vectorization, means the execution
      file can run on larger length RVV CPU.
2. -param=riscv-autovec-lmul=
   Specify LMUL choosing for RVV auto-vectorization.

gcc/ChangeLog:

	* config/riscv/riscv-opts.h (enum riscv_autovec_preference_enum): Add enum for
	auto-vectorization preference.
	(enum riscv_autovec_lmul_enum): Add enum for choosing LMUL of RVV
	auto-vectorization.
	* config/riscv/riscv.opt: Add compile option for RVV auto-vectorization.
2023-04-25 21:06:42 -06:00
Jivan Hakobyan
392200f807 avoid splitting small constants in bcrli_nottwobits patterns
I have noticed that in the case when we try to clear two bits through a
small constant,
and ZBS is enabled then GCC split it into two "andi" instructions.
For example for the following C code:
  int foo(int a) {
    return a & ~ 0x101;
  }

GCC generates the following:
  foo:
     andi a0,a0,-2
     andi a0,a0,-257
     ret

but should be this one:
  foo:
     andi a0,a0,-258
     ret

This patch solves the mentioned issue.

gcc/ChangeLog
	* config/riscv/bitmanip.md: Updated predicates of bclri<mode>_nottwobits
	and bclridisi_nottwobits patterns.
	* config/riscv/predicates.md: (not_uimm_extra_bit_or_nottwobits): Adjust
	predicate to avoid splitting arith constants.
	(const_nottwobits_not_arith_operand): New predicate.

gcc/testsuite
	* gcc.target/riscv/zbs-bclri-nottwobits.c: New test.
2023-04-25 20:44:56 -06:00
Gaius Mulley
68201409bc PR modula2/108121 Re-implement overflow detection for constant literals
This patch fixes the overflow detection for constant literals.
The ZTYPE is changed to int128 (or int64) if int128 is unavailable and
constant literals are built from widest_int.  The widest_int is converted
into the tree type and checked for overflow.
m2expr_interpret_integer and append_m2_digit are removed.

gcc/m2/ChangeLog:

	PR modula2/108121
	* gm2-compiler/M2ALU.mod (Less): Reformatted.
	* gm2-compiler/SymbolTable.mod (DetermineSizeOfConstant): Remove
	from import.
	(ConstantStringExceedsZType): Import.
	(GetConstLitType): Re-implement using ConstantStringExceedsZType.
	* gm2-gcc/m2decl.cc (m2decl_DetermineSizeOfConstant): Remove.
	(m2decl_ConstantStringExceedsZType): New function.
	(m2decl_BuildConstLiteralNumber): Re-implement.
	* gm2-gcc/m2decl.def (DetermineSizeOfConstant): Remove.
	(ConstantStringExceedsZType): New function.
	* gm2-gcc/m2decl.h (m2decl_DetermineSizeOfConstant): Remove.
	(m2decl_ConstantStringExceedsZType): New function.
	* gm2-gcc/m2expr.cc (append_digit): Remove.
	(m2expr_interpret_integer): Remove.
	(append_m2_digit): Remove.
	(m2expr_StrToWideInt): New function.
	(m2expr_interpret_m2_integer): Remove.
	* gm2-gcc/m2expr.def (CheckConstStrZtypeRange): New function.
	* gm2-gcc/m2expr.h (m2expr_StrToWideInt): New function.
	* gm2-gcc/m2type.cc (build_m2_word64_type_node): New function.
	(build_m2_ztype_node): New function.
	(m2type_InitBaseTypes): Call build_m2_ztype_node.
	* gm2-lang.cc (gm2_type_for_size): Re-write using early returns.

gcc/testsuite/ChangeLog:

	PR modula2/108121
	* gm2/pim/fail/largeconst.mod: Increased constant value test
	to fail now that cc1gm2 uses widest_int to represent a ZTYPE.
	* gm2/pim/fail/largeconst2.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-04-26 02:55:59 +01:00
GCC Administrator
49cea02d8b Daily bump. 2023-04-26 00:17:46 +00:00
Hans-Peter Nilsson
064eed39d2 recog.cc: Correct comments referring to parameter match_len
* recog.cc (peep2_attempt, peep2_update_life): Correct
	head-comment description of parameter match_len.
2023-04-26 01:12:25 +02:00
Joseph Myers
dd39ec6dc7 Regenerate gcc.pot
* gcc.pot: Regenerate.
2023-04-25 21:43:55 +00:00
Patrick Palka
3d674e29d7 c++: value dependence of by-ref lambda capture [PR108975]
We are still ICEing on the generic lambda version of the testcase from
this PR, even after r13-6743-g6f90de97634d6f, due to the by-ref capture
of the constant local variable 'dim' being considered value-dependent
when regenerating the lambda (at which point processing_template_decl is
set since the lambda is generic), which prevents us from constant folding
its uses.  Later during prune_lambda_captures we end up not thoroughly
walking the body of the lambda and overlook the (non-folded) uses of
'dim' within the array bound and using-decls.

We could fix this by making prune_lambda_captures walk the body of the
lambda more thoroughly so that it finds these uses of 'dim', but ideally
we should be able to constant fold all uses of 'dim' ahead of time and
prune the implicit capture after all.

To that end this patch makes value_dependent_expression_p return false
for such by-ref captures of constant local variables, allowing their
uses to get constant folded ahead of time.  It seems we just need to
disable the predicate's conservative early exit for reference variables
(added by r5-5022-g51d72abe5ea04e) when DECL_HAS_VALUE_EXPR_P.  This
effectively makes us treat by-value and by-ref captures more consistently
when it comes to value dependence.

	PR c++/108975

gcc/cp/ChangeLog:

	* pt.cc (value_dependent_expression_p) <case VAR_DECL>:
	Suppress conservative early exit for reference variables
	when DECL_HAS_VALUE_EXPR_P.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/lambda/lambda-const11a.C: New test.
2023-04-25 15:59:22 -04:00
Vineet Gupta
0530254413 riscv: relax splitter restrictions for creating pseudos
[partial addressing of PR/109279]

RISCV splitters have restrictions to not create pesudos due to a combine
limitatation. And despite this being a split-during-combine limitation,
all split passes take the hit due to way define*_split are used in gcc.

With the original combine issue being fixed 61bee6aed2 ("combine: Don't
record for UNDO_MODE pointers into regno_reg_rtx array [PR104985]")
the RV splitters can now be relaxed.

This improves the codegen in general. e.g.

	long long f(void) { return 0x0101010101010101ull; }

Before

	li	a0,0x01010000
	addi	a0,0x0101
	slli	a0,a0,16
	addi	a0,a0,0x0101
	slli	a0,a0,16
	addi	a0,a0,0x0101
	ret

With patch

	li	a5,0x01010000
	addi	a5,a5,0x0101
	mv	a0,a5
	slli	a5,a5,32
	add	a0,a5,a0
	ret

This reduces the qemu icounts, even if slightly, across SPEC2017.

500.perlbench_r	0	1235310737733	1231742384460	0.29%
		1	744489708820	743515759958
		2	714072106766	712875768625	0.17%
502.gcc_r	0	197365353269	197178223030
		1	235614445254	235465240341
		2	226769189971	226604663947
		3	188315686133	188123584015
		4	289372107644	289187945424
503.bwaves_r	0	326291538768	326291539697
		1	515809487294	515809488863
		2	401647004144	401647005463
		3	488750661035	488750662484
505.mcf_r	0	681926695281	681925418147
507.cactuBSSN_r	0	3832240965352	3832226068734
508.namd_r	0	1919838790866	1919832527292
510.parest_r	0	3515999635520	3515878553435
511.povray_r	0	3073889223775	3074758622749
519.lbm_r	0	1194077464296	1194077464041
520.omnetpp_r	0	1014144252460	1011530791131	0.26%
521.wrf_r	0	3966715533120	3966265425092
523.xalancbmk_r	0	1064914296949	1064506711802
525.x264_r	0	509290028335	509258131632
		1	2001424246635	2001677767181
		2	1914660798226	1914869407575
526.blender_r	0	1726083839515	1725974286174
527.cam4_r	0	2336526136415	2333656336419
531.deepsjeng_r	0	1689007489539	1686541299243	0.15%
538.imagick_r	0	3247960667520	3247942048723
541.leela_r	0	2072315300365	2070248271250
544.nab_r	0	1527909091282	1527906483039
548.exchange2_r	0	2086120304280	2086314757502
549.fotonik3d_r	0	2261694058444	2261670330720
554.roms_r	0	2640547903140	2640512733483
557.xz_r	0	388736881767	386880875636	0.48%
		1	959356981818	959993132842
		2	547643353034	546374038310	0.23%
997.specrand_fr	0	512881578	512599641
999.specrand_ir	0	512881578	512599641

This is testsuite clean, no regression w/ patch.

               ========= Summary of gcc testsuite =========
                            | # of unexpected case / # of unique unexpected case
                            |          gcc |          g++ |     gfortran |
 rv64imafdc/  lp64d/ medlow |    2 /     2 |    1 /     1 |    6 /     1 |
   rv64imac/   lp64/ medlow |    3 /     3 |    1 /     1 |   43 /     8 |
 rv32imafdc/ ilp32d/ medlow |    1 /     1 |    3 /     2 |    6 /     1 |
   rv32imac/  ilp32/ medlow |    1 /     1 |    3 /     2 |   43 /     8 |

This came up as part of IRC chat on PR/109279 and was suggested by
Andrew Pinski.

gcc/ChangeLog:

	* config/riscv/riscv.md: riscv_move_integer() drop in_splitter arg.
	riscv_split_symbol() drop in_splitter arg.
	* config/riscv/riscv.cc: riscv_move_integer() drop in_splitter arg.
	riscv_split_symbol() drop in_splitter arg.
	riscv_force_temporary() drop in_splitter arg.
	* config/riscv/riscv-protos.h: riscv_move_integer() drop in_splitter arg.
	riscv_split_symbol() drop in_splitter arg.

Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
2023-04-25 10:00:56 -07:00
Eric Botcazou
e262cdf49c Avoid creating useless debug temporaries
insert_debug_temp_for_var_def has some strange code whereby it creates
debug temporaries for SINGLE_RHS (RHS for gimple_assign_single_p) but
not for other RHS in the same situation.

gcc/
	* tree-ssa.cc (insert_debug_temp_for_var_def): Do not create
	superfluous debug temporaries for single GIMPLE assignments.
2023-04-25 17:39:28 +02:00
Richard Biener
e8d0035301 tree-optimization/109609 - correctly interpret arg size in fnspec
By majority vote and a hint from the API name which is
arg_max_access_size_given_by_arg_p this interprets a memory access
size specified as given as other argument such as for strncpy
in the testcase which has "1cO313" as specifying the _maximum_
size read/written rather than the exact size.  There are two
uses interpreting it that way already and one differing.  The
following adjusts the differing and clarifies the documentation.

	PR tree-optimization/109609
	* attr-fnspec.h (arg_max_access_size_given_by_arg_p):
	Clarify semantics.
	* tree-ssa-alias.cc (check_fnspec): Correctly interpret
	the size given by arg_max_access_size_given_by_arg_p as
	maximum, not exact, size.

	* gcc.dg/torture/pr109609.c: New testcase.
2023-04-25 16:53:06 +02:00
Tobias Burnus
1c101fcfaa 'omp scan' struct block seq update for OpenMP 5.x
While OpenMP 5.0 required a single structured block before and after the
'omp scan' directive, OpenMP 5.1 changed this to a 'structured block sequence,
denoting 2 or more executable statements in OpenMP 5.1 (whoops!) and zero or
more in OpenMP 5.2. This commit updates C/C++ to accept zero statements (but
till requires the '{' ... '}' for the final-loop-body) and updates Fortran
to accept zero or more than one statements.

If there is no preceeding or succeeding executable statement, a warning is
shown.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_omp_scan_loop_body): Handle
	zero exec statements before/after 'omp scan'.

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_scan_loop_body): Handle
	zero exec statements before/after 'omp scan'.

gcc/fortran/ChangeLog:

	* openmp.cc (gfc_resolve_omp_do_blocks): Handle zero
	or more than one exec statements before/after 'omp scan'.
	* trans-openmp.cc (gfc_trans_omp_do): Likewise.

libgomp/ChangeLog:

	* testsuite/libgomp.c-c++-common/scan-1.c: New test.
	* testsuite/libgomp.c/scan-23.c: New test.
	* testsuite/libgomp.fortran/scan-2.f90: New test.

gcc/testsuite/ChangeLog:

	* g++.dg/gomp/attrs-7.C: Update dg-error/dg-warning.
	* gfortran.dg/gomp/loop-2.f90: Likewise.
	* gfortran.dg/gomp/reduction5.f90: Likewise.
	* gfortran.dg/gomp/reduction6.f90: Likewise.
	* gfortran.dg/gomp/scan-1.f90: Likewise.
	* gfortran.dg/gomp/taskloop-2.f90: Likewise.
	* c-c++-common/gomp/scan-6.c: New test.
	* gfortran.dg/gomp/scan-8.f90: New test.
2023-04-25 16:29:14 +02:00
Jakub Jelinek
78aaaf862e testsuite: Fix up ext-floating2.C on powerpc64-linux
Another testcase that is failing on powerpc64-linux.  The test expects
a diagnostics when float64 && float128 or in another spot when
float32 && float128.  Now, float128 effective target is satisfied on
powerpc64-linux, despite __CPP_FLOAT128_T__ not being defined, because
one needs to add some extra options for it.  I think 32-bit arm has
similar case for float16.

2023-04-25  Jakub Jelinek  <jakub@redhat.com>

	* g++.dg/cpp23/ext-floating2.C: Add dg-add-options for
	float16, float32, float64 and float128.
2023-04-25 16:00:48 +02:00
Kyrylo Tkachov
9e9503e7b2 aarch64: PR target/PR99195 Annotate more simple integer binary patterns with vcz subst rules
This patch adds more straightforward annotations to some more integer binary ops to
eliminate redundant fmovs around 64-bit SIMD results.

Bootstrapped and tested on aarch64-none-linux.

gcc/ChangeLog:

	PR target/99195
	* config/aarch64/aarch64-simd.md (orn<mode>3): Rename to...
	(orn<mode>3<vczle><vczbe>): ... This.
	(bic<mode>3): Rename to...
	(bic<mode>3<vczle><vczbe>): ... This.
	(<su><maxmin><mode>3): Rename to...
	(<su><maxmin><mode>3<vczle><vczbe>): ... This.

gcc/testsuite/ChangeLog:

	PR target/99195
	* gcc.target/aarch64/simd/pr99195_1.c: Add tests for orn, bic, max and min.
2023-04-25 14:54:57 +01:00
Kyrylo Tkachov
c69db3ef7f aarch64: Implement V2DI,V4SI division optabs for TARGET_SVE
Similar to the mulv2di case, we can use SVE instruction to implement the V4SI and V2DI optabs
for signed and unsigned integer division.
This allows us to generate much cleaner code for the testcase than the current:
food:
        fmov    x1, d1
        fmov    x0, d0
        umov    x2, v0.d[1]
        sdiv    x0, x0, x1
        umov    x1, v1.d[1]
        sdiv    x1, x2, x1
        fmov    d0, x0
        ins     v0.d[1], x1
        ret
which now becomes:
food:
        ptrue   p0.b, all
        sdiv    z0.d, p0/m, z0.d, z1.d
        ret

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/ChangeLog:

	* config/aarch64/aarch64-simd.md (<su_optab>div<mode>3): New define_expand.
	* config/aarch64/iterators.md (VQDIV): New mode iterator.
	(vnx2di): New mode attribute.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/sve-neon-modes_3.c: New test.
2023-04-25 14:51:09 +01:00
Jakub Jelinek
784e03f378 testsuite: Fix up ext-floating15.C tests on powerpc64-linux [PR109278]
I've noticed this test FAILs on powerpc64-linux, with
FAIL: g++.dg/cpp23/ext-floating15.C  -std=gnu++98 (test for excess errors)
Excess errors:
/home/jakub/gcc/gcc/testsuite/g++.dg/cpp23/ext-floating15.C:8:5: error: '_Float128' is not supported on this target
/home/jakub/gcc/gcc/testsuite/g++.dg/cpp23/ext-floating15.C:8:5: error: '_Float128' is not supported on this target
/home/jakub/gcc/gcc/testsuite/g++.dg/cpp23/ext-floating15.C:8:1: error: variable or field 'bar' declared void
/home/jakub/gcc/gcc/testsuite/g++.dg/cpp23/ext-floating15.C:8:5: error: '_Float128' is not supported on this target
/home/jakub/gcc/gcc/testsuite/g++.dg/cpp23/ext-floating15.C:8:6: error: expected primary-expression before '_Float128'
and similarly other std versions.
powerpc64-linux is float128 target, but needs to add some options for it.

Fixed by adding them.

2023-04-25  Jakub Jelinek  <jakub@redhat.com>

	PR c++/109278
	* g++.dg/cpp23/ext-floating15.C: Add dg-add-options float128.
2023-04-25 14:42:21 +02:00
Richard Biener
6d4bd27a60 rtl-optimization/109585 - alias analysis typo
When r10-514-gc6b84edb6110dd2b4fb improved access path analysis
it introduced a typo that triggers when there's an access to a
trailing array in the first access path leading to false
disambiguation.

	PR rtl-optimization/109585
	* tree-ssa-alias.cc (aliasing_component_refs_p): Fix typo.

	* gcc.dg/torture/pr109585.c: New testcase.
2023-04-25 14:23:59 +02:00
Jakub Jelinek
97f8f2d0a0 powerpc: Fix up *branch_anddi3_dot for -m32 -mpowerpc64 [PR109566]
The following testcase reduced from newlib ICEs on powerpc-linux,
with -O2 -m32 -mpowerpc64 since r12-6433 PR102239 optimization was
added and on the original testcase since some ranger improvements in
GCC 13 made it no longer latent on newlib.
The problem is that the *branch_anddi3_dot define_insn_and_split
relies on the *rotldi3_mask_dot define_insn_and_split being recognized
during splitting.  The rs6000_is_valid_rotate_dot_mask function checks whether
the mask is a CONST_INT which is a valid mask, but *rotl<mode>3_mask_dot in
addition to checking that it is a valid mask also has
  (<MODE>mode == Pmode || UINTVAL (operands[3]) <= 0x7fffffff)
test in the condition.  For TARGET_64BIT that doesn't add any further
requirements, but for !TARGET_64BIT && TARGET_POWERPC64 if the AND
second operand is larger than INT_MAX it will not be recognized.

The rs6000_is_valid_rotate_dot_mask function is used solely in one spot,
condition of *branch_anddi3_dot, so the following patch adjusts it
to check for that as well.

2023-04-25  Jakub Jelinek  <jakub@redhat.com>

	PR target/109566
	* config/rs6000/rs6000.cc (rs6000_is_valid_rotate_dot_mask): For
	!TARGET_64BIT, don't return true if UINTVAL (mask) << (63 - nb)
	is larger than signed int maximum.

	* gcc.target/powerpc/pr109566.c: New test.
2023-04-25 14:20:51 +02:00
Martin Liska
171fe0681e gcov: add info about "calls" to JSON output format
gcc/ChangeLog:

	* doc/gcov.texi: Document the new "calls" field and document
	the API bump. Mention also "block_ids" for lines.
	* gcov.cc (output_intermediate_json_line): Output info about
	calls and extend branches as well.
	(generate_results): Bump version to 2.
	(output_line_details): Use block ID instead of a non-sensual
	index.

gcc/testsuite/ChangeLog:

	* g++.dg/gcov/gcov-17.C: Add call to a noreturn function.
	* g++.dg/gcov/test-gcov-17.py: Cover new format.
	* lib/gcov.exp: Add options for gcov that emit the extra info.
2023-04-25 13:11:29 +02:00
Roger Sayle
dee5cef280 [Committed] Correct zeroextendqihi2 insn length regression on xstormy16.
My recent tweak to the zeroextendqihi2 pattern on xstormy16 incorrectly
handled the case where the operand was a MEM.  MEM operands use a longer
encoding than REG operands, and the incorrect instruction length resulted
in assembler errors (as reported by Jeff Law).  This patch restores the
original length resolving this regression.  Sorry for the inconvenience.
Committed as obvious, after testing that a cross-compiler to xstormy16-elf
builds from x86_64-pc-linux-gnu, and that gcc.c-torture/execute/memset-2.c
no longer causes "operand out of range" issues in gas.  Committed as
obvious.

2023-04-25  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/stormy16/stormy16.md (zero_extendqihi2): Restore/fix
	length attribute for the first (memory operand) alternative.
2023-04-25 12:04:52 +01:00