Commit graph

204981 commits

Author SHA1 Message Date
Juzhe-Zhong
c9bb20f7c9 NFC: Fix whitespace
Notice there is a whitspace issue in previous commit:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f66b2fc122b8a17591afbb881d580b32e8ddb708

Sorry for missing fixing this whitespace.

Committed as it is obvious.

gcc/ChangeLog:

	* tree-vect-slp.cc (vect_build_slp_tree_1): Fix whitespace.
2023-11-01 08:52:46 +08:00
GCC Administrator
eac0917bd3 Daily bump. 2023-11-01 00:17:52 +00:00
David Malcolm
37e1634ef1 analyzer: move class record_layout to its own .h/.cc
No functional change intended.

gcc/ChangeLog:
	* Makefile.in (ANALYZER_OBJS): Add analyzer/record-layout.o.

gcc/analyzer/ChangeLog:
	* record-layout.cc: New file, based on material in region-model.cc.
	* record-layout.h: Likewise.
	* region-model.cc: Include "analyzer/record-layout.h".
	(class record_layout): Move to record-layout.cc and .h

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-10-31 17:05:41 -04:00
David Malcolm
b0f19336f2 libcpp: eliminate MACRO_MAP_EXPANSION_POINT_LOCATION
This patch eliminates the function "MACRO_MAP_EXPANSION_POINT_LOCATION"
(which hasn't been a macro since r6-739-g0501dbd932a7e9) in favor of
a new line_map_macro::get_expansion_point_location accessor.

No functional change intended.

gcc/c-family/ChangeLog:
	* c-warn.cc (warn_for_multistatement_macros): Update for removal
	of MACRO_MAP_EXPANSION_POINT_LOCATION.

gcc/cp/ChangeLog:
	* module.cc (ordinary_loc_of): Update for removal of
	MACRO_MAP_EXPANSION_POINT_LOCATION.
	(module_state::note_location): Update for renaming of field.
	(module_state::write_macro_maps): Likewise.

gcc/ChangeLog:
	* input.cc (dump_location_info): Update for removal of
	MACRO_MAP_EXPANSION_POINT_LOCATION.
	* tree-diagnostic.cc (maybe_unwind_expanded_macro_loc):
	Likewise.

libcpp/ChangeLog:
	* include/line-map.h
	(line_map_macro::get_expansion_point_location): New accessor.
	(line_map_macro::expansion): Rename field to...
	(line_map_macro::mexpansion): Rename field to...
	(MACRO_MAP_EXPANSION_POINT_LOCATION): Delete this function.
	* line-map.cc (linemap_enter_macro): Update for renaming of field.
	(linemap_macro_map_loc_to_exp_point): Update for removal of
	MACRO_MAP_EXPANSION_POINT_LOCATION.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-10-31 17:05:41 -04:00
David Malcolm
8b4ac021cd opts.cc: fix comment about DOCUMENTATION_ROOT_URL
gcc/ChangeLog:
	* opts.cc (get_option_url): Update comment; the requirement to
	pass DOCUMENTATION_ROOT_URL's value via -D was removed in
	r10-8065-ge33a1eae25b8a8.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-10-31 17:05:41 -04:00
David Malcolm
b9e2088d29 pretty-print: gracefully handle null URLs
gcc/ChangeLog:
	* pretty-print.cc (pretty_printer::pretty_printer): Initialize
	m_skipping_null_url.
	(pp_begin_url): Handle URL being null.
	(pp_end_url): Likewise.
	(selftest::test_null_urls): New.
	(selftest::pretty_print_cc_tests): Call it.
	* pretty-print.h (pretty_printer::m_skipping_null_url): New.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-10-31 17:05:40 -04:00
Juzhe-Zhong
f66b2fc122 VECT: Support SLP MASK_LEN_GATHER_LOAD with conditional mask
This patch leverage current MASK_GATHER_LOAD to support SLP MASK_LEN_GATHER_LOAD with condtional mask.

Unconditional MASK_LEN_GATHER_LOAD (base, offset, scale, zero, -1) SLP is not included in this patch
since it seems that we can't support it in the middle-end:

FAIL: gcc.dg/tree-ssa/pr44306.c (internal compiler error: in vectorizable_load, at tree-vect-stmts.cc:9885)

May be we should support GATHER_LOAD explictily in RISC-V backend to walk around this issue.

I am gonna support GATHER_LOAD explictly work around in RISC-V backend.

This patch also adds conditional gather load test since there is no conditional gather load test.

Ok for trunk ?

gcc/ChangeLog:

	* tree-vect-slp.cc (vect_get_operand_map): Add MASK_LEN_GATHER_LOAD.
	(vect_build_slp_tree_1): Ditto.
	(vect_build_slp_tree_2): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.dg/vect/vect-gather-6.c: New test.
2023-10-31 21:07:46 +01:00
Cupertino Miranda
327d38f101 bpf: Improvements in CO-RE builtins implementation.
This patch moved the processing of attribute preserve_access_index to
its own independent pass in a gimple lowering pass.
This approach is more consistent with the implementation of the CO-RE
builtins when used explicitly in the code.  The attributed type accesses
are now early converted to __builtin_core_reloc builtin instead of being
kept as an expression in code through out all of the middle-end.
This disables the compiler to optimize out or manipulate the expression
using the local defined type, instead of assuming nothing is known about
this expression, as it should be the case in all of the CO-RE
relocations.

In the process, also the __builtin_preserve_access_index has been
improved to generate code for more complex expressions that would
require more then one CO-RE relocation.
This turned out to be a requirement, since bpf-next selftests would rely on
loop unrolling in order to convert an undefined index array access into a
defined one. This seemed extreme to expect for the unroll to happen, and for
that reason GCC still generates correct code in such scenarios, even when index
access is never predictable or unrolling does not occur.

gcc/ChangeLog:
	* config/bpf/bpf-passes.def (pass_lower_bpf_core): Added pass.
	* config/bpf/bpf-protos.h: Added prototype for new pass.
	* config/bpf/bpf.cc (bpf_delegitimize_address): New function.
	* config/bpf/bpf.md (mov_reloc_core<MM:mode>): Prefixed
	name with '*'.
	* config/bpf/core-builtins.cc (cr_builtins) Added access_node to
	struct.
	(is_attr_preserve_access): Improved check.
	(core_field_info): Make use of root_for_core_field_info
	function.
	(process_field_expr): Adapted to new functions.
	(pack_type): Small improvement.
	(bpf_handle_plugin_finish_type): Adapted to GTY(()).
	(bpf_init_core_builtins): Changed to new function names.
	(construct_builtin_core_reloc): Improved implementation.
	(bpf_resolve_overloaded_core_builtin): Changed how
	__builtin_preserve_access_index is converted.
	(compute_field_expr): Corrected implementation. Added
	access_node argument.
	(bpf_core_get_index): Added valid argument.
	(root_for_core_field_info, pack_field_expr)
	(core_expr_with_field_expr_plus_base, make_core_safe_access_index)
	(replace_core_access_index_comp_expr, maybe_get_base_for_field_expr)
	(core_access_clean, core_is_access_index, core_mark_as_access_index)
	(make_gimple_core_safe_access_index, execute_lower_bpf_core)
	(make_pass_lower_bpf_core): Added functions.
	(pass_data_lower_bpf_core): New pass struct.
	(pass_lower_bpf_core): New gimple_opt_pass class.
	(pack_field_expr_for_preserve_field)
	(bpf_replace_core_move_operands): Removed function.
	(bpf_enum_value_kind): Added GTY(()).
	* config/bpf/core-builtins.h (bpf_field_info_kind, bpf_type_id_kind)
	(bpf_type_info_kind, bpf_enum_value_kind): New enum.
	* config/bpf/t-bpf: Added pass bpf-passes.def to PASSES_EXTRA.

gcc/testsuite/ChangeLog:
	* gcc.target/bpf/core-attr-5.c: New test.
	* gcc.target/bpf/core-attr-6.c: New test.
	* gcc.target/bpf/core-builtin-1.c: Corrected
	* gcc.target/bpf/core-builtin-enumvalue-opt.c: Corrected regular
	expression.
	* gcc.target/bpf/core-builtin-enumvalue.c: Corrected regular
	expression.
	* gcc.target/bpf/core-builtin-exprlist-1.c: New test.
	* gcc.target/bpf/core-builtin-exprlist-2.c: New test.
	* gcc.target/bpf/core-builtin-exprlist-3.c: New test.
	* gcc.target/bpf/core-builtin-exprlist-4.c: New test.
	* gcc.target/bpf/core-builtin-fieldinfo-offset-1.c: Extra tests
2023-10-31 18:47:03 +00:00
Neal Frager
0f1727e25f gcc: config: microblaze: fix cpu version check
The MICROBLAZE_VERSION_COMPARE was incorrectly using strcasecmp
instead of strverscmp to check the mcpu version against feature
options.  By simply changing the define to use strverscmp,
the new version 10.0 is treated correctly as a higher version
than previous versions.

gcc/ChangeLog:

	* config/microblaze/microblaze.cc: Fix mcpu version check.

gcc/testsuite/ChangeLog:

	* gcc.target/microblaze/isa/bshift.c: Bump to mcpu=v10.0.
	* gcc.target/microblaze/isa/div.c: Ditto.
	* gcc.target/microblaze/isa/fcmp1.c: Ditto.
	* gcc.target/microblaze/isa/fcmp2.c: Ditto.
	* gcc.target/microblaze/isa/fcmp3.c: Ditto.
	* gcc.target/microblaze/isa/fcmp4.c: Ditto.
	* gcc.target/microblaze/isa/fcvt.c: Ditto.
	* gcc.target/microblaze/isa/float.c: Ditto.
	* gcc.target/microblaze/isa/fsqrt.c: Ditto.
	* gcc.target/microblaze/isa/mul-bshift-pcmp.c: Ditto.
	* gcc.target/microblaze/isa/mul-bshift.c: Ditto.
	* gcc.target/microblaze/isa/mul.c: Ditto.
	* gcc.target/microblaze/isa/mulh-bshift-pcmp.c: Ditto.
	* gcc.target/microblaze/isa/mulh.c: Ditto.
	* gcc.target/microblaze/isa/nofcmp.c: Ditto.
	* gcc.target/microblaze/isa/nofloat.c: Ditto.
	* gcc.target/microblaze/isa/pcmp.c: Ditto.
	* gcc.target/microblaze/isa/vanilla.c: Ditto.
	* gcc.target/microblaze/microblaze.exp: Ditto.

Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
2023-10-31 10:57:45 -07:00
Patrick O'Neill
2b19c38769
RISC-V: Require a extension for testcases with atomic insns
Add testsuite infrastructure for the A extension and use it to require the A
extension for dg-do run and add the add extension for non-A dg-do compile.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/amo-table-a-6-amo-add-1.c: Add A extension to
	dg-options for dg-do compile.
	* gcc.target/riscv/amo-table-a-6-amo-add-2.c: Ditto.
	* gcc.target/riscv/amo-table-a-6-amo-add-3.c: Ditto.
	* gcc.target/riscv/amo-table-a-6-amo-add-4.c: Ditto.
	* gcc.target/riscv/amo-table-a-6-amo-add-5.c: Ditto.
	* gcc.target/riscv/amo-table-a-6-compare-exchange-1.c: Ditto.
	* gcc.target/riscv/amo-table-a-6-compare-exchange-2.c: Ditto.
	* gcc.target/riscv/amo-table-a-6-compare-exchange-3.c: Ditto.
	* gcc.target/riscv/amo-table-a-6-compare-exchange-4.c: Ditto.
	* gcc.target/riscv/amo-table-a-6-compare-exchange-5.c: Ditto.
	* gcc.target/riscv/amo-table-a-6-compare-exchange-6.c: Ditto.
	* gcc.target/riscv/amo-table-a-6-compare-exchange-7.c: Ditto.
	* gcc.target/riscv/amo-table-a-6-subword-amo-add-1.c: Ditto.
	* gcc.target/riscv/amo-table-a-6-subword-amo-add-2.c: Ditto.
	* gcc.target/riscv/amo-table-a-6-subword-amo-add-3.c: Ditto.
	* gcc.target/riscv/amo-table-a-6-subword-amo-add-4.c: Ditto.
	* gcc.target/riscv/amo-table-a-6-subword-amo-add-5.c: Ditto.
	* gcc.target/riscv/inline-atomics-2.c: Ditto.
	* gcc.target/riscv/inline-atomics-3.c: Require A extension for dg-do
	run.
	* gcc.target/riscv/inline-atomics-4.c: Ditto.
	* gcc.target/riscv/inline-atomics-5.c: Ditto.
	* gcc.target/riscv/inline-atomics-6.c: Ditto.
	* gcc.target/riscv/inline-atomics-7.c: Ditto.
	* gcc.target/riscv/inline-atomics-8.c: Ditto.
	* lib/target-supports.exp: Add testing infrastructure to require the A
	extension or add it to an existing -march.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2023-10-31 10:15:34 -07:00
Patrick O'Neill
b93fddba39
RISC-V: Let non-atomic targets use optimized amo loads/stores
Non-atomic targets are currently prevented from using the optimized fencing for
seq_cst load/seq_cst store. This patch removes that constraint.

gcc/ChangeLog:

	* config/riscv/sync-rvwmo.md (atomic_load_rvwmo<mode>): Remove
	TARGET_ATOMIC constraint
	(atomic_store_rvwmo<mode>): Ditto.
	* config/riscv/sync-ztso.md (atomic_load_ztso<mode>): Ditto.
	(atomic_store_ztso<mode>): Ditto.
	* config/riscv/sync.md (atomic_load<mode>): Ditto.
	(atomic_store<mode>): Ditto.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2023-10-31 10:15:33 -07:00
Christoph Müllner
60d6c63df0 riscv: thead: Add support for the XTheadFMemIdx ISA extension
The XTheadFMemIdx ISA extension provides additional load and store
instructions for floating-point registers with new addressing modes.

The following memory accesses types are supported:
* load/store: [w,d] (single-precision FP, double-precision FP)

The following addressing modes are supported:
* register offset with additional immediate offset (4 instructions):
  flr<type>, fsr<type>
* zero-extended register offset with additional immediate offset
  (4 instructions): flur<type>, fsur<type>

These addressing modes are also part of the similar XTheadMemIdx
ISA extension support, whose code is reused and extended to support
floating-point registers.

One challenge that this patch needs to solve are GP registers in FP-mode
(e.g. "(reg:DF a2)"), which cannot be handled by the XTheadFMemIdx
instructions. Such registers are the result of independent
optimizations, which can happen after register allocation.
This patch uses a simple but efficient method to address this:
add a dependency for XTheadMemIdx to XTheadFMemIdx optimizations.
This allows to use the instructions from XTheadMemIdx in case
of such registers.

The added tests ensure that this feature won't regress without notice.
Testing: GCC regression test suite and SPEC CPU 2017 intrate (base&peak).

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_index_reg_class):
	Return GR_REGS for XTheadFMemIdx.
	(riscv_regno_ok_for_index_p): Add support for XTheadFMemIdx.
	* config/riscv/riscv.h (HARDFP_REG_P): New macro.
	* config/riscv/thead.cc (is_fmemidx_mode): New function.
	(th_memidx_classify_address_index): Add support for XTheadFMemIdx.
	(th_fmemidx_output_index): New function.
	(th_output_move): Add support for XTheadFMemIdx.
	* config/riscv/thead.md (TH_M_ANYF): New mode iterator.
	(TH_M_NOEXTF): Likewise.
	(*th_fmemidx_movsf_hardfloat): New INSN.
	(*th_fmemidx_movdf_hardfloat_rv64): Likewise.
	(*th_fmemidx_I_a): Likewise.
	(*th_fmemidx_I_c): Likewise.
	(*th_fmemidx_US_a): Likewise.
	(*th_fmemidx_US_c): Likewise.
	(*th_fmemidx_UZ_a): Likewise.
	(*th_fmemidx_UZ_c): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/xtheadfmemidx-index-update.c: New test.
	* gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c: New test.
	* gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c: New test.
	* gcc.target/riscv/xtheadfmemidx-index.c: New test.
	* gcc.target/riscv/xtheadfmemidx-uindex-update.c: New test.
	* gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c: New test.
	* gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c: New test.
	* gcc.target/riscv/xtheadfmemidx-uindex.c: New test.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2023-10-31 18:08:02 +01:00
Christoph Müllner
2d65622fda riscv: thead: Add support for the XTheadMemIdx ISA extension
The XTheadMemIdx ISA extension provides a additional load and store
instructions with new addressing modes.

The following memory accesses types are supported:
* load: b,bu,h,hu,w,wu,d
* store: b,h,w,d

The following addressing modes are supported:
* immediate offset with PRE_MODIFY or POST_MODIFY (22 instructions):
  l<ltype>.ia, l<ltype>.ib, s<stype>.ia, s<stype>.ib
* register offset with additional immediate offset (11 instructions):
  lr<ltype>, sr<stype>
* zero-extended register offset with additional immediate offset
  (11 instructions): lur<ltype>, sur<stype>

The RISC-V base ISA does not support index registers, so the changes
are kept separate from the RISC-V standard support as much as possible.

To combine the shift/multiply instructions into the memory access
instructions, this patch comes with a few insn_and_split optimizations
that allow the combiner to do this task.

Handling the different cases of extensions results in a couple of INSNs
that look redundant on first view, but they are just the equivalence
of what we already have for Zbb as well. The only difference is, that
we have much more load instructions.

We already have a constraint with the name 'th_f_fmv', therefore,
the new constraints follow this pattern and have the same length
as required ('th_m_mia', 'th_m_mib', 'th_m_mir', 'th_m_miu').

The added tests ensure that this feature won't regress without notice.
Testing: GCC regression test suite, GCC bootstrap build, and
SPEC CPU 2017 intrate (base&peak) on C920.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

gcc/ChangeLog:

	* config/riscv/constraints.md (th_m_mia): New constraint.
	(th_m_mib): Likewise.
	(th_m_mir): Likewise.
	(th_m_miu): Likewise.
	* config/riscv/riscv-protos.h (enum riscv_address_type):
	Add new address types ADDRESS_REG_REG, ADDRESS_REG_UREG,
	and ADDRESS_REG_WB and their documentation.
	(struct riscv_address_info): Add new field 'shift' and
	document the field usage for the new address types.
	(riscv_valid_base_register_p): New prototype.
	(th_memidx_legitimate_modify_p): Likewise.
	(th_memidx_legitimate_index_p): Likewise.
	(th_classify_address): Likewise.
	(th_output_move): Likewise.
	(th_print_operand_address): Likewise.
	* config/riscv/riscv.cc (riscv_index_reg_class):
	Return GR_REGS for XTheadMemIdx.
	(riscv_regno_ok_for_index_p): Add support for XTheadMemIdx.
	(riscv_classify_address): Call th_classify_address() on top.
	(riscv_output_move): Call th_output_move() on top.
	(riscv_print_operand_address): Call th_print_operand_address()
	on top.
	* config/riscv/riscv.h (HAVE_POST_MODIFY_DISP): New macro.
	(HAVE_PRE_MODIFY_DISP): Likewise.
	* config/riscv/riscv.md (zero_extendqi<SUPERQI:mode>2): Disable
	for XTheadMemIdx.
	(*zero_extendqi<SUPERQI:mode>2_internal): Convert to expand,
	create INSN with same name and disable it for XTheadMemIdx.
	(extendsidi2): Likewise.
	(*extendsidi2_internal): Disable for XTheadMemIdx.
	* config/riscv/thead.cc (valid_signed_immediate): New helper
	function.
	(th_memidx_classify_address_modify): New function.
	(th_memidx_legitimate_modify_p): Likewise.
	(th_memidx_output_modify): Likewise.
	(is_memidx_mode): Likewise.
	(th_memidx_classify_address_index): Likewise.
	(th_memidx_legitimate_index_p): Likewise.
	(th_memidx_output_index): Likewise.
	(th_classify_address): Likewise.
	(th_output_move): Likewise.
	(th_print_operand_address): Likewise.
	* config/riscv/thead.md (*th_memidx_operand): New splitter.
	(*th_memidx_zero_extendqi<SUPERQI:mode>2): New INSN.
	(*th_memidx_extendsidi2): Likewise.
	(*th_memidx_zero_extendsidi2): Likewise.
	(*th_memidx_zero_extendhi<GPR:mode>2): Likewise.
	(*th_memidx_extend<SHORT:mode><SUPERQI:mode>2): Likewise.
	(*th_memidx_bb_zero_extendsidi2): Likewise.
	(*th_memidx_bb_zero_extendhi<GPR:mode>2): Likewise.
	(*th_memidx_bb_extendhi<GPR:mode>2): Likewise.
	(*th_memidx_bb_extendqi<SUPERQI:mode>2): Likewise.
	(TH_M_ANYI): New mode iterator.
	(TH_M_NOEXTI): Likewise.
	(*th_memidx_I_a): New combiner optimization.
	(*th_memidx_I_b): Likewise.
	(*th_memidx_I_c): Likewise.
	(*th_memidx_US_a): Likewise.
	(*th_memidx_US_b): Likewise.
	(*th_memidx_US_c): Likewise.
	(*th_memidx_UZ_a): Likewise.
	(*th_memidx_UZ_b): Likewise.
	(*th_memidx_UZ_c): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/xtheadmemidx-helpers.h: New test.
	* gcc.target/riscv/xtheadmemidx-index-update.c: New test.
	* gcc.target/riscv/xtheadmemidx-index-xtheadbb-update.c: New test.
	* gcc.target/riscv/xtheadmemidx-index-xtheadbb.c: New test.
	* gcc.target/riscv/xtheadmemidx-index.c: New test.
	* gcc.target/riscv/xtheadmemidx-modify-xtheadbb.c: New test.
	* gcc.target/riscv/xtheadmemidx-modify.c: New test.
	* gcc.target/riscv/xtheadmemidx-uindex-update.c: New test.
	* gcc.target/riscv/xtheadmemidx-uindex-xtheadbb-update.c: New test.
	* gcc.target/riscv/xtheadmemidx-uindex-xtheadbb.c: New test.
	* gcc.target/riscv/xtheadmemidx-uindex.c: New test.
2023-10-31 18:08:01 +01:00
Carl Love
c82f123d93 rs6000, Add missing overloaded bcd builtin tests, documentation
Currently we have the documentation for __builtin_vec_bcdsub_{eq,gt,lt} but
not for __builtin_bcdsub_{gl}e, this patch is to supplement the descriptions
for them.  Although they are mainly for __builtin_bcdcmp{ge,le}, we already
have some testing coverage for __builtin_vec_bcdsub_{eq,gt,lt}, this patch
adds the corresponding explicit test cases as well.

gcc/ChangeLog:
	* doc/extend.texi (__builtin_bcdsub_le, __builtin_bcdsub_ge): Add
	documentation for the builti-ins.

gcc/testsuite/ChangeLog:
	* gcc.target/powerpc/bcd-3.c (do_sub_ge, do_suble): Add functions
	to test builtins __builtin_bcdsub_ge and __builtin_bcdsub_le.
2023-10-31 12:30:41 -04:00
Neal Frager
f694960924 gcc: config: microblaze: fix cpu version check
The MICROBLAZE_VERSION_COMPARE was incorrectly using strcasecmp
instead of strverscmp to check the mcpu version against feature
options.  By simply changing the define to use strverscmp,
the new version 10.0 is treated correctly as a higher version
than previous versions.

Fix incorrect warning with -mcpu=10.0:
  warning: '-mxl-multiply-high' can be used only with
  '-mcpu=v6.00.a' or greater

Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
2023-10-31 09:29:10 -07:00
Vladimir N. Makarov
9119b008b4 [RA]: Fixing LRA cycling for multi-reg variable containing a fixed reg
PR111971 test case uses a multi-reg variable containing a fixed reg.  LRA
rejects such multi-reg because of this when matching the constraint for
an asm insn.  The rejection results in LRA cycling.  The patch fixes this issue.

gcc/ChangeLog:

	PR rtl-optimization/111971
	* lra-constraints.cc: (process_alt_operands): Don't check start
	hard regs for regs originated from register variables.

gcc/testsuite/ChangeLog:

	PR rtl-optimization/111971
	* gcc.target/powerpc/pr111971.c: New test.
2023-10-31 11:45:40 -04:00
Thomas Schwinge
3e888f9462 Add OpenACC 'acc_map_data' variant to 'libgomp.oacc-c-c++-common/deep-copy-8.c'
libgomp/
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c: Add OpenACC
	'acc_map_data' variant.
2023-10-31 14:54:41 +01:00
Robin Dapp
5de05bdaa7 RISC-V: Add vector fmin/fmax expanders.
This patch adds expanders for fmin and fmax.  As per RISC-V V Spec 1.0
vfmin/vfmax are IEEE 754-2019 compliant which differs from IEEE 754-2008
that fmin/fmax require (particularly in the signaling-NaN handling).
Therefore the pattern conditions include a !HONOR_SNANS.

gcc/ChangeLog:

	* config/riscv/autovec.md (<ieee_fmaxmin_op><mode>3): fmax/fmin
	expanders.
	(cond_<ieee_fmaxmin_op><mode>): Ditto.
	(cond_len_<ieee_fmaxmin_op><mode>): Ditto.
	(reduc_fmax_scal_<mode>): Ditto.
	(reduc_fmin_scal_<mode>): Ditto.
	* config/riscv/riscv-v.cc (needs_fp_rounding): Add fmin/fmax.
	* config/riscv/vector-iterators.md (fmin): New UNSPEC.
	(UNSPEC_VFMIN): Ditto.
	* config/riscv/vector.md (@pred_<ieee_fmaxmin_op><mode>): Add
	UNSPEC insn patterns.
	(@pred_<ieee_fmaxmin_op><mode>_scalar): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Remove
	-ffast-math.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/binop/fmax-1.c: New test.
	* gcc.target/riscv/rvv/autovec/binop/fmax_run-1.c: New test.
	* gcc.target/riscv/rvv/autovec/binop/fmax_zvfh-1.c: New test.
	* gcc.target/riscv/rvv/autovec/binop/fmax_zvfh_run-1.c: New test.
	* gcc.target/riscv/rvv/autovec/binop/fmin-1.c: New test.
	* gcc.target/riscv/rvv/autovec/binop/fmin_run-1.c: New test.
	* gcc.target/riscv/rvv/autovec/binop/fmin_zvfh-1.c: New test.
	* gcc.target/riscv/rvv/autovec/binop/fmin_zvfh_run-1.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-1.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-2.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-3.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-4.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-1.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-2.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-3.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-4.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-1.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-2.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-3.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-4.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-1.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-2.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-3.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-4.c: New test.
	* gcc.target/riscv/rvv/autovec/reduc/reduc-10.c: New test.
	* gcc.target/riscv/rvv/autovec/reduc/reduc_run-10.c: New test.
	* gcc.target/riscv/rvv/autovec/reduc/reduc_zvfh-10.c: New test.
	* gcc.target/riscv/rvv/autovec/reduc/reduc_zvfh_run-10.c: New test.
2023-10-31 13:34:28 +01:00
Robin Dapp
184378027e genemit: Split insn-emit.cc into several partitions.
On riscv insn-emit.cc has grown to over 1.2 mio lines of code and
compiling it takes considerable time.
Therefore, this patch adjust genemit to create several partitions
(insn-emit-1.cc to insn-emit-n.cc).  The available patterns are
written to the given files in a sequential fashion.

Similar to match.pd a configure option --with-emitinsn-partitions=num
is introduced that makes the number of partition configurable.

gcc/ChangeLog:

	PR bootstrap/84402
	PR target/111600

	* Makefile.in: Handle split insn-emit.cc.
	* configure: Regenerate.
	* configure.ac: Add --with-insnemit-partitions.
	* genemit.cc (output_peephole2_scratches): Print to file instead
	of stdout.
	(print_code): Ditto.
	(gen_rtx_scratch): Ditto.
	(gen_exp): Ditto.
	(gen_emit_seq): Ditto.
	(emit_c_code): Ditto.
	(gen_insn): Ditto.
	(gen_expand): Ditto.
	(gen_split): Ditto.
	(output_add_clobbers): Ditto.
	(output_added_clobbers_hard_reg_p): Ditto.
	(print_overload_arguments): Ditto.
	(print_overload_test): Ditto.
	(handle_overloaded_code_for): Ditto.
	(handle_overloaded_gen): Ditto.
	(print_header): New function.
	(handle_arg): New function.
	(main): Split output into 10 files.
	* gensupport.cc (count_patterns): New function.
	* gensupport.h (count_patterns): Define.
	* read-md.cc (md_reader::print_md_ptr_loc): Add file argument.
	* read-md.h (class md_reader): Change definition.
2023-10-31 13:34:28 +01:00
Alexandre Oliva
15404016d9 hardcfr: support checking at abnormal edges [PR111943]
Control flow redundancy may choose abnormal edges for early checking,
but that breaks because we can't insert checks on such edges.

Introduce conditional checking on the dest block of abnormal edges,
and leave it for the optimizer to drop the conditional.


for  gcc/ChangeLog

	PR tree-optimization/111943
	* gimple-harden-control-flow.cc: Adjust copyright year.
	(rt_bb_visited): Add vfalse and vtrue data members.
	Zero-initialize them in the ctor.
	(rt_bb_visited::insert_exit_check_on_edge): Upon encountering
	abnormal edges, insert initializers for vfalse and vtrue on
	entry, and insert the check sequence guarded by a conditional
	in the dest block.

for  libgcc/ChangeLog

	* hardcfr.c: Adjust copyright year.

for  gcc/testsuite/ChangeLog

	PR tree-optimization/111943
	* gcc.dg/harden-cfr-pr111943.c: New.
2023-10-31 09:32:08 -03:00
Richard Biener
e3da1d7bb2 tree-optimization/112305 - SCEV cprop and conditional undefined overflow
The following adjusts final value replacement to also rewrite the
replacement to defined overflow behavior if there's conditionally
evaluated stmts (with possibly undefined overflow), not only when
we "folded casts".  The patch hooks into expression_expensive for
this.

	PR tree-optimization/112305
	* tree-scalar-evolution.h (expression_expensive): Adjust.
	* tree-scalar-evolution.cc (expression_expensive): Record
	when we see a COND_EXPR.
	(final_value_replacement_loop): When the replacement contains
	a COND_EXPR, rewrite it to defined overflow.
	* tree-ssa-loop-ivopts.cc (may_eliminate_iv): Adjust.

	* gcc.dg/torture/pr112305.c: New testcase.
2023-10-31 13:10:04 +01:00
Iain Buclaw
1cf5dc05c6 d: Clean-up unused variable assignments after interface change
The lowering done for invoking `new' on a single dimension array was
moved from the code generator to the front-end semantic pass in
r14-4996.  This removes the detritus left behind in the code generator
from that deletion.

gcc/d/ChangeLog:

	* expr.cc (ExprVisitor::visit (NewExp *)): Remove unused assignments.
2023-10-31 12:30:52 +01:00
Xi Ruoyao
6bf2cebe2b
LoongArch: Define HAVE_AS_TLS to 0 if it's undefined [PR112299]
Now loongarch.md uses HAVE_AS_TLS, we need this to fix the failure
building a cross compiler if the cross assembler is not installed yet.

gcc/ChangeLog:

	PR target/112299
	* config/loongarch/loongarch-opts.h (HAVE_AS_TLS): Define to 0
	if not defined yet.
2023-10-31 14:23:18 +08:00
Lehua Ding
5ee961b6f2 RISC-V: Add assert of the number of vmerge in autovec cond testcases
This patch adds more asserts about the vmerge insns which is intended
to ensure better performance for cond autovec.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/cond/cond_arith-1.c: Add vmerge assert.
	* gcc.target/riscv/rvv/autovec/cond/cond_arith-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_arith-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_arith-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_arith-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_arith-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_arith-7.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_arith-8.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv32-1.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv32-2.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv64-1.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv64-2.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv32-1.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv32-2.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv64-1.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv64-2.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-1.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-2.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-1.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-2.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv32-1.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv32-2.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv64-1.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv64-2.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv64gcv.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-7.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-8.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma_run-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_shift-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_shift-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_shift-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_shift-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_shift-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_shift-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_shift-7.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_shift-8.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_shift-9.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary-1.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary-7.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_unary-8.c: Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_arith-10.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_arith-11.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_arith_run-10.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_arith_run-11.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: New test.
	* gcc.target/riscv/rvv/autovec/cond/cond_fmul_run-5.c: New test.
2023-10-31 14:15:57 +08:00
Lehua Ding
711d703d07 match.pd: Support combine cond_len_op + vec_cond similar to cond_op
This patch adds combine cond_len_op and vec_cond to cond_len_op like
cond_op.

Consider this code (RISC-V target):
  void
  foo (uint8_t *__restrict x, uint8_t *__restrict y, uint8_t *__restrict z,
       uint8_t *__restrict pred, uint8_t *__restrict merged, int n)
  {
    for (int i = 0; i < n; ++i)
      x[i] = pred[i] != 1 ? y[i] / z[i] : merged[i];
  }

Before this patch:
  ...
  vect_iftmp.18_71 = .COND_LEN_DIV (mask__31.11_61, vect__5.14_65, vect__7.17_69, { 0, ... }, _86, 0);
  vect_iftmp.23_78 = .VCOND_MASK (mask__31.11_61, vect_iftmp.18_71, vect_iftmp.22_77);
  ...

After this patch:
  ...
  _30 = .COND_LEN_DIV (mask__31.16_61, vect__5.19_65, vect__7.22_69, vect_iftmp.27_77, _85, 0);
  ...

gcc/ChangeLog:

	* gimple-match.h (gimple_match_op::gimple_match_op):
	Add interfaces for more arguments.
	(gimple_match_op::set_op): Add interfaces for more arguments.
	* match.pd: Add support of combining cond_len_op + vec_cond
2023-10-31 14:13:05 +08:00
Haochen Jiang
9cc2b97458 Fix incorrect option mask and avx512cd target push
gcc/ChangeLog:

	* config/i386/avx512cdintrin.h (target): Push evex512 for
	avx512cd.
	* config/i386/avx512vlintrin.h (target): Split avx512cdvl part
	out from avx512vl.
	* config/i386/i386-builtin.def (BDESC): Do not check evex512
	for builtins not needed.
2023-10-31 13:42:15 +08:00
Lehua Ding
5ee894130f RISC-V: Add the missed combine of [u]int64 -> _Float16 and vcond
Hi,

This patch let the INT64 to FP16 convert split to two small converts
(INT64 -> FP32 and FP32 -> FP16) when expanding instead of dealy the
split to split1 pass. This change could make it possible to combine
the FP32 to FP16 and vcond patterns and so we don't need to add an
combine pattern for INT64 to FP16 and vcond patterns.

Consider this code:
  void
  foo (_Float16 *__restrict r, int64_t *__restrict a, _FLoat16 *__restrict b,
       int64_t *__restrict pred, int n)
  {
    for (int i = 0; i < n; i += 1)
      {
        r[i] = pred[i] ? (_Float16) a[i] : b[i];
      }
  }

Before this patch:
  ...
  vfncvt.f.f.w    v2,v2
  vmerge.vvm      v1,v1,v2,v0
  vse16.v v1,0(a0)
  ...

After this patch:
  ...
  vfncvt.f.f.w    v1,v2,v0.t
  vse16.v v1,0(a0)
  ...

gcc/ChangeLog:

	* config/riscv/autovec.md (<float_cvt><mode><vnnconvert>2):
	Change to define_expand.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-1.c:
	Add vfncvt.f.f.w assert.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-2.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-1.c:
	Ditto.
	* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-2.c:
	Ditto.
2023-10-31 11:45:46 +08:00
liuhongt
f5d33d0c79 Fix wrong code due to incorrect define_split
-(define_split
-  [(set (match_operand:V2HI 0 "register_operand")
-        (eq:V2HI
-          (eq:V2HI
-            (us_minus:V2HI
-              (match_operand:V2HI 1 "register_operand")
-              (match_operand:V2HI 2 "register_operand"))
-            (match_operand:V2HI 3 "const0_operand"))
-          (match_operand:V2HI 4 "const0_operand")))]
-  "TARGET_SSE4_1"
-  [(set (match_dup 0)
-        (umin:V2HI (match_dup 1) (match_dup 2)))
-   (set (match_dup 0)
-        (eq:V2HI (match_dup 0) (match_dup 2)))])

the splitter is wrong when op1 == op2.(the original pattern returns 0, after split, it returns 1)
So remove the splitter.

Also extend another define_split to define_insn_and_split to handle
below pattern

494(set (reg:V4QI 112)
495    (unspec:V4QI [
496            (subreg:V4QI (reg:V2HF 111 [ bf ]) 0)
497            (subreg:V4QI (reg:V2HF 110 [ af ]) 0)
498            (subreg:V4QI (eq:V2HI (eq:V2HI (reg:V2HI 105)
499                        (const_vector:V2HI [
500                                (const_int 0 [0]) repeated x2
501                            ]))
502                    (const_vector:V2HI [
503                            (const_int 0 [0]) repeated x2
504                        ])) 0)
505        ] UNSPEC_BLENDV))

define_split doesn't work since pass_combine assume it produces at
most 2 insns after split, but here it produces 3 since we need to move
const0_rtx (V2HImode) to reg. The move insn can be eliminated later.

gcc/ChangeLog:

	PR target/112276
	* config/i386/mmx.md (*mmx_pblendvb_v8qi_1): Change
	define_split to define_insn_and_split to handle
	immediate_operand for comparison.
	(*mmx_pblendvb_v8qi_2): Ditto.
	(*mmx_pblendvb_<mode>_1): Ditto.
	(*mmx_pblendvb_v4qi_2): Ditto.
	(<code><mode>3): Remove define_split after it.
	(<code>v8qi3): Ditto.
	(<code><mode>3): Ditto.
	(<ode>v2hi3): Ditto.

gcc/testsuite/ChangeLog:

	* g++.target/i386/part-vect-vcondhf.C: Adjust testcase.
	* gcc.target/i386/pr112276.c: New test.
2023-10-31 11:24:45 +08:00
Andrew Pinski
541b754c77 MATCH: Add some more value_replacement simplifications to match
This moves a few more value_replacements simplifications to match.
/* a == 1 ? b : a * b -> a * b */
/* a == 1 ? b : b / a  -> b / a */
/* a == -1 ? b : a & b -> a & b */

Also adds a testcase to show can we catch these where value_replacement would not
(but other passes would).

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

	* match.pd (`a == 1 ? b : a OP b`): New pattern.
	(`a == -1 ? b : a & b`): New pattern.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/phi-opt-value-4.c: New test.
2023-10-30 19:15:25 -07:00
Andrew Pinski
598fdb5290 MATCH: first of the value replacement moving from phiopt
This moves a few simple patterns that are done in value replacement
in phiopt over to match.pd. Just the simple ones which might show up
in other code.

This allows some optimizations to happen even without depending
on sinking from happening and in some cases where phiopt is not
invoked (cond-1.c is an example there).

Changes since v1:
* v2: Add an extra testcase to showcase improvements at -O1.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

	* match.pd: (`a == 0 ? b : b + a`,
	`a == 0 ? b : b - a`): New patterns.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/cond-1.c: New test.
	* gcc.dg/tree-ssa/phi-opt-value-1.c: New test.
	* gcc.dg/tree-ssa/phi-opt-value-1a.c: New test.
	* gcc.dg/tree-ssa/phi-opt-value-2.c: New test.
2023-10-30 19:15:25 -07:00
GCC Administrator
a5c157b95a Daily bump. 2023-10-31 00:17:32 +00:00
Mayshao
94c0b26f45 i386: Zhaoxin yongfeng enablement
Enable -march/-mtune=yongfeng. Costs and tunings are set according
to the characteristics of the processor. Add a new .md file to describe
yongfeng processor.

gcc/ChangeLog:

	* common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Recognize yongfeng.
	* common/config/i386/i386-common.cc: Add yongfeng.
	* common/config/i386/i386-cpuinfo.h (enum processor_subtypes):
	Add ZHAOXIN_FAM7H_YONGFENG.
	* config.gcc: Add yongfeng.
	* config/i386/driver-i386.cc (host_detect_local_cpu):
	Let -march=native recognize yongfeng processors.
	* config/i386/i386-c.cc (ix86_target_macros_internal): Add yongfeng.
	* config/i386/i386-options.cc (m_YONGFENG): New definition.
	(m_ZHAOXIN): Ditto.
	* config/i386/i386.h (enum processor_type): Add PROCESSOR_YONGFENG.
	* config/i386/i386.md: Add yongfeng.
	* config/i386/lujiazui.md: Fix typo.
	* config/i386/x86-tune-costs.h (struct processor_costs):
	Add yongfeng costs.
	* config/i386/x86-tune-sched.cc (ix86_issue_rate): Add yongfeng.
	(ix86_adjust_cost): Ditto.
	* config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Replace
	m_LUJIAZUI with m_ZHAOXIN.
	(X86_TUNE_PARTIAL_REG_DEPENDENCY): Ditto.
	(X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY): Ditto.
	(X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): Ditto.
	(X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Ditto.
	(X86_TUNE_MOVX): Ditto.
	(X86_TUNE_MEMORY_MISMATCH_STALL): Ditto.
	(X86_TUNE_FUSE_CMP_AND_BRANCH_32): Ditto.
	(X86_TUNE_FUSE_CMP_AND_BRANCH_64): Ditto.
	(X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS): Ditto.
	(X86_TUNE_FUSE_ALU_AND_BRANCH): Ditto.
	(X86_TUNE_ACCUMULATE_OUTGOING_ARGS): Ditto.
	(X86_TUNE_USE_LEAVE): Ditto.
	(X86_TUNE_PUSH_MEMORY): Ditto.
	(X86_TUNE_LCP_STALL): Ditto.
	(X86_TUNE_INTEGER_DFMODE_MOVES): Ditto.
	(X86_TUNE_OPT_AGU): Ditto.
	(X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB): Ditto.
	(X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Ditto.
	(X86_TUNE_USE_SAHF): Ditto.
	(X86_TUNE_USE_BT): Ditto.
	(X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Ditto.
	(X86_TUNE_ONE_IF_CONV_INSN): Ditto.
	(X86_TUNE_AVOID_MFENCE): Ditto.
	(X86_TUNE_EXPAND_ABS): Ditto.
	(X86_TUNE_USE_SIMODE_FIOP): Ditto.
	(X86_TUNE_USE_FFREEP): Ditto.
	(X86_TUNE_EXT_80387_CONSTANTS): Ditto.
	(X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Ditto.
	(X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Ditto.
	(X86_TUNE_SSE_TYPELESS_STORES): Ditto.
	(X86_TUNE_SSE_LOAD0_BY_PXOR): Ditto.
	(X86_TUNE_USE_GATHER_2PARTS): Add m_YONGFENG.
	(X86_TUNE_USE_GATHER_4PARTS): Ditto.
	(X86_TUNE_USE_GATHER_8PARTS): Ditto.
	(X86_TUNE_AVOID_128FMA_CHAINS): Ditto.
	* doc/extend.texi: Add details about yongfeng.
	* doc/invoke.texi: Ditto.
	* config/i386/yongfeng.md: New file to describe yongfeng processor.

gcc/testsuite/ChangeLog:

	* g++.target/i386/mv32.C: Handle new -march.
	* gcc.target/i386/funcspec-56.inc: Ditto.
2023-10-30 22:20:01 +01:00
François Dumont
6504b4a498 libstdc++: [_GLIBCXX_INLINE_VERSION] Add comment on emul TLS symbols
libstdc++-v3/ChangeLog:

	* config/abi/pre/gnu-versioned-namespace.ver: Add comment on recently
	added emul TLS symbols.
2023-10-30 22:07:49 +01:00
François Dumont
5ea11700e5 libstdc++: [_GLIBCXX_INLINE_VERSION] Un-weak handle_contract_violation
libstdc++-v3/ChangeLog:

	* src/experimental/contract.cc
	[_GLIBCXX_INLINE_VERSION](handle_contract_violation): Rework comment.
	Remove weak attribute.
2023-10-30 21:49:31 +01:00
Iain Sandoe
434975cb1b configure, fixincludes: Add change missed in r14-4825.
This corrects an oversight in the r14-4825 commit.

fixincludes/ChangeLog:

	* configure: Regenerate.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
2023-10-30 19:05:00 +00:00
Martin Jambor
997c8219f0
ipa: Prune any IPA-CP aggregate constants known by modref to be killed (111157)
PR 111157 shows that IPA-modref and IPA-CP (when plugged into value
numbering) can optimize out a store both before a call (because the
call will overwrite it) and in the call (because the store is of the
same value) and by eliminating both create miscompilation.

This patch fixes that by pruning any constants from the list of IPA-CP
aggregate value constants that it knows the contents of the memory can
be "killed."  Unfortunately, doing so is tricky.  First, IPA-modref
loads override kills and so only stores not loaded are truly not
necessary.  Looking stuff up there means doing what most of what
modref_may_alias may do but doing exactly what it does is tricky
because it takes also aliasing into account and has bail-out counters.

To err on the side of caution in order to avoid this miscompilation we
have to prune a constant when in doubt.  However, pruning can
interfere with the mechanism of how clone materialization
distinguishes between the cases when a parameter was entirely removed
and when it was both IPA-CPed and IPA-SRAed (in order to make up for
the removal in debug info, which can bump into an assert when
compiling g++.dg/torture/pr103669.C when we are not careful).

Therefore this patch:

  1) marks constants that IPA-modref has in its kill list with a new
     "killed" flag, and
  2) prunes the list from entries with this flag after materialization
     and IPA-CP transformation is done using the template introduced in
     the previous patch

It does not try to look up anything in the load lists, this will be
done as a follow-up in order to ease review.

gcc/ChangeLog:

2023-10-27  Martin Jambor  <mjambor@suse.cz>

	PR ipa/111157
	* ipa-prop.h (struct ipa_argagg_value): Newf flag killed.
	* ipa-modref.cc (ipcp_argagg_and_kill_overlap_p): New function.
	(update_signature): Mark any any IPA-CP aggregate constants at
	positions known to be killed as killed.  Move check that there is
	clone_info after this pruning.
	* ipa-cp.cc (ipa_argagg_value_list::dump): Dump the killed flag.
	(ipa_argagg_value_list::push_adjusted_values): Clear the new flag.
	(push_agg_values_from_plats): Likewise.
	(ipa_push_agg_values_from_jfunc): Likewise.
	(estimate_local_effects): Likewise.
	(push_agg_values_for_index_from_edge): Likewise.
	* ipa-prop.cc (write_ipcp_transformation_info): Stream the killed
	flag.
	(read_ipcp_transformation_info): Likewise.
	(ipcp_get_aggregate_const): Update comment, assert that encountered
	record does not have killed flag set.
	(ipcp_transform_function): Prune all aggregate constants with killed
	set.

gcc/testsuite/ChangeLog:

2023-09-18  Martin Jambor  <mjambor@suse.cz>

	PR ipa/111157
	* gcc.dg/lto/pr111157_0.c: New test.
	* gcc.dg/lto/pr111157_1.c: Second file of the same new test.
2023-10-30 18:36:54 +01:00
Martin Jambor
1437df40f1
ipa-cp: Templatize filtering of m_agg_values
PR 111157 points to another place where IPA-CP collected aggregate
compile-time constants need to be filtered, in addition to the one
place that already does this in ipa-sra.  In order to re-use code,
this patch turns the common bit into a template.

The functionality is still covered by testcase gcc.dg/ipa/pr108959.c.

gcc/ChangeLog:

2023-09-13  Martin Jambor  <mjambor@suse.cz>

	PR ipa/111157
	* ipa-prop.h (ipcp_transformation): New member function template
	remove_argaggs_if.
	* ipa-sra.cc (zap_useless_ipcp_results): Use remove_argaggs_if to
	filter aggreagate constants.
2023-10-30 18:36:40 +01:00
Patrick O'Neill
68880e4053
RISC-V: Make rv32i_zcmp testcase more robust
GCC recently changed its register allocator which causes this
testcase to fail.
This patch updates the regex to be more robust to change by accepting
any s register in the range of 1-9 for cm.push and cm.popret insns.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rv32i_zcmp.c: Accept any register in the
	range of 1-9 for cm.push and cm.popret insns.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2023-10-30 09:58:54 -07:00
Roger Sayle
a3da9adeb4 ARC: Convert (signed<<31)>>31 to -(signed&1) without barrel shifter.
This patch optimizes PR middle-end/101955 for the ARC backend.  On ARC
CPUs with a barrel shifter, using two shifts is optimal as:

        asl_s   r0,r0,31
        asr_s   r0,r0,31

but without a barrel shifter, GCC -O2 -mcpu=em currently generates:

        and     r2,r0,1
        ror     r2,r2
        add.f   0,r2,r2
        sbc     r0,r0,r0

with this patch, we now generate the smaller, faster and non-flags
clobbering:

        bmsk_s  r0,r0,0
        neg_s   r0,r0

2023-10-30  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR middle-end/101955
	* config/arc/arc.md (*extvsi_1_0): New define_insn_and_split
	to convert sign extract of the least significant bit into an
	AND $1 then a NEG when !TARGET_BARREL_SHIFTER.

gcc/testsuite/ChangeLog
	PR middle-end/101955
	* gcc.target/arc/pr101955.c: New test case.
2023-10-30 16:21:28 +00:00
Roger Sayle
31cc9824d1 ARC: Improved ARC rtx_costs/insn_cost for SHIFTs and ROTATEs.
This patch overhauls the ARC backend's insn_cost target hook, and makes
some related improvements to rtx_costs, BRANCH_COST, etc.  The primary
goal is to allow the backend to indicate that shifts and rotates are
slow (discouraged) when the CPU doesn't have a barrel shifter. I should
also acknowledge Richard Sandiford for inspiring the use of set_cost
in this rewrite of arc_insn_cost; this implementation borrows heavily
for the target hooks for AArch64 and ARM.

The motivating example is derived from PR rtl-optimization/110717.

struct S { int a : 5; };
unsigned int foo (struct S *p) {
  return p->a;
}

With a barrel shifter, GCC -O2 generates the reasonable:

foo:    ldb_s   r0,[r0]
        asl_s   r0,r0,27
        j_s.d   [blink]
        asr_s   r0,r0,27

What's interesting is that during combine, the middle-end actually
has two shifts by three bits, and a sign-extension from QI to SI.

Trying 8, 9 -> 11:
    8: r158:SI=r157:QI#0<<0x3
      REG_DEAD r157:QI
    9: r159:SI=sign_extend(r158:SI#0)
      REG_DEAD r158:SI
   11: r155:SI=r159:SI>>0x3
      REG_DEAD r159:SI

Whilst it's reasonable to simplify this to two shifts by 27 bits when
the CPU has a barrel shifter, it's actually a significant pessimization
when these shifts are implemented by loops.  This combination can be
prevented if the backend provides accurate-ish estimates for insn_cost.

Previously, without a barrel shifter, GCC -O2 -mcpu=em generates:

foo:	ldb_s   r0,[r0]
        mov     lp_count,27
        lp      2f
        add     r0,r0,r0
        nop
2:      # end single insn loop
        mov     lp_count,27
        lp      2f
        asr     r0,r0
        nop
2:      # end single insn loop
        j_s     [blink]

which contains two loops and requires about ~113 cycles to execute.
With this patch to rtx_cost/insn_cost, GCC -O2 -mcpu=em generates:

foo:	ldb_s   r0,[r0]
        mov_s   r2,0    ;3
        add3    r0,r2,r0
        sexb_s  r0,r0
        asr_s   r0,r0
        asr_s   r0,r0
        j_s.d   [blink]
        asr_s   r0,r0

which requires only ~6 cycles, for the shorter shifts by 3 and sign
extension.

2023-10-30  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/arc/arc.cc (arc_rtx_costs): Improve cost estimates.
	Provide reasonable values for SHIFTS and ROTATES by constant
	bit counts depending upon TARGET_BARREL_SHIFTER.
	(arc_insn_cost): Use insn attributes if the instruction is
	recognized.  Avoid calling get_attr_length for type "multi",
	i.e. define_insn_and_split patterns without explicit type.
	Fall-back to set_rtx_cost for single_set and pattern_cost
	otherwise.
	* config/arc/arc.h (COSTS_N_BYTES): Define helper macro.
	(BRANCH_COST): Improve/correct definition.
	(LOGICAL_OP_NON_SHORT_CIRCUIT): Preserve previous behavior.
2023-10-30 16:17:42 +00:00
Roger Sayle
d24c3c5334 ARC: Improved SImode shifts and rotates with -mswap.
This patch improves the code generated by the ARC back-end for CPUs
without a barrel shifter but with -mswap.  The -mswap option provides
a SWAP instruction that implements SImode rotations by 16, but also
logical shift instructions (left and right) by 16 bits.  Clearly these
are also useful building blocks for implementing shifts by 17, 18, etc.
which would otherwise require a loop.

As a representative example:
int shl20 (int x) { return x << 20; }

GCC with -O2 -mcpu=em -mswap would previously generate:

shl20:  mov     lp_count,10
        lp      2f
        add     r0,r0,r0
        add     r0,r0,r0
2:      # end single insn loop
        j_s     [blink]

with this patch we now generate:

shl20:  mov_s   r2,0    ;3
        lsl16   r0,r0
        add3    r0,r2,r0
        j_s.d   [blink]
        asl_s r0,r0

Although both are four instructions (excluding the j_s),
the original takes ~22 cycles, and replacement ~4 cycles.

2023-10-30  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/arc/arc.cc (arc_split_ashl): Use lsl16 on TARGET_SWAP.
	(arc_split_ashr): Use swap and sign-extend on TARGET_SWAP.
	(arc_split_lshr): Use lsr16 on TARGET_SWAP.
	(arc_split_rotl): Use swap on TARGET_SWAP.
	(arc_split_rotr): Likewise.
	* config/arc/arc.md (ANY_ROTATE): New code iterator.
	(<ANY_ROTATE>si2_cnt16): New define_insn for alternate form of
	swap instruction on TARGET_SWAP.
	(ashlsi2_cnt16): Rename from *ashlsi16_cnt16 and move earlier.
	(lshrsi2_cnt16): New define_insn for LSR16 instruction.
	(*ashlsi2_cnt16): See above.

gcc/testsuite/ChangeLog
	* gcc.target/arc/lsl16-1.c: New test case.
	* gcc.target/arc/lsr16-1.c: Likewise.
	* gcc.target/arc/swap-1.c: Likewise.
	* gcc.target/arc/swap-2.c: Likewise.
2023-10-30 16:12:30 +00:00
Richard Ball
fb1941d08f arm: move the switch tables for Arm to the RO data section.
Follow up patch to arm: Use deltas for Arm switch tables
This patch moves the switch tables for Arm from the .text section
into the .rodata section.

gcc/ChangeLog:

	* config/arm/aout.h: Change to use the Lrtx label.
	* config/arm/arm.h (CASE_VECTOR_PC_RELATIVE): Remove arm targets
	from (!target_pure_code) condition.
	(ADDR_VEC_ALIGN): Add align for tables in rodata section.
	* config/arm/arm.cc (arm_output_casesi): Alter the function to include
	.Lrtx label and remove adr instructions.
	* config/arm/arm.md
	(arm_casesi_internal): Use force_reg to generate ldr instructions that
	would otherwise be out of range, and change rtl to accommodate force reg.
	Additionally remove unnecessary register temp.
	(casesi): Remove pure code check for Arm.
	* config/arm/elf.h (JUMP_TABLES_IN_TEXT_SECTION): Remove arm
	targets from JUMP_TABLES_IN_TEXT_SECTION definition.

gcc/testsuite/ChangeLog:

	* gcc.target/arm/arm-switchstatement.c: Alter the tests to
	change adr instruction to ldr.
2023-10-30 15:31:26 +00:00
Francois-Xavier Coudert
7666d94db0 Testsuite, i386: Mark test as requiring ifunc
Test is currently failing on x86_64-apple-darwin.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr105554.c: Require ifunc.
2023-10-30 15:57:33 +01:00
Francois-Xavier Coudert
89e97f655d Testsuite, Darwin: Fix trampoline warning
Heap-based trampolines are enabled on darwin20 and later,
meaning that no warning is emitted.

gcc/testsuite/ChangeLog:

	* gcc.dg/Wtrampolines.c: Skip on darwin20 and later.
2023-10-30 14:45:47 +01:00
Francois-Xavier Coudert
5c7bbb0fcd Testsuite, i386: Fix test by passing -march
The test currently fails on Darwin, where the default arch is core2.

gcc/testsuite/ChangeLog:

	PR target/112287
	* gcc.target/i386/pr111698.c: Pass -march=sandybridge.
2023-10-30 12:50:01 +01:00
Francois-Xavier Coudert
a0c557690c Testsuite, Darwin: skip PIE test
gcc/testsuite/ChangeLog:

	* gcc.dg/pie-2.c: Skip test on darwin.
2023-10-30 12:41:17 +01:00
Jeevitha
36a52cdc23 rs6000: Change bitwise xor to an equality operator [PR106907]
PR106907 has a few warnings spotted from cppcheck. These warnings
are related to the need of precedence clarification. Instead of using xor,
it has been changed to equality check, which achieves the same result.
Additionally, comment indentation has been fixed.

2023-10-11  Jeevitha Palanisamy  <jeevitha@linux.ibm.com>

gcc/
	PR target/106907
	* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Change bitwise
	xor to an equality and fix comment indentation.
2023-10-30 05:38:19 -05:00
Richard Biener
ff4cea05a6 PR testsuite/111462 - add powerpc64le to list of ssa-sink-18.c XFAIL
PR testsuite/111462
gcc/testsuite/
	* gcc.dg/tree-ssa/ssa-sink-18.c: XFAIL also powerpc64le.
2023-10-30 11:03:03 +01:00
Juzhe-Zhong
eb1cdb3e43 RISC-V: Fix bugs of handling scalar of SEW64 vx instruction in RV32
sew64_scalar_helper is handling SEW64 vx instruction pattern on RV32 system.
According to RVV ISA, we can directly use vx instruction of SEW64 on RV32 system
since RV32 GR reg is 32bit.

Consider this following case:

vsetvl e64m1
vadd.vx v,v,x

will be transform by sew64_scalar_helper:

vsetvl e64m1
sw
sw
vlse v
vadd.vv

This bug is reported by Robin.
(insn 143 179 230 9 (set (reg:SI 15 a5 [234])
        (unspec:SI [
                (const_int 64 [0x40])
            ] UNSPEC_VLMAX)) 751 {vlmax_avlsi}
     (expr_list:REG_EQUIV (unspec:SI [
                (const_int 64 [0x40])
            ] UNSPEC_VLMAX)
        (nil)))
(insn 230 143 78 9 (parallel [
            (set (reg:SI 66 vl)
                (unspec:SI [
                        (reg:SI 15 a5 [234])
                        (const_int 64 [0x40])
                        (const_int 0 [0])
                    ] UNSPEC_VSETVL))
            (set (reg:SI 67 vtype)
                (unspec:SI [
                        (const_int 64 [0x40])
                        (const_int 0 [0])
                        (const_int 1 [0x1]) repeated x2
                    ] UNSPEC_VSETVL))
        ]) "bug.c":14:14 discrim 1 1469 {vsetvl_discard_resultsi}
     (nil))
(insn 78 230 84 9 (set (reg:RVVM1DI 102 v6 [203])
        (if_then_else:RVVM1DI (unspec:RVVMF64BI [
                    (const_vector:RVVMF64BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (const_int 0 [0])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (vec_duplicate:RVVM1DI (mem/u/c:DI (reg/f:SI 29 t4 [230]) [0  S8 A64]))
            (unspec:RVVM1DI [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "bug.c":14:14 discrim 1 1872 {*pred_broadcastrvvm1di}
     (expr_list:REG_DEAD (reg/f:SI 29 t4 [230])
        (nil)))

The root cause of this is because we missed VLMAX handling since the codes was invented
long time ago (Callers always intrinsics codes, no VLMAX situation).

Now, all following bugs are fixed after this patch:

FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (sew64_scalar_helper): Fix bug.
	* config/riscv/riscv-v.cc (sew64_scalar_helper): Ditto.
	* config/riscv/vector.md: Ditto.
2023-10-30 15:48:29 +08:00
Paul Thomas
f3e44d0797 Fortran: Fix a problem with SELECT TYPE selectors [PR104555].
2023-10-30  Paul Thomas  <pault@gcc.gnu.org>

gcc/fortran
	PR fortran/104555
	* resolve.cc (resolve_select_type): If the selector expression
	has no class component references and the expression is a
	derived type, copy the typespec of the symbol to that of the
	expression.

gcc/testsuite/
	PR fortran/104555
	* gfortran.dg/pr104555.f90: New test.
2023-10-30 07:12:40 +00:00