FreeChainXenon/gcc - Aiden Isik's Forgejo Server

Author	SHA1	Message	Date
Patrick O'Neill	2b19c38769	RISC-V: Require a extension for testcases with atomic insns Add testsuite infrastructure for the A extension and use it to require the A extension for dg-do run and add the add extension for non-A dg-do compile. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo-table-a-6-amo-add-1.c: Add A extension to dg-options for dg-do compile. * gcc.target/riscv/amo-table-a-6-amo-add-2.c: Ditto. * gcc.target/riscv/amo-table-a-6-amo-add-3.c: Ditto. * gcc.target/riscv/amo-table-a-6-amo-add-4.c: Ditto. * gcc.target/riscv/amo-table-a-6-amo-add-5.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-1.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-2.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-3.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-4.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-5.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-6.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-7.c: Ditto. * gcc.target/riscv/amo-table-a-6-subword-amo-add-1.c: Ditto. * gcc.target/riscv/amo-table-a-6-subword-amo-add-2.c: Ditto. * gcc.target/riscv/amo-table-a-6-subword-amo-add-3.c: Ditto. * gcc.target/riscv/amo-table-a-6-subword-amo-add-4.c: Ditto. * gcc.target/riscv/amo-table-a-6-subword-amo-add-5.c: Ditto. * gcc.target/riscv/inline-atomics-2.c: Ditto. * gcc.target/riscv/inline-atomics-3.c: Require A extension for dg-do run. * gcc.target/riscv/inline-atomics-4.c: Ditto. * gcc.target/riscv/inline-atomics-5.c: Ditto. * gcc.target/riscv/inline-atomics-6.c: Ditto. * gcc.target/riscv/inline-atomics-7.c: Ditto. * gcc.target/riscv/inline-atomics-8.c: Ditto. * lib/target-supports.exp: Add testing infrastructure to require the A extension or add it to an existing -march. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>	2023-10-31 10:15:34 -07:00
Patrick O'Neill	b93fddba39	RISC-V: Let non-atomic targets use optimized amo loads/stores Non-atomic targets are currently prevented from using the optimized fencing for seq_cst load/seq_cst store. This patch removes that constraint. gcc/ChangeLog: * config/riscv/sync-rvwmo.md (atomic_load_rvwmo<mode>): Remove TARGET_ATOMIC constraint (atomic_store_rvwmo<mode>): Ditto. * config/riscv/sync-ztso.md (atomic_load_ztso<mode>): Ditto. (atomic_store_ztso<mode>): Ditto. * config/riscv/sync.md (atomic_load<mode>): Ditto. (atomic_store<mode>): Ditto. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>	2023-10-31 10:15:33 -07:00
Christoph Müllner	60d6c63df0	riscv: thead: Add support for the XTheadFMemIdx ISA extension The XTheadFMemIdx ISA extension provides additional load and store instructions for floating-point registers with new addressing modes. The following memory accesses types are supported: * load/store: [w,d] (single-precision FP, double-precision FP) The following addressing modes are supported: * register offset with additional immediate offset (4 instructions): flr<type>, fsr<type> * zero-extended register offset with additional immediate offset (4 instructions): flur<type>, fsur<type> These addressing modes are also part of the similar XTheadMemIdx ISA extension support, whose code is reused and extended to support floating-point registers. One challenge that this patch needs to solve are GP registers in FP-mode (e.g. "(reg:DF a2)"), which cannot be handled by the XTheadFMemIdx instructions. Such registers are the result of independent optimizations, which can happen after register allocation. This patch uses a simple but efficient method to address this: add a dependency for XTheadMemIdx to XTheadFMemIdx optimizations. This allows to use the instructions from XTheadMemIdx in case of such registers. The added tests ensure that this feature won't regress without notice. Testing: GCC regression test suite and SPEC CPU 2017 intrate (base&peak). Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> gcc/ChangeLog: * config/riscv/riscv.cc (riscv_index_reg_class): Return GR_REGS for XTheadFMemIdx. (riscv_regno_ok_for_index_p): Add support for XTheadFMemIdx. * config/riscv/riscv.h (HARDFP_REG_P): New macro. * config/riscv/thead.cc (is_fmemidx_mode): New function. (th_memidx_classify_address_index): Add support for XTheadFMemIdx. (th_fmemidx_output_index): New function. (th_output_move): Add support for XTheadFMemIdx. * config/riscv/thead.md (TH_M_ANYF): New mode iterator. (TH_M_NOEXTF): Likewise. (th_fmemidx_movsf_hardfloat): New INSN. (th_fmemidx_movdf_hardfloat_rv64): Likewise. (th_fmemidx_I_a): Likewise. (th_fmemidx_I_c): Likewise. (th_fmemidx_US_a): Likewise. (th_fmemidx_US_c): Likewise. (th_fmemidx_UZ_a): Likewise. (th_fmemidx_UZ_c): Likewise. gcc/testsuite/ChangeLog: * gcc.target/riscv/xtheadfmemidx-index-update.c: New test. * gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c: New test. * gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c: New test. * gcc.target/riscv/xtheadfmemidx-index.c: New test. * gcc.target/riscv/xtheadfmemidx-uindex-update.c: New test. * gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c: New test. * gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c: New test. * gcc.target/riscv/xtheadfmemidx-uindex.c: New test. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>	2023-10-31 18:08:02 +01:00
Christoph Müllner	2d65622fda	riscv: thead: Add support for the XTheadMemIdx ISA extension The XTheadMemIdx ISA extension provides a additional load and store instructions with new addressing modes. The following memory accesses types are supported: * load: b,bu,h,hu,w,wu,d * store: b,h,w,d The following addressing modes are supported: * immediate offset with PRE_MODIFY or POST_MODIFY (22 instructions): l<ltype>.ia, l<ltype>.ib, s<stype>.ia, s<stype>.ib * register offset with additional immediate offset (11 instructions): lr<ltype>, sr<stype> * zero-extended register offset with additional immediate offset (11 instructions): lur<ltype>, sur<stype> The RISC-V base ISA does not support index registers, so the changes are kept separate from the RISC-V standard support as much as possible. To combine the shift/multiply instructions into the memory access instructions, this patch comes with a few insn_and_split optimizations that allow the combiner to do this task. Handling the different cases of extensions results in a couple of INSNs that look redundant on first view, but they are just the equivalence of what we already have for Zbb as well. The only difference is, that we have much more load instructions. We already have a constraint with the name 'th_f_fmv', therefore, the new constraints follow this pattern and have the same length as required ('th_m_mia', 'th_m_mib', 'th_m_mir', 'th_m_miu'). The added tests ensure that this feature won't regress without notice. Testing: GCC regression test suite, GCC bootstrap build, and SPEC CPU 2017 intrate (base&peak) on C920. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> gcc/ChangeLog: * config/riscv/constraints.md (th_m_mia): New constraint. (th_m_mib): Likewise. (th_m_mir): Likewise. (th_m_miu): Likewise. * config/riscv/riscv-protos.h (enum riscv_address_type): Add new address types ADDRESS_REG_REG, ADDRESS_REG_UREG, and ADDRESS_REG_WB and their documentation. (struct riscv_address_info): Add new field 'shift' and document the field usage for the new address types. (riscv_valid_base_register_p): New prototype. (th_memidx_legitimate_modify_p): Likewise. (th_memidx_legitimate_index_p): Likewise. (th_classify_address): Likewise. (th_output_move): Likewise. (th_print_operand_address): Likewise. * config/riscv/riscv.cc (riscv_index_reg_class): Return GR_REGS for XTheadMemIdx. (riscv_regno_ok_for_index_p): Add support for XTheadMemIdx. (riscv_classify_address): Call th_classify_address() on top. (riscv_output_move): Call th_output_move() on top. (riscv_print_operand_address): Call th_print_operand_address() on top. * config/riscv/riscv.h (HAVE_POST_MODIFY_DISP): New macro. (HAVE_PRE_MODIFY_DISP): Likewise. * config/riscv/riscv.md (zero_extendqi<SUPERQI:mode>2): Disable for XTheadMemIdx. (zero_extendqi<SUPERQI:mode>2_internal): Convert to expand, create INSN with same name and disable it for XTheadMemIdx. (extendsidi2): Likewise. (extendsidi2_internal): Disable for XTheadMemIdx. * config/riscv/thead.cc (valid_signed_immediate): New helper function. (th_memidx_classify_address_modify): New function. (th_memidx_legitimate_modify_p): Likewise. (th_memidx_output_modify): Likewise. (is_memidx_mode): Likewise. (th_memidx_classify_address_index): Likewise. (th_memidx_legitimate_index_p): Likewise. (th_memidx_output_index): Likewise. (th_classify_address): Likewise. (th_output_move): Likewise. (th_print_operand_address): Likewise. * config/riscv/thead.md (th_memidx_operand): New splitter. (th_memidx_zero_extendqi<SUPERQI:mode>2): New INSN. (th_memidx_extendsidi2): Likewise. (th_memidx_zero_extendsidi2): Likewise. (th_memidx_zero_extendhi<GPR:mode>2): Likewise. (th_memidx_extend<SHORT:mode><SUPERQI:mode>2): Likewise. (th_memidx_bb_zero_extendsidi2): Likewise. (th_memidx_bb_zero_extendhi<GPR:mode>2): Likewise. (th_memidx_bb_extendhi<GPR:mode>2): Likewise. (th_memidx_bb_extendqi<SUPERQI:mode>2): Likewise. (TH_M_ANYI): New mode iterator. (TH_M_NOEXTI): Likewise. (th_memidx_I_a): New combiner optimization. (th_memidx_I_b): Likewise. (th_memidx_I_c): Likewise. (th_memidx_US_a): Likewise. (th_memidx_US_b): Likewise. (th_memidx_US_c): Likewise. (th_memidx_UZ_a): Likewise. (th_memidx_UZ_b): Likewise. (th_memidx_UZ_c): Likewise. gcc/testsuite/ChangeLog: gcc.target/riscv/xtheadmemidx-helpers.h: New test. * gcc.target/riscv/xtheadmemidx-index-update.c: New test. * gcc.target/riscv/xtheadmemidx-index-xtheadbb-update.c: New test. * gcc.target/riscv/xtheadmemidx-index-xtheadbb.c: New test. * gcc.target/riscv/xtheadmemidx-index.c: New test. * gcc.target/riscv/xtheadmemidx-modify-xtheadbb.c: New test. * gcc.target/riscv/xtheadmemidx-modify.c: New test. * gcc.target/riscv/xtheadmemidx-uindex-update.c: New test. * gcc.target/riscv/xtheadmemidx-uindex-xtheadbb-update.c: New test. * gcc.target/riscv/xtheadmemidx-uindex-xtheadbb.c: New test. * gcc.target/riscv/xtheadmemidx-uindex.c: New test.	2023-10-31 18:08:01 +01:00
Carl Love	c82f123d93	rs6000, Add missing overloaded bcd builtin tests, documentation Currently we have the documentation for __builtin_vec_bcdsub_{eq,gt,lt} but not for __builtin_bcdsub_{gl}e, this patch is to supplement the descriptions for them. Although they are mainly for __builtin_bcdcmp{ge,le}, we already have some testing coverage for __builtin_vec_bcdsub_{eq,gt,lt}, this patch adds the corresponding explicit test cases as well. gcc/ChangeLog: * doc/extend.texi (__builtin_bcdsub_le, __builtin_bcdsub_ge): Add documentation for the builti-ins. gcc/testsuite/ChangeLog: * gcc.target/powerpc/bcd-3.c (do_sub_ge, do_suble): Add functions to test builtins __builtin_bcdsub_ge and __builtin_bcdsub_le.	2023-10-31 12:30:41 -04:00
Neal Frager	f694960924	gcc: config: microblaze: fix cpu version check The MICROBLAZE_VERSION_COMPARE was incorrectly using strcasecmp instead of strverscmp to check the mcpu version against feature options. By simply changing the define to use strverscmp, the new version 10.0 is treated correctly as a higher version than previous versions. Fix incorrect warning with -mcpu=10.0: warning: '-mxl-multiply-high' can be used only with '-mcpu=v6.00.a' or greater Signed-off-by: Neal Frager <neal.frager@amd.com> Signed-off-by: Michael J. Eager <eager@eagercon.com>	2023-10-31 09:29:10 -07:00
Vladimir N. Makarov	9119b008b4	[RA]: Fixing LRA cycling for multi-reg variable containing a fixed reg PR111971 test case uses a multi-reg variable containing a fixed reg. LRA rejects such multi-reg because of this when matching the constraint for an asm insn. The rejection results in LRA cycling. The patch fixes this issue. gcc/ChangeLog: PR rtl-optimization/111971 * lra-constraints.cc: (process_alt_operands): Don't check start hard regs for regs originated from register variables. gcc/testsuite/ChangeLog: PR rtl-optimization/111971 * gcc.target/powerpc/pr111971.c: New test.	2023-10-31 11:45:40 -04:00
Thomas Schwinge	3e888f9462	Add OpenACC 'acc_map_data' variant to 'libgomp.oacc-c-c++-common/deep-copy-8.c' libgomp/ * testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c: Add OpenACC 'acc_map_data' variant.	2023-10-31 14:54:41 +01:00
Robin Dapp	5de05bdaa7	RISC-V: Add vector fmin/fmax expanders. This patch adds expanders for fmin and fmax. As per RISC-V V Spec 1.0 vfmin/vfmax are IEEE 754-2019 compliant which differs from IEEE 754-2008 that fmin/fmax require (particularly in the signaling-NaN handling). Therefore the pattern conditions include a !HONOR_SNANS. gcc/ChangeLog: * config/riscv/autovec.md (<ieee_fmaxmin_op><mode>3): fmax/fmin expanders. (cond_<ieee_fmaxmin_op><mode>): Ditto. (cond_len_<ieee_fmaxmin_op><mode>): Ditto. (reduc_fmax_scal_<mode>): Ditto. (reduc_fmin_scal_<mode>): Ditto. * config/riscv/riscv-v.cc (needs_fp_rounding): Add fmin/fmax. * config/riscv/vector-iterators.md (fmin): New UNSPEC. (UNSPEC_VFMIN): Ditto. * config/riscv/vector.md (@pred_<ieee_fmaxmin_op><mode>): Add UNSPEC insn patterns. (@pred_<ieee_fmaxmin_op><mode>_scalar): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Remove -ffast-math. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-4.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/fmax-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmax_run-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmax_zvfh-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmax_zvfh_run-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmin-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmin_run-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmin_zvfh-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmin_zvfh_run-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-3.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-4.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-3.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-4.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-3.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-4.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-3.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-4.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc-10.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_run-10.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_zvfh-10.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_zvfh_run-10.c: New test.	2023-10-31 13:34:28 +01:00
Robin Dapp	184378027e	genemit: Split insn-emit.cc into several partitions. On riscv insn-emit.cc has grown to over 1.2 mio lines of code and compiling it takes considerable time. Therefore, this patch adjust genemit to create several partitions (insn-emit-1.cc to insn-emit-n.cc). The available patterns are written to the given files in a sequential fashion. Similar to match.pd a configure option --with-emitinsn-partitions=num is introduced that makes the number of partition configurable. gcc/ChangeLog: PR bootstrap/84402 PR target/111600 * Makefile.in: Handle split insn-emit.cc. * configure: Regenerate. * configure.ac: Add --with-insnemit-partitions. * genemit.cc (output_peephole2_scratches): Print to file instead of stdout. (print_code): Ditto. (gen_rtx_scratch): Ditto. (gen_exp): Ditto. (gen_emit_seq): Ditto. (emit_c_code): Ditto. (gen_insn): Ditto. (gen_expand): Ditto. (gen_split): Ditto. (output_add_clobbers): Ditto. (output_added_clobbers_hard_reg_p): Ditto. (print_overload_arguments): Ditto. (print_overload_test): Ditto. (handle_overloaded_code_for): Ditto. (handle_overloaded_gen): Ditto. (print_header): New function. (handle_arg): New function. (main): Split output into 10 files. * gensupport.cc (count_patterns): New function. * gensupport.h (count_patterns): Define. * read-md.cc (md_reader::print_md_ptr_loc): Add file argument. * read-md.h (class md_reader): Change definition.	2023-10-31 13:34:28 +01:00
Alexandre Oliva	15404016d9	hardcfr: support checking at abnormal edges [PR111943] Control flow redundancy may choose abnormal edges for early checking, but that breaks because we can't insert checks on such edges. Introduce conditional checking on the dest block of abnormal edges, and leave it for the optimizer to drop the conditional. for gcc/ChangeLog PR tree-optimization/111943 * gimple-harden-control-flow.cc: Adjust copyright year. (rt_bb_visited): Add vfalse and vtrue data members. Zero-initialize them in the ctor. (rt_bb_visited::insert_exit_check_on_edge): Upon encountering abnormal edges, insert initializers for vfalse and vtrue on entry, and insert the check sequence guarded by a conditional in the dest block. for libgcc/ChangeLog * hardcfr.c: Adjust copyright year. for gcc/testsuite/ChangeLog PR tree-optimization/111943 * gcc.dg/harden-cfr-pr111943.c: New.	2023-10-31 09:32:08 -03:00
Richard Biener	e3da1d7bb2	tree-optimization/112305 - SCEV cprop and conditional undefined overflow The following adjusts final value replacement to also rewrite the replacement to defined overflow behavior if there's conditionally evaluated stmts (with possibly undefined overflow), not only when we "folded casts". The patch hooks into expression_expensive for this. PR tree-optimization/112305 * tree-scalar-evolution.h (expression_expensive): Adjust. * tree-scalar-evolution.cc (expression_expensive): Record when we see a COND_EXPR. (final_value_replacement_loop): When the replacement contains a COND_EXPR, rewrite it to defined overflow. * tree-ssa-loop-ivopts.cc (may_eliminate_iv): Adjust. * gcc.dg/torture/pr112305.c: New testcase.	2023-10-31 13:10:04 +01:00
Iain Buclaw	1cf5dc05c6	d: Clean-up unused variable assignments after interface change The lowering done for invoking `new' on a single dimension array was moved from the code generator to the front-end semantic pass in r14-4996. This removes the detritus left behind in the code generator from that deletion. gcc/d/ChangeLog: * expr.cc (ExprVisitor::visit (NewExp *)): Remove unused assignments.	2023-10-31 12:30:52 +01:00
Xi Ruoyao	6bf2cebe2b	LoongArch: Define HAVE_AS_TLS to 0 if it's undefined [PR112299] Now loongarch.md uses HAVE_AS_TLS, we need this to fix the failure building a cross compiler if the cross assembler is not installed yet. gcc/ChangeLog: PR target/112299 * config/loongarch/loongarch-opts.h (HAVE_AS_TLS): Define to 0 if not defined yet.	2023-10-31 14:23:18 +08:00
Lehua Ding	5ee961b6f2	RISC-V: Add assert of the number of vmerge in autovec cond testcases This patch adds more asserts about the vmerge insns which is intended to ensure better performance for cond autovec. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/cond/cond_arith-1.c: Add vmerge assert. * gcc.target/riscv/rvv/autovec/cond/cond_arith-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-7.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-8.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv32-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv32-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv64-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv64-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv32-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv32-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv64-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv64-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv32-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv32-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv64-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv64-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv64gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-7.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-8.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma_run-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-7.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-8.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-9.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-7.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-8.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-10.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_arith-11.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_arith_run-10.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_arith_run-11.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmul_run-5.c: New test.	2023-10-31 14:15:57 +08:00
Lehua Ding	711d703d07	match.pd: Support combine cond_len_op + vec_cond similar to cond_op This patch adds combine cond_len_op and vec_cond to cond_len_op like cond_op. Consider this code (RISC-V target): void foo (uint8_t __restrict x, uint8_t __restrict y, uint8_t __restrict z, uint8_t __restrict pred, uint8_t __restrict merged, int n) { for (int i = 0; i < n; ++i) x[i] = pred[i] != 1 ? y[i] / z[i] : merged[i]; } Before this patch: ... vect_iftmp.18_71 = .COND_LEN_DIV (mask__31.11_61, vect__5.14_65, vect__7.17_69, { 0, ... }, _86, 0); vect_iftmp.23_78 = .VCOND_MASK (mask__31.11_61, vect_iftmp.18_71, vect_iftmp.22_77); ... After this patch: ... _30 = .COND_LEN_DIV (mask__31.16_61, vect__5.19_65, vect__7.22_69, vect_iftmp.27_77, _85, 0); ... gcc/ChangeLog: gimple-match.h (gimple_match_op::gimple_match_op): Add interfaces for more arguments. (gimple_match_op::set_op): Add interfaces for more arguments. * match.pd: Add support of combining cond_len_op + vec_cond	2023-10-31 14:13:05 +08:00
Haochen Jiang	9cc2b97458	Fix incorrect option mask and avx512cd target push gcc/ChangeLog: * config/i386/avx512cdintrin.h (target): Push evex512 for avx512cd. * config/i386/avx512vlintrin.h (target): Split avx512cdvl part out from avx512vl. * config/i386/i386-builtin.def (BDESC): Do not check evex512 for builtins not needed.	2023-10-31 13:42:15 +08:00
Lehua Ding	5ee894130f	RISC-V: Add the missed combine of [u]int64 -> _Float16 and vcond Hi, This patch let the INT64 to FP16 convert split to two small converts (INT64 -> FP32 and FP32 -> FP16) when expanding instead of dealy the split to split1 pass. This change could make it possible to combine the FP32 to FP16 and vcond patterns and so we don't need to add an combine pattern for INT64 to FP16 and vcond patterns. Consider this code: void foo (_Float16 __restrict r, int64_t __restrict a, _FLoat16 __restrict b, int64_t __restrict pred, int n) { for (int i = 0; i < n; i += 1) { r[i] = pred[i] ? (_Float16) a[i] : b[i]; } } Before this patch: ... vfncvt.f.f.w v2,v2 vmerge.vvm v1,v1,v2,v0 vse16.v v1,0(a0) ... After this patch: ... vfncvt.f.f.w v1,v2,v0.t vse16.v v1,0(a0) ... gcc/ChangeLog: * config/riscv/autovec.md (<float_cvt><mode><vnnconvert>2): Change to define_expand. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-1.c: Add vfncvt.f.f.w assert. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-2.c: Ditto.	2023-10-31 11:45:46 +08:00
liuhongt	f5d33d0c79	Fix wrong code due to incorrect define_split -(define_split - [(set (match_operand:V2HI 0 "register_operand") - (eq:V2HI - (eq:V2HI - (us_minus:V2HI - (match_operand:V2HI 1 "register_operand") - (match_operand:V2HI 2 "register_operand")) - (match_operand:V2HI 3 "const0_operand")) - (match_operand:V2HI 4 "const0_operand")))] - "TARGET_SSE4_1" - [(set (match_dup 0) - (umin:V2HI (match_dup 1) (match_dup 2))) - (set (match_dup 0) - (eq:V2HI (match_dup 0) (match_dup 2)))]) the splitter is wrong when op1 == op2.(the original pattern returns 0, after split, it returns 1) So remove the splitter. Also extend another define_split to define_insn_and_split to handle below pattern 494(set (reg:V4QI 112) 495 (unspec:V4QI [ 496 (subreg:V4QI (reg:V2HF 111 [ bf ]) 0) 497 (subreg:V4QI (reg:V2HF 110 [ af ]) 0) 498 (subreg:V4QI (eq:V2HI (eq:V2HI (reg:V2HI 105) 499 (const_vector:V2HI [ 500 (const_int 0 [0]) repeated x2 501 ])) 502 (const_vector:V2HI [ 503 (const_int 0 [0]) repeated x2 504 ])) 0) 505 ] UNSPEC_BLENDV)) define_split doesn't work since pass_combine assume it produces at most 2 insns after split, but here it produces 3 since we need to move const0_rtx (V2HImode) to reg. The move insn can be eliminated later. gcc/ChangeLog: PR target/112276 * config/i386/mmx.md (mmx_pblendvb_v8qi_1): Change define_split to define_insn_and_split to handle immediate_operand for comparison. (mmx_pblendvb_v8qi_2): Ditto. (mmx_pblendvb_<mode>_1): Ditto. (mmx_pblendvb_v4qi_2): Ditto. (<code><mode>3): Remove define_split after it. (<code>v8qi3): Ditto. (<code><mode>3): Ditto. (<ode>v2hi3): Ditto. gcc/testsuite/ChangeLog: * g++.target/i386/part-vect-vcondhf.C: Adjust testcase. * gcc.target/i386/pr112276.c: New test.	2023-10-31 11:24:45 +08:00
Andrew Pinski	541b754c77	MATCH: Add some more value_replacement simplifications to match This moves a few more value_replacements simplifications to match. /* a == 1 ? b : a * b -> a * b / / a == 1 ? b : b / a -> b / a / / a == -1 ? b : a & b -> a & b / Also adds a testcase to show can we catch these where value_replacement would not (but other passes would). Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: match.pd (`a == 1 ? b : a OP b`): New pattern. (`a == -1 ? b : a & b`): New pattern. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/phi-opt-value-4.c: New test.	2023-10-30 19:15:25 -07:00
Andrew Pinski	598fdb5290	MATCH: first of the value replacement moving from phiopt This moves a few simple patterns that are done in value replacement in phiopt over to match.pd. Just the simple ones which might show up in other code. This allows some optimizations to happen even without depending on sinking from happening and in some cases where phiopt is not invoked (cond-1.c is an example there). Changes since v1: * v2: Add an extra testcase to showcase improvements at -O1. Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: * match.pd: (`a == 0 ? b : b + a`, `a == 0 ? b : b - a`): New patterns. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/cond-1.c: New test. * gcc.dg/tree-ssa/phi-opt-value-1.c: New test. * gcc.dg/tree-ssa/phi-opt-value-1a.c: New test. * gcc.dg/tree-ssa/phi-opt-value-2.c: New test.	2023-10-30 19:15:25 -07:00
GCC Administrator	a5c157b95a	Daily bump.	2023-10-31 00:17:32 +00:00
Mayshao	94c0b26f45	i386: Zhaoxin yongfeng enablement Enable -march/-mtune=yongfeng. Costs and tunings are set according to the characteristics of the processor. Add a new .md file to describe yongfeng processor. gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Recognize yongfeng. * common/config/i386/i386-common.cc: Add yongfeng. * common/config/i386/i386-cpuinfo.h (enum processor_subtypes): Add ZHAOXIN_FAM7H_YONGFENG. * config.gcc: Add yongfeng. * config/i386/driver-i386.cc (host_detect_local_cpu): Let -march=native recognize yongfeng processors. * config/i386/i386-c.cc (ix86_target_macros_internal): Add yongfeng. * config/i386/i386-options.cc (m_YONGFENG): New definition. (m_ZHAOXIN): Ditto. * config/i386/i386.h (enum processor_type): Add PROCESSOR_YONGFENG. * config/i386/i386.md: Add yongfeng. * config/i386/lujiazui.md: Fix typo. * config/i386/x86-tune-costs.h (struct processor_costs): Add yongfeng costs. * config/i386/x86-tune-sched.cc (ix86_issue_rate): Add yongfeng. (ix86_adjust_cost): Ditto. * config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Replace m_LUJIAZUI with m_ZHAOXIN. (X86_TUNE_PARTIAL_REG_DEPENDENCY): Ditto. (X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY): Ditto. (X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): Ditto. (X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Ditto. (X86_TUNE_MOVX): Ditto. (X86_TUNE_MEMORY_MISMATCH_STALL): Ditto. (X86_TUNE_FUSE_CMP_AND_BRANCH_32): Ditto. (X86_TUNE_FUSE_CMP_AND_BRANCH_64): Ditto. (X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS): Ditto. (X86_TUNE_FUSE_ALU_AND_BRANCH): Ditto. (X86_TUNE_ACCUMULATE_OUTGOING_ARGS): Ditto. (X86_TUNE_USE_LEAVE): Ditto. (X86_TUNE_PUSH_MEMORY): Ditto. (X86_TUNE_LCP_STALL): Ditto. (X86_TUNE_INTEGER_DFMODE_MOVES): Ditto. (X86_TUNE_OPT_AGU): Ditto. (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB): Ditto. (X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Ditto. (X86_TUNE_USE_SAHF): Ditto. (X86_TUNE_USE_BT): Ditto. (X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Ditto. (X86_TUNE_ONE_IF_CONV_INSN): Ditto. (X86_TUNE_AVOID_MFENCE): Ditto. (X86_TUNE_EXPAND_ABS): Ditto. (X86_TUNE_USE_SIMODE_FIOP): Ditto. (X86_TUNE_USE_FFREEP): Ditto. (X86_TUNE_EXT_80387_CONSTANTS): Ditto. (X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Ditto. (X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Ditto. (X86_TUNE_SSE_TYPELESS_STORES): Ditto. (X86_TUNE_SSE_LOAD0_BY_PXOR): Ditto. (X86_TUNE_USE_GATHER_2PARTS): Add m_YONGFENG. (X86_TUNE_USE_GATHER_4PARTS): Ditto. (X86_TUNE_USE_GATHER_8PARTS): Ditto. (X86_TUNE_AVOID_128FMA_CHAINS): Ditto. * doc/extend.texi: Add details about yongfeng. * doc/invoke.texi: Ditto. * config/i386/yongfeng.md: New file to describe yongfeng processor. gcc/testsuite/ChangeLog: * g++.target/i386/mv32.C: Handle new -march. * gcc.target/i386/funcspec-56.inc: Ditto.	2023-10-30 22:20:01 +01:00
François Dumont	6504b4a498	libstdc++: [_GLIBCXX_INLINE_VERSION] Add comment on emul TLS symbols libstdc++-v3/ChangeLog: * config/abi/pre/gnu-versioned-namespace.ver: Add comment on recently added emul TLS symbols.	2023-10-30 22:07:49 +01:00
François Dumont	5ea11700e5	libstdc++: [_GLIBCXX_INLINE_VERSION] Un-weak handle_contract_violation libstdc++-v3/ChangeLog: * src/experimental/contract.cc [_GLIBCXX_INLINE_VERSION](handle_contract_violation): Rework comment. Remove weak attribute.	2023-10-30 21:49:31 +01:00
Iain Sandoe	434975cb1b	configure, fixincludes: Add change missed in r14-4825. This corrects an oversight in the r14-4825 commit. fixincludes/ChangeLog: * configure: Regenerate. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>	2023-10-30 19:05:00 +00:00
Martin Jambor	997c8219f0	ipa: Prune any IPA-CP aggregate constants known by modref to be killed (111157) PR 111157 shows that IPA-modref and IPA-CP (when plugged into value numbering) can optimize out a store both before a call (because the call will overwrite it) and in the call (because the store is of the same value) and by eliminating both create miscompilation. This patch fixes that by pruning any constants from the list of IPA-CP aggregate value constants that it knows the contents of the memory can be "killed." Unfortunately, doing so is tricky. First, IPA-modref loads override kills and so only stores not loaded are truly not necessary. Looking stuff up there means doing what most of what modref_may_alias may do but doing exactly what it does is tricky because it takes also aliasing into account and has bail-out counters. To err on the side of caution in order to avoid this miscompilation we have to prune a constant when in doubt. However, pruning can interfere with the mechanism of how clone materialization distinguishes between the cases when a parameter was entirely removed and when it was both IPA-CPed and IPA-SRAed (in order to make up for the removal in debug info, which can bump into an assert when compiling g++.dg/torture/pr103669.C when we are not careful). Therefore this patch: 1) marks constants that IPA-modref has in its kill list with a new "killed" flag, and 2) prunes the list from entries with this flag after materialization and IPA-CP transformation is done using the template introduced in the previous patch It does not try to look up anything in the load lists, this will be done as a follow-up in order to ease review. gcc/ChangeLog: 2023-10-27 Martin Jambor <mjambor@suse.cz> PR ipa/111157 * ipa-prop.h (struct ipa_argagg_value): Newf flag killed. * ipa-modref.cc (ipcp_argagg_and_kill_overlap_p): New function. (update_signature): Mark any any IPA-CP aggregate constants at positions known to be killed as killed. Move check that there is clone_info after this pruning. * ipa-cp.cc (ipa_argagg_value_list::dump): Dump the killed flag. (ipa_argagg_value_list::push_adjusted_values): Clear the new flag. (push_agg_values_from_plats): Likewise. (ipa_push_agg_values_from_jfunc): Likewise. (estimate_local_effects): Likewise. (push_agg_values_for_index_from_edge): Likewise. * ipa-prop.cc (write_ipcp_transformation_info): Stream the killed flag. (read_ipcp_transformation_info): Likewise. (ipcp_get_aggregate_const): Update comment, assert that encountered record does not have killed flag set. (ipcp_transform_function): Prune all aggregate constants with killed set. gcc/testsuite/ChangeLog: 2023-09-18 Martin Jambor <mjambor@suse.cz> PR ipa/111157 * gcc.dg/lto/pr111157_0.c: New test. * gcc.dg/lto/pr111157_1.c: Second file of the same new test.	2023-10-30 18:36:54 +01:00
Martin Jambor	1437df40f1	ipa-cp: Templatize filtering of m_agg_values PR 111157 points to another place where IPA-CP collected aggregate compile-time constants need to be filtered, in addition to the one place that already does this in ipa-sra. In order to re-use code, this patch turns the common bit into a template. The functionality is still covered by testcase gcc.dg/ipa/pr108959.c. gcc/ChangeLog: 2023-09-13 Martin Jambor <mjambor@suse.cz> PR ipa/111157 * ipa-prop.h (ipcp_transformation): New member function template remove_argaggs_if. * ipa-sra.cc (zap_useless_ipcp_results): Use remove_argaggs_if to filter aggreagate constants.	2023-10-30 18:36:40 +01:00
Patrick O'Neill	68880e4053	RISC-V: Make rv32i_zcmp testcase more robust GCC recently changed its register allocator which causes this testcase to fail. This patch updates the regex to be more robust to change by accepting any s register in the range of 1-9 for cm.push and cm.popret insns. gcc/testsuite/ChangeLog: * gcc.target/riscv/rv32i_zcmp.c: Accept any register in the range of 1-9 for cm.push and cm.popret insns. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>	2023-10-30 09:58:54 -07:00
Roger Sayle	a3da9adeb4	ARC: Convert (signed<<31)>>31 to -(signed&1) without barrel shifter. This patch optimizes PR middle-end/101955 for the ARC backend. On ARC CPUs with a barrel shifter, using two shifts is optimal as: asl_s r0,r0,31 asr_s r0,r0,31 but without a barrel shifter, GCC -O2 -mcpu=em currently generates: and r2,r0,1 ror r2,r2 add.f 0,r2,r2 sbc r0,r0,r0 with this patch, we now generate the smaller, faster and non-flags clobbering: bmsk_s r0,r0,0 neg_s r0,r0 2023-10-30 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR middle-end/101955 * config/arc/arc.md (extvsi_1_0): New define_insn_and_split to convert sign extract of the least significant bit into an AND $1 then a NEG when !TARGET_BARREL_SHIFTER. gcc/testsuite/ChangeLog PR middle-end/101955 gcc.target/arc/pr101955.c: New test case.	2023-10-30 16:21:28 +00:00
Roger Sayle	31cc9824d1	ARC: Improved ARC rtx_costs/insn_cost for SHIFTs and ROTATEs. This patch overhauls the ARC backend's insn_cost target hook, and makes some related improvements to rtx_costs, BRANCH_COST, etc. The primary goal is to allow the backend to indicate that shifts and rotates are slow (discouraged) when the CPU doesn't have a barrel shifter. I should also acknowledge Richard Sandiford for inspiring the use of set_cost in this rewrite of arc_insn_cost; this implementation borrows heavily for the target hooks for AArch64 and ARM. The motivating example is derived from PR rtl-optimization/110717. struct S { int a : 5; }; unsigned int foo (struct S p) { return p->a; } With a barrel shifter, GCC -O2 generates the reasonable: foo: ldb_s r0,[r0] asl_s r0,r0,27 j_s.d [blink] asr_s r0,r0,27 What's interesting is that during combine, the middle-end actually has two shifts by three bits, and a sign-extension from QI to SI. Trying 8, 9 -> 11: 8: r158:SI=r157:QI#0<<0x3 REG_DEAD r157:QI 9: r159:SI=sign_extend(r158:SI#0) REG_DEAD r158:SI 11: r155:SI=r159:SI>>0x3 REG_DEAD r159:SI Whilst it's reasonable to simplify this to two shifts by 27 bits when the CPU has a barrel shifter, it's actually a significant pessimization when these shifts are implemented by loops. This combination can be prevented if the backend provides accurate-ish estimates for insn_cost. Previously, without a barrel shifter, GCC -O2 -mcpu=em generates: foo: ldb_s r0,[r0] mov lp_count,27 lp 2f add r0,r0,r0 nop 2: # end single insn loop mov lp_count,27 lp 2f asr r0,r0 nop 2: # end single insn loop j_s [blink] which contains two loops and requires about ~113 cycles to execute. With this patch to rtx_cost/insn_cost, GCC -O2 -mcpu=em generates: foo: ldb_s r0,[r0] mov_s r2,0 ;3 add3 r0,r2,r0 sexb_s r0,r0 asr_s r0,r0 asr_s r0,r0 j_s.d [blink] asr_s r0,r0 which requires only ~6 cycles, for the shorter shifts by 3 and sign extension. 2023-10-30 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog config/arc/arc.cc (arc_rtx_costs): Improve cost estimates. Provide reasonable values for SHIFTS and ROTATES by constant bit counts depending upon TARGET_BARREL_SHIFTER. (arc_insn_cost): Use insn attributes if the instruction is recognized. Avoid calling get_attr_length for type "multi", i.e. define_insn_and_split patterns without explicit type. Fall-back to set_rtx_cost for single_set and pattern_cost otherwise. * config/arc/arc.h (COSTS_N_BYTES): Define helper macro. (BRANCH_COST): Improve/correct definition. (LOGICAL_OP_NON_SHORT_CIRCUIT): Preserve previous behavior.	2023-10-30 16:17:42 +00:00
Roger Sayle	d24c3c5334	ARC: Improved SImode shifts and rotates with -mswap. This patch improves the code generated by the ARC back-end for CPUs without a barrel shifter but with -mswap. The -mswap option provides a SWAP instruction that implements SImode rotations by 16, but also logical shift instructions (left and right) by 16 bits. Clearly these are also useful building blocks for implementing shifts by 17, 18, etc. which would otherwise require a loop. As a representative example: int shl20 (int x) { return x << 20; } GCC with -O2 -mcpu=em -mswap would previously generate: shl20: mov lp_count,10 lp 2f add r0,r0,r0 add r0,r0,r0 2: # end single insn loop j_s [blink] with this patch we now generate: shl20: mov_s r2,0 ;3 lsl16 r0,r0 add3 r0,r2,r0 j_s.d [blink] asl_s r0,r0 Although both are four instructions (excluding the j_s), the original takes ~22 cycles, and replacement ~4 cycles. 2023-10-30 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/arc/arc.cc (arc_split_ashl): Use lsl16 on TARGET_SWAP. (arc_split_ashr): Use swap and sign-extend on TARGET_SWAP. (arc_split_lshr): Use lsr16 on TARGET_SWAP. (arc_split_rotl): Use swap on TARGET_SWAP. (arc_split_rotr): Likewise. * config/arc/arc.md (ANY_ROTATE): New code iterator. (<ANY_ROTATE>si2_cnt16): New define_insn for alternate form of swap instruction on TARGET_SWAP. (ashlsi2_cnt16): Rename from ashlsi16_cnt16 and move earlier. (lshrsi2_cnt16): New define_insn for LSR16 instruction. (ashlsi2_cnt16): See above. gcc/testsuite/ChangeLog * gcc.target/arc/lsl16-1.c: New test case. * gcc.target/arc/lsr16-1.c: Likewise. * gcc.target/arc/swap-1.c: Likewise. * gcc.target/arc/swap-2.c: Likewise.	2023-10-30 16:12:30 +00:00
Richard Ball	fb1941d08f	arm: move the switch tables for Arm to the RO data section. Follow up patch to arm: Use deltas for Arm switch tables This patch moves the switch tables for Arm from the .text section into the .rodata section. gcc/ChangeLog: * config/arm/aout.h: Change to use the Lrtx label. * config/arm/arm.h (CASE_VECTOR_PC_RELATIVE): Remove arm targets from (!target_pure_code) condition. (ADDR_VEC_ALIGN): Add align for tables in rodata section. * config/arm/arm.cc (arm_output_casesi): Alter the function to include .Lrtx label and remove adr instructions. * config/arm/arm.md (arm_casesi_internal): Use force_reg to generate ldr instructions that would otherwise be out of range, and change rtl to accommodate force reg. Additionally remove unnecessary register temp. (casesi): Remove pure code check for Arm. * config/arm/elf.h (JUMP_TABLES_IN_TEXT_SECTION): Remove arm targets from JUMP_TABLES_IN_TEXT_SECTION definition. gcc/testsuite/ChangeLog: * gcc.target/arm/arm-switchstatement.c: Alter the tests to change adr instruction to ldr.	2023-10-30 15:31:26 +00:00
Francois-Xavier Coudert	7666d94db0	Testsuite, i386: Mark test as requiring ifunc Test is currently failing on x86_64-apple-darwin. gcc/testsuite/ChangeLog: * gcc.target/i386/pr105554.c: Require ifunc.	2023-10-30 15:57:33 +01:00
Francois-Xavier Coudert	89e97f655d	Testsuite, Darwin: Fix trampoline warning Heap-based trampolines are enabled on darwin20 and later, meaning that no warning is emitted. gcc/testsuite/ChangeLog: * gcc.dg/Wtrampolines.c: Skip on darwin20 and later.	2023-10-30 14:45:47 +01:00
Francois-Xavier Coudert	5c7bbb0fcd	Testsuite, i386: Fix test by passing -march The test currently fails on Darwin, where the default arch is core2. gcc/testsuite/ChangeLog: PR target/112287 * gcc.target/i386/pr111698.c: Pass -march=sandybridge.	2023-10-30 12:50:01 +01:00
Francois-Xavier Coudert	a0c557690c	Testsuite, Darwin: skip PIE test gcc/testsuite/ChangeLog: * gcc.dg/pie-2.c: Skip test on darwin.	2023-10-30 12:41:17 +01:00
Jeevitha	36a52cdc23	rs6000: Change bitwise xor to an equality operator [PR106907] PR106907 has a few warnings spotted from cppcheck. These warnings are related to the need of precedence clarification. Instead of using xor, it has been changed to equality check, which achieves the same result. Additionally, comment indentation has been fixed. 2023-10-11 Jeevitha Palanisamy <jeevitha@linux.ibm.com> gcc/ PR target/106907 * config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Change bitwise xor to an equality and fix comment indentation.	2023-10-30 05:38:19 -05:00
Richard Biener	ff4cea05a6	PR testsuite/111462 - add powerpc64le to list of ssa-sink-18.c XFAIL PR testsuite/111462 gcc/testsuite/ * gcc.dg/tree-ssa/ssa-sink-18.c: XFAIL also powerpc64le.	2023-10-30 11:03:03 +01:00
Juzhe-Zhong	eb1cdb3e43	RISC-V: Fix bugs of handling scalar of SEW64 vx instruction in RV32 sew64_scalar_helper is handling SEW64 vx instruction pattern on RV32 system. According to RVV ISA, we can directly use vx instruction of SEW64 on RV32 system since RV32 GR reg is 32bit. Consider this following case: vsetvl e64m1 vadd.vx v,v,x will be transform by sew64_scalar_helper: vsetvl e64m1 sw sw vlse v vadd.vv This bug is reported by Robin. (insn 143 179 230 9 (set (reg:SI 15 a5 [234]) (unspec:SI [ (const_int 64 [0x40]) ] UNSPEC_VLMAX)) 751 {vlmax_avlsi} (expr_list:REG_EQUIV (unspec:SI [ (const_int 64 [0x40]) ] UNSPEC_VLMAX) (nil))) (insn 230 143 78 9 (parallel [ (set (reg:SI 66 vl) (unspec:SI [ (reg:SI 15 a5 [234]) (const_int 64 [0x40]) (const_int 0 [0]) ] UNSPEC_VSETVL)) (set (reg:SI 67 vtype) (unspec:SI [ (const_int 64 [0x40]) (const_int 0 [0]) (const_int 1 [0x1]) repeated x2 ] UNSPEC_VSETVL)) ]) "bug.c":14:14 discrim 1 1469 {vsetvl_discard_resultsi} (nil)) (insn 78 230 84 9 (set (reg:RVVM1DI 102 v6 [203]) (if_then_else:RVVM1DI (unspec:RVVMF64BI [ (const_vector:RVVMF64BI repeat [ (const_int 1 [0x1]) ]) (const_int 0 [0]) (const_int 2 [0x2]) repeated x2 (const_int 0 [0]) (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (vec_duplicate:RVVM1DI (mem/u/c:DI (reg/f:SI 29 t4 [230]) [0 S8 A64])) (unspec:RVVM1DI [ (reg:SI 0 zero) ] UNSPEC_VUNDEF))) "bug.c":14:14 discrim 1 1872 {pred_broadcastrvvm1di} (expr_list:REG_DEAD (reg/f:SI 29 t4 [230]) (nil))) The root cause of this is because we missed VLMAX handling since the codes was invented long time ago (Callers always intrinsics codes, no VLMAX situation). Now, all following bugs are fixed after this patch: FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test gcc/ChangeLog: config/riscv/riscv-protos.h (sew64_scalar_helper): Fix bug. * config/riscv/riscv-v.cc (sew64_scalar_helper): Ditto. * config/riscv/vector.md: Ditto.	2023-10-30 15:48:29 +08:00
Paul Thomas	f3e44d0797	Fortran: Fix a problem with SELECT TYPE selectors [PR104555]. 2023-10-30 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/104555 * resolve.cc (resolve_select_type): If the selector expression has no class component references and the expression is a derived type, copy the typespec of the symbol to that of the expression. gcc/testsuite/ PR fortran/104555 * gfortran.dg/pr104555.f90: New test.	2023-10-30 07:12:40 +00:00
liuhongt	8c40b72036	Improve memcmpeq for 512-bit vector with vpcmpeq + kortest. When 2 vectors are equal, kmask is allones and kortest will set CF, else CF will be cleared. So CF bit can be used to check for the result of the comparison. Before: vmovdqu (%rsi), %ymm0 vpxorq (%rdi), %ymm0, %ymm0 vptest %ymm0, %ymm0 jne .L2 vmovdqu 32(%rsi), %ymm0 vpxorq 32(%rdi), %ymm0, %ymm0 vptest %ymm0, %ymm0 je .L5 .L2: movl $1, %eax xorl $1, %eax vzeroupper ret After: vmovdqu64 (%rsi), %zmm0 xorl %eax, %eax vpcmpeqd (%rdi), %zmm0, %k0 kortestw %k0, %k0 setc %al vzeroupper ret gcc/ChangeLog: PR target/104610 * config/i386/i386-expand.cc (ix86_expand_branch): Handle 512-bit vector with vpcmpeq + kortest. * config/i386/i386.md (cbranchxi4): New expander. * config/i386/sse.md: (cbranch<mode>4): Extend to V16SImode and V8DImode. gcc/testsuite/ChangeLog: * gcc.target/i386/pr104610-2.c: New test.	2023-10-30 11:10:01 +08:00
Haochen Gui	8111b5c23b	Expand: Checking available optabs for scalar modes in by pieces operations The former patch (`f08ca5903c`) examines the scalar modes by target hook scalar_mode_supported_p. It causes some i386 regression cases as XImode and OImode are not enabled in i386 target function. This patch examines the scalar mode by checking if the corresponding optabs are available for the mode. gcc/ PR target/111449 * expr.cc (qi_vector_mode_supported_p): Rename to... (by_pieces_mode_supported_p): ...this, and extends it to do the checking for both scalar and vector mode. (widest_fixed_size_mode_for_size): Call by_pieces_mode_supported_p to examine the mode. (op_by_pieces_d::smallest_fixed_size_mode_for_size): Likewise.	2023-10-30 11:02:29 +08:00
GCC Administrator	39a11d8e0b	Daily bump.	2023-10-30 00:17:23 +00:00
François Dumont	3c444fb2ff	libstdc++: [_GLIBCXX_INLINE_VERSION] Add emul TLS symbols libstdc++-v3/ChangeLog: * config/abi/pre/gnu-versioned-namespace.ver: Add missing emul TLS symbols.	2023-10-29 22:20:28 +01:00
François Dumont	5d1b723cef	libstdc++: [_GLIBCXX_INLINE_VERSION] Provide handle_contract_violation symbol libstdc++-v3/ChangeLog: * src/experimental/contract.cc [_GLIBCXX_INLINE_VERSION](handle_contract_violation): Provide symbol without version namespace decoration for gcc.	2023-10-29 22:10:33 +01:00
Iain Buclaw	ea8ffdcadb	d: Fix ICE: verify_gimple_failed (conversion of register to a different size in 'view_convert_expr') Static arrays in D are passed around by value, rather than decaying to a pointer. On x86_64 __builtin_va_list is an exception to this rule, but semantically it's still treated as a static array. This makes certain assignment operations fail due a mismatch in types. As all examples in the test program are rejected by C/C++ front-ends, these are now errors in D too to be consistent. PR d/110712 gcc/d/ChangeLog: * d-codegen.cc (d_build_call): Update call to convert_for_argument. * d-convert.cc (is_valist_parameter_type): New function. (check_valist_conversion): New function. (convert_for_assignment): Update signature. Add check whether assigning va_list is permissible. (convert_for_argument): Likewise. * d-tree.h (convert_for_assignment): Update signature. (convert_for_argument): Likewise. * expr.cc (ExprVisitor::visit (AssignExp )): Update call to convert_for_assignment. gcc/testsuite/ChangeLog: gdc.dg/pr110712.d: New test.	2023-10-29 20:13:14 +01:00
Iain Buclaw	e773c6c700	d: Merge upstream dmd, druntime e48bc0987d, phobos 2458e8f82. D front-end changes: - Import dmd v2.106.0-beta.1. D runtime changes: - Import druntime v2.106.0-beta.1. Phobos changes: - Import phobos v2.106.0-beta.1. gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd e48bc0987d. * expr.cc (ExprVisitor::visit (NewExp )): Update for new front-end interface. runtime.def (NEWARRAYT): Remove. (NEWARRAYIT): Remove. libphobos/ChangeLog: * libdruntime/MERGE: Merge upstream druntime e48bc0987d. * src/MERGE: Merge upstream phobos 2458e8f82.	2023-10-29 16:41:29 +01:00
Iain Sandoe	c6929b0855	testsuite, X86, Darwin: Skip a test for mcmodel=large. The large model is not implemented so far for Darwin (and the codegen will be different when it is). gcc/testsuite/ChangeLog: * gcc.target/i386/large-data.c: Skip for Darwin. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>	2023-10-29 07:12:48 +00:00
Iain Sandoe	78491bee70	testsuite, X86, Darwin: Skip tests with incompatible output. Darwin platforms do not currently emit .cfi_xxx instructions so that these tests do not work there. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-interrupt-1.c: Skip for Darwin. * gcc.target/i386/apx-push2pop2-1.c: Likewise. * gcc.target/i386/apx-push2pop2_force_drap-1.c: Likewise. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>	2023-10-29 07:07:07 +00:00

1 2 3 4 5 ...

204972 commits