FreeChainXenon/gcc - Aiden Isik's Forgejo Server

Author	SHA1	Message	Date
Eric Gallager	4968e4844a	Minor formatting fix for newly-added file from previous commit Committing under the "obvious" rule. ChangeLog: * .github/CONTRIBUTING.md: Wrap lines more tightly.	2023-11-01 19:28:56 -04:00
Eric Gallager	2b9778c8d9	Add files to discourage submissions of PRs to the GitHub mirror. Currently there is an unofficial mirror of GCC on GitHub that people sometimes submit pull requests to: https://github.com/gcc-mirror/gcc However, this is not the proper way to contribute to GCC, so that means that someone (usually Jonathan Wakely) has to go through the PRs and manually tell people that they're sending their PRs to the wrong place. One thing that would help mitigate this problem would be files in a special .github directory that GitHub would automatically open when contributors attempt to open a PR, that would then tell them the proper way to contribute instead. This patch attempts to add two such files. They are written in Markdown, which I'm realizing might require some special handling in this repository, since the ".md" extension is also used for GCC's "Machine Description" files here, but I'm not quite sure how to go about handling that. Also note that I adapted these files from equivalent files in the git repository for Git itself: https://github.com/git/git/blob/master/.github/CONTRIBUTING.md https://github.com/git/git/blob/master/.github/PULL_REQUEST_TEMPLATE.md What do people think? ChangeLog: * .github/CONTRIBUTING.md: New file. * .github/PULL_REQUEST_TEMPLATE.md: New file.	2023-11-01 19:23:18 -04:00
Roger Sayle	80b1a37100	PR target/110551: Tweak mulx register allocation using peephole2. This patch is a follow-up to my previous PR target/110551 patch, this time to address the additional move after mulx, seen on TARGET_BMI2 architectures (such as -march=haswell). The complication here is that the flexible multiple-set mulx instruction is introduced into RTL after reload, by split2, and therefore can't benefit from register preferencing. This results in RTL like the following: (insn 32 31 17 2 (parallel [ (set (reg:DI 4 si [orig:101 r ] [101]) (mult:DI (reg:DI 1 dx [109]) (reg:DI 5 di [109]))) (set (reg:DI 5 di [ r+8 ]) (umul_highpart:DI (reg:DI 1 dx [109]) (reg:DI 5 di [109]))) ]) "pr110551-2.c":8:17 -1 (nil)) (insn 17 32 9 2 (set (reg:DI 0 ax [107]) (reg:DI 5 di [ r+8 ])) "pr110551-2.c":9:40 90 {movdi_internal} (expr_list:REG_DEAD (reg:DI 5 di [ r+8 ]) (nil))) Here insn 32, the mulx instruction, places its results in si and di, and then immediately after decides to move di to ax, with di now dead. This can be trivially cleaned up by a peephole2. I've added an additional constraint that the two SET_DESTs can't be the same register to avoid confusing the middle-end, but this has well-defined behaviour on x86_64/BMI2, encoding a umul_highpart. For the new test case, compiled on x86_64 with -O2 -march=haswell: Before: mulx64: movabsq $-7046029254386353131, %rdx mulx %rdi, %rsi, %rdi movq %rdi, %rax xorq %rsi, %rax ret After: mulx64: movabsq $-7046029254386353131, %rdx mulx %rdi, %rsi, %rax xorq %rsi, %rax ret 2023-11-01 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR target/110551 config/i386/i386.md (bmi2_umul<mode><dwi>3_1): Tidy condition as operands[2] with predicate register_operand must be !MEM_P. (peephole2): Optimize a mulx followed by a register-to-register move, to place result in the correct destination if possible. gcc/testsuite/ChangeLog PR target/110551 gcc.target/i386/pr110551-2.c: New test case.	2023-11-01 22:33:45 +00:00
Patrick O'Neill	274c904471	RISC-V: Use riscv_subword_address for atomic_test_and_set Other subword atomic patterns use riscv_subword_address to calculate the aligned address, shift amount, mask and !mask. atomic_test_and_set was implemented before the common function was added. After this patch all subword atomic patterns use riscv_subword_address. gcc/ChangeLog: * config/riscv/sync.md: Use riscv_subword_address function to calculate the address and shift in atomic_test_and_set. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>	2023-11-01 15:06:04 -07:00
Patrick O'Neill	ea2e7bf80b	RISC-V: Enable ztso tests on rv32 This patch transitions the ztso testcases to use the testsuite infrastructure, enabling the tests on both rv64 and rv32 targets. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo-table-ztso-amo-add-1.c: Add Ztso extension to dg-options for dg-do compile. * gcc.target/riscv/amo-table-ztso-amo-add-2.c: Ditto. * gcc.target/riscv/amo-table-ztso-amo-add-3.c: Ditto. * gcc.target/riscv/amo-table-ztso-amo-add-4.c: Ditto. * gcc.target/riscv/amo-table-ztso-amo-add-5.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-1.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-2.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-3.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-4.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-5.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-6.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-7.c: Ditto. * gcc.target/riscv/amo-table-ztso-fence-1.c: Ditto. * gcc.target/riscv/amo-table-ztso-fence-2.c: Ditto. * gcc.target/riscv/amo-table-ztso-fence-3.c: Ditto. * gcc.target/riscv/amo-table-ztso-fence-4.c: Ditto. * gcc.target/riscv/amo-table-ztso-fence-5.c: Ditto. * gcc.target/riscv/amo-table-ztso-load-1.c: Ditto. * gcc.target/riscv/amo-table-ztso-load-2.c: Ditto. * gcc.target/riscv/amo-table-ztso-load-3.c: Ditto. * gcc.target/riscv/amo-table-ztso-store-1.c: Ditto. * gcc.target/riscv/amo-table-ztso-store-2.c: Ditto. * gcc.target/riscv/amo-table-ztso-store-3.c: Ditto. * gcc.target/riscv/amo-table-ztso-subword-amo-add-1.c: Ditto. * gcc.target/riscv/amo-table-ztso-subword-amo-add-2.c: Ditto. * gcc.target/riscv/amo-table-ztso-subword-amo-add-3.c: Ditto. * gcc.target/riscv/amo-table-ztso-subword-amo-add-4.c: Ditto. * gcc.target/riscv/amo-table-ztso-subword-amo-add-5.c: Ditto. * lib/target-supports.exp: Add testing infrastructure to require the Ztso extension or add it to an existing -march. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>	2023-11-01 15:02:48 -07:00
Vineet Gupta	7560f2b4e3	RISC-V: fix TARGET_PROMOTE_FUNCTION_MODE hook for libcalls Fixes: `3496ca4e65` ("RISC-V: Add runtime invariant support") riscv_promote_function_mode doesn't promote a SI to DI for libcalls case. It intends to do that however the code is broken (regression). The fix is what generic promote_mode () in explow.cc does. I really don't understand why the old code didn't work, but stepping thru the debugger shows old code didn't and fixed does. This showed up when testing Ajit's REE ABI extension series which probes the ABI (using a NULL tree type) and ends up hitting the libcall code path. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_promote_function_mode): Fix mode returned for libcall case. Tested-by: Patrick O'Neill <patrick@rivosinc.com> # pre-commit-CI #526 Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>	2023-11-01 14:49:34 -07:00
Martin Uecker	d880e093d9	c: Add Walloc-size to warn about insufficient size in allocations [PR71219] Add option Walloc-size that warns about allocations that have insufficient storage for the target type of the pointer the storage is assigned to. Added to Wextra. PR c/71219 gcc: * doc/invoke.texi: Document -Walloc-size option. gcc/c-family: * c.opt (Walloc-size): New option. gcc/c: * c-typeck.cc (convert_for_assignment): Add warning. gcc/testsuite: * gcc.dg/Walloc-size-1.c: New test. * gcc.dg/Walloc-size-2.c: New test.	2023-11-01 22:35:14 +01:00
Edwin Lu	25f92179de	Make genautomata.cc output reflect insn-attr.h expectation genautomata was writing the insn_has_dfa_reservation_p function inside of the CPU_UNITS_QUERY conditional when it shouldn't have. Move insn_has_dfa_reservation_p outside of conditional group. gcc/ChangeLog: * genautomata.cc (write_automata): move endif Signed-off-by: Edwin Lu <ewlu@rivosinc.com>	2023-11-01 10:19:47 -07:00
Andre Vieira	ea4a3d08f1	omp: Reorder call for TARGET_SIMD_CLONE_ADJUST This patch moves the call to TARGET_SIMD_CLONE_ADJUST until after the arguments and return types have been transformed into vector types. It also constructs the adjuments and retval modifications after this call, allowing targets to alter the types of the arguments and return of the clone prior to the modifications to the function definition. gcc/ChangeLog: * omp-simd-clone.cc (simd_clone_adjust_return_type): Hoist out code to create return array and don't return new type. (simd_clone_adjust_argument_types): Hoist out code that creates ipa_param_body_adjustments and don't return them. (simd_clone_adjust): Call TARGET_SIMD_CLONE_ADJUST after return and argument types have been vectorized, create adjustments and return array after the hook. (expand_simd_clones): Call TARGET_SIMD_CLONE_ADJUST after return and argument types have been vectorized.	2023-11-01 17:02:41 +00:00
Uros Bizjak	64f3a1937a	i386: Fix stack protector peephole2 operand predicate [PR112332] PR target/112332 gcc/ChangeLog: * config/i386/i386.md (stack_protexct_set_2 peephole2): Use general_gr_operand as operand 4 predicate.	2023-11-01 12:06:36 +01:00
Uros Bizjak	7480dbb6e7	i386: Improve stack protector patterns and peephole2s Improve stack protector patterns and peephole2s to substitute stack protector scratch register clear with unrelated subsequent register initialization in several ways: a. Explicitly generate scratch register as named pseudo. This allows optimizers to eventually reuse the zero value in the register. b. Allow scratch register in different mode (SWI48) than PTR mode: d000: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax d007: 00 00 d009: 48 89 44 24 08 mov %rax,0x8(%rsp) d00e: 8b 87 e0 01 00 00 mov 0x1e0(%rdi),%eax SImode moves on x86 zero-extend to the whole DImode register, so stack protector paranoia is not compromised. c. Relax peephole2 constraint that stack protector scratch register must match new initialized register. This relaxation substantially improves peephole2 opportunities, and generates sequences like: a310: 65 4c 8b 34 25 28 00 mov %gs:0x28,%r14 a317: 00 00 a319: 4c 89 74 24 08 mov %r14,0x8(%rsp) a31e: 4c 8b b7 98 00 00 00 mov 0x98(%rdi),%r14 We have to ensure the new scratch is dead in front of the sequence. The patch also fixes omission of earlyclobbers for all alternatives of new initialized register in stack_protect_set_3, avoiding the need for reg_overlap_mentioned_p constraint. Earlyclobbers are per alternative, not per operand. Also, instructions are already valid in peephole2 pass, so we don't have to explicitly re-check their operands for validity. gcc/ChangeLog: config/i386/i386.md (stack_protect_set): Explicitly generate scratch register in word mode. (@stack_protect_set_1_<mode>): Rename to ... (@stack_protect_set_1_<PTR:mode>_<SWI48:mode>): ... this. Use SWI48 mode iterator to match scratch register. (stack_protexct_set_1 peephole2): Use PTR, W and SWI48 mode iterators to match peephole sequence. Use general_operand predicate for operand 4. Allow different operand 2 and operand 3 registers and use peep2_reg_dead_p to ensure new scratch register is dead before peephole seqeunce. Use peep2_reg_dead_p to ensure old scratch register is dead after peephole sequence. (stack_protect_set_2_<mode>): Rename to ... (stack_protect_set_2_<mode>_si): .. this. (stack_protect_set_3): Rename to ... (stack_protect_set_2_<mode>_di): ... this. Use PTR mode iterator to match stack protector memory move. Use earlyclobber for all alternatives of operand 1. (stack_protexct_set_2 peephole2): Use PTR, W and SWI48 mode iterators to match peephole sequence. Use general_operand predicate for operand 4. Allow different operand 2 and operand 3 registers and use peep2_reg_dead_p to ensure new scratch register is dead before peephole seqeunce. Use peep2_reg_dead_p to ensure old scratch register is dead after peephole sequence.	2023-11-01 10:43:38 +01:00
Gaius Mulley	9693459e03	PR modula2/102989: reimplement overflow detection in ztype though WIDE_INT_MAX_PRECISION The ZTYPE in iso modula2 is used to denote intemediate ordinal type const expressions and these are always converted into the approriate language or user ordinal type prior to code generation. The increase of bits supported by _BitInt causes the modula2 largeconst.mod regression failure tests to pass. The largeconst.mod test has been increased to fail, however the char at a time overflow check is now too slow to detect failure. The overflow detection for the ZTYPE has been rewritten to check against exceeding WIDE_INT_MAX_PRECISION (many orders of magnitude faster). gcc/m2/ChangeLog: PR modula2/102989 * gm2-compiler/SymbolTable.mod (OverflowZType): Import from m2expr. (ConstantStringExceedsZType): Remove import. (GetConstLitType): Replace ConstantStringExceedsZType with OverflowZType. * gm2-gcc/m2decl.cc (m2decl_ConstantStringExceedsZType): Remove. (m2decl_BuildConstLiteralNumber): Re-write. * gm2-gcc/m2decl.def (ConstantStringExceedsZType): Remove. * gm2-gcc/m2decl.h (m2decl_ConstantStringExceedsZType): Remove. * gm2-gcc/m2expr.cc (m2expr_StrToWideInt): Rewrite to check overflow. (m2expr_OverflowZType): New function. (ToWideInt): New function. * gm2-gcc/m2expr.def (OverflowZType): New procedure function declaration. * gm2-gcc/m2expr.h (m2expr_OverflowZType): New prototype. gcc/testsuite/ChangeLog: PR modula2/102989 * gm2/pim/fail/largeconst.mod: Updated foo to an outrageous value. * gm2/pim/fail/largeconst2.mod: Duplicate test removed. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>	2023-11-01 09:05:10 +00:00
xuli	084ea7ea5a	RISC-V: Support vundefine intrinsics for tuple types https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/288 gcc/ChangeLog: * config/riscv/riscv-vector-builtins-functions.def (vundefined): Add vundefine intrinsics for tuple types. * config/riscv/riscv-vector-builtins.cc: Ditto. * config/riscv/vector.md (@vundefined<mode>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/tuple_vundefined.c: New test.	2023-11-01 06:44:19 +00:00
Juzhe-Zhong	c9bb20f7c9	NFC: Fix whitespace Notice there is a whitspace issue in previous commit: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f66b2fc122b8a17591afbb881d580b32e8ddb708 Sorry for missing fixing this whitespace. Committed as it is obvious. gcc/ChangeLog: * tree-vect-slp.cc (vect_build_slp_tree_1): Fix whitespace.	2023-11-01 08:52:46 +08:00
GCC Administrator	eac0917bd3	Daily bump.	2023-11-01 00:17:52 +00:00
David Malcolm	37e1634ef1	analyzer: move class record_layout to its own .h/.cc No functional change intended. gcc/ChangeLog: * Makefile.in (ANALYZER_OBJS): Add analyzer/record-layout.o. gcc/analyzer/ChangeLog: * record-layout.cc: New file, based on material in region-model.cc. * record-layout.h: Likewise. * region-model.cc: Include "analyzer/record-layout.h". (class record_layout): Move to record-layout.cc and .h Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2023-10-31 17:05:41 -04:00
David Malcolm	b0f19336f2	libcpp: eliminate MACRO_MAP_EXPANSION_POINT_LOCATION This patch eliminates the function "MACRO_MAP_EXPANSION_POINT_LOCATION" (which hasn't been a macro since r6-739-g0501dbd932a7e9) in favor of a new line_map_macro::get_expansion_point_location accessor. No functional change intended. gcc/c-family/ChangeLog: * c-warn.cc (warn_for_multistatement_macros): Update for removal of MACRO_MAP_EXPANSION_POINT_LOCATION. gcc/cp/ChangeLog: * module.cc (ordinary_loc_of): Update for removal of MACRO_MAP_EXPANSION_POINT_LOCATION. (module_state::note_location): Update for renaming of field. (module_state::write_macro_maps): Likewise. gcc/ChangeLog: * input.cc (dump_location_info): Update for removal of MACRO_MAP_EXPANSION_POINT_LOCATION. * tree-diagnostic.cc (maybe_unwind_expanded_macro_loc): Likewise. libcpp/ChangeLog: * include/line-map.h (line_map_macro::get_expansion_point_location): New accessor. (line_map_macro::expansion): Rename field to... (line_map_macro::mexpansion): Rename field to... (MACRO_MAP_EXPANSION_POINT_LOCATION): Delete this function. * line-map.cc (linemap_enter_macro): Update for renaming of field. (linemap_macro_map_loc_to_exp_point): Update for removal of MACRO_MAP_EXPANSION_POINT_LOCATION. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2023-10-31 17:05:41 -04:00
David Malcolm	8b4ac021cd	opts.cc: fix comment about DOCUMENTATION_ROOT_URL gcc/ChangeLog: * opts.cc (get_option_url): Update comment; the requirement to pass DOCUMENTATION_ROOT_URL's value via -D was removed in r10-8065-ge33a1eae25b8a8. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2023-10-31 17:05:41 -04:00
David Malcolm	b9e2088d29	pretty-print: gracefully handle null URLs gcc/ChangeLog: * pretty-print.cc (pretty_printer::pretty_printer): Initialize m_skipping_null_url. (pp_begin_url): Handle URL being null. (pp_end_url): Likewise. (selftest::test_null_urls): New. (selftest::pretty_print_cc_tests): Call it. * pretty-print.h (pretty_printer::m_skipping_null_url): New. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2023-10-31 17:05:40 -04:00
Juzhe-Zhong	f66b2fc122	VECT: Support SLP MASK_LEN_GATHER_LOAD with conditional mask This patch leverage current MASK_GATHER_LOAD to support SLP MASK_LEN_GATHER_LOAD with condtional mask. Unconditional MASK_LEN_GATHER_LOAD (base, offset, scale, zero, -1) SLP is not included in this patch since it seems that we can't support it in the middle-end: FAIL: gcc.dg/tree-ssa/pr44306.c (internal compiler error: in vectorizable_load, at tree-vect-stmts.cc:9885) May be we should support GATHER_LOAD explictily in RISC-V backend to walk around this issue. I am gonna support GATHER_LOAD explictly work around in RISC-V backend. This patch also adds conditional gather load test since there is no conditional gather load test. Ok for trunk ? gcc/ChangeLog: * tree-vect-slp.cc (vect_get_operand_map): Add MASK_LEN_GATHER_LOAD. (vect_build_slp_tree_1): Ditto. (vect_build_slp_tree_2): Ditto. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-gather-6.c: New test.	2023-10-31 21:07:46 +01:00
Cupertino Miranda	327d38f101	bpf: Improvements in CO-RE builtins implementation. This patch moved the processing of attribute preserve_access_index to its own independent pass in a gimple lowering pass. This approach is more consistent with the implementation of the CO-RE builtins when used explicitly in the code. The attributed type accesses are now early converted to __builtin_core_reloc builtin instead of being kept as an expression in code through out all of the middle-end. This disables the compiler to optimize out or manipulate the expression using the local defined type, instead of assuming nothing is known about this expression, as it should be the case in all of the CO-RE relocations. In the process, also the __builtin_preserve_access_index has been improved to generate code for more complex expressions that would require more then one CO-RE relocation. This turned out to be a requirement, since bpf-next selftests would rely on loop unrolling in order to convert an undefined index array access into a defined one. This seemed extreme to expect for the unroll to happen, and for that reason GCC still generates correct code in such scenarios, even when index access is never predictable or unrolling does not occur. gcc/ChangeLog: * config/bpf/bpf-passes.def (pass_lower_bpf_core): Added pass. * config/bpf/bpf-protos.h: Added prototype for new pass. * config/bpf/bpf.cc (bpf_delegitimize_address): New function. * config/bpf/bpf.md (mov_reloc_core<MM:mode>): Prefixed name with ''. config/bpf/core-builtins.cc (cr_builtins) Added access_node to struct. (is_attr_preserve_access): Improved check. (core_field_info): Make use of root_for_core_field_info function. (process_field_expr): Adapted to new functions. (pack_type): Small improvement. (bpf_handle_plugin_finish_type): Adapted to GTY(()). (bpf_init_core_builtins): Changed to new function names. (construct_builtin_core_reloc): Improved implementation. (bpf_resolve_overloaded_core_builtin): Changed how __builtin_preserve_access_index is converted. (compute_field_expr): Corrected implementation. Added access_node argument. (bpf_core_get_index): Added valid argument. (root_for_core_field_info, pack_field_expr) (core_expr_with_field_expr_plus_base, make_core_safe_access_index) (replace_core_access_index_comp_expr, maybe_get_base_for_field_expr) (core_access_clean, core_is_access_index, core_mark_as_access_index) (make_gimple_core_safe_access_index, execute_lower_bpf_core) (make_pass_lower_bpf_core): Added functions. (pass_data_lower_bpf_core): New pass struct. (pass_lower_bpf_core): New gimple_opt_pass class. (pack_field_expr_for_preserve_field) (bpf_replace_core_move_operands): Removed function. (bpf_enum_value_kind): Added GTY(()). * config/bpf/core-builtins.h (bpf_field_info_kind, bpf_type_id_kind) (bpf_type_info_kind, bpf_enum_value_kind): New enum. * config/bpf/t-bpf: Added pass bpf-passes.def to PASSES_EXTRA. gcc/testsuite/ChangeLog: * gcc.target/bpf/core-attr-5.c: New test. * gcc.target/bpf/core-attr-6.c: New test. * gcc.target/bpf/core-builtin-1.c: Corrected * gcc.target/bpf/core-builtin-enumvalue-opt.c: Corrected regular expression. * gcc.target/bpf/core-builtin-enumvalue.c: Corrected regular expression. * gcc.target/bpf/core-builtin-exprlist-1.c: New test. * gcc.target/bpf/core-builtin-exprlist-2.c: New test. * gcc.target/bpf/core-builtin-exprlist-3.c: New test. * gcc.target/bpf/core-builtin-exprlist-4.c: New test. * gcc.target/bpf/core-builtin-fieldinfo-offset-1.c: Extra tests	2023-10-31 18:47:03 +00:00
Neal Frager	0f1727e25f	gcc: config: microblaze: fix cpu version check The MICROBLAZE_VERSION_COMPARE was incorrectly using strcasecmp instead of strverscmp to check the mcpu version against feature options. By simply changing the define to use strverscmp, the new version 10.0 is treated correctly as a higher version than previous versions. gcc/ChangeLog: * config/microblaze/microblaze.cc: Fix mcpu version check. gcc/testsuite/ChangeLog: * gcc.target/microblaze/isa/bshift.c: Bump to mcpu=v10.0. * gcc.target/microblaze/isa/div.c: Ditto. * gcc.target/microblaze/isa/fcmp1.c: Ditto. * gcc.target/microblaze/isa/fcmp2.c: Ditto. * gcc.target/microblaze/isa/fcmp3.c: Ditto. * gcc.target/microblaze/isa/fcmp4.c: Ditto. * gcc.target/microblaze/isa/fcvt.c: Ditto. * gcc.target/microblaze/isa/float.c: Ditto. * gcc.target/microblaze/isa/fsqrt.c: Ditto. * gcc.target/microblaze/isa/mul-bshift-pcmp.c: Ditto. * gcc.target/microblaze/isa/mul-bshift.c: Ditto. * gcc.target/microblaze/isa/mul.c: Ditto. * gcc.target/microblaze/isa/mulh-bshift-pcmp.c: Ditto. * gcc.target/microblaze/isa/mulh.c: Ditto. * gcc.target/microblaze/isa/nofcmp.c: Ditto. * gcc.target/microblaze/isa/nofloat.c: Ditto. * gcc.target/microblaze/isa/pcmp.c: Ditto. * gcc.target/microblaze/isa/vanilla.c: Ditto. * gcc.target/microblaze/microblaze.exp: Ditto. Signed-off-by: Neal Frager <neal.frager@amd.com> Signed-off-by: Michael J. Eager <eager@eagercon.com>	2023-10-31 10:57:45 -07:00
Patrick O'Neill	2b19c38769	RISC-V: Require a extension for testcases with atomic insns Add testsuite infrastructure for the A extension and use it to require the A extension for dg-do run and add the add extension for non-A dg-do compile. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo-table-a-6-amo-add-1.c: Add A extension to dg-options for dg-do compile. * gcc.target/riscv/amo-table-a-6-amo-add-2.c: Ditto. * gcc.target/riscv/amo-table-a-6-amo-add-3.c: Ditto. * gcc.target/riscv/amo-table-a-6-amo-add-4.c: Ditto. * gcc.target/riscv/amo-table-a-6-amo-add-5.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-1.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-2.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-3.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-4.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-5.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-6.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-7.c: Ditto. * gcc.target/riscv/amo-table-a-6-subword-amo-add-1.c: Ditto. * gcc.target/riscv/amo-table-a-6-subword-amo-add-2.c: Ditto. * gcc.target/riscv/amo-table-a-6-subword-amo-add-3.c: Ditto. * gcc.target/riscv/amo-table-a-6-subword-amo-add-4.c: Ditto. * gcc.target/riscv/amo-table-a-6-subword-amo-add-5.c: Ditto. * gcc.target/riscv/inline-atomics-2.c: Ditto. * gcc.target/riscv/inline-atomics-3.c: Require A extension for dg-do run. * gcc.target/riscv/inline-atomics-4.c: Ditto. * gcc.target/riscv/inline-atomics-5.c: Ditto. * gcc.target/riscv/inline-atomics-6.c: Ditto. * gcc.target/riscv/inline-atomics-7.c: Ditto. * gcc.target/riscv/inline-atomics-8.c: Ditto. * lib/target-supports.exp: Add testing infrastructure to require the A extension or add it to an existing -march. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>	2023-10-31 10:15:34 -07:00
Patrick O'Neill	b93fddba39	RISC-V: Let non-atomic targets use optimized amo loads/stores Non-atomic targets are currently prevented from using the optimized fencing for seq_cst load/seq_cst store. This patch removes that constraint. gcc/ChangeLog: * config/riscv/sync-rvwmo.md (atomic_load_rvwmo<mode>): Remove TARGET_ATOMIC constraint (atomic_store_rvwmo<mode>): Ditto. * config/riscv/sync-ztso.md (atomic_load_ztso<mode>): Ditto. (atomic_store_ztso<mode>): Ditto. * config/riscv/sync.md (atomic_load<mode>): Ditto. (atomic_store<mode>): Ditto. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>	2023-10-31 10:15:33 -07:00
Christoph Müllner	60d6c63df0	riscv: thead: Add support for the XTheadFMemIdx ISA extension The XTheadFMemIdx ISA extension provides additional load and store instructions for floating-point registers with new addressing modes. The following memory accesses types are supported: * load/store: [w,d] (single-precision FP, double-precision FP) The following addressing modes are supported: * register offset with additional immediate offset (4 instructions): flr<type>, fsr<type> * zero-extended register offset with additional immediate offset (4 instructions): flur<type>, fsur<type> These addressing modes are also part of the similar XTheadMemIdx ISA extension support, whose code is reused and extended to support floating-point registers. One challenge that this patch needs to solve are GP registers in FP-mode (e.g. "(reg:DF a2)"), which cannot be handled by the XTheadFMemIdx instructions. Such registers are the result of independent optimizations, which can happen after register allocation. This patch uses a simple but efficient method to address this: add a dependency for XTheadMemIdx to XTheadFMemIdx optimizations. This allows to use the instructions from XTheadMemIdx in case of such registers. The added tests ensure that this feature won't regress without notice. Testing: GCC regression test suite and SPEC CPU 2017 intrate (base&peak). Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> gcc/ChangeLog: * config/riscv/riscv.cc (riscv_index_reg_class): Return GR_REGS for XTheadFMemIdx. (riscv_regno_ok_for_index_p): Add support for XTheadFMemIdx. * config/riscv/riscv.h (HARDFP_REG_P): New macro. * config/riscv/thead.cc (is_fmemidx_mode): New function. (th_memidx_classify_address_index): Add support for XTheadFMemIdx. (th_fmemidx_output_index): New function. (th_output_move): Add support for XTheadFMemIdx. * config/riscv/thead.md (TH_M_ANYF): New mode iterator. (TH_M_NOEXTF): Likewise. (th_fmemidx_movsf_hardfloat): New INSN. (th_fmemidx_movdf_hardfloat_rv64): Likewise. (th_fmemidx_I_a): Likewise. (th_fmemidx_I_c): Likewise. (th_fmemidx_US_a): Likewise. (th_fmemidx_US_c): Likewise. (th_fmemidx_UZ_a): Likewise. (th_fmemidx_UZ_c): Likewise. gcc/testsuite/ChangeLog: * gcc.target/riscv/xtheadfmemidx-index-update.c: New test. * gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c: New test. * gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c: New test. * gcc.target/riscv/xtheadfmemidx-index.c: New test. * gcc.target/riscv/xtheadfmemidx-uindex-update.c: New test. * gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c: New test. * gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c: New test. * gcc.target/riscv/xtheadfmemidx-uindex.c: New test. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>	2023-10-31 18:08:02 +01:00
Christoph Müllner	2d65622fda	riscv: thead: Add support for the XTheadMemIdx ISA extension The XTheadMemIdx ISA extension provides a additional load and store instructions with new addressing modes. The following memory accesses types are supported: * load: b,bu,h,hu,w,wu,d * store: b,h,w,d The following addressing modes are supported: * immediate offset with PRE_MODIFY or POST_MODIFY (22 instructions): l<ltype>.ia, l<ltype>.ib, s<stype>.ia, s<stype>.ib * register offset with additional immediate offset (11 instructions): lr<ltype>, sr<stype> * zero-extended register offset with additional immediate offset (11 instructions): lur<ltype>, sur<stype> The RISC-V base ISA does not support index registers, so the changes are kept separate from the RISC-V standard support as much as possible. To combine the shift/multiply instructions into the memory access instructions, this patch comes with a few insn_and_split optimizations that allow the combiner to do this task. Handling the different cases of extensions results in a couple of INSNs that look redundant on first view, but they are just the equivalence of what we already have for Zbb as well. The only difference is, that we have much more load instructions. We already have a constraint with the name 'th_f_fmv', therefore, the new constraints follow this pattern and have the same length as required ('th_m_mia', 'th_m_mib', 'th_m_mir', 'th_m_miu'). The added tests ensure that this feature won't regress without notice. Testing: GCC regression test suite, GCC bootstrap build, and SPEC CPU 2017 intrate (base&peak) on C920. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> gcc/ChangeLog: * config/riscv/constraints.md (th_m_mia): New constraint. (th_m_mib): Likewise. (th_m_mir): Likewise. (th_m_miu): Likewise. * config/riscv/riscv-protos.h (enum riscv_address_type): Add new address types ADDRESS_REG_REG, ADDRESS_REG_UREG, and ADDRESS_REG_WB and their documentation. (struct riscv_address_info): Add new field 'shift' and document the field usage for the new address types. (riscv_valid_base_register_p): New prototype. (th_memidx_legitimate_modify_p): Likewise. (th_memidx_legitimate_index_p): Likewise. (th_classify_address): Likewise. (th_output_move): Likewise. (th_print_operand_address): Likewise. * config/riscv/riscv.cc (riscv_index_reg_class): Return GR_REGS for XTheadMemIdx. (riscv_regno_ok_for_index_p): Add support for XTheadMemIdx. (riscv_classify_address): Call th_classify_address() on top. (riscv_output_move): Call th_output_move() on top. (riscv_print_operand_address): Call th_print_operand_address() on top. * config/riscv/riscv.h (HAVE_POST_MODIFY_DISP): New macro. (HAVE_PRE_MODIFY_DISP): Likewise. * config/riscv/riscv.md (zero_extendqi<SUPERQI:mode>2): Disable for XTheadMemIdx. (zero_extendqi<SUPERQI:mode>2_internal): Convert to expand, create INSN with same name and disable it for XTheadMemIdx. (extendsidi2): Likewise. (extendsidi2_internal): Disable for XTheadMemIdx. * config/riscv/thead.cc (valid_signed_immediate): New helper function. (th_memidx_classify_address_modify): New function. (th_memidx_legitimate_modify_p): Likewise. (th_memidx_output_modify): Likewise. (is_memidx_mode): Likewise. (th_memidx_classify_address_index): Likewise. (th_memidx_legitimate_index_p): Likewise. (th_memidx_output_index): Likewise. (th_classify_address): Likewise. (th_output_move): Likewise. (th_print_operand_address): Likewise. * config/riscv/thead.md (th_memidx_operand): New splitter. (th_memidx_zero_extendqi<SUPERQI:mode>2): New INSN. (th_memidx_extendsidi2): Likewise. (th_memidx_zero_extendsidi2): Likewise. (th_memidx_zero_extendhi<GPR:mode>2): Likewise. (th_memidx_extend<SHORT:mode><SUPERQI:mode>2): Likewise. (th_memidx_bb_zero_extendsidi2): Likewise. (th_memidx_bb_zero_extendhi<GPR:mode>2): Likewise. (th_memidx_bb_extendhi<GPR:mode>2): Likewise. (th_memidx_bb_extendqi<SUPERQI:mode>2): Likewise. (TH_M_ANYI): New mode iterator. (TH_M_NOEXTI): Likewise. (th_memidx_I_a): New combiner optimization. (th_memidx_I_b): Likewise. (th_memidx_I_c): Likewise. (th_memidx_US_a): Likewise. (th_memidx_US_b): Likewise. (th_memidx_US_c): Likewise. (th_memidx_UZ_a): Likewise. (th_memidx_UZ_b): Likewise. (th_memidx_UZ_c): Likewise. gcc/testsuite/ChangeLog: gcc.target/riscv/xtheadmemidx-helpers.h: New test. * gcc.target/riscv/xtheadmemidx-index-update.c: New test. * gcc.target/riscv/xtheadmemidx-index-xtheadbb-update.c: New test. * gcc.target/riscv/xtheadmemidx-index-xtheadbb.c: New test. * gcc.target/riscv/xtheadmemidx-index.c: New test. * gcc.target/riscv/xtheadmemidx-modify-xtheadbb.c: New test. * gcc.target/riscv/xtheadmemidx-modify.c: New test. * gcc.target/riscv/xtheadmemidx-uindex-update.c: New test. * gcc.target/riscv/xtheadmemidx-uindex-xtheadbb-update.c: New test. * gcc.target/riscv/xtheadmemidx-uindex-xtheadbb.c: New test. * gcc.target/riscv/xtheadmemidx-uindex.c: New test.	2023-10-31 18:08:01 +01:00
Carl Love	c82f123d93	rs6000, Add missing overloaded bcd builtin tests, documentation Currently we have the documentation for __builtin_vec_bcdsub_{eq,gt,lt} but not for __builtin_bcdsub_{gl}e, this patch is to supplement the descriptions for them. Although they are mainly for __builtin_bcdcmp{ge,le}, we already have some testing coverage for __builtin_vec_bcdsub_{eq,gt,lt}, this patch adds the corresponding explicit test cases as well. gcc/ChangeLog: * doc/extend.texi (__builtin_bcdsub_le, __builtin_bcdsub_ge): Add documentation for the builti-ins. gcc/testsuite/ChangeLog: * gcc.target/powerpc/bcd-3.c (do_sub_ge, do_suble): Add functions to test builtins __builtin_bcdsub_ge and __builtin_bcdsub_le.	2023-10-31 12:30:41 -04:00
Neal Frager	f694960924	gcc: config: microblaze: fix cpu version check The MICROBLAZE_VERSION_COMPARE was incorrectly using strcasecmp instead of strverscmp to check the mcpu version against feature options. By simply changing the define to use strverscmp, the new version 10.0 is treated correctly as a higher version than previous versions. Fix incorrect warning with -mcpu=10.0: warning: '-mxl-multiply-high' can be used only with '-mcpu=v6.00.a' or greater Signed-off-by: Neal Frager <neal.frager@amd.com> Signed-off-by: Michael J. Eager <eager@eagercon.com>	2023-10-31 09:29:10 -07:00
Vladimir N. Makarov	9119b008b4	[RA]: Fixing LRA cycling for multi-reg variable containing a fixed reg PR111971 test case uses a multi-reg variable containing a fixed reg. LRA rejects such multi-reg because of this when matching the constraint for an asm insn. The rejection results in LRA cycling. The patch fixes this issue. gcc/ChangeLog: PR rtl-optimization/111971 * lra-constraints.cc: (process_alt_operands): Don't check start hard regs for regs originated from register variables. gcc/testsuite/ChangeLog: PR rtl-optimization/111971 * gcc.target/powerpc/pr111971.c: New test.	2023-10-31 11:45:40 -04:00
Thomas Schwinge	3e888f9462	Add OpenACC 'acc_map_data' variant to 'libgomp.oacc-c-c++-common/deep-copy-8.c' libgomp/ * testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c: Add OpenACC 'acc_map_data' variant.	2023-10-31 14:54:41 +01:00
Robin Dapp	5de05bdaa7	RISC-V: Add vector fmin/fmax expanders. This patch adds expanders for fmin and fmax. As per RISC-V V Spec 1.0 vfmin/vfmax are IEEE 754-2019 compliant which differs from IEEE 754-2008 that fmin/fmax require (particularly in the signaling-NaN handling). Therefore the pattern conditions include a !HONOR_SNANS. gcc/ChangeLog: * config/riscv/autovec.md (<ieee_fmaxmin_op><mode>3): fmax/fmin expanders. (cond_<ieee_fmaxmin_op><mode>): Ditto. (cond_len_<ieee_fmaxmin_op><mode>): Ditto. (reduc_fmax_scal_<mode>): Ditto. (reduc_fmin_scal_<mode>): Ditto. * config/riscv/riscv-v.cc (needs_fp_rounding): Add fmin/fmax. * config/riscv/vector-iterators.md (fmin): New UNSPEC. (UNSPEC_VFMIN): Ditto. * config/riscv/vector.md (@pred_<ieee_fmaxmin_op><mode>): Add UNSPEC insn patterns. (@pred_<ieee_fmaxmin_op><mode>_scalar): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Remove -ffast-math. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-4.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/fmax-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmax_run-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmax_zvfh-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmax_zvfh_run-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmin-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmin_run-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmin_zvfh-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmin_zvfh_run-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-3.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-4.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-3.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-4.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-3.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-4.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-3.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-4.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc-10.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_run-10.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_zvfh-10.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_zvfh_run-10.c: New test.	2023-10-31 13:34:28 +01:00
Robin Dapp	184378027e	genemit: Split insn-emit.cc into several partitions. On riscv insn-emit.cc has grown to over 1.2 mio lines of code and compiling it takes considerable time. Therefore, this patch adjust genemit to create several partitions (insn-emit-1.cc to insn-emit-n.cc). The available patterns are written to the given files in a sequential fashion. Similar to match.pd a configure option --with-emitinsn-partitions=num is introduced that makes the number of partition configurable. gcc/ChangeLog: PR bootstrap/84402 PR target/111600 * Makefile.in: Handle split insn-emit.cc. * configure: Regenerate. * configure.ac: Add --with-insnemit-partitions. * genemit.cc (output_peephole2_scratches): Print to file instead of stdout. (print_code): Ditto. (gen_rtx_scratch): Ditto. (gen_exp): Ditto. (gen_emit_seq): Ditto. (emit_c_code): Ditto. (gen_insn): Ditto. (gen_expand): Ditto. (gen_split): Ditto. (output_add_clobbers): Ditto. (output_added_clobbers_hard_reg_p): Ditto. (print_overload_arguments): Ditto. (print_overload_test): Ditto. (handle_overloaded_code_for): Ditto. (handle_overloaded_gen): Ditto. (print_header): New function. (handle_arg): New function. (main): Split output into 10 files. * gensupport.cc (count_patterns): New function. * gensupport.h (count_patterns): Define. * read-md.cc (md_reader::print_md_ptr_loc): Add file argument. * read-md.h (class md_reader): Change definition.	2023-10-31 13:34:28 +01:00
Alexandre Oliva	15404016d9	hardcfr: support checking at abnormal edges [PR111943] Control flow redundancy may choose abnormal edges for early checking, but that breaks because we can't insert checks on such edges. Introduce conditional checking on the dest block of abnormal edges, and leave it for the optimizer to drop the conditional. for gcc/ChangeLog PR tree-optimization/111943 * gimple-harden-control-flow.cc: Adjust copyright year. (rt_bb_visited): Add vfalse and vtrue data members. Zero-initialize them in the ctor. (rt_bb_visited::insert_exit_check_on_edge): Upon encountering abnormal edges, insert initializers for vfalse and vtrue on entry, and insert the check sequence guarded by a conditional in the dest block. for libgcc/ChangeLog * hardcfr.c: Adjust copyright year. for gcc/testsuite/ChangeLog PR tree-optimization/111943 * gcc.dg/harden-cfr-pr111943.c: New.	2023-10-31 09:32:08 -03:00
Richard Biener	e3da1d7bb2	tree-optimization/112305 - SCEV cprop and conditional undefined overflow The following adjusts final value replacement to also rewrite the replacement to defined overflow behavior if there's conditionally evaluated stmts (with possibly undefined overflow), not only when we "folded casts". The patch hooks into expression_expensive for this. PR tree-optimization/112305 * tree-scalar-evolution.h (expression_expensive): Adjust. * tree-scalar-evolution.cc (expression_expensive): Record when we see a COND_EXPR. (final_value_replacement_loop): When the replacement contains a COND_EXPR, rewrite it to defined overflow. * tree-ssa-loop-ivopts.cc (may_eliminate_iv): Adjust. * gcc.dg/torture/pr112305.c: New testcase.	2023-10-31 13:10:04 +01:00
Iain Buclaw	1cf5dc05c6	d: Clean-up unused variable assignments after interface change The lowering done for invoking `new' on a single dimension array was moved from the code generator to the front-end semantic pass in r14-4996. This removes the detritus left behind in the code generator from that deletion. gcc/d/ChangeLog: * expr.cc (ExprVisitor::visit (NewExp *)): Remove unused assignments.	2023-10-31 12:30:52 +01:00
Xi Ruoyao	6bf2cebe2b	LoongArch: Define HAVE_AS_TLS to 0 if it's undefined [PR112299] Now loongarch.md uses HAVE_AS_TLS, we need this to fix the failure building a cross compiler if the cross assembler is not installed yet. gcc/ChangeLog: PR target/112299 * config/loongarch/loongarch-opts.h (HAVE_AS_TLS): Define to 0 if not defined yet.	2023-10-31 14:23:18 +08:00
Lehua Ding	5ee961b6f2	RISC-V: Add assert of the number of vmerge in autovec cond testcases This patch adds more asserts about the vmerge insns which is intended to ensure better performance for cond autovec. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/cond/cond_arith-1.c: Add vmerge assert. * gcc.target/riscv/rvv/autovec/cond/cond_arith-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-7.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-8.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv32-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv32-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv64-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv64-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv32-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv32-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv64-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv64-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv32-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv32-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv64-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv64-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv64gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-7.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-8.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma_run-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-7.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-8.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-9.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-7.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-8.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-10.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_arith-11.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_arith_run-10.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_arith_run-11.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmul_run-5.c: New test.	2023-10-31 14:15:57 +08:00
Lehua Ding	711d703d07	match.pd: Support combine cond_len_op + vec_cond similar to cond_op This patch adds combine cond_len_op and vec_cond to cond_len_op like cond_op. Consider this code (RISC-V target): void foo (uint8_t __restrict x, uint8_t __restrict y, uint8_t __restrict z, uint8_t __restrict pred, uint8_t __restrict merged, int n) { for (int i = 0; i < n; ++i) x[i] = pred[i] != 1 ? y[i] / z[i] : merged[i]; } Before this patch: ... vect_iftmp.18_71 = .COND_LEN_DIV (mask__31.11_61, vect__5.14_65, vect__7.17_69, { 0, ... }, _86, 0); vect_iftmp.23_78 = .VCOND_MASK (mask__31.11_61, vect_iftmp.18_71, vect_iftmp.22_77); ... After this patch: ... _30 = .COND_LEN_DIV (mask__31.16_61, vect__5.19_65, vect__7.22_69, vect_iftmp.27_77, _85, 0); ... gcc/ChangeLog: gimple-match.h (gimple_match_op::gimple_match_op): Add interfaces for more arguments. (gimple_match_op::set_op): Add interfaces for more arguments. * match.pd: Add support of combining cond_len_op + vec_cond	2023-10-31 14:13:05 +08:00
Haochen Jiang	9cc2b97458	Fix incorrect option mask and avx512cd target push gcc/ChangeLog: * config/i386/avx512cdintrin.h (target): Push evex512 for avx512cd. * config/i386/avx512vlintrin.h (target): Split avx512cdvl part out from avx512vl. * config/i386/i386-builtin.def (BDESC): Do not check evex512 for builtins not needed.	2023-10-31 13:42:15 +08:00
Lehua Ding	5ee894130f	RISC-V: Add the missed combine of [u]int64 -> _Float16 and vcond Hi, This patch let the INT64 to FP16 convert split to two small converts (INT64 -> FP32 and FP32 -> FP16) when expanding instead of dealy the split to split1 pass. This change could make it possible to combine the FP32 to FP16 and vcond patterns and so we don't need to add an combine pattern for INT64 to FP16 and vcond patterns. Consider this code: void foo (_Float16 __restrict r, int64_t __restrict a, _FLoat16 __restrict b, int64_t __restrict pred, int n) { for (int i = 0; i < n; i += 1) { r[i] = pred[i] ? (_Float16) a[i] : b[i]; } } Before this patch: ... vfncvt.f.f.w v2,v2 vmerge.vvm v1,v1,v2,v0 vse16.v v1,0(a0) ... After this patch: ... vfncvt.f.f.w v1,v2,v0.t vse16.v v1,0(a0) ... gcc/ChangeLog: * config/riscv/autovec.md (<float_cvt><mode><vnnconvert>2): Change to define_expand. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-1.c: Add vfncvt.f.f.w assert. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-2.c: Ditto.	2023-10-31 11:45:46 +08:00
liuhongt	f5d33d0c79	Fix wrong code due to incorrect define_split -(define_split - [(set (match_operand:V2HI 0 "register_operand") - (eq:V2HI - (eq:V2HI - (us_minus:V2HI - (match_operand:V2HI 1 "register_operand") - (match_operand:V2HI 2 "register_operand")) - (match_operand:V2HI 3 "const0_operand")) - (match_operand:V2HI 4 "const0_operand")))] - "TARGET_SSE4_1" - [(set (match_dup 0) - (umin:V2HI (match_dup 1) (match_dup 2))) - (set (match_dup 0) - (eq:V2HI (match_dup 0) (match_dup 2)))]) the splitter is wrong when op1 == op2.(the original pattern returns 0, after split, it returns 1) So remove the splitter. Also extend another define_split to define_insn_and_split to handle below pattern 494(set (reg:V4QI 112) 495 (unspec:V4QI [ 496 (subreg:V4QI (reg:V2HF 111 [ bf ]) 0) 497 (subreg:V4QI (reg:V2HF 110 [ af ]) 0) 498 (subreg:V4QI (eq:V2HI (eq:V2HI (reg:V2HI 105) 499 (const_vector:V2HI [ 500 (const_int 0 [0]) repeated x2 501 ])) 502 (const_vector:V2HI [ 503 (const_int 0 [0]) repeated x2 504 ])) 0) 505 ] UNSPEC_BLENDV)) define_split doesn't work since pass_combine assume it produces at most 2 insns after split, but here it produces 3 since we need to move const0_rtx (V2HImode) to reg. The move insn can be eliminated later. gcc/ChangeLog: PR target/112276 * config/i386/mmx.md (mmx_pblendvb_v8qi_1): Change define_split to define_insn_and_split to handle immediate_operand for comparison. (mmx_pblendvb_v8qi_2): Ditto. (mmx_pblendvb_<mode>_1): Ditto. (mmx_pblendvb_v4qi_2): Ditto. (<code><mode>3): Remove define_split after it. (<code>v8qi3): Ditto. (<code><mode>3): Ditto. (<ode>v2hi3): Ditto. gcc/testsuite/ChangeLog: * g++.target/i386/part-vect-vcondhf.C: Adjust testcase. * gcc.target/i386/pr112276.c: New test.	2023-10-31 11:24:45 +08:00
Andrew Pinski	541b754c77	MATCH: Add some more value_replacement simplifications to match This moves a few more value_replacements simplifications to match. /* a == 1 ? b : a * b -> a * b / / a == 1 ? b : b / a -> b / a / / a == -1 ? b : a & b -> a & b / Also adds a testcase to show can we catch these where value_replacement would not (but other passes would). Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: match.pd (`a == 1 ? b : a OP b`): New pattern. (`a == -1 ? b : a & b`): New pattern. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/phi-opt-value-4.c: New test.	2023-10-30 19:15:25 -07:00
Andrew Pinski	598fdb5290	MATCH: first of the value replacement moving from phiopt This moves a few simple patterns that are done in value replacement in phiopt over to match.pd. Just the simple ones which might show up in other code. This allows some optimizations to happen even without depending on sinking from happening and in some cases where phiopt is not invoked (cond-1.c is an example there). Changes since v1: * v2: Add an extra testcase to showcase improvements at -O1. Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: * match.pd: (`a == 0 ? b : b + a`, `a == 0 ? b : b - a`): New patterns. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/cond-1.c: New test. * gcc.dg/tree-ssa/phi-opt-value-1.c: New test. * gcc.dg/tree-ssa/phi-opt-value-1a.c: New test. * gcc.dg/tree-ssa/phi-opt-value-2.c: New test.	2023-10-30 19:15:25 -07:00
GCC Administrator	a5c157b95a	Daily bump.	2023-10-31 00:17:32 +00:00
Mayshao	94c0b26f45	i386: Zhaoxin yongfeng enablement Enable -march/-mtune=yongfeng. Costs and tunings are set according to the characteristics of the processor. Add a new .md file to describe yongfeng processor. gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Recognize yongfeng. * common/config/i386/i386-common.cc: Add yongfeng. * common/config/i386/i386-cpuinfo.h (enum processor_subtypes): Add ZHAOXIN_FAM7H_YONGFENG. * config.gcc: Add yongfeng. * config/i386/driver-i386.cc (host_detect_local_cpu): Let -march=native recognize yongfeng processors. * config/i386/i386-c.cc (ix86_target_macros_internal): Add yongfeng. * config/i386/i386-options.cc (m_YONGFENG): New definition. (m_ZHAOXIN): Ditto. * config/i386/i386.h (enum processor_type): Add PROCESSOR_YONGFENG. * config/i386/i386.md: Add yongfeng. * config/i386/lujiazui.md: Fix typo. * config/i386/x86-tune-costs.h (struct processor_costs): Add yongfeng costs. * config/i386/x86-tune-sched.cc (ix86_issue_rate): Add yongfeng. (ix86_adjust_cost): Ditto. * config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Replace m_LUJIAZUI with m_ZHAOXIN. (X86_TUNE_PARTIAL_REG_DEPENDENCY): Ditto. (X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY): Ditto. (X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): Ditto. (X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Ditto. (X86_TUNE_MOVX): Ditto. (X86_TUNE_MEMORY_MISMATCH_STALL): Ditto. (X86_TUNE_FUSE_CMP_AND_BRANCH_32): Ditto. (X86_TUNE_FUSE_CMP_AND_BRANCH_64): Ditto. (X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS): Ditto. (X86_TUNE_FUSE_ALU_AND_BRANCH): Ditto. (X86_TUNE_ACCUMULATE_OUTGOING_ARGS): Ditto. (X86_TUNE_USE_LEAVE): Ditto. (X86_TUNE_PUSH_MEMORY): Ditto. (X86_TUNE_LCP_STALL): Ditto. (X86_TUNE_INTEGER_DFMODE_MOVES): Ditto. (X86_TUNE_OPT_AGU): Ditto. (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB): Ditto. (X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Ditto. (X86_TUNE_USE_SAHF): Ditto. (X86_TUNE_USE_BT): Ditto. (X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Ditto. (X86_TUNE_ONE_IF_CONV_INSN): Ditto. (X86_TUNE_AVOID_MFENCE): Ditto. (X86_TUNE_EXPAND_ABS): Ditto. (X86_TUNE_USE_SIMODE_FIOP): Ditto. (X86_TUNE_USE_FFREEP): Ditto. (X86_TUNE_EXT_80387_CONSTANTS): Ditto. (X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Ditto. (X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Ditto. (X86_TUNE_SSE_TYPELESS_STORES): Ditto. (X86_TUNE_SSE_LOAD0_BY_PXOR): Ditto. (X86_TUNE_USE_GATHER_2PARTS): Add m_YONGFENG. (X86_TUNE_USE_GATHER_4PARTS): Ditto. (X86_TUNE_USE_GATHER_8PARTS): Ditto. (X86_TUNE_AVOID_128FMA_CHAINS): Ditto. * doc/extend.texi: Add details about yongfeng. * doc/invoke.texi: Ditto. * config/i386/yongfeng.md: New file to describe yongfeng processor. gcc/testsuite/ChangeLog: * g++.target/i386/mv32.C: Handle new -march. * gcc.target/i386/funcspec-56.inc: Ditto.	2023-10-30 22:20:01 +01:00
François Dumont	6504b4a498	libstdc++: [_GLIBCXX_INLINE_VERSION] Add comment on emul TLS symbols libstdc++-v3/ChangeLog: * config/abi/pre/gnu-versioned-namespace.ver: Add comment on recently added emul TLS symbols.	2023-10-30 22:07:49 +01:00
François Dumont	5ea11700e5	libstdc++: [_GLIBCXX_INLINE_VERSION] Un-weak handle_contract_violation libstdc++-v3/ChangeLog: * src/experimental/contract.cc [_GLIBCXX_INLINE_VERSION](handle_contract_violation): Rework comment. Remove weak attribute.	2023-10-30 21:49:31 +01:00
Iain Sandoe	434975cb1b	configure, fixincludes: Add change missed in r14-4825. This corrects an oversight in the r14-4825 commit. fixincludes/ChangeLog: * configure: Regenerate. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>	2023-10-30 19:05:00 +00:00
Martin Jambor	997c8219f0	ipa: Prune any IPA-CP aggregate constants known by modref to be killed (111157) PR 111157 shows that IPA-modref and IPA-CP (when plugged into value numbering) can optimize out a store both before a call (because the call will overwrite it) and in the call (because the store is of the same value) and by eliminating both create miscompilation. This patch fixes that by pruning any constants from the list of IPA-CP aggregate value constants that it knows the contents of the memory can be "killed." Unfortunately, doing so is tricky. First, IPA-modref loads override kills and so only stores not loaded are truly not necessary. Looking stuff up there means doing what most of what modref_may_alias may do but doing exactly what it does is tricky because it takes also aliasing into account and has bail-out counters. To err on the side of caution in order to avoid this miscompilation we have to prune a constant when in doubt. However, pruning can interfere with the mechanism of how clone materialization distinguishes between the cases when a parameter was entirely removed and when it was both IPA-CPed and IPA-SRAed (in order to make up for the removal in debug info, which can bump into an assert when compiling g++.dg/torture/pr103669.C when we are not careful). Therefore this patch: 1) marks constants that IPA-modref has in its kill list with a new "killed" flag, and 2) prunes the list from entries with this flag after materialization and IPA-CP transformation is done using the template introduced in the previous patch It does not try to look up anything in the load lists, this will be done as a follow-up in order to ease review. gcc/ChangeLog: 2023-10-27 Martin Jambor <mjambor@suse.cz> PR ipa/111157 * ipa-prop.h (struct ipa_argagg_value): Newf flag killed. * ipa-modref.cc (ipcp_argagg_and_kill_overlap_p): New function. (update_signature): Mark any any IPA-CP aggregate constants at positions known to be killed as killed. Move check that there is clone_info after this pruning. * ipa-cp.cc (ipa_argagg_value_list::dump): Dump the killed flag. (ipa_argagg_value_list::push_adjusted_values): Clear the new flag. (push_agg_values_from_plats): Likewise. (ipa_push_agg_values_from_jfunc): Likewise. (estimate_local_effects): Likewise. (push_agg_values_for_index_from_edge): Likewise. * ipa-prop.cc (write_ipcp_transformation_info): Stream the killed flag. (read_ipcp_transformation_info): Likewise. (ipcp_get_aggregate_const): Update comment, assert that encountered record does not have killed flag set. (ipcp_transform_function): Prune all aggregate constants with killed set. gcc/testsuite/ChangeLog: 2023-09-18 Martin Jambor <mjambor@suse.cz> PR ipa/111157 * gcc.dg/lto/pr111157_0.c: New test. * gcc.dg/lto/pr111157_1.c: Second file of the same new test.	2023-10-30 18:36:54 +01:00
Martin Jambor	1437df40f1	ipa-cp: Templatize filtering of m_agg_values PR 111157 points to another place where IPA-CP collected aggregate compile-time constants need to be filtered, in addition to the one place that already does this in ipa-sra. In order to re-use code, this patch turns the common bit into a template. The functionality is still covered by testcase gcc.dg/ipa/pr108959.c. gcc/ChangeLog: 2023-09-13 Martin Jambor <mjambor@suse.cz> PR ipa/111157 * ipa-prop.h (ipcp_transformation): New member function template remove_argaggs_if. * ipa-sra.cc (zap_useless_ipcp_results): Use remove_argaggs_if to filter aggreagate constants.	2023-10-30 18:36:40 +01:00

1 2 3 4 5 ...

204994 commits