FreeChainXenon/gcc - Aiden Isik's Forgejo Server

Author	SHA1	Message	Date
Jakub Jelinek	01dfc5b4ad	bitint: Use gsi_insert_on_edge rather than gsi_insert_on_edge_immediate [PR115887] The following testcase ICEs on x86_64-linux, because we try to gsi_insert_on_edge_immediate a statement on an edge which already has statements queued with gsi_insert_on_edge, and the deferral has been intentional so that we don't need to deal with cfg changes in between. The following patch uses the delayed insertion as well. 2024-07-17 Jakub Jelinek <jakub@redhat.com> PR middle-end/115887 * gimple-lower-bitint.cc (gimple_lower_bitint): Use gsi_insert_on_edge instead of gsi_insert_on_edge_immediate and set edge_insertions to true. * gcc.dg/bitint-108.c: New test. (cherry picked from commit 5104fe4c7808a66ed3041a8da8e4720585cc8a1f)	2024-07-17 17:43:04 +02:00
Jakub Jelinek	d668f87598	gimple-fold: Fix up __builtin_clear_padding lowering [PR115527] The builtin-clear-padding-6.c testcase fails as clear_padding_type doesn't correctly recompute the buf->size and buf->off members after expanding clearing of an array using a runtime loop. buf->size should be in that case the offset after which it should continue with next members or padding before them modulo UNITS_PER_WORD and buf->off that offset minus buf->size. That is what the code was doing, but with off being the start of the loop cleared array, not its end. So, the last hunk in gimple-fold.cc fixes that. When adding the testcase, I've noticed that the c-c++-common/torture/builtin-clear-padding-* tests, although clearly written as runtime tests to test the builtins at runtime, didn't have { dg-do run } directive and were just compile tests because of that. When adding that to the tests, builtin-clear-padding-1.c was already failing without that clear_padding_type hunk too, but builtin-clear-padding-5.c was still failing even after the change. That is due to a bug in clear_padding_flush which the patch fixes as well - when clear_padding_flush is called with full=true (that happens at the end of the whole __builtin_clear_padding or on those array padding clears done by a runtime loop), it wants to flush all the pending padding clearings rather than just some. If it is at the end of the whole object, it decreases wordsize when needed to make sure the code never writes including RMW cycles to something outside of the object: if ((unsigned HOST_WIDE_INT) (buf->off + i + wordsize) > (unsigned HOST_WIDE_INT) buf->sz) { gcc_assert (wordsize > 1); wordsize /= 2; i -= wordsize; continue; } but if it is full==true flush in the middle, this doesn't happen, but we still process just the buffer bytes before the current end. If that end is not on a wordsize boundary, e.g. on the builtin-clear-padding-5.c test the last chunk is 2 bytes, '\0', '\xff', i is 16 and end is 18, nonzero_last might be equal to the end - i, i.e. 2 here, but still all_ones might be true, so in some spots we just didn't emit any clearing in that last chunk. 2024-07-17 Jakub Jelinek <jakub@redhat.com> PR middle-end/115527 * gimple-fold.cc (clear_padding_flush): Introduce endsize variable and use it instead of wordsize when comparing it against nonzero_last. (clear_padding_type): Increment off by sz. * c-c++-common/torture/builtin-clear-padding-1.c: Add dg-do run directive. * c-c++-common/torture/builtin-clear-padding-2.c: Likewise. * c-c++-common/torture/builtin-clear-padding-3.c: Likewise. * c-c++-common/torture/builtin-clear-padding-4.c: Likewise. * c-c++-common/torture/builtin-clear-padding-5.c: Likewise. * c-c++-common/torture/builtin-clear-padding-6.c: New test. (cherry picked from commit 8b5919bae11754f4b65a17e63663d3143f9615ac)	2024-07-17 17:40:47 +02:00
Jakub Jelinek	297ea7e5bb	c++: Fix ICE on constexpr placement new [PR115754] C++26 is making in P2747R2 paper placement new constexpr. While working on a patch for that, I've noticed we ICE starting with GCC 14 on the following testcase. The problem is that e.g. for the void * to sometype * casts checks, we really assume the casts have their operand constant evaluated as prvalue, but on the testcase the cast itself is evaluated with vc_discard and that means op can end up e.g. a VAR_DECL which the later code doesn't like and asserts on. If the result type is void, we don't really need the cast operand for anything, so can use vc_discard for the recursive call, VIEW_CONVERT_EXPR can appear on the lhs, so we need to honor the lval but otherwise the patch uses vc_prvalue. I'd like to get this patch in before the rest of P2747R2 implementation, so that it can be backported to 14.2 later on. 2024-07-02 Jakub Jelinek <jakub@redhat.com> Jason Merrill <jason@redhat.com> PR c++/115754 * constexpr.cc (cxx_eval_constant_expression) <case CONVERT_EXPR>: For conversions to void, pass vc_discard to the recursive call and otherwise for tcode other than VIEW_CONVERT_EXPR pass vc_prvalue. * g++.dg/cpp26/pr115754.C: New test. (cherry picked from commit 1250540a98e0a1dfa4d7834672d88d8543ea70b1)	2024-07-17 17:38:04 +02:00
Robin Dapp	bf64404280	vect: Merge loop mask and cond_op mask in fold-left reduction [PR115382]. Currently we discard the cond-op mask when the loop is fully masked which causes wrong code in gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c when compiled with -O3 -march=cascadelake --param vect-partial-vector-usage=2. This patch ANDs both masks. gcc/ChangeLog: PR tree-optimization/115382 * tree-vect-loop.cc (vectorize_fold_left_reduction): Use prepare_vec_mask. * tree-vect-stmts.cc (check_load_store_for_partial_vectors): Remove static of prepare_vec_mask. * tree-vectorizer.h (prepare_vec_mask): Export. (cherry picked from commit 2b438a0d2aa80f051a09b245a58f643540d4004b)	2024-07-17 08:18:21 +02:00
Richard Biener	c58bede01c	tree-optimization/115868 - ICE with .MASK_CALL in simdclone The following adjusts mask recording which didn't take into account that we can merge call arguments from two vectors like _50 = {vect_d_1.253_41, vect_d_1.254_43}; _51 = VIEW_CONVERT_EXPR<unsigned char>(mask__19.257_49); _52 = (unsigned int) _51; _53 = _Z3bazd.simdclone.7 (_50, _52); _54 = BIT_FIELD_REF <_53, 256, 0>; _55 = BIT_FIELD_REF <_53, 256, 256>; The testcase g++.dg/vect/pr68762-2.cc exercises this on x86_64 with partial vector usage enabled and AVX512 support. PR tree-optimization/115868 * tree-vect-stmts.cc (vectorizable_simd_clone_call): Correctly compute the number of mask copies required for vect_record_loop_mask. (cherry picked from commit abf3964711f05b6858d9775c3595ec2b45483e14)	2024-07-17 08:14:27 +02:00
Nathaniel Shead	5fad0b552c	c++/modules: Propagate BINDING_VECTOR__DUPS_P on realloc [PR99242] When importing modules, when a binding vector for a name runs out of slots it gets reallocated with a larger size, and existing bindings are copied across. However, the flags to indicate whether deduping needs to occur did not: this causes ICEs, as it allows a duplicate binding to be added which then violates assumptions later on. PR c++/99242 gcc/cp/ChangeLog: name-lookup.cc (append_imported_binding_slot): Propagate dups flags. gcc/testsuite/ChangeLog: * g++.dg/modules/pr99242_a.H: New test. * g++.dg/modules/pr99242_b.H: New test. * g++.dg/modules/pr99242_c.H: New test. * g++.dg/modules/pr99242_d.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> (cherry picked from commit 1aa0f1627857c3e2d90982bdb07ca78ca10b26f3)	2024-07-17 11:23:23 +10:00
GCC Administrator	4039c7473a	Daily bump.	2024-07-17 00:24:45 +00:00
Richard Biener	59ed01d5e3	tree-optimization/115841 - reduction epilogue placement issue When emitting the compensation to the vectorized main loop for a vector reduction value to be re-used in the vectorized epilogue we fail to place it in the correct block when the main loop is known to be entered (no loop_vinfo->main_loop_edge) but the epilogue is not (a loop_vinfo->skip_this_loop_edge). The code currently disregards this situation. With the recent znver4 cost fix I couldn't trigger this situation with the testcase but I adjusted it so it could eventually trigger on other targets. PR tree-optimization/115841 * tree-vect-loop.cc (vect_transform_cycle_phi): Correctly place the partial vector reduction for the accumulator re-use when the main loop cannot be skipped but the epilogue can. * gcc.dg/vect/pr115841.c: New testcase. (cherry picked from commit 016c947b02e79a5c0c0c2d4ad5cb71aa04db3efd)	2024-07-16 16:22:35 +02:00
Richard Biener	06829e593d	tree-optimization/115843 - fix wrong-code with fully-masked loop and peeling When AVX512 uses a fully masked loop and peeling we fail to create the correct initial loop mask when the mask is composed of multiple components in some cases. The following fixes this by properly applying the bias for the component to the shift amount. PR tree-optimization/115843 * tree-vect-loop-manip.cc (vect_set_loop_condition_partial_vectors_avx512): Properly bias the shift of the initial mask for alignment peeling. * gcc.dg/vect/pr115843.c: New testcase. (cherry picked from commit a177be05f6952c3f7e62186d2e138d96c475b81a)	2024-07-16 16:22:24 +02:00
Richard Biener	e01012c459	tree-optimization/115701 - fix maybe_duplicate_ssa_info_at_copy The following restricts copying of points-to info from defs that might be in regions invoking UB and are never executed. PR tree-optimization/115701 * tree-ssanames.cc (maybe_duplicate_ssa_info_at_copy): Only copy info from within the same BB. * gcc.dg/torture/pr115701.c: New testcase. (cherry picked from commit b77f17c5feec9614568bf2dee7f7d811465ee4a5)	2024-07-16 16:22:05 +02:00
Richard Biener	6f74a5f5dc	tree-optimization/115701 - factor out maybe_duplicate_ssa_info_at_copy The following factors out the code that preserves SSA info of the LHS of a SSA copy LHS = RHS when LHS is about to be eliminated to RHS. PR tree-optimization/115701 * tree-ssanames.h (maybe_duplicate_ssa_info_at_copy): Declare. * tree-ssanames.cc (maybe_duplicate_ssa_info_at_copy): New function, split out from ... * tree-ssa-copy.cc (fini_copy_prop): ... here. * tree-ssa-sccvn.cc (eliminate_dom_walker::eliminate_stmt): ... and here. (cherry picked from commit b5c64b413fd5bc03a1a8ef86d005892071e42cbe)	2024-07-16 16:22:05 +02:00
Richard Biener	ca275b68ef	tree-optimization/115867 - ICE with simdcall vectorization in masked loop When only a loop mask is to be supplied for the inbranch arg to a simd function we fail to handle integer mode masks correctly. We need to guess the number of elements represented by it. This assumes that excess arguments are all for masks, I wasn't able to create a simdclone with more than one integer mode mask argument. The gcc.dg/vect/vect-simd-clone-20.c exercises this with -mavx512vl PR tree-optimization/115867 * tree-vect-stmts.cc (vectorizable_simd_clone_call): Properly guess the number of mask elements for integer mode masks. (cherry picked from commit 4f4478f0f31263997bfdc4159f90e58dd79b38f9)	2024-07-16 16:22:05 +02:00
Richard Biener	4a04110ec8	Fixup unaligned load/store cost for znver5 Currently unaligned YMM and ZMM load and store costs are cheaper than aligned which causes the vectorizer to purposely mis-align accesses by adding an alignment prologue. It looks like the unaligned costs were simply copied from the bogus znver4 costs. The following makes the unaligned costs equal to the aligned costs like in the fixed znver4 version. * config/i386/x86-tune-costs.h (znver5_cost): Update unaligned load and store cost from the aligned costs. (cherry picked from commit 896393791ee34ffc176c87d232dfee735db3aaab)	2024-07-16 16:22:05 +02:00
Richard Biener	d702a95775	Fixup unaligned load/store cost for znver4 Currently unaligned YMM and ZMM load and store costs are cheaper than aligned which causes the vectorizer to purposely mis-align accesses by adding an alignment prologue. It looks like the unaligned costs were simply left untouched from znver3 where they equate the aligned costs when tweaking aligned costs for znver4. The following makes the unaligned costs equal to the aligned costs. This avoids the miscompile seen in PR115843 but it's of course not a real fix for the issue uncovered there. But it makes it qualify as a regression fix. PR tree-optimization/115843 * config/i386/x86-tune-costs.h (znver4_cost): Update unaligned load and store cost from the aligned costs. (cherry picked from commit 1e3aa9c9278db69d4bdb661a750a7268789188d6)	2024-07-16 16:22:05 +02:00
Alexandre Oliva	c8fdef7fc2	[alpha] adjust MEM alignment for block move [PR115459] Before issuing loads or stores for a block move, adjust the MEM alignments if analysis of the addresses enabled the inference of stricter alignment. This ensures that the MEMs are sufficiently aligned for the corresponding insns, which avoids trouble in case of e.g. substitutions into SUBREGs. for gcc/ChangeLog PR target/115459 * config/alpha/alpha.cc (alpha_expand_block_move): Adjust MEMs to match inferred alignment. (cherry picked from commit ccfe7151803956d178947d0afda0bd66ce097275)	2024-07-16 08:54:20 -03:00
Christoph Müllner	b3cff8357e	RISC-V: Allow adding enabled extension via target arch attributes The set of enabled extensions can be extended via target arch function attributes by listing each extension with a '+' prefix and a comma as list separator. E.g.: __attribute__((target("arch=+zba,+zbb"))) void foo(); The programmer intends to ensure that one or more extensions are enabled when building the code. This is independent of the arch string that is passed at build time via the -march= option. Therefore, it is reasonable to allow enabling extensions via target arch attributes, which have already been enabled via the -march= string. The subset list code already supports such duplication for implied extensions. This patch adds an interface so the subset list parser can be switched into a mode where duplication is allowed. This commit fixes the following regressed test cases: * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-39.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-42.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-43.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-44.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-45.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-46.c gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_subset_list::add): Allow adding enabled extension if m_allow_adding_dup is set. * config/riscv/riscv-subset.h: Add m_allow_adding_dup and setter. * config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::parse_arch): Allow adding enabled extensions. gcc/testsuite/ChangeLog: * gcc.target/riscv/pr115554.c: Change expected fail to expected pass. * gcc.target/riscv/target-attr-16.c: New test. (cherry picked from commit 61c21a719e205f70bd046c6a0275d1a3fd6341a4) Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>	2024-07-16 13:02:16 +02:00
Christoph Müllner	0e1f599d63	RISC-V: Rewrite target attribute handling The target-arch attribute handling in RISC-V is only a few months old, but already saw a rewrite (`9941f0295a`), which addressed an important issue. This rewrite introduced a hash table in the backend, which is used to keep track of target-arch attributes of all functions. The index of this hash table is the pointer to the function declaration object (fndecl). However, objects like these don't have the lifetime that is assumed here, which resulted in observing two fndecl objects with the same address for different objects (triggering the assertion in riscv_func_target_put() -- see also PR115562). This patch removes the hash table approach in favor of storing target specific options using the DECL_FUNCTION_SPECIFIC_TARGET() macro, which is also used by other backends and is specifically designed for this purpose (https://gcc.gnu.org/onlinedocs/gccint/Function-Properties.html). To have an accessible field in the target options, we need to adjust riscv.opt and introduce the field riscv_arch_string (for the already existing option '-march='). Using this macro allows to remove much code from riscv-common.cc, which controls access to the objects 'func_target_table' and 'current_subset_list'. One thing to mention is, that we had two subset lists: current_subset_list and cmdline_subset_list, with the latter being introduced recently for target attribute handling. This patch reduces them back to one (cmdline_subset_list) which contains the list of extensions that have been enabled by the command line arguments. Note that the patch keeps the existing behavior of rejecting duplications of extensions when added via the '+' operator in a function target attribute. E.g. "-march=rv64gc_zbb" and "arch=+zbb" will trigger an error (see pr115554.c). However, at the same time this patch breaks the acceptance of adding implied extensions, which causes the following six regressions (with the error "extension 'EXT' appear more than one time"): * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-39.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-42.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-43.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-44.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-45.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-46.c New tests were added to document the behavior and to ensure it won't regress. This patch did not show any regressions for rv32/rv64 and fixes the ICEs from PR115554 and PR115562. PR target/115554 PR target/115562 gcc/ChangeLog: * common/config/riscv/riscv-common.cc (struct riscv_func_target_info): Remove. (struct riscv_func_target_hasher): Likewise. (riscv_func_decl_hash): Likewise. (riscv_func_target_hasher::hash): Likewise. (riscv_func_target_hasher::equal): Likewise. (riscv_current_subset_list): Likewise. (riscv_cmdline_subset_list): Remove obsolete space. (riscv_func_target_table_lazy_init): Remove. (riscv_func_target_get): Likewise. (riscv_func_target_put): Likewise. (riscv_func_target_remove_and_destory): Likewise. (riscv_arch_str): Generate from cmdline_subset_list. (riscv_set_arch_by_subset_list): Don't set current_subset_list. (riscv_parse_arch_string): Remove current_subset_list. * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Get subset list via riscv_cmdline_subset_list(). * config/riscv/riscv-subset.h (riscv_current_subset_list): Remove prototype. (riscv_func_target_get): Likewise. (riscv_func_target_put): Likewise. (riscv_func_target_remove_and_destory): Likewise. * config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::parse_arch): Build base arch string from existing target options, if any. (riscv_target_attr_parser::update_settings): Store new arch string in target options. (riscv_process_one_target_attr): Whitespace fix. (riscv_process_target_attr): Drop opts argument. (riscv_option_valid_attribute_p): Properly save, change and restore target options. * config/riscv/riscv.cc (get_arch_str): New function. (riscv_declare_function_name): Get arch string for option-arch directive from function's target options. * config/riscv/riscv.opt: Add riscv_arch_string variable to march option. gcc/testsuite/ChangeLog: * gcc.target/riscv/target-attr-01.c: Add test for option-arch directive. * gcc.target/riscv/target-attr-02.c: Likewise. * gcc.target/riscv/target-attr-03.c: Likewise. * gcc.target/riscv/target-attr-04.c: Likewise. * gcc.target/riscv/target-attr-05.c: Fix formatting. * gcc.target/riscv/target-attr-06.c: Likewise. * gcc.target/riscv/target-attr-07.c: Likewise. * gcc.target/riscv/pr115554.c: New test. * gcc.target/riscv/pr115562.c: New test. * gcc.target/riscv/target-attr-08.c: New test. * gcc.target/riscv/target-attr-09.c: New test. * gcc.target/riscv/target-attr-10.c: New test. * gcc.target/riscv/target-attr-11.c: New test. * gcc.target/riscv/target-attr-12.c: New test. * gcc.target/riscv/target-attr-13.c: New test. * gcc.target/riscv/target-attr-14.c: New test. * gcc.target/riscv/target-attr-15.c: New test. (cherry picked from commit aa8e2de78cae4dca7f9b0efe0685f3382f9ecb9a) Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>	2024-07-16 13:02:16 +02:00
Christoph Müllner	b604d59b23	RISC-V: Fix comment/naming in attribute parsing code Function target attributes have to be separated by semi-colons. Let's fix the comment and variable naming to better explain what the code does. gcc/ChangeLog: * config/riscv/riscv-target-attr.cc (riscv_process_target_attr): Fix comments and variable names. (cherry picked from commit 5ef0b7d2048a7142174ee3e8e021fc1a9c3e3334) Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>	2024-07-16 13:02:16 +02:00
Christoph Müllner	20fb450d17	RISC-V: Deduplicate arch subset list processing We have a code duplication in riscv_set_arch_by_subset_list() and riscv_parse_arch_string(), where the latter function parses an ISA string into a subset_list before doing the same as the former function. riscv_parse_arch_string() is used to process command line options and riscv_set_arch_by_subset_list() processes target attributes. So, it is obvious that both functions should do the same. Let's deduplicate the code to enforce this. gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_set_arch_by_subset_list): Fix overlong line. (riscv_parse_arch_string): Replace duplicated code by a call to riscv_set_arch_by_subset_list. (cherry picked from commit 85fa334fbcaa8e4b98ab197a8c9410dde87f0ae3) Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>	2024-07-16 13:02:16 +02:00
Christoph Müllner	ea5907d6d4	RISC-V: testsuite: Properly gate LTO tests There are two test cases with the following skip directive: dg-skip-if "" { --* } { "-flto -fno-fat-lto-objects" } This reads as: skip if both '-flto' and '-fno-fat-lto-objects' are present. This is not the case if only '-flto' is present. Since both tests depend on instruction sequences (one does check-function-bodies the other tests for an assembler error message), they won't work reliably with fat LTO objects. Let's change the skip line to gate the test on '-flto' to avoid failing tests like this: FAIL: gcc.target/riscv/interrupt-misaligned.c -O2 -flto check-function-bodies interrupt FAIL: gcc.target/riscv/interrupt-misaligned.c -O2 -flto -flto-partition=none check-function-bodies interrupt FAIL: gcc.target/riscv/pr93202.c -O2 -flto (test for errors, line 10) FAIL: gcc.target/riscv/pr93202.c -O2 -flto (test for errors, line 9) FAIL: gcc.target/riscv/pr93202.c -O2 -flto -flto-partition=none (test for errors, line 10) FAIL: gcc.target/riscv/pr93202.c -O2 -flto -flto-partition=none (test for errors, line 9) gcc/testsuite/ChangeLog: * gcc.target/riscv/interrupt-misaligned.c: Remove "-fno-fat-lto-objects" from skip condition. * gcc.target/riscv/pr93202.c: Likewise. (cherry picked from commit 0717d50fc4ff983b79093bdef43b04e4584cc3cd) Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>	2024-07-16 13:02:16 +02:00
Alexandre Oliva	7bc63f1c70	[i386] adjust flag_omit_frame_pointer in a single function [PR113719] The first two patches for PR113719 have each regressed gcc.dg/ipa/iinline-attr.c on a different target. The reason for this instability is that there are competing flag_omit_frame_pointer overriders on x86: - ix86_recompute_optlev_based_flags computes and sets a -f[no-]omit-frame-pointer default depending on USE_IX86_FRAME_POINTER and, in 32-bit mode, optimize_size - ix86_option_override_internal enables flag_omit_frame_pointer for -momit-leaf-frame-pointer to take effect ix86_option_override[_internal] calls ix86_recompute_optlev_based_flags before setting flag_omit_frame_pointer. It is called during global process_options. But ix86_recompute_optlev_based_flags is also called by parse_optimize_options, during attribute processing, and at that point, ix86_option_override is not called, so the final overrider for global options is not applied to the optimize attributes. If they differ, the testcase fails. In order to fix this, we need to process all overriders of this option whenever we process any of them. Since this setting is affected by optimization options, it makes sense to compute it in parse_optimize_options, rather than in process_options. for gcc/ChangeLog PR target/113719 * config/i386/i386-options.cc (ix86_option_override_internal): Move flag_omit_frame_pointer final overrider... (ix86_recompute_optlev_based_flags): ... here. (cherry picked from commit bf8e80f9d164f8778d86a3dc50e501cf19a9eff1)	2024-07-16 06:37:13 -03:00
Alexandre Oliva	102bcf1478	[i386] restore recompute to override opts after change [PR113719] The first patch for PR113719 regressed gcc.dg/ipa/iinline-attr.c on toolchains configured to --enable-frame-pointer, because the optimization node created within handle_optimize_attribute had flag_omit_frame_pointer incorrectly set, whereas default_optimization_node didn't. With this difference, can_inline_edge_by_limits_p flagged an optimization mismatch and we refused to inline the function that had a redundant optimization flag into one that didn't, which is exactly what is tested for there. This patch restores the calls to ix86_default_align and ix86_recompute_optlev_based_flags that used to be, and ought to be, issued during TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE, but preserves the intent of the original change, of having those functions called at different spots within ix86_option_override_internal. To that end, the remaining bits were refactored into a separate function, that was in turn adjusted to operate on explicitly-passed opts and opts_set, rather than going for their global counterparts. for gcc/ChangeLog PR target/113719 * config/i386/i386-options.cc (ix86_override_options_after_change_1): Add opts and opts_set parms, operate on them, after factoring out of... (ix86_override_options_after_change): ... this. Restore calls of ix86_default_align and ix86_recompute_optlev_based_flags. (ix86_option_override_internal): Call the factored-out bits. (cherry picked from commit bf2fc0a27b35de039c3d45e6d7ea9ad0a8a305ba)	2024-07-16 06:27:06 -03:00
H.J. Lu	1fff665a51	x86: Update branch hint for Redwood Cove. According to Intel® 64 and IA-32 Architectures Optimization Reference Manual[1], Branch Hint is updated for Redwood Cove. --------cut from [1]------------------------- Starting with the Redwood Cove microarchitecture, if the predictor has no stored information about a branch, the branch has the Intel® SSE2 branch taken hint (i.e., instruction prefix 3EH), When the codec decodes the branch, it flips the branch’s prediction from not-taken to taken. It then flushes the pipeline in front of it and steers this pipeline to fetch the taken path of the branch. --------cut end ----------------------------- Split tune branch_prediction_hints into branch_prediction_hints_taken and branch_prediction_hints_not_taken, always generate branch hint for conditional branches, both tunes are disabled by default. [1] https://www.intel.com/content/www/us/en/content-details/821612/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html gcc/ * config/i386/i386.cc (ix86_print_operand): Always generate branch hint for conditional branches. * config/i386/i386.h (TARGET_BRANCH_PREDICTION_HINTS): Split into .. (TARGET_BRANCH_PREDICTION_HINTS_TAKEN): .. this, and .. (TARGET_BRANCH_PREDICTION_HINTS_NOT_TAKEN): .. this. * config/i386/x86-tune.def (X86_TUNE_BRANCH_PREDICTION_HINTS): Split into .. (X86_TUNE_BRANCH_PREDICTION_HINTS_TAKEN): .. this, and .. (X86_TUNE_BRANCH_PREDICTION_HINTS_NOT_TAKEN): .. this. (cherry picked from commit a910c30c7c27cd0f6d2d2694544a09fb11d611b9)	2024-07-16 09:28:08 +08:00
GCC Administrator	0fcadb3d51	Daily bump.	2024-07-16 00:26:23 +00:00
Harald Anlauf	71ec9ed7a7	Fortran: improve attribute conflict checking [PR93635] gcc/fortran/ChangeLog: PR fortran/93635 * symbol.cc (conflict_std): Helper function for reporting attribute conflicts depending on the Fortran standard version. (conf_std): Helper macro for checking standard-dependent conflicts. (gfc_check_conflict): Use it. gcc/testsuite/ChangeLog: PR fortran/93635 * gfortran.dg/c-interop/c1255-2.f90: Adjust pattern. * gfortran.dg/pr87907.f90: Likewise. * gfortran.dg/pr93635.f90: New test. Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org> (cherry picked from commit 9561cf550a66a89e7c8d31202a03c4fddf82a3f2)	2024-07-15 20:41:43 +02:00
liuhongt	13bfc385b0	Fix SSA_NAME leak due to def_stmt is removed before use_stmt. - _5 = __atomic_fetch_or_8 (&set_work_pending_p, 1, 0); - # DEBUG old => (long int) _5 + _6 = .ATOMIC_BIT_TEST_AND_SET (&set_work_pending_p, 0, 1, 0, __atomic_fetch_or_8); + # DEBUG old => NULL # DEBUG BEGIN_STMT - # DEBUG D#2 => _5 & 1 + # DEBUG D#2 => NULL ... - _10 = ~_5; - _8 = (_Bool) _10; - # DEBUG ret => _8 + _8 = _6 == 0; + # DEBUG ret => (_Bool) _10 confirmed. convert_atomic_bit_not does this, it checks for single_use and removes the def, failing to release the name (which would fix this up IIRC). Note the function removes stmts in "wrong" order (before uses of LHS are removed), so it requires larger surgery. And it leaks SSA names. gcc/ChangeLog: PR target/115872 * tree-ssa-ccp.cc (convert_atomic_bit_not): Remove use_stmt after use_nop_stmt is removed. (optimize_atomic_bit_test_and): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr115872.c: New test. (cherry picked from commit a8209237dc46dc4db7d9d8e3807e6c93734c64b5)	2024-07-15 14:19:57 +08:00
GCC Administrator	53dd1ced62	Daily bump.	2024-07-15 00:24:06 +00:00
Mikael Morin	c80a746023	fortran: Assume there is no cyclic reference with submodule symbols [PR99798] This prevents a premature release of memory with procedure symbols from submodules, causing random compiler crashes. The problem is a fragile detection of cyclic references, which can match with procedures host-associated from a module in submodules, in cases where it shouldn't. The formal namespace is released, and with it the dummy arguments symbols of the procedure. But there is no cyclic reference, so the procedure symbol itself is not released and remains, with pointers to its dummy arguments now dangling. The fix adds a condition to avoid the case, and refactors to a new predicate by the way. Part of the original condition is also removed, for lack of a reason to keep it. PR fortran/99798 gcc/fortran/ChangeLog: * symbol.cc (gfc_release_symbol): Move the condition guarding the handling cyclic references... (cyclic_reference_break_needed): ... here as a new predicate. Remove superfluous parts. Add a condition preventing any premature release with submodule symbols. gcc/testsuite/ChangeLog: * gfortran.dg/submodule_33.f08: New test. (cherry picked from commit 38d1761c0c94b77a081ccc180d6e039f7a670468)	2024-07-14 20:31:20 +02:00
Mikael Morin	55988c48ea	fortran: Correctly evaluate scalar MASK arguments of MINLOC/MAXLOC Add the preliminary code that the generated expression for MASK may depend on when generating the inline code to evaluate MINLOC or MAXLOC with a scalar MASK. The generated code was only keeping the generated expression but not the preliminary code, which was sufficient for simple cases such as data references or simple (scalar) function calls, but was bogus with more complicated ones. gcc/fortran/ChangeLog: * trans-intrinsic.cc (gfc_conv_intrinsic_minmaxloc): Add the preliminary code generated for MASK to the preliminary code of MINLOC/MAXLOC. gcc/testsuite/ChangeLog: * gfortran.dg/minmaxloc_17.f90: New test. (cherry picked from commit d211100903d4d532d989451243ea00d7fa2e9d5e)	2024-07-14 19:30:33 +02:00
GCC Administrator	81972649bd	Daily bump.	2024-07-14 00:24:37 +00:00
Lulu Cheng	89f9342980	LoongArch: TFmode is not allowed to be stored in the float register. PR target/115752 gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_hard_regno_mode_ok_uncached): Replace UNITS_PER_FPVALUE with UNITS_PER_HWFPVALUE. * config/loongarch/loongarch.h (UNITS_PER_FPVALUE): Delete. gcc/testsuite/ChangeLog: * gcc.target/loongarch/pr115752.c: New test. (cherry picked from commit abeb6c8a62758faa0719e818e6e8a7db15a6793b)	2024-07-13 14:06:13 +08:00
Stefan Schulze Frielinghaus	5ade7afdef	s390: Fix output template for movv1qi Although for instructions MVI and MVIY it does not make a difference whether the immediate is interpreted as signed or unsigned, GAS expects unsigned immediates for instruction format SI_URD. gcc/ChangeLog: * config/s390/vector.md (mov<mode>): Fix output template for movv1qi. (cherry picked from commit e6680d3f392f7f7cc2a1515276213e21e9eeab1c)	2024-07-13 08:01:59 +02:00
Stefan Schulze Frielinghaus	cd11413ff7	s390: Align cjump_64 and icjump_64 During machine reorg we optimize backward jumps and transform insns as e.g. (jump_insn 118 117 119 (set (pc) (if_then_else (ne (reg:CCRAW 33 %cc) (const_int 8 [0x8])) (label_ref 134) (pc))) "dec_math_1.f90":204:8 discrim 1 2161 {cjump_64} (expr_list:REG_DEAD (reg:CCRAW 33 %cc) (int_list:REG_BR_PROB 719407028 (nil))) -> 134) into (jump_insn 118 117 432 (set (pc) (if_then_else (ne (reg:CCRAW 33 %cc) (const_int 8 [0x8])) (pc) (label_ref 433))) "dec_math_1.f90":204:8 discrim 1 -1 (expr_list:REG_DEAD (reg:CCRAW 33 %cc) (int_list:REG_BR_PROB 719407028 (nil))) -> 433) The latter is not recognized anymore since icjump_64 only matches CC_REGNUM against zero. Fixed by aligning cjump_64 and icjump_64. gcc/ChangeLog: * config/s390/s390.md (*icjump_64): Allow raw CC comparisons, i.e., any constant integer between 0 and 15 for CC comparisons. (cherry picked from commit 56de68aba6cb9cf3022d9e303eec6c6cdb49ad4d)	2024-07-13 08:01:51 +02:00
GCC Administrator	3cba6fb80e	Daily bump.	2024-07-13 00:23:54 +00:00
Jonathan Wakely	d920658cbb	libstdc++: Fix unwanted #pragma messages from PSTL headers [PR113376] When we rebased the PSTL on upstream, in r14-2109-g3162ca09dbdc2e, a change to how _PSTL_USAGE_WARNINGS is set was missed out, but the change to how it's tested was included. This means that the macro is always defined, so testing it with #ifdef (instead of using #if to test its value) doesn't work as intended. Revert the test to use #if again, since that part of the upstream change was unnecessary in the first place (the macro is always defined, so there's no need to use #ifdef to avoid -Wundef warnings). libstdc++-v3/ChangeLog: PR libstdc++/113376 * include/pstl/pstl_config.h: Use #if instead of #ifdef to test the _PSTL_USAGE_WARNINGS macro. (cherry picked from commit 99a1fe6c12c733fe4923a75a79d09a66ff8abcec)	2024-07-12 11:12:27 +01:00
Jonathan Wakely	21c8708ba6	libstdc++: Fix std::to_array for trivial-ish types [PR115522] Due to PR c++/85723 the std::is_trivial trait is true for types with a deleted default constructor, so the use of std::is_trivial in std::to_array is not sufficient to ensure the type can be trivially default constructed then filled using memcpy. I also forgot that a type with a deleted assignment operator can still be trivial, so we also need to check that it's assignable because the is_constant_evaluated() path can't use memcpy. Replace the uses of std::is_trivial with std::is_trivially_copyable (needed for memcpy), std::is_trivially_default_constructible (needed so that the default construction is valid and does no work) and std::is_copy_assignable (needed for the constant evaluation case). libstdc++-v3/ChangeLog: PR libstdc++/115522 * include/std/array (to_array): Workaround the fact that std::is_trivial is not sufficient to check that a type is trivially default constructible and assignable. * testsuite/23_containers/array/creation/115522.cc: New test. (cherry picked from commit 510ce5eed69ee1bea9c2c696fe3b2301e16d1486)	2024-07-12 11:12:27 +01:00
YunQiang Su	cff270707f	RISC-V: NO_WARNING preferred else value for RVV PR target/115840. In riscv_preferred_else_value, we create an uninitialized tmp var for else value, instead of the 0 (as default_preferred_else_value) or the pre-exists VAR (as aarch64 does), so that we can use agnostic policy. The problem is that `warn_uninit` will emit a warning: '({anonymous})' may be used uninitialized Let's mark this tmp var as NO_WARNING. This problem is found when I try to build glibc with V extension. gcc PR target/115840 * config/riscv/riscv.cc(riscv_preferred_else_value): Mark tmp_var as NO_WARNING. gcc/testsuite * gcc.dg/vect/pr115840.c: New testcase. (cherry picked from commit c6f38e5e6d900b8ed6a4f5c126d3197946cad4dd)	2024-07-12 17:31:18 +08:00
Paul Thomas	29b2e1cdb6	Fortran: Fix ICEs due to comp calls in initialization exprs [PR103312] 2024-05-23 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/103312 * dependency.cc (gfc_dep_compare_expr): Handle component call expressions. Return -2 as default and return 0 if compared with a function expression that is from an interface body and has the same name. * expr.cc (gfc_reduce_init_expr): If the expression is a comp call do not attempt to reduce, defer to resolution and return false. * trans-types.cc (gfc_get_dtype_rank_type, gfc_get_nodesc_array_type): Fix whitespace. gcc/testsuite/ PR fortran/103312 * gfortran.dg/pr103312.f90: New test. (cherry picked from commit 2ce90517ed75c4af9fc0616f2670cf6dfcfa8a91)	2024-07-12 07:34:16 +01:00
GCC Administrator	d096ff3715	Daily bump.	2024-07-12 00:26:24 +00:00
Andre Vieira	b7a16ad1df	mve: Fix vsetq_lane for 64-bit elements with lane 1 [PR 115611] This patch fixes the backend pattern that was printing the wrong input scalar register pair when inserting into lane 1. Added a new test to force float-abi=hard so we can use scan-assembler to check correct codegen. gcc/ChangeLog: PR target/115611 * config/arm/mve.md (mve_vec_setv2di_internal): Fix printing of input scalar register pair when lane = 1. gcc/testsuite/ChangeLog: * gcc.target/arm/mve/intrinsics/vsetq_lane_su64.c: New test. (cherry picked from commit 7c11fdd2cc11a7058e9643b6abf27831970ad2c9)	2024-07-11 18:02:30 +01:00
Nathaniel Shead	08c2abffe0	c++/modules: Keep entity mapping info across duplicate_decls [PR99241] When duplicate_decls finds a match with an existing imported declaration, it clears DECL_LANG_SPECIFIC of the olddecl and replaces it with the contents of newdecl; this clears DECL_MODULE_ENTITY_P causing an ICE if the same declaration is imported again later. This fixes the issue by ensuring that the flag is transferred to newdecl before clearing so that it ends up on olddecl again. For future-proofing we also do the same with DECL_MODULE_KEYED_DECLS_P, though because we don't yet support textual redefinition merging we can't yet test this works as intended. I don't expect it's possible for a new declaration already to have extra keyed decls mismatching that of the old declaration though, so I don't do anything with 'keyed_map' at this time. PR c++/99241 gcc/cp/ChangeLog: * decl.cc (duplicate_decls): Merge module entity information. gcc/testsuite/ChangeLog: * g++.dg/modules/pr99241_a.H: New test. * g++.dg/modules/pr99241_b.H: New test. * g++.dg/modules/pr99241_c.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> (cherry picked from commit f04f9714fca40315360af109b9e5ca2305fd75db)	2024-07-11 21:11:30 +10:00
GCC Administrator	ddea107322	Daily bump.	2024-07-11 00:23:02 +00:00
Torbjörn SVENSSON	e7d81cf551	testsuite: Align testcase with implementation [PR105090] Since r13-1006-g2005b9b888eeac, the test case copysign_softfloat_1.c no longer contains any lsr istruction, so drop the check as per comment 9 in PR105090. gcc/testsuite/ChangeLog: PR target/105090 * gcc.target/arm/copysign_softfloat_1.c: Drop check for lsr Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com> (cherry picked from commit 4865a92b35054fdfaa1318a4c1f56d95d44012a2)	2024-07-10 18:56:23 +02:00
Uros Bizjak	47a8b464d2	middle-end: Fix stalled swapped condition code value [PR115836] emit_store_flag_1 calculates scode (swapped condition code) at the beginning of the function from the value of code variable. However, code variable may change before scode usage site, resulting in invalid stalled scode value. Move calculation of scode value just before its only usage site to avoid stalled scode value. PR middle-end/115836 gcc/ChangeLog: * expmed.cc (emit_store_flag_1): Move calculation of scode just before its only usage site. (cherry picked from commit 44933fdeb338e00c972e42224b9a83d3f8f6a757)	2024-07-10 15:13:26 +02:00
Fei Gao	efa30f6193	RISC-V: backport fix zcmp popretz [PR113715] Root cause: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864 Commit above tries in targetm.gen_epilogue () to detect if there's li a0,0 insn at the end of insn chain, if so, cm.popret is replaced by cm.popretz and li a0,0 insn is deleted. Insertion of the generated epilogue sequence into the insn chain doesn't happen at this moment. If later shrink-wrap decides NOT to insert the epilogue sequence at the end of insn chain, then the li a0,0 insn has already been mistakeny removed. Fix this issue by removing generation of cm.popretz in epilogue, leaving the assignment to a0 and use insn with cm.popret. That's likely going to result in some kind of code size regression, but not a correctness regression. Optimization can be done in future. Signed-off-by: Fei Gao <gaofei@eswincomputing.com> gcc/ChangeLog: PR target/113715 * config/riscv/riscv.cc (riscv_zcmp_can_use_popretz): Removed. (riscv_gen_multi_pop_insn): Remove generation of cm.popretz. gcc/testsuite/ChangeLog: * gcc.target/riscv/rv32e_zcmp.c: Adapt TC. * gcc.target/riscv/rv32i_zcmp.c: Likewise.	2024-07-10 02:08:54 +00:00
GCC Administrator	76b4721734	Daily bump.	2024-07-10 00:23:33 +00:00
Jonathan Wakely	c94c8ff5f5	libstdc++: Fix _Atomic(T) macro in <stdatomic.h> [PR115807] The definition of the _Atomic(T) macro needs to refer to ::std::atomic, not some other std::atomic relative to the current namespace. libstdc++-v3/ChangeLog: PR libstdc++/115807 * include/c_compatibility/stdatomic.h (_Atomic): Ensure it refers to std::atomic in the global namespace. * testsuite/29_atomics/headers/stdatomic.h/115807.cc: New test. (cherry picked from commit 40d234dd6439e8c8cfbf3f375a61906aed35c80d)	2024-07-09 19:59:21 +01:00
Jonathan Wakely	85a39a8aaf	libstdc++: Define __glibcxx_assert_fail for non-verbose build [PR115585] When the library is configured with --disable-libstdcxx-verbose the assertions just abort instead of calling __glibcxx_assert_fail, and so I didn't export that function for the non-verbose build. However, that option is documented to not change the library ABI, so we still need to export the symbol from the library. It could be needed by programs compiled against the headers from a verbose build. The non-verbose definition can just call abort so that it doesn't pull in I/O symbols, which are unwanted in a non-verbose build. libstdc++-v3/ChangeLog: PR libstdc++/115585 * src/c++11/assert_fail.cc (__glibcxx_assert_fail): Add definition for non-verbose builds. (cherry picked from commit 52370c839edd04df86d3ff2b71fcdca0c7376a7f)	2024-07-09 19:59:21 +01:00
Alfie Richards	72753ec820	Aarch64, bugfix: Fix NEON bigendian addp intrinsic [PR114890] This change removes code that switches the operands in bigendian mode erroneously. This fixes the related test also. gcc/ChangeLog: PR target/114890 * config/aarch64/aarch64-simd.md: Remove bigendian operand swap. gcc/testsuite/ChangeLog: PR target/114890 * gcc.target/aarch64/vector_intrinsics_asm.c: Remove xfail. (cherry picked from commit 11049cdf204bc96bc407e5dd44ed3b8a492f405a)	2024-07-09 12:39:39 +01:00
Wilco Dijkstra	83332e3f80	Arm: Fix ldrd offset range [PR115153] The valid offset range of LDRD in arm_legitimate_index_p is increased to -1024..1020 if NEON is enabled since VALID_NEON_DREG_MODE includes DImode. Fix this by moving the LDRD check earlier. gcc: PR target/115153 * config/arm/arm.cc (arm_legitimate_index_p): Move LDRD case before NEON. (thumb2_legitimate_index_p): Update comments. (output_move_neon): Use DFmode for vldr/vstr and non-checking adjust_address. gcc/testsuite: PR target/115153 * gcc.target/arm/pr115153.c: Add new test. * lib/target-supports.exp: Add arm_arch_v7ve_neon target support. (cherry picked from commit 44e5ecfd261afe72aa04eba4bf1a9ec782579cab)	2024-07-09 12:39:02 +01:00

1 2 3 4 5 ...

210376 commits