FreeChainXenon/gcc - Aiden Isik's Forgejo Server

Author	SHA1	Message	Date
Tamar Christina	dfa17fd3b1	AArch64: Fix expansion of Advanced SIMD div and mul using SVE [PR109636] As suggested in the ticket this replaces the expansion by converting the Advanced SIMD types to SVE types by simply printing out an SVE register for these instructions. This fixes the subreg issues since there are no subregs involved anymore. gcc/ChangeLog: PR target/109636 * config/aarch64/aarch64-simd.md (<su_optab>div<mode>3, mulv2di3): Remove. * config/aarch64/iterators.md (VQDIV): Remove. (SVE_FULL_SDI_SIMD, SVE_FULL_HSDI_SIMD_DI, SVE_I_SIMD_DI): New. (VPRED, sve_lane_con): Add V4SI and V2DI. * config/aarch64/aarch64-sve.md (<optab><mode>3, @aarch64_pred_<optab><mode>): Support Advanced SIMD types. (mul<mode>3): New, split from <optab><mode>3. (@aarch64_pred_<optab><mode>, post_ra_<optab><mode>3): New. config/aarch64/aarch64-sve2.md (@aarch64_mul_lane_<mode>, aarch64_mul_unpredicated_<mode>): Change SVE_FULL_HSDI to SVE_FULL_HSDI_SIMD_DI. gcc/testsuite/ChangeLog: PR target/109636 gcc.target/aarch64/sve/pr109636_1.c: New test. * gcc.target/aarch64/sve/pr109636_2.c: New test. * gcc.target/aarch64/sve2/pr109636_1.c: New test.	2024-01-24 15:58:34 +00:00
Tamar Christina	306713c953	AArch64: Do not allow SIMD clones with simdlen 1 [PR113552] The AArch64 vector PCS does not allow simd calls with simdlen 1, however due to a bug we currently do allow it for num == 0. This causes us to emit a symbol that doesn't exist and we fail to link. gcc/ChangeLog: PR tree-optimization/113552 * config/aarch64/aarch64.cc (aarch64_simd_clone_compute_vecsize_and_simdlen): Block simdlen 1. gcc/testsuite/ChangeLog: PR tree-optimization/113552 * gcc.target/aarch64/pr113552.c: New test. * gcc.target/aarch64/simd_pcs_attribute-3.c: Remove bogus check.	2024-01-24 15:56:50 +00:00
Martin Jambor	bc4a20bc57	ipa-cp: Fix check for exceeding param_ipa_cp_value_list_size (PR 113490) When the check for exceeding param_ipa_cp_value_list_size limit was modified to be ignored for generating values from self-recursive calls, it should have been changed from equal to, to equals to or is greater than. This omission manifests itself as PR 113490. When I examined the condition I also noticed that the parameter should come from the callee rather than the caller, since the value list is associated with the former and not the latter. In practice the limit is of course very likely to be the same, but I fixed this aspect of the condition too. I briefly audited all other uses of opt_for_fn in ipa-cp.cc and all the others looked OK. gcc/ChangeLog: 2024-01-19 Martin Jambor <mjambor@suse.cz> PR ipa/113490 * ipa-cp.cc (ipcp_lattice<valtype>::add_value): Bail out if value count is equal or greater than the limit. Use the limit from the callee. gcc/testsuite/ChangeLog: 2024-01-22 Martin Jambor <mjambor@suse.cz> PR ipa/113490 * gcc.dg/ipa/pr113490.c: New test.	2024-01-24 16:20:18 +01:00
David Malcolm	e503f9aca9	analyzer: fix taint false +ve due to overzealous state purging [PR112977] gcc/analyzer/ChangeLog: PR analyzer/112977 * engine.cc (impl_region_model_context::on_liveness_change): Pass m_ext_state to sm_state_map::on_liveness_change. * program-state.cc (sm_state_map::on_svalue_leak): Guard removal of map entry based on can_purge_p. (sm_state_map::on_liveness_change): Add ext_state param. Add workaround for bad interaction between state purging and alt-inherited sm-state. * program-state.h (sm_state_map::on_liveness_change): Add ext_state param. * sm-taint.cc (taint_state_machine::has_alt_get_inherited_state_p): New. (taint_state_machine::can_purge_p): Return false for "has_lb" and "has_ub". * sm.h (state_machine::has_alt_get_inherited_state_p): New vfunc. gcc/testsuite/ChangeLog: PR analyzer/112977 * gcc.dg/plugin/plugin.exp: Add taint-pr112977.c. * gcc.dg/plugin/taint-pr112977.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2024-01-24 10:11:35 -05:00
David Malcolm	b6e537571c	analyzer kernel plugin: implement __check_object_size [PR112927] PR analyzer/112927 reports a false positive from -Wanalyzer-tainted-size seen on the Linux kernel's drivers/char/ipmi/ipmi_devintf.c with the analyzer kernel plugin. The issue is that in: (A): if (msg->data_len > 272) { return -90; } (B): n = msg->data_len; __check_object_size(to, n); n = copy_from_user(to, from, n); the analyzer is treating __check_object_size as having arbitrary side effects, and, in particular could modify msg->data_len. Hence the sanitization that occurs at (A) above is treated as being for a different value than the size obtained at (B), hence the bogus warning at the call to copy_from_user. Fixed by extending the analyzer kernel plugin to "teach" it that __check_object_size has no side effects. gcc/testsuite/ChangeLog: PR analyzer/112927 * gcc.dg/plugin/analyzer_kernel_plugin.c (class known_function___check_object_size): New. (kernel_analyzer_init_cb): Register it. * gcc.dg/plugin/plugin.exp: Add taint-pr112927.c. * gcc.dg/plugin/taint-pr112927.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2024-01-24 10:11:09 -05:00
Gaius Mulley	3de031c96f	PR modula2/113559 FIO.mod lseek requires cssize_t rather than longint This patch fixes a bug in gcc/m2/gm2-libs/FIO.mod which failed to cast the whence parameter into the correct type. The patch casts the whence parameter for lseek to SYSTEM.CSSIZE_T. gcc/m2/ChangeLog: PR modula2/113559 * gm2-libs/FIO.mod (SetPositionFromBeginning): Convert pos into CSSIZE_T during call to lseek. (SetPositionFromEnd): Convert pos into CSSIZE_T during call to lseek. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>	2024-01-24 13:11:46 +00:00
Rainer Orth	b8f54195ed	testsuite: i386: Don't restrict gcc.dg/vect/vect-simd-clone-16c.c etc. to i686 [PR113556] A couple of gcc.dg/vect/vect-simd-clone-1.c tests FAIL on 32-bit Solaris/x86 since 20230222: FAIL: gcc.dg/vect/vect-simd-clone-16c.c scan-tree-dump-times vect "[\\\\n\\\\r] [^\\\\n] = foo\\\\.simdclone" 2 FAIL: gcc.dg/vect/vect-simd-clone-16d.c scan-tree-dump-times vect "[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2 FAIL: gcc.dg/vect/vect-simd-clone-17c.c scan-tree-dump-times vect "[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2 FAIL: gcc.dg/vect/vect-simd-clone-17d.c scan-tree-dump-times vect "[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2 FAIL: gcc.dg/vect/vect-simd-clone-18c.c scan-tree-dump-times vect "[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2 FAIL: gcc.dg/vect/vect-simd-clone-18d.c scan-tree-dump-times vect "[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2 The problem is that the 32-bit Solaris/x86 triple still uses i386, although gcc defaults to -mpentium4. However, the tests only handle x86_64* and i686, although the tests don't seem to require some specific ISA extension not covered by vect_simd_clones. To fix this, the tests now allow generic i?86. At the same time, I've removed the wildcards from x86_64 and i686* since DejaGnu uses the canonical forms. Tested on i386-pc-solaris2.11 and i686-pc-linux-gnu. 2024-01-24 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: PR target/113556 * gcc.dg/vect/vect-simd-clone-16c.c: Don't wildcard x86_64 in target specs. Allow any i?86 target instead of i686 only. * gcc.dg/vect/vect-simd-clone-16d.c: Likewise. * gcc.dg/vect/vect-simd-clone-17c.c: Likewise. * gcc.dg/vect/vect-simd-clone-17d.c: Likewise. * gcc.dg/vect/vect-simd-clone-18c.c: Likewise. * gcc.dg/vect/vect-simd-clone-18d.c: Likewise.	2024-01-24 13:56:23 +01:00
Rainer Orth	f4a2478f17	testsuite: i386: Fix gcc.target/i386/pr80833-1.c on 32-bit Solaris/x86 gcc.target/i386/pr80833-1.c FAILs on 32-bit Solaris/x86 since 20220609: FAIL: gcc.target/i386/pr80833-1.c scan-assembler pextrd Unlike e.g. Linux/i686, 32-bit Solaris/x86 defaults to -mstackrealign, so this patch overrides that to match. Tested on i386-pc-solaris2.11 and i686-pc-linux-gnu. 2024-01-23 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: * gcc.target/i386/pr80833-1.c: Add -mno-stackrealign to dg-options.	2024-01-24 13:52:54 +01:00
YunQiang Su	58af788d1d	MIPS: Accept arguments for -mexplicit-relocs GAS introduced explicit relocs since 2001, and %pcrel_hi/low were introduced in 2014. In future, we may introduce more. Let's convert -mexplicit-relocs option, and accpet options: none, base, pcrel. We also update gcc/configure.ac to set the value to option the gas support when GCC itself is built. gcc * configure.ac: Detect the explicit relocs support for mips, and define C macro MIPS_EXPLICIT_RELOCS. * config.in: Regenerated. * configure: Regenerated. * doc/invoke.texi(MIPS Options): Add -mexplicit-relocs. * config/mips/mips-opts.h: Define enum mips_explicit_relocs. * config/mips/mips.cc(mips_set_compression_mode): Sorry if !TARGET_EXPLICIT_RELOCS instead of just set it. * config/mips/mips.h: Define TARGET_EXPLICIT_RELOCS and TARGET_EXPLICIT_RELOCS_PCREL with mips_opt_explicit_relocs. * config/mips/mips.opt: Introduce -mexplicit-relocs= option and define -m(no-)explicit-relocs as aliases.	2024-01-24 20:33:42 +08:00
Thomas Schwinge	7fcdb50136	MAINTAINERS: Update my work email address * MAINTAINERS: Update my work email address.	2024-01-24 12:03:03 +01:00
Alex Coplan	da9647e98a	aarch64: Re-enable ldp/stp fusion pass Since, to the best of my knowledge, all reported regressions related to the ldp/stp fusion pass have now been fixed, and PGO+LTO bootstrap with --enable-languages=all is working again with the passes enabled, this patch turns the passes back on by default, as agreed with Jakub here: https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642478.html gcc/ChangeLog: * config/aarch64/aarch64.opt (-mearly-ldp-fusion): Set default to 1. (-mlate-ldp-fusion): Likewise.	2024-01-24 09:22:19 +00:00
Tamar Christina	5900471166	middle-end: rename main_exit_p in reduction code. This renamed main_exit_p to last_val_reduc_p to more accurately reflect what the value is calculating. gcc/ChangeLog: * tree-vect-loop.cc (vect_get_vect_def, vect_create_epilog_for_reduction): Rename main_exit_p to last_val_reduc_p.	2024-01-24 07:38:18 +00:00
Tamar Christina	72429448fd	middle-end: fix epilog reductions when vector iters peeled [PR113364] This fixes a bug where vect_create_epilog_for_reduction does not handle the case where all exits are early exits. In this case we should do like induction handling code does and not have a main exit. This shows that some new miscompiles are happening (stage3 is likely miscompiled) but that's unrelated to this patch and I'll look at it next. gcc/ChangeLog: PR tree-optimization/113364 * tree-vect-loop.cc (vect_create_epilog_for_reduction): If all exits all early exits then we must reduce from the first offset for all of them. gcc/testsuite/ChangeLog: PR tree-optimization/113364 * gcc.dg/vect/vect-early-break_107-pr113364.c: New test.	2024-01-24 07:37:17 +00:00
Tobias Burnus	d89537a141	libgomp.texi: Document omp_pause_resource{,_all} and omp_target_memcpy* libgomp/ChangeLog: * libgomp.texi (Runtime Library Routines): Document omp_pause_resource, omp_pause_resource_all and omp_target_memcpy{,_rect}{,_async}. Co-authored-by: Sandra Loosemore <sandra@codesourcery.com> Signed-off-by: Tobias Burnus <tburnus@baylibre.com>	2024-01-24 08:06:28 +01:00
Huanghui Nie	ec0a68b9ee	libstdc++: [_Hashtable] Remove useless check for _M_before_begin node When removing the first node of a bucket it is useless to check if this bucket is the one containing the _M_before_begin node. The bucket before-begin node is already transfered to the next pointed-to bucket regardeless if it is the container before-begin node. libstdc++-v3/ChangeLog: * include/bits/hashtable.h (_Hahstable<>::_M_remove_bucket_begin): Remove _M_before_begin check and cleanup implementation. Co-authored-by: Théo Papadopoulo <papadopoulo@gmail.com>	2024-01-24 06:36:04 +01:00
Patrick O'Neill	7f7d9c525c	RISC-V: Add regression test for vsetvl bug pr113429 The reduced testcase for pr113429 (cam4 failure) needed additional modules so it wasn't committed. The fuzzer found a c testcase that was also fixed with pr113429's fix. Adding it as a regression test. PR target/113429 gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr113429.c: New test. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>	2024-01-23 17:19:21 -08:00
Juzhe-Zhong	3132d2d36b	RISC-V: Fix large memory usage of VSETVL PASS [PR113495] SPEC 2017 wrf benchmark expose unreasonble memory usage of VSETVL PASS that is, VSETVL PASS consume over 33 GB memory which make use impossible to compile SPEC 2017 wrf in a laptop. The root cause is wasting-memory variables: unsigned num_exprs = num_bbs * num_regs; sbitmap avl_def_loc = sbitmap_vector_alloc (num_bbs, num_exprs); sbitmap m_kill = sbitmap_vector_alloc (num_bbs, num_exprs); m_avl_def_in = sbitmap_vector_alloc (num_bbs, num_exprs); m_avl_def_out = sbitmap_vector_alloc (num_bbs, num_exprs); I find that compute_avl_def_data can be achieved by RTL_SSA framework. Replace the code implementation base on RTL_SSA framework. After this patch, the memory-hog issue is fixed. simple vsetvl memory usage (valgrind --tool=massif --pages-as-heap=yes --massif-out-file=massif.out) is 1.673 GB. lazy vsetvl memory usage (valgrind --tool=massif --pages-as-heap=yes --massif-out-file=massif.out) is 2.441 GB. Tested on both RV32 and RV64, no regression. gcc/ChangeLog: PR target/113495 * config/riscv/riscv-vsetvl.cc (get_expr_id): Remove. (get_regno): Ditto. (get_bb_index): Ditto. (pre_vsetvl::compute_avl_def_data): Ditto. (pre_vsetvl::earliest_fuse_vsetvl_info): Fix large memory usage. (pre_vsetvl::pre_global_vsetvl_info): Ditto. gcc/testsuite/ChangeLog: PR target/113495 * gcc.target/riscv/rvv/vsetvl/avl_single-107.c: Adapt test.	2024-01-24 08:29:42 +08:00
GCC Administrator	3128786c7e	Daily bump.	2024-01-24 00:18:36 +00:00
Nathaniel Shead	bf358eaa16	testsuite: Disable new test for PR113292 on targets without TLS support This disables the new test added by r14-8168 on machines that don't have TLS support, such as bare-metal ARM. gcc/testsuite/ChangeLog: * g++.dg/modules/pr113292_c.C: Require TLS. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>	2024-01-24 10:42:03 +11:00
Marek Polacek	9010fdba68	c++: -Wdangling-reference and lambda false warning [PR109640] -Wdangling-reference checks if a function receives a temporary as its argument, and only warns if any of the arguments was a temporary. But we should not warn when the temporary represents a lambda or we generate false positives as in the attached testcases. PR c++/113256 PR c++/111607 PR c++/109640 gcc/cp/ChangeLog: * call.cc (do_warn_dangling_reference): Don't warn if the temporary is of lambda type. gcc/testsuite/ChangeLog: * g++.dg/warn/Wdangling-reference14.C: New test. * g++.dg/warn/Wdangling-reference15.C: New test. * g++.dg/warn/Wdangling-reference16.C: New test.	2024-01-23 16:35:31 -05:00
Tobias Burnus	ed4c7893de	MAINTAINERS: Update my email address ChangeLog: * MAINTAINERS: Update my email address. Signed-off-by: Tobias Burnus <tburnus@baylibre.com>	2024-01-23 22:18:57 +01:00
Jakub Jelinek	dbc5f1f523	c: Call c_fully_fold on __atomic_* operands in atomic_bitint_fetch_using_cas_loop [PR113518] As the following testcase shows, I forgot to call c_fully_fold on the __atomic_/__sync_ operands called on _BitInt address, the expressions are then used inside of TARGET_EXPR initializers etc. and are never fully folded later, which means we can ICE e.g. on C_MAYBE_CONST_EXPR trees inside of those. The following patch fixes it, while the function currently is only called in the C FE because C++ doesn't support BITINT_TYPE, I think guarding the calls on !c_dialect_cxx () is safer. 2024-01-23 Jakub Jelinek <jakub@redhat.com> PR c/113518 * c-common.cc (atomic_bitint_fetch_using_cas_loop): Call c_fully_fold on lhs_addr, val and model for C. * gcc.dg/bitint-77.c: New test.	2024-01-23 19:59:00 +01:00
Andrew Pinski	06ee648e9b	aarch64/expr: Use ccmp when the outer expression is used twice [PR100942] Ccmp is not used if the result of the and/ior is used by both a GIMPLE_COND and a GIMPLE_ASSIGN. This improves the code generation here by using ccmp in this case. Two changes is required, first we need to allow the outer statement's result be used more than once. The second change is that during the expansion of the gimple, we need to try using ccmp. This is needed because we don't use expand the ssa name of the lhs but rather expand directly from the gimple. A small note on the ccmp_4.c testcase, we should be able to get slightly better than with this patch but it is one extra instruction compared to before. PR target/100942 gcc/ChangeLog: * ccmp.cc (ccmp_candidate_p): Add outer argument. Allow if the outer is true and the lhs is used more than once. (expand_ccmp_expr): Update call to ccmp_candidate_p. * expr.h (expand_expr_real_gassign): Declare. * expr.cc (expand_expr_real_gassign): New function, split out from... (expand_expr_real_1): ...here. * cfgexpand.cc (expand_gimple_stmt_1): Use expand_expr_real_gassign. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ccmp_3.c: New test. * gcc.target/aarch64/ccmp_4.c: New test. * gcc.target/aarch64/ccmp_5.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com> Co-Authored-By: Richard Sandiford <richard.sandiford@arm.com>	2024-01-23 17:42:51 +00:00
Andrew Stubbs	cc082cf97e	Update my email in MAINTAINERS ChangeLog: * MAINTAINERS: Update Signed-off-by: Andrew Stubbs <ams@baylibre.com>	2024-01-23 17:23:22 +00:00
Alex Coplan	3d82ebb696	aarch64: Fix up debug uses in ldp/stp pass [PR113089] As the PR shows, we were missing code to update debug uses in the load/store pair fusion pass. This patch fixes that. The patch tries to give a complete treatment of the debug uses that will be affected by the changes we make, and in particular makes an effort to preserve debug info where possible, e.g. when re-ordering an update of a base register by a constant over a debug use of that register. When re-ordering loads over a debug use of a transfer register, we reset the debug insn. Likewise when re-ordering stores over debug uses of mem. While doing this I noticed that try_promote_writeback used a strange choice of move_range for the pair insn, in that it chose the previous nondebug insn instead of the insn itself. Since the insn is being changed, these move ranges are equivalent (at least in terms of nondebug insn placement as far as RTL-SSA is concerned), but I think it is more natural to choose the pair insn itself. This is needed to avoid incorrectly updating some debug uses. gcc/ChangeLog: PR target/113089 * config/aarch64/aarch64-ldp-fusion.cc (reset_debug_use): New. (fixup_debug_use): New. (fixup_debug_uses_trailing_add): New. (fixup_debug_uses): New. Use it ... (ldp_bb_info::fuse_pair): ... here. (try_promote_writeback): Call fixup_debug_uses_trailing_add to fix up debug uses of the base register that are affected by folding in the trailing add insn. gcc/testsuite/ChangeLog: PR target/113089 * gcc.c-torture/compile/pr113089.c: New test.	2024-01-23 16:49:13 +00:00
Alex Coplan	49bfda6017	aarch64: Re-parent trailing nondebug base reg uses [PR113089] While working on PR113089, I realised we where missing code to re-parent trailing nondebug uses of the base register in the case of cancelling writeback in the load/store pair pass. This patch fixes that. gcc/ChangeLog: PR target/113089 * config/aarch64/aarch64-ldp-fusion.cc (ldp_bb_info::fuse_pair): Update trailing nondebug uses of the base register in the case of cancelling writeback.	2024-01-23 16:49:13 +00:00
Alex Coplan	cef6031694	rtl-ssa: Provide easier access to debug uses [PR113089] This patch adds some accessors to set_info and use_info to make it easier to get at and iterate through uses in debug insns. It is used by the aarch64 load/store pair fusion pass in a subsequent patch to fix PR113089, i.e. to update debug uses in the pass. gcc/ChangeLog: PR target/113089 * rtl-ssa/accesses.h (use_info::next_debug_insn_use): New. (debug_insn_use_iterator): New. (set_info::first_debug_insn_use): New. (set_info::debug_insn_uses): New. * rtl-ssa/member-fns.inl (use_info::next_debug_insn_use): New. (set_info::first_debug_insn_use): New. (set_info::debug_insn_uses): New.	2024-01-23 16:49:13 +00:00
Alex Coplan	639ae54344	aarch64: Don't record hazards against paired insns [PR113356] For the testcase in the PR, we try to pair insns where the first has writeback and the second uses the updated base register. This causes us to record a hazard against the second insn, thus narrowing the move range away from the end of the BB. However, it isn't meaningful to record hazards against the other insn in the pair, as this doesn't change which pairs can be formed, and also doesn't change where the pair is formed (from the perspective of nondebug insns). To see why this is the case, consider the two cases: - Suppoe we are finding hazards for insns[0]. If we record a hazard against insns[1], then range.last becomes insns[1]->prev_nondebug_insn (), but note that this is equivalent to inserting after insns[1] (since insns[1] is being changed). - Now consider finding hazards for insns[1]. Suppose we record insns[0] as a hazard. Then we set range.first = insns[0], which is a no-op. As such, it seems better to never record hazards against the other insn in the pair, as we check whether the insns themselves are suitable for combination separately (e.g. for ldp checking that they use distinct transfer registers). Avoiding unnecessarily narrowing the move range avoids unnecessarily re-ordering over debug insns. This should also mean that we can only narrow the move range away from the end of the BB in the case that we record a hazard for insns[0] against insns[1]->prev_nondebug_insn () or earlier. This means that for the non-call-exceptions case, either the move range includes insns[1], or we reject the pair (thus the assert tripped in the PR should always hold). gcc/ChangeLog: PR target/113356 * config/aarch64/aarch64-ldp-fusion.cc (ldp_bb_info::try_fuse_pair): Don't record hazards against the opposite insn in the pair. gcc/testsuite/ChangeLog: PR target/113356 * gcc.target/aarch64/pr113356.C: New test.	2024-01-23 16:49:13 +00:00
Ronan Desplanques	0ad6908f6a	Update year in Gnatvsn gcc/ada/ * gnatvsn.ads: Update year.	2024-01-23 17:22:48 +01:00
Xi Ruoyao	46f3ba56c4	LoongArch: testsuite: Disable stack protector for got-load.C When building GCC with --enable-default-ssp, the stack protector is enabled for got-load.C, causing additional GOT loads for __stack_chk_guard. So mem/u will be matched more than 2 times and the test will fail. Disable stack protector to fix this issue. gcc/testsuite: * g++.target/loongarch/got-load.C (dg-options): Add -fno-stack-protector.	2024-01-23 23:44:26 +08:00
Zac Walker	c608ada288	Ifdef `.hidden`, `.type`, and `.size` pseudo-ops for `aarch64-w64-mingw32` target Recent change (https://gcc.gnu.org/pipermail/gcc-cvs/2023-December/394915.html) added a generic SME support using `.hidden`, `.type`, and ``.size` pseudo-ops in the assembly sources, `aarch64-w64-mingw32` does not support the pseudo-ops though. This patch wraps usage of those pseudo-ops using macros and ifdefs them for `__ELF__` define. libgcc/ * config/aarch64/aarch64-asm.h (HIDDEN, SYMBOL_SIZE, SYMBOL_TYPE) (ENTRY_ALIGN, GNU_PROPERTY): New macros. * config/aarch64/__arm_sme_state.S: Use them. * config/aarch64/__arm_tpidr2_save.S: Likewise. * config/aarch64/__arm_za_disable.S: Likewise. * config/aarch64/crti.S: Likewise. * config/aarch64/lse.S: Likewise.	2024-01-23 15:32:30 +00:00
H.J. Lu	3936c8709c	gcc.dg/torture/pr113255.c: Fix ia32 test failure Fix ia32 test failure: FAIL: gcc.dg/torture/pr113255.c -O1 (test for excess errors) Excess errors: cc1: error: '-mstringop-strategy=rep_8byte' not supported for 32-bit code PR rtl-optimization/113255 * gcc.dg/torture/pr113255.c (dg-additional-options): Add only if not ia32.	2024-01-23 06:34:43 -08:00
H.J. Lu	2bdf138a0d	m2: Use time_t in time and don't redefine alloca Fix the m2 build warning and error: [...] ../../src/gcc/m2/mc/mc.flex:32:9: warning: "alloca" redefined 32 \| #define alloca __builtin_alloca \| ^~~~~~ In file included from /usr/include/stdlib.h:587, from <stdout>:22: /usr/include/alloca.h:35:10: note: this is the location of the previous definition 35 \| # define alloca(size) __builtin_alloca (size) \| ^~~~~~ ../../src/gcc/m2/mc/mc.flex: In function 'handleDate': ../../src/gcc/m2/mc/mc.flex:333:25: error: passing argument 1 of 'time' from incompatible point er type [-Wincompatible-pointer-types] 333 \| time_t clock = time ((long )0); \| ^~~~~~~~~ \| \| \| long int In file included from ../../src/gcc/m2/mc/mc.flex:28: /usr/include/time.h:76:29: note: expected 'time_t ' {aka 'long long int '} but argument is of type 'long int ' 76 \| extern time_t time (time_t __timer) __THROW; PR bootstrap/113554 * mc/mc.flex (alloca): Don't redefine. (handleDate): Replace (long )0 with (time_t )0 when calling time.	2024-01-23 06:09:45 -08:00
Alex Coplan	ef86659da9	aarch64: Fix up uses of mem following stp insert [PR113070] As the PR shows (specifically #c7) we are missing updating uses of mem when inserting an stp in the aarch64 load/store pair fusion pass. This patch fixes that. RTL-SSA has a simple view of memory and by default doesn't allow stores to be re-ordered w.r.t. other stores. In the ldp fusion pass, we do our own alias analysis and so can re-order stores over other accesses when we deem this is safe. If neither store can be re-purposed (moved into the required position to form the stp while respecting the RTL-SSA constraints), then we turn both the candidate stores into "tombstone" insns (logically delete them) and insert a new stp insn. As it stands, we implement the insert case separately (after dealing with the candidate stores) in fuse_pair by inserting into the middle of the vector of changes. This is OK when we only have to insert one change, but with this fix we would need to insert the change for the new stp plus multiple changes to fix up uses of mem (note the number of fix-ups is naturally bounded by the alias limit param to prevent quadratic behaviour). If we kept the code structured as is and inserted into the middle of the vector, that would lead to repeated moving of elements in the vector which seems inefficient. The structure of the code would also be a little unwieldy. To improve on that situation, this patch introduces a helper class, stp_change_builder, which implements a state machine that helps to build the required changes directly in program order. That state machine is reponsible for deciding what changes need to be made in what order, and the code in fuse_pair then simply follows those steps. Together with the fix in the previous patch for installing new defs correctly in RTL-SSA, this fixes PR113070. We take the opportunity to rename the function decide_stp_strategy to try_repurpose_store, as that seems more descriptive of what it actually does, since stp_change_builder is now responsible for the overall change strategy. gcc/ChangeLog: PR target/113070 * config/aarch64/aarch64-ldp-fusion.cc (struct stp_change_builder): New. (decide_stp_strategy): Reanme to ... (try_repurpose_store): ... this. (ldp_bb_info::fuse_pair): Refactor to use stp_change_builder to construct stp changes. Fix up uses when inserting new stp insns.	2024-01-23 13:22:11 +00:00
Alex Coplan	6dd613df59	rtl-ssa: Ensure new defs get inserted [PR113070] In r14-5820-ga49befbd2c783e751dc2110b544fe540eb7e33eb I added support to RTL-SSA for inserting new insns, which included support for users creating new defs. However, I missed that apply_changes_to_insn needed updating to ensure that the new defs actually got inserted into the main def chain. This meant that when the aarch64 ldp/stp pass inserted a new stp insn, the stp would just get skipped over during subsequent alias analysis, as its def never got inserted into the memory def chain. This (unsurprisingly) led to wrong code. This patch fixes the issue by ensuring new user-created defs get inserted. I would have preferred to have used a flag internal to the defs instead of a separate data structure to keep track of them, but since machine_mode increased to 16 bits we're already at 64 bits in access_info, and we can't really reuse m_is_temp as the logic in finalize_new_accesses requires it to get cleared. gcc/ChangeLog: PR target/113070 * rtl-ssa.h: Include hash-set.h. * rtl-ssa/changes.cc (function_info::finalize_new_accesses): Add new_sets parameter and use it to keep track of new user-created sets. (function_info::apply_changes_to_insn): Also call add_def on new sets. (function_info::change_insns): Add hash_set to keep track of new user-created defs. Plumb it through. * rtl-ssa/functions.h: Add hash_set parameter to finalize_new_accesses and apply_changes_to_insn.	2024-01-23 13:22:11 +00:00
Alex Coplan	fce3994d04	rtl-ssa: Support for creating new uses [PR113070] This exposes an interface for users to create new uses in RTL-SSA. This is needed for updating uses after inserting a new store pair insn in the aarch64 load/store pair fusion pass. gcc/ChangeLog: PR target/113070 * rtl-ssa/accesses.cc (function_info::create_use): New. * rtl-ssa/changes.cc (function_info::finalize_new_accesses): Ensure new uses end up referring to permanent defs. * rtl-ssa/functions.h (function_info::create_use): Declare.	2024-01-23 13:22:11 +00:00
Alex Coplan	e0374b028a	rtl-ssa: Run finalize_new_accesses forwards [PR113070] The next patch in this series exposes an interface for creating new uses in RTL-SSA. The intent is that new user-created uses can consume new user-created defs in the same change group. This is so that we can correctly update uses of memory when inserting a new store pair insn in the aarch64 load/store pair fusion pass (the affected uses need to consume the new store pair insn). As it stands, finalize_new_accesses is called as part of the backwards insn placement loop within change_insns, but if we want new uses to be able to depend on new defs in the same change group, we need finalize_new_accesses to be called on earlier insns first. This is so that when we process temporary uses and turn them into permanent uses, we can follow the last_def link on the temporary def to ensure we end up with a permanent use consuming a permanent def. gcc/ChangeLog: PR target/113070 * rtl-ssa/changes.cc (function_info::change_insns): Split out the call to finalize_new_accesses from the backwards placement loop, run it forwards in a separate loop.	2024-01-23 13:22:11 +00:00
Richard Biener	d5d43dc399	tree-optimization/113552 - fix num_call accounting in simd clone vectorization The following avoids using exact_log2 on the number of SIMD clone calls to be emitted when vectorizing calls since that can easily be not a power of two in which case it will return -1. For different simd clones the number of calls will differ by a multiply with a power of two only so using floor_log2 is good enough here. PR tree-optimization/113552 * tree-vect-stmts.cc (vectorizable_simd_clone_call): Use floor_log2 instead of exact_log2 on the number of calls.	2024-01-23 14:09:30 +01:00
Jakub Jelinek	ac98aa7828	ia64: Fix up -Wunused-parameter warning Since r14-6945-gc659dd8bfb55e02a1b97407c1c28f7a0e8f7f09b there is a warning ../../gcc/config/ia64/ia64.cc: In function ‘void ia64_start_function(FILE, const char, tree)’: ../../gcc/config/ia64/ia64.cc:3889:59: warning: unused parameter ‘decl’ [-Wunused-parameter] 3889 \| ia64_start_function (FILE file, const char fnname, tree decl) \| ~~~~~^~~~ which presumably for bootstraps breaks the bootstrap. While the decl parameter is passed to the ASM_OUTPUT_FUNCTION_LABEL macro, that macro actually doesn't use that argument, so the removal of ATTRIBUTE_UNUSED was incorrect. This patch reverts the first ia64.cc hunk from r14-6945. 2024-01-23 Jeff Law <jlaw@ventanamicro.com> Jakub Jelinek <jakub@redhat.com> * config/ia64/ia64.cc (ia64_start_function): Add ATTRIBUTE_UNUSED to decl.	2024-01-23 13:57:41 +01:00
Richard Biener	02e6838949	Refactor exit PHI handling in vectorizer epilogue peeling This refactors the handling of PHIs inbetween the main and the epilogue loop. Instead of trying to handle the multiple exit and original single exit case together the following separates these cases resulting in much easier to understand code. * tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg): Separate single and multi-exit case when creating PHIs between the main and epilogue.	2024-01-23 13:08:12 +01:00
Richard Sandiford	659a5a908e	aarch64: Avoid registering duplicate C++ overloads [PR112989] In the original fix for this PR, I'd made sure that including <arm_sme.h> didn't reach the final return in simulate_builtin_function_decl (which would indicate duplicate function definitions). But it seems I forgot to do the same thing for C++, which defines all of its overloads directly. This patch fixes a case where we still recorded duplicate functions for C++. Thanks to Iain for reporting the resulting GC ICE and for help with reproducing it. gcc/ PR target/112989 * config/aarch64/aarch64-sve-builtins-shapes.cc (build_one): Skip MODE_single variants of functions that don't take tuple arguments.	2024-01-23 11:10:41 +00:00
Alex Coplan	20e18106fa	aarch64: Don't assert recog success in ldp/stp pass [PR113114] The PR shows two different cases where try_promote_writeback produces an RTL pattern which isn't recognized. Currently this leads to an ICE, as we assert recog success, but I think it's better just to back out of the changes gracefully if recog fails (as we do in the main fuse_pair case). In theory since we check the ranges here recog shouldn't fail (which is why I had the assert in the first place), but the PR shows an edge case in the patterns where if we form a pre-writeback pair where the writeback offset is exactly -S, where S is the size in bytes of one transfer register, we fail to match the expected pattern as the patterns look explicitly for plus operands in the mems. I think fixing this would require adding at least four new special-case patterns to aarch64.md for what doesn't seem to be a particularly useful variant of the insns. Even if we were to do that, I think it would be GCC 15 material, and it's better to just punt for GCC 14. The ILP32 case in the PR is a bit different, as that shows us trying to combine a pair with DImode base register operands in the mems together with an SImode trailing update of the base register. This leads to us forming an RTL pattern which references the base register in both SImode and DImode, which also fails to recog. Again, I think it's best just to take the missed optimization for now. If we really want to make this (try_promote_writeback) work for ILP32, we can try to do it for GCC 15. gcc/ChangeLog: PR target/113114 * config/aarch64/aarch64-ldp-fusion.cc (try_promote_writeback): Don't assert recog success, just punt if the writeback pair isn't recognized. gcc/testsuite/ChangeLog: PR target/113114 * gcc.c-torture/compile/pr113114.c: New test. * gcc.target/aarch64/pr113114.c: New test.	2024-01-23 10:57:33 +00:00
Jakub Jelinek	bd1703b79c	gcn: Fix a warning I see ../../gcc/config/gcn/gcn.cc: In function ‘void gcn_hsa_declare_function_name(FILE, const char, tree)’: ../../gcc/config/gcn/gcn.cc:6568:67: warning: unused parameter ‘decl’ [-Wunused-parameter] 6568 \| gcn_hsa_declare_function_name (FILE file, const char name, tree decl) \| ~~~~~^~~~ warning presumably since r14-6945-gc659dd8bfb55e02a1b97407c1c28f7a0e8f7f09b Previously, the argument was anonymous, but now it is passed to a macro which ignores it, so I think we should go with ATTRIBUTE_UNUSED. 2024-01-23 Jakub Jelinek <jakub@redhat.com> * config/gcn/gcn.cc (gcn_hsa_declare_function_name): Add ATTRIBUTE_UNUSED to decl.	2024-01-23 11:21:00 +01:00
Richard Biener	e2f3057fc9	debug/107058 - gracefully handle unexpected DIE contexts While the bug is persisting that LTO streaming picks up a CONST_DECL from an attribute argument on a VAR_DECL which with -fdebug-type-section refers to a DIE in a type unit we can handle this gracefully, at least with -fno-checking. Do so. The C++ frontend nevetheless should resolve the CONST_DECL attribute argument to a constant. PR debug/107058 * dwarf2out.cc (dwarf2out_die_ref_for_decl): Gracefully handle unexpected but bogus DIE contexts when not checking enabled. * c-c++-common/pr107058.c: New testcase.	2024-01-23 11:15:29 +01:00
Nathaniel Shead	affef534b0	c++: Fix handling of extern templates in modules [PR112820] Currently, extern templates are detected by looking for the DECL_EXTERNAL flag on a TYPE_DECL. However, this is incorrect: TYPE_DECLs don't actually set this flag, and it happens to work by coincidence due to TYPE_DECL_SUPPRESS_DEBUG happening to use the same underlying bit. This however causes issues with other TYPE_DECLs that also happen to have suppressed debug information. Instead, this patch reworks the logic so CLASSTYPE_INTERFACE_ONLY is always emitted into the module BMI and can then be used to check for an extern template correctly. Otherwise, for other declarations we always want to redetermine this: even for declarations from the GMF, we may change our mind on whether to import or export depending on decisions made later in the TU after importing so we shouldn't decide this now, or necessarily reuse what the module we'd imported had decided. Some of this may need to change in the future to account for https://github.com/itanium-cxx-abi/cxx-abi/issues/170. PR c++/112820 PR c++/102607 gcc/cp/ChangeLog: * module.cc (trees_out::lang_type_bools): Write interface_only and interface_unknown. (trees_in::lang_type_bools): Read the above flags. (trees_in::decl_value): Reset CLASSTYPE_INTERFACE_* except for extern templates. (trees_in::read_class_def): Remove buggy extern template handling. gcc/testsuite/ChangeLog: * g++.dg/modules/debug-2_a.C: New test. * g++.dg/modules/debug-2_b.C: New test. * g++.dg/modules/debug-2_c.C: New test. * g++.dg/modules/debug-3_a.C: New test. * g++.dg/modules/debug-3_b.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>	2024-01-23 20:46:01 +11:00
Jakub Jelinek	5015015ae6	fold-const: Fold larger VIEW_CONVERT_EXPRs [PR113462] On Mon, Jan 22, 2024 at 11:27:52AM +0100, Richard Biener wrote: > We run into > > static tree > native_interpret_int (tree type, const unsigned char ptr, int len) > { > ... > if (total_bytes > len > \|\| total_bytes BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT) > return NULL_TREE; > > OTOH using a V_C_E to "truncate" a _BitInt looks wrong? OTOH the > check doesn't really handle native_encode_expr using the "proper" > wide_int encoding however that's exactly handled. So it might be > a pre-existing issue that's only uncovered by large _BitInts > (__int128 might show similar issues?) I guess the \|\| total_bytes * BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT conditions make no sense, all we care is whether it fits in the buffer or not. But then there is fold_view_convert_expr (and other spots) which use /* We support up to 1024-bit values (for GCN/RISC-V V128QImode). / unsigned char buffer[128]; or something similar. This patch fixes even that by using a XALLOCAVEC allocated buffer if the type size is 129 .. 8192 bytes. 2024-01-22 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/113462 fold-const.cc (native_interpret_int): Don't punt if total_bytes is larger than HOST_BITS_PER_DOUBLE_INT / BITS_PER_UNIT. (fold_view_convert_expr): Use XALLOCAVEC buffers for types with sizes between 129 and 8192 bytes.	2024-01-23 09:05:05 +01:00
Xi Ruoyao	f12317306e	LoongArch: Disable explicit reloc for TLS LD/GD with -mexplicit-relocs=auto Binutils 2.42 supports TLS LD/GD relaxation which requires the assembler macro. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_explicit_relocs_p): If la_opt_explicit_relocs is EXPLICIT_RELOCS_AUTO, return false for SYMBOL_TLS_LDM and SYMBOL_TLS_GD. (loongarch_call_tls_get_addr): Do not split symbols of SYMBOL_TLS_LDM or SYMBOL_TLS_GD if la_opt_explicit_relocs is EXPLICIT_RELOCS_AUTO. gcc/testsuite/ChangeLog: * gcc.target/loongarch/explicit-relocs-auto-tls-ld-gd.c: Check for la.tls.ld and la.tls.gd.	2024-01-23 16:01:59 +08:00
Richard Biener	23ebb09ed2	find_base_value part The following adjusts find_base_value similar as to what find_base_term was adjusted for PR113255. * alias.cc (known_base_value_p): Remove. (find_base_value): Remove PLUS/MINUS handling when both operands are not CONST_INT_P.	2024-01-23 08:08:27 +01:00
Richard Biener	a98d5130a6	rtl-optimization/113255 - base_alias_check vs. pointer difference When the x86 backend generates code for cpymem with the rep_8byte strathegy for the 8 byte aligned main rep movq it needs to compute an adjusted pointer to the source after doing a prologue aligning the destination. It computes that via src_ptr + (dest_ptr - orig_dest_ptr) which is perfectly fine. On RTL this is then 8: r134:DI=const(`g'+0x44) 9: {r133:DI=frame:DI-0x4c;clobber flags:CC;} REG_UNUSED flags:CC 56: r129:DI=const(`g'+0x4c) 57: {r129:DI=r129:DI&0xfffffffffffffff8;clobber flags:CC;} REG_UNUSED flags:CC REG_EQUAL const(`g'+0x4c)&0xfffffffffffffff8 58: {r118:DI=r134:DI-r129:DI;clobber flags:CC;} REG_DEAD r134:DI REG_UNUSED flags:CC REG_EQUAL const(`g'+0x44)-r129:DI 59: {r119:DI=r133:DI-r118:DI;clobber flags:CC;} REG_DEAD r133:DI REG_UNUSED flags:CC but as written find_base_term happily picks the first candidate it finds for the MINUS which means it picks const(`g') rather than the correct frame:DI. This way find_base_term (but also the unfixed find_base_value used by init_alias_analysis to initialize REG_BASE_VALUE) performs pointer analysis isn't sound. The following restricts the handling of multi-operand operations to the case we know only one can be a pointer. This for example causes gcc.dg/tree-ssa/pr94969.c to miss some RTL PRE (I've opened PR113395 for this). A more drastic patch, removing base_alias_check results in only gcc.dg/guality/pr41447-1.c regressing (so testsuite coverage is bad). I've looked at gcc.dg/tree-ssa tests and mostly scheduling changes are present, the cc1plus .text size is only 230 bytes worse. With the this less drastic patch below most scheduling changes are gone. x86_64 might not the very best target to test for impact, but test coverage on other targets is unlikely to be very much better. PR rtl-optimization/113255 * alias.cc (find_base_term): Remove PLUS/MINUS handling when both operands are not CONST_INT_P. * gcc.dg/torture/pr113255.c: New testcase.	2024-01-23 08:08:27 +01:00
Richard Biener	7218f5050c	debug/112718 - reset all type units with -ffat-lto-objects When mixing -flto, -ffat-lto-objects and -fdebug-type-section we fail to reset all type units after early output resulting in an ICE when attempting to add then duplicate sibling attributes. PR debug/112718 * dwarf2out.cc (dwarf2out_finish): Reset all type units for the fat part of an LTO compile. * gcc.dg/debug/pr112718.c: New testcase.	2024-01-23 08:05:08 +01:00

1 2 3 4 5 ...

208323 commits