FreeChainXenon/gcc - Aiden Isik's Forgejo Server

Author	SHA1	Message	Date
Roger Sayle	d1bb9569d7	PR 91865: Avoid ZERO_EXTEND of ZERO_EXTEND in make_compound_operation. This patch is my proposed solution to PR rtl-optimization/91865. Normally RTX simplification canonicalizes a ZERO_EXTEND of a ZERO_EXTEND to a single ZERO_EXTEND, but as shown in this PR it is possible for combine's make_compound_operation to unintentionally generate a non-canonical ZERO_EXTEND of a ZERO_EXTEND, which is unlikely to be matched by the backend. For the new test case: const int table[2] = {1, 2}; int foo (char i) { return table[i]; } compiling with -O2 -mlarge on msp430 we currently see: Trying 2 -> 7: 2: r25:HI=zero_extend(R12:QI) REG_DEAD R12:QI 7: r28:PSI=sign_extend(r25:HI)#0 REG_DEAD r25:HI Failed to match this instruction: (set (reg:PSI 28 [ iD.1772 ]) (zero_extend:PSI (zero_extend:HI (reg:QI 12 R12 [ iD.1772 ])))) which results in the following code: foo: AND #0xff, R12 RLAM.A #4, R12 { RRAM.A #4, R12 RLAM.A #1, R12 MOVX.W table(R12), R12 RETA With this patch, we now see: Trying 2 -> 7: 2: r25:HI=zero_extend(R12:QI) REG_DEAD R12:QI 7: r28:PSI=sign_extend(r25:HI)#0 REG_DEAD r25:HI Successfully matched this instruction: (set (reg:PSI 28 [ iD.1772 ]) (zero_extend:PSI (reg:QI 12 R12 [ iD.1772 ]))) allowing combination of insns 2 and 7 original costs 4 + 8 = 12 replacement cost 8 foo: MOV.B R12, R12 RLAM.A #1, R12 MOVX.W table(R12), R12 RETA 2023-10-26 Roger Sayle <roger@nextmovesoftware.com> Richard Biener <rguenther@suse.de> gcc/ChangeLog PR rtl-optimization/91865 * combine.cc (make_compound_operation): Avoid creating a ZERO_EXTEND of a ZERO_EXTEND. gcc/testsuite/ChangeLog PR rtl-optimization/91865 * gcc.target/msp430/pr91865.c: New test case.	2023-10-26 10:06:59 +01:00
liuhongt	2f592b7b55	Pass type of comparison operands instead of comparison result to truth_type_for in build_vec_cmp. gcc/c/ChangeLog: * c-typeck.cc (build_vec_cmp): Pass type of arg0 to truth_type_for. gcc/cp/ChangeLog: * typeck.cc (build_vec_cmp): Pass type of arg0 to truth_type_for.	2023-10-26 16:36:06 +08:00
Jiahao Xu	60c11c9a23	LoongArch:Enable vcond_mask_mn expanders for SF/DF modes. If the vcond_mask patterns don't support fp modes, the vector FP comparison instructions will not be generated. gcc/ChangeLog: * config/loongarch/lasx.md (vcond_mask_<ILASX:mode><ILASX:mode>): Change to (vcond_mask_<mode><mode256_i>): this. * config/loongarch/lsx.md (vcond_mask_<ILSX:mode><ILSX:mode>): Change to (vcond_mask_<mode><mode_i>): this. gcc/testsuite/ChangeLog: * gcc.target/loongarch/vector/lasx/lasx-vcond-1.c: New test. * gcc.target/loongarch/vector/lasx/lasx-vcond-2.c: New test. * gcc.target/loongarch/vector/lsx/lsx-vcond-1.c: New test. * gcc.target/loongarch/vector/lsx/lsx-vcond-2.c: New test.	2023-10-26 15:03:58 +08:00
Stefan Schulze Frielinghaus	88df58b7ee	testsuite: Fix _BitInt in gcc.misc-tests/godump-1.c Currently _BitInt is only supported on x86_64 which means that for other targets all tests fail with e.g. gcc.misc-tests/godump-1.c:237:1: sorry, unimplemented: '_BitInt(32)' is not supported on this target 237 \| _BitInt(32) b32_v; \| ^~~~~~~ Instead of requiring _BitInt support for godump-1.c, move _BitInt tests into godump-2.c such that all other tests in godump-1.c are still executed in case of missing _BitInt support. gcc/testsuite/ChangeLog: * gcc.misc-tests/godump-1.c: Move _BitInt tests into godump-2.c. * gcc.misc-tests/godump-2.c: New test.	2023-10-26 08:41:24 +02:00
Thomas Schwinge	3dfe7e2d55	More '#ifdef ASM_OUTPUT_DEF' -> 'if (TARGET_SUPPORTS_ALIASES)' etc. Per commit `a8b522b483` (Subversion r251048) "Introduce TARGET_SUPPORTS_ALIASES", there is the idea that a back end may or may not provide symbol aliasing support ('TARGET_SUPPORTS_ALIASES') independent of '#ifdef ASM_OUTPUT_DEF', and in particular, depending not just on static but instead on dynamic (run-time) configuration. There did remain a few instances where we currently still assume that from '#ifdef ASM_OUTPUT_DEF' follows 'TARGET_SUPPORTS_ALIASES'. Change these to 'if (TARGET_SUPPORTS_ALIASES)', similarly, or 'gcc_checking_assert (TARGET_SUPPORTS_ALIASES);'. gcc/ * ipa-icf.cc (sem_item::target_supports_symbol_aliases_p): 'gcc_checking_assert (TARGET_SUPPORTS_ALIASES);' before 'return true;'. * ipa-visibility.cc (function_and_variable_visibility): Change '#ifdef ASM_OUTPUT_DEF' to 'if (TARGET_SUPPORTS_ALIASES)'. * varasm.cc (output_constant_pool_contents) [#ifdef ASM_OUTPUT_DEF]: 'gcc_checking_assert (TARGET_SUPPORTS_ALIASES);'. (do_assemble_alias) [#ifdef ASM_OUTPUT_DEF]: 'if (!TARGET_SUPPORTS_ALIASES)', 'gcc_checking_assert (seen_error ());'. (assemble_alias): Change '#if !defined (ASM_OUTPUT_DEF)' to 'if (!TARGET_SUPPORTS_ALIASES)'. (default_asm_output_anchor): 'gcc_checking_assert (TARGET_SUPPORTS_ALIASES);'.	2023-10-26 08:37:25 +02:00
Alexandre Oliva	33d38b431c	set hardcmp eh probs Set execution count of EH blocks, and probability of EH edges. for gcc/ChangeLog PR tree-optimization/111520 * gimple-harden-conditionals.cc (pass_harden_compares::execute): Set EH edge probability and EH block execution count. for gcc/testsuite/ChangeLog PR tree-optimization/111520 * g++.dg/torture/harden-comp-pr111520.cc: New.	2023-10-26 03:19:29 -03:00
Alexandre Oliva	2f398d148a	rename make_eh_edges to make_eh_edge Since make_eh_edges creates at most one edge, rename it to make_eh_edge. for gcc/ChangeLog * tree-eh.h (make_eh_edges): Rename to... (make_eh_edge): ... this. * tree-eh.cc: Likewise. Adjust all callers... * gimple-harden-conditionals.cc: ... here, ... * gimple-harden-control-flow.cc: ... here, ... * tree-cfg.cc: ... here, ... * tree-inline.cc: ... and here.	2023-10-26 03:06:05 -03:00
GCC Administrator	f75fc1f083	Daily bump.	2023-10-26 00:17:43 +00:00
Iain Sandoe	da9e72f80f	Darwin: Handle the fPIE option specially. For Darwin, PIE requires PIC codegen, but otherwise is only a link-time change. For almost all Darwin, we do not report __PIE__; the exception is 32bit X86 and from Darwin12 to 17 only (32 bit is no longer supported after Darwin17). gcc/ChangeLog: * config/darwin.cc (darwin_override_options): Handle fPIE. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>	2023-10-25 20:46:59 +01:00
Iain Sandoe	8f62ce10bc	config, aarch64: Use a more compatible sed invocation. Currently, the sed command used to parse --with-{cpu,tune,arch} are using GNU-specific extension (automatically recognising extended regex). This is failing on Darwin, which defualts to Posix behaviour. However '-E' is accepted to indicate an extended RE. Strictly, this is also not really sufficient, since we should only require a Posix sed. gcc/ChangeLog: * config.gcc: Use -E to to sed to indicate that we are using extended REs. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>	2023-10-25 20:44:20 +01:00
Jason Merrill	1aa9f1cc98	tree: update address_space comment Mention front-end uses of the address_space bit-field, and remove the inaccurate "only". gcc/ChangeLog: * tree-core.h (struct tree_base): Update address_space comment.	2023-10-25 15:24:30 -04:00
Wilco Dijkstra	668c4c3783	AArch64: Improve immediate generation Further improve immediate generation by adding support for 2-instruction MOV/EOR bitmask immediates. This reduces the number of 3/4-instruction immediates in SPECCPU2017 by ~2%. Reviewed-by: Richard Earnshaw <Richard.Earnshaw@arm.com> gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_internal_mov_immediate) Add support for immediates using MOV/EOR bitmask. gcc/testsuite: * gcc.target/aarch64/imm_choice_comparison.c: Change tests. * gcc.target/aarch64/moveor_imm.c: Add new test. * gcc.target/aarch64/pr106583.c: Change tests.	2023-10-25 16:25:29 +01:00
Jason Merrill	406709b1c7	c++: improve comment It's incorrect to say that the address of an OFFSET_REF is always a pointer-to-member; if it represents an overload set with both static and non-static member functions that ends up resolving to a static one, the address is a normal pointer. And let's go ahead and mention explicit object member functions even though the patch hasn't landed yet. gcc/cp/ChangeLog: * cp-tree.def: Improve OFFSET_REF comment. * cp-gimplify.cc (cp_fold_immediate): Add to comment.	2023-10-25 11:02:31 -04:00
Uros Bizjak	678e6c328c	i386: Narrow test instructions with immediate operands [PR111698] Narrow test instructions with immediate operand that test memory location for zero. E.g. testl $0x00aa0000, mem can be converted to testb $0xaa, mem+2. Reject targets where reading (possibly unaligned) part of memory location after a large write to the same address causes store-to-load forwarding stall. PR target/111698 gcc/ChangeLog: * config/i386/x86-tune.def (X86_TUNE_PARTIAL_MEMORY_READ_STALL): New tune. * config/i386/i386.h (TARGET_PARTIAL_MEMORY_READ_STALL): New macro. * config/i386/i386.md: New peephole pattern to narrow test instructions with immediate operands that test memory locations for zero. gcc/testsuite/ChangeLog: * gcc.target/i386/pr111698.c: New test.	2023-10-25 16:28:09 +02:00
Andrew MacLeod	f7dbf62304	Faster irange union for appending ranges. A common pattern to to append a range to an existing range via union. This optimizes that process. * value-range.cc (irange::union_append): New. (irange::union_): Call union_append when appropriate. * value-range.h (irange::union_append): New prototype.	2023-10-25 09:49:02 -04:00
Chenghui Pan	4912418dc1	LoongArch: Fix vfrint-releated comments in lsxintrin.h and lasxintrin.h The comment of vfrint-related intrinsic functions does not match the return value type in definition. This patch fixes these comments. gcc/ChangeLog: * config/loongarch/lasxintrin.h (__lasx_xvftintrnel_l_s): Fix comments. (__lasx_xvfrintrne_s): Ditto. (__lasx_xvfrintrne_d): Ditto. (__lasx_xvfrintrz_s): Ditto. (__lasx_xvfrintrz_d): Ditto. (__lasx_xvfrintrp_s): Ditto. (__lasx_xvfrintrp_d): Ditto. (__lasx_xvfrintrm_s): Ditto. (__lasx_xvfrintrm_d): Ditto. * config/loongarch/lsxintrin.h (__lsx_vftintrneh_l_s): Ditto. (__lsx_vfrintrne_s): Ditto. (__lsx_vfrintrne_d): Ditto. (__lsx_vfrintrz_s): Ditto. (__lsx_vfrintrz_d): Ditto. (__lsx_vfrintrp_s): Ditto. (__lsx_vfrintrp_d): Ditto. (__lsx_vfrintrm_s): Ditto. (__lsx_vfrintrm_d): Ditto.	2023-10-25 21:13:27 +08:00
chenxiaolong	1b30ef7cea	LoongArch: Implement __builtin_thread_pointer for TLS. gcc/ChangeLog: * config/loongarch/loongarch.md (get_thread_pointer<mode>):Adds the instruction template corresponding to the __builtin_thread_pointer function. * doc/extend.texi:Add the __builtin_thread_pointer function support description to the documentation. gcc/testsuite/ChangeLog: * gcc.target/loongarch/builtin_thread_pointer.c: New test.	2023-10-25 21:11:16 +08:00
Patrick Palka	fb28d5c6b0	c++: add fixed testcase [PR99804] We accept the non-dependent call f(e) here ever since the NON_DEPENDENT_EXPR removal patch r14-4793-gdad311874ac3b3. I haven't looked closely into why but I suspect wrapping 'e' in a NON_DEPENDENT_EXPR was causing the argument conversion to misbehave. PR c++/99804 gcc/testsuite/ChangeLog: * g++.dg/template/enum9.C: New test.	2023-10-25 09:03:52 -04:00
Vibhav Pant	ac66744d94	jit: dump string literal initializers correctly Signed-off-by: David Malcolm <dmalcolm@redhat.com> gcc/jit/ChangeLog: * jit-recording.cc (recording::global::write_to_dump): Fix dump of string literal initializers. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2023-10-25 08:35:47 -04:00
Jonathan Wakely	f32c1e1e96	libstdc++: Build libstdc++_libbacktrace.a as PIC [PR111936] In order for std::stacktrace to be used in a shared library, the libbacktrace symbols need to be built with -fPIC. Add the libtool -prefer-pic flag to the commands in src/libbacktrace/Makefile so that the archive contains PIC objects. libstdc++-v3/ChangeLog: PR libstdc++/111936 * src/libbacktrace/Makefile.am: Add -prefer-pic to libtool compile commands. * src/libbacktrace/Makefile.in: Regenerate.	2023-10-25 11:08:57 +01:00
Gaius Mulley	8bb655d0c5	PR modula2/111955 introduce isnan support to Builtins.def This patch introduces isnan, isnanf and isnanl to Builtins.def. It requires fallback functions isnan, isnanf, isnanl to be implemented in libgm2/libm2pim/wrapc.cc and gm2-libs-ch/wrapc.c. Access to the GCC builtin isnan tree is provided by adding an isnan definition and support functions to gm2-gcc/m2builtins.cc. gcc/m2/ChangeLog: PR modula2/111955 * gm2-gcc/m2builtins.cc (gm2_isnan_node): New tree. (DoBuiltinIsnan): New function. (m2builtins_BuiltInIsnan): New function. (m2builtins_init): Initialize gm2_isnan_node. (list_of_builtins): Add define for __builtin_isnan. * gm2-libs-ch/wrapc.c (wrapc_isnan): New function. (wrapc_isnanf): New function. (wrapc_isnanl): New function. * gm2-libs/Builtins.def (isnanf): New procedure function. (isnan): New procedure function. (isnanl): New procedure function. * gm2-libs/Builtins.mod: * gm2-libs/wrapc.def (isnan): New function. (isnanf): New function. (isnanl): New function. libgm2/ChangeLog: PR modula2/111955 * libm2pim/wrapc.cc (isnan): Export new function. (isnanf): Export new function. (isnanl): Export new function. gcc/testsuite/ChangeLog: PR modula2/111955 * gm2/pimlib/run/pass/testnan.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>	2023-10-25 11:04:12 +01:00
Richard Sandiford	cfb7755d10	rtl-ssa: Add new helper functions This patch adds some RTL-SSA helper functions. They will be used by the upcoming late-combine pass. The patch contains the first non-template out-of-line function declared in movement.h, so it adds a movement.cc. I realise it seems a bit over-the-top to have a file with just one function, but it might grow in future. :) gcc/ * Makefile.in (OBJS): Add rtl-ssa/movement.o. * rtl-ssa/access-utils.h (accesses_include_nonfixed_hard_registers) (single_set_info): New functions. (remove_uses_of_def, accesses_reference_same_resource): Declare. (insn_clobbers_resources): Likewise. * rtl-ssa/accesses.cc (rtl_ssa::remove_uses_of_def): New function. (rtl_ssa::accesses_reference_same_resource): Likewise. (rtl_ssa::insn_clobbers_resources): Likewise. * rtl-ssa/movement.h (can_move_insn_p): Declare. * rtl-ssa/movement.cc: New file.	2023-10-25 10:39:53 +01:00
Richard Sandiford	39cac7c314	rtl-ssa: Extend make_uses_available The first in-tree use of RTL-SSA was fwprop, and one of the goals was to make the fwprop rewrite preserve the old behaviour as far as possible. The switch to RTL-SSA was supposed to be a pure infrastructure change. So RTL-SSA has various FIXMEs for things that were artifically limited to faciliate the old-fwprop vs. new-fwprop comparison. One of the things that fwprop wants to do is extend live ranges, and function_info::make_use_available tried to keep within the cases that old fwprop could handle. Since the information is built in extended basic blocks, it's easy to handle intra-EBB queries directly. This patch does that, and removes the associated FIXME. To get a flavour for how much difference this makes, I tried compiling the testsuite at -Os for at least one target per supported CPU and OS. For most targets, only a handful of tests changed, but the vast majority of changes were positive. The only target that seemed to benefit significantly was i686-apple-darwin. The main point of the patch is to remove the FIXME and to enable the upcoming post-RA late-combine pass to handle more cases. gcc/ * rtl-ssa/functions.h (function_info::remains_available_at_insn): New member function. * rtl-ssa/accesses.cc (function_info::remains_available_at_insn): Likewise. (function_info::make_use_available): Avoid false negatives for queries within an EBB.	2023-10-25 10:39:53 +01:00
Richard Sandiford	d7266f655e	rtl-ssa: Use frequency-weighted insn costs rtl_ssa::changes_are_worthwhile used the standard approach of summing up the individual costs of the old and new sequences to see which one is better overall. But when optimising for speed and changing instructions in multiple blocks, it seems better to weight the cost of each instruction by its execution frequency. (We already do something similar for SLP layouts.) gcc/ * rtl-ssa/changes.cc: Include sreal.h. (rtl_ssa::changes_are_worthwhile): When optimizing for speed, scale the cost of each instruction by its execution frequency.	2023-10-25 10:39:52 +01:00
Richard Sandiford	cc15a0f49d	rtl-ssa: Handle call clobbers in more places In order to save (a lot of) memory, RTL-SSA avoids creating individual clobber records for every call-clobbered register. It instead maintains a list & splay tree of calls in an EBB, grouped by ABI. This patch takes these call clobbers into account in a couple more routines. I don't think this will have any effect on existing users, since it's only necessary for hard registers. gcc/ * rtl-ssa/access-utils.h (next_call_clobbers): New function. (is_single_dominating_def, remains_available_on_exit): Replace with... * rtl-ssa/functions.h (function_info::is_single_dominating_def) (function_info::remains_available_on_exit): ...these new member functions. (function_info::m_clobbered_by_calls): New member variable. * rtl-ssa/functions.cc (function_info::function_info): Explicitly initialize m_clobbered_by_calls. * rtl-ssa/insns.cc (function_info::record_call_clobbers): Update m_clobbered_by_calls for each call-clobber note. * rtl-ssa/member-fns.inl (function_info::is_single_dominating_def): New function. Check for call clobbers. * rtl-ssa/accesses.cc (function_info::remains_available_on_exit): Likewise.	2023-10-25 10:39:52 +01:00
Richard Sandiford	ba97d0e3b9	rtl-ssa: Calculate dominance frontiers for the exit block The exit block can have multiple predecessors, for example if the function calls __builtin_eh_return. We might then need PHI nodes for values that are live on exit. RTL-SSA uses the normal dominance frontiers approach for calculating where PHI nodes are needed. However, dominannce.cc only calculates dominators for normal blocks, not the exit block. calculate_dominance_frontiers likewise only calculates dominance frontiers for normal blocks. This patch fills in the “missing” frontiers manually. gcc/ * rtl-ssa/internals.h (build_info::exit_block_dominator): New member variable. * rtl-ssa/blocks.cc (build_info::build_info): Initialize it. (bb_walker::bb_walker): Use it, moving the computation of the dominator to... (function_info::process_all_blocks): ...here. (function_info::place_phis): Add dominance frontiers for the exit block.	2023-10-25 10:39:51 +01:00
Richard Sandiford	adf1b369c5	rtl-ssa: Handle artifical uses of deleted defs If an optimisation removes the last real use of a definition, there can still be artificial uses left. This patch removes those uses too. These artificial uses exist because RTL-SSA is only an SSA-like view of the existing RTL IL, rather than a native SSA representation. It effectively treats RTL registers like gimple vops, but with the addition of an RPO view of the register's lifetime(s). Things are structured to allow most operations to update this RPO view in amortised sublinear time. gcc/ * rtl-ssa/functions.h (function_info::process_uses_of_deleted_def): New member function. * rtl-ssa/changes.cc (function_info::process_uses_of_deleted_def): Likewise. (function_info::change_insns): Use it.	2023-10-25 10:39:51 +01:00
Richard Sandiford	60ef0d2cdc	rtl-ssa: Fix ICE when deleting memory clobbers Sometimes an optimisation can remove a clobber of scratch registers or scratch memory. We then need to update the DU chains to reflect the removed clobber. For registers this isn't a problem. Clobbers of registers are just momentary blips in the register's lifetime. They act as a barrier for moving uses later or defs earlier, but otherwise they have no effect on the semantics of other instructions. Removing a clobber is therefore a cheap, local operation. In contrast, clobbers of memory are modelled as full sets. This is because (a) a clobber of memory does not invalidate all memory and (b) it's a common idiom to use (clobber (mem ...)) in stack barriers. But removing a set and redirecting all uses to a different set is a linear operation. Doing it for potentially every optimisation could lead to quadratic behaviour. This patch therefore refrains from removing sets of memory that appear to be redundant. There's an opportunity to clean this up in linear time at the end of the pass, but as things stand, nothing would benefit from that. This is also a very rare event. Usually we should try to optimise the insn before the scratch memory has been allocated. gcc/ * rtl-ssa/changes.cc (function_info::finalize_new_accesses): If a change describes a set of memory, ensure that that set is kept, regardless of the insn pattern.	2023-10-25 10:39:50 +01:00
Richard Sandiford	d5e0321c3f	rtl-ssa: Create REG_UNUSED notes after all pending changes Unlike REG_DEAD notes, REG_UNUSED notes need to be kept free of false positives by all passes. function_info::change_insns does this by removing all REG_UNUSED notes, and then using add_reg_unused_notes to add notes back (or create new ones) where appropriate. The problem was that it called add_reg_unused_notes on the fly while updating each instruction, which meant that the information for later instructions in the change set wasn't up to date. This patch does it in a separate loop instead. gcc/ * rtl-ssa/changes.cc (function_info::apply_changes_to_insn): Remove call to add_reg_unused_notes and instead... (function_info::change_insns): ...use a separate loop here.	2023-10-25 10:39:50 +01:00
Richard Sandiford	01b42e9880	rtl-ssa: Ensure global registers are live on exit RTL-SSA mostly relies on DF for block-level register liveness information, including artificial uses and defs at the beginning and end of blocks. But one case was missing. DF does not add artificial uses of global registers to the beginning or end of a block. Instead it marks them as used within every block when computing LR and LIVE problems. For RTL-SSA, global registers behave like memory, which in turn behaves like gimple vops. We need to ensure that they are live on exit so that final definitions do not appear to be unused. Also, the previous live-on-exit handling only considered the exit block itself. It needs to consider non-local gotos as well, since they jump directly to some code in a parent function and so do not have a path to the exit block. gcc/ * rtl-ssa/blocks.cc (function_info::add_artificial_accesses): Force global registers to be live on exit. Handle any block with zero successors like an exit block.	2023-10-25 10:39:49 +01:00
Thomas Schwinge	7b2ae64b68	Handle OpenACC 'self' clause for compute constructs in OpenACC 'kernels' decomposition ... to fix up recent commit `3a3596389c` "OpenACC 2.7: Implement self clause for compute constructs" for that case. gcc/ * omp-oacc-kernels-decompose.cc (omp_oacc_kernels_decompose_1): Handle 'OMP_CLAUSE_SELF' like 'OMP_CLAUSE_IF'. * omp-expand.cc (expand_omp_target): Handle 'OMP_CLAUSE_SELF' for 'GF_OMP_TARGET_KIND_OACC_DATA_KERNELS'. gcc/testsuite/ * c-c++-common/goacc/self-clause-2.c: Verify '--param=openacc-kernels=decompose'. * gfortran.dg/goacc/kernels-tree.f95: Adjust. libgomp/ * oacc-parallel.c (GOACC_data_start): Handle 'GOACC_FLAG_LOCAL_DEVICE'. (GOACC_parallel_keyed): Simplify accordingly. * testsuite/libgomp.oacc-fortran/self-1.f90: Adjust.	2023-10-25 11:30:36 +02:00
Thomas Schwinge	047841a68e	Extend test suite coverage for OpenACC 'self' clause for compute constructs ... on top of what was provided in recent commit `3a3596389c` "OpenACC 2.7: Implement self clause for compute constructs". gcc/testsuite/ * c-c++-common/goacc/if-clause-2.c: Enhance. * c-c++-common/goacc/self-clause-1.c: Likewise. * c-c++-common/goacc/self-clause-2.c: Likewise. * gfortran.dg/goacc/if.f95: Likewise. * gfortran.dg/goacc/kernels-tree.f95: Likewise. * gfortran.dg/goacc/parallel-tree.f95: Likewise. * gfortran.dg/goacc/self.f95: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/if-1.c: Enhance. * testsuite/libgomp.oacc-c-c++-common/self-1.c: Likewise. * testsuite/libgomp.oacc-fortran/if-1.f90: Likewise. * testsuite/libgomp.oacc-c-c++-common/if-self-1.c: New. * testsuite/libgomp.oacc-fortran/self-1.f90: Likewise.	2023-10-25 11:24:29 +02:00
Thomas Schwinge	a5e919027f	Consistently order 'OMP_CLAUSE_SELF' right after 'OMP_CLAUSE_IF' As noted in recent commit `3a3596389c` "OpenACC 2.7: Implement self clause for compute constructs", the OpenACC 'self' clause very much relates to the 'if' clause, and therefore copies a lot of the latter's handling. Therefore it makes sense to also place this handling in proximity to that of the 'if' clause, which was done in a lot but not all instances. gcc/ * tree-core.h (omp_clause_code): Move 'OMP_CLAUSE_SELF' after 'OMP_CLAUSE_IF'. * tree-pretty-print.cc (dump_omp_clause): Adjust. * tree.cc (omp_clause_num_ops, omp_clause_code_name): Likewise. * tree.h: Likewise.	2023-10-25 11:14:25 +02:00
Juzhe-Zhong	8064e7e2b5	RISC-V: Export some functions from riscv-vsetvl to riscv-v[NFC] Address kito's comments of AVL propagation patch. Export the functions that are not only used by VSETVL PASS but also AVL propagation PASS. No functionality change. gcc/ChangeLog: * config/riscv/riscv-protos.h (has_vl_op): Export from riscv-vsetvl to riscv-v (tail_agnostic_p): Ditto. (validate_change_or_fail): Ditto. (nonvlmax_avl_type_p): Ditto. (vlmax_avl_p): Ditto. (get_sew): Ditto. (enum vlmul_type): Ditto. (count_regno_occurrences): Ditto. * config/riscv/riscv-v.cc (has_vl_op): Ditto. (get_default_ta): Ditto. (tail_agnostic_p): Ditto. (validate_change_or_fail): Ditto. (nonvlmax_avl_type_p): Ditto. (vlmax_avl_p): Ditto. (get_sew): Ditto. (enum vlmul_type): Ditto. (get_vlmul): Ditto. (count_regno_occurrences): Ditto. * config/riscv/riscv-vsetvl.cc (vlmax_avl_p): Ditto. (has_vl_op): Ditto. (get_sew): Ditto. (get_vlmul): Ditto. (get_default_ta): Ditto. (tail_agnostic_p): Ditto. (count_regno_occurrences): Ditto. (validate_change_or_fail): Ditto.	2023-10-25 17:08:06 +08:00
Thomas Schwinge	c92509d9fd	Disentangle handling of OpenACC 'host', 'self' pragma tokens 'gcc/c-family/c-pragma.h:pragma_omp_clause' already defines 'PRAGMA_OACC_CLAUSE_SELF', but it has no longer been used for the 'update' directive's 'self' clause as of 2018 commit `829c6349e9` (Subversion r261813) "Update OpenACC data clause semantics to the 2.5 behavior". That one instead mapped the 'self' pragma token to the 'host' one (same semantics). That means that we're later not able to tell whether originally we had seen 'self' or 'host', which was OK as long as only the 'update' directive had a 'self' clause. However, as of recent commit `3a3596389c` "OpenACC 2.7: Implement self clause for compute constructs", also OpenACC compute constructs may have a 'self' clause -- with different semantics. That means, we need to know which OpenACC directive we're parsing clauses for, which can be done in a simpler way than in that commit, similar to how the OpenMP 'to' clause is handled. While at that, clarify that (already in OpenACC 2.0a) "The 'host' clause is a synonym for the 'self' clause." -- not the other way round. gcc/c/ * c-parser.cc (c_parser_omp_clause_name): Return 'PRAGMA_OACC_CLAUSE_SELF' for "self". (c_parser_oacc_data_clause, OACC_UPDATE_CLAUSE_MASK): Adjust. (c_parser_oacc_all_clauses): Remove 'bool compute_p' formal parameter, and instead locally determine whether we're called for an OpenACC compute construct or OpenACC 'update' directive. (c_parser_oacc_compute): Adjust. gcc/cp/ * parser.cc (cp_parser_omp_clause_name): Return 'PRAGMA_OACC_CLAUSE_SELF' for "self". (cp_parser_oacc_data_clause, OACC_UPDATE_CLAUSE_MASK): Adjust. (cp_parser_oacc_all_clauses): Remove 'bool compute_p' formal parameter, and instead locally determine whether we're called for an OpenACC compute construct or OpenACC 'update' directive. (cp_parser_oacc_compute): Adjust. gcc/fortran/ * openmp.cc (omp_mask2): Split 'OMP_CLAUSE_HOST_SELF' into 'OMP_CLAUSE_SELF', 'OMP_CLAUSE_HOST'. (gfc_match_omp_clauses, OACC_UPDATE_CLAUSES): Adjust.	2023-10-25 11:02:27 +02:00
Thomas Schwinge	76cc546322	Enable 'c-c++-common/goacc/{if,self}-clause-1.c' for C++ As discovered via recent commit `3a3596389c` "OpenACC 2.7: Implement self clause for compute constructs", 'c-c++-common/goacc/if-clause-1.c', which the new 'c-c++-common/goacc/self-clause-1.c' was copied from, was not enabled for C++. gcc/testsuite/ * c-c++-common/goacc/if-clause-1.c: Enable for C++ * c-c++-common/goacc/self-clause-1.c: Likewise.	2023-10-25 10:55:22 +02:00
Chung-Lin Tang	3a3596389c	OpenACC 2.7: Implement self clause for compute constructs This patch implements the 'self' clause for compute constructs: parallel, kernels, and serial. This clause conditionally uses the local device (the host mult-core CPU) as the executing device of the compute region. The actual implementation of the "local device" device type inside libgomp (presumably using pthreads) is still not yet completed, so the libgomp side is still implemented the exact same as host-fallback mode. (so as of now, it essentially behaves like the 'if' clause with the condition inverted) gcc/c/ChangeLog: * c-parser.cc (c_parser_oacc_compute_clause_self): New function. (c_parser_oacc_all_clauses): Add new 'bool compute_p = false' parameter, add parsing of self clause when compute_p is true. (OACC_KERNELS_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_SELF. (OACC_PARALLEL_CLAUSE_MASK): Likewise, (OACC_SERIAL_CLAUSE_MASK): Likewise. (c_parser_oacc_compute): Adjust call to c_parser_oacc_all_clauses to set compute_p argument to true. * c-typeck.cc (c_finish_omp_clauses): Add OMP_CLAUSE_SELF case. gcc/cp/ChangeLog: * parser.cc (cp_parser_oacc_compute_clause_self): New function. (cp_parser_oacc_all_clauses): Add new 'bool compute_p = false' parameter, add parsing of self clause when compute_p is true. (OACC_KERNELS_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_SELF. (OACC_PARALLEL_CLAUSE_MASK): Likewise, (OACC_SERIAL_CLAUSE_MASK): Likewise. (cp_parser_oacc_compute): Adjust call to c_parser_oacc_all_clauses to set compute_p argument to true. * pt.cc (tsubst_omp_clauses): Add OMP_CLAUSE_SELF case. * semantics.cc (c_finish_omp_clauses): Add OMP_CLAUSE_SELF case, merged with OMP_CLAUSE_IF case. gcc/fortran/ChangeLog: * gfortran.h (typedef struct gfc_omp_clauses): Add self_expr field. * openmp.cc (enum omp_mask2): Add OMP_CLAUSE_SELF. (gfc_match_omp_clauses): Add handling for OMP_CLAUSE_SELF. (OACC_PARALLEL_CLAUSES): Add OMP_CLAUSE_SELF. (OACC_KERNELS_CLAUSES): Likewise. (OACC_SERIAL_CLAUSES): Likewise. (resolve_omp_clauses): Add handling for omp_clauses->self_expr. * trans-openmp.cc (gfc_trans_omp_clauses): Add handling of clauses->self_expr and building of OMP_CLAUSE_SELF tree clause. (gfc_split_omp_clauses): Add handling of self_expr field copy. gcc/ChangeLog: * gimplify.cc (gimplify_scan_omp_clauses): Add OMP_CLAUSE_SELF case. (gimplify_adjust_omp_clauses): Likewise. * omp-expand.cc (expand_omp_target): Add OMP_CLAUSE_SELF expansion code, * omp-low.cc (scan_sharing_clauses): Add OMP_CLAUSE_SELF case. * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_SELF enum. * tree-nested.cc (convert_nonlocal_omp_clauses): Add OMP_CLAUSE_SELF case. (convert_local_omp_clauses): Likewise. * tree-pretty-print.cc (dump_omp_clause): Add OMP_CLAUSE_SELF case. * tree.cc (omp_clause_num_ops): Add OMP_CLAUSE_SELF entry. (omp_clause_code_name): Likewise. * tree.h (OMP_CLAUSE_SELF_EXPR): New macro. gcc/testsuite/ChangeLog: * c-c++-common/goacc/self-clause-1.c: New test. * c-c++-common/goacc/self-clause-2.c: New test. * gfortran.dg/goacc/self.f95: New test. include/ChangeLog: * gomp-constants.h (GOACC_FLAG_LOCAL_DEVICE): New flag bit value. libgomp/ChangeLog: * oacc-parallel.c (GOACC_parallel_keyed): Add code to handle GOACC_FLAG_LOCAL_DEVICE case. * testsuite/libgomp.oacc-c-c++-common/self-1.c: New test.	2023-10-25 10:49:55 +02:00
Thomas Schwinge	fa68e04e76	OpenMP/Fortran: Group handling of 'if' clause without and with modifier The 'if' clause with modifier was introduced in commit `b4c3a85be9` (Subversion r242037) "Partial OpenMP 4.5 fortran support", but -- in some instances -- didn't place it next to the existing handling of 'if' clause without modifier. Unify that; no change in behavior. gcc/fortran/ * dump-parse-tree.cc (show_omp_clauses): Group handling of 'if' clause without and with modifier. * frontend-passes.cc (gfc_code_walker): Likewise. * gfortran.h (gfc_omp_clauses): Likewise. * openmp.cc (gfc_free_omp_clauses): Likewise.	2023-10-25 10:49:55 +02:00
Juzhe-Zhong	5e71499275	RISC-V: Change MD attribute avl_type into avl_type_idx[NFC] Address kito's comments of AVL propagation patch. Change avl_type into avl_type_idx. No functionality change. gcc/ChangeLog: * config/riscv/riscv-protos.h (vlmax_avl_type_p): New function. * config/riscv/riscv-v.cc (vlmax_avl_type_p): Ditto. * config/riscv/riscv-vsetvl.cc (get_avl): Adapt function. * config/riscv/vector.md: Change avl_type into avl_type_idx.	2023-10-25 16:40:48 +08:00
Marek Polacek	6fa7284e28	c++: error with bit-fields and scoped enums [PR111895] Here we issue a bogus error: invalid operands of types 'unsigned char:2' and 'int' to binary 'operator!=' when casting a bit-field of scoped enum type to bool. In build_static_cast_1, perform_direct_initialization_if_possible returns NULL_TREE, because the invented declaration T t(e) fails, which is correct. So we go down to ocp_convert, which has code to deal with this case: /* We can't implicitly convert a scoped enum to bool, so convert to the underlying type first. / if (SCOPED_ENUM_P (intype) && (convtype & CONV_STATIC)) e = build_nop (ENUM_UNDERLYING_TYPE (intype), e); but the SCOPED_ENUM_P is false since intype is <unnamed-unsigned:2>. This could be fixed by using unlowered_expr_type. But then c_common_truthvalue_conversion/CASE_CONVERT has a similar problem, and unlowered_expr_type is a C++-only function. Rather than adding a dummy unlowered_expr_type to C, I think we should follow [expr.static.cast]p3: "the lvalue-to-rvalue conversion is applied to the bit-field and the resulting prvalue is used as the operand of the static_cast." There are no prvalue bit-fields, so the l-to-r conversion performed in decay_conversion will get us an expression whose type is the enum. PR c++/111895 gcc/cp/ChangeLog: typeck.cc (build_static_cast_1): Call decay_conversion. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/scoped_enum12.C: New test.	2023-10-24 20:22:48 -04:00
GCC Administrator	444a485f8f	Daily bump.	2023-10-25 00:19:04 +00:00
Gaius Mulley	5dbcc40a03	modula2: tidyup M2Dependent.mod This patch tidies up M2Dependent.mod by introducing a new procedure to initialize all fields of DependencyList. gcc/m2/ChangeLog: * gm2-libs/M2Dependent.mod (InitDependencyList): New procedure. (CreateModule): Call InitDependencyList to initialize all fields of DependencyList. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>	2023-10-24 23:59:35 +01:00
Patrick Palka	d80a26cca0	c++: non-dep array new-expr size [PR111929] This PR is another instance of NON_DEPENDENT_EXPR having acted as an "analysis barrier" for middle-end routines, and now that it's gone we're more prone to passing weird templated trees (that have a generic tree code) to middle-end routines which end up ICEing on such trees. In the testcase below the non-dependent array new-expr size 'x + 42' is expressed as an ordinary PLUS_EXPR, but whose operands have different types (since templated trees encode just the syntactic form of an expression devoid of e.g. implicit conversions). This type incoherency triggers an ICE from size_binop in build_new_1 due to a wide_int assert that expects the operand types to have the same precision. This patch fixes this by replacing our piecemeal folding of 'size' in build_new_1 with a single call to cp_fully_fold (which is a no-op in a template context) once 'size' is built up. PR c++/111929 gcc/cp/ChangeLog: * init.cc (build_new_1): Use convert, build2, build3 and cp_fully_fold instead of fold_convert, size_binop and fold_build3 when building up 'size'. gcc/testsuite/ChangeLog: * g++.dg/template/non-dependent28.C: New test.	2023-10-24 18:03:17 -04:00
Patrick Palka	51f164f762	c++: cp_stabilize_reference and non-dep exprs [PR111919] After the removal of NON_DEPENDENT_EXPR, cp_stabilize_reference (which used to just exit early for NON_DEPENDENT_EXPR) is now more prone to passing a weird templated tree to middle-end routines, which for the testcase below leads to a crash from contains_placeholder_p. It seems the best fix is to just exit early when in a template context, like we do in the closely related function cp_save_expr. PR c++/111919 gcc/cp/ChangeLog: * tree.cc (cp_stabilize_reference): Do nothing when processing_template_decl. gcc/testsuite/ChangeLog: * g++.dg/template/non-dependent27.C: New test.	2023-10-24 17:48:00 -04:00
Paul M. Bendixen	c1eee808a9	libstdc++: Include cstdarg in freestanding P1642 includes cstdarg in the full headers to include. This commit includes it along with cstdalign and cstdbool that were left out when updating in an earlier commit. libstdc++/Changelog * include/Makefile.am: Move cstdarg, cstdalign and cstdbool to freestanding. * include/Makefile.in: Regenerate. Signed-off-by: Paul M. Bendixen <paulbendixen@gmail.com>	2023-10-24 21:21:31 +01:00
Gaius Mulley	23ddfa1bac	modula2: gcc/m2/gm2-libs/M2Dependent.mod initialize all record fields. Initialize all sub fields within mptr. Valgrind detected uninitialized fields in M2Dependent.mod. CreateModule must ensure all sub fields are initialized. gcc/m2/ChangeLog: * gm2-libs/M2Dependent.mod (CreateModule): Initialize all dependency fields for DependencyList. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>	2023-10-24 21:13:01 +01:00
Richard Sandiford	1fa7bcfdaa	recog/reload: Remove old UNARY_P operand support reload and constrain_operands had some old code to look through unary operators. E.g. an operand could be (sign_extend (reg X)), and the constraints would match the reg rather than the sign_extend. This was previously used by the MIPS port. But relying on it was a recurring source of problems, so Eric and I removed it in the MIPS rewrite from ~20 years back. I don't know of any other port that used it. Also, the constraints processing in LRA and IRA do not have direct support for these embedded operators, so I think it was only ever a reload-specific feature (and probably only a global/local+reload-specific feature, rather than IRA+reload). Keeping the checks caused problems for special memory constraints, leading to: /* A unary operator may be accepted by the predicate, but it is irrelevant for matching constraints. / / For special_memory_operand, there could be a memory operand inside, and it would cause a mismatch for constraint_satisfied_p. / if (UNARY_P (op) && op == extract_mem_from_operand (op)) op = XEXP (op, 0); But inline asms are another source of problems. Asms don't have predicates, and so we can't use recog to decide whether a given change to an asm gives a valid match. We instead rely on constrain_operands as something of a recog stand-in. For an example like: void foo (int ptr) { asm volatile ("%0" :: "r" (-ptr)); } any attempt to propagate the negation into the asm would be allowed, because it's the negated register that would be checked against the "r" constraint. This would later lead to: error: invalid 'asm': invalid operand The same thing happened in gcc.target/aarch64/vneg_s.c with the upcoming late-combine pass. Rather than add more workarounds, it seemed better just to delete this code. gcc/ recog.cc (constrain_operands): Remove UNARY_P handling. * reload.cc (find_reloads): Likewise.	2023-10-24 20:22:39 +01:00
Jose E. Marchesi	e6fdea823e	gcc: fix typo in comment in gcov-io.h gcc/ChangeLog: * gcov-io.h: Fix record length encoding in comment.	2023-10-24 21:08:38 +02:00
Roger Sayle	99a6c1065d	i386: Fine tune STV register conversion costs for -Os. The eagle-eyed may have spotted that my recent testcases for DImode shifts on x86_64 included -mno-stv in the dg-options. This is because the Scalar-To-Vector (STV) pass currently transforms these shifts to use SSE vector operations, producing larger code even with -Os. The issue is that the compute_convert_gain currently underestimates the size of instructions required for interunit moves, which is corrected with the patch below. For the simple test case: unsigned long long shl1(unsigned long long x) { return x << 1; } without this patch, GCC -m32 -Os -mavx2 currently generates: shl1: push %ebp // 1 byte mov %esp,%ebp // 2 bytes vmovq 0x8(%ebp),%xmm0 // 5 bytes pop %ebp // 1 byte vpaddq %xmm0,%xmm0,%xmm0 // 4 bytes vmovd %xmm0,%eax // 4 bytes vpextrd $0x1,%xmm0,%edx // 6 bytes ret // 1 byte = 24 bytes total with this patch, we now generate the shorter shl1: push %ebp // 1 byte mov %esp,%ebp // 2 bytes mov 0x8(%ebp),%eax // 3 bytes mov 0xc(%ebp),%edx // 3 bytes pop %ebp // 1 byte add %eax,%eax // 2 bytes adc %edx,%edx // 2 bytes ret // 1 byte = 15 bytes total Benchmarking using CSiBE, shows that this patch saves 1361 bytes when compiling with -m32 -Os, and saves 172 bytes when compiling with -Os. 2023-10-24 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386-features.cc (compute_convert_gain): Provide more accurate values (sizes) for inter-unit moves with -Os.	2023-10-24 16:45:08 +01:00
Roger Sayle	35f4e95265	ARC: Improved SImode shifts and rotates on !TARGET_BARREL_SHIFTER. This patch completes the ARC back-end's transition to using pre-reload splitters for SImode shifts and rotates on targets without a barrel shifter. The core part is that the shift_si3 define_insn is no longer needed, as shifts and rotates that don't require a loop are split before reload, and then because shift_si3_loop is the only caller of output_shift, both can be significantly cleaned up and simplified. The output_shift function (Claudiu's "the elephant in the room") is renamed output_shift_loop, which handles just the four instruction zero-overhead loop implementations. Aside from the clean-ups, the user visible changes are much improved implementations of SImode shifts and rotates on affected targets. For the function: unsigned int rotr_1 (unsigned int x) { return (x >> 1) \| (x << 31); } GCC with -O2 -mcpu=em would previously generate: rotr_1: lsr_s r2,r0 bmsk_s r0,r0,0 ror r0,r0 j_s.d [blink] or_s r0,r0,r2 with this patch, we now generate: j_s.d [blink] ror r0,r0 For the function: unsigned int rotr_31 (unsigned int x) { return (x >> 31) \| (x << 1); } GCC with -O2 -mcpu=em would previously generate: rotr_31: mov_s r2,r0 ;4 asl_s r0,r0 add.f 0,r2,r2 rlc r2,0 j_s.d [blink] or_s r0,r0,r2 with this patch we now generate an add.f followed by an adc: rotr_31: add.f r0,r0,r0 j_s.d [blink] add.cs r0,r0,1 Shifts by constants requiring a loop have been improved for even counts by performing two operations in each iteration: int shl10(int x) { return x >> 10; } Previously looked like: shl10: mov.f lp_count, 10 lpnz 2f asr r0,r0 nop 2: # end single insn loop j_s [blink] And now becomes: shl10: mov lp_count,5 lp 2f asr r0,r0 asr r0,r0 2: # end single insn loop j_s [blink] So emulating ARC's SWAP on architectures that don't have it: unsigned int rotr_16 (unsigned int x) { return (x >> 16) \| (x << 16); } previously required 10 instructions and ~70 cycles: rotr_16: mov_s r2,r0 ;4 mov.f lp_count, 16 lpnz 2f add r0,r0,r0 nop 2: # end single insn loop mov.f lp_count, 16 lpnz 2f lsr r2,r2 nop 2: # end single insn loop j_s.d [blink] or_s r0,r0,r2 now becomes just 4 instructions and ~18 cycles: rotr_16: mov lp_count,8 lp 2f ror r0,r0 ror r0,r0 2: # end single insn loop j_s [blink] 2023-10-24 Roger Sayle <roger@nextmovesoftware.com> Claudiu Zissulescu <claziss@gmail.com> gcc/ChangeLog * config/arc/arc-protos.h (output_shift): Rename to... (output_shift_loop): Tweak API to take an explicit rtx_code. (arc_split_ashl): Prototype new function here. (arc_split_ashr): Likewise. (arc_split_lshr): Likewise. (arc_split_rotl): Likewise. (arc_split_rotr): Likewise. * config/arc/arc.cc (output_shift): Delete local prototype. Rename. (output_shift_loop): New function replacing output_shift to output a zero overheap loop for SImode shifts and rotates on ARC targets without barrel shifter (i.e. no hardware support for these insns). (arc_split_ashl): New helper function to split ashlsi3_nobs. (arc_split_ashr): New helper function to split ashrsi3_nobs. (arc_split_lshr): New helper function to split lshrsi3_nobs. (arc_split_rotl): New helper function to split rotlsi3_nobs. (arc_split_rotr): New helper function to split rotrsi3_nobs. (arc_print_operand): Correct whitespace. (arc_rtx_costs): Likewise. (hwloop_optimize): Likewise. config/arc/arc.md (ANY_SHIFT_ROTATE): New define_code_iterator. (define_code_attr insn): New code attribute to map to pattern name. (<ANY_SHIFT_ROTATE>si3): New expander unifying previous ashlsi3, ashrsi3 and lshrsi3 define_expands. Adds rotlsi3 and rotrsi3. (<ANY_SHIFT_ROTATE>si3_nobs): New define_insn_and_split that unifies the previous ashlsi3_nobs, ashrsi3_nobs and lshrsi3_nobs. We now call arc_split_<insn> in arc.cc to implement each split. (shift_si3): Delete define_insn, all shifts/rotates are now split. (shift_si3_loop): Rename to... (<insn>si3_loop): define_insn to handle loop implementations of SImode shifts and rotates, calling ouput_shift_loop for template. (rotrsi3): Rename to... (rotrsi3_insn): define_insn for TARGET_BARREL_SHIFTER's ror. (rotlsi3): New define_insn_and_split to transform left rotates into right rotates before reload. (rotlsi3_cnt1): New define_insn_and_split to implement a left rotate by one bit using an add.f followed by an adc. * config/arc/predicates.md (shiftr4_operator): Delete.	2023-10-24 16:43:21 +01:00

1 2 3 4 5 ...

204970 commits