FreeChainXenon/gcc - Aiden Isik's Forgejo Server

Author	SHA1	Message	Date
GCC Administrator	2697f8324f	Daily bump.	2021-08-05 00:17:03 +00:00
David Malcolm	ded2c2c068	analyzer: initial implementation of asm support [PR101570] gcc/ChangeLog: PR analyzer/101570 * Makefile.in (ANALYZER_OBJS): Add analyzer/region-model-asm.o. gcc/analyzer/ChangeLog: PR analyzer/101570 * analyzer.cc (maybe_reconstruct_from_def_stmt): Add GIMPLE_ASM case. * analyzer.h (class asm_output_svalue): New forward decl. (class reachable_regions): New forward decl. * complexity.cc (complexity::from_vec_svalue): New. * complexity.h (complexity::from_vec_svalue): New decl. * engine.cc (feasibility_state::maybe_update_for_edge): Handle asm stmts by calling on_asm_stmt. * region-model-asm.cc: New file. * region-model-manager.cc (region_model_manager::maybe_fold_asm_output_svalue): New. (region_model_manager::get_or_create_asm_output_svalue): New. (region_model_manager::log_stats): Log m_asm_output_values_map. * region-model.cc (region_model::on_stmt_pre): Handle GIMPLE_ASM. * region-model.h (visitor::visit_asm_output_svalue): New. (region_model_manager::get_or_create_asm_output_svalue): New decl. (region_model_manager::maybe_fold_asm_output_svalue): New decl. (region_model_manager::asm_output_values_map_t): New typedef. (region_model_manager::m_asm_output_values_map): New field. (region_model::on_asm_stmt): New. * store.cc (binding_cluster::on_asm): New. * store.h (binding_cluster::on_asm): New decl. * svalue.cc (svalue::cmp_ptr): Handle SK_ASM_OUTPUT. (asm_output_svalue::dump_to_pp): New. (asm_output_svalue::dump_input): New. (asm_output_svalue::input_idx_to_asm_idx): New. (asm_output_svalue::accept): New. * svalue.h (enum svalue_kind): Add SK_ASM_OUTPUT. (svalue::dyn_cast_asm_output_svalue): New. (class asm_output_svalue): New. (is_a_helper <const asm_output_svalue >::test): New. (struct default_hash_traits<asm_output_svalue::key_t>): New. gcc/testsuite/ChangeLog: PR analyzer/101570 gcc.dg/analyzer/asm-x86-1.c: New test. * gcc.dg/analyzer/asm-x86-lp64-1.c: New test. * gcc.dg/analyzer/asm-x86-lp64-2.c: New test. * gcc.dg/analyzer/pr101570.c: New test. * gcc.dg/analyzer/torture/asm-x86-linux-array_index_mask_nospec.c: New test. * gcc.dg/analyzer/torture/asm-x86-linux-cpuid-paravirt-1.c: New test. * gcc.dg/analyzer/torture/asm-x86-linux-cpuid-paravirt-2.c: New test. * gcc.dg/analyzer/torture/asm-x86-linux-cpuid.c: New test. * gcc.dg/analyzer/torture/asm-x86-linux-rdmsr-paravirt.c: New test. * gcc.dg/analyzer/torture/asm-x86-linux-rdmsr.c: New test. * gcc.dg/analyzer/torture/asm-x86-linux-wfx_get_ps_timeout-full.c: New test. * gcc.dg/analyzer/torture/asm-x86-linux-wfx_get_ps_timeout-reduced.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2021-08-04 18:21:25 -04:00
H.J. Lu	5738a64f8b	x86: Update STORE_MAX_PIECES Update STORE_MAX_PIECES to allow 16/32/64 bytes only if inter-unit move is enabled since vec_duplicate enabled by inter-unit move is used to implement store_by_pieces of 16/32/64 bytes. gcc/ PR target/101742 * config/i386/i386.h (STORE_MAX_PIECES): Allow 16/32/64 bytes only if TARGET_INTER_UNIT_MOVES_TO_VEC is true. gcc/testsuite/ PR target/101742 * gcc.target/i386/pr101742a.c: New test. * gcc.target/i386/pr101742b.c: Likewise.	2021-08-04 12:59:38 -07:00
H.J. Lu	09dba016db	x86: Avoid stack realignment when copying data with SSE register To avoid stack realignment, call ix86_gen_scratch_sse_rtx to get a scratch SSE register to copy data with with SSE register from one memory location to another. gcc/ PR target/101772 * config/i386/i386-expand.c (ix86_expand_vector_move): Call ix86_gen_scratch_sse_rtx to get a scratch SSE register to copy data with SSE register from one memory location to another. gcc/testsuite/ PR target/101772 * gcc.target/i386/eh_return-2.c: New test.	2021-08-04 12:51:12 -07:00
Andreas Krebbel	361da782a2	IBM Z: Implement TARGET_VECTORIZE_VEC_PERM_CONST for vpdi This patch makes use of the vector permute double immediate instruction for constant permute vectors. gcc/ChangeLog: * config/s390/s390.c (expand_perm_with_vpdi): New function. (vectorize_vec_perm_const_1): Call expand_perm_with_vpdi. * config/s390/vector.md (vpdi1<mode>, @vpdi1<mode>): Enable a parameterized expander. (vpdi4<mode>, @vpdi4<mode>): Likewise. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/perm-vpdi.c: New test.	2021-08-04 18:40:11 +02:00
Andreas Krebbel	6dc8c46564	IBM Z: Implement TARGET_VECTORIZE_VEC_PERM_CONST for vector merge This patch implements the TARGET_VECTORIZE_VEC_PERM_CONST in the IBM Z backend. The initial implementation only exploits the vector merge instruction but there is more to come. gcc/ChangeLog: * config/s390/s390.c (MAX_VECT_LEN): Define macro. (struct expand_vec_perm_d): Define struct. (expand_perm_with_merge): New function. (vectorize_vec_perm_const_1): New function. (s390_vectorize_vec_perm_const): New function. (TARGET_VECTORIZE_VEC_PERM_CONST): Define target macro. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/perm-merge.c: New test. * gcc.target/s390/vector/vec-types.h: New test.	2021-08-04 18:40:10 +02:00
Andreas Krebbel	4e34925ef1	IBM Z: Remove redundant V_HW_64 mode iterator. gcc/ChangeLog: * config/s390/vector.md (V_HW_64): Remove mode iterator. (vec_load_pair<mode>): Use V_HW_2 instead of V_HW_64. config/s390/vx-builtins.md (vec_scatter_element<V_HW_2:mode>_SI): Use V_HW_2 instead of V_HW_64.	2021-08-04 18:40:10 +02:00
Andreas Krebbel	0aa7091bef	IBM Z: Get rid of vpdi unspec The patch gets rid of the unspec used for the vector permute double immediate instruction and replaces it with generic rtx. gcc/ChangeLog: * config/s390/s390.md (UNSPEC_VEC_PERMI): Remove constant definition. * config/s390/vector.md (vpdi1<mode>, vpdi4<mode>): New pattern definitions. * config/s390/vx-builtins.md (vec_permi<mode>): Emit generic rtx instead of an unspec. gcc/testsuite/ChangeLog: gcc.target/s390/zvector/vec-permi.c: Removed. * gcc.target/s390/zvector/vec_permi.c: New test.	2021-08-04 18:40:09 +02:00
Andreas Krebbel	5391688acc	IBM Z: Get rid of vec merge unspec This patch gets rid of the unspecs we were using for the vector merge instruction and replaces it with generic rtx. gcc/ChangeLog: * config/s390/s390-modes.def: Add more vector modes to support concatenation of two vectors. * config/s390/s390-protos.h (s390_expand_merge_perm_const): Add prototype. (s390_expand_merge): Likewise. * config/s390/s390.c (s390_expand_merge_perm_const): New function. (s390_expand_merge): New function. * config/s390/s390.md (UNSPEC_VEC_MERGEH, UNSPEC_VEC_MERGEL): Remove constant definitions. * config/s390/vector.md (V_HW_2): Add mode iterators. (VI_HW_4, V_HW_4): Rename VI_HW_4 to V_HW_4. (vec_2x_nelts, vec_2x_wide): New mode attributes. (vmrhb, vmrlb, vmrhh, vmrlh, vmrhf, vmrlf, vmrhg, vmrlg): New pattern definitions. (vec_widen_umult_lo_<mode>, vec_widen_umult_hi_<mode>) (vec_widen_smult_lo_<mode>, vec_widen_smult_hi_<mode>) (vec_unpacks_lo_v4sf, vec_unpacks_hi_v4sf, vec_unpacks_lo_v2df) (vec_unpacks_hi_v2df): Adjust expanders to emit non-unspec RTX for vec merge. * config/s390/vx-builtins.md (V_HW_4): Remove mode iterator. Now in vector.md. (vec_mergeh<mode>, vec_mergel<mode>): Use s390_expand_merge to emit vec merge pattern. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/long-double-asm-in-out-hard-fp-reg.c: Instead of vpdi with 0 and 5 vmrlg and vmrhg are used now. * gcc.target/s390/vector/long-double-asm-inout-hard-fp-reg.c: Likewise. * gcc.target/s390/zvector/vec-types.h: New test. * gcc.target/s390/zvector/vec_merge.c: New test.	2021-08-04 18:40:09 +02:00
Jonathan Wright	63834c84d4	aarch64: Don't include vec_select high-half in SIMD multiply cost The Neon multiply/multiply-accumulate/multiply-subtract instructions can select the top or bottom half of the operand registers. This selection does not change the cost of the underlying instruction and this should be reflected by the RTL cost function. This patch adds RTL tree traversal in the Neon multiply cost function to match vec_select high-half of its operands. This traversal prevents the cost of the vec_select from being added into the cost of the multiply - meaning that these instructions can now be emitted in the combine pass as they are no longer deemed prohibitively expensive. gcc/ChangeLog: 2021-07-19 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64.c (aarch64_strip_extend_vec_half): Define. (aarch64_rtx_mult_cost): Traverse RTL tree to prevent cost of vec_select high-half from being added into Neon multiply cost. * rtlanal.c (vec_series_highpart_p): Define. * rtlanal.h (vec_series_highpart_p): Declare. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vmul_high_cost.c: New test.	2021-08-04 16:58:26 +01:00
Jonathan Wright	1d65c9d251	aarch64: Don't include vec_select element in SIMD multiply cost The Neon multiply/multiply-accumulate/multiply-subtract instructions can take various forms - multiplying full vector registers of values or multiplying one vector by a single element of another. Regardless of the form used, these instructions have the same cost, and this should be reflected by the RTL cost function. This patch adds RTL tree traversal in the Neon multiply cost function to match the vec_select used by the lane-referencing forms of the instructions already mentioned. This traversal prevents the cost of the vec_select from being added into the cost of the multiply - meaning that these instructions can now be emitted in the combine pass as they are no longer deemed prohibitively expensive. gcc/ChangeLog: 2021-07-19 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64.c (aarch64_strip_duplicate_vec_elt): Define. (aarch64_rtx_mult_cost): Traverse RTL tree to prevent vec_select cost from being added into Neon multiply cost. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vmul_element_cost.c: New test.	2021-08-04 16:57:38 +01:00
Richard Sandiford	5a1017dc30	vect: Tweak comparisons with existing epilogue loops This patch uses a more accurate scalar iteration estimate when comparing the epilogue of a constant-iteration loop with a candidate replacement epilogue. In the testcase, the patch prevents a 1-to-3-element SVE epilogue from seeming better than a 64-bit Advanced SIMD epilogue. gcc/ * tree-vect-loop.c (vect_better_loop_vinfo_p): Detect cases in which old_loop_vinfo is an epilogue loop that handles a constant number of iterations. gcc/testsuite/ * gcc.target/aarch64/sve/cost_model_12.c: New test.	2021-08-04 16:52:09 +01:00
Richard Sandiford	315a1c3756	vect: Tweak dump messages for vector mode choice After vect_analyze_loop has successfully analysed a loop for one base vector mode B1, it considers using following base vector modes to vectorise an epilogue. However, for VECT_COMPARE_COSTS, a later mode B2 might turn out to be better than B1 was. Initially this comparison will be between an epilogue loop (for B2) and a main loop (for B1). However, in r11-6458 I'd added code to reanalyse the B2 epilogue loop as a main loop, partly for correctness and partly for better costing. This can lead to a situation in which we think that the B2 epilogue loop was better than the B1 main loop, but that the B2 main loop is not better than the B1 main loop. There was no dump message to say that this had happened, which made it look like B2 had still won. gcc/ * tree-vect-loop.c (vect_analyze_loop): Print a dump message when a reanalyzed loop fails to be cheaper than the current main loop.	2021-08-04 16:52:08 +01:00
Richard Sandiford	eb55b5b0df	aarch64: Fix a typo gcc/ * config/aarch64/aarch64.c: Fix a typo.	2021-08-04 16:52:07 +01:00
Vincent Lefèvre	929f2cf410	gcov: check return code of a fclose gcc/ChangeLog: PR gcov-profile/101773 * gcov-io.c (gcov_close): Check return code of a fclose.	2021-08-04 17:26:28 +02:00
Bernd Edlinger	96c82a16b2	Fix debug info for ignored decls at start of assembly Ignored functions decls that are compiled at the start of the assembly have bogus line numbers until the first .file directive, as reported in PR101575. The corresponding binutils bug report is https://sourceware.org/bugzilla/show_bug.cgi?id=28149 The work around for this issue is to emit a dummy .file directive before the first function is compiled, unless another .file directive was already emitted previously. 2021-08-04 Bernd Edlinger <bernd.edlinger@hotmail.de> PR ada/101575 * dwarf2out.c (dwarf2out_assembly_start): Emit a dummy .file statement when needed.	2021-08-04 16:18:07 +02:00
Tamar Christina	9fcb8ec603	[testsuite] Fix trapping access in test PR101750 I believe PR101750 to be a testism. Fix it by giving the class a name. gcc/testsuite/ChangeLog: PR tree-optimization/101750 * g++.dg/vect/pr99149.cc: Name class.	2021-08-04 14:36:26 +01:00
Richard Biener	31855ba6b1	Add emulated gather capability to the vectorizer This adds a gather vectorization capability to the vectorizer without target support by decomposing the offset vector, doing sclar loads and then building a vector from the result. This is aimed mainly at cases where vectorizing the rest of the loop offsets the cost of vectorizing the gather. Note it's difficult to avoid vectorizing the offset load, but in some cases later passes can turn the vector load + extract into scalar loads, see the followup patch. On SPEC CPU 2017 510.parest_r this improves runtime from 250s to 219s on a Zen2 CPU which has its native gather instructions disabled (using those the runtime instead increases to 254s) using -Ofast -march=znver2 [-flto]. It turns out the critical loops in this benchmark all perform gather operations. 2021-07-30 Richard Biener <rguenther@suse.de> * tree-vect-data-refs.c (vect_check_gather_scatter): Include widening conversions only when the result is still handed by native gather or the current offset size not already matches the data size. Also succeed analysis in case there's no native support, noted by a IFN_LAST ifn and a NULL decl. (vect_analyze_data_refs): Always consider gathers. * tree-vect-patterns.c (vect_recog_gather_scatter_pattern): Test for no IFN gather rather than decl gather. * tree-vect-stmts.c (vect_model_load_cost): Pass in the gather-scatter info and cost emulated gathers accordingly. (vect_truncate_gather_scatter_offset): Properly test for no IFN gather. (vect_use_strided_gather_scatters_p): Likewise. (get_load_store_type): Handle emulated gathers and its restrictions. (vectorizable_load): Likewise. Emulate them by extracting scalar offsets, doing scalar loads and a vector construct. * gcc.target/i386/vect-gather-1.c: New testcase. * gfortran.dg/vect/vect-8.f90: Adjust.	2021-08-04 15:28:07 +02:00
H.J. Lu	f2e5d2717d	by_pieces: Pass MAX_PIECES to op_by_pieces_d Pass MAX_PIECES to op_by_pieces_d::op_by_pieces_d for move, store and compare. PR target/101742 * expr.c (op_by_pieces_d::op_by_pieces_d): Add a max_pieces argument to set m_max_size. (move_by_pieces_d): Pass MOVE_MAX_PIECES to op_by_pieces_d. (store_by_pieces_d): Pass STORE_MAX_PIECES to op_by_pieces_d. (compare_by_pieces_d): Pass COMPARE_MAX_PIECES to op_by_pieces_d.	2021-08-04 06:24:46 -07:00
Roger Sayle	96146e61cd	Fold (X<<C1)^(X<<C2) to a multiplication when possible. The easiest way to motivate these additions to match.pd is with the following example: unsigned int foo(unsigned char i) { return i \| (i<<8) \| (i<<16) \| (i<<24); } which mainline with -O2 on x86_64 currently generates: foo: movzbl %dil, %edi movl %edi, %eax movl %edi, %edx sall $8, %eax sall $16, %edx orl %edx, %eax orl %edi, %eax sall $24, %edi orl %edi, %eax ret but with this patch now becomes: foo: movzbl %dil, %eax imull $16843009, %eax, %eax ret Interestingly, this transformation is already applied when using addition, allowing synth_mult to select an optimal sequence, but not when using the equivalent bit-wise ior or xor operators. The solution is to use tree_nonzero_bits to check that the potentially non-zero bits of each operand don't overlap, which ensures that BIT_IOR_EXPR and BIT_XOR_EXPR produce the same results as PLUS_EXPR, which effectively generalizes the old fold_plusminus_mult_expr. Technically, the transformation is to canonicalize (XC1)\|(XC2) and (XC1)^(XC2) to X(C1+C2) where X and X<<C are considered special cases. 2021-08-04 Roger Sayle <roger@nextmovesoftware.com> Marc Glisse <marc.glisse@inria.fr> gcc/ChangeLog match.pd (bit_ior, bit_xor): Canonicalize (XC1)\|(XC2) and (XC1)^(XC2) as X(C1+C2), and related variants, using tree_nonzero_bits to ensure that operands are bit-wise disjoint. gcc/testsuite/ChangeLog gcc.dg/fold-ior-4.c: New test.	2021-08-04 14:22:51 +01:00
Jonathan Wakely	0d04fe4923	libstdc++: Add [[nodiscard]] to sequence containers ... and container adaptors. This adds the [[nodiscard]] attribute to functions with no side-effects for the sequence containers and their iterators, and the debug versions of those containers, and the container adaptors, Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/forward_list.h: Add [[nodiscard]] to functions with no side-effects. * include/bits/stl_bvector.h: Likewise. * include/bits/stl_deque.h: Likewise. * include/bits/stl_list.h: Likewise. * include/bits/stl_queue.h: Likewise. * include/bits/stl_stack.h: Likewise. * include/bits/stl_vector.h: Likewise. * include/debug/deque: Likewise. * include/debug/forward_list: Likewise. * include/debug/list: Likewise. * include/debug/safe_iterator.h: Likewise. * include/debug/vector: Likewise. * include/std/array: Likewise. * testsuite/23_containers/array/creation/3_neg.cc: Use -Wno-unused-result. * testsuite/23_containers/array/debug/back1_neg.cc: Cast result to void. * testsuite/23_containers/array/debug/back2_neg.cc: Likewise. * testsuite/23_containers/array/debug/front1_neg.cc: Likewise. * testsuite/23_containers/array/debug/front2_neg.cc: Likewise. * testsuite/23_containers/array/debug/square_brackets_operator1_neg.cc: Likewise. * testsuite/23_containers/array/debug/square_brackets_operator2_neg.cc: Likewise. * testsuite/23_containers/array/tuple_interface/get_neg.cc: Adjust dg-error line numbers. * testsuite/23_containers/deque/cons/clear_allocator.cc: Cast result to void. * testsuite/23_containers/deque/debug/invalidation/4.cc: Likewise. * testsuite/23_containers/deque/types/1.cc: Use -Wno-unused-result. * testsuite/23_containers/list/types/1.cc: Cast result to void. * testsuite/23_containers/priority_queue/members/7161.cc: Likewise. * testsuite/23_containers/queue/members/7157.cc: Likewise. * testsuite/23_containers/vector/59829.cc: Likewise. * testsuite/23_containers/vector/ext_pointer/types/1.cc: Likewise. * testsuite/23_containers/vector/ext_pointer/types/2.cc: Likewise. * testsuite/23_containers/vector/types/1.cc: Use -Wno-unused-result.	2021-08-04 12:54:29 +01:00
Jonathan Wakely	240b01b021	libstdc++: Add [[nodiscard]] to iterators and related utilities This adds [[nodiscard]] throughout <iterator>, as proposed by P2377R0 (with some minor corrections). The attribute is added for all modes from C++11 up, using [[__nodiscard__]] or _GLIBCXX_NODISCARD where C++17 [[nodiscard]] can't be used directly. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/iterator_concepts.h (iter_move): Add [[nodiscard]]. * include/bits/range_access.h (begin, end, cbegin, cend) (rbegin, rend, crbegin, crend, size, data, ssize): Likewise. * include/bits/ranges_base.h (ranges::begin, ranges::end) (ranges::cbegin, ranges::cend, ranges::rbegin, ranges::rend) (ranges::crbegin, ranges::crend, ranges::size, ranges::ssize) (ranges::empty, ranges::data, ranges::cdata): Likewise. * include/bits/stl_iterator.h (reverse_iterator, __normal_iterator) (back_insert_iterator, front_insert_iterator, insert_iterator) (move_iterator, move_sentinel, common_iterator) (counted_iterator): Likewise. * include/bits/stl_iterator_base_funcs.h (distance, next, prev): Likewise. * include/bits/stream_iterator.h (istream_iterator) (ostream_iterartor): Likewise. * include/bits/streambuf_iterator.h (istreambuf_iterator) (ostreambuf_iterator): Likewise. * include/std/ranges (views::single, views::iota, views::all) (views::filter, views::transform, views::take, views::take_while) (views::drop, views::drop_while, views::join, views::lazy_split) (views::split, views::counted, views::common, views::reverse) (views::elements): Likewise. * testsuite/20_util/rel_ops.cc: Use -Wno-unused-result. * testsuite/24_iterators/move_iterator/greedy_ops.cc: Likewise. * testsuite/24_iterators/normal_iterator/greedy_ops.cc: Likewise. * testsuite/24_iterators/reverse_iterator/2.cc: Likewise. * testsuite/24_iterators/reverse_iterator/greedy_ops.cc: Likewise. * testsuite/21_strings/basic_string/range_access/char/1.cc: Cast result to void. * testsuite/21_strings/basic_string/range_access/wchar_t/1.cc: Likewise. * testsuite/21_strings/basic_string_view/range_access/char/1.cc: Likewise. * testsuite/21_strings/basic_string_view/range_access/wchar_t/1.cc: Likewise. * testsuite/23_containers/array/range_access.cc: Likewise. * testsuite/23_containers/deque/range_access.cc: Likewise. * testsuite/23_containers/forward_list/range_access.cc: Likewise. * testsuite/23_containers/list/range_access.cc: Likewise. * testsuite/23_containers/map/range_access.cc: Likewise. * testsuite/23_containers/multimap/range_access.cc: Likewise. * testsuite/23_containers/multiset/range_access.cc: Likewise. * testsuite/23_containers/set/range_access.cc: Likewise. * testsuite/23_containers/unordered_map/range_access.cc: Likewise. * testsuite/23_containers/unordered_multimap/range_access.cc: Likewise. * testsuite/23_containers/unordered_multiset/range_access.cc: Likewise. * testsuite/23_containers/unordered_set/range_access.cc: Likewise. * testsuite/23_containers/vector/range_access.cc: Likewise. * testsuite/24_iterators/customization_points/iter_move.cc: Likewise. * testsuite/24_iterators/istream_iterator/sentinel.cc: Likewise. * testsuite/24_iterators/istreambuf_iterator/sentinel.cc: Likewise. * testsuite/24_iterators/move_iterator/dr2061.cc: Likewise. * testsuite/24_iterators/operations/prev_neg.cc: Likewise. * testsuite/24_iterators/ostreambuf_iterator/2.cc: Likewise. * testsuite/24_iterators/range_access/range_access.cc: Likewise. * testsuite/24_iterators/range_operations/100768.cc: Likewise. * testsuite/26_numerics/valarray/range_access2.cc: Likewise. * testsuite/28_regex/range_access.cc: Likewise. * testsuite/experimental/string_view/range_access/char/1.cc: Likewise. * testsuite/experimental/string_view/range_access/wchar_t/1.cc: Likewise. * testsuite/ext/vstring/range_access.cc: Likewise. * testsuite/std/ranges/adaptors/take.cc: Likewise. * testsuite/std/ranges/p2259.cc: Likewise.	2021-08-04 12:54:28 +01:00
Richard Biener	2724d1bba6	Rewrite more vector loads to scalar loads This teaches forwprop to rewrite more vector loads that are only used in BIT_FIELD_REFs as scalar loads. This provides the remaining uplift to SPEC CPU 2017 510.parest_r on Zen 2 which has CPU gathers disabled. In particular vector load + vec_unpack + bit-field-ref is turned into (extending) scalar loads which avoids costly XMM/GPR transitions. To not conflict with vector load + bit-field-ref + vector constructor matching to vector load + shuffle the extended transform is only done after vector lowering. 2021-07-30 Richard Biener <rguenther@suse.de> * tree-ssa-forwprop.c (pass_forwprop::execute): Split out code to decompose vector loads ... (optimize_vector_load): ... here. Generalize it to handle intermediate widening and TARGET_MEM_REF loads and apply it to loads with a supported vector mode as well.	2021-08-04 12:38:03 +02:00
Richard Biener	87a0b607e4	tree-optimization/101756 - avoid vectorizing boolean MAX reductions The following avoids vectorizing MIN/MAX reductions on bools which, when ending up as vector(2) <signed-boolean:64> would need to be adjusted because of the sign change. The fix instead avoids any reduction vectorization where the result isn't compatible to the original scalar type since we don't compensate for that either. 2021-08-04 Richard Biener <rguenther@suse.de> PR tree-optimization/101756 * tree-vect-slp.c (vectorizable_bb_reduc_epilogue): Make sure the result of the reduction epilogue is compatible to the original scalar result. * gcc.dg/vect/bb-slp-pr101756.c: New testcase.	2021-08-04 12:33:23 +02:00
Jakub Jelinek	af31cab047	c++: Fix up #pragma omp declare {simd,variant} and acc routine parsing When parsing default arguments, we need to temporarily clear parser->omp_declare_simd and parser->oacc_routine, otherwise it can clash with further declarations inside of e.g. lambdas inside of those default arguments. 2021-08-04 Jakub Jelinek <jakub@redhat.com> PR c++/101759 * parser.c (cp_parser_default_argument): Temporarily override parser->omp_declare_simd and parser->oacc_routine to NULL. * g++.dg/gomp/pr101759.C: New test. * g++.dg/goacc/pr101759.C: New test.	2021-08-04 11:53:48 +02:00
Jakub Jelinek	8aa14fa7d9	testsuite: Fix duplicated content of gcc.c-torture/execute/ieee/pr29302-1.x The file has two identical halves, seems like twice applied patch. 2021-08-04 Jakub Jelinek <jakub@redhat.com> * gcc.c-torture/execute/ieee/pr29302-1.x: Undo doubly applied patch.	2021-08-04 11:44:45 +02:00
liuhongt	9f26640f7b	Refine predicate of peephole2 to general_reg_operand. [PR target/101743] The define_peephole2 which is added by r12-2640-gf7bf03cf69ccb7dc should only work on general registers, considering that x86 also supports mov instructions between gpr, sse reg, mask reg, limiting the peephole2 predicate to general_reg_operand. gcc/ChangeLog: PR target/101743 * config/i386/i386.md (peephole2): Refine predicate from register_operand to general_reg_operand.	2021-08-04 17:43:17 +08:00
Jakub Jelinek	7195fa03e7	libgcc: Fix duplicated content of config/t-slibgcc-fuchsia The file has two identical halves, seems like twice applied patch. 2021-08-04 Jakub Jelinek <jakub@redhat.com> * config/t-slibgcc-fuchsia: Undo doubly applied patch.	2021-08-04 11:40:52 +02:00
Aldy Hernandez	9db0bcd9fd	Mark path_range_query::dump as override. gcc/ChangeLog: * gimple-range-path.h (path_range_query::dump): Mark override.	2021-08-04 10:57:11 +02:00
Richard Biener	4d56259101	tree-optimization/101769 - tail recursion creates possibly infinite loop This makes tail recursion optimization produce a loop structure manually rather than relying on loop fixup. That also allows the loop to be marked as finite (it would eventually blow the stack if it were not). 2021-08-04 Richard Biener <rguenther@suse.de> PR tree-optimization/101769 * tree-tailcall.c (eliminate_tail_call): Add the created loop for the first recursion and return it via the new output parameter. (optimize_tail_call): Pass through new output param. (tree_optimize_tail_calls_1): After creating all latches, add the created loop to the loop tree. Do not mark loops for fixup. * g++.dg/tree-ssa/pr101769.C: New testcase.	2021-08-04 10:35:27 +02:00
Martin Liska	5c73b94fdc	docs: document threader-mode param gcc/ChangeLog: * doc/invoke.texi: Document threader-mode param.	2021-08-04 09:48:05 +02:00
liuhongt	3ae1468e26	Add dg-require-effective-target for testcases. gcc/testsuite/ChangeLog: * gcc.target/i386/cond_op_addsubmul_d-2.c: Add dg-require-effective-target for avx512. * gcc.target/i386/cond_op_addsubmul_q-2.c: Ditto. * gcc.target/i386/cond_op_addsubmul_w-2.c: Ditto. * gcc.target/i386/cond_op_addsubmuldiv_double-2.c: Ditto. * gcc.target/i386/cond_op_addsubmuldiv_float-2.c: Ditto. * gcc.target/i386/cond_op_fma_double-2.c: Ditto. * gcc.target/i386/cond_op_fma_float-2.c: Ditto.	2021-08-04 13:25:46 +08:00
liuhongt	2fc2e3917f	Support cond_{fma,fms,fnma,fnms} for vector float/double under AVX512. gcc/ChangeLog: * config/i386/sse.md (cond_fma<mode>): New expander. (cond_fms<mode>): Ditto. (cond_fnma<mode>): Ditto. (cond_fnms<mode>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/cond_op_fma_double-1.c: New test. * gcc.target/i386/cond_op_fma_double-2.c: New test. * gcc.target/i386/cond_op_fma_float-1.c: New test. * gcc.target/i386/cond_op_fma_float-2.c: New test.	2021-08-04 12:58:01 +08:00
Cherry Mui	22e40cc7fe	compiler: support new language constructs in escape analysis Previous CLs add new language constructs in Go 1.17, specifically, unsafe.Add, unsafe.Slice, and conversion from a slice to a pointer to an array. This CL handles them in the escape analysis. At the point of the escape analysis, unsafe.Add and unsafe.Slice are still builtin calls, so just handle them in data flow. Conversion from a slice to a pointer to an array has already been lowered to a combination of compound expression, conditional expression and slice info expressions, so handle them in the escape analysis. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/339671	2021-08-03 18:32:07 -07:00
GCC Administrator	fa1407c761	Daily bump.	2021-08-04 00:16:51 +00:00
Ian Lance Taylor	e435e72ad7	compile, runtime: make selectnbrecv return two values The only different between selectnbrecv and selectnbrecv2 is the later set the input pointer value by second return value from chanrecv. So by making selectnbrecv return two values from chanrecv, we can get rid of selectnbrecv2, the compiler can now call only selectnbrecv and generate simpler code. This is the gofrontend version of https://golang.org/cl/292890. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/339529	2021-08-03 16:40:00 -07:00
Ian Lance Taylor	cbbd439a33	compiler: check slice to pointer-to-array conversion element type When checking a slice to pointer-to-array conversion, I forgot to verify that the elements types are identical. For golang/go#395 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/339329	2021-08-03 16:36:20 -07:00
Segher Boessenkool	3a7794b469	rs6000: Replace & by && 2021-08-03 Segher Boessenkool <segher@kernel.crashing.org> * config/rs6000/vsx.md (*vsx_le_perm_store_<mode>): Use && instead of &.	2021-08-03 22:33:52 +00:00
Segher Boessenkool	ebff536cf4	rs6000: "e" is not a free constraint letter It is the prefix of the "es" and "eI" constraints. 2021-08-03 Segher Boessenkool <segher@kernel.crashing.org> * config/rs6000/constraints.md: Remove "e" from the list of available constraint characters.	2021-08-03 22:26:40 +00:00
Eugene Rozenfeld	285aa6895d	Fix indirect call inlining with AutoFDO The histogram value for indirect calls was incorrectly set up. That is fixed now. With this change the tree-prof tests checking indirect call inlining with AutoFDO in gcc.dg and g++.dg are passing. Resolves: PR gcov-profile/71672 - inlining indirect calls does not work with autofdo gcc/ChangeLog: PR gcov-profile/71672 * auto-profile.c (afdo_indirect_call): Fix setup of the historgram value for indirect calls.	2021-08-03 14:36:33 -07:00
Eugene Rozenfeld	9265b37853	Fixes for AutoFDO testing * create_gcov tool doesn't currently support dwarf 5 so I made a change in profopt.exp to pass -gdwarf-4 when compiling the binary to profile. * I updated the invocation of create_gcov in profopt.exp to pass -gcov_version=2. I recently made a change to create_gcov to support version 2: https://github.com/google/autofdo/pull/117 . * I removed useless -o perf.data from the invocation of gcc-auto-profile in target-supports.exp. These changes contribute to fixing PR gcov-profile/71672. gcc/testsuite/ChangeLog: * lib/profopt.exp: Pass gdwarf-4 when compiling test to profile; pass -gcov_version=2. * lib/target-supports.exp: Remove unnecessary -o perf.data passed to gcc-auto-profile.	2021-08-03 14:28:42 -07:00
Eugene Rozenfeld	0ed093c7c3	Fix indir-call-prof-2.c with AutoFDO indir-call-prof-2.c has -fno-early-inlining but AutoFDO can't work without early inlining (it needs to match the inlining of the profiled binary). I changed profopt.exp to always pass -fearly-inlining for AutoFDO. With that change the indirect call inlining in indir-call-prof-2.c happens in the early inliner so I changed the dg-final-use-autofdo. Contributes to fixing PR gcov-profile/71672 gcc/testsuite/ChangeLog: * gcc.dg/tree-prof/indir-call-prof-2.c: Fix dg-final-use-autofdo. * lib/profopt.exp: Pass -fearly-inlining when compiling with AutoFDO.	2021-08-03 14:26:27 -07:00
Eugene Rozenfeld	f9ad3d5339	Fixes for AutoFDO tests * Changed several tests to use -fdump-ipa-afdo-optimized instead of -fdump-ipa-afdo in dg-options so that the expected output can be found * Increased the number of iterations in several tests so that perf can have enough sampling events Contributes to fixing PR gcov-profile/71672. gcc/testsuite/ChangeLog: * g++.dg/tree-prof/indir-call-prof.C: Fix options, increase the number of iterations. * g++.dg/tree-prof/morefunc.C: Fix options, increase the number of iterations. * g++.dg/tree-prof/reorder.C: Fix options, increase the number of iterations. * gcc.dg/tree-prof/indir-call-prof-2.c: Fix options, increase the number of iterations. * gcc.dg/tree-prof/indir-call-prof.c: Fix options.	2021-08-03 14:25:47 -07:00
Martin Sebor	aabf07cd5d	Disable a test case in ILP32 [PR101688]. Resolves: PR testsuite/101688 - g++.dg/warn/Wstringop-overflow-4.C fails on 32-bit archs with new jump threader gcc/testsuite: PR testsuite/101688 * g++.dg/warn/Wstringop-overflow-4.C: Disable a test case in ILP32.	2021-08-03 13:56:56 -06:00
Paul A. Clarke	0f44b09732	rs6000: Add test for _mm_minpos_epu16 Copy the test for _mm_minpos_epu16 from gcc/testsuite/gcc.target/i386/sse4_1-phminposuw.c, with a few adjustments: - Adjust the dejagnu directives for powerpc platform. - Make the data not be monotonically increasing, such that some of the returned values are not always the first value (index 0). - Create a list of input data testing various scenarios including more than one minimum value and different orders and indices of the minimum value. - Fix a masking issue where the index was being truncated to 2 bits instead of 3 bits, which wasn't found because all of the returned indices were 0 with the original generated data. - Support big-endian. 2021-08-03 Paul A. Clarke <pc@us.ibm.com> gcc/testsuite * gcc.target/powerpc/sse4_1-phminposuw.c: Copy from gcc/testsuite/gcc.target/i386, adjust dg directives to suit, make more robust.	2021-08-03 13:58:41 -05:00
Paul A. Clarke	eaa93a0f3d	rs6000: Add support for _mm_minpos_epu16 Add a naive implementation of the subject x86 intrinsic to ease porting. 2021-08-03 Paul A. Clarke <pc@us.ibm.com> gcc * config/rs6000/smmintrin.h (_mm_minpos_epu16): New.	2021-08-03 13:58:31 -05:00
Jonathan Wakely	a77a46d9ae	libstdc++: Suppress redundant definitions of inline variables In C++17 the out-of-class definitions for static constexpr variables are redundant, because they are implicitly inline. This change avoids "redundant redeclaration" warnings from -Wsystem-headers -Wdeprecated. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/random.tcc (linear_congruential_engine): Do not define static constexpr members when they are implicitly inline. * include/std/ratio (ratio, __ratio_multiply, __ratio_divide) (__ratio_add, __ratio_subtract): Likewise. * include/std/type_traits (integral_constant): Likewise. * testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error line number.	2021-08-03 15:41:11 +01:00
Jonathan Wakely	5c6759e416	libstdc++: Replace TR1 components with C++11 ones in test utils Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * testsuite/util/testsuite_common_types.h: Replace uses of tr1::unordered_map and tr1::unordered_set with their C++11 equivalents. * testsuite/29_atomics/atomic/cons/assign_neg.cc: Adjust dg-error line number. * testsuite/29_atomics/atomic/cons/copy_neg.cc: Likewise. * testsuite/29_atomics/atomic_integral/cons/assign_neg.cc: Likewise. * testsuite/29_atomics/atomic_integral/cons/copy_neg.cc: Likewise. * testsuite/29_atomics/atomic_integral/operators/bitwise_neg.cc: Likewise. * testsuite/29_atomics/atomic_integral/operators/decrement_neg.cc: Likewise. * testsuite/29_atomics/atomic_integral/operators/increment_neg.cc: Likewise.	2021-08-03 15:40:42 +01:00
Jonathan Wakely	13a1ac9f6f	libstdc++: Specialize allocator_traits<pmr::polymorphic_allocator<T>> This adds a partial specialization of allocator_traits, similar to what was already done for std::allocator. This means that most uses of polymorphic_allocator via the traits can avoid the metaprogramming overhead needed to deduce the properties from polymorphic_allocator. In addition, I'm changing polymorphic_allocator::delete_object to invoke the destructor (or pseudo-destructor) directly, rather than calling allocator_traits::destroy, which calls polymorphic_allocator::destroy (which is deprecated). This is observable if a user has specialized allocator_traits<polymorphic_allocator<Foo>> and expects to see its destroy member function called. I consider explicit specializations of allocator_traits to be wrong-headed, and this use case seems unnecessary to support. So delete_object just invokes the destructor directly. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/std/memory_resource (polymorphic_allocator::delete_object): Call destructor directly instead of using destroy. (allocator_traits<polymorphic_allocator<T>>): Define partial specialization.	2021-08-03 15:30:36 +01:00
Jonathan Wakely	9bd87e3887	libstdc++: Remove trailing whitespace in some tests Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * testsuite/20_util/function_objects/binders/3113.cc: Remove trailing whitespace. * testsuite/20_util/shared_ptr/assign/auto_ptr.cc: Likewise. * testsuite/20_util/shared_ptr/assign/auto_ptr_neg.cc: Likewise. * testsuite/20_util/shared_ptr/assign/auto_ptr_rvalue.cc: Likewise. * testsuite/20_util/shared_ptr/creation/dr925.cc: Likewise. * testsuite/25_algorithms/headers/algorithm/synopsis.cc: Likewise. * testsuite/25_algorithms/random_shuffle/requirements/explicit_instantiation/2.cc: Likewise. * testsuite/25_algorithms/random_shuffle/requirements/explicit_instantiation/pod.cc: Likewise.	2021-08-03 15:30:36 +01:00

1 2 3 4 5 ...

187259 commits