Like the previous two patches this moves the iterators
that are in sync.md to iterators.md.
OK? build and tested for riscv64-linux-gnu.
gcc/ChangeLog:
* config/riscv/sync.md (any_atomic, atomic_optab): Move to ...
* config/riscv/iterators.md: Here.
Just like the previous patch this move all of the iterators
of bitmanip.md to iterators.md. All modern backends put the
iterators in iterators.md for easier access.
OK? Built and tested for riscv32-linux-gnu with --with-arch=rv32imafdc_zba_zbb_zbc_zbs.
Thanks,
Andrew Pinski
gcc/ChangeLog:
* config/riscv/bitmanip.md
(bitmanip_bitwise, bitmanip_minmax, clz_ctz_pcna,
tbitmanip_optab, bitmanip_insn, shiftm1): Move to ...
* config/riscv/iterators.md: Here.
This moves the iterators out from riscv.md to iterators.md
like most modern backends.
I have not moved the iterators from the other .md files yet.
OK? Build and tested on riscv64-linux-gnu and riscv32-linux-gnu.
Thanks,
Andrew Pinski
gcc/ChangeLog:
* config/riscv/riscv.md (GPR): Move to new file.
(P, X, BR): Likewise.
(MOVE32, MOVE64, SHORT): Likewise.
(HISI, SUPERQI, SUBX): Likewise.
(ANYI, ANYF, SOFTF): Likewise.
(size, load, default_load): Likewise.
(softload, store, softstore): Likewise.
(reg, fmt, ifmt, amo): Likewise.
(UNITMODE, HALFMODE): Likewise.
(RINT, rint_pattern, rint_rm): Likewise.
(QUIET_COMPARISON, quiet_pattern, QUIET_PATTERN): Likewise.
(any_extend, any_shiftrt, any_shift): Likewise.
(any_bitwise): Likewise.
(any_div, any_mod): Likewise.
(any_gt, any_ge, any_lt, any_le): Likewise.
(u, su): Likewise.
(optab, insn): Likewise.
* config/riscv/iterators.md: New file.
While looking for testcases to quickly test, I Noticed that
check_effective_target_bswap was not enabled for riscv when
ZBB is enabled. This patch checks if ZBB is enabled when
targeting RISCV* for bswap.
OK? Ran the testsuite for riscv32-linux-gnu both with and without ZBB enabled.
PR testsuite/106690
gcc/testsuite/ChangeLog:
* lib/target-supports.exp (check_effective_target_bswap):
Return true if riscv and ZBB ISA extension is enabled.
The default expansion for bswap16 is two extractions (shift/and)
followed by an insertation (ior) and then a zero extend. This can be improved
with ZBB enabled to just full byteswap followed by a (logical) shift right.
This patch adds a new pattern for this which does that.
OK? Built and tested on riscv32-linux-gnu and riscv64-linux-gnu.
gcc/ChangeLog:
PR target/106601
* config/riscv/bitmanip.md (bswaphi2): New pattern.
gcc/testsuite/ChangeLog:
PR target/106601
* gcc.target/riscv/zbb_32_bswap-2.c: New test.
* gcc.target/riscv/zbb_bswap-2.c: New test.
The problem here is the bswap<mode>2 pattern had a check for TARGET_64BIT
but then used the X iterator. Since the X iterator is either SI or DI depending
on the setting TARGET_64BIT, there is no reason for the TARGET_64BIT.
OK? Built and tested on both riscv32-linux-gnu and riscv64-linux-gnu.
Thanks,
Andrew Pinski
gcc/ChangeLog:
PR target/106600
* config/riscv/bitmanip.md (bswap<mode>2): Remove
condition on TARGET_64BIT as X is already conditional there.
gcc/testsuite/ChangeLog:
PR target/106600
* gcc.target/riscv/zbb_32_bswap-1.c: New test.
* gcc.target/riscv/zbb_bswap-1.c: New test.
gcc/fortran/ChangeLog:
PR fortran/103694
* simplify.cc (simplify_size): The size expression of an array cannot
be simplified if an error occurs while resolving the array spec.
gcc/testsuite/ChangeLog:
PR fortran/103694
* gfortran.dg/pr103694.f90: New test.
The recent change to split out the cold path of std::stable_sort caused
a regression for some Qt code. The problem is that the library now adds
a value of type ptrdiff_t to the iterator, which is ambiguous with
-pedantic. The addition could either convert the iterator to a built-in
pointer and add the ptrdiff_t to that, or it could convert the ptrdiff_t
to the iterator's difference_type and use the iterator's own operator+.
The fix is to cast the ptrdiff_t value to the difference type first.
libstdc++-v3/ChangeLog:
* include/bits/stl_algo.h (__stable_sort): Cast size to
iterator's difference type.
* testsuite/25_algorithms/stable_sort/4.cc: New test.
Until now operator+(char*, const string&) and operator+(const string&,
char*) had different performance characteristics. The former required a
single memory allocation and the latter required two. This patch makes
the performance equal.
libstdc++-v3/ChangeLog:
* include/bits/basic_string.h (operator+(const string&, const char*)):
Remove naive implementation.
* include/bits/basic_string.tcc (operator+(const string&, const char*)):
Add single-allocation implementation.
Signed-off-by: Will Hawkins <whh8b@obs.cr>
When an object of decimal floating-point type is default-initialized,
GCC is inconsistent about whether it is given the all-zero-bits
representation (zero with the least quantum exponent) or whether it
acts like a conversion of integer 0 to the DFP type (zero with quantum
exponent 0). In particular, the representation stored in memory can
have all zero bits, but optimization of access to the same object
based on its known constant value can then produce zero with quantum
exponent 0 instead.
C2x leaves the quantum exponent for default initialization
implementation-defined, but that doesn't allow such inconsistency in
the interpretation of a single object. All zero bits seems most
appropriate; change build_real to special-case dconst0 the same way
other constants are special-cased and ensure that the correct zero for
the type is generated.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/
* tree.cc (build_real): Give DFP dconst0 the minimum quantum
exponent for the type.
gcc/testsuite/
* gcc.dg/torture/dfp-default-init-1.c,
gcc.dg/torture/dfp-default-init-2.c,
gcc.dg/torture/dfp-default-init-3.c: New tests.
eBPF effectively supports two kind of call instructions:
- The so called pseudo-calls ("bpf to bpf").
- External calls ("bpf to kernel").
The BPF call instruction always gets an immediate argument, whose
interpretation varies depending on the purpose of the instruction:
- For pseudo-calls, the immediate argument is interpreted as a
32-bit PC-relative displacement measured in number of 64-bit words
minus one.
- For external calls, the immediate argument is interpreted as the
identification of a kernel helper.
In order to differenciate both flavors of CALL instructions the SRC
field of the instruction (otherwise unused) is abused as an opcode;
if the field holds 0 the instruction is an external call, if it holds
BPF_PSEUDO_CALL the instruction is a pseudo-call.
C-to-BPF toolchains, including the GNU toolchain, use the following
practical heuristic at assembly time in order to determine what kind
of CALL instruction to generate: call instructions requiring a fixup
at assembly time are interpreted as pseudo-calls. This means that in
practice a call instruction involving symbols at assembly time (such
as `call foo') is assembled into a pseudo-call instruction, whereas
something like `call 12' is assembled into an external call
instruction.
In both cases, the argument of CALL is an immediate: at the time of
writing eBPF lacks support for indirect calls, i.e. there is no
call-to-register instruction.
This is the reason why BPF programs, in practice, rely on certain
optimizations to happen in order to generate calls to immediates.
This is a typical example involving a kernel helper:
static void * (*bpf_map_lookup_elem)(void *map, const void *key)
= (void *) 1;
int foo (...)
{
char *ret;
ret = bpf_map_lookup_elem (args...);
if (ret)
return 1;
return 0;
}
Note how the code above relies on the compiler to do constant
propagation so the call to bpf_map_lookup_elem can be compiled to a
`call 1' instruction.
While GCC provides a kernel_helper function declaration attribute that
can be used in a robust way to tell GCC to generate an external call
despite of optimization level and any other consideration, the Linux
kernel bpf_helpers.h file relies on tricks like the above.
This patch modifies the BPF backend to avoid SSA sparse constant
propagation to be "undone" by the expander loading the function
address into a register. A new test is also added.
Tested in bpf-unknown-linux-gnu.
No regressions.
gcc/ChangeLog:
PR target/106733
* config/bpf/bpf.cc (bpf_legitimate_address_p): Recognize integer
constants as legitimate addresses for functions.
(bpf_small_register_classes_for_mode_p): Define target hook.
gcc/testsuite/ChangeLog:
PR target/106733
* gcc.target/bpf/constant-calls.c: Rename to ...
* gcc.target/bpf/constant-calls-1.c: and modify to not expect
failure anymore.
* gcc.target/bpf/constant-calls-2.c: New test.
This LWG issue was closed as NAD, as it was just a bug in an
implementation, not a defect in the standard. Libstdc++ never had that
bug and always worked for the problem case. Add a test to ensure we
don't regress.
The problem occurs when abs is implemented using a ternary expression:
return d >= d.zero() ? d : -d;
If decltype(-d) is not the same as decltype(d) then this is ambiguous,
because each type can be converted to the other, so there is no common
type.
libstdc++-v3/ChangeLog:
* testsuite/20_util/duration_cast/rounding.cc: Check abs with
non-reduced duration.
This moves a few functions, notably normalization after a big comment
documenting it. I've left the rest unorganized for now.
* gimple-predicate-analysis.cc: Move predicate normalization
after the comment documenting it.
This splits the API collected in gimple-predicate-analysis.h into
what I'd call a predicate and assorted functionality plus utility
used by the uninit pass that happens to use that. I've tried to
be minimalistic with refactoring, there's still recursive
instantiation of uninit_analysis, the new class encapsulating a
series of uninit analysis queries from the uninit pass. But it
at least should make the predicate part actually reusable and
what predicate is dealt with is a little bit more clear in the
uninit_analysis part.
I will followup with moving the predicate implementation bits
together in the gimple-predicate-analysis.cc file.
* gimple-predicate-analysis.h (predicate): Split out
non-predicate related functionality into ..
(uninit_analysis): .. this new class.
* gimple-predicate-analysis.cc: Refactor into two classes.
* tree-ssa-uninit.cc (find_uninit_use): Use uninit_analysis.
This limits the simple control dep also to the cd_root plus avoids
filling the lazily computed PHI def predicate in the early out path
which would leave it not simplified and normalized if it were
re-used. It also avoids computing the use predicates when the
post-dominance early out doesn't need it. It also syncs
predicate::use_cannot_happen with init_from_phi_def, adding the
missing PHI edge to the computed chains (the simple control dep
code already adds it).
* gimple-predicate-analysis.cc (predicate::use_cannot_happen):
Do simple_control_dep_chain only up to cd_root, add the PHI
operand edge to the chains like init_from_phi_def does.
(predicate::is_use_guarded): Speedup early out, avoid half-way
initializing the PHI def predicate.
Currently, when md file reader sees <something> and something is valid mode
(or code) attribute but which doesn't include case for the current mode
(or code), it just keeps the <something> untouched.
I went through all cases matching <[a-zA-Z] in tmp-mddump.md after make mddump.
Most of the cases were related to the recent V*BF mode additions, some
to V*HF mode too, and there was one typo.
2022-08-24 Jakub Jelinek <jakub@redhat.com>
PR target/106721
* config/i386/sse.md (shuffletype): Add V32BF, V16BF and V8BF entries.
Change V32HF, V16HF and V8HF entries from "f" to "i".
(iptr): Add V32BF, V16BF, V8BF and BF entries.
(i128vldq): Add V16HF and V16BF entries.
(avx512er_vmrcp28<mode><mask_name><round_saeonly_name>): Fix typo,
mask_opernad3 -> mask_operand3.
* gcc.target/i386/avx512vl-pr106721.c: New test.
On Thu, Aug 18, 2022 at 11:02:44PM +0000, Joseph Myers wrote:
> ISO C2x standardizes the existing #warning extension. Arrange
> accordingly for it not to be diagnosed with -std=c2x -pedantic, but to
> be diagnosed with -Wc11-c2x-compat.
And here is the corresponding C++ version.
Don't pedwarn about this for C++23/GNU++23 and tweak the diagnostics
for C++ otherwise, + testsuite coverage.
The diagnostic wording is similar e.g. to the #elifdef diagnostics.
2022-08-24 Jakub Jelinek <jakub@redhat.com>
PR c++/106646
* init.cc: Implement C++23 P2437R1 - Support for #warning.
(lang_defaults): Set warning_directive for GNUCXX23 and CXX23.
* directives.cc (directive_diagnostics): Use different wording of
#warning pedwarn for C++.
* g++.dg/cpp/warning-1.C: New test.
* g++.dg/cpp/warning-2.C: New test.
* g++.dg/cpp/warning-3.C: New test.
The function jump instruction in normal mode is 'bl',
so the scope of the function jump is +-128MB.
Now we've added support for 'medium' mode, this mode is
to complete the function jump through two instructions:
pcalau12i + jirl
So in this mode the function jump range is increased to +-2GB.
Compared with 'normal' mode, 'medium' mode only affects the
jump range of functions.
gcc/ChangeLog:
* config/loongarch/genopts/loongarch-strings: Support code model medium.
* config/loongarch/genopts/loongarch.opt.in: Likewise.
* config/loongarch/loongarch-def.c: Likewise.
* config/loongarch/loongarch-def.h (CMODEL_LARGE): Likewise.
(CMODEL_EXTREME): Likewise.
(N_CMODEL_TYPES): Likewise.
(CMODEL_MEDIUM): Likewise.
* config/loongarch/loongarch-opts.cc: Likewise.
* config/loongarch/loongarch-opts.h (TARGET_CMODEL_MEDIUM): Likewise.
* config/loongarch/loongarch-str.h (STR_CMODEL_MEDIUM): Likewise.
* config/loongarch/loongarch.cc (loongarch_call_tls_get_addr):
Tls symbol Loading support medium mode.
(loongarch_legitimize_call_address): When medium mode, make a symbolic
jump with two instructions.
(loongarch_option_override_internal): Support medium.
* config/loongarch/loongarch.md (@pcalau12i<mode>): New template.
(@sibcall_internal_1<mode>): New function call templates added to support
medium mode.
(@sibcall_value_internal_1<mode>): Likewise.
(@sibcall_value_multiple_internal_1<mode>): Likewise.
(@call_internal_1<mode>): Likewise.
(@call_value_internal_1<mode>): Likewise.
(@call_value_multiple_internal_1<mode>): Likewise.
* config/loongarch/loongarch.opt: Support medium.
* config/loongarch/predicates.md: Add processing about medium mode.
* doc/invoke.texi: Document for '-mcmodel=medium'.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/func-call-medium-1.c: New test.
* gcc.target/loongarch/func-call-medium-2.c: New test.
* gcc.target/loongarch/func-call-medium-3.c: New test.
* gcc.target/loongarch/func-call-medium-4.c: New test.
* gcc.target/loongarch/func-call-medium-5.c: New test.
* gcc.target/loongarch/func-call-medium-6.c: New test.
* gcc.target/loongarch/func-call-medium-7.c: New test.
* gcc.target/loongarch/func-call-medium-8.c: New test.
* gcc.target/loongarch/tls-gd-noplt.c: Add compile parameter '-mexplicit-relocs'.
The following reverts a hunk from r8-5789-g11ef0b22d68cd1 that
made compute_control_dep_chain start from function entry rather
than the immediate dominator of the source block of the edge with
the undefined value on the PHI node. Reverting at that point
does not reveal any testsuite FAIL, in particular the added
testcase still passes. The following adjusts this to the other
function that computes predicates that hold on the PHI incoming
edges with undefined values, predicate::init_from_phi_def, which
starts at the immediate dominator of the PHI. That's much less
likely to run into the CFG walking limit.
* gimple-predicate-analysis.cc (predicate::use_cannot_happen):
Start the compute_control_dep_chain walk from the immediate
dominator of the PHI.
This patch fixes a pretty stoopid thinko. When I added code to warn
about pessimizing std::move in initializations like
T t{std::move(T())};
I also added code to unwrap the expression from { }. But when we have
return {std::move(t)};
we cannot warn about a redundant std::move because the implicit move
wouldn't happen for "return {t};" because the expression isn't just
a name. However, we still want to warn about
return {std::move(T())};
so let's not disable the -Wpessimizing-move warning. Tests added for
both cases.
gcc/cp/ChangeLog:
* typeck.cc (maybe_warn_pessimizing_move): Don't warn about
redundant std::move when the expression was wrapped in { }.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/Wpessimizing-move10.C: New test.
* g++.dg/cpp0x/Wredundant-move12.C: New test.
Since XMM BF16 tests only require SSE2, replace vmovdqu with movdqu in
BF16 XMM ABI tests to support SSE2 machines without AVX.
Tested on x86-64 machines with and without AVX.
* gcc.target/x86_64/abi/bf16/asm-support.S: Replace vmovdqu with
movdqu.
This implements the non-<ranges> changes from P2321R2, which primarily
consist of additional converting constructors, assignment operator and
swap overloads for std::pair and std::tuple.
libstdc++-v3/ChangeLog:
* include/bits/stl_bvector.h (_Bit_reference::operator=): Define
const overload for C++23 as per P2321R2.
* include/bits/stl_pair.h (pair::swap): Likewise.
(pair::pair): Define additional converting constructors for
C++23 as per P2321R2.
(pair::operator=): Define const overloads for C++23 as per
P2321R2.
(swap): Define overload taking const pair& for C++23 as per
P2321R2.
(basic_common_reference): Define partial specialization for
pair for C++23 as per P2321R2.
(common_type): Likewise.
* include/bits/uses_allocator_args.h
(uses_allocator_construction_args): Define additional pair
overloads for C++23 as per P2321R2.
* include/std/tuple (_Tuple_impl::_Tuple_impl): Define
additional converting constructors for C++23 as per P2321R2.
(_Tuple_impl::_M_assign): Define const overloads for C++23
as per P2321R2.
(_Tuple_impl::_M_swap): Likewise.
(tuple::__constructible): Define as a convenient renaming of
_TCC<true>::__constructible.
(tuple::__convertible): As above but for _TCC<true>::__convertible.
(tuple::tuple): Define additional converting constructors for
C++23 as per P2321R2.
(tuple::operator=): Define const overloads for C++23 as per
P2321R2.
(tuple::swap): Likewise.
(basic_common_reference): Define partial specialization for
tuple for C++23 as per P2321R2.
(common_type): Likewise.
* testsuite/20_util/pair/p2321r2.cc: New test.
* testsuite/20_util/tuple/p2321r2.cc: New test.
* testsuite/23_containers/vector/bool/element_access/1.cc: New test.
P2321R2 adds additional conditionally explicit constructors to std::tuple
which we'll concisely implement in a subsequent patch using explicit(bool),
like in our C++20 std::pair implementation. This prerequisite patch
adds member typedefs to _TupleConstraints for testing element-wise
constructibility and convertibility separately; we'll use the first in
the new constructors' constraints, and the second in their explicit
specifier.
In passing, this patch also redefines the existing member predicates
__is_ex/implicitly_constructible in terms of these new members. This
seems to reduce compile time and memory usage by about 10% for large
tuples when using the converting constructors that're constrained by
_Explicit/_ImplicitCtor.
libstdc++-v3/ChangeLog:
* include/std/tuple (_TupleConstraints::__convertible): Define.
(_TupleConstraints::__constructible): Define.
(_TupleConstraints::__is_explicitly_constructible): Redefine this
in terms of __convertible and __constructible.
(_TupleConstraints::__is_implicitly_constructible): Likewise.
The optimization for the common case of std::visit forgot to handle the
edge case of passing zero variants to a non-void visitor and converting
the result to void.
libstdc++-v3/ChangeLog:
PR libstdc++/106589
* include/std/variant (__do_visit): Handle is_void<R> for zero
argument case.
* testsuite/20_util/variant/visit_r.cc: Check std::visit<void>(v).
On 64-bit Windows, long is 32 bits and can't be used as stride in memory
operand when base is a pointer which is 64 bits. Cast stride to
__PTRDIFF_TYPE__, instead of long.
PR target/106714
* config/i386/amxtileintrin.h (_tile_loadd_internal): Cast to
__PTRDIFF_TYPE__.
(_tile_stream_loadd_internal): Likewise.
(_tile_stored_internal): Likewise.
The following applies similar measures as r13-2133-ge66cf626c72d58
to the computation of the use predicate when the path from PHI def
to use is too long and we run into compute_control_dep_chain limits.
It also moves the preprocessor define limits internal.
This resolves the reduced testcase but not the original one.
PR tree-optimization/106722
* gimple-predicate-analysis.h (MAX_NUM_CHAINS, MAX_CHAIN_LEN,
MAX_POSTDOM_CHECK, MAX_SWITCH_CASES): Move ...
* gimple-predicate-analysis.cc: ... here and document.
(simple_control_dep_chain): New function, factored from
predicate::use_cannot_happen.
(predicate::use_cannot_happen): Adjust.
(predicate::predicate): Use simple_control_dep_chain as fallback.
* g++.dg/uninit-pr106722-1.C: New testcase.
r11-4123 came without a test but I happened upon a nice test case that
got fixed by that revision. So I think it'd be good to add it. The
ICE was:
phi-1.C: In constructor 'ElementManager::ElementManager()':
phi-1.C:28:1: error: missing definition
28 | ElementManager::ElementManager() : array_(makeArray()) {}
| ^~~~~~~~~~~~~~
for SSA_NAME: _12 in statement:
_10 = PHI <_12(3), _11(5)>
PHI argument
_12
for PHI node
_10 = PHI <_12(3), _11(5)>
during GIMPLE pass: fixup_cfg
phi-1.C:28:1: internal compiler error: verify_ssa failed
gcc/testsuite/ChangeLog:
* g++.dg/torture/phi-1.C: New test.
Exactly the same as previous commit for depend-4.f90, r13-2151.
gcc/testsuite/
* gfortran.dg/gomp/depend-6.f90: Fix array index use for
depobj var + update scan-tree-dump-times.
Like the integer version, when op1 == op2 is known to be true the
ranges are also equal.
gcc/ChangeLog:
* range-op-float.cc (foperator_equal::op1_range): Set range to
range of op2.
That's a weird function in predicate analysis that currently looks like
/* Return true if BB1 is postdominating BB2 and BB1 is not a loop exit
bb. The loop exit bb check is simple and does not cover all cases. */
static bool
is_non_loop_exit_postdominating (basic_block bb1, basic_block bb2)
{
if (!dominated_by_p (CDI_POST_DOMINATORS, bb2, bb1))
return false;
if (single_pred_p (bb1) && !single_succ_p (bb2))
return false;
return true;
}
One can refactor this to
return (dominated_by_p (CDI_POST_DOMINATORS, bb2, bb1)
&& !(single_pred_p (bb1) && !single_succ_p (bb2)));
Notable is that the comment refers to BB1 with respect to a loop
exit but the test seems to be written with an exit edge bb1 -> bb2
in mind. None of the three callers are guaranteed to have bb1 and
bb2 connected directly with an edge.
The patch now introduces a is_loop_exit function and inlines
the post-dominance check which makes the find_control_equiv_block
case simpler because the post-dominance check can be elided.
It also avoids the double negation in compute_control_dep_chain
and makes it obvious this is the case where we do look at an edge.
For the main is_use_guarded API I chose to elide the loop exit
test, if the use block post-dominates the definition block of the
PHI node the use is always unconditional. I don't quite understand
the loop exit special-casing of the remaining two uses though.
* gimple-predicate-analysis.cc (is_loop_exit): Split out
from ...
(is_non_loop_exit_postdominating): ... here. Remove after
inlining ...
(find_control_equiv_block): ... here.
(compute_control_dep_chain): ... and here.
(predicate::is_use_guarded): Do not excempt loop exits
from short-cutting the case of the use post-dominating the
PHI definition.
Fix the abi test fail issue caused by type missing.
gcc/testsuite/ChangeLog:
* gcc.target/x86_64/abi/bf16/bf16-helper.h:
Add _m128bf16/m256bf16/_m512bf16.
* gcc.target/x86_64/abi/bf16/m512bf16/bf16-zmm-check.h:
Include bf16-helper.h.
With an input condition of op1 > op2, and evaluating the unsigned expression:
LHS = op1 - op2
range-ops was returning LHS < op1 , which is incorrect as op2 coould be
zero. This patch adjusts it to return LHS <= op1.
PR tree-optimization/106687
gcc/
* range-op.cc (operator_minus::lhs_op1_relation): Return VREL_LE
for the VREL_GT case as well.
gcc/testsuite/
* g++.dg/pr106687.C: New.
libstdc++-v3/ChangeLog:
PR libstdc++/105678
* doc/xml/manual/using.xml: Document -lstdc++_libbacktrace
requirement for using std::stacktrace. Also adjust -frtti and
-fexceptions to document non-default (i.e. negative) forms.
* doc/html/*: Regenerate.
When I changed std::thread and std::async to avoid unnecessary move
construction of temporaries, I introduced a regression where types with
an explicit copy constructor could not be passed to std::thread or
std::async. The fix is to add a constructor instead of using aggregate
initialization of an unnamed temporary.
libstdc++-v3/ChangeLog:
PR libstdc++/106695
* include/bits/std_thread.h (thread::_State_impl): Forward
individual arguments to _Invoker constructor.
(thread::_Invoker): Add constructor. Delete copies.
* include/std/future (__future_base::_Deferred_state): Forward
individual arguments to _Invoker constructor.
(__future_base::_Async_state_impl): Likewise.
* testsuite/30_threads/async/106695.cc: New test.
* testsuite/30_threads/thread/106695.cc: New test.
Currently we fail to notice integer overflow when parsing a
back-reference expression, or when converting the parsed result from
long to int. This changes the result to be int, so no conversion is
needed, and uses the overflow-checking built-ins to detect an
out-of-range back-reference.
libstdc++-v3/ChangeLog:
PR libstdc++/106607
* include/bits/regex_compiler.tcc (_Compiler::_M_cur_int_value):
Use built-ins to check for integer overflow in back-reference
number.
* testsuite/28_regex/basic_regex/106607.cc: New test.
The earlyclobber in the pattern yields inefficient code due to
unnecessarily generated moves. Optimize by removing the earlyclobber
for two special alternatives:
- If OP2 is a small constant integer.
- If the logical bit operation has only two operands.
gcc/ChangeLog:
* config/pru/pru.md (pru_<code>di3): New alternative for
two operands but without earlyclobber.
gcc/testsuite/ChangeLog:
* gcc.target/pru/bitop-di.c: New test.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
Use the FILL instruction to efficiently load -1 constants.
gcc/ChangeLog:
* config/pru/pru.md (prumov<mode>, mov<mode>): Add
variants for loading -1 consts.
gcc/testsuite/ChangeLog:
* gcc.target/pru/mov-m1.c: New test.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
Add new patterns to optimize 64-bit sign- and zero-extend operations for
the PRU target.
The new 64-bit zero-extend patterns are straightforward define_insns.
The old 16/32-bit sign-extend pattern has been rewritten from scratch
in order to add 64-bit support. The new pattern expands into several
optimized insns for filling bytes with zeros or ones, and for
conditional branching on bit-test. The bulk of this patch is to
implement the patterns for those new optimized insns.
PR target/106564
gcc/ChangeLog:
* config/pru/constraints.md (Um): New constraint for -1.
(Uf): New constraint for IOR fill-bytes constants.
(Uz): New constraint for AND zero-bytes constants.
* config/pru/predicates.md (const_fillbytes_operand): New
predicate for IOR fill-bytes constants.
(const_zerobytes_operand): New predicate for AND zero-bytes
constants.
* config/pru/pru-protos.h (pru_output_sign_extend): Remove.
(struct pru_byterange): New struct to describe a byte range.
(pru_calc_byterange): New declaration.
* config/pru/pru.cc (pru_rtx_costs): Add penalty for
64-bit zero-extend.
(pru_output_sign_extend): Remove.
(pru_calc_byterange): New helper function to extract byte
range info from a constant.
(pru_print_operand): Remove 'y' and 'z' print modifiers.
* config/pru/pru.md (zero_extendqidi2): New pattern.
(zero_extendhidi2): New pattern.
(zero_extendsidi2): New pattern.
(extend<EQS0:mode><EQD:mode>2): Rewrite as an expand.
(@pru_ior_fillbytes<mode>): New pattern.
(@pru_and_zerobytes<mode>): New pattern.
(<code>di3): Rewrite as an expand and handle ZERO and FILL
special cases.
(pru_<code>di3): New name for <code>di3.
(@cbranch_qbbx_const_<BIT_TEST:code><HIDI:mode>): New pattern to
handle bit-test for 64-bit registers.
gcc/testsuite/ChangeLog:
* gcc.target/pru/pr106564-1.c: New test.
* gcc.target/pru/pr106564-2.c: New test.
* gcc.target/pru/pr106564-3.c: New test.
* gcc.target/pru/pr106564-4.c: New test.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
gcc/fortran/ChangeLog:
PR fortran/106557
* simplify.cc (gfc_simplify_ibclr): Ensure consistent results of
the simplification by dropping a redundant memory representation
of argument x.
(gfc_simplify_ibset): Likewise.
gcc/testsuite/ChangeLog:
PR fortran/106557
* gfortran.dg/pr106557.f90: New test.
The following removes the unused def_expr, use_expr and expr APIs
from the predicate class including the unconditional build of the
GENERIC use_expr on each uninit analysis run.
* gimple-predicate-analysis.h (predicate::m_use_expr): Remove.
(predicate::def_expr): Likewise.
(predicate::use_expr): Likewise.
(predicate::expr): Likewise.
* gimple-predicate-analysis.cc (predicate::def_expr): Remove.
(predicate::use_expr): Likewise.
(predicate::expr): Likewise.
(predicate::is_use_guarded): Do not build m_use_expr.
PR lto/106700
gcc/ChangeLog:
* configure.ac: Detect O_NONBLOCK flag for open.
* config.in: Regenerate.
* configure: Regenerate.
* opts-common.cc (jobserver_info::connect): Set is_connected
properly based on O_NONBLOCK.
* opts-jobserver.h (struct jobserver_info): Add is_connected
member variable.
gcc/lto/ChangeLog:
* lto.cc (wait_for_child): Ask if we are connected to jobserver.
(stream_out_partitions): Likewise.
This patch fix issue of poly_uint16 (1, 1) in machine mode self test.
gcc/ChangeLog:
* simplify-rtx.cc (test_vector_subregs_fore_back): Make first value
and repeat value different.
Usually, the caller takes care of the .o files for the offload compilers
(suffix: ".target.o"). However, if an error occurs during processing
(e.g. fatal error by lto1), they were not deleted.
gcc/ChangeLog:
PR lto/106686
* lto-wrapper.cc (free_array_of_ptrs): Move before tool_cleanup.
(tool_cleanup): Unlink offload_names.
(compile_offload_image): Take filename argument to set it early.
(compile_images_for_offload_targets): Update call; set
offload_names to NULL after freeing the array.
The following avoids adding PHIs to the worklist for uninit processing
if we reach them following backedges. That confuses predicate analysis
because it assumes the use is happening in the same iteration as the the
definition. For the testcase in the PR the situation is like
void foo (int val)
{
int uninit;
# val = PHI <..> (B)
for (..)
{
if (..)
{
.. = val; (C)
val = uninit;
}
# val = PHI <..> (A)
}
}
and starting from (A) with 'uninit' as argument we arrive at (B)
and from there at (C). Predicate analysis then tries to prove
the predicate of (B) (not the backedge) can prove that the
path from (B) to (C) is unreachable which isn't really what it
necessary - that's what we'd need to do when the preheader
edge of the loop were the edge with the uninitialized def.
So the following makes those cases intentionally false negatives.
PR tree-optimization/105937
* tree-ssa-uninit.cc (find_uninit_use): Do not queue PHIs
on backedges.
(execute_late_warn_uninitialized): Mark backedges.
* g++.dg/uninit-pr105937.C: New testcase.