gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vmsgt_vv-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vv-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vv-3.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vv_m-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vv_m-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vv_m-3.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vv_mu-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vv_mu-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vv_mu-3.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vv-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vv-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vv-3.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vv_m-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vv_m-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vv_m-3.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vv_mu-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vv_mu-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vv_mu-3.c: New test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vmsgt_vx_m_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_m_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_m_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_m_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_m_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_m_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_mu_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_mu_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_mu_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_mu_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_mu_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_mu_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgt_vx_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_m_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_m_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_m_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_m_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_m_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_m_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_mu_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_mu_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_mu_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_mu_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_mu_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_mu_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsgtu_vx_rv64-3.c: New test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vmsle_vv-1.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vv-2.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vv-3.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vv_m-1.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vv_m-2.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vv_m-3.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vv_mu-1.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vv_mu-2.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vv_mu-3.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vv-1.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vv-2.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vv-3.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vv_m-1.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vv_m-2.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vv_m-3.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vv_mu-1.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vv_mu-2.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vv_mu-3.c: New test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vmsle_vx_m_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_m_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_m_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_m_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_m_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_m_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_mu_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_mu_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_mu_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_mu_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_mu_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_mu_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsle_vx_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_m_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_m_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_m_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_m_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_m_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_m_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_mu_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_mu_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_mu_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_mu_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_mu_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_mu_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsleu_vx_rv64-3.c: New test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vmslt_vv-1.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vv-2.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vv-3.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vv_m-1.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vv_m-2.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vv_m-3.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vv_mu-1.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vv_mu-2.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vv_mu-3.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vv-1.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vv-2.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vv-3.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vv_m-1.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vv_m-2.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vv_m-3.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vv_mu-1.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vv_mu-2.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vv_mu-3.c: New test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vmslt_vx_m_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_m_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_m_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_m_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_m_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_m_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_mu_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_mu_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_mu_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_mu_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_mu_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_mu_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmslt_vx_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_m_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_m_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_m_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_m_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_m_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_m_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_mu_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_mu_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_mu_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_mu_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_mu_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_mu_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsltu_vx_rv64-3.c: New test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vmsne_vv-1.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vv-2.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vv-3.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vv_m-1.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vv_m-2.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vv_m-3.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vv_mu-1.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vv_mu-2.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vv_mu-3.c: New test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vmsne_vx_m_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_m_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_m_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_m_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_m_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_m_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_mu_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_mu_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_mu_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_mu_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_mu_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_mu_rv64-3.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_rv32-1.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_rv32-2.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_rv32-3.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_rv64-1.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_rv64-2.c: New test.
* gcc.target/riscv/rvv/base/vmsne_vx_rv64-3.c: New test.
Windows needs to use uintptr_t to represent an integral pointer type (long
is not the right type there).
Patch from 'nightstike'.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/testsuite/ChangeLog:
* obj-c++.dg/proto-lossage-4.mm: Use uintptr_t for integral pointer
representations.
PR 108679 testcase shows a situation when IPA-CP is able to track a
scalar constant in a single-field structure that is part of a bigger
structure. This smaller structure is however also passed in a few
calls to other functions, but the two same-but-different entities,
originally placed at the same offset and with the same size, confuse
the mechanism that takes care of handling call statements after
IPA-SRA.
I think that in stage 4 it is best to revert to GCC 12 behavior in this
particular case (when IPA-CP detects a constant in a single-field
structure or a single element array that is part of a bigger aggregate)
and the patch below does that. If accepted, I plan to file a
missed-optimization bug to track that we could use the IPA-CP propagated
value to re-construct the small aggregate arguments.
gcc/ChangeLog:
2023-02-13 Martin Jambor <mjambor@suse.cz>
PR ipa/108679
* ipa-sra.cc (push_param_adjustments_for_index): Do not omit
creation of non-scalar replacements even if IPA-CP knows their
contents.
gcc/testsuite/ChangeLog:
2023-02-13 Martin Jambor <mjambor@suse.cz>
PR ipa/108679
* gcc.dg/ipa/pr108679.c: New test.
For 'parallel', loop-iteration variables are marked are marked as 'private',
unless they either appear in an omp do/simd loop or an data-sharing clause
already exists for those on 'parallel'. 'omp loop' wasn't handled, leading
to (potentially) multiple data-sharing clauses in gfc_resolve_do_iterator
as omp_current_ctx pointed to the 'parallel' directive, ignoring the
in-betwen 'loop' directive.
The latter lead to a bogus diagnostic - or rather an ICE as the source
location var contained only '\0'.
Additionally, several 'case EXEC_OMP...LOOP' have been added to call the
right resolution function and likewise for '{masked,master} taskloop'.
gcc/fortran/ChangeLog:
PR fortran/108512
* openmp.cc (gfc_resolve_omp_parallel_blocks): Handle combined 'loop'
directives.
(gfc_resolve_do_iterator): Set a source location for added
'private'-clause arguments.
* resolve.cc (gfc_resolve_code): Call gfc_resolve_omp_do_blocks
also for EXEC_OMP_LOOP and gfc_resolve_omp_parallel_blocks for
combined directives with loop + '{masked,master} taskloop (simd)'.
gcc/testsuite/ChangeLog:
PR fortran/108512
* gfortran.dg/gomp/loop-5.f90: New test.
* gfortran.dg/gomp/loop-2.f90: Update dg-error.
* gfortran.dg/gomp/taskloop-2.f90: Update dg-error.
libgomp/
* target.c (gomp_target_rev): Dereference ptr
to get device address.
* testsuite/libgomp.fortran/reverse-offload-5.f90: Add test
for unallocated allocatable.
As GOMP_MAP_ALWAYS_POINTER operates on the previous map item, ensure that
with 'target enter data' both are passed together to gomp_map_vars_internal.
libgomp/ChangeLog:
* target.c (gomp_map_vars_internal): Add 'i > 0' before doing a
kind check.
(GOMP_target_enter_exit_data): If the next map item is
GOMP_MAP_ALWAYS_POINTER map it together with the current item.
* testsuite/libgomp.fortran/target-enter-data-3.f90: New test.
WIDEN_MULT_PLUS_EXPR as documented has the factor operands with
the same precision and the addend and result another one at least twice
as wide.
Similarly, {,u}maddMN4 is documented as
'maddMN4'
Multiply operands 1 and 2, sign-extend them to mode N, add operand
3, and store the result in operand 0. Operands 1 and 2 have mode M
and operands 0 and 3 have mode N. Both modes must be integer or
fixed-point modes and N must be twice the size of M.
In other words, 'maddMN4' is like 'mulMN3' except that it also adds
operand 3.
These instructions are not allowed to 'FAIL'.
'umaddMN4'
Like 'maddMN4', but zero-extend the multiplication operands instead
of sign-extending them.
The PR103109 addition of these expanders to rs6000 didn't handle this
correctly though, it treated the last argument as also having mode M
sign or zero extended into N. Unfortunately this means incorrect code
generation whenever the last operand isn't really sign or zero extended
from DImode to TImode.
The following patch removes maddditi4 expander altogether from rs6000.md,
because we'd need
maddhd 9,3,4,5
sradi 10,5,63
maddld 3,3,4,5
sub 9,9,10
add 4,9,6
which is longer than
mulld 9,3,4
mulhd 4,3,4
addc 3,9,5
adde 4,4,6
and nothing would be able to optimize the case of last operand already
sign-extended from DImode to TImode into just
mr 9,3
maddld 3,3,4,5
maddhd 4,9,4,5
or so. And fixes umaddditi4, so that it emits an add at the end to add
the high half of the last operand, fortunately in this case if the high
half of the last operand is known to be zero (i.e. last operand is zero
extended from DImode to TImode) then combine will drop the useless add.
If we wanted to get back the signed op1 * op2 + op3 all in the DImode
into TImode op0, we'd need to introduce a new tree code next to
WIDEN_MULT_PLUS_EXPR and maddMN4 expander, because I'm afraid it can't
be done at expansion time in maddMN4 expander to detect whether the
operand is sign extended especially because of SUBREGs and the awkwardness
of looking at earlier emitted instructions, and combine would need 5
instruction combination.
2023-02-15 Jakub Jelinek <jakub@redhat.com>
PR target/108787
PR target/103109
* config/rs6000/rs6000.md (<u>maddditi4): Change into umaddditi4 only
expander, change operand 3 to be TImode, emit maddlddi4 and
umadddi4_highpart{,_le} with its low half and finally add the high
half to the result.
* gcc.dg/pr108787.c: New test.
* gcc.target/powerpc/pr108787.c: New test.
* gcc.target/powerpc/pr103109-1.c: Adjust expected instruction counts.
The following patch adds testcases for 5 DRs. In the DR2475, DR2530 and
CWG2691 my understanding is we already implement the desired behavior,
in DR2478 partially (I've added 2 dg-bogus there, I think we inherit
rather than overwrite DECL_DECLARED_CONSTINIT_P for explicit specialization
somewhere, still far better than clang++) and DR2673 on the other side the
DR was to codify the clang++ behavior rather than GCC.
Not 100% sure if it is better to commit the 2 with dg-bogus or just wait
until the actual fixes are implemented. BTW, I've noticed
register_specialization does:
FOR_EACH_CLONE (clone, fn)
{
DECL_DECLARED_INLINE_P (clone)
= DECL_DECLARED_INLINE_P (fn);
DECL_SOURCE_LOCATION (clone)
= DECL_SOURCE_LOCATION (fn);
DECL_DELETED_FN (clone)
= DECL_DELETED_FN (fn);
}
but not e.g. constexpr/consteval, have tried to cover that in a testcase
but haven't managed to do so.
2023-02-15 Jakub Jelinek <jakub@redhat.com>
* g++.dg/DRs/dr2475.C: New test.
* g++.dg/DRs/dr2478.C: New test.
* g++.dg/DRs/dr2530.C: New test.
* g++.dg/DRs/dr2673.C: New test.
* c-c++-common/cpp/delimited-escape-seq-8.c: New test.
While working on bitmap operations I figured sanopt.cc uses
a sbitmap worklist, iterating using bitmap_first_set_bit on it.
That's quadratic since bitmap_first_set_bit for sbitmap is O(n).
The fix is to use regular bitmaps for the worklist and the bitmap
feeding it and to avoid a useless copy.
* sanopt.cc (sanitize_asan_mark_unpoison): Use bitmap
for with_poison and alias worklist to it.
(sanitize_asan_mark_poison): Likewise.
The following does low-hanging optimizations, combining bitmap
test and set and removing redundant operations.
PR target/108738
* config/i386/i386-features.cc (scalar_chain::add_to_queue):
Combine bitmap test and set.
(scalar_chain::add_insn): Likewise.
(scalar_chain::analyze_register_chain): Remove redundant
attempt to add to queue and instead strengthen assert.
Sink common attempts to mark the def dual-mode.
(scalar_chain::add_to_queue): Remove redundant insn bitmap
check.
When the set of candidates becomes very large then repeated
bit checks on it during the build of an actual chain can become
slow because of the O(n) nature of bitmap tests. The following
switches the candidates bitmaps to the tree representation before
building the chains to get O(log n) amortized behavior.
For the testcase at hand this improves STV time by 50%.
PR target/108738
* config/i386/i386-features.cc (convert_scalars_to_vector):
Switch candidates bitmaps to tree view before building the chains.
Observed when disabling LEGITIMIZE_RELOAD_ADDRESS for
cris-elf: the current code doesn't handle the post-cc0
parallel-with-clobber-of-cc0 sets, dropping down into the
fatal_insn call. Following the code, it's obvious that the
variable "set" is always NULL at the call. The intended
parameter is "in".
* reload1.cc (gen_reload): Correct rtx parameter for fatal_insn
"failure trying to reload" call.
The debug-function in sel-sched-dump.cc that would be
suitable for a hookup to a command in gdb is guarded by
#ifdef INSN_SCHEDULING, thus can't be used for all targets.
Better move the function marked DEBUG_FUNCTION elsewhere,
here to a file with a suitable static function to call.
There are multiple sets of similar functions dumping
HARD_REG_SETs, but cleaning that up is better left to a
separate commit.
gcc:
* gdbinit.in (phrs): New command.
* sel-sched-dump.cc (debug_hard_reg_set): Remove debug-function.
* ira-color.cc (debug_hard_reg_set): New, calling print_hard_reg_set.
joust_maybe_elide_copy checks that the last conversion in the ICS for
the first argument is ck_ref_bind, which is reasonable, because we've
checked that we're dealing with a copy/move constructor. But it can
also happen that we couldn't figure out which conversion function is
better to convert the argument, as in this testcase: joust couldn't
decide if we should go with
operator foo &()
or
operator foo const &()
so we get a ck_ambig, which then upsets joust_maybe_elide_copy. Since
a ck_ambig can validly occur, I think we should just return early, as
in the patch below.
PR c++/106675
gcc/cp/ChangeLog:
* call.cc (joust_maybe_elide_copy): Return false for ck_ambig.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/overload-conv-5.C: New test.
In some cases where the target memory address for an ldx or stx
instruction could be reduced to a constant, GCC could emit a malformed
instruction like:
ldxdw %r0,0
Rather than the expected form:
ldxdw %rX, [%rY + OFFSET]
This is due to the constraint allowing a const_int operand, which the
output templates do not handle.
Fix it by introducing a new memory constraint for the appropriate
operands of these instructions, which is identical to 'm' except that
it does not accept const_int.
gcc/
PR target/108790
* config/bpf/constraints.md (q): New memory constraint.
* config/bpf/bpf.md (zero_extendhidi2): Use it here.
(zero_extendqidi2): Likewise.
(zero_extendsidi2): Likewise.
(*mov<MM:mode>): Likewise.
gcc/testsuite/
PR target/108790
* gcc.target/bpf/ldxdw.c: New test.
For bool values, it is easier to deal with
xor 1 rather than having 1 - a. This is because
we are more likely to simplify the xor further in many
cases.
This is a special case for (MASK - b) where MASK
is a powerof2 - 1 and b <= MASK but only for bool
ranges ([0,1]) as that is the main case where the
difference comes into play.
Note this is enabled for gimple folding only
as the ranges are only know while doing gimple
folding and cfun is not always set when fold is called.
OK? Bootstrapped and tested on x86_64-linux-gnu with no
regressions.
gcc/ChangeLog:
PR tree-optimization/108355
PR tree-optimization/96921
* match.pd: Add pattern for "1 - bool_val".
gcc/testsuite/ChangeLog:
PR tree-optimization/108355
PR tree-optimization/96921
* gcc.dg/tree-ssa/bool-minus-1.c: New test.
* gcc.dg/tree-ssa/bool-minus-2.c: New test.
* gcc.dg/tree-ssa/pr108354-1.c: New test.
libstdc++-v3/ChangeLog:
* doc/xml/manual/status_cxx2017.xml: Update an open-std.org link
to www.open-std.org and https.
* doc/html/manual/status.html: Regenerate.
The hash function of PHIs is weak since we want to be able to CSE
them even across basic-blocks in some cases. The following avoids
weakening the hash for cases we are never going to CSE, reducing
the number of collisions and avoiding redundant work in the
hash and equality functions.
* tree-ssa-sccvn.cc (vn_phi_compute_hash): Key skipping
basic block index hashing on the availability of ->cclhs.
(vn_phi_eq): Avoid re-doing sanity checks for CSE but
rely on ->cclhs availability.
(vn_phi_lookup): Set ->cclhs only when we are eventually
going to CSE the PHI.
(vn_phi_insert): Likewise.
The commit "ada: Add PIE support to backtraces on Linux" uses
_r_debug under Linux unconditionally. It is incorrect since musl
libc does not define _r_debug like glibc.
gcc/ada/
* adaint.c [Linux]: Include <features.h>.
(__gnat_get_executable_load_address) [Linux]: Enable only for
glibc and uClibc.
First order recurrence vectorization isn't possible for nested
loops.
PR tree-optimization/108782
* tree-vect-loop.cc (vect_phi_first_order_recurrence_p):
Make sure we're not vectorizing an inner loop.
* gcc.dg/torture/pr108782.c: New testcase.
While in the -fsanitize=address case libasan overloads memcpy, memset,
memmove and many other builtins, such that they are always instrumented,
Linux kernel for -fsanitize=kernel-address recently changed or is changing,
such that memcpy, memset and memmove actually aren't instrumented because
they are often used also from no_sanitize ("kernel-address") functions
and wants __{,hw,}asaN_{memcpy,memset,memmove} to be used instead
for the instrumented calls. See e.g. the https://lkml.org/lkml/2023/2/9/1182
thread. Without appropriate support on the compiler side, that will mean
any time a kernel-address instrumented function (most of them) calls
memcpy/memset/memmove, they will not be instrumented and thus won't catch
kernel bugs. Apparently clang 15 has a param for this.
The following patch implements the same (except it is a usual GCC --param,
not -mllvm argument) on the GCC side. I know this isn't a regression
bugfix, but given that -fsanitize=kernel-address has a single project that
uses it which badly wants this I think it would be worthwhile to make an
exception and get this into GCC 13 rather than waiting another year, it
won't affect non-kernel code, nor even the kernel unless the new parameter
is used.
2023-02-14 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/108777
* params.opt (-param=asan-kernel-mem-intrinsic-prefix=): New param.
* asan.h (asan_memfn_rtl): Declare.
* asan.cc (asan_memfn_rtls): New variable.
(asan_memfn_rtl): New function.
* builtins.cc (expand_builtin): If
param_asan_kernel_mem_intrinsic_prefix and function is
kernel-{,hw}address sanitized, emit calls to
__{,hw}asan_{memcpy,memmove,memset} rather than
{memcpy,memmove,memset}. Use sanitize_flags_p (SANITIZE_ADDRESS)
instead of flag_sanitize & SANITIZE_ADDRESS to check if
asan_intercepted_p functions shouldn't be expanded inline.
* gcc.dg/asan/pr108777-1.c: New test.
* gcc.dg/asan/pr108777-2.c: New test.
* gcc.dg/asan/pr108777-3.c: New test.
* gcc.dg/asan/pr108777-4.c: New test.
* gcc.dg/asan/pr108777-5.c: New test.
* gcc.dg/asan/pr108777-6.c: New test.
* gcc.dg/completion-3.c: Adjust expected multiline output.
PR96373 points out that a predicated SVE loop currently converts
trapping unconditional ops into unpredicated vector ops. Doing
the operation on inactive lanes can then raise an exception.
As discussed in the PR trail, we aren't 100% consistent about
whether we preserve traps or not. But the direction of travel
is clearly to improve that rather than live with it. This patch
tries to do that for the SVE case.
Doing this regresses gcc.target/aarch64/sve/fabd_1.c. I've added
-fno-trapping-math for now and filed PR108571 to track it.
A similar problem applies to fsubr_1.c.
I think this is likely to regress Power 10, since conditional
operations are only available for masked loops. I think we'll
need to add -fno-trapping-math to any affected testcases,
but I don't have a Power 10 system to test on.
gcc/
PR tree-optimization/96373
* tree-vect-stmts.cc (vectorizable_operation): Predicate trapping
operations on the loop mask. Reject partial vectors if this isn't
possible.
gcc/testsuite/
PR tree-optimization/96373
PR tree-optimization/108571
* gcc.target/aarch64/sve/fabd_1.c: Add -fno-trapping-math.
* gcc.target/aarch64/sve/fsubr_1.c: Likewise.
* gcc.target/aarch64/sve/fmul_1.c: Expect predicate ops.
* gcc.target/aarch64/sve/fp_arith_1.c: Likewise.
As Richard pointed out in [1] and the testing on Power10, the
proposed fix for PR96373 requires some updates on a few rs6000
test cases which adopt partial vector. This patch is to fix
all of them with one extra option "-fno-trapping-math" as
Richard suggested.
Besides, the original test case also failed on Power10 without
Richard's proposed fix, this patch adds it together for a bit
better testing coverage.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-January/610728.html
PR target/96373
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/p9-vec-length-epil-1.c: Add -fno-trapping-math.
* gcc.target/powerpc/p9-vec-length-epil-2.c: Likewise.
* gcc.target/powerpc/p9-vec-length-epil-3.c: Likewise.
* gcc.target/powerpc/p9-vec-length-epil-4.c: Likewise.
* gcc.target/powerpc/p9-vec-length-epil-5.c: Likewise.
* gcc.target/powerpc/p9-vec-length-epil-6.c: Likewise.
* gcc.target/powerpc/p9-vec-length-epil-8.c: Likewise.
* gcc.target/powerpc/p9-vec-length-full-1.c: Likewise.
* gcc.target/powerpc/p9-vec-length-full-2.c: Likewise.
* gcc.target/powerpc/p9-vec-length-full-3.c: Likewise.
* gcc.target/powerpc/p9-vec-length-full-4.c: Likewise.
* gcc.target/powerpc/p9-vec-length-full-5.c: Likewise.
* gcc.target/powerpc/p9-vec-length-full-6.c: Likewise.
* gcc.target/powerpc/p9-vec-length-full-8.c: Likewise.
* gcc.target/powerpc/pr96373.c: New test.
This patch adds -
atomic_flag_test
atomic_flag_test_explicit
Which were missed when commit 491ba6 introduced C++20 atomic flag
test.
libstdc++-v3/ChangeLog:
PR libstdc++/103934
* include/std/atomic (atomic_flag_test): Add.
(atomic_flag_test_explicit): Add.
* testsuite/29_atomics/atomic_flag/test/explicit.cc: Add
test case to cover missing atomic_flag free functions.
* testsuite/29_atomics/atomic_flag/test/implicit.cc:
Likewise.
Update __FreeBSD_version values for the latest supported FreeBSD
versions. In particular, add __FreeBSD_version for FreeBSD 14, which
is necessary to compile libphobos successfully on FreeBSD 14.
libphobos/ChangeLog:
PR d/107469
* libdruntime/core/sys/freebsd/config.d: Update __FreeBSD_version.
In this PR we had a write to one vector of a 4-vector tuple.
The vector had mode V1DI, and the target doesn't provide V1DI
moves, so this was converted into:
(clobber (subreg:V1DI (reg/v:V4x1DI 92 [ b ]) 24))
followed by a DImode move. (The clobber isn't really necessary
or helpful for a single word, but would be for wider moves.)
The subreg in the clobber survived until after RA:
(clobber (subreg:V1DI (reg/v:V4x1DI 34 v2 [orig:92 b ] [92]) 24))
IMO this isn't well-formed. If a subreg of a hard register simplifies
to a hard register, it should be replaced by the hard register. If the
subreg doesn't simplify, then target-independent code can't be sure
which parts of the register are affected and which aren't. A clobber
of such a subreg isn't useful and (again IMO) should just be removed.
Conversely, a use of such a subreg is effectively a use of the whole
inner register.
LRA has code to simplify subregs of hard registers, but it didn't
handle bare uses and clobbers. The patch extends it to do that.
One question was whether the final_p argument to alter_subregs
should be true or false. True is IMO dangerous, since it forces
replacements that might not be valid from a dataflow perspective,
and uses and clobbers only exist for dataflow. As said above,
I think the correct way of handling a failed simplification would
be to delete clobbers and replace uses of subregs with uses of
the inner register. But I didn't want to write untested code
to do that.
In the PR, the clobber caused an infinite loop in DCE, because
of a disagreement about what effect the clobber had. But for
the reasons above, I think that was GIGO rather than a bug in
DF or DCE.
gcc/
PR rtl-optimization/108681
* lra-spills.cc (lra_final_code_change): Extend subreg replacement
code to handle bare uses and clobbers.
gcc/testsuite/
PR rtl-optimization/108681
* gcc.target/aarch64/pr108681.c: New test.
IRA can invalidate initially setup equivalence in setup_reg_equiv.
Flag caller_saved was not cleared during invalidation although
init_insns were cleared. It resulted in segmentation fault in
get_equiv. Clearing the flag solves the problem. For more
precaution I added clearing the flag in other places too although it
might be not necessary.
PR rtl-optimization/108774
gcc/ChangeLog:
* ira.cc (ira_update_equiv_info_by_shuffle_insn): Clear equiv
caller_save_p flag when clearing defined_p flag.
(setup_reg_equiv): Ditto.
* lra-constraints.cc (lra_constraints): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr108774.c: New.
gcc/fortran/ChangeLog:
PR fortran/103475
* primary.cc (gfc_expr_attr): Avoid NULL pointer dereference for
invalid use of CLASS variable.
gcc/testsuite/ChangeLog:
PR fortran/103475
* gfortran.dg/pr103475.f90: New test.
Combine pass simplifies zero-extend of a zero-extract to:
Trying 16 -> 6:
16: r86:QI#0=zero_extract(r87:HI,0x8,0x8)
REG_DEAD r87:HI
6: r84:SI=zero_extend(r86:QI)
REG_DEAD r86:QI
Failed to match this instruction:
(set (reg:SI 84 [ s.e2 ])
(zero_extract:SI (reg:HI 87)
(const_int 8 [0x8])
(const_int 8 [0x8])))
which fails instruction recognision. The pattern is valid, since there
is no requirement on the mode of the location operand.
The patch relaxes location operand mode requirements of *extzv and *extv
insn patterns to allow all supported integer modes. The patch also
adds support for a related sign-extend from zero-extracted operand.
2023-02-13 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/108516
* config/i386/predicates.md (extr_register_operand):
New special predicate.
* config/i386/i386.md (*extv<mode>): Use extr_register_operand
as operand 1 predicate.
(*exzv<mode>): Ditto.
(*extendqi<SWI24:mode>_ext_1): New insn pattern.
gcc/testsuite/ChangeLog:
PR target/108516
* gcc.target/i386/pr108516-1.c: New test.
* gcc.target/i386/pr108516-2.c: Ditto.
This patch removes the macro tprintf sizeof nop hack and replaces
it with tprintf (...).
libgm2/ChangeLog:
* libm2iso/RTco.cc (tprintf): Replace definition.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
I noticed that for gcc.c-torture/compile/20001226-1.c even -O1 has
around 50% of the compile-time accounted to FRE. That's because
we have blocks with a high incoming edge count and
can_track_predicate_on_edge visits all of them even though it could
stop after the second. The function is also called repeatedly for
the same edge. The following fixes this and reduces the FRE time
to 1% on the testcase.
PR tree-optimization/28614
* tree-ssa-sccvn.cc (can_track_predicate_on_edge): Avoid
walking all edges in most cases.
(vn_nary_op_insert_pieces_predicated): Avoid repeated
calls to can_track_predicate_on_edge unless checking is
enabled.
(process_bb): Instead call it once here for each edge
we register possibly multiple predicates on.
DCE now chokes on indirect setjmp calls becoming direct because
that exposes them too late to be subject to abnormal edge creation.
The following patch honors gimple_call_ctrl_altering for those and
_not_ treat formerly indirect calls to setjmp as calls to setjmp
in notice_special_calls.
Unfortunately there's no way to have an indirect call to setjmp
properly annotated (the returns_twice attribute is ignored on types).
RTL expansion late discovers returns-twice for the purpose of
adding REG_SETJMP notes and also sets ->calls_setjmp
(instead of asserting it is set). There's no good way to
transfer proper knowledge around here so I'm using ->calls_setjmp
as a flag to indicate whether gimple_call_ctrl_altering_p was set.
PR tree-optimization/108691
* tree-cfg.cc (notice_special_calls): When the CFG is built
honor gimple_call_ctrl_altering_p.
* cfgexpand.cc (expand_call_stmt): Clear cfun->calls_setjmp
temporarily if the call is not control-altering.
* calls.cc (emit_call_1): Do not add REG_SETJMP if
cfun->calls_setjmp is not set. Do not alter cfun->calls_setjmp.
* gcc.dg/pr108691.c: New testcase.
So far we propagate scheduler state across basic blocks within EBBs and
reset the state otherwise. In certain circumstances the entry block of
an EBB might be empty, i.e., no_real_insns_p is true. In those cases
scheduler state is not reset and subsequently wrong state is propagated
to following blocks of the same EBB.
Since the performance benefit of tracking state across basic blocks is
questionable on modern hardware, simply reset the state for each basic
block.
Fix also resetting f{p,x}d_longrunning.
gcc/ChangeLog:
PR target/108102
* config/s390/s390.cc (s390_bb_fallthru_entry_likely): Remove.
(struct s390_sched_state): Initialise to zero.
(s390_sched_variable_issue): For better debuggability also emit
the current side.
(s390_sched_init): Unconditionally reset scheduler state.