Reacting to a long-standing XPASS for CRIS. This one is
slightly brown paper-bag level; it was never the here-removed
xfailed scan that failed and I didn't notice that XPASS when
reporting success on the commit as a whole. It's not logical to
re-read what was just-written even with overlap issues, and I'm
sure that edit was originally a copy-pasto. I checked
historical m68k-linux and pru-elf test-results too, to verify
that I got that part right.
PR testsuite/91419
* gcc.dg/tree-ssa/pr91091-2.c:15 Remove xfail for RHS.
Reacting to a long-standing XPASS for CRIS. Maybe better do
as https://gcc.gnu.org/PR79356#c11 suggests: xfail it for
x86 only ...except I see m68k also does not xpass.
testsuite:
PR testsuite/79356
* gcc.dg/attr-alloc_size-11.c: Add CRIS to the list
of targets excluding xfail on lines 50 and 51.
For cris-elf before this patch, ever since it was added,
this test gets:
Running /x/gcc/testsuite/gcc.dg/dg.exp ...
FAIL: gcc.dg/Wuse-after-free-2.c (test for warnings, line 115)
FAIL: gcc.dg/Wuse-after-free-2.c (test for warnings, line 116)
and comparing tree dumps with a native x86_64-pc-linux-gnu
run shows a suspicious difference in the "180t.ivopts" dump.
Indeed -fno-ivopts makes the warning appear for cris-elf
too. It was suggested to simply add -fno-ivopts to the
test-flags, like before -fno-tree-loop-distribute-patterns
was added; thus.
PR tree-optimization/108828
* gcc.dg/Wuse-after-free-2.c: Add -fno-ivopts.
gcc/fortran/ChangeLog:
PR fortran/108937
* trans-intrinsic.cc (gfc_conv_intrinsic_ibits): Handle corner case
LEN argument of IBITS equal to BITSIZE(I).
gcc/testsuite/ChangeLog:
PR fortran/108937
* gfortran.dg/ibits_2.f90: New test.
According to Intel ISA manual, fprem and fprem1 return NaN when invalid
arithmetic exception is generated. This is documented in Table 8-10 of the
ISA manual and makes these two instructions fully IEEE compatible.
The reverted patch was based on the data from table 3-30 and 3-31 of the
Intel ISA manual, where results in case of st(0) being infinity or
st(1) being 0 are not specified.
2023-02-27 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/108922
Revert:
* config/i386/i386.md (fmodxf3): Enable for flag_finite_math_only only.
(fmod<mode>3): Ditto.
(fpremxf4_i387): Ditto.
(reminderxf3): Ditto.
(reminder<mode>3): Ditto.
(fprem1xf4_i387): Ditto.
In 2011, the rtl.texi documentation was updated to reflect that the
modes of the RTX unary operations FFS, POPCOUNT and PARITY should
match those of their operands. Unfortunately, some of the transformations
in simplify-rtx.cc predate this tightening of RTL semantics, and have
not (until now) been updated/fixed. i.e. The POPCOUNT and PARITY
optimizations were "correct" when I added them back in 2007.
2023-02-27 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* simplify-rtx.cc (simplify_unary_operation_1) <case FFS>: Avoid
generating FFS with mismatched operand and result modes, by using
an explicit SIGN_EXTEND/ZERO_EXTEND.
<case POPCOUNT>: Likewise, for POPCOUNT of ZERO_EXTEND.
<case PARITY>: Likewise, for PARITY of {ZERO,SIGN}_EXTEND.
The pattern parameter to memset is second. Correct an obvious mistake
in libm2pim/sckt.cc.
libgm2/ChangeLog:
PR modula2/108944
* libm2pim/sckt.cc (getLocalIP): Correct parameter order.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
This fixes some header-defined functions that are undesirably declared
static and weren't caught by the "^static inline" pattern used for the
main patch r13-6096-gcb3e0eac262e55.
gcc/ChangeLog:
* hash-table.h (gt_pch_nx(hash_table<D>)): Remove static.
* lra-int.h (lra_change_class): Likewise.
* recog.h (which_op_alt): Likewise.
* sel-sched-ir.h (sel_bb_empty_or_nop_p): Declare inline
instead of static.
This is a complicated API that should be clearly documented.
Also improve the comment on basic_ios::_M_setstate.
libstdc++-v3/ChangeLog:
* include/bits/basic_ios.h (basic_ios::_M_setstate): Add
caveat to comment.
* include/bits/basic_string.h (resize_and_overwrite): Add
doxygen comment.
This patch introduces the use of CLAMPS instruction when the instruction
is configured.
/* example */
int test(int a) {
if (a < -512)
return -512;
if (a > 511)
return 511;
return a;
}
;; prereq: TARGET_CLAMPS
test:
clamps a2, a2, 9
ret.n
gcc/ChangeLog:
* config/xtensa/xtensa-protos.h (xtensa_match_CLAMPS_imms_p):
New prototype.
* config/xtensa/xtensa.cc (xtensa_match_CLAMPS_imms_p):
New function.
* config/xtensa/xtensa.h (TARGET_CLAMPS): New macro definition.
* config/xtensa/xtensa.md (*xtensa_clamps): New insn pattern.
The PCH mechanism first tries to map the .gch file to the virtual memory
space pointed to by TRY_EMPTY_VM_SPACE during the compilation process.
The original value of TRY_EMPTY_VM_SPACE macro is 0x8000000000,
but like la464 only has 40 bits of virtual address space, this value
just exceeds the address range.
If we want to support chips with less than 40 bits virtual addresses,
then the value of this macro needs to be set small. I think setting
this value small will increase the probability of virtual address
mapping failure. And the purpose of pch is to make compilation faster,
but I think we rarely compile on embedded systems. So this situation
may not be within our consideration.
So change the value of this macro to 0x1000000000.
gcc/ChangeLog:
* config/host-linux.cc (TRY_EMPTY_VM_SPACE): Modify the value of
the macro to 0x1000000000.
The projects-pim-run-pass-tower.exp test blocks indefinitely
on some platforms. This patch disables it for now - it should
be enabled once a cross platform fix for RTint.mod is found.
Even disable the trivial execution test.
gcc/testsuite/ChangeLog:
* gm2/projects/pim/run/pass/tower/projects-pim-run-pass-tower.exp:
Also add conditional to gm2-simple-execute.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
The projects-pim-run-pass-tower.exp test blocks indefinitely
on some platforms. This patch disables it for now - it should
be enabled once a cross platform fix for RTint.mod is found.
gcc/testsuite/ChangeLog:
* gm2/projects/pim/run/pass/tower/projects-pim-run-pass-tower.exp
(gm2_run_tower_test): New global variable. Add conditional
before invoking gm2-local-exec.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
This avoids making the associted_dummy field point to a new memory chunk
if it's already pointing somewhere, in which case doing so would leak the
previously allocated chunk.
PR fortran/108923
gcc/fortran/ChangeLog:
* intrinsic.cc (get_intrinsic_dummy_arg,
set_intrinsic_dummy_arg): Rename the former to the latter.
Remove the return value, add a reference to the lhs as argument,
and do the pointer assignment inside the function. Don't do
it if the pointer is already non-NULL.
(sort_actual): Update caller.
I see overlong lines in the output when a test fails, for
example for a bug exposed for cris-elf and pru-elf in
gcc.dg/analyzer/allocation-size-multiline-3.c:
Running /x/gcc/testsuite/gcc.dg/analyzer/analyzer.exp ...
FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c expected multiline pattern lines 16-25 not found: "\s*int32_t \*ptr = alloca \(99\);[^\n\r]*\n \^~~~~~\n 'test_constant_99': events 1-2[^\n\r]*\n \|[^\n\r]*\n \| int32_t \*ptr = alloca \(99\);[^\n\r]*\n \| \^~~~~~\n \| \|[^\n\r]*\n \| \(1\) allocated 99 bytes here[^\n\r]*\n \| \(2\) assigned to 'int32_t \*' \{aka 'int \*'\} here; 'sizeof \(int32_t \{aka int\}\)' is '4'[^\n\r]*\n \|[^\n\r]*\n"
FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c expected multiline pattern lines 34-43 not found: " int32_t \*ptr = alloca \(n \* 2\);[^\n\r]*\n \^~~~~~\n 'test_symbolic': events 1-2[^\n\r]*\n \|[^\n\r]*\n \| int32_t \*ptr = alloca \(n \* 2\);[^\n\r]*\n \| \^~~~~~\n \| \|[^\n\r]*\n \| \(1\) allocated 'n \* 2' bytes here[^\n\r]*\n \| \(2\) assigned to 'int32_t \*' \{aka 'int \*'\} here; 'sizeof \(int32_t \{aka int\}\)' is '4'[^\n\r]*\n \|[^\n\r]*\n"
FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c (test for excess errors)
That multiline-pattern-quoted-on-a-single-line is redundant
when also outputting "lines 16-25" and "lines 34-43". It's
also so noisy that it can be mistaken for a testsuite error.
If there's a need to inspect it, it can be seen at
verbose-level 4, i.e. persons interested in seeing it
without editing sources can just add "-v -v -v -v".
Let's "prune" the pattern from regular output, instead producing:
Running /x/gcc/testsuite/gcc.dg/analyzer/analyzer.exp ...
FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c expected multiline pattern lines 16-25 not found
FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c expected multiline pattern lines 34-43 not found
FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c (test for excess errors)
* lib/multiline.exp (handle-multiline-outputs): Don't include the
quoted multiline pattern in the pass/fail output.
gcc/
PR target/108919
* config/xtensa/xtensa-protos.h
(xtensa_prepare_expand_call): Rename to xtensa_expand_call.
* config/xtensa/xtensa.cc (xtensa_prepare_expand_call): Rename
to xtensa_expand_call.
(xtensa_expand_call): Emit the call and add a clobber expression
for the static chain to it in case of windowed ABI.
* config/xtensa/xtensa.md (call, call_value, sibcall)
(sibcall_value): Call xtensa_expand_call and complete expansion
right after that call.
gcc/testsuite/
* gcc.target/xtensa/pr108919.c: New test.
When the dummy argument of the bind(C) proc is 'pointer, intent(out)', the conversion
of the GFC to the CFI bounds can be skipped: it is not needed and avoids issues with
noninit memory.
Note that the 'cfi->base_addr = gfc->addr' assignment is kept as the C code of a user
might assume that a nullified pointer arrives as NULL (or even a specific value).
For instance, gfortran.dg/c-interop/section-{1,2}.f90 assumes the value NULL.
Note 2: The PR is about a may-be-uninitialized warning with intent(out). In the PR's
testcase, the pointer was nullified and should not have produced that warning.
That is a diagnostic issue, now tracked as PR middle-end/108906 as the issue in principle
still exists (e.g. with 'intent(inout)'). [But no longer for intent(out).]
Note 3: With undefined pointers and no 'intent', accessing uninit memory is unavoidable
on the caller side as the compiler cannot know what the C function does (but this usage
determines whether the pointer is permitted be undefined or whether the bounds must be
gfc-to-cfi converted).
gcc/fortran/ChangeLog:
PR fortran/108621
* trans-expr.cc (gfc_conv_gfc_desc_to_cfi_desc): Skip setting of
bounds of CFI desc for 'pointer,intent(out)'.
gcc/testsuite/ChangeLog:
PR fortran/108621
* gfortran.dg/c-interop/fc-descriptor-pr108621.f90: New test.
Add the rest of the weak-*.f90 testcases.
gcc/fortran/ChangeLog:
* trans-decl.cc (gfc_finish_var_decl): Apply attribute.
(generate_local_decl): Add diagnostic for dummy and local variables.
gcc/testsuite/ChangeLog:
* gfortran.dg/weak-2.f90: New test.
* gfortran.dg/weak-3.f90: New test.
Signed-off-by: Rimvydas Jasinskas <rimvydas.jas@gmail.com>
This fixes a memory leak by accompanying the release of
gfc_actual_arglist elements' memory with a release of the
associated_dummy field memory (if allocated).
Actual argument copy is adjusted as well so that each copy can free
its field independently.
PR fortran/108923
gcc/fortran/ChangeLog:
* expr.cc (gfc_free_actual_arglist): Free associated_dummy
memory.
(gfc_copy_actual_arglist): Make a copy of the associated_dummy
field if it is set in the original element.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
PR libstdc++/108856
* include/experimental/bits/simd_builtin.h
(_SimdImplBuiltin::_S_masked_unary): More efficient
implementation of masked inc-/decrement for integers and floats
without AVX2.
* include/experimental/bits/simd_x86.h
(_SimdImplX86::_S_masked_unary): New. Use AVX512 masked subtract
builtins for masked inc-/decrement.
The following avoids default-initializing auto_vec storage for
non-POD T since that's not what the allocated storage fallback
will do and it's also not expected for existing cases like
auto_vec<std::pair<unsigned, unsigned>, 64> elts;
which exist to optimize the allocation.
It also fixes the array accesses done by vec<vl_embed> to not
use its own m_vecdata member but instead access the container
provided storage via pointer arithmetic.
* vec.h (vec<T, A, vl_embed>::m_vecdata): Remove.
(vec<T, A, vl_embed>::m_vecpfx): Align as T to avoid
changing alignment of vec<T, A, vl_embed> and simplifying
address.
(vec<T, A, vl_embed>::address): Compute as this + 1.
(vec<T, A, vl_embed>::embedded_size): Use sizeof the
vector instead of the offset of the m_vecdata member.
(auto_vec<T, N>::m_data): Turn storage into
uninitialized unsigned char.
(auto_vec<T, N>::auto_vec): Allow allocation of one
stack member. Initialize m_vec in a special way to
avoid later stringop overflow diagnostics.
* vec.cc (test_auto_alias): New.
(vec_cc_tests): Call it.
As preparation to remove m_vecdata in the vl_embed vector this
changes references to it into calls to address ().
As I was here it also fixes ::contains to avoid repeated bounds
checking and the same issue in ::lower_bound which also suffers
from unnecessary copying around values.
* vec.h (vec<T, A, vl_embed>::lower_bound): Adjust to
take a const reference to the object, use address to
access data.
(vec<T, A, vl_embed>::contains): Use address to access data.
(vec<T, A, vl_embed>::operator[]): Use address instead of
m_vecdata to access data.
(vec<T, A, vl_embed>::iterate): Likewise.
(vec<T, A, vl_embed>::copy): Likewise.
(vec<T, A, vl_embed>::quick_push): Likewise.
(vec<T, A, vl_embed>::pop): Likewise.
(vec<T, A, vl_embed>::quick_insert): Likewise.
(vec<T, A, vl_embed>::ordered_remove): Likewise.
(vec<T, A, vl_embed>::unordered_remove): Likewise.
(vec<T, A, vl_embed>::block_remove): Likewise.
(vec<T, A, vl_heap>::address): Likewise.
As mentioned in the PR, when we use LTO, we wrongly use ltrans output
file name as a module name of a global variable. That leads to a
non-reproducible output.
After the suggested change, we emit context name of normal global
variables. And for artificial variables (like .Lubsan_data3), we use
aux_base_name (e.g. "./a.ltrans0.ltrans").
PR sanitizer/108834
gcc/ChangeLog:
* asan.cc (asan_add_global): Use proper TU name for normal
global variables (and aux_base_name for the artificial one).
gcc/testsuite/ChangeLog:
* c-c++-common/asan/global-overflow-1.c: Test line and column
info for a global variable.
g++.dg/modules/virt-2_a.C fails on arm-eabi and many other arm targets
that use the AAPCS variant. ARM is the only target that overrides
TARGET_CXX_KEY_METHOD_MAY_BE_INLINE. It's not clear to me which way
the clash between AAPCS and C++ Modules design should be resolved, but
currently it favors AAPCS and thus the test fails, so skip it on
arm_eabi.
for gcc/testsuite/ChangeLog
PR c++/105224
* g++.dg/modules/virt-2_a.C: Skip on arm_eabi.
The TS says the arguments to these constructors shall meet the Executor
requirements, so it's undefined if they don't. Constraining on a subset
of those requirements won't affect valid cases, but prevents the
majority of invalid cases from trying to instantiate the constructor.
This prevents the non-explicit executor(Executor) constructor being a
candidate anywhere that a net::executor could be constructed e.g.
comparing ip::tcp::v4() == ip::udp::v4() would try to convert both
operands to executor using that constructor, then compare then using
operator==(const executor&, const executor&).
libstdc++-v3/ChangeLog:
* include/experimental/executor (executor): Constrain template
constructors.
I messed up the endianness of the address_v4::bytes_type array, which
should always be in network byte order. We can just use bit_cast to
convert the _M_addr member to/from bytes_type.
libstdc++-v3/ChangeLog:
* include/experimental/internet (address_4(const bytes_type&)):
Use __builtin_bit_cast if available, otherwise convert to
network byte order.
(address_v4::to_bytes()): Likewise, but convert from network
byte order.
* testsuite/experimental/net/internet/address/v4/cons.cc: Fix
incorrect tests. Check for constexpr too.
* testsuite/experimental/net/internet/address/v4/creation.cc:
Likewise.
* testsuite/experimental/net/internet/address/v4/members.cc:
Check that bytes_type is a standard-layout type.
This is an order of magnitude faster than calling inet_ntop (and not
only because we now avoid allocating a string that is one byte larger
than the SSO buffer).
libstdc++-v3/ChangeLog:
* include/experimental/internet (address_v4::to_string):
Optimize.
* testsuite/experimental/net/internet/address/v4/members.cc:
Check more addresses.
The options need to be set first, so that -std=gnu++20 is used when
checking the c++20 effective target.
libstdc++-v3/ChangeLog:
* testsuite/std/format/arguments/lwg3810.cc: Move dg-options
before dg-do.
I've noticed the description of these wasn't updated when the mask2
argument has been added in 2019.
2023-02-24 Jakub Jelinek <jakub@redhat.com>
* config/i386/i386-builtin.def: Update description of BDESC
and BDESC_FIRST in file comment to include mask2.
With the cleanup of the arch features in GCC 13 the comment on the FLAGS field in aarch64-cores.def
is now outdated. It's now a comma-separated list rather than a bitwise or.
Spotted while reviewing an aarch64-cores.def patch.
Update the comment.
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def (FLAGS): Update comment.
The following testcase ICEs on x86_64-linux with -m32. The problem is
we create an artificial thunk and because of -fPIC, ia32 and thunk
destination which doesn't bind locally can't use a mi thunk.
The ICE is because during expansion to RTL we see SSA_NAME for a PARM_DECL,
but the PARM_DECL doesn't have DECL_CONTEXT of the current function.
This is because duplicate_thunk_for_node creates a new DECL_ARGUMENTS chain
only if some arguments need modification.
The following patch fixes it by copying the DECL_ARGUMENTS list even if
the arguments can stay as is, to update DECL_CONTEXT on them. While for
mi thunks it doesn't really matter because we don't use those arguments
in any way, for other thunks it is important.
2023-02-23 Jakub Jelinek <jakub@redhat.com>
PR middle-end/108854
* cgraphclones.cc (duplicate_thunk_for_node): If no parameter
changes are needed, copy at least DECL_ARGUMENTS PARM_DECL
nodes and adjust their DECL_CONTEXT.
* g++.dg/opt/pr108854.C: New test.
The builtins used in avx512bf16vlintrin.h implementation need both
avx512bf16 and avx512vl ISAs, which the header ensures for them, but
the builtins weren't actually requiring avx512vl, so when used by hand
with just -mavx512bf16 -mno-avx512vl it resulted in ICEs.
Fixed by adding OPTION_MASK_ISA_AVX512VL to their BDESC.
2023-02-24 Jakub Jelinek <jakub@redhat.com>
PR target/108881
* config/i386/i386-builtin.def (__builtin_ia32_cvtne2ps2bf16_v16bf,
__builtin_ia32_cvtne2ps2bf16_v16bf_mask,
__builtin_ia32_cvtne2ps2bf16_v16bf_maskz,
__builtin_ia32_cvtne2ps2bf16_v8bf,
__builtin_ia32_cvtne2ps2bf16_v8bf_mask,
__builtin_ia32_cvtne2ps2bf16_v8bf_maskz,
__builtin_ia32_cvtneps2bf16_v8sf_mask,
__builtin_ia32_cvtneps2bf16_v8sf_maskz,
__builtin_ia32_cvtneps2bf16_v4sf_mask,
__builtin_ia32_cvtneps2bf16_v4sf_maskz,
__builtin_ia32_dpbf16ps_v8sf, __builtin_ia32_dpbf16ps_v8sf_mask,
__builtin_ia32_dpbf16ps_v8sf_maskz, __builtin_ia32_dpbf16ps_v4sf,
__builtin_ia32_dpbf16ps_v4sf_mask,
__builtin_ia32_dpbf16ps_v4sf_maskz): Require also
OPTION_MASK_ISA_AVX512VL.
* gcc.target/i386/avx512bf16-pr108881.c: New test.