In r14-409, we started handling empty bases first in cxx_fold_indirect_ref_1
so that we don't need to recurse and waste time.
This caused a bogus "modifying a const object" error. I'm appending my
analysis from the PR, but basically, cxx_fold_indirect_ref now returns
a different object than before, and we mark the wrong thing as const,
but since we're initializing an empty object, we should avoid setting
the object constness.
~~
Pre-r14-409: we're evaluating the call to C::C(), which is in the body of
B::B(), which is the body of D::D(&d):
C::C ((struct C *) this, NON_LVALUE_EXPR <0>)
It's a ctor so we get here:
3118 /* Remember the object we are constructing or destructing. */
3119 tree new_obj = NULL_TREE;
3120 if (DECL_CONSTRUCTOR_P (fun) || DECL_DESTRUCTOR_P (fun))
3121 {
3122 /* In a cdtor, it should be the first `this' argument.
3123 At this point it has already been evaluated in the call
3124 to cxx_bind_parameters_in_call. */
3125 new_obj = TREE_VEC_ELT (new_call.bindings, 0);
new_obj=(struct C *) &d.D.2656
3126 new_obj = cxx_fold_indirect_ref (ctx, loc, DECL_CONTEXT (fun), new_obj);
new_obj=d.D.2656.D.2597
We proceed to evaluate the call, then we get here:
3317 /* At this point, the object's constructor will have run, so
3318 the object is no longer under construction, and its possible
3319 'const' semantics now apply. Make a note of this fact by
3320 marking the CONSTRUCTOR TREE_READONLY. */
3321 if (new_obj && DECL_CONSTRUCTOR_P (fun))
3322 cxx_set_object_constness (ctx, new_obj, /*readonly_p=*/true,
3323 non_constant_p, overflow_p);
new_obj is still d.D.2656.D.2597, its type is "C", cxx_set_object_constness
doesn't set anything as const. This is fine.
After r14-409: on line 3125, new_obj is (struct C *) &d.D.2656 as before,
but we go to cxx_fold_indirect_ref_1:
5739 if (is_empty_class (type)
5740 && CLASS_TYPE_P (optype)
5741 && lookup_base (optype, type, ba_any, NULL, tf_none, off))
5742 {
5743 if (empty_base)
5744 *empty_base = true;
5745 return op;
type is C, which is an empty class; optype is "const D", and C is a base of D.
So we return the VAR_DECL 'd'. Then we get to cxx_set_object_constness with
object=d, which is const, so we mark the constructor READONLY.
Then we're evaluating A::A() which has
((A*)this)->data = 0;
we evaluate the LHS to d.D.2656.a, for which the initializer is
{.D.2656={.a={.data=}}} which is TREE_READONLY and 'd' is const, so we think
we're modifying a const object and fail the constexpr evaluation.
PR c++/115900
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_call_expression): Set new_obj to NULL_TREE
if cxx_fold_indirect_ref set empty_base to true.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/constexpr-init23.C: New test.
(cherry picked from commit d890b04197fb0ddba4fbfb32f88e266fa27e02f3)
The following fixes an issue with CCPs likely_value when faced with
a vector CTOR containing undef SSA names and constants. This should
be classified as CONSTANT and not UNDEFINED.
PR tree-optimization/116057
* tree-ssa-ccp.cc (likely_value): Also walk CTORs in stmt
operands to look for constants.
* gcc.dg/torture/pr116057.c: New testcase.
(cherry picked from commit 1ea551514b9c285d801ac5ab8d78b22483ff65af)
The test fails on 32-bit targets (which don't support __int128 type).
Using unsigned long long instead still ICEs before the fix and passes
after it on those targets.
2024-07-29 Jakub Jelinek <jakub@redhat.com>
PR c++/115986
* g++.dg/cpp2a/consteval-prop21.C (operator "" _c): Use
unsigned long long rather than __uint128_t for return type if int128
is unsupported.
(cherry picked from commit 331f23540eec39fc1e665f573c4aac258bba6043)
During speculative constant folding of an if consteval, we take the false
branch, but the true branch is an immediate function context, so we don't
want to to cp_fold_immediate it. So we could check IF_STMT_CONSTEVAL_P
here. But beyond that, we don't want to do this inside a call, only when
first parsing a function.
PR c++/115583
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_conditional_expression): Don't
cp_fold_immediate for if consteval.
gcc/testsuite/ChangeLog:
* g++.dg/cpp23/consteval-if13.C: New test.
(cherry picked from commit d5f1948640815a554d106542c2e91e4e117aa3bc)
Here the call to e() makes us decide to check d() for escalation at EOF, but
while checking it we try to fold_immediate 0_c, and get confused by the
template trees. Let's not mess with escalation for function templates.
PR c++/115986
gcc/cp/ChangeLog:
* cp-gimplify.cc (remember_escalating_expr): Skip function
templates.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/consteval-prop21.C: New test.
(cherry picked from commit a9e9f772c7488ac0c09dd92f28890bdab939771a)
Here when we want to synthesize methods for foo()::B maybe_push_to_top_level
calls push_function_context, which sets cfun to a dummy value; later
finish_call_expr tries to set something in
cp_function_chain (i.e. cfun->language), which isn't set. Many places in
the compiler check cfun && cp_function_chain to avoid this problem; here we
also want to check !cp_unevaluated_operand, like set_flags_from_callee does.
PR c++/115561
gcc/cp/ChangeLog:
* semantics.cc (finish_call_expr): Check cp_unevaluated_operand.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-lambda21.C: New test.
(cherry picked from commit 3129a2ed6a764c0687efaca9eba53dcf12d1d8a0)
There are several typo in AVX512 intrins macro define. Correct them to solve
errors when compiled with -O0.
gcc/ChangeLog:
* config/i386/avx512dqintrin.h
(_mm_mask_fpclass_ss_mask): Correct operand order.
(_mm_mask_fpclass_sd_mask): Ditto.
(_mm256_maskz_reduce_round_ss): Use __builtin_ia32_reducess_mask_round
instead of __builtin_ia32_reducesd_mask_round.
(_mm_reduce_round_sd): Use -1 as mask since it is non-mask.
(_mm_reduce_round_ss): Ditto.
* config/i386/avx512vlbwintrin.h
(_mm256_mask_alignr_epi8): Correct operand usage.
(_mm_mask_alignr_epi8): Ditto.
* config/i386/avx512vlintrin.h (_mm_mask_alignr_epi64): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512bw-vpalignr-1b.c: New test.
* gcc.target/i386/avx512dq-vfpclasssd-1b.c: Ditto.
* gcc.target/i386/avx512dq-vfpclassss-1b.c: Ditto.
* gcc.target/i386/avx512dq-vreducesd-1b.c: Ditto.
* gcc.target/i386/avx512dq-vreducess-1b.c: Ditto.
* gcc.target/i386/avx512vl-valignq-1b.c: Ditto.
Didn't notice the memmove is into an int variable, so the test
was still failing on big endian.
2024-07-24 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/116034
PR testsuite/116061
* gcc.dg/pr116034.c (g): Change type from int to unsigned short.
(foo): Guard memmove call on __SIZEOF_SHORT__ == 2.
(cherry picked from commit 69e69847e21a8d951ab5f09fd3421449564dba31)
It is possible that the Zba optimization pattern zero_extendsidi2_bitmanip
matches for a XTheadMemIdx INSN with the effect of emitting an invalid
instruction as reported in PR116035.
The pattern above is used to emit a zext.w instruction to zero-extend
SI mode registers to DI mode. A similar functionality can be achieved
by XTheadBb's th.extu instruction. And indeed, we have the equivalent
pattern in thead.md (zero_extendsidi2_th_extu). However, that pattern
depends on !TARGET_XTHEADMEMIDX. To compensate for that, there are
specific patterns that ensure that zero-extension instruction can still
be emitted (th_memidx_bb_zero_extendsidi2 and friends).
While we could implement something similar (th_memidx_zba_zero_extendsidi2)
it would only make sense, if there existed real HW that does implement Zba
and XTheadMemIdx, but not XTheadBb. Unless such a machine exists, let's
simply disable zero_extendsidi2_bitmanip if XTheadMemIdx is available.
PR target/116035
gcc/ChangeLog:
* config/riscv/bitmanip.md: Disable zero_extendsidi2_bitmanip
for XTheadMemIdx.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr116035-1.c: New test.
* gcc.target/riscv/pr116035-2.c: New test.
(cherry picked from commit 9817d29cd66762893782a52b2c304c5083bc0023)
Reported-by: Patrick O'Neill <patrick@rivosinc.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
As the test case requires +-Inf and NaN to work and -ffast-math is added
by default for arm-none-eabi, re-enable non-finite math.
gcc/testsuite/ChangeLog:
PR testsuite/115826
* gcc.dg/vect/tsvc/vect-tsvc-s1281.c: Use -fno-finite-math-only.
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
(cherry picked from commit 7793f5b4194253acaac0b53d8a1c95d9b5c8f4bb)
This avoids some warnings when the preprocessor conditions are not met.
libstdc++-v3/ChangeLog:
* src/c++23/print.cc (__open_terminal): Use [[maybe_unused]] on
parameter.
(cherry picked from commit b40156d69153364315e071dc968227ce1c3bd2a8)
avrlibc has an incomplete unistd.h that doesn't have isatty.
So building libstdc++ fails when compiling c++23/print.cc.
As a workaround I added a check for AVR.
libstdc++-v3/ChangeLog:
PR libstdc++/115482
* src/c++23/print.cc (__open_terminal) [__AVR__]: Do not use
isatty.
(cherry picked from commit 8439405e38c56b774cf3c65bdafae5f9e11d470a)
The folding into REALPART_EXPR is correct, used only when the mem_offset
is zero, but for IMAGPART_EXPR it didn't check the exact offset value (just
that it is not 0).
The following patch fixes that by using IMAGPART_EXPR only if the offset
is right and using BITFIELD_REF or whatever else otherwise.
2024-07-23 Jakub Jelinek <jakub@redhat.com>
Andrew Pinski <quic_apinski@quicinc.com>
PR tree-optimization/116034
* tree-ssa.cc (maybe_rewrite_mem_ref_base): Only use IMAGPART_EXPR
if MEM_REF offset is equal to element type size.
* gcc.dg/pr116034.c: New test.
(cherry picked from commit b9cefd67a2a464a3c9413e6b3f28e7dc7a9ef162)
For prefetchi instructions, RIP-relative address is explicitly mentioned
for operand and assembler obeys that rule strictly. This makes
instruction like:
prefetchit0 bar
got illegal for assembler, which should be a broad usage for prefetchi.
Change to %a to explicitly add (%rip) after function label to make it
legal in assembler so that it could pass to linker to get the real address.
gcc/ChangeLog:
* config/i386/i386.md (prefetchi): Change to %a.
gcc/testsuite/ChangeLog:
* gcc.target/i386/prefetchi-1.c: Check (%rip).
The dg-do directive appears after dg-require-effective-target in
g++.target/powerpc/pr106069.C. That doesn't work the way that was
presumably intended. Both of these directives set dg-do-what, but
dg-do does so fully and unconditionally, overriding any decisions
recorded there by earlier directives. Reorder the directives more
canonically, so that both take effect.
for gcc/testsuite/ChangeLog
PR target/106069
* g++.target/powerpc/pr106069.C: Reorder dg directives.
(cherry picked from commit ad65caa332bc7600caff6b9b5b29175b40d91e67)
When passing *this to the promise type ctor (or to its operator new)
(as per [dcl.fct.def.coroutine]/4), we add an explicit cast to lvalue
reference. But this is unnecessary since *this is already always an
lvalue. And doing so means we need to call convert_from_reference
afterward to lower the reference expression to an implicit dereference,
which we're currently neglecting to do and which causes overload
resolution to get confused when computing argument conversions.
So this patch removes this unneeded reference cast when passing *this
to the promise ctor, and removes both the cast and implicit deref when
passing *this to operator new, for consistency. While we're here, use
cp_build_fold_indirect_ref instead of directly building INDIRECT_REF.
PR c++/104981
PR c++/115550
gcc/cp/ChangeLog:
* coroutines.cc (morph_fn_to_coro): Remove unneeded calls
to convert_to_reference and convert_from_reference when
passing *this. Use cp_build_fold_indirect_ref instead
of directly building INDIRECT_REF.
gcc/testsuite/ChangeLog:
* g++.dg/coroutines/pr104981-preview-this.C: New test.
* g++.dg/coroutines/pr115550-preview-this.C: New test.
Reviewed-by: Iain Sandoe <iain@sandoe.co.uk>
Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 7c5a9bf1d206fe20cb050200d4a30f11c76b1b19)
The code path for rejecting an object-less call to a non-static member
function should also consider xobj member functions (so that we correctly
reject the below calls with a "cannot call member function without object"
diagnostic).
PR c++/115783
gcc/cp/ChangeLog:
* call.cc (build_new_method_call): Generalize METHOD_TYPE
check to DECL_OBJECT_MEMBER_FUNCTION_P.
gcc/testsuite/ChangeLog:
* g++.dg/cpp23/explicit-obj-diagnostics11.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 2ee70c9f83a1033f2897a35bff9e9ffdd03cc651)
Hi,
this patch fixes wrong code in case store-merging introduces load of function
parameter that was previously write-only (which happens for bitfields).
Without this, the whole store-merged area is consdered to be killed.
PR ipa/111613
gcc/ChangeLog:
* ipa-modref.cc (analyze_parms): Do not preserve EAF_NO_DIRECT_READ and
EAF_NO_INDIRECT_READ from past flags.
gcc/testsuite/ChangeLog:
* gcc.c-torture/pr111613.c: New test.
(cherry picked from commit 14074773350ffed7efdebbc553adf0f23b572e87)
We currently silently ignore the -mrop-protect option for old CPUs we don't
support with the ROP hash insns, but we throw an error for unsupported ABIs.
This patch treats unsupported CPUs and ABIs similarly by throwing an error
both both. This matches clang behavior and allows us to simplify our tests
in the code that generates our prologue and epilogue code.
2024-06-26 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/114759
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Disallow
CPUs and ABIs that do no support the ROP protection insns.
* config/rs6000/rs6000-logue.cc (rs6000_stack_info): Remove now
unneeded tests.
(rs6000_emit_prologue): Likewise.
Remove unneeded gcc_assert.
(rs6000_emit_epilogue): Likewise.
* config/rs6000/rs6000.md: Likewise.
gcc/testsuite/
PR target/114759
* gcc.target/powerpc/pr114759-3.c: New test.
(cherry picked from commit 6f2bab9b5d1ce1914c748b7dcd8638dafaa98df7)
We currently only emit the ROP-protect hash* insns for Power10, where the
insns were added to the architecture. We want to emit them for earlier
cpus (where they operate as NOPs), so that if those older binaries are
ever executed on a Power10, then they'll be protected from ROP attacks.
Binutils accepts hashst and hashchk back to Power8, so change GCC to emit
them for Power8 and later. This matches clang's behavior.
2024-06-19 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/114759
* config/rs6000/rs6000-logue.cc (rs6000_stack_info): Use TARGET_POWER8.
(rs6000_emit_prologue): Likewise.
* config/rs6000/rs6000.md (hashchk): Likewise.
(hashst): Likewise.
Fix whitespace.
gcc/testsuite/
PR target/114759
* gcc.target/powerpc/pr114759-2.c: New test.
* lib/target-supports.exp (rop_ok): Use
check_effective_target_has_arch_pwr8.
(cherry picked from commit a05c3d23d1e1c8d2971b123804fc7a61a3561adb)
We currently only compute the offset for the ROP hash save location in
the stack frame for Altivec compiles. For non-Altivec compiles when we
emit ROP mitigation instructions, we use a default offset of zero which
corresponds to the backchain save location which will get clobbered on
any call. The fix is to compute the ROP hash save location for all
compiles.
2024-06-14 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/115389
* config/rs6000/rs6000-logue.cc (rs6000_stack_info): Compute
rop_hash_save_offset for non-Altivec compiles.
gcc/testsuite
PR target/115389
* gcc.target/powerpc/pr115389.c: New test.
(cherry picked from commit c70eea0dba5f223d49c80cfb3e80e87b74330aac)
The ELFv2 stack frame layout comment in rs6000-logue.cc shows the ROP
hash save slot in the wrong location. Update the comment to show the
correct ROP hash save location in the frame.
2024-06-07 Peter Bergner <bergner@linux.ibm.com>
gcc/
* config/rs6000/rs6000-logue.cc (rs6000_stack_info): Update comment.
(cherry picked from commit e91cf26a954a5c1bf431e36f3a1e69f94e9fa4fe)
modref_eaf_analysis::analyze_ssa_name misinterprets EAF flags. If dereferenced
parameter is passed (to map_iterator in the testcase) it can be returned
indirectly which in turn makes it to escape into the next function call.
PR ipa/115033
gcc/ChangeLog:
* ipa-modref.cc (modref_eaf_analysis::analyze_ssa_name): Fix checking of
EAF flags when analysing values dereferenced as function parameters.
gcc/testsuite/ChangeLog:
* gcc.c-torture/execute/pr115033.c: New test.
(cherry picked from commit cf8ffc58aad3127031c229a75cc4b99c8ace25e0)
unadjusted_ptr_and_unit_offset accidentally throws away the offset computed by
get_addr_base_and_unit_offset. Instead of passing extra_offset it passes offset.
PR ipa/114207
gcc/ChangeLog:
* ipa-prop.cc (unadjusted_ptr_and_unit_offset): Fix accounting of offsets in ADDR_EXPR.
gcc/testsuite/ChangeLog:
* gcc.c-torture/execute/pr114207.c: New test.
(cherry picked from commit 391f46f10b0586c074014de82efe76787739bb0c)
Hi,
this testcase shows another poblem with missing comparators for metadata
in ICF. With value ranges available to loop optimizations during early
opts we can estimate number of iterations based on guarding condition that
can be split away by the fnsplit pass. This patch disables ICF when
number of iteraitons does not match.
Bootstrapped/regtesed x86_64-linux, will commit it shortly
gcc/ChangeLog:
PR ipa/115277
* ipa-icf-gimple.cc (func_checker::compare_loops): compare loop
bounds.
gcc/testsuite/ChangeLog:
* gcc.c-torture/compile/pr115277.c: New test.
(cherry picked from commit 0d19fbc7b0760ce665fa6a88cd40cfa0311358d7)
this patch tames down inliner on (mutiply) self-recursive always_inline functions.
While we already have caps on recursive inlning, the testcase combines early inliner
and late inliner to get very wide recursive inlining tree. The basic idea is to
ignore DISREGARD_INLINE_LIMITS when deciding on inlining self recursive functions
(so we cut on function being large) and clear the flag once it is detected.
I did not include the testcase since it still produces a lot of code and would
slow down testing. It also outputs many inlining failed messages that is not
very nice, but it is hard to detect self recursin cycles in full generality
when indirect calls and other tricks may happen.
gcc/ChangeLog:
PR ipa/113291
* ipa-inline.cc (enum can_inline_edge_by_limits_flags): New enum.
(can_inline_edge_by_limits_p): Take flags instead of multiple bools; add flag
for forcing inlinie limits.
(can_early_inline_edge_p): Update.
(want_inline_self_recursive_call_p): Update; use FORCE_LIMITS mode.
(check_callers): Update.
(update_caller_keys): Update.
(update_callee_keys): Update.
(recursive_inlining): Update.
(add_new_edges_to_heap): Update.
(speculation_useful_p): Update.
(inline_small_functions): Clear DECL_DISREGARD_INLINE_LIMITS on self recursion.
(flatten_function): Update.
(inline_to_all_callers_1): Update.
(cherry picked from commit 1ec49897253e093e1ef6261eb104ac0c111bac83)
Use INT_MIN rather than -1 in `comparison_qty' where a comparison is not
with a register, because the value of -1 is actually a valid reference
to register 0 in the case where it has not been assigned a quantity.
Using -1 makes `REG_QTY (REGNO (folded_arg1)) == ent->comparison_qty'
comparison in `fold_rtx' to incorrectly trigger in rare circumstances
and return true for a memory reference, making CSE consider a comparison
operation to evaluate to a constant expression and consequently make the
resulting code incorrectly execute or fail to execute conditional
blocks.
This has caused a miscompilation of rwlock.c from LinuxThreads for the
`alpha-linux-gnu' target, where `rwlock->__rw_writer != thread_self ()'
expression (where `thread_self' returns the thread pointer via a PALcode
call) has been decided to be always true (with `ent->comparison_qty'
using -1 for a reference to to `rwlock->__rw_writer', while register 0
holding the thread pointer retrieved by `thread_self') and code for the
false case has been optimized away where it mustn't have, causing
program lockups.
The issue has been observed as a regression from commit 08a692679f
("Undefined cse.c behaviour causes 3.4 regression on HPUX"),
<https://gcc.gnu.org/ml/gcc-patches/2004-10/msg02027.html>, and up to
commit 932ad4d9b5 ("Make CSE path following use the CFG"),
<https://gcc.gnu.org/ml/gcc-patches/2006-12/msg00431.html>, where CSE
has been restructured sufficiently for the issue not to trigger with the
original reproducer anymore. However the original bug remains and can
trigger, because `comparison_qty' will still be assigned -1 for a memory
reference and the `reg_qty' member of a `cse_reg_info_table' entry will
still be assigned -1 for register 0 where the entry has not been
assigned a quantity, e.g. at initialization.
Use INT_MIN then as noted above, so that the value remains negative, for
consistency with the REGNO_QTY_VALID_P macro (even though not used on
`comparison_qty'), and then so that it should not ever match a valid
negated register number, fixing the regression with commit 08a692679f.
gcc/
PR rtl-optimization/115565
* cse.cc (record_jump_cond): Use INT_MIN rather than -1 for
`comparison_qty' if !REG_P.
(cherry picked from commit 69bc5fb97dc3fada81869e00fa65d39f7def6acf)
Code attribute bhfgq is missing a mapping for TF. This results in
unresolved iterators in assembler templates for *bswaptf.
With the TF mapping added the base mnemonics vlbr and vstbr are not
"used" anymore but only the extended mnemonics (vlbr<bhfgq> was
interpreted as vlbr; likewise for vstbr). Therefore, remove the base
mnemonics from the scheduling description, otherwise, genattrtab would
error about unknown mnemonics.
Likewise, for movtf_vr only the extended mnemonics for vrepi are used,
now, which means the base mnemonic is "unused" and has to be removed
from the scheduling description.
Similarly, we end up with unresolved iterators in assembler templates
for mulfprx23 since code attribute xdee is missing a mapping for FPRX2.
Note, this is basically a cherry pick of commit r15-2060-ga4abda934aa426
with the addition that vrepi is removed from the scheduling description,
too.
gcc/ChangeLog:
* config/s390/3931.md (vlbr, vstbr, vrepi): Remove.
* config/s390/s390.md (xdee): Add FPRX2 mapping.
* config/s390/vector.md (bhfgq): Add TF mapping.
The inner loop in build_option_suggestions uses OPTION to take the
address of OPTB and use it across iterations, which is undefined
behaviour since OPTB is defined within the loop. Pull it outside the
loop to make this defined.
gcc/ChangeLog:
* opt-suggestions.cc
(option_proposer::build_option_suggestions): Pull OPTB
definition out of the innermost loop.
(cherry picked from commit e0d997e913f811ecf4b3e10891e6a4aab5b38a31)