10000 / 78 is strictly greater than 128 so we will
actually do 128+1 strides in foo() for s == 78 and p[]
needs to be dimensioned accordingly.
2021-12-20 Olivier Hainque <hainque@adacore.com>
gcc/testsuite/
* gcc.dg/vect/vect-simd-20.c: Fix size of p[]
to accommodate the number of strides performed
by foo() for s == 78.
While working on a middle-end patch to more aggressively use highpart
multiplications on targets that support them, I noticed that the RTL
expanded by the x86 backend interacts poorly with register allocation
leading to suboptimal code.
For the testcase,
typedef int __attribute ((mode(TI))) ti_t;
long foo(long x)
{
return ((ti_t)x * 19065) >> 64;
}
we'd like to avoid:
foo: movq %rdi, %rax
movl $19065, %edx
imulq %rdx
movq %rdx, %rax
ret
and would prefer:
foo: movl $19065, %eax
imulq %rdi
movq %rdx, %rax
ret
This patch provides a pair of peephole2 transformations to tweak the
spills generated by reload, and at the same time replaces the current
define_expand with a define_insn pattern using the new [su]mul_highpart
RTX codes.
2021-12-20 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
* config/i386/i386.md (any_mul_highpart): New code iterator.
(sgnprefix, s): Add attribute support for [su]mul_highpart.
(<s>mul<mode>3_highpart): Delete expander.
(<s>mul<mode>3_highpart, <s>mulsi32_highpart_zext):
New define_insn patterns.
(define_peephole2): Tweak the register allocation for the above
instructions after reload.
gcc/testsuite/ChangeLog
* gcc.target/i386/smuldi3_highpart.c: New test case.
This patch makes us remember the function selected by overload resolution
during ahead of time processing of a non-dependent call expression, so
that at instantiation time we avoid repeating some of the work of overload
resolution for the call. Note that we already do this for non-dependent
operator expressions via build_min_non_dep_op_overload.
Some caveats:
* When processing ahead of time a non-dependent call to a member
function template of a currently open class template (as in
g++.dg/template/deduce4.C), we end up generating an "inside-out"
partial instantiation such as S<T>::foo<int, int>(), the likes of
which we're apparently not prepared to fully instantiate. So in this
situation, we instead prune to the selected template instead of the
specialization in this situation.
* This change triggered a latent FUNCTION_DECL pretty printing issue
in cpp0x/error2.C -- since we now resolve the call to foo<0> ahead
of time, the error now looks like:
error: expansion pattern ‘foo()()=0’ contains no parameter pack
where the FUNCTION_DECL for foo<0> is clearly misprinted. But this
pretty-printing issue could be reproduced without this patch if
we define foo as a non-template function. Since this testcase was
added to verify pretty printing of TEMPLATE_ID_EXPR, I work around
this test failure by making the call to foo type-dependent and thus
immune to this ahead of time pruning.
* We now reject parts of cpp0x/fntmp-equiv1.C because we notice that
the non-dependent call d(f, b) in
int d(int, int);
template <unsigned long f, unsigned b, typename> e<d(f, b)> d();
is non-constexpr. Since this testcase is about equivalency of
dependent names in the context of declaration matching, it seems the
best fix here is to make the calls to d, d2 and d3 within the
function signatures dependent.
gcc/cp/ChangeLog:
* call.c (build_new_method_call): For a non-dependent call
expression inside a template, returning a templated tree
whose overload set contains just the selected function.
* semantics.c (finish_call_expr): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/error2.C: Make the call to foo type-dependent in
order to avoid latent pretty-printing issue for FUNCTION_DECL
inside MODOP_EXPR.
* g++.dg/cpp0x/fntmp-equiv1.C: Make the calls to d, d2 and d3
within the function signatures dependent.
* g++.dg/template/non-dependent16.C: New test.
* g++.dg/template/non-dependent16a.C: New test.
* g++.dg/template/non-dependent17.C: New test.
gcc/jit/libgccjit.c:3957:8: warning: type 'struct version_info' violates the C++ One Definition Rule [-Wodr]
../../gcc/jit/libgccjit.c:3957:8: warning: type 'struct version_info' violates the C++ One Definition Rule [-Wodr]
3957 | struct version_info
../../gcc/tree-ssa-loop-ivopts.c:181: note: a different type is defined in another translation unit
181 | struct version_info
gcc/jit/ChangeLog:
* libgccjit.c (struct version_info): Rename to jit_version_info.
(struct jit_version_info): Likewise.
(gcc_jit_version_major): Likewise.
(gcc_jit_version_minor): Likewise.
(gcc_jit_version_patchlevel): Likewise.
In the testcase we fail to analyze SSA name because flag do_dataflow is set
and thus triggers early exist in analyze_ssa_name. Fixed by disabling
early exits when handling deferred names.
gcc/ChangeLog:
2021-12-20 Jan Hubicka <hubicka@ucw.cz>
PR ipa/103669
* ipa-modref.c (modref_eaf_analysis::analyze_ssa_name): Add deferred
parameter.
(modref_eaf_analysis::propagate): Use it.
gcc/testsuite/ChangeLog:
2021-12-20 Jan Hubicka <hubicka@ucw.cz>
PR ipa/103669
* g++.dg/torture/pr103669.C: New test.
This enables IEEE support on the upcoming aarch64-apple-darwin target,
and has been tested for some time in an external port.
libgfortran/ChangeLog:
* configure.host: Add aarch64-apple-darwin support.
* config/fpu-aarch64.h: New file.
With the recent PHI-OPT patch for line numbers, I had missed this
testcase was now failing. The uninitialized warning was there
before my recent patch, just was on the wrong line. The testcase
had added an xfail in r12-4698-gf6d012338 (though a bug report was
not filed to record it).
This patch changes the dg-bogus messages around to catch both locations
and xfail both of them.
At least there is now a patch for the correct line numbers for the
phi-opt.
Committed after testing the testcase.
gcc/testsuite/ChangeLog:
* gcc.dg/uninit-pr89230-1.c: Change the dg-bogus messages
around and xfail both of them.
When adding support for static chain and return slot flags I forgot to update
early exit condition in modref_merge_call_site_flags. This yields to wrong
code as demonstrated by the Fortran testcase attached to PR (which I hope
someone will help me to turn into testuite one).
gcc/ChangeLog:
2021-12-19 Jan Hubicka <hubicka@ucw.cz>
PR ipa/103766
* ipa-modref.c (modref_merge_call_site_flags): Fix early exit condition
Code like
void swap() {
namespace __variant = __detail::__variant;
...
}
create a NAMESPACE_DECL where the CP_DECL_CONTEXT is a FUNCTION_DECL.
DECL_TEMPLATE_INFO fails on NAMESPACE_DECL and therefore must be handled
first in the assertion.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
gcc/cp/ChangeLog:
* module.cc (trees_out::get_merge_kind): NAMESPACE_DECLs also
cannot have a DECL_TEMPLATE_INFO.
The r12-5403 fix apparently doesn't handle the case where the inner
lambda explicitly rather than implicitly captures the capture proxy from
the outer lambda, which causes us to reject the first example in the
testcase below.
This is because compared to an implicit capture, the effective initializer
for an explicit capture is wrapped in a location wrapper (pointing to within
the capture list), and this wrapper foils the is_capture_proxy check added
in r12-5403.
The simplest fix appears to be to strip location wrappers accordingly
before checking is_capture_proxy. And to help prevent against this kind
of bug, this patch also makes is_capture_proxy assert it doesn't see a
location wrapper.
PR c++/94376
gcc/cp/ChangeLog:
* lambda.c (lambda_capture_field_type): Strip location wrappers
before checking for a capture proxy.
(is_capture_proxy): Assert that we don't see a location wrapper.
(mark_const_cap_r): Don't call is_constant_capture_proxy on a
location wrapper.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/lambda/lambda-nested9a.C: New test.
Here during constraint checking for the inner call to A<0>::f<0>,
substitution into the PARM_DECL d in the atomic constraint yields the
wrong local specialization because local_specializations at this point
is nonempty, and contains specializations for the caller A<0>::f<1>.
This patch makes us call push_to_top_level during satisfaction, which'll
temporarily clear local_specializations for us.
PR c++/103714
gcc/cp/ChangeLog:
* constraint.cc (satisfy_declaration_constraints): Do
push_to_top_level and pop_from_top_level around the call to
satisfy_normalized_constraints.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-uneval5.C: New test.
Before match-and-simplify was used in phiot, the location of the
new stamtents were all of that of the conditional, this adds that
back as I did not realize gimple_simplify didn't do that for you.
OK? Bootstrapped and tested on x86_64 with no regressions.
gcc/ChangeLog:
* tree-ssa-phiopt.c (gimple_simplify_phiopt): Annotate the
new sequence with the location of the conditional statement.
A common idiom is to create a DImode value from the "concat" of two SImode
values, using "(long long)hi << 32 | (long long)lo", where the operation
may be ior, xor or plus. On x86, with -m32, the high and low parts of
a DImode register are actually different SImode registers (typically %edx
and %eax) so ideally this idiom should reduce to two move instructions
(or optimally, just clever register allocation).
Unfortunately, GCC currently performs the IOR operation above on -m32,
and worse allocates DImode registers (split to SImode register pairs)
for both the zero extended HI and LO values.
Hence, for test1 from the new test case below:
typedef int __v4si __attribute__ ((__vector_size__ (16)));
long long test1(__v4si v) {
unsigned int loVal = (unsigned int)v[0];
unsigned int hiVal = (unsigned int)v[1];
return (long long)(loVal) | ((long long)(hiVal) << 32);
}
we currently generate (with -m32 -O2 -msse4.1):
test1: subl $28, %esp
pextrd $1, %xmm0, %eax
pmovzxdq %xmm0, %xmm1
movq %xmm1, 8(%esp)
movl %eax, %edx
movl 8(%esp), %eax
orl 12(%esp), %edx
addl $28, %esp
orb $0, %ah
ret
with this patch we now generate:
test1: pextrd $1, %xmm0, %edx
movd %xmm0, %eax
ret
The fix is to recognize and split the idiom (hi<<32)|zext(lo) prior
to register allocation on !TARGET_64BIT, simplifying this sequence to
"highpart(dst) = hi; lowpart(dst) = lo".
The one minor complication is that sse.md's define_insn for
*vec_extractv4si_0_zext_sse4 can sometimes interfere with this
optimization. It turns out that on !TARGET_64BIT, the zero_extend:DI
following vec_select:SI isn't free, and this insn gets split back
into multiple instructions during later passes, but too late to
be optimized away by this patch/reload. Hence the last hunk of
this patch is to restrict *vec_extractv4si_0_zext_sse4 to TARGET_64BIT.
Checking PR target/80286, where *vec_extractv4si_0_zext_sse4 was
first added, this seems reasonable.
2021-12-18 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
PR target/103611
* config/i386/i386.md (any_or_plus): New code iterator.
(define_split): Split (HI<<32)|zext(LO) into piece-wise
move instructions on !TARGET_64BIT.
* config/i386/sse.md (*vec_extractv4si_0_zext_sse4):
Restrict to TARGET_64BIT.
gcc/testsuite/ChangeLog
PR target/103611
* gcc.target/i386/pr103611-2.c: New test case.
This patch adds support for an -Oz command line option, aggressively
optimizing for size at the expense of performance. GCC's current -Os
provides a reasonable balance of size and performance, whereas -Oz is
probably only useful for code size benchmarks such as CSiBE. Or so I
thought until I read in https://news.ycombinator.com/item?id=25408853
that clang's -Oz sometimes outperforms -O[23s]; I suspect modern instruction
decode stages can treat "pushq $1; popq %rax" as a short uop encoding.
Instead of introducing a new global variable, this patch simply abuses
the existing optimize_size by setting its value to 2. The only change
in behaviour is the tweak to the i386 backend implementing the suggestion
in PR target/32803 to use a short push/pop sequence for loading small
immediate values (-128..127) on x86, matching the behaviour of LLVM.
On x86_64, the simple function:
int foo() { return 25; }
currently generates with -Os:
foo: movl $25, %eax // 5 bytes
ret
With the proposed -Oz, it generates:
foo: pushq $25 // 2 bytes
popq %rax // 1 byte
ret
On CSiBE, this results in a 0.94% improvement (3703513 bytes total
down to 3668516 bytes).
2021-12-18 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR target/32803
* common.opt (Oz): New command line option.
* doc/invoke.texi: Document the new -Oz option.
* lto-wrapper.c (merge_and_complain, append_compiler_options):
Treat OPT_Oz as synonymous with OPT_Os.
* optc-save-gen.awk: Increase maximum value of optimize_size to 2.
* opts.c (default_options_optimization) [OPT_Oz]: Handle OPT_Oz
just like OPT_Os, except set opt->x_optimize_size to 2.
(common_handle_option): Skip OPT_Oz just like OPT_Os.
* config/i386/i386.md (*movdi_internal): Use a push/pop sequence
for suitable SImode TYPE_IMOV moves when optimize_size > 1.
(*movsi_internal): Likewise.
gcc/testsuite/ChangeLog
PR target/32803
* gcc.target/i386/pr32803.c: New test case.
Since all computations in tree-object-size are now done in sizetype and
not HOST_WIDE_INT, comparisons with HOST_WIDE_INT based unknown and
initval would be incorrect. Instead, use the sizetype trees directly to
generate and evaluate initval and unknown size values.
gcc/ChangeLog:
PR tree-optimization/103759
* tree-object-size.c (unknown, initval): Remove functions.
(size_unknown, size_initval, size_unknown_p): Operate directly
on trees.
Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
Functions from <ctype.h> should only be called on values that can be
represented by unsigned char. On targets where char is a signed type,
some of libgfortran calls have undefined behaviour.
The solution is to cast the argument to unsigned char type. I’ve defined
macros in libgfortran.h to do so, to retain legibility of the library
code.
PR libfortran/95177
libgfortran/ChangeLog
* libgfortran.h: include ctype.h, provide safe macros.
* io/format.c: use safe macros.
* io/list_read.c: use safe macros.
* io/read.c: use safe macros.
* io/write.c: use safe macros.
* runtime/environ.c: use safe macros.
The current GCC branch will become 12.1.0, which will be the stable
version of GCC when the next macOS version is released. There are some
places in GCC that don’t handle darwin22 as a version, so we need to
future-proof it (gcc/config.gcc and gcc/config/darwin-driver.c). We
align that code with what Apple clang does, i.e. accept all potential
major macOS versions until 99.
This patch also homogenises the handling of darwin version numbers,
where the majority of places use darwin2*, but some used darwin2[0-9]*.
Since there never was a darwin2.x version, the two are equivalent, and
we prefer the simpler darwin2*
gcc/ChangeLog:
* config/darwin-driver.c: Make version code more future-proof.
* config.gcc: Homogeneize darwin versions.
* configure.ac: Homogeneize darwin versions.
* configure: Regenerate.
gcc/testsuite/ChangeLog:
* gcc.dg/darwin-minversion-link.c: Test darwin21.
* obj-c++.dg/cxx-ivars-3.mm: Homogeneize darwin versions.
* obj-c++.dg/objc-gc-3.mm: Homogeneize darwin versions.
* objc.dg/objc-gc-4.m: Homogeneize darwin versions.
My patch to implement -Wno-attribute=A::b caused a bogus error when
parsing
[[foo::bar(1, 2)]];
when -Wno-attributes=foo::bar was specified on the command line, because
when we create a fake foo::bar attribute and insert it into our attribute
table, it is created with max_length == 0 which doesn't allow any args.
That is wrong -- we know nothing about the attribute, so we shouldn't
require any specific number of arguments. And since unknown attributes
can be rather complex (see for example omp::{directive,sequence}), we
must skip parsing their arguments. To that end, I'm using max_length
with value -2.
Also let's not warn about things like
[[vendor::assume(true)]];
because they may have some meaning (this is reminiscent of C++ Portable
Assumptions).
PR c/103649
gcc/ChangeLog:
* attribs.c (handle_ignored_attributes_option): Create the fake
attribute with max_length == -2.
(attribute_ignored_p): New overloads.
* attribs.h (attribute_ignored_p): Declare them.
* tree-core.h (struct attribute_spec): Document that max_length
can be -2.
gcc/c/ChangeLog:
* c-decl.c (c_warn_unused_attributes): Don't warn for
attribute_ignored_p.
* c-parser.c (c_parser_std_attribute): Skip parsing of the attribute
arguments when the attribute is ignored.
gcc/cp/ChangeLog:
* parser.c (cp_parser_declaration): Don't warn for attribute_ignored_p.
(cp_parser_std_attribute): Skip parsing of the attribute
arguments when the attribute is ignored.
gcc/testsuite/ChangeLog:
* c-c++-common/Wno-attributes-6.c: New test.
To match the tests expectations for toolchains
configured to default to not so capable cpus.
2021-12-17 Olivier Hainque <hainque@adacore.com>
gcc/testsuite/
* gcc.target/powerpc/pr97142.c: Add -mdejagnu-cpu=power7
to the dg-options.
For code like
template<typename>
struct bar;
struct bar {
int baz;
};
bar var;
we emit a fairly misleading and unwieldy diagnostic:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
$ g++ -c u.cc
u.cc:6:8: error: template argument required for 'struct bar'
6 | struct bar {
| ^~~
u.cc:10:5: error: class template argument deduction failed:
10 | bar var;
| ^~~
u.cc:10:5: error: no matching function for call to 'bar()'
u.cc:3:17: note: candidate: 'template<class> bar()-> bar< <template-parameter-1-1> >'
3 | friend struct bar;
| ^~~
u.cc:3:17: note: template argument deduction/substitution failed:
u.cc:10:5: note: couldn't deduce template parameter '<template-parameter-1-1>'
10 | bar var;
| ^~~
u.cc:3:17: note: candidate: 'template<class> bar(bar< <template-parameter-1-1> >)-> bar< <template-parameter-1-1> >'
3 | friend struct bar;
| ^~~
u.cc:3:17: note: template argument deduction/substitution failed:
u.cc:10:5: note: candidate expects 1 argument, 0 provided
10 | bar var;
| ^~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
but with this patch we get:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
z.C:4:10: error: class template 'bar' redeclared as non-template
4 | struct bar {
| ^~~
z.C:2:10: note: previous declaration here
2 | struct bar;
| ^~~
z.C:8:7: error: 'bar<...auto...> var' has incomplete type
8 | bar var;
| ^~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
which is clearer about what the problem is.
I thought it'd be nice to avoid printing the messages about failed CTAD,
too. To that end, I'm using CLASSTYPE_ERRONEOUS to suppress CTAD. Not
sure if that's entirely kosher.
The other direction (first a non-template class declaration followed by
a class template definition) we handle quite well:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
z.C:11:8: error: 'bar' is not a template
11 | struct bar {};
| ^~~
z.C:8:8: note: previous declaration here
8 | struct bar;
| ^~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PR c++/103749
gcc/cp/ChangeLog:
* decl.c (lookup_and_check_tag): Give an error when a class was
declared as template but no template header has been provided.
* pt.c (do_class_deduction): Don't deduce CLASSTYPE_ERRONEOUS
types.
gcc/testsuite/ChangeLog:
* g++.dg/template/redecl4.C: Adjust dg-error.
* g++.dg/diagnostic/redeclaration-2.C: New test.
Make the darn testcases work (and be tested) in 32-bit mode as well.
They used to ICE, but they no longer do.
2021-12-17 Segher Boessenkool <segher@kernel.crashing.org>
gcc/testsuite/
PR target/103624
* gcc.target/powerpc/darn-0.c: Remove target clause.
* gcc.target/powerpc/darn-1.c: Remove target clause. Remove lp64
requirement. Change return type to long.
* gcc.target/powerpc/darn-2.c: Ditto.
* gcc.target/powerpc/darn-3.c: Remove target clause.
The builtins now all return "long". The patterns have :GPR as the
output mode, so they can be 32-bit as well (the instruction makes sense
in 32 bit just fine). The builtins expand to the DImode version
normally, but to the SImode if {32bit} is true.
2021-12-17 Segher Boessenkool <segher@kernel.crashing.org>
PR target/103624
* config/rs6000/rs6000-builtins.def (__builtin_darn): Expand to
darn_64_di. Add {32bit} attribute. Return long.
(__builtin_darn_32): Expand to darn_32_di. Add {32bit} attribute.
Return long.
(__builtin_darn_raw): Expand to darn_raw_di. Add {32bit} attribute.
Return long.
* config/rs6000/rs6000-call.c (rs6000_expand_builtin): Expand the darn
builtins to the _si variants for -m32.
* config/rs6000/rs6000.md (UNSPECV_DARN_32, UNSPECV_DARN_RAW): Delete.
(UNSPECV_DARN): Update comment.
(darn_32, darn_raw, darn): Delete.
(darn_32_<mode>, darn_64_<mode>, darn_raw_<mode> for GPR): New.
(@darn<mode> for GPR): New.
The way in which a C++20 coroutine is specified discards any value
that might be returned from the initial or final await expressions.
This ICE was caused by an initial await expression with an
await_resume () returning a reference, the function rewrite code
was not set up to expect this.
Fixed by looking through any indirection present and by explicitly
discarding the value, if any, returned by await_resume().
It does not seem useful to make a diagnostic for this, since
the user could define a generic awaiter that usefully returns
values when used in a different position from the initial (or
final) await expressions.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR c++/100127
gcc/cp/ChangeLog:
* coroutines.cc (coro_rewrite_function_body): Handle initial
await expressions that try to produce a reference value.
gcc/testsuite/ChangeLog:
* g++.dg/coroutines/pr100127.C: New test.
The wording of the standard has been clarified to be explicit that
the the parameters to any user-defined operator-new in the promise
class should be lvalues.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR c++/100772
gcc/cp/ChangeLog:
* coroutines.cc (morph_fn_to_coro): Convert function parms
from reference before constructing any operator-new args
list.
gcc/testsuite/ChangeLog:
* g++.dg/coroutines/pr100772-a.C: New test.
* g++.dg/coroutines/pr100772-b.C: New test.
This PR was fixed by r12-5255-gdaa9c6b015, this adds
the testcase.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/testsuite/ChangeLog:
PR c++/96517
* g++.dg/coroutines/pr96517.C: New test.
rs6000-overload.def defines one instance of vec_promote so that it can be
registered with the front end. Actual expansion of the vec_promote overload
is done with special-case code in rs6000-c.c. During another cleanup, I
observed that the fake instance has the wrong number of arguments. Fix that.
2021-12-17 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000-overload.def (__builtin_vec_promote): Add second
argument.
This PR shows that I didn't properly test the multi-vector case when
adding support for SLP gather loads. The patch fixes that case using
the same approach as we do for non-SLP cases: keep the scalar base
the same, but iterate through the (also multi-vector) vector offsets.
“vec_num * j + i” is already used elsewhere as a way of handling both
the multi-vector SLP case and the multi-vector non-SLP case.
gcc/
PR tree-optimization/103744
* tree-vect-stmts.c (vectorizable_load): Handle multi-vector
SLP gather loads.
gcc/testsuite/
PR tree-optimization/103744
* gcc.dg/vect/pr103744-1.c: New test.
* gcc.dg/vect/pr103744-2.c: Likewise.
It seems I forgot to check that the operation we're combing when masking the
predicated together are actually predicates types.
Without it we end up accidentally trying to combine a value and a mask.
gcc/ChangeLog:
PR tree-optimization/103741
* tree-vect-stmts.c (vectorizable_operation): Check for boolean.
gcc/testsuite/ChangeLog:
PR tree-optimization/103741
* gcc.target/aarch64/pr103741.c: New test.
There was a race condition where the link for the new shared EH library
(only used on earlier Darwin) could fail because the new crts had not been
copied to the gcc directory. This can cause a build failure (although
currently only seen on powerpc-darwin).
Fixed by adding specific dependency on the crts and on the multi target.
We also add the declaration header for the Darwin10 unwinder shim to the
powerpc cases, since we build that there for Rosetta use.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
libgcc/ChangeLog:
* config.host: Add shim declaration header to powerpc*-darwin builds.
* config/rs6000/t-darwin-ehs: Remove dependency on the powerpc end
file.
* config/t-darwin-ehs: Add dependencies to the shared unwinder
objects.
* config/t-slibgcc-darwin: Add extra_parts to the dependencies for
the shared EH lib. Add all-multi to the dependencies for the
libgcc_s.1.dylib redirections.
We were pushing a spec value for weak_reference_mismatches unconditionally
which is not needed (the value was the default) and the side-effect of
this was that we appeared to need to drive a link command; leading to
unexpected diagnostics for cases where gcc was invoked with an empty
command line.
Also we were pushing flags for sysroot, os minimum version and controls
even if the command line was empty.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:
* config/darwin-driver.c (darwin_driver_init): Exit from the
option handling early if the command line is definitely enpty.
* config/darwin.h (SUBTARGET_DRIVER_SELF_SPECS): Remove
setting for the default content of weak_reference_mismatches.
This adds a missed change from r12-5974-g926d64906af.
The builin_decls array has been renamed to drop the trailing
_x that was used during the main changes to the builtins.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:
* config/rs6000/darwin.h: Drop trailing _x from the
builtin_decls array name.
Recognize the __builtin_dynamic_object_size builtin and add paths in the
object size path to deal with it, but treat it like
__builtin_object_size for now. Also add tests to provide the same
testing coverage for the new builtin name.
gcc/ChangeLog:
* builtins.def (BUILT_IN_DYNAMIC_OBJECT_SIZE): New builtin.
* tree-object-size.h: Move object size type bits enum from
tree-object-size.c and add new value OST_DYNAMIC.
* builtins.c (expand_builtin, fold_builtin_2): Handle it.
(fold_builtin_object_size): Handle new builtin and adjust for
change to compute_builtin_object_size.
* tree-object-size.c: Include builtins.h.
(compute_builtin_object_size): Adjust.
(early_object_sizes_execute_one,
dynamic_object_sizes_execute_one): New functions.
(object_sizes_execute): Rename insert_min_max_p argument to
early. Handle BUILT_IN_DYNAMIC_OBJECT_SIZE and call the new
functions.
* doc/extend.texi (__builtin_dynamic_object_size): Document new
builtin.
gcc/testsuite/ChangeLog:
* g++.dg/ext/builtin-dynamic-object-size1.C: New test.
* g++.dg/ext/builtin-dynamic-object-size2.C: Likewise.
* gcc.dg/builtin-dynamic-alloc-size.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-1.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-10.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-11.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-12.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-13.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-14.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-15.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-16.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-17.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-18.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-19.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-2.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-3.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-4.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-5.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-6.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-7.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-8.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-9.c: Likewise.
* gcc.dg/builtin-object-size-16.c: Adjust to allow inclusion
from builtin-dynamic-object-size-16.c.
* gcc.dg/builtin-object-size-17.c: Likewise.
Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
Transform tree-object-size to operate on tree objects instead of host
wide integers. This makes it easier to extend to dynamic expressions
for object sizes.
The compute_builtin_object_size interface also now returns a tree
expression instead of HOST_WIDE_INT, so callers have been adjusted to
account for that.
The trees in object_sizes are each an object_size object with members
size (the bytes from the pointer to the end of the object) and wholesize
(the size of the whole object). This allows analysis of negative
offsets, which can now be allowed to the extent of the object bounds.
Tests have been added to verify that it actually works.
gcc/ChangeLog:
* tree-object-size.h (compute_builtin_object_size): Return tree
instead of HOST_WIDE_INT.
* builtins.c (fold_builtin_object_size): Adjust.
* gimple-fold.c (gimple_fold_builtin_strncat): Likewise.
* ubsan.c (instrument_object_size): Likewise.
* tree-object-size.c (object_size): New structure.
(object_sizes): Change type to vec<object_size>.
(initval): New function.
(unknown): Use it.
(size_unknown_p, size_initval, size_unknown): New functions.
(object_sizes_unknown_p): Use it.
(object_sizes_get): Return tree.
(object_sizes_initialize): Rename from object_sizes_set_force
and set VAL parameter type as tree. Add new parameter WHOLEVAL.
(object_sizes_set): Set VAL parameter type as tree and adjust
implementation. Add new parameter WHOLEVAL.
(size_for_offset): New function.
(decl_init_size): Adjust comment.
(addr_object_size): Change PSIZE parameter to tree and adjust
implementation. Add new parameter PWHOLESIZE.
(alloc_object_size): Return tree.
(compute_builtin_object_size): Return tree in PSIZE.
(expr_object_size, call_object_size, unknown_object_size):
Adjust for object_sizes_set change.
(merge_object_sizes): Drop OFFSET parameter and adjust
implementation for tree change.
(plus_stmt_object_size): Call collect_object_sizes_for directly
instead of merge_object_size and call size_for_offset to get net
size.
(cond_expr_object_size, collect_object_sizes_for,
object_sizes_execute): Adjust for change of type from
HOST_WIDE_INT to tree.
(check_for_plus_in_loops_1): Likewise and skip non-positive
offsets.
gcc/testsuite/ChangeLog:
* gcc.dg/builtin-object-size-1.c (test9): New test.
(main): Call it.
* gcc.dg/builtin-object-size-2.c (test8): New test.
(main): Call it.
* gcc.dg/builtin-object-size-3.c (test9): New test.
(main): Call it.
* gcc.dg/builtin-object-size-4.c (test8): New test.
(main): Call it.
* gcc.dg/builtin-object-size-5.c (test5, test6, test7): New
tests.
Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>