This patch marks the nios2*-*-* targets obsolete in GCC 14. Intel has
EOL'ed this architecture and the maintainers no longer have access to
hardware for testing. While the port is still in reasonably good
shape at this time, no further testing or updates are planned.
gcc/
* config.gcc: Add nios2*-*-* to the list of obsoleted targets.
contrib/
* config-list.mk (LIST): --enable-obsolete for nios2*-*-*.
2024-04-18 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/114739
* primary.cc (gfc_match_varspec): Check for default type before
checking for derived types with the right component name.
gcc/testsuite/
PR fortran/114739
* gfortran.dg/pr114739.f90: New test.
* gfortran.dg/derived_comp_array_ref_8.f90: Add 'implicit none'
for consistency with expected error message.
* gfortran.dg/nullify_4.f90: ditto
* gfortran.dg/pointer_init_6.f90: ditto
* gfortran.dg/pr107397.f90: ditto
* gfortran.dg/pr88138.f90: ditto
Without -msse2, an i586-targeting toolchain fails bf16_short_warn.c
because neither type __m128bh nor intrinsic _mm_cvtneps_pbh get
declared.
for gcc/testsuite/ChangeLog
* gcc.target/i386/bf16_short_warn.c: Add -msse2.
A few x86 tests get unexpected insn counts if the toolchain is
configured with --enable-frame-pointer. Add explicit
-fomit-frame-pointer so that the expected insn sequences are output.
for gcc/testsuite/ChangeLog
* gcc.target/i386/pr107261.c: Add -fomit-frame-pointer.
* gcc.target/i386/pr69482-1.c: Likewise.
* gcc.target/i386/pr69482-2.c: Likewise.
Complete r13-2205, adjusting an arm-specific test that expects a
no-longer-issued error at an empty initializer.
for gcc/testsuite/ChangeLog
* gcc.target/arm/bfloat16_scalar_typecheck.c: Accept C23
empty initializers.
The test expected the address of a literal string, converted to long
long, to yield a positive value. That expectation doesn't necessarily
hold, and the test fails where it doesn't.
Adjust the test to use a pointer that will compare as expected.
for gcc/testsuite/ChangeLog
* g++.dg/contracts/contracts9.C: Don't assume string literals
have non-negative addresses.
pr103798-2.c fails in C++ on targets that provide a ISO C++-compliant
declaration of memchr, because it mismatches the C-compatible builtin,
as per PR113706. Expect the C++ test to fail on vxworks as well.
for gcc/testsuite/ChangeLog
PR testsuite/113706
* c-c++-common/pr103798-2.c: XFAIL in C++ on vxworks too.
Test that calls select fails on vxworks because select is only
declared in sys/select.h. Include that header if it's present.
for gcc/testsuite/ChangeLog
* gcc.dg/analyzer/fd-glibc-byte-stream-connection-server.c:
Include sys/select.h if present.
Mark tests that fail due to the lack of fork, as in vxworks kernel
mode, as requiring fork.
for gcc/testsuite/ChangeLog
* gcc.dg/analyzer/pipe-glibc.c: Require fork.
* gcc.dg/analyzer/pipe-manpages.c: Likewise.
O_ACCMODE is not defined on vxworks, and the test is meaningless and
failing without it, so skip it.
for gcc/testsuite/ChangeLog
* gcc.dg/analyzer/fd-access-mode-target-headers.c: Skip on
vxworks as well.
Define macro that prevents mode_t from being defined by vxworks'
headers as well.
for gcc/testsuite/ChangeLog
* gcc.dg/analyzer/fd-4.c: Define macro to avoid mode_t on
vxworks.
A number of tests that call strndup fail on vxworks, where there's no
strndup. Some of them already had workarounds to skip the strndup
parts of the tests on platforms that don't offer it. I've changed
them to rely on a strndup effective target instead, and extended the
logic to other tests that were otherwise skipped entirely.
for gcc/ChangeLog
* doc/sourcebuild.texi (strndup): Add effective target.
for gcc/testsuite/ChangeLog
* lib/target-supports.exp (check_effective_target_strndup): New.
* gcc.dg/builtin-dynamic-object-size-0.c: Skip strndup tests
when the function is not available.
* gcc.dg/builtin-dynamic-object-size-1.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-2.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-3.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-4.c: Likewise.
* gcc.dg/builtin-object-size-1.c: Likewise.
* gcc.dg/builtin-object-size-2.c: Likewise.
* gcc.dg/builtin-object-size-3.c: Likewise.
* gcc.dg/builtin-object-size-4.c: Likewise.
On arm-vx7r2, the uses of as.load() as initializer get SRAed, so the
padding bits in the tests are not what we might expect from full-word
struct copies.
I tried adding a function to perform bitwise copying, but even taking
the as.load() argument by const&, we'd still construct a temporary
with SRAed field-wise copying. Unable to find another way to ensure
we wouldn't get a temporary, I went for disabling SRA.
for libstdc++-v3/ChangeLog
* testsuite/29_atomics/atomic/compare_exchange_padding.cc:
Disable SRA.
Tests 20_util/from_chars/4.cc and 20_util/to_chars/long_double.cc were
adjusted about a year ago to skip long double on some targets, because
the fastfloat library was limited to 64-bit doubles.
The same problem comes up in similar float128_t tests on
aarch64-vxworks. This patch adjusts them similarly.
Unlike the earlier tests, that got similar treatment for
x86_64-vxworks, these haven't failed there.
for libstdc++-v3/ChangeLog
* testsuite/20_util/from_chars/8.cc: Skip float128_t testing
on aarch64-vxworks.
* testsuite/20_util/to_chars/float128_c++23.cc: Xfail run on
aarch64-vxworks.
VxWorks fails to load kernel-mode modules with weak undefined symbols.
In RTP mode modules, that undergo final linking, weak undefined
symbols are not a problem.
This patch adds kernel-mode VxWorks multilibs to the set of targets
that don't support weak undefined symbols without special flags, in
which tzdb's zoneinfo_dir_override is given a weak definition.
for libstdc++-v3/ChangeLog
* src/c++20/tzdb.cc (__gnu_cxx::zoneinfo_dir_override): Define
on VxWorks non-RTP.
In PR114741 we see that we have a regression in codegen when SVE is enable where
the simple testcase:
void foo(unsigned v, unsigned *p)
{
*p = v & 1;
}
generates
foo:
fmov s31, w0
and z31.s, z31.s, #1
str s31, [x1]
ret
instead of:
foo:
and w0, w0, 1
str w0, [x1]
ret
This causes an impact it not just codesize but also performance. This is caused
by the use of the ^ constraint modifier in the pattern <optab><mode>3.
The documentation states that this modifier should only have an effect on the
alternative costing in that a particular alternative is to be preferred unless
a non-psuedo reload is needed.
The pattern was trying to convey that whenever both r and w are required, that
it should prefer r unless a reload is needed. This is because if a reload is
needed then we can construct the constants more flexibly on the SIMD side.
We were using this so simplify the implementation and to get generic cases such
as:
double negabs (double x)
{
unsigned long long y;
memcpy (&y, &x, sizeof(double));
y = y | (1UL << 63);
memcpy (&x, &y, sizeof(double));
return x;
}
which don't go through an expander.
However the implementation of ^ in the register allocator is not according to
the documentation in that it also has an effect during coloring. During initial
register class selection it applies a penalty to a class, similar to how ? does.
In this example the penalty makes the use of GP regs expensive enough that it no
longer considers them:
r106: preferred FP_REGS, alternative NO_REGS, allocno FP_REGS
;; 3--> b 0: i 9 r106=r105&0x1
:cortex_a53_slot_any:GENERAL_REGS+0(-1)FP_REGS+1(1)PR_LO_REGS+0(0)
PR_HI_REGS+0(0):model 4
which is not the expected behavior. For GCC 14 this is a conservative fix.
1. we remove the ^ modifier from the logical optabs.
2. In order not to regress copysign we then move the copysign expansion to
directly use the SIMD variant. Since copysign only supports floating point
modes this is fine and no longer relies on the register allocator to select
the right alternative.
It once again regresses the general case, but this case wasn't optimized in
earlier GCCs either so it's not a regression in GCC 14. This change gives
strict better codegen than earlier GCCs and still optimizes the important cases.
gcc/ChangeLog:
PR target/114741
* config/aarch64/aarch64.md (<optab><mode>3): Remove ^ from alt 2.
(copysign<GPF:mode>3): Use SIMD version of IOR directly.
gcc/testsuite/ChangeLog:
PR target/114741
* gcc.target/aarch64/fneg-abs_2.c: Update codegen.
* gcc.target/aarch64/fneg-abs_4.c: xfail for now.
* gcc.target/aarch64/pr114741.c: New test.
The following testcase aborts on aarch64-linux but does not on x86_64-linux.
In both cases there is UB in the __divmodbitint4 implemenetation.
When the divisor is negative with most significant limb (even when partial)
all ones, has at least 2 limbs and the second most significant limb has the
most significant bit clear, when this number is negated, it will have 0
in the most significant limb.
Already in the PR114397 r14-9592 fix I was dealing with such divisors, but
thought the problem is only if because of that un < vn doesn't imply the
quotient is 0 and remainder u.
But as this testcase shows, the problem is with such divisors always.
What happens is that we use __builtin_clz* on the most significant limb,
and assume it will not be 0 because that is UB for the builtins.
Normally the most significant limb of the divisor shouldn't be 0, as
guaranteed by the bitint_reduce_prec e.g. for the positive numbers, unless
the divisor is just 0 (but for vn == 1 we have special cases).
The following patch moves the handling of this corner case a few lines
earlier before the un < vn check, because adjusting the vn later is harder.
2024-04-18 Jakub Jelinek <jakub@redhat.com>
PR libgcc/114755
* libgcc2.c (__divmodbitint4): Perform the decrement on negative
v with most significant limb all ones and the second least
significant limb with most significant bit clear always, regardless of
un < vn.
* gcc.dg/torture/bitint-69.c: New test.
__builtin_{add,sub,mul}_overflow{,_p} builtins are well defined
for all inputs even for -ftrapv, and the -fsanitize=signed-integer-overflow
ifns shouldn't abort in libgcc but emit the desired ubsan diagnostics
or abort depending on -fsanitize* setting regardless of -ftrapv.
The expansion of these internal functions uses expand_expr* in various
places (e.g. MULT_EXPR at least in 2 spots), so temporarily disabling
flag_trapv in all those spots would be hard.
The following patch disables it around the bodies of 3 functions
which can do the expand_expr calls.
If it was in the C++ FE, I'd use some RAII sentinel, but I don't think
we have one in the middle-end.
2024-04-18 Jakub Jelinek <jakub@redhat.com>
PR middle-end/114753
* internal-fn.cc (expand_mul_overflow): Save flag_trapv and
temporarily clear it for the duration of the function, then
restore previous value.
(expand_vector_ubsan_overflow): Likewise.
(expand_arith_overflow): Likewise.
* gcc.dg/pr114753.c: New test.
Test case builtins-6-p9-runnable.c doesn't work well on BE
due to two problems:
- When applying vec_xl_len onto data_128 and data_u128
with length 8, it expects to load 1280000[01] from
the memory, but unfortunately assigning 1280000[01] to
a {vector} {u,}int128 type variable, the value isn't
guaranteed to be at the beginning of storage (in the
low part of memory), which means the loaded value can
be unexpected (as shown on BE). So this patch is to
introduce getU128 which can ensure the given value
shows up as expected and also update some dumping code
for debugging.
- When applying vec_xl_len_r with length 16, on BE it's
just like the normal vector load, so the expected data
should not be reversed from the original.
PR testsuite/114744
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/builtins-6-p9-runnable.c: Adjust for BE by fixing
data_{u,}128, their uses and vec_uc_expected1, also adjust some formats.
gcc/testsuite/
* gcc.target/powerpc/bcd-4.c: Enable the case to be tested on P9.
Enable the case to be run on big endian. Fix function maxbcd and
other misc. problems.
This was recently approved for C++26 at the Tokyo meeting. As suggested
by Stephan T. Lavavej, I'm defining it as an extension for C++23 mode
(when std::print and std::prinln were first added) rather than as a new
C++26 feature. Both MSVC and libc++ have agreed to do this too.
libstdc++-v3/ChangeLog:
* include/std/ostream (println(ostream&)): Define new overload.
* include/std/print (println(FILE*), println()): Likewise.
* testsuite/27_io/basic_ostream/print/2.cc: New test.
* testsuite/27_io/print/1.cc: Remove unused header.
* testsuite/27_io/print/3.cc: New test.
Starting with GCC 14 we have the nice URLification of the options printed
in diagnostics, say for in
test.c:4:23: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘long int’ [-Wformat=]
the -Wformat= is underlined in some terminals and hovering on it shows
https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wformat
link.
This works nicely on the GCC trunk, where the online documentation is
regenerated every day from a cron job and more importantly, people rarely
use the trunk snapshots for too long, so it is unlikely that further changes
in the documentation will make too many links stale, because users will
simply regularly update to newer snapshots.
I think it doesn't work properly on release branches though.
Some users only use the relased versions (i.e. MAJOR.MINOR.0) from tarballs
but can use them for a couple of years, others use snapshots from the
release branches, but again they could be in use for months or years and
the above mentioned online docs which represent just the GCC trunk might
diverge significantly.
Now, for the relases we always publish also online docs for the release,
which unlike the trunk online docs will not change further, under
e.g.
https://gcc.gnu.org/onlinedocs/gcc-14.1.0/gcc/Warning-Options.html#index-Wformat
or
https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/Warning-Options.html#index-Wformat
etc.
So, I think at least for the MAJOR.MINOR.0 releases we want to use
URLs like above rather than the trunk ones and we can use the same process
of updating *.opt.urls as well for that.
For the snapshots from release branches, we don't have such docs.
One option (implemented in the patch below for the URL printing side) is
point to the MAJOR.MINOR.0 docs even for MAJOR.MINOR.1 snapshots.
Most of the links will work fine, for options newly added on the release
branches (rare thing but still happens) can have until the next release
no URLs for them and get them with the next point release.
The question is what to do about make regenerate-opt-urls for the release
branch snapshots. Either just document that users shouldn't
make regenerate-opt-urls on release branches (and filter out *.opt.urls
changes from their commits), add make regenerate-opt-urls task be RM
responsibility before making first release candidate from a branch and
adjust the autoregen CI to know about that. Or add a separate goal
which instead of relying on make html created files would download
copy of the html files from the last release from web (kind of web
mirroring the https://gcc.gnu.org/onlinedocs/gcc-14.1.0/ subtree locally)
and doing regenerate-opt-urls on top of that? But how to catch the
point when first release candidate is made and we want to update to
what will be the URLs once the release is made (but will be stale URLs
for a week or so)?
Another option would be to add to cron daily regeneration of the online
docs for the release branches. I don't think that is a good idea though,
because as I wrote earlier, not all users update to the latest snapshot
frequently, so there can be users that use gcc 13.1.1 20230525 for months
or years, and other users which use gcc 13.1.1 20230615 for years etc.
Another question is what is most sensible for users who want to override
the default root and use the --with-documentation-root-url= configure
option. Do we expect them to grab the whole onlinedocs tree or for release
branches at least include gcc-14.1.0/ subdirectory under the root?
If so, the patch below deals with that. Or should we just change the
default documentation root url, so if user doesn't specify
--with-documentation-root-url= and we are on a release branch, default that
to https://gcc.gnu.org/onlinedocs/gcc-14.1.0/ or
https://gcc.gnu.org/onlinedocs/gcc-14.2.0/ etc. and don't add any infix in
get_option_url/make_doc_url, but when people supply their own, let them
point to the root of the tree which contains the right docs?
Then such changes would go into gcc/configure.ac, some case based on
"$gcc_version", from that decide if it is a release branch or trunk.
2024-04-17 Jakub Jelinek <jakub@redhat.com>
PR other/114738
* opts.cc (get_option_url): On release branches append
gcc-MAJOR.MINOR.0/ after DOCUMENTATION_ROOT_URL.
* gcc-urlifier.cc (gcc_urlifier::make_doc_url): Likewise.
As discussed in the PR, aclocal.m4 and configure were incorrectly
regenerated at some point.
2024-04-17 Christophe Lyon <christophe.lyon@linaro.org>
PR preprocessor/114748
libcpp/
* aclocal.m4: Regenerate.
* configure: Regenerate.
The following makes sure to reset LOOP_VINFO_USING_PARTIAL_VECTORS_P
to its default of false when re-trying without SLP as otherwise
analysis may run into bogus asserts.
PR tree-optimization/114749
* tree-vect-loop.cc (vect_analyze_loop_2): Reset
LOOP_VINFO_USING_PARTIAL_VECTORS_P when re-trying without SLP.
... as made apparent by a number of unexpectedly UNSUPPORTED test cases, which
now all turn into PASS, with just one exception:
PASS: gcc.dg/vect/vect-early-break_124-pr114403.c (test for excess errors)
PASS: gcc.dg/vect/vect-early-break_124-pr114403.c execution test
FAIL: gcc.dg/vect/vect-early-break_124-pr114403.c scan-tree-dump vect "LOOP VECTORIZED"
..., which needs to be looked into, separately.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_vect_long_long):
Enable for GCN.
This resolves failing tests in check-simd.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
PR libstdc++/114750
* include/experimental/bits/simd_builtin.h
(_SimdImplBuiltin::_S_load, _S_store): Fall back to copying
scalars if the memory type cannot be vectorized for the target.
.ABNORMAL_DISPATCHER is currently the only internal function with
ECF_NORETURN, and asan likes to instrument ECF_NORETURN calls by adding
some builtin call before them, which breaks the .ABNORMAL_DISPATCHER
discovery added in gsi_safe_*.
The following patch fixes asan not to instrument .ABNORMAL_DISPATCHER
calls, like it doesn't instrument a couple of specific builtin calls
as well.
2024-04-17 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/114743
* asan.cc (maybe_instrument_call): Don't instrument calls to
.ABNORMAL_DISPATCHER.
* gcc.dg/asan/pr112709-2.c (freddy): New function from
gcc.dg/ubsan/pr112709-2.c version of the test.
The testcase had the wrong indices in the buffer check loop.
gcc/testsuite/ChangeLog:
PR tree-optimization/114403
* gcc.dg/vect/vect-early-break_124-pr114403.c: Fix check loop.
F2008 requires for ALLOCATE with SOURCE= or MOLD= specifier that the kind
type parameters of allocate-object and source-expr have the same values.
Add compile-time diagnostics for different character length and a runtime
check (under -fcheck=bounds). Use length from allocate-object to prevent
heap corruption and to allow string padding or truncation on assignment.
gcc/fortran/ChangeLog:
PR fortran/113793
* resolve.cc (resolve_allocate_expr): Reject ALLOCATE with SOURCE=
or MOLD= specifier for unequal length.
* trans-stmt.cc (gfc_trans_allocate): If an allocatable character
variable has fixed length, use it and do not use the source length.
With bounds-checking enabled, add a runtime check for same length.
gcc/testsuite/ChangeLog:
PR fortran/113793
* gfortran.dg/allocate_with_source_29.f90: New test.
* gfortran.dg/allocate_with_source_30.f90: New test.
* gfortran.dg/allocate_with_source_31.f90: New test.
This just adds a clause to make it more obvious that the vector_size
attribute extension works with typedefs.
Note this whole section needs a rewrite to be a similar format as other
extensions. But that is for another day.
gcc/ChangeLog:
PR c/92880
* doc/extend.texi (Using Vector Instructions): Add that
the base_types could be a typedef of them.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
The following fixes a DFS walk issue when identifying to be ignored
latch edges. We have (bogus) SLP_TREE_REPRESENTATIVEs for VEC_PERM
nodes so those have to be explicitly ignored as possibly being PHIs.
PR tree-optimization/114736
* tree-vect-slp.cc (vect_optimize_slp_pass::is_cfg_latch_edge):
Do not consider VEC_PERM_EXPRs as PHI use.
* gfortran.dg/vect/pr114736.f90: New testcase.
The neg induction vectorization code isn't prepared to deal with
single element vectors.
PR tree-optimization/114733
* tree-vect-loop.cc (vectorizable_nonlinear_induction): Reject
neg induction vectorization of single element vectors.
* gcc.dg/vect/pr114733.c: New testcase.
This patch adjusts the implementation of acc_map_data/acc_unmap_data API library
routines to more fit the description in the OpenACC 2.7 specification.
Instead of using REFCOUNT_INFINITY, we now define a REFCOUNT_ACC_MAP_DATA
special value to mark acc_map_data-created mappings. Adjustment around
mapping related code to respect OpenACC semantics are also added.
libgomp/ChangeLog:
* libgomp.h (REFCOUNT_ACC_MAP_DATA): Define as (REFCOUNT_SPECIAL | 2).
* oacc-mem.c (acc_map_data): Adjust to use REFCOUNT_ACC_MAP_DATA,
initialize dynamic_refcount as 1.
(acc_unmap_data): Adjust to use REFCOUNT_ACC_MAP_DATA,
(goacc_map_var_existing): Add REFCOUNT_ACC_MAP_DATA case.
(goacc_exit_datum_1): Add REFCOUNT_ACC_MAP_DATA case, respect
REFCOUNT_ACC_MAP_DATA when decrementing/finalizing. Force lowest
dynamic_refcount to be 1 for REFCOUNT_ACC_MAP_DATA.
(goacc_enter_data_internal): Add REFCOUNT_ACC_MAP_DATA case.
* target.c (gomp_increment_refcount): Return early for
REFCOUNT_ACC_MAP_DATA case.
(gomp_decrement_refcount): Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-96.c: New testcase.
* testsuite/libgomp.oacc-c-c++-common/unmap-infinity-1.c: Adjust
testcase error output scan test.
While studying the TYPE_CANONICAL/TYPE_STRUCTURAL_EQUALITY_P stuff,
I've noticed some nits in comments, the following patch fixes them.
2024-04-16 Jakub Jelinek <jakub@redhat.com>
* tree.cc (array_type_nelts): Ensure 2 spaces after . in comment
instead of just one.
(build_variant_type_copy): Likewise.
(tree_check_failed): Likewise.
(build_atomic_base): Likewise.
* ipa-free-lang-data.cc (fld_incomplete_type_of): Use an indefinite
article rather than a.
..., until <https://github.com/Rust-GCC/gccrs/issues/2898>
"'cargo' should build for the host system" is resolved.
Follow-up to commit 3e1e73fc99
"build: Check for cargo when building rust language".
* configure.ac (have_cargo): Force to "no" in Canadian cross
configurations
* configure: Regenerate.
Follow-up to commit 3e1e73fc99
"build: Check for cargo when building rust language":
On 2024-04-15T13:14:42+0200, I wrote:
> I now wonder: instead of 'AC_CHECK_TOOL', shouldn't this use
> 'AC_CHECK_PROG'? (We always want plain 'cargo', not host-prefixed
> 'aarch64-linux-gnu-cargo' etc., right?) I'll look into changing this.
* configure: Regenerate.
config/
* acx.m4 (ACX_PROG_CARGO): Use 'AC_CHECK_PROGS'.
https://eel.is/c++draft/bit.cast#3 says that std::bit_cast isn't constexpr
if To, From and the types of all subobjects have certain properties which the
check_bit_cast_type checks (such as it isn't a pointer, reference, union,
member pointer, volatile). The function doesn't cp_walk_tree though, so
I've missed one important case, for ARRAY_TYPEs we need to recurse on the
element type. I think we don't need to handle VECTOR_TYPEs/COMPLEX_TYPEs,
because those will not have a pointer/reference/union/member pointer in
the element type and if the element type is volatile, I think the whole
derived type is volatile as well.
2024-04-16 Jakub Jelinek <jakub@redhat.com>
PR c++/114706
* constexpr.cc (check_bit_cast_type): Handle ARRAY_TYPE.
* g++.dg/cpp2a/bit-cast17.C: New test.
When one of the two input operands is 0, ADD and IOR are functionally
equivalent.
ADD is slightly preferred over IOR because ADD has a higher likelihood
of being implemented as a compressed instruction when compared to IOR.
C.ADD uses the CR format with any of the 32 RVI registers availble,
while C.OR uses the CA format with limit to just 8 of them.
Conditional select, if zero case:
rd = (rc == 0) ? rs1 : rs2
before patch:
czero.nez rd, rs1, rc
czero.eqz rtmp, rs2, rc
or rd, rd, rtmp
after patch:
czero.eqz rd, rs1, rc
czero.nez rtmp, rs2, rc
add rd, rd, rtmp
Same trick applies for the conditional select, if non-zero case:
rd = (rc != 0) ? rs1 : rs2
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_expand_conditional_move):
replace or with add when expanding zicond if possible.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zicond-prefer-add-to-or.c: New test.
The earlier patch for PR112938 arranged for volatile parms to be made
indirect in internal strub wrapped bodies.
The first problem that remained, more evident, was that the indirected
parameter remained volatile, despite the indirection, but it wasn't
regimplified, so indirecting it was malformed gimple.
Regimplifying turned out not to be needed. The best course of action
was to drop the volatility from the by-reference parm, that was being
unexpectedly inherited from the original volatile parm.
That exposed another problem: the dereferences would then lose their
volatile status, so we had to bring volatile back to them.
for gcc/ChangeLog
PR middle-end/112938
* ipa-strub.cc (pass_ipa_strub::execute): Drop volatility from
indirected parm.
(maybe_make_indirect): Restore volatility in dereferences.
for gcc/testsuite/ChangeLog
PR middle-end/112938
* g++.dg/strub-internal-pr112938.cc: New.