This removes the #error from <bits/semaphore_base.h> for the case where
neither __atomic_semaphore nor __platform_semaphore is defined.
Also rename the _GLIBCXX_REQUIRE_POSIX_SEMAPHORE macro to
_GLIBCXX_USE_POSIX_SEMAPHORE for consistency with the similar
_GLIBCXX_USE_CXX11_ABI macro that can be used to request an alternative
(ABI-changing) implementation.
libstdc++-v3/ChangeLog:
PR libstdc++/100179
* include/bits/semaphore_base.h: Remove #error.
* include/std/semaphore: Do not define anything unless one of
the implementations is available.
This is a substantial rewrite of the atomic wait/notify (and timed wait
counterparts) implementation.
The previous __platform_wait looped on EINTR however this behavior is
not required by the standard. A new _GLIBCXX_HAVE_PLATFORM_WAIT macro
now controls whether wait/notify are implemented using a platform
specific primitive or with a platform agnostic mutex/condvar. This
patch only supplies a definition for linux futexes. A future update
could add support __ulock_wait/wake on Darwin, for instance.
The members of __waiters were lifted to a new base class. The members
are now arranged such that overall sizeof(__waiter_pool_base) fits in
two cache lines (on platforms with at least 64 byte cache lines). The
definition will also use destructive_interference_size for this if it is
available.
The __waiters type is now specific to untimed waits, and is renamed to
__waiter_pool. Timed waits have a corresponding __timed_waiter_pool
type. Much of the code has been moved from the previous __atomic_wait()
free function to the __waiter_base template and a __waiter derived type
is provided to implement the un-timed wait operations. A similar change
has been made to the timed wait implementation.
The __atomic_spin code has been extended to take a spin policy which is
invoked after the initial busy wait loop. The default policy is to
return from the spin. The timed wait code adds a timed backoff spinning
policy. The code from <thread> which implements this_thread::sleep_for,
sleep_until has been moved to a new <bits/std_thread_sleep.h> header
which allows the thread sleep code to be consumed without pulling in the
whole of <thread>.
The entry points into the wait/notify code have been restructured to
support either -
* Testing the current value of the atomic stored at the given address
and waiting on a notification.
* Applying a predicate to determine if the wait was satisfied.
The entry points were renamed to make it clear that the wait and wake
operations operate on addresses. The first variant takes the expected
value and a function which returns the current value that should be used
in comparison operations, these operations are named with a _v suffix
(e.g. 'value'). All atomic<_Tp> wait/notify operations use the first
variant. Barriers, latches and semaphores use the predicate variant.
This change also centralizes what it means to compare values for the
purposes of atomic<T>::wait rather than scattering through individual
predicates.
This change also centralizes the repetitive code which adjusts for
different user supplied clocks (this should be moved elsewhere
and all such adjustments should use a common implementation).
This change also removes the hashing of the pointer and uses
the pointer value directly for indexing into the waiters table.
libstdc++-v3/ChangeLog:
* include/Makefile.am: Add new <bits/this_thread_sleep.h> header.
* include/Makefile.in: Regenerate.
* include/bits/this_thread_sleep.h: New file.
* include/bits/atomic_base.h: Adjust all calls
to __atomic_wait/__atomic_notify for new call signatures.
* include/bits/atomic_timed_wait.h: Extensive rewrite.
* include/bits/atomic_wait.h: Likewise.
* include/bits/semaphore_base.h: Adjust all calls
to __atomic_wait/__atomic_notify for new call signatures.
* include/std/atomic: Likewise.
* include/std/barrier: Likewise.
* include/std/latch: Likewise.
* include/std/semaphore: Likewise.
* include/std/thread (this_thread::sleep_for)
(this_thread::sleep_until): Move to new header.
* testsuite/29_atomics/atomic/wait_notify/bool.cc: Simplify
test.
* testsuite/29_atomics/atomic/wait_notify/generic.cc: Likewise.
* testsuite/29_atomics/atomic/wait_notify/pointers.cc: Likewise.
* testsuite/29_atomics/atomic_flag/wait_notify/1.cc: Likewise.
* testsuite/29_atomics/atomic_float/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_integral/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_ref/wait_notify.cc: Likewise.
This implements the wording changes of P2259R1 "Repairing input range
adaptors and counted_iterator", which resolves LWG 3283, 3289 and 3408.
The wording changes are relatively straightforward, but they require
some boilerplate to implement: the changes to make a type alias
"conditionally present" in some iterator class are realized by defining
a base class template and an appropriately constrained partial
specialization thereof that contains the type alias, and having the
iterator class derive from this base class. Sometimes the relevant
condition depend on members from the iterator class, but because a
class is incomplete when instantiating its bases, the constraints on
the partial specialization can't use anything from the iterator class.
This patch works around this by hoisting these members out to the
enclosing scope (e.g. transform_view::_Iterator::_Base is hoisted out
to transform_view::_Base so that transform_view::__iter_cat can use it).
This patch also implements the proposed resolution of LWG 3291 to rename
iota_view::iterator_category to iota_view::iterator_concept, which was
previously problematic due to LWG 3408.
libstdc++-v3/ChangeLog:
PR libstdc++/95983
* include/bits/stl_iterator.h (__detail::__move_iter_cat):
Define.
(move_iterator): Derive from the above in C++20 in order to
conditionally define iterator_category as per P2259.
(move_iterator::__base_cat): No longer used, so remove.
(move_iterator::iterator_category): Remove in C++20.
(__detail::__common_iter_use_postfix_proxy): Define.
(common_iterator::_Proxy): Rename to ...
(common_iterator:__arrow_proxy): ... this.
(common_iterator::__postfix_proxy): Define as per P2259.
(common_iterator::operator->): Adjust.
(common_iterator::operator++): Adjust as per P2259.
(iterator_traits<common_iterator>::_S_iter_cat): Define.
(iterator_traits<common_iterator>::iterator_category): Change as
per P2259.
(__detail::__counted_iter_value_type): Define.
(__detail::__counted_iter_concept): Define.
(__detail::__counted_iter_cat): Define.
(counted_iterator): Derive from the above three classes in order
to conditionally define value_type, iterator_concept and
iterator category respectively as per P2259.
(counted_iterator::operator->): Define as per P2259.
(incrementable_traits<counted_iterator>): Remove as per P2259.
(iterator_traits<counted_iterator>): Adjust as per P2259.
* include/std/ranges (__detail::__iota_view_iter_cat): Define.
(iota_view::_Iterator): Derive from the above in order to
conditionally define iterator_category as per P2259.
(iota_view::_S_iter_cat): Rename to ...
(iota_view::_S_iter_concept): ... this.
(iota_view::iterator_concept): Use it to apply LWG 3291 changes.
(iota_view::iterator_category): Remove.
(__detail::__filter_view_iter_cat): Define.
(filter_view::_Iterator): Derive from the above in order to
conditionally define iterator_category as per P2259.
(filter_view::_Iterator): Move to struct __filter_view_iter_cat.
(filter_view::_Iterator::iterator_category): Remove.
(transform_view::_Base): Define.
(transform_view::__iter_cat): Define.
(transform_view::_Iterator): Derive from the above in order to
conditionally define iterator_category as per P2259.
(transform_view::_Iterator::_Base): Just alias
transform_view::_Base.
(transform_view::_Iterator::_S_iter_cat): Move to struct
transform_view::__iter_cat.
(transform_view::_Iterator::iterator_category): Remove.
(transform_view::_Sentinel::_Base): Just alias
transform_view::_Base.
(join_view::_Base): Define.
(join_view::_Outer_iter): Define.
(join_view::_Inner_iter): Define.
(join_view::_S_ref_is_glvalue): Define.
(join_view::__iter_cat): Define.
(join_view::_Iterator): Derive from it in order to conditionally
define iterator_category as per P2259.
(join_view::_Iterator::_Base): Just alias join_view::_Base.
(join_view::_Iterator::_S_ref_is_glvalue): Just alias
join_view::_S_ref_is_glvalue.
(join_view::_Iterator::_S_iter_cat): Move to struct
transform_view::__iter_cat.
(join_view::_Iterator::_Outer_iter): Just alias
join_view::_Outer_iter.
(join_view::_Iterator::_Inner_iter): Just alias
join_view::_Inner_iter.
(join_view::_Iterator::iterator_category): Remove.
(join_view::_Sentinel::_Base): Just alias join_view::_Base.
(__detail::__split_view_outer_iter_cat): Define.
(__detail::__split_view_inner_iter_cat): Define.
(split_view::_Base): Define.
(split_view::_Outer_iter): Derive from __split_view_outer_iter_cat
in order to conditionally define iterator_category as per P2259.
(split_view::_Outer_iter::iterator_category): Remove.
(split_view::_Inner_iter): Derive from __split_view_inner_iter_cat
in order to conditionally define iterator_category as per P2259.
(split_view::_Inner_iter::_S_iter_cat): Move to
__split_view_inner_iter_cat.
(split_view::_Inner_iter::iterator_category): Remove.
(elements_view::_Base): Define.
(elements_view::__iter_cat): Define.
(elements_view::_Iterator): Derive from the above in order to
conditionall define iterator_category as per P2259.
(elements_view::_Iterator::_Base): Just alias
elements_view::_Base.
(elements_view::_Iterator::_S_iter_concept)
(elements_view::_Iterator::iterator_concept): Define as per
P2259.
(elements_view::_Iterator::iterator_category): Remove.
(elements_view::_Sentinel::_Base): Just alias
elements_view::_Base.
* testsuite/24_iterators/headers/iterator/synopsis_c++20.cc:
Adjust constraints on iterator_traits<counted_iterator>.
* testsuite/std/ranges/p2259.cc: New test.
This defines the feature test macro when uselocale is available, because
the floating-point std::from_chars support currently depends on that.
Co-authored-by: Jakub Jelinek <jakub@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/100146
* include/std/charconv (__cpp_lib_to_chars): Define
conditionally.
* include/std/version (__cpp_lib_to_chars): Likewise..
* testsuite/20_util/from_chars/4.cc: Only check feature test
macro, not _GLIBCXX_HAVE_USELOCALE.
* testsuite/20_util/from_chars/5.cc: Likewise.
* testsuite/20_util/from_chars/6.cc: Likewise.
* testsuite/20_util/to_chars/long_double.cc: Likewise.
The <ranges> header currently copies some simple algorithms from
<bits/ranges_algo.h>, which was originally done in order to avoid a
circular dependency with the header. This is no longer an issue since
the latter header now includes <bits/ranges_util.h> instead of all of
<ranges>.
This means we could now just include <bits/ranges_algo.h> and remove the
copied algorithms, but that'd increase the size of <ranges> by ~10%.
And we can't use the corresponding STL-style algorithms here because
they assume input iterators are copyable. So this patch instead
simplifies these copied algorithms, removing their constraints and
unused parameters, and keeps them around. In a subsequent patch we're
going to copy (a simplified version of) ranges::find into <ranges> as
well.
libstdc++-v3/ChangeLog:
* include/std/ranges (__detail::find_if): Simplify.
(__detail::find_if_not): Likewise.
(__detail::min): Remove.
(__detail::mismatch): Simplify.
(take_view::size): Use std::min instead of __detail::min.
While we're modifying elements_view, this also implements the one-line
resolution of LWG 3492.
libstdc++-v3/ChangeLog:
* include/std/ranges (__detail::__returnable_element): New
concept.
(elements_view): Use this concept in its constraints. Add
missing private access specifier.
(elements_view::_S_get_element): Define as per LWG 3502.
(elements_view::operator*, elements_view::operator[]): Use
_S_get_element.
(elements_view::operator++): Remove unnecessary constraint
as per LWG 3492.
* testsuite/std/ranges/adaptors/elements.cc (test05): New test.
This rewrites our range adaptor implementation for more comprehensible
error messages, improved SFINAE behavior and conformance to P2281.
The diagnostic improvements mostly come from using appropriately named
functors instead of lambdas in the generic implementation of partial
application and composition of range adaptors, and in the definition of
each of the standard range adaptors. This makes their pretty printed
types much shorter and more self-descriptive.
The improved SFINAE behavior comes from constraining the range adaptors'
member functions appropriately. This improvement fixes PR99433, and is
also necessary in order to implement the wording changes of P2281.
Finally, P2281 clarified that partial application and composition of
range adaptors behaves like a perfect forwarding call wrapper. This
patch implements this, except that we don't bother adding overloads for
forwarding captured state entities as non-const lvalues, since it seems
sufficient to handle the const lvalue and non-const rvalue cases for now,
given the current set of standard range adaptors. But such overloads
can be easily added if they turn out to be needed.
libstdc++-v3/ChangeLog:
PR libstdc++/99433
* include/std/ranges (__adaptor::__maybe_refwrap): Remove.
(__adaptor::__adaptor_invocable): New concept.
(__adaptor::__adaptor_partial_app_viable): New concept.
(__adaptor::_RangeAdaptorClosure): Rewrite, turning it into a
non-template base class.
(__adaptor::_RangeAdaptor): Rewrite, turning it into a CRTP base
class template.
(__adaptor::_Partial): New class template that represents
partial application of a range adaptor non-closure.
(__adaptor::__pipe_invocable): New concept.
(__adaptor::_Pipe): New class template.
(__detail::__can_ref_view): New concept.
(__detail::__can_subrange): New concept.
(all): Replace the lambda here with ...
(_All): ... this functor. Add appropriate constraints.
(__detail::__can_filter_view): New concept.
(filter, _Filter): As in all/_All.
(__detail::__can_transform): New concept.
(transform, _Transform): As in all/_All.
(__detail::__can_take_view): New concept.
(take, _Take): As in all/_All.
(__detail::__can_take_while_view): New concept.
(take_while, _TakeWhile): As in all/_All.
(__detail::__can_drop_view): New concept.
(drop, _Drop): As in all/_All.
(__detail::__can_drop_while_view): New concept.
(drop_while, _DropWhile): As in all/_All.
(__detail::__can_join_view): New concept.
(join, _Join): As in all/_All.
(__detail::__can_split_view): New concept.
(split, _Split): As in all/_All. Rename template parameter
_Fp to _Pattern.
(__detail::__already_common): New concept.
(__detail::__can_common_view): New concept.
(common, _Common): As in all/_All.
(__detail::__can_reverse_view): New concept.
(reverse, _Reverse): As in all/_All.
(__detail::__can_elements_view): New concept.
(elements, _Elements): As in all/_All.
(keys, values): Adjust.
* testsuite/std/ranges/adaptors/99433.cc: New test.
* testsuite/std/ranges/adaptors/all.cc: No longer expect that
adding empty range adaptor closure objects to a pipeline doesn't
increase the size of the pipeline.
(test05): New test.
* testsuite/std/ranges/adaptors/common.cc (test03): New test.
* testsuite/std/ranges/adaptors/drop.cc (test09): New test.
* testsuite/std/ranges/adaptors/drop_while.cc (test04): New test.
* testsuite/std/ranges/adaptors/elements.cc (test04): New test.
* testsuite/std/ranges/adaptors/filter.cc (test06): New test.
* testsuite/std/ranges/adaptors/join.cc (test09): New test.
* testsuite/std/ranges/adaptors/p2281.cc: New test.
* testsuite/std/ranges/adaptors/reverse.cc (test07): New test.
* testsuite/std/ranges/adaptors/split.cc (test01, test04):
Adjust.
(test09): New test.
* testsuite/std/ranges/adaptors/split_neg.cc (test01): Adjust
expected error message.
(test02): Likewise. Extend test.
* testsuite/std/ranges/adaptors/take.cc (test06): New test.
* testsuite/std/ranges/adaptors/take_while.cc (test05): New test.
* testsuite/std/ranges/adaptors/transform.cc (test07, test08):
New test.
Tim Song pointed out that using __underlying_type is ill-formed for
incomplete enumeration types, and is_scoped_enum doesn't require a
complete type. This changes the trait to check for conversion to int
instead of to the underlying type.
In order to give the correct result when the trait is used in the
enumerator-list of an incomplete type the partial specialization for
enums has an additional check that fails for incomplete types. This
assumes that an incompelte enumeration type must be an unscoped
enumeration, and so the primary template (with a std::false_type base
characteristic) can be used. This isn't necessarily true, but it is not
currently possible to refer to a scoped enumeration type before its type
is complete (PR c++/89025).
It should be possible to use requires(remove_cv_t<_Tp> __t) in the
partial specialization's assignablility check, but that currently gives
an ICE (PR c++/99968) so there is an extra partial specialization of
is_scoped_enum<const _Tp> to handle const types.
libstdc++-v3/ChangeLog:
* include/std/type_traits (is_scoped_enum<T>): Constrain partial
specialization to not match incomplete enum types. Use a
requires-expression instead of instantiating is_convertible.
(is_scoped_enum<const T>): Add as workaround for PR c++/99968.
* testsuite/20_util/is_scoped_enum/value.cc: Check with
incomplete types and opaque-enum-declarations.
Add [[nodiscard]] to functions that are effectively just a static_cast,
as per P2351. Also add it to std::addressof.
libstdc++-v3/ChangeLog:
* include/bits/move.h (forward, move, move_if_noexcept)
(addressof): Add _GLIBCXX_NODISCARD.
* include/bits/ranges_cmp.h (identity::operator()): Add
nodiscard attribute.
* include/c_global/cstddef (to_integer): Likewise.
* include/std/bit (bit_cast): Likewise.
* include/std/utility (as_const, to_underlying): Likewise.
std::ranges::reverse_view uses make_reverse_iterator in its
implementation as described in [range.reverse.view]. This accidentally
allows ADL as an unqualified name is used in the call. According to
[contents], however, this should be treated as a qualified lookup into
the std namespace.
This leads to errors due to ambiguous name lookups when another
make_reverse_iterator function is found via ADL.
libstdc++-v3/Changelog:
* include/std/ranges (reverse_view::begin, reverse_view::end):
Qualify make_reverse_iterator calls to avoid ADL.
* testsuite/std/ranges/adaptors/reverse.cc: Test that
views::reverse works when make_reverse_iterator is defined
in an associated namespace.
This implements the new string_view constructor proposed by P1989R2.
This hasn't been voted into the C++23 draft yet, but it's been reviewed
by LWG and is expected to be approved at the next WG21 meeting.
libstdc++-v3/ChangeLog:
* include/std/string_view (basic_string_view(Range&&)): Define new
constructor and deduction guide.
* testsuite/21_strings/basic_string_view/cons/char/range_c++20.cc: New test.
* testsuite/21_strings/basic_string_view/cons/wchar_t/range_c++20.cc: New test.
Implement this C++23 feature, as proposed by P1048R1.
This implementation assumes that a C++23 compiler supports concepts
already. I don't see any point in using preprocessor hacks to detect
compilers which define __cplusplus to a post-C++20 value but don't
support concepts yet.
libstdc++-v3/ChangeLog:
* include/std/type_traits (is_scoped_enum): Define.
* include/std/version (__cpp_lib_is_scoped_enum): Define.
* testsuite/20_util/is_scoped_enum/value.cc: New test.
* testsuite/20_util/is_scoped_enum/version.cc: New test.
The new std::call_once implementation is not backwards compatible,
contrary to my intention. Because std::once_flag::_M_active() doesn't
write glibc's "fork generation" into the pthread_once_t object, it's
possible for glibc and libstdc++ to run two active executions
concurrently. This violates the primary invariant of the feature!
This patch reverts std::once_flag and std::call_once to the old
implementation that uses pthread_once. This means PR 66146 is a problem
again, but glibc has been changed to solve that. A new API similar to
pthread_once but supporting failure and resetting the pthread_once_t
will be proposed for inclusion in glibc and other C libraries.
This change doesn't simply revert r11-4691 because I want to retain the
new implementation for non-ghtreads targets (which didn't previously
support std::call_once at all, so there's no backwards compatibility
concern). This also leaves the new std::call_once::_M_activate() and
std::call_once::_M_finish(bool) symbols present in libstdc++.so.6 so
that code already compiled against GCC 11 can still use them. Those
symbols will be removed in a subsequent commit (which distros can choose
to temporarily revert if needed).
libstdc++-v3/ChangeLog:
PR libstdc++/99341
* include/std/mutex [_GLIBCXX_HAVE_LINUX_FUTEX] (once_flag):
Revert to pthread_once_t implementation.
[_GLIBCXX_HAVE_LINUX_FUTEX] (call_once): Likewise.
* src/c++11/mutex.cc [_GLIBCXX_HAVE_LINUX_FUTEX]
(struct __once_flag_compat): New type matching the reverted
implementation of once_flag using futexes.
(once_flag::_M_activate): Remove, replace with ...
(_ZNSt9once_flag11_M_activateEv): ... alias symbol.
(once_flag::_M_finish): Remove, replace with ...
(_ZNSt9once_flag9_M_finishEb): ... alias symbol.
* testsuite/30_threads/call_once/66146.cc: Removed.
The standard only specifies that barrier::arrival_token is a move
constructible and move assignable type. We originally used a scoped enum
type, but that means we do not diagnose non-portable code that makes
copies of arrival tokens (or compares them for equality, or uses them as
keys in map!) This wraps the enum in a move-only class type, so that
users are forced to pass it correctly.
The move constructor and move assignment operator of the new class do
not zero out the moved-from token, as that would add additional
instructions. That means that passing a moved-from token will work with
our implementation, despite being a bug in the user code. We could
consider doing that zeroing out in debug mode.
libstdc++-v3/ChangeLog:
* include/std/barrier (barrier::arrival_token): New move-only
class that encapsulates the underlying token value.
As Lewis Baker wrote in the PR:
> The 'fetch_sub()' operation in _M_release_ownership() should be using
> memory_order::acq_rel instead of memory_order::release. The use of
> 'release' only is insufficient as it does not synchronise with any
> corresponding 'acquire' operation.
> With the current implementation, it's possible that a prior write to
> one of the _M_value or _M_head data-members by a thread releasing the
> second-to-last reference might not be visible to another thread that
> releases the last reference and frees the memory, resulting in
> potential write to freed memory.
This simply changes the memory order to acq_rel as suggested.
libstdc++-v3/ChangeLog:
PR libstdc++/99537
* include/std/stop_token (_Stop_state_t::_M_release_ownership):
Use acq_rel memory ordering.
As can be seen on:
unsigned char f1 (unsigned char x, int y) { return std::rotl (x, y); }
unsigned char f2 (unsigned char x, int y) { return std::rotr (x, y); }
unsigned short f3 (unsigned short x, int y) { return std::rotl (x, y); }
unsigned short f4 (unsigned short x, int y) { return std::rotr (x, y); }
unsigned int f5 (unsigned int x, int y) { return std::rotl (x, y); }
unsigned int f6 (unsigned int x, int y) { return std::rotr (x, y); }
unsigned long int f7 (unsigned long int x, int y) { return std::rotl (x, y); }
unsigned long int f8 (unsigned long int x, int y) { return std::rotr (x, y); }
unsigned long long int f9 (unsigned long long int x, int y) { return std::rotl (x, y); }
unsigned long long int f10 (unsigned long long int x, int y) { return std::rotr (x, y); }
//unsigned __int128 f11 (unsigned __int128 x, int y) { return std::rotl (x, y); }
//unsigned __int128 f12 (unsigned __int128 x, int y) { return std::rotr (x, y); }
constexpr auto a = std::rotl (1234U, 0);
constexpr auto b = std::rotl (1234U, 5);
constexpr auto c = std::rotl (1234U, -5);
constexpr auto d = std::rotl (1234U, -__INT_MAX__ - 1);
the current <bit> definitions of std::__rot[lr] aren't pattern recognized
as rotates, they are too long/complex for that, starting with signed modulo,
special case for 0 and different cases for positive and negative.
For types with power of two bits the following patch adds definitions that
the compiler can pattern recognize and turn e.g. on x86_64 into ro[lr][bwlq]
instructions. For weirdo types like unsigned __int20 etc. it keeps the
current definitions.
2021-03-06 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/99396
* include/std/bit (__rotl, __rotr): Add optimized variants for power of
two _Nd which the compiler can pattern match the rotates.
The conversions to integer types are explicit, so need to use the
correct type. Converting to uint32_t only works if that is the same type
as unsigned.
libstdc++-v3/ChangeLog:
PR libstdc++/99301
* include/std/chrono (year_month_day::_M_days_since_epoch()):
Convert chrono::month and chrono::day to unsigned before
converting to uint32_t.
Implement P1682R2 as just approved for C++23.
libstdc++-v3/ChangeLog:
* include/std/utility (to_underlying): Define.
* include/std/version (__cpp_lib_to_underlying): Define.
* testsuite/20_util/to_underlying/1.cc: New test.
* testsuite/20_util/to_underlying/version.cc: New test.
This patch reimplements std::chrono::year_month_day_last:day() which yields the
last day of a particular month. The current implementation uses a look-up table
implemented as an unsigned[12] array. The new implementation instead
is based on
the fact that a month m in [1, 12], except for m == 2 (February), is
either 31 or
30 days long and m's length depends on two things: m's parity and whether m >= 8
or not. These two conditions are determined by the 0th and 3th bit of m and,
therefore, cheap and straightforward bit-twiddling can provide the right result.
Measurements in x86_64 [1] suggest a 10% performance boost. Although this does
not seem to be huge, notice that measurements are done in hot L1 cache
conditions which might not be very representative of production runs. Also
freeing L1 cache from holding the look-up table might allow performance
improvements elsewhere.
References:
[1] https://github.com/cassioneri/calendar
libstdc++-v3/ChangeLog:
* include/std/chrono (year_month_day_last:day): New
implementation.
This patch reimplements std::chrono::year::is_leap(). Leap year check is
ubiquitously implemented (including here) as:
y % 4 == 0 && (y % 100 != 0 || y % 400 == 0).
The rationale being that testing divisibility by 4 first implies an earlier
return for 75% of the cases, therefore, avoiding the needless calculations of
y % 100 and y % 400. Although this fact is true, it does not take into account
the cost of branching. This patch, instead, tests divisibility by 100 first:
(y % 100 != 0 || y % 400 == 0) && y % 4 == 0.
It is certainly counterintuitive that this could be more efficient since among
the three divisibility tests (4, 100 and 400) the one by 100 is the only one
that can never provide a definitive answer and a second divisibility test (by 4
or 400) is always required. However, measurements [1] in x86_64 suggest this is
3x more efficient! A possible explanation is that checking divisibility by 100
first implies a split in the execution path with probabilities of (1%, 99%)
rather than (25%, 75%) when divisibility by 4 is checked first. This decreases
the entropy of the branching distribution which seems to help prediction.
Given that y belongs to [-32767, 32767] [time.cal.year.members], a more
efficient algorithm [2] to check divisibility by 100 is used (instead of
y % 100 != 0). Measurements suggest that this optimization improves performance
by 20%.
The patch adds a test that exhaustively compares the result of this
implementation with the ubiquitous one for all y in [-32767, 32767]. Although
its completeness, the test completes in a matter of seconds.
References:
[1] https://stackoverflow.com/a/60646967/1137388
[2] https://accu.org/journals/overload/28/155/overload155.pdf#page=16
libstdc++-v3/ChangeLog:
* include/std/chrono (year::is_leap): New implementation.
* testsuite/std/time/year/2.cc: New test.
This patch reimplements std::chrono::year_month_day::_M_days_since_epoch()
which calculates the number of elapsed days since 1970/01/01. The new
implementation is based on Proposition 6.2 of Neri and Schneider, "Euclidean
Affine Functions and Applications to Calendar Algorithms" available at
https://arxiv.org/abs/2102.06959.
The aforementioned paper benchmarks the implementation against several
counterparts, including libc++'s (which is identical to the current
implementation). The results, shown in Figure 3, indicate the new algorithm is
1.7 times faster than the current one.
The patch adds a test which loops through all dates in [-32767/01/01,
32767/12/31], and for each of them, gets the number of days and compares the
result against its expected value. The latter is calculated using a much
simpler and easy to understand algorithm but which is also much slower.
The dates used in the test covers the full range of possible values
[time.cal.year.members]. Despite its completeness the test runs in matter of
seconds.
libstdc++-v3/ChangeLog:
* include/std/chrono (year_month_day::_M_days_since_epoch):
New implementation.
* testsuite/std/time/year_month_day/4.cc: New test.
This patch reimplements std::chrono::year_month_day::_S_from_days() which
retrieves a date from the number of elapsed days since 1970/01/01. The new
implementation is based on Proposition 6.3 of Neri and Schneider, "Euclidean
Affine Functions and Applications to Calendar Algorithms" available at
https://arxiv.org/abs/2102.06959.
The aforementioned paper benchmarks the implementation against several
counterparts, including libc++'s (which is identical to the current
implementation). The results, shown in Figure 4, indicate the new algorithm is
2.2 times faster than the current one.
The patch adds a test which loops through all integers in [-12687428, 11248737],
and for each of them, gets the corresponding date and compares the result
against its expected value. The latter is calculated using a much simpler and
easy to understand algorithm but which is also much slower.
The interval used in the test covers the full range of values for which a
roundtrip must work [time.cal.ymd.members]. Despite its completeness the test
runs in a matter of seconds.
libstdc++-v3/ChangeLog:
* include/std/chrono (year_month_day::_S_from_days): New
implementation.
* testsuite/std/time/year_month_day/3.cc: New test.
The once_flag::_M_activate() function is only ever called immediately
after a call to once_flag::_M_passive(), and so in the non-gthreads case
it is impossible for _M_passive() to be true in the body of
_M_activate(). Add a check for it anyway, to avoid warnings about
missing return.
Also replace a non-reserved name with a reserved one.
libstdc++-v3/ChangeLog:
* include/std/mutex (once_flag::_M_activate()): Add explicit
return statement for passive case.
(once_flag::_M_finish(bool)): Use reserved name for parameter.
The coroutine_handle<void>::from_address(void*) version is already
noexcept, and they do the same thing. Make them consistent.
libstdc++-v3/ChangeLog:
PR libstdc++/99021
* include/std/coroutine (coroutine_handle<P>::from_address): Add
noexcept.
This implements WG21 P1679R3, adding contains member functions to
basic_string_view and basic_string.
libstdc++-v3/ChangeLog:
* include/bits/basic_string.h (basic_string::contains): New
member functions.
* include/std/string_view (basic_string_view::contains):
Likewise.
* include/std/version (__cpp_lib_string_contains): Define.
* testsuite/21_strings/basic_string/operations/starts_with/char/1.cc:
Remove trailing whitespace.
* testsuite/21_strings/basic_string/operations/starts_with/wchar_t/1.cc:
Likewise.
* testsuite/21_strings/basic_string/operations/contains/char/1.cc: New test.
* testsuite/21_strings/basic_string/operations/contains/wchar_t/1.cc: New test.
* testsuite/21_strings/basic_string_view/operations/contains/char/1.cc: New test.
* testsuite/21_strings/basic_string_view/operations/contains/char/2.cc: New test.
* testsuite/21_strings/basic_string_view/operations/contains/wchar_t/1.cc: New test.
The patch adding these files was approved in 2020 but it wasn't
committed until 2021, so the copyright years were not updated along with
the years in all the existing files.
libstdc++-v3/ChangeLog:
* include/std/barrier: Update copyright years. Fix whitespace.
* include/std/version: Fix whitespace.
* testsuite/30_threads/barrier/1.cc: Update copyright years.
* testsuite/30_threads/barrier/2.cc: Likewise.
* testsuite/30_threads/barrier/arrive.cc: Likewise.
* testsuite/30_threads/barrier/arrive_and_drop.cc: Likewise.
* testsuite/30_threads/barrier/arrive_and_wait.cc: Likewise.
* testsuite/30_threads/barrier/completion.cc: Likewise.
This patch conditionally disables the floating-point std::to_chars
implementation on targets whose float and double aren't IEEE binary32
and binary64, until a proper fallback can be added for such targets.
This fixes a bootstrap failure on non-IEEE-754 FP targets such as
vax-netbsdelf.
The new preprocessor tests in c++config that detect the binary32 and
binary64 formats were copied from gcc/testsuite/gcc.dg/float-exact-1.c.
libstdc++-v3/ChangeLog:
* include/bits/c++config (_GLIBCXX_FLOAT_IS_IEEE_BINARY_32):
Define this macro.
(_GLIBCXX_DOUBLE_IS_IEEE_BINARY_64): Likewise.
* include/std/charconv (to_chars): Use these macros to
conditionally hide the overloads for floating-point types.
* src/c++17/floating_to_chars.cc: Use the macros to
conditionally disable this file.
(floating_type_traits<float>): Remove redundant static assert.
(floating_type_traits<double>): Likewise.
* testsuite/20_util/to_chars/double.cc: Run this test only on
ieee-floats effective targets.
* testsuite/20_util/to_chars/float.cc: Likewise.
* testsuite/20_util/to_chars/long_double.cc: Likewise.
* testsuite/lib/libstdc++.exp
(check_effective_target_ieee-floats): Define new proc for
detecting whether float and double have the IEEE binary32 and
binary64 formats.
This implements the floating-point std::to_chars overloads for float,
double and long double. We use the Ryu library to compute the shortest
round-trippable fixed and scientific forms for float, double and long
double. We also use Ryu for performing explicit-precision fixed and
scientific formatting for float and double. For explicit-precision
formatting for long double we fall back to using printf. Hexadecimal
formatting for float, double and long double is implemented from
scratch.
The supported long double binary formats are binary64, binary80 (x86
80-bit extended precision), binary128 and ibm128.
Much of the complexity of the implementation is in computing the exact
output length before handing it off to Ryu (which doesn't do bounds
checking). In some cases it's hard to compute the output length
beforehand, so in these cases we instead compute an upper bound on the
output length and use a sufficiently-sized intermediate buffer only if
necessary.
Another source of complexity is in the general-with-precision formatting
mode, where we need to do zero-trimming of the string returned by Ryu,
and where we also take care to avoid having to format the number through
Ryu a second time when the general formatting mode resolves to fixed
(which we determine by doing a scientific formatting first and
inspecting the scientific exponent). We avoid going through Ryu twice
by instead transforming the scientific form to the corresponding fixed
form via in-place string manipulation.
This implementation is non-conforming in a couple of ways:
1. For the shortest hexadecimal formatting, we currently follow the
Microsoft implementation's decision to be consistent with the
output of printf's '%a' specifier at the expense of sometimes not
printing the shortest representation. For example, the shortest hex
form for the number 1.08p+0 is 2.1p-1, but we output the former
instead of the latter, as does printf.
2. The Ryu routine generic_binary_to_decimal that we use for performing
shortest formatting for large floating point types is implemented
using the __int128 type, but some targets with a large long double
type lack __int128 (e.g. i686), so we can't perform shortest
formatting of long double on such targets through Ryu. As a
temporary stopgap this patch makes the long double to_chars overloads
just dispatch to the double overloads on these targets, which means
we lose precision in the output. (We could potentially fix this by
writing a specialized version of Ryu's generic_binary_to_decimal
routine that uses uint64_t instead of __int128.) [Though I wonder if
there's a better way to work around the lack of __int128 on i686
specifically?]
3. Our shortest formatting for __ibm128 doesn't guarantee the round-trip
property if the difference between the high- and low-order exponent
is large. This is because we treat __ibm128 as if it has a
contiguous 105-bit mantissa by merging the mantissas of the high-
and low-order parts (using code extracted from glibc), so we
potentially lose precision from the low-order part. This seems to be
consistent with how glibc printf formats __ibm128.
libstdc++-v3/ChangeLog:
* config/abi/pre/gnu.ver: Add new exports.
* include/std/charconv (to_chars): Declare the floating-point
overloads for float, double and long double.
* src/c++17/Makefile.am (sources): Add floating_to_chars.cc.
* src/c++17/Makefile.in: Regenerate.
* src/c++17/floating_to_chars.cc: New file.
(to_chars): Define for float, double and long double.
* testsuite/20_util/to_chars/long_double.cc: New test.
This makes the hash function available without including the whole of
<thread>, which is needed for <barrier>.
libstdc++-v3/ChangeLog:
* include/bits/std_thread.h (hash<thread::id>): Move here,
from ...
* include/std/thread (hash<thread::id>): ... here.
Now that GCC supports __has_builtin there is no need to test whether
it's defined, we can just use it unconditionally.
libstdc++-v3/ChangeLog:
* include/std/utility: Use __has_builtin without checking if
it's defined.
This causes the global objects that run the <iostream> initialization
code to be constructed earlier, which avoids some bugs in user code due
to incorrectly relying on static initialization order.
libstdc++-v3/ChangeLog:
PR libstdc++/98108
* include/std/iostream (__ioinit): Add init_priority attribute.
There's no need to explicitly check for the maximum value, because the
function we call handles it correctly anyway.
libstdc++-v3/ChangeLog:
PR libstdc++/98226
* include/std/bit (__countl_one, __countr_one): Remove redundant
branches.
In previous releases the std::this_thread::sleep_for function was only
declared if the target supports multiple threads. I changed that
recently in r11-2649-g5bbb1f3000c57fd4d95969b30fa0e35be6d54ffb so that
sleep_for could be used single-threaded. But that means that targets
using --disable-threads are now required to provide some way to sleep.
This breaks the build for (at least) AVR when trying to build a hosted
library.
This patch adds a new autoconf macro that is defined when no way to
sleep is available, and uses that to suppress the sleeping functions in
std::this_thread.
The #error in src/c++11/thread.cc is retained for the case where there
is no sleep function available but multiple threads are supported. This
is consistent with previous releases, but that #error could probably be
removed without any consequences.
libstdc++-v3/ChangeLog:
* acinclude.m4 (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Define NO_SLEEP
if none of nanosleep, sleep and Sleep is available.
* config.h.in: Regenerate.
* configure: Regenerate.
* include/std/thread [_GLIBCXX_NO_SLEEP] (__sleep_for): Do
not declare.
[_GLIBCXX_NO_SLEEP] (sleep_for, sleep_until): Do not
define.
* src/c++11/thread.cc [_GLIBCXX_NO_SLEEP] (__sleep_for): Do
not define.
This doesn't define a new _GLIBCXX_HAVE_BUILTIN_SOURCE_LOCATION macro.
because using __has_builtin(__builtin_source_location) is sufficient.
Currently only GCC supports it, but if/when Clang and Intel add it the
__has_builtin check should for them too.
Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* doc/doxygen/user.cfg.in (INPUT): Add <source_location>.
* include/Makefile.am: Add <source_location>.
* include/Makefile.in: Regenerate.
* include/std/version (__cpp_lib_source_location): Define.
* include/std/source_location: New file.
* testsuite/18_support/source_location/1.cc: New test.
* testsuite/18_support/source_location/consteval.cc: New test.
* testsuite/18_support/source_location/srcloc.h: New test.
* testsuite/18_support/source_location/version.cc: New test.
Thanks to Jakub's addition of the built-in, we can add this to the
library now. The compiler tests for the built-in are quite extensive,
including verifying the constraints, so this only adds minimal tests to
the library testsuite.
This doesn't add a new _GLIBCXX_HAVE_BUILTIN_BIT_CAST because using
__has_builtin(__builtin_bit_cast) works for GCC and versions of Clang
that provide the built-in.
libstdc++-v3/ChangeLog:
PR libstdc++/93121
* include/std/bit (__cpp_lib_bit_cast, bit_cast): Define.
* include/std/version (__cpp_lib_bit_cast): Define.
* testsuite/26_numerics/bit/bit.cast/bit_cast.cc: New test.
* testsuite/26_numerics/bit/bit.cast/version.cc: New test.
The recent changes to add assertions to std::array broke the functions
that need to be constexpr in C++11, because of the restrictive rules for
constexpr functions in C++11.
This simply disables the assertions for C++11 mode, so the functions can
be constexpr again.
libstdc++-v3/ChangeLog:
* include/std/array (array::operator[](size_t) const, array::front() const)
(array::back() const) [__cplusplus == 201103]: Disable
assertions.
* testsuite/23_containers/array/element_access/constexpr_element_access.cc:
Check for correct values.
* testsuite/23_containers/array/tuple_interface/get_neg.cc:
Adjust dg-error line numbers.
* testsuite/23_containers/array/debug/constexpr_c++11.cc: New test.
This changes some #ifdef checks to use #if instead.
libstdc++-v3/ChangeLog:
* include/bits/atomic_timed_wait.h: Use #if instead of #ifdef.
* include/bits/semaphore_base.h: Likewise.
* include/std/version: Remove trailing whitespace.