Like in r12-7519-g027e30414492d50feb2854aff38227b14300dc4b, I've done
git grep -v 'long long\|optab optab\|template template\|double double' | grep ' \([a-zA-Z]\+\) \1 '
This is just part of the changes, mostly for non-gcc directories.
I'll try to get to the rest soon. Obviously, the above command also
finds cases which are correct as is and shouldn't be changed, so one
needs to manually inspect everything.
I'd hope most of it is pretty obvious, but the config/ and libstdc++-v3/
hunks include a tweak in a license wording, though other copies of the
similar license have the wording right.
2024-04-02 Jakub Jelinek <jakub@redhat.com>
* Makefile.tpl: Fix duplicated words; returns returns ->
returns.
config/
* lcmessage.m4: Fix duplicated words; can can -> can,
package package -> package.
libdecnumber/
* decCommon.c (decFinalize): Fix duplicated words in
comment; the the -> the.
libgcc/
* unwind-dw2-fde.c (struct fde_accumulator): Fix duplicated
words in comment; is is -> is.
libgfortran/
* configure.host: Fix duplicated words; the the -> the.
libgm2/
* configure.host: Fix duplicated words; the the -> the.
libgomp/
* libgomp.texi (OpenMP 5.2): Fix duplicated words; with with ->
with.
(omp_target_associate_ptr): Fix duplicated words; either either ->
either.
(omp_init_allocator): Fix duplicated words; be be -> be.
(omp_realloc): Fix duplicated words; is is -> is.
(OMP_ALLOCATOR): Fix duplicated words; other other -> other.
* priority_queue.h (priority_queue_multi_p): Fix duplicated words;
to to -> to.
libiberty/
* regex.c (byte_re_match_2_internal): Fix duplicated words in comment;
next next -> next.
* dyn-string.c (dyn_string_init): Fix duplicated words in comment;
of of -> of.
libitm/
* beginend.cc (GTM::gtm_thread::begin_transaction): Fix duplicated
words in comment; not not -> not to.
libobjc/
* init.c (duplicate_classes): Fix duplicated words in comment; in in
-> in.
* sendmsg.c (__objc_prepare_dtable_for_class): Fix duplicated words
in comment; the the -> the.
* encoding.c (objc_layout_structure): Likewise.
libstdc++-v3/
* acinclude.m4: Fix duplicated words; file file -> file can.
* configure.host: Fix duplicated words; the the -> the.
libvtv/
* vtv_rts.cc (vtv_fail): Fix duplicated words; to to -> to.
* vtv_fail.cc (vtv_fail): Likewise.
While texinfo 7.0.3 does not warn, an older texinfo did complain about:
libgomp.texi:1964: warning: node next `omp_target_memcpy' in menu
`omp_target_memcpy_rect' and in sectioning `omp_target_memcpy_async' differ
libgomp/
* libgomp.texi (Device Memory Routines): Swap item order to match
the order of the '@node's of the '@subsection's.
These routines map simply to the C counterpart and are meanwhile
defined in OpenACC 3.3. (There are additional routine changes,
including the Fortran addition of acc_attach/acc_detach, that
require more work than a simple addition of an interface and
are therefore excluded.)
libgomp/ChangeLog:
* libgomp.texi (OpenACC Runtime Library Routines): Document new 3.3
routines that simply map to their C counterpart.
* openacc.f90 (openacc): Add them.
* openacc_lib.h: Likewise.
* testsuite/libgomp.oacc-fortran/acc_host_device_ptr.f90: New test.
* testsuite/libgomp.oacc-fortran/acc-memcpy.f90: New test.
* testsuite/libgomp.oacc-fortran/acc-memcpy-2.f90: New test.
* testsuite/libgomp.oacc-c-c++-common/lib-59.c: Crossref to f90 test.
* testsuite/libgomp.oacc-c-c++-common/lib-60.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-95.c: Likewise.
The main 'arch' context selector for nvptx is, well, 'nvptx';
however, as 'nvptx64' is used as by LLVM, it makes sense
to support it as well.
Note that LLVM has: "The triple architecture can be one of
``nvptx`` (32-bit PTX) or ``nvptx64`` (64-bit PTX)."
GCC effectively only supports the 64bit variant (at least for
offloading). Thus, GCC's 'nvptx' is not quite the same as LLVM's.
The device-compiler part (nvptx_omp_device_kind_arch_isa) uses
TARGET_ABI64 such that nvptx64 is only defined with -m64.
gcc/ChangeLog:
* config/nvptx/gen-omp-device-properties.sh: Add 'nvptx64' to arch.
* config/nvptx/nvptx.cc (nvptx_omp_device_kind_arch_isa): Likewise.
libgomp/ChangeLog:
* libgomp.texi (OpenMP Context Selectors): Add 'nvptx64' as additional
'arch' value for nvptx.
Support for indirect calls to procedures/functions in offloaded target
regions is now available for C, C++ and Fortran.
2024-02-15 Kwok Cheung Yeung <kcyeung@baylibre.com>
libgomp/
* libgomp.texi (OpenMP 5.1): Mark indirect call support as fully
implemented.
This patch adds support for parsing general lvalues ("locator list item
types") for OpenMP "map", "to" and "from" clauses to the C front-end,
similar to the previously-posted patch for C++. Such syntax is permitted
for OpenMP 5.0 and above. It was previously posted for mainline here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/609038.html
and for the og13 branch here:
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/623355.html
2024-01-11 Julian Brown <julian@codesourcery.com>
gcc/c-family/
* c-pretty-print.cc (c_pretty_printer::postfix_expression,
c_pretty_printer::expression): Add OMP_ARRAY_SECTION support.
gcc/c/
* c-parser.cc (c_parser_braced_init, c_parser_conditional_expression):
Don't allow OpenMP array section.
(c_parser_postfix_expression): Don't allow array section in statement
expression.
(c_parser_postfix_expression_after_primary): Add support for OpenMP
array section parsing.
(c_parser_expr_list): Don't allow OpenMP array section here.
(c_parser_omp_variable_list): Change ALLOW_DEREF parameter to
MAP_LVALUE. Support parsing of general lvalues in "map", "to" and
"from" clauses.
(c_parser_omp_var_list_parens): Change ALLOW_DEREF parameter to
MAP_LVALUE. Update call to c_parser_omp_variable_list.
(c_parser_oacc_data_clause): Update calls to
c_parser_omp_var_list_parens.
(c_parser_omp_clause_reduction): Use OMP_ARRAY_SECTION tree node
instead of TREE_LIST for array sections.
(c_parser_omp_target): Allow GOMP_MAP_ATTACH.
* c-tree.h (c_omp_array_section_p): Add extern declaration.
(build_omp_array_section): Add prototype.
* c-typeck.cc (c_omp_array_section_p): Add flag.
(mark_exp_read): Support OMP_ARRAY_SECTION.
(build_omp_array_section): Add function.
(build_external_ref): Tweak error path for OpenMP array sections.
(handle_omp_array_sections_1): Use OMP_ARRAY_SECTION tree code instead
of TREE_LIST. Handle more kinds of expressions.
(c_oacc_check_attachments): Use OMP_ARRAY_SECTION instead of TREE_LIST
for array sections.
(c_finish_omp_clauses): Use OMP_ARRAY_SECTION instead of TREE_LIST.
Check for supported expression types.
gcc/testsuite/
* gcc.dg/gomp/bad-array-section-c-1.c: New test.
* gcc.dg/gomp/bad-array-section-c-2.c: New test.
* gcc.dg/gomp/bad-array-section-c-3.c: New test.
* gcc.dg/gomp/bad-array-section-c-4.c: New test.
* gcc.dg/gomp/bad-array-section-c-5.c: New test.
* gcc.dg/gomp/bad-array-section-c-6.c: New test.
* gcc.dg/gomp/bad-array-section-c-7.c: New test.
* gcc.dg/gomp/bad-array-section-c-8.c: New test.
libgomp/
* libgomp.texi: C/C++ lvalues are supported now for map/to/from.
* testsuite/libgomp.c-c++-common/ind-base-4.c: New test.
* testsuite/libgomp.c-c++-common/unary-ptr-1.c: New test.
Implement the OpenMP pinned memory trait on Linux hosts using the mlock
syscall. Pinned allocations are performed using mmap, not malloc, to ensure
that they can be unpinned safely when freed.
This implementation will work OK for page-scale allocations, and finer-grained
allocations will be implemented in a future patch.
libgomp/ChangeLog:
* allocator.c (MEMSPACE_ALLOC): Add PIN.
(MEMSPACE_CALLOC): Add PIN.
(MEMSPACE_REALLOC): Add PIN.
(MEMSPACE_FREE): Add PIN.
(MEMSPACE_VALIDATE): Add PIN.
(omp_init_allocator): Use MEMSPACE_VALIDATE to check pinning.
(omp_aligned_alloc): Add pinning to all MEMSPACE_* calls.
(omp_aligned_calloc): Likewise.
(omp_realloc): Likewise.
(omp_free): Likewise.
* config/linux/allocator.c: New file.
* config/nvptx/allocator.c (MEMSPACE_ALLOC): Add PIN.
(MEMSPACE_CALLOC): Add PIN.
(MEMSPACE_REALLOC): Add PIN.
(MEMSPACE_FREE): Add PIN.
(MEMSPACE_VALIDATE): Add PIN.
* config/gcn/allocator.c (MEMSPACE_ALLOC): Add PIN.
(MEMSPACE_CALLOC): Add PIN.
(MEMSPACE_REALLOC): Add PIN.
(MEMSPACE_FREE): Add PIN.
* libgomp.texi: Switch pinned trait to supported.
(MEMSPACE_VALIDATE): Add PIN.
* testsuite/libgomp.c/alloc-pinned-1.c: New test.
* testsuite/libgomp.c/alloc-pinned-2.c: New test.
* testsuite/libgomp.c/alloc-pinned-3.c: New test.
* testsuite/libgomp.c/alloc-pinned-4.c: New test.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
This commit adds -fopenmp-allocators which enables support for
'omp allocators' and 'omp allocate' that are associated with a Fortran
allocate-stmt. If such a construct is encountered, an error is shown,
unless the -fopenmp-allocators flag is present.
With -fopenmp -fopenmp-allocators, those constructs get turned into
GOMP_alloc allocations, while -fopenmp-allocators (also without -fopenmp)
ensures deallocation and reallocation (via intrinsic assignments) are
properly directed to GOMP_free/omp_realloc - while normal Fortran
allocations are processed by free/realloc.
In order to distinguish a 'malloc'ed from a 'GOMP_alloc'ed memory, the
version field of the Fortran array discriptor is (mis)used: 0 indicates
the normal Fortran allocation while 1 denotes GOMP_alloc. For scalars,
there is record keeping in libgomp: GOMP_add_alloc(ptr) will add the
pointer address to a splay_tree while GOMP_is_alloc(ptr) will return
true it was previously added but also removes it from the list.
Besides Fortran FE work, BUILT_IN_GOMP_REALLOC is no part of
omp-builtins.def and libgomp gains the mentioned two new function.
gcc/ChangeLog:
* builtin-types.def (BT_FN_PTR_PTR_SIZE_PTRMODE_PTRMODE): New.
* omp-builtins.def (BUILT_IN_GOMP_REALLOC): New.
* builtins.cc (builtin_fnspec): Handle it.
* gimple-ssa-warn-access.cc (fndecl_alloc_p,
matching_alloc_calls_p): Likewise.
* gimple.cc (nonfreeing_call_p): Likewise.
* predict.cc (expr_expected_value_1): Likewise.
* tree-ssa-ccp.cc (evaluate_stmt): Likewise.
* tree.cc (fndecl_dealloc_argno): Likewise.
gcc/fortran/ChangeLog:
* dump-parse-tree.cc (show_omp_node): Handle EXEC_OMP_ALLOCATE
and EXEC_OMP_ALLOCATORS.
* f95-lang.cc (ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LIST):
Add 'ECF_LEAF | ECF_MALLOC' to existing 'ECF_NOTHROW'.
(ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LEAF_LIST): Define.
* gfortran.h (gfc_omp_clauses): Add contained_in_target_construct.
* invoke.texi (-fopenacc, -fopenmp): Update based on C version.
(-fopenmp-simd): New, based on C version.
(-fopenmp-allocators): New.
* lang.opt (fopenmp-allocators): Add.
* openmp.cc (resolve_omp_clauses): For allocators/allocate directive,
add target and no dynamic_allocators diagnostic and more invalid
diagnostic.
* parse.cc (decode_omp_directive): Set contains_teams_construct.
* trans-array.h (gfc_array_allocate): Update prototype.
(gfc_conv_descriptor_version): New prototype.
* trans-decl.cc (gfc_init_default_dt): Fix comment.
* trans-array.cc (gfc_conv_descriptor_version): New.
(gfc_array_allocate): Support GOMP_alloc allocation.
(gfc_alloc_allocatable_for_assignment, structure_alloc_comps):
Handle GOMP_free/omp_realloc as needed.
* trans-expr.cc (gfc_conv_procedure_call): Likewise.
(alloc_scalar_allocatable_for_assignment): Likewise.
* trans-intrinsic.cc (conv_intrinsic_move_alloc): Likewise.
* trans-openmp.cc (gfc_trans_omp_allocators,
gfc_trans_omp_directive): Handle allocators/allocate directive.
(gfc_omp_call_add_alloc, gfc_omp_call_is_alloc): New.
* trans-stmt.h (gfc_trans_allocate): Update prototype.
* trans-stmt.cc (gfc_trans_allocate): Support GOMP_alloc.
* trans-types.cc (gfc_get_dtype_rank_type): Set version field.
* trans.cc (gfc_allocate_using_malloc, gfc_allocate_allocatable):
Update to handle GOMP_alloc.
(gfc_deallocate_with_status, gfc_deallocate_scalar_with_status):
Handle GOMP_free.
(trans_code): Update call.
* trans.h (gfc_allocate_allocatable, gfc_allocate_using_malloc):
Update prototype.
(gfc_omp_call_add_alloc, gfc_omp_call_is_alloc): New prototype.
* types.def (BT_FN_PTR_PTR_SIZE_PTRMODE_PTRMODE): New.
libgomp/ChangeLog:
* allocator.c (struct fort_alloc_splay_tree_key_s,
fort_alloc_splay_compare, GOMP_add_alloc, GOMP_is_alloc): New.
* libgomp.h: Define splay_tree_static for 'reverse' splay tree.
* libgomp.map (GOMP_5.1.2): New; add GOMP_add_alloc and
GOMP_is_alloc; move GOMP_target_map_indirect_ptr from ...
(GOMP_5.1.1): ... here.
* libgomp.texi (Impl. Status, Memory management): Update for
allocators/allocate directives.
* splay-tree.c: Handle splay_tree_static define to declare all
functions as static.
(splay_tree_lookup_node): New.
* splay-tree.h: Handle splay_tree_decl_only define.
(splay_tree_lookup_node): New prototype.
* target.c: Define splay_tree_static for 'reverse'.
* testsuite/libgomp.fortran/allocators-1.f90: New test.
* testsuite/libgomp.fortran/allocators-2.f90: New test.
* testsuite/libgomp.fortran/allocators-3.f90: New test.
* testsuite/libgomp.fortran/allocators-4.f90: New test.
* testsuite/libgomp.fortran/allocators-5.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/allocate-14.f90: Add coarray and
not-listed tests.
* gfortran.dg/gomp/allocate-5.f90: Remove sorry dg-message.
* gfortran.dg/bind_c_array_params_2.f90: Update expected
dump for dtype '.version=0'.
* gfortran.dg/gomp/allocate-16.f90: New test.
* gfortran.dg/gomp/allocators-3.f90: New test.
* gfortran.dg/gomp/allocators-4.f90: New test.
This implements the OpenMP low-latency memory allocator for AMD GCN using the
small per-team LDS memory (Local Data Store).
Since addresses can now refer to LDS space, the "Global" address space is
no-longer compatible. This patch therefore switches the backend to use
entirely "Flat" addressing (which supports both memories). A future patch
will re-enable "global" instructions for cases where it is known to be safe
to do so.
gcc/ChangeLog:
* config/gcn/gcn-builtins.def (DISPATCH_PTR): New built-in.
* config/gcn/gcn.cc (gcn_init_machine_status): Disable global
addressing.
(gcn_expand_builtin_1): Implement GCN_BUILTIN_DISPATCH_PTR.
libgomp/ChangeLog:
* config/gcn/libgomp-gcn.h (TEAM_ARENA_START): Move to here.
(TEAM_ARENA_FREE): Likewise.
(TEAM_ARENA_END): Likewise.
(GCN_LOWLAT_HEAP): New.
* config/gcn/team.c (LITTLEENDIAN_CPU): New, and import hsa.h.
(__gcn_lowlat_init): New prototype.
(gomp_gcn_enter_kernel): Initialize the low-latency heap.
* libgomp.h (TEAM_ARENA_START): Move to libgomp.h.
(TEAM_ARENA_FREE): Likewise.
(TEAM_ARENA_END): Likewise.
* plugin/plugin-gcn.c (lowlat_size): New variable.
(print_kernel_dispatch): Label the group_segment_size purpose.
(init_environment_variables): Read GOMP_GCN_LOWLAT_POOL.
(create_kernel_dispatch): Pass low-latency head allocation to kernel.
(run_kernel): Use shadow; don't assume values.
* testsuite/libgomp.c/omp_alloc-traits.c: Enable for amdgcn.
* config/gcn/allocator.c: New file.
* libgomp.texi: Document low-latency implementation details.
The NVPTX low latency memory is not accessible outside the team that allocates
it, and therefore should be unavailable for allocators with the access trait
"all". This change means that the omp_low_lat_mem_alloc predefined
allocator no longer works (but omp_cgroup_mem_alloc still does).
libgomp/ChangeLog:
* allocator.c (MEMSPACE_VALIDATE): New macro.
(omp_init_allocator): Use MEMSPACE_VALIDATE.
(omp_aligned_alloc): Use OMP_LOW_LAT_MEM_ALLOC_INVALID.
(omp_aligned_calloc): Likewise.
(omp_realloc): Likewise.
* config/nvptx/allocator.c (nvptx_memspace_validate): New function.
(MEMSPACE_VALIDATE): New macro.
(OMP_LOW_LAT_MEM_ALLOC_INVALID): New define.
* libgomp.texi: Document low-latency implementation details.
* testsuite/libgomp.c/omp_alloc-1.c (main): Add gnu_lowlat.
* testsuite/libgomp.c/omp_alloc-2.c (main): Add gnu_lowlat.
* testsuite/libgomp.c/omp_alloc-3.c (main): Add gnu_lowlat.
* testsuite/libgomp.c/omp_alloc-4.c (main): Add access trait.
* testsuite/libgomp.c/omp_alloc-5.c (main): Add gnu_lowlat.
* testsuite/libgomp.c/omp_alloc-6.c (main): Add access trait.
* testsuite/libgomp.c/omp_alloc-traits.c: New test.
This patch adds support for allocating low-latency ".shared" memory on
NVPTX GPU device, via the omp_low_lat_mem_space and omp_alloc. The memory
can be allocated, reallocated, and freed using a basic but fast algorithm,
is thread safe and the size of the low-latency heap can be configured using
the GOMP_NVPTX_LOWLAT_POOL environment variable.
The use of the PTX dynamic_smem_size feature means that low-latency allocator
will not work with the PTX 3.1 multilib.
For now, the omp_low_lat_mem_alloc allocator also works, but that will change
when I implement the access traits.
libgomp/ChangeLog:
* allocator.c (MEMSPACE_ALLOC): New macro.
(MEMSPACE_CALLOC): New macro.
(MEMSPACE_REALLOC): New macro.
(MEMSPACE_FREE): New macro.
(predefined_alloc_mapping): New array. Add _Static_assert to match.
(ARRAY_SIZE): New macro.
(omp_aligned_alloc): Use MEMSPACE_ALLOC.
Implement fall-backs for predefined allocators. Simplify existing
fall-backs.
(omp_free): Use MEMSPACE_FREE.
(omp_calloc): Use MEMSPACE_CALLOC. Implement fall-backs for
predefined allocators. Simplify existing fall-backs.
(omp_realloc): Use MEMSPACE_REALLOC, MEMSPACE_ALLOC, and MEMSPACE_FREE.
Implement fall-backs for predefined allocators. Simplify existing
fall-backs.
* config/nvptx/team.c (__nvptx_lowlat_pool): New asm variable.
(__nvptx_lowlat_init): New prototype.
(gomp_nvptx_main): Call __nvptx_lowlat_init.
* libgomp.texi: Update memory space table.
* plugin/plugin-nvptx.c (lowlat_pool_size): New variable.
(GOMP_OFFLOAD_init_device): Read the GOMP_NVPTX_LOWLAT_POOL envvar.
(GOMP_OFFLOAD_run): Apply lowlat_pool_size.
* basic-allocator.c: New file.
* config/nvptx/allocator.c: New file.
* testsuite/libgomp.c/omp_alloc-1.c: New test.
* testsuite/libgomp.c/omp_alloc-2.c: New test.
* testsuite/libgomp.c/omp_alloc-3.c: New test.
* testsuite/libgomp.c/omp_alloc-4.c: New test.
* testsuite/libgomp.c/omp_alloc-5.c: New test.
* testsuite/libgomp.c/omp_alloc-6.c: New test.
Co-authored-by: Kwok Cheung Yeung <kcy@codesourcery.com>
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
Since OpenMP 5.2, the destroy clause takes an depend argument as argument;
for the depobj directive, it the new argument is optional but, if present,
it must be identical to the directive's argument.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_depobj): Accept optionally an argument
to the destroy clause.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_depobj): Accept optionally an argument
to the destroy clause.
gcc/fortran/ChangeLog:
* openmp.cc (gfc_match_omp_depobj): Accept optionally an argument
to the destroy clause.
libgomp/ChangeLog:
* libgomp.texi (5.2 Impl. Status): An argument to the destroy clause
is now supported.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/depobj-3.c: New test.
* gfortran.dg/gomp/depobj-3.f90: New test.
This adds support for the 'indirect' clause in the 'declare target'
directive. Functions declared as indirect may be called via function
pointers passed from the host in offloaded code.
Virtual calls to member functions via the object pointer in C++ are
currently not supported in target regions.
2023-11-07 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/c-family/
* c-attribs.cc (c_common_attribute_table): Add attribute for
indirect functions.
* c-pragma.h (enum parma_omp_clause): Add entry for indirect clause.
gcc/c/
* c-decl.cc (c_decl_attributes): Add attribute for indirect
functions.
* c-lang.h (c_omp_declare_target_attr): Add indirect field.
* c-parser.cc (c_parser_omp_clause_name): Handle indirect clause.
(c_parser_omp_clause_indirect): New.
(c_parser_omp_all_clauses): Handle indirect clause.
(OMP_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
(c_parser_omp_declare_target): Handle indirect clause. Emit error
message if device_type or indirect clauses used alone. Emit error
if indirect clause used with device_type that is not 'any'.
(OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
(c_parser_omp_begin): Handle indirect clause.
* c-typeck.cc (c_finish_omp_clauses): Handle indirect clause.
gcc/cp/
* cp-tree.h (cp_omp_declare_target_attr): Add indirect field.
* decl2.cc (cplus_decl_attributes): Add attribute for indirect
functions.
* parser.cc (cp_parser_omp_clause_name): Handle indirect clause.
(cp_parser_omp_clause_indirect): New.
(cp_parser_omp_all_clauses): Handle indirect clause.
(handle_omp_declare_target_clause): Add extra parameter. Add
indirect attribute for indirect functions.
(OMP_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
(cp_parser_omp_declare_target): Handle indirect clause. Emit error
message if device_type or indirect clauses used alone. Emit error
if indirect clause used with device_type that is not 'any'.
(OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
(cp_parser_omp_begin): Handle indirect clause.
* semantics.cc (finish_omp_clauses): Handle indirect clause.
gcc/
* lto-cgraph.cc (enum LTO_symtab_tags): Add tag for indirect
functions.
(output_offload_tables): Write indirect functions.
(input_offload_tables): read indirect functions.
* lto-section-names.h (OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): New.
* omp-builtins.def (BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR): New.
* omp-offload.cc (offload_ind_funcs): New.
(omp_discover_implicit_declare_target): Add functions marked with
'omp declare target indirect' to indirect functions list.
(omp_finish_file): Add indirect functions to section for offload
indirect functions.
(execute_omp_device_lower): Redirect indirect calls on target by
passing function pointer to BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR.
(pass_omp_device_lower::gate): Run pass_omp_device_lower if
indirect functions are present on an accelerator device.
* omp-offload.h (offload_ind_funcs): New.
* tree-core.h (omp_clause_code): Add OMP_CLAUSE_INDIRECT.
* tree.cc (omp_clause_num_ops): Add entry for OMP_CLAUSE_INDIRECT.
(omp_clause_code_name): Likewise.
* tree.h (OMP_CLAUSE_INDIRECT_EXPR): New.
* config/gcn/mkoffload.cc (process_asm): Process offload_ind_funcs
section. Count number of indirect functions.
(process_obj): Emit number of indirect functions.
* config/nvptx/mkoffload.cc (ind_func_ids, ind_funcs_tail): New.
(process): Emit offload_ind_func_table in PTX code. Emit indirect
function names and count in image.
* config/nvptx/nvptx.cc (nvptx_record_offload_symbol): Mark
indirect functions in PTX code with IND_FUNC_MAP.
gcc/testsuite/
* c-c++-common/gomp/declare-target-7.c: Update expected error message.
* c-c++-common/gomp/declare-target-indirect-1.c: New.
* c-c++-common/gomp/declare-target-indirect-2.c: New.
* g++.dg/gomp/attrs-21.C (v12): Update expected error message.
* g++.dg/gomp/declare-target-indirect-1.C: New.
* gcc.dg/gomp/attrs-21.c (v12): Update expected error message.
include/
* gomp-constants.h (GOMP_VERSION): Increment to 3.
(GOMP_VERSION_SUPPORTS_INDIRECT_FUNCS): New.
libgcc/
* offloadstuff.c (OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): New.
(__offload_ind_func_table): New.
(__offload_ind_funcs_end): New.
(__OFFLOAD_TABLE__): Add entries for indirect functions.
libgomp/
* Makefile.am (libgomp_la_SOURCES): Add target-indirect.c.
* Makefile.in: Regenerate.
* libgomp-plugin.h (GOMP_INDIRECT_ADDR_MAP): New define.
(GOMP_OFFLOAD_load_image): Add extra argument.
* libgomp.h (struct indirect_splay_tree_key_s): New.
(indirect_splay_tree_node, indirect_splay_tree,
indirect_splay_tree_key): New.
(indirect_splay_compare): New.
* libgomp.map (GOMP_5.1.1): Add GOMP_target_map_indirect_ptr.
* libgomp.texi (OpenMP 5.1): Update documentation on indirect
calls in target region and on indirect clause.
(Other new OpenMP 5.2 features): Add entry for virtual function calls.
* libgomp_g.h (GOMP_target_map_indirect_ptr): Add prototype.
* oacc-host.c (host_load_image): Add extra argument.
* target.c (gomp_load_image_to_device): If the GOMP_VERSION is high
enough, read host indirect functions table and pass to
load_image_func.
* config/accel/target-indirect.c: New.
* config/linux/target-indirect.c: New.
* config/gcn/team.c (build_indirect_map): Add prototype.
(gomp_gcn_enter_kernel): Initialize support for indirect
function calls on GCN target.
* config/nvptx/team.c (build_indirect_map): Add prototype.
(gomp_nvptx_main): Initialize support for indirect function
calls on NVPTX target.
* plugin/plugin-gcn.c (struct gcn_image_desc): Add field for
indirect functions count.
(GOMP_OFFLOAD_load_image): Add extra argument. If the GOMP_VERSION
is high enough, build address translation table and copy it to target
memory.
* plugin/plugin-nvptx.c (nvptx_tdata): Add field for indirect
functions count.
(GOMP_OFFLOAD_load_image): Add extra argument. If the GOMP_VERSION
is high enough, Build address translation table and copy it to target
memory.
* testsuite/libgomp.c-c++-common/declare-target-indirect-1.c: New.
* testsuite/libgomp.c-c++-common/declare-target-indirect-2.c: New.
* testsuite/libgomp.c++/declare-target-indirect-1.C: New.
This patch mentions the C attribute syntax support in the libgomp documentation.
2023-11-05 Jakub Jelinek <jakub@redhat.com>
* libgomp.texi (Enabling OpenMP): Adjust wording for attribute syntax
supported also in C.
The OpenACC specification does not mention the '!$ ' sentinel for conditional
compilation and the feature was removed in r11-5572-g1d6f6ac693a860
for PR fortran/98011; update libgomp.texi for this and update a leftover
comment. - Additionally, some other updates are done as well.
libgomp/
* libgomp.texi (Enabling OpenMP): Update for C/C++ attributes;
improve wording especially for Fortran; mention -fopenmp-simd.
(Enabling OpenACC): Minor cleanup; remove conditional compilation
sentinel.
gcc/
* doc/invoke.texi (-fopenacc, -fopenmp, -fopenmp-simd): Use @samp not
@code; document more completely the supported Fortran sentinels.
gcc/fortran
* scanner.cc (skip_free_comments, skip_fixed_comments): Remove
leftover 'OpenACC' from comments about OpenMP's conditional
compilation sentinel.
None of the ACC_* env vars was documented; in particular, the valid valids
for ACC_DEVICE_TYPE found to be lacking as those are not document in the
OpenACC spec.
GCC_ACC_NOTIFY was removed as I failed to find any traces of it but the
addition to the documentation in commit r6-6185-gcdf6119dad04dd
("libgomp.texi: Updates for OpenACC."). It seems to be planned as GCC
version of the ACC_NOTIFY env var used by another compiler for offloading
debugging.
libgomp/
* libgomp.texi (ACC_DEVICE_TYPE, ACC_DEVICE_NUM, ACC_PROFLIB):
Actually document what the function does.
(GCC_ACC_NOTIFY): Remove unused env var.
This commit completes the documentation of the OpenMP memory-management
routines, except for the unimplemented TR11 additions. It also makes clear
in the 'Memory allocation' section of the 'OpenMP-Implementation Specifics'
chapter under which condition OpenMP managed memory/allocators are used.
libgomp/ChangeLog:
* libgomp.texi: Fix some typos.
(Memory Management Routines): Document remaining 5.x routines.
(Memory allocation): Make clear when the section applies.
gcc/fortran/ChangeLog:
* gfortran.h (ext_attr_t): Add omp_allocate flag.
* match.cc (gfc_free_omp_namelist): Void deleting same
u2.allocator multiple times now that a sequence can use
the same one.
* openmp.cc (gfc_match_omp_clauses, gfc_match_omp_allocate): Use
same allocator expr multiple times.
(is_predefined_allocator): Make static.
(gfc_resolve_omp_allocate): Update/extend restriction checks;
remove sorry message.
(resolve_omp_clauses): Reject corarrays in allocate/allocators
directive.
* parse.cc (check_omp_allocate_stmt): Permit procedure pointers
here (rejected later) for less misleading diagnostic.
* trans-array.cc (gfc_trans_auto_array_allocation): Propagate
size for GOMP_alloc and location to which it should be added to.
* trans-decl.cc (gfc_trans_deferred_vars): Handle 'omp allocate'
for stack variables; sorry for static variables/common blocks.
* trans-openmp.cc (gfc_trans_omp_clauses): Evaluate 'allocate'
clause's allocator only once; fix adding expressions to the
block.
(gfc_trans_omp_single): Pass a block to gfc_trans_omp_clauses.
gcc/ChangeLog:
* gimplify.cc (gimplify_bind_expr): Handle Fortran's
'omp allocate' for stack variables.
libgomp/ChangeLog:
* libgomp.texi (OpenMP Impl. Status): Mention that Fortran now
supports the allocate directive for stack variables.
* testsuite/libgomp.fortran/allocate-5.f90: New test.
* testsuite/libgomp.fortran/allocate-6.f90: New test.
* testsuite/libgomp.fortran/allocate-7.f90: New test.
* testsuite/libgomp.fortran/allocate-8.f90: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/allocate-14.c: Fix directive name.
* c-c++-common/gomp/allocate-15.c: Likewise.
* c-c++-common/gomp/allocate-9.c: Fix comment typo.
* gfortran.dg/gomp/allocate-4.f90: Remove sorry dg-error.
* gfortran.dg/gomp/allocate-7.f90: Likewise.
* gfortran.dg/gomp/allocate-10.f90: New test.
* gfortran.dg/gomp/allocate-11.f90: New test.
* gfortran.dg/gomp/allocate-12.f90: New test.
* gfortran.dg/gomp/allocate-13.f90: New test.
* gfortran.dg/gomp/allocate-14.f90: New test.
* gfortran.dg/gomp/allocate-15.f90: New test.
* gfortran.dg/gomp/allocate-8.f90: New test.
* gfortran.dg/gomp/allocate-9.f90: New test.
libgomp/ChangeLog:
* libgomp.texi (OpenMP Context Selectors): Clarify 'kind' trait
and that other target archs have no 'arch'/'isa' traits implemented.
Call GOMP_alloc/free for 'omp allocate' allocated variables. This is
for C only as C++ and Fortran show a sorry already in the FE. Note that
this only applies to stack variables as the C FE shows a sorry for
static variables.
gcc/ChangeLog:
* gimplify.cc (gimplify_bind_expr): Call GOMP_alloc/free for
'omp allocate' variables; move stack cleanup after other
cleanup.
(omp_notice_variable): Process original decl when decl
of the value-expression for a 'omp allocate' variable is passed.
* omp-low.cc (scan_omp_1_op): Handle 'omp allocate' variables
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.1 Impl.): Mark 'omp allocate' as
implemented for C only.
* testsuite/libgomp.c/allocate-4.c: New test.
* testsuite/libgomp.c/allocate-5.c: New test.
* testsuite/libgomp.c/allocate-6.c: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/allocate-11.c: Remove C-only dg-message
for 'sorry, unimplemented'.
* c-c++-common/gomp/allocate-12.c: Likewise.
* c-c++-common/gomp/allocate-15.c: Likewise.
* c-c++-common/gomp/allocate-9.c: Likewise.
* c-c++-common/gomp/allocate-10.c: New test.
* c-c++-common/gomp/allocate-17.c: New test.
This patch adds support for (so far C++) omp::decl attribute. For
declare simd and declare variant directives it is essentially another
spelling of omp::decl, except per discussions it is not allowed inside
of omp::sequence attribute. For threadprivate, declare target, allocate
and later groupprivate directives it should appertain to variable (or for
declare target also function definitions and) declarations and where in
normal syntax one specifies a list of variables (or variables and functions),
either as argument of the directive or clause argument, such argument is
not specified and implied to be the variable it applies to.
2023-09-20 Jakub Jelinek <jakub@redhat.com>
PR c++/111392
gcc/
* attribs.cc (decl_attributes): Don't warn on omp::directive attribute
on vars or function decls if -fopenmp or -fopenmp-simd.
gcc/c-family/
* c-omp.cc (c_omp_directives): Add commented out groupprivate
directive entry.
gcc/cp/
* parser.h (struct cp_lexer): Add in_omp_decl_attribute member.
* cp-tree.h (cp_maybe_parse_omp_decl): Declare.
* parser.cc (cp_parser_handle_statement_omp_attributes): Diagnose
omp::decl attribute on statements. Adjust diagnostic wording for
omp::decl.
(cp_parser_omp_directive_args): Add DECL_P argument, set TREE_PUBLIC
to it on the DEFERRED_PARSE tree.
(cp_parser_omp_sequence_args): Adjust caller.
(cp_parser_std_attribute): Handle omp::decl attribute.
(cp_parser_omp_var_list): If parser->lexer->in_omp_decl_attribute
don't expect any arguments, instead create clause or TREE_LIST for
that decl.
(cp_parser_late_parsing_omp_declare_simd): Adjust diagnostic wording
for omp::decl.
(cp_maybe_parse_omp_decl): New function.
(cp_parser_omp_declare_target): If
parser->lexer->in_omp_decl_attribute and first token isn't name or
comma invoke cp_parser_omp_var_list.
* decl2.cc (cplus_decl_attributes): Adjust diagnostic wording for
omp::decl. Handle omp::decl on declarations.
* name-lookup.cc (finish_using_directive): Adjust diagnostic wording
for omp::decl.
gcc/testsuite/
* g++.dg/gomp/attrs-19.C: New test.
* g++.dg/gomp/attrs-20.C: New test.
* g++.dg/gomp/attrs-21.C: New test.
libgomp/
* libgomp.texi: Mark decl attribute was added to the C++ attribute
syntax as implemented.
When copying a 2D or 3D rectangular memmory block, the performance is
better when using CUDA's cuMemcpy2D/cuMemcpy3D instead of copying the
data one by one. That's what this commit does.
Additionally, it permits device-to-device copies, if neccessary using a
temporary variable on the host.
include/ChangeLog:
* cuda/cuda.h (CUlimit): Add CUDA_ERROR_NOT_INITIALIZED,
CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_INVALID_HANDLE.
(CUarray, CUmemorytype, CUDA_MEMCPY2D, CUDA_MEMCPY3D,
CUDA_MEMCPY3D_PEER): New typdefs.
(cuMemcpy2D, cuMemcpy2DAsync, cuMemcpy2DUnaligned,
cuMemcpy3D, cuMemcpy3DAsync, cuMemcpy3DPeer,
cuMemcpy3DPeerAsync): New prototypes.
libgomp/ChangeLog:
* libgomp-plugin.h (GOMP_OFFLOAD_memcpy2d,
GOMP_OFFLOAD_memcpy3d): New prototypes.
* libgomp.h (struct gomp_device_descr): Add memcpy2d_func
and memcpy3d_func.
* libgomp.texi (nvtpx): Document when cuMemcpy2D/cuMemcpy3D is used.
* oacc-host.c (memcpy2d_func, .memcpy3d_func): Init with NULL.
* plugin/cuda-lib.def (cuMemcpy2D, cuMemcpy2DUnaligned,
cuMemcpy3D): Invoke via CUDA_ONE_CALL.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memcpy2d,
GOMP_OFFLOAD_memcpy3d): New.
* target.c (omp_target_memcpy_rect_worker):
(omp_target_memcpy_rect_check, omp_target_memcpy_rect_copy):
Permit all device-to-device copyies; invoke new plugins for
2D and 3D copying when available.
(gomp_load_plugin_for_device): DLSYM the new plugin functions.
* testsuite/libgomp.c/target-12.c: Fix dimension bug.
* testsuite/libgomp.fortran/target-12.f90: Likewise.
* testsuite/libgomp.fortran/target-memcpy-rect-1.f90: New test.
The previous list of OpenMP routines was rather lengthy and the order seemed
to be rather random - especially for outputs which did not have @menu as then
the sectioning was not visible.
The OpenMP specification split in 5.1 the lengthy list by adding
sections to the chapter and grouping the routines under them.
This patch follow suite and uses the same sections and order. The commit also
prepares for adding not-yet-documented routines by listening those in the
@menu (@c commented - both for just undocumented and for also unimplemented
routines). See also PR 110364.
libgomp/ChangeLog:
* libgomp.texi (OpenMP Runtime Library Routines):
Split long list by adding sections and moving routines there.
(OMP_ALLOCATORS): Fix typo.
Before this commit, gfortran produced with OpenMP for 'do i = 1,10,2'
the code
for (count.0 = 0; count.0 < 5; count.0 = count.0 + 1)
i = count.0 * 2 + 1;
While such an inner loop can be collapsed, a non-rectangular could not.
With this commit and for all constant loop steps, a simple loop such
as 'for (i = 1; i <= 10; i = i + 2)' is created. (Before only for the
constant steps of 1 and -1.)
The constant step permits to know the direction (increasing/decreasing)
that is required for the loop condition.
The new code is only valid if one assumes no overflow of the loop variable.
However, the Fortran standard can be read that this must be ensured by
the user. Namely, the Fortran standard requires (F2023, 10.1.5.2.4):
"The execution of any numeric operation whose result is not defined by
the arithmetic used by the processor is prohibited."
And, for DO loops, F2023's "11.1.7.4.3 The execution cycle" has the
following: The number of loop iterations handled by an iteration count,
which would permit code like 'do i = huge(i)-5, huge(i),4'. However,
in step (3), this count is not only decremented by one but also:
"... The DO variable, if any, is incremented by the value of the
incrementation parameter m3."
And for the example above, 'i' would be 'huge(i)+3' in the last
execution cycle, which exceeds the largest model number and should
render the example as invalid.
PR fortran/107424
gcc/fortran/ChangeLog:
* trans-openmp.cc (gfc_nonrect_loop_expr): Accept all
constant loop steps.
(gfc_trans_omp_do): Likewise; use sign to determine
loop direction.
libgomp/ChangeLog:
* libgomp.texi (Impl. Status 5.0): Add link to new PR110735.
* testsuite/libgomp.fortran/non-rectangular-loop-1.f90: Enable
commented tests.
* testsuite/libgomp.fortran/non-rectangular-loop-1a.f90: Remove
test file; tests are in non-rectangular-loop-1.f90.
* testsuite/libgomp.fortran/non-rectangular-loop-5.f90: Change
testcase to use a non-constant step to retain the 'sorry' test.
* testsuite/libgomp.fortran/non-rectangular-loop-6.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/linear-2.f90: Update dump to remove
the additional count variable.
libgomp/
* libgomp.texi (OMP_ALLOCATOR): Document the default values for
the traits. Add crossref to 'Memory allocation'.
(Memory allocation): Refer to OMP_ALLOCATOR for the available
traits and allocators/mem spaces; document the default value
for the pool_size trait.
libgomp/
* libgomp.texi (OpenMP 5.0): Replace '... stub' by @ref to
'Memory allocation' section which contains the full status.
(TR11): Remove differently worded duplicated entry.
As with the memkind library, it is only used when found at runtime;
it does not need to be present when building GCC.
The included testcase does not check whether the memory has been placed
on the nearest node as the Linux kernel memory handling too often ignores
that hint, using a different node for the allocation. However, when
running with 'numactl --preferred=<node> ./executable', it is clearly
visible that the feature works by comparing malloc/default vs. nearest
placement (using get_mempolicy to obtain the node for a mem addr).
libgomp/ChangeLog:
* allocator.c: Add ifdef for LIBGOMP_USE_LIBNUMA.
(enum gomp_numa_memkind_kind): Renamed from gomp_memkind_kind;
add GOMP_MEMKIND_LIBNUMA.
(struct gomp_libnuma_data, gomp_init_libnuma, gomp_get_libnuma): New.
(omp_init_allocator): Handle partition=nearest with libnuma if avail.
(omp_aligned_alloc, omp_free, omp_aligned_calloc, omp_realloc): Add
numa_alloc_local (+ memset), numa_free, and numa_realloc calls as
needed.
* config/linux/allocator.c (LIBGOMP_USE_LIBNUMA): Define
* libgomp.texi: Fix a typo; use 'fi' instead of its ligature char.
(Memory allocation): Renamed from 'Memory allocation with libmemkind';
updated for libnuma usage.
* testsuite/libgomp.c-c++-common/alloc-11.c: New test.
* testsuite/libgomp.c-c++-common/alloc-12.c: New test.
libgomp/
* allocator.c (omp_init_allocator): Use malloc for
omp_high_bw_mem_space when the memkind lib is unavailable
instead of returning omp_null_allocator.
* libgomp.texi (OpenMP 5.0): Fix typo.
(Memory allocation with libmemkind): Document implementation
in more detail.
Use @var{} instead of @emph{} - for semantic texinfo formatting; the result
is similar: slanted instead of italic in PDF, still italic in HTML, albeit
in info is is now uppercase instead of '_' as pre/suffix.
The patch also documents the newer _ALL/_DEV/_DEV_<no> env var suffixes
and as it refers to the ICV vars and their scope, those were added to the
OMP_ env vars for reference. For OMP_NESTING, a note that those were
deprecated was added plus a bunch of cross references. For OMP_ALLOCATOR,
add note about the lack of per-device env vars support.
A new section, consisting mostly of cross references was added to document
the implementation-defined ICV initialization, especially as OpenMP demands
that implementations document what they do for 'implementation defined'.
For nvptx, the implementation-defined used stack size was documented
libgomp/
* libgomp.texi: Use @var for ICV vars.
(OpenMP Environment Variables): Mention _ALL/_DEV/_DEV_<no> variants,
document which ICV is set and which scope the ICV has; extend/cleanup
some @ref.
(Implementation-defined ICV Initialization): New.
(nvptx): Document the implementation-defined used per-warp stack size.
For C/C++ pointers, default implicit mapping firstprivatizes the pointer
but if the memory it points to is mapped, the it is updated to point to
the device memory (by attaching a zero sized array section of the pointed-to
storage).
However, if the pointed-to storage wasn't mapped, the pointer was set to
NULL on the device side (OpenMP 5.0/5.1 semantic). With this commit, the
pointer retains the on-host address in that case (OpenMP 5.2 semantic).
The new semantic avoids an explicit map/firstprivate/is_device_ptr in the
following sensible cases: Special values (e.g. pointer or 0x1, 0x2 etc.),
explicitly device allocated memory (e.g. omp_target_alloc), and with
(unified) shared memory.
(Note: With (U)SM, mappings still must be tracked, at least when
omp_target_associate_ptr does not fail when passing in two destinct pointers.)
libgomp/
PR middle-end/110270
* target.c (gomp_map_vars_internal): Copy host value instead of NULL
for GOMP_MAP_ZERO_LEN_ARRAY_SECTION if not mapped.
* libgomp.texi (OpenMP 5.2 Impl.): Mark as 'Y'.
* testsuite/libgomp.c/target-19.c: Update expected value.
* testsuite/libgomp.c++/target-18.C: Likewise.
* testsuite/libgomp.c++/target-19.C: Likewise.
* testsuite/libgomp.c-c++-common/requires-unified-addr-2.c: New test.
* testsuite/libgomp.c-c++-common/target-implicit-map-3.c: New test.
* testsuite/libgomp.c-c++-common/target-implicit-map-4.c: New test.
Support OpenMP 5.1's syntax for OMP_ALLOCATOR as well,
which permits besides predefined allocators also
predefined memspaces optionally followed by traits.
Additionally, this commit adds the previously lacking
documentation for OMP_ALLOCATOR, OMP_AFFINITY_FORMAT
and OMP_DISPLAY_AFFINITY.
libgomp/ChangeLog:
* env.c (gomp_def_allocator_envvar): New var.
(parse_allocator): Handle OpenMP 5.1 syntax.
(cleanup_env): New.
(omp_display_env): Output gomp_def_allocator_envvar
for an allocator with traits.
* libgomp.texi (OMP_ALLOCATOR, OMP_AFFINITY_FORMAT,
OMP_DISPLAY_AFFINITY): New.
* testsuite/libgomp.c/allocator-1.c: New test.
* testsuite/libgomp.c/allocator-2.c: New test.
* testsuite/libgomp.c/allocator-3.c: New test.
* testsuite/libgomp.c/allocator-4.c: New test.
* testsuite/libgomp.c/allocator-5.c: New test.
* testsuite/libgomp.c/allocator-6.c: New test.
OMP_TARGET_OFFLOAD=mandatory handling was before inconsistent. Hence, in
OpenMP 5.2 it was clarified/extended by having implications on the
default-device-var; additionally, omp_initial_device and omp_invalid_device
enum values/PARAMETERs were added; support for it was added
in r13-1066-g1158fe43407568 including aborting for omp_invalid_device and
non-conforming device numbers. Only the mandatory handling was missing.
Namely, while the default-device-var is usually initialized to value 0,
with 'mandatory' it must have the value 'omp_invalid_device' if and only if
zero non-host devices are available. (The OMP_DEFAULT_DEVICE env var
overrides this as it comes semantically after the initialization.)
To achieve this, default-device-var is now initialized to MIN_INT. If
there is no 'mandatory', it is set to 0 directly after env var parsing.
Otherwise, it is updated in gomp_target_init to either 0 or
omp_invalid_device. To ensure INT_MIN is never seen by the user, both
the omp_get_default_device API routine and omp_display_env (user call
and OMP_DISPLAY_ENV env var) call gomp_init_targets_once() in that case.
libgomp/ChangeLog:
* env.c (gomp_default_icv_values): Init default_device_var to
an nonconforming value - INT_MIN.
(initialize_env): After env-var parsing, set default_device_var to
device 0 unless OMP_TARGET_OFFLOAD=mandatory.
(omp_display_env): If default_device_var is INT_MIN, call
gomp_init_targets_once.
* icv-device.c (omp_get_default_device): Likewise.
* libgomp.texi (OMP_DEFAULT_DEVICE): Update init description.
(OpenMP 5.2 Impl. Status): Mark OMP_TARGET_OFFLOAD=mandatory as 'Y'.
* target.c (resolve_device): Improve error message device-num < 0
with 'mandatory' and no no-host devices available.
(gomp_target_init): Set default-device-var if INT_MIN.
* testsuite/libgomp.c/target-48.c: New test.
* testsuite/libgomp.c/target-49.c: New test.
* testsuite/libgomp.c/target-50.c: New test.
* testsuite/libgomp.c/target-50a.c: New test.
* testsuite/libgomp.c/target-51.c: New test.
* testsuite/libgomp.c/target-52.c: New test.
* testsuite/libgomp.c/target-53.c: New test.
* testsuite/libgomp.c/target-54.c: New test.
Effectively, for GCN (as for nvptx) there is a common address space between
host and device, whether being accessible or not. Thus, this commit
permits to use 'omp requires unified_address' with GCN devices.
(nvptx accepts this requirement since r13-3460-g131d18e928a3ea.)
libgomp/
* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices): Regard
unified_address requirement as supported.
* libgomp.texi (OpenMP 5.0, AMD Radeon, nvptx): Remove
'unified_address' from the not-supported requirements.
This implements support for the OpenMP 5.1 'present' modifier, which can be
used in map clauses in the 'target', 'target data', 'target data enter' and
'target data exit' constructs, and in the 'to' and 'from' clauses of the
'target update' construct. It is also supported in defaultmap.
The modifier triggers a fatal runtime error if the data specified by the
clause is not already present on the target device. It can also be combined
with 'always' in map clauses.
2023-06-06 Kwok Cheung Yeung <kcy@codesourcery.com>
Tobias Burnus <tobias@codesourcery.com>
gcc/c/
* c-parser.cc (c_parser_omp_clause_defaultmap,
c_parser_omp_clause_map): Parse 'present'.
(c_parser_omp_clause_to, c_parser_omp_clause_from): Remove.
(c_parser_omp_clause_from_to): New; parse to/from clauses with
optional present modifer.
(c_parser_omp_all_clauses): Update call.
(c_parser_omp_target_data, c_parser_omp_target_enter_data,
c_parser_omp_target_exit_data): Handle new map enum values
for 'present' mapping.
gcc/cp/
* parser.cc (cp_parser_omp_clause_defaultmap,
cp_parser_omp_clause_map): Parse 'present'.
(cp_parser_omp_clause_from_to): New; parse to/from
clauses with optional 'present' modifier.
(cp_parser_omp_all_clauses): Update call.
(cp_parser_omp_target_data, cp_parser_omp_target_enter_data,
cp_parser_omp_target_exit_data): Handle new enum value for
'present' mapping.
* semantics.cc (finish_omp_target): Likewise.
gcc/fortran/
* dump-parse-tree.cc (show_omp_namelist): Display 'present' map
modifier.
(show_omp_clauses): Display 'present' motion modifier for 'to'
and 'from' clauses.
* gfortran.h (enum gfc_omp_map_op): Add entries with 'present'
modifiers.
(struct gfc_omp_namelist): Add 'present_modifer'.
* openmp.cc (gfc_match_motion_var_list): New, handles optional
'present' modifier for to/from clauses.
(gfc_match_omp_clauses): Call it for to/from clauses; parse 'present'
in defaultmap and map clauses.
(resolve_omp_clauses): Allow 'present' modifiers on 'target',
'target data', 'target enter' and 'target exit' directives.
* trans-openmp.cc (gfc_trans_omp_clauses): Apply 'present' modifiers
to tree node for 'map', 'to' and 'from' clauses. Apply 'present' for
defaultmap.
gcc/
* gimplify.cc (omp_notice_variable): Apply GOVD_MAP_ALLOC_ONLY flag
and defaultmap flags if the defaultmap has GOVD_MAP_FORCE_PRESENT flag
set.
(omp_get_attachment): Handle map clauses with 'present' modifier.
(omp_group_base): Likewise.
(gimplify_scan_omp_clauses): Reorder present maps to come first.
Set GOVD flags for present defaultmaps.
(gimplify_adjust_omp_clauses_1): Set map kind for present defaultmaps.
* omp-low.cc (scan_sharing_clauses): Handle 'always, present' map
clauses.
(lower_omp_target): Handle map clauses with 'present' modifier.
Handle 'to' and 'from' clauses with 'present'.
* tree-core.h (enum omp_clause_defaultmap_kind): Add
OMP_CLAUSE_DEFAULTMAP_PRESENT defaultmap kind.
* tree-pretty-print.cc (dump_omp_clause): Handle 'map', 'to' and
'from' clauses with 'present' modifier. Handle present defaultmap.
* tree.h (OMP_CLAUSE_MOTION_PRESENT): New #define.
include/
* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_5): New.
(GOMP_MAP_FLAG_FORCE): Redefine.
(GOMP_MAP_FLAG_PRESENT, GOMP_MAP_FLAG_ALWAYS_PRESENT): New.
(enum gomp_map_kind): Add map kinds with 'present' modifiers.
(GOMP_MAP_COPY_TO_P, GOMP_MAP_COPY_FROM_P): Evaluate to true for
map variants with 'present'
(GOMP_MAP_ALWAYS_TO_P, GOMP_MAP_ALWAYS_FROM_P): Evaluate to true
for map variants with 'always, present' modifiers.
(GOMP_MAP_ALWAYS): Redefine.
(GOMP_MAP_FORCE_P, GOMP_MAP_PRESENT_P): New.
libgomp/
* libgomp.texi (OpenMP 5.1 Impl. status): Set 'present' support for
defaultmap to 'Y', add 'Y' entry for 'present' on to/from/map clauses.
* target.c (gomp_to_device_kind_p): Add map kinds with 'present'
modifier.
(gomp_map_vars_existing): Use new GOMP_MAP_FORCE_P macro.
(gomp_map_vars_internal, gomp_update, gomp_target_rev):
Emit runtime error if memory region not present.
* testsuite/libgomp.c-c++-common/target-present-1.c: New test.
* testsuite/libgomp.c-c++-common/target-present-2.c: New test.
* testsuite/libgomp.c-c++-common/target-present-3.c: New test.
* testsuite/libgomp.fortran/target-present-1.f90: New test.
* testsuite/libgomp.fortran/target-present-2.f90: New test.
* testsuite/libgomp.fortran/target-present-3.f90: New test.
gcc/testsuite/
* c-c++-common/gomp/map-6.c: Update dg-error, extend to test for
duplicated 'present' and extend scan-dump tests for 'present'.
* gfortran.dg/gomp/defaultmap-1.f90: Update dg-error.
* gfortran.dg/gomp/map-7.f90: Extend parse and dump test for
'present'.
* gfortran.dg/gomp/map-8.f90: Extend for duplicate 'present'
modifier checking.
* c-c++-common/gomp/defaultmap-4.c: New test.
* c-c++-common/gomp/map-9.c: New test.
* c-c++-common/gomp/target-update-1.c: New test.
* gfortran.dg/gomp/defaultmap-8.f90: New test.
* gfortran.dg/gomp/map-11.f90: New test.
* gfortran.dg/gomp/map-12.f90: New test.
* gfortran.dg/gomp/target-update-1.f90: New test.
Update permitted directives for directives marked in OpenMP's 5.2 as pure.
To ensure that list is updated, unimplemented directives are placed into
pure-2.f90 such the test FAILs once a known to be pure directive is
implemented without handling its pureness.
gcc/fortran/ChangeLog:
* parse.cc (decode_omp_directive): Accept all pure directives
inside a PURE procedures; handle 'error at(execution).
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.2): Mark pure-directive handling as 'Y'.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/nothing-2.f90: Remove one dg-error.
* gfortran.dg/gomp/pr79154-2.f90: Update expected dg-error wording.
* gfortran.dg/gomp/pr79154-simd.f90: Likewise.
* gfortran.dg/gomp/pure-1.f90: New test.
* gfortran.dg/gomp/pure-2.f90: New test.
* gfortran.dg/gomp/pure-3.f90: New test.
* gfortran.dg/gomp/pure-4.f90: New test.
I decided to check for repeated the the in libgomp and noticed
there are several occurrences of a typo theads rather than threads
in libgomp.texi.
2023-02-16 Jakub Jelinek <jakub@redhat.com>
* libgomp.texi: Fix typos - theads -> threads.