These test cases being 'dg-skip-if [...] { ! amdgcn-*-* }' meant they were
only ever considered for 'amdgcn-*-*' target -- that is, GCN *target*, not
GCN *offload target*, what is intended to be tested here, but for which they
thus always were UNSUPPORTED. Use the same style of 'dg-[...]' directives
as for the nvptx offloading test cases, 'libgomp.c/declare-variant-3*.c'.
Fix-up for commit 1fd508744e
"amdgcn: Support AMD-specific 'isa' traits in OpenMP context selectors".
libgomp/
* testsuite/libgomp.c/declare-variant-4-fiji.c: Adjust.
* testsuite/libgomp.c/declare-variant-4-gfx803.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx900.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx906.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx908.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx90a.c: Likewise.
* testsuite/libgomp.c/declare-variant-4.h: Likewise.
* testsuite/libgomp.c/declare-variant-4.c: New.
Having nvptx offloading configured doesn't imply being able to run nvptx
offloading test cases on the test host.
Also, make 'libgomp.c/declare-variant-3.c' work for all non-offloading and
offloading cases.
Fix-up for commit 59b8ade887
"[libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.c".
libgomp/
* testsuite/libgomp.c/declare-variant-3-sm30.c: Turn 'dg-do run'
into 'dg-do link'.
* testsuite/libgomp.c/declare-variant-3.c: New.
* testsuite/libgomp.c/declare-variant-3.h: Extend.
... to care for the case where not just one but both of GCN and nvptx
offloading are enabled. In that case, we currently get:
UNRESOLVED: libgomp.c/declare-variant-3-sm30.c scan-amdgcn-amdhsa-offload-tree-dump optimized "= f30 \\(\\);"
... in addition to:
PASS: libgomp.c/declare-variant-3-sm30.c scan-nvptx-none-offload-tree-dump optimized "= f30 \\(\\);"
Etc.
Fix-up for commit 59b8ade887
"[libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.c",
and commit 1fd508744e
"amdgcn: Support AMD-specific 'isa' traits in OpenMP context selectors".
libgomp/
* testsuite/libgomp.c/declare-variant-3-sm30.c: Restrict
'scan-offload-tree-dump' to 'only_for_offload_target nvptx-none'.
* testsuite/libgomp.c/declare-variant-3-sm35.c: Likewise.
* testsuite/libgomp.c/declare-variant-3-sm53.c: Likewise.
* testsuite/libgomp.c/declare-variant-3-sm70.c: Likewise.
* testsuite/libgomp.c/declare-variant-3-sm75.c: Likewise.
* testsuite/libgomp.c/declare-variant-3-sm80.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-fiji.c: Restrict
'scan-offload-tree-dump' to
'only_for_offload_target amdgcn-amdhsa'.
* testsuite/libgomp.c/declare-variant-4-gfx803.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx900.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx906.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx908.c: Likewise.
* testsuite/libgomp.c/declare-variant-4-gfx90a.c: Likewise.
This adds support for the 'indirect' clause in the 'declare target'
directive. Functions declared as indirect may be called via function
pointers passed from the host in offloaded code.
Virtual calls to member functions via the object pointer in C++ are
currently not supported in target regions.
2023-11-07 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/c-family/
* c-attribs.cc (c_common_attribute_table): Add attribute for
indirect functions.
* c-pragma.h (enum parma_omp_clause): Add entry for indirect clause.
gcc/c/
* c-decl.cc (c_decl_attributes): Add attribute for indirect
functions.
* c-lang.h (c_omp_declare_target_attr): Add indirect field.
* c-parser.cc (c_parser_omp_clause_name): Handle indirect clause.
(c_parser_omp_clause_indirect): New.
(c_parser_omp_all_clauses): Handle indirect clause.
(OMP_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
(c_parser_omp_declare_target): Handle indirect clause. Emit error
message if device_type or indirect clauses used alone. Emit error
if indirect clause used with device_type that is not 'any'.
(OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
(c_parser_omp_begin): Handle indirect clause.
* c-typeck.cc (c_finish_omp_clauses): Handle indirect clause.
gcc/cp/
* cp-tree.h (cp_omp_declare_target_attr): Add indirect field.
* decl2.cc (cplus_decl_attributes): Add attribute for indirect
functions.
* parser.cc (cp_parser_omp_clause_name): Handle indirect clause.
(cp_parser_omp_clause_indirect): New.
(cp_parser_omp_all_clauses): Handle indirect clause.
(handle_omp_declare_target_clause): Add extra parameter. Add
indirect attribute for indirect functions.
(OMP_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
(cp_parser_omp_declare_target): Handle indirect clause. Emit error
message if device_type or indirect clauses used alone. Emit error
if indirect clause used with device_type that is not 'any'.
(OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
(cp_parser_omp_begin): Handle indirect clause.
* semantics.cc (finish_omp_clauses): Handle indirect clause.
gcc/
* lto-cgraph.cc (enum LTO_symtab_tags): Add tag for indirect
functions.
(output_offload_tables): Write indirect functions.
(input_offload_tables): read indirect functions.
* lto-section-names.h (OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): New.
* omp-builtins.def (BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR): New.
* omp-offload.cc (offload_ind_funcs): New.
(omp_discover_implicit_declare_target): Add functions marked with
'omp declare target indirect' to indirect functions list.
(omp_finish_file): Add indirect functions to section for offload
indirect functions.
(execute_omp_device_lower): Redirect indirect calls on target by
passing function pointer to BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR.
(pass_omp_device_lower::gate): Run pass_omp_device_lower if
indirect functions are present on an accelerator device.
* omp-offload.h (offload_ind_funcs): New.
* tree-core.h (omp_clause_code): Add OMP_CLAUSE_INDIRECT.
* tree.cc (omp_clause_num_ops): Add entry for OMP_CLAUSE_INDIRECT.
(omp_clause_code_name): Likewise.
* tree.h (OMP_CLAUSE_INDIRECT_EXPR): New.
* config/gcn/mkoffload.cc (process_asm): Process offload_ind_funcs
section. Count number of indirect functions.
(process_obj): Emit number of indirect functions.
* config/nvptx/mkoffload.cc (ind_func_ids, ind_funcs_tail): New.
(process): Emit offload_ind_func_table in PTX code. Emit indirect
function names and count in image.
* config/nvptx/nvptx.cc (nvptx_record_offload_symbol): Mark
indirect functions in PTX code with IND_FUNC_MAP.
gcc/testsuite/
* c-c++-common/gomp/declare-target-7.c: Update expected error message.
* c-c++-common/gomp/declare-target-indirect-1.c: New.
* c-c++-common/gomp/declare-target-indirect-2.c: New.
* g++.dg/gomp/attrs-21.C (v12): Update expected error message.
* g++.dg/gomp/declare-target-indirect-1.C: New.
* gcc.dg/gomp/attrs-21.c (v12): Update expected error message.
include/
* gomp-constants.h (GOMP_VERSION): Increment to 3.
(GOMP_VERSION_SUPPORTS_INDIRECT_FUNCS): New.
libgcc/
* offloadstuff.c (OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): New.
(__offload_ind_func_table): New.
(__offload_ind_funcs_end): New.
(__OFFLOAD_TABLE__): Add entries for indirect functions.
libgomp/
* Makefile.am (libgomp_la_SOURCES): Add target-indirect.c.
* Makefile.in: Regenerate.
* libgomp-plugin.h (GOMP_INDIRECT_ADDR_MAP): New define.
(GOMP_OFFLOAD_load_image): Add extra argument.
* libgomp.h (struct indirect_splay_tree_key_s): New.
(indirect_splay_tree_node, indirect_splay_tree,
indirect_splay_tree_key): New.
(indirect_splay_compare): New.
* libgomp.map (GOMP_5.1.1): Add GOMP_target_map_indirect_ptr.
* libgomp.texi (OpenMP 5.1): Update documentation on indirect
calls in target region and on indirect clause.
(Other new OpenMP 5.2 features): Add entry for virtual function calls.
* libgomp_g.h (GOMP_target_map_indirect_ptr): Add prototype.
* oacc-host.c (host_load_image): Add extra argument.
* target.c (gomp_load_image_to_device): If the GOMP_VERSION is high
enough, read host indirect functions table and pass to
load_image_func.
* config/accel/target-indirect.c: New.
* config/linux/target-indirect.c: New.
* config/gcn/team.c (build_indirect_map): Add prototype.
(gomp_gcn_enter_kernel): Initialize support for indirect
function calls on GCN target.
* config/nvptx/team.c (build_indirect_map): Add prototype.
(gomp_nvptx_main): Initialize support for indirect function
calls on NVPTX target.
* plugin/plugin-gcn.c (struct gcn_image_desc): Add field for
indirect functions count.
(GOMP_OFFLOAD_load_image): Add extra argument. If the GOMP_VERSION
is high enough, build address translation table and copy it to target
memory.
* plugin/plugin-nvptx.c (nvptx_tdata): Add field for indirect
functions count.
(GOMP_OFFLOAD_load_image): Add extra argument. If the GOMP_VERSION
is high enough, Build address translation table and copy it to target
memory.
* testsuite/libgomp.c-c++-common/declare-target-indirect-1.c: New.
* testsuite/libgomp.c-c++-common/declare-target-indirect-2.c: New.
* testsuite/libgomp.c++/declare-target-indirect-1.C: New.
This patch implements the 'self' clause for compute constructs: parallel,
kernels, and serial. This clause conditionally uses the local device
(the host mult-core CPU) as the executing device of the compute region.
The actual implementation of the "local device" device type inside libgomp
(presumably using pthreads) is still not yet completed, so the libgomp
side is still implemented the exact same as host-fallback mode. (so as of now,
it essentially behaves like the 'if' clause with the condition inverted)
gcc/c/ChangeLog:
* c-parser.cc (c_parser_oacc_compute_clause_self): New function.
(c_parser_oacc_all_clauses): Add new 'bool compute_p = false'
parameter, add parsing of self clause when compute_p is true.
(OACC_KERNELS_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_SELF.
(OACC_PARALLEL_CLAUSE_MASK): Likewise,
(OACC_SERIAL_CLAUSE_MASK): Likewise.
(c_parser_oacc_compute): Adjust call to c_parser_oacc_all_clauses to
set compute_p argument to true.
* c-typeck.cc (c_finish_omp_clauses): Add OMP_CLAUSE_SELF case.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_oacc_compute_clause_self): New function.
(cp_parser_oacc_all_clauses): Add new 'bool compute_p = false'
parameter, add parsing of self clause when compute_p is true.
(OACC_KERNELS_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_SELF.
(OACC_PARALLEL_CLAUSE_MASK): Likewise,
(OACC_SERIAL_CLAUSE_MASK): Likewise.
(cp_parser_oacc_compute): Adjust call to c_parser_oacc_all_clauses to
set compute_p argument to true.
* pt.cc (tsubst_omp_clauses): Add OMP_CLAUSE_SELF case.
* semantics.cc (c_finish_omp_clauses): Add OMP_CLAUSE_SELF case, merged
with OMP_CLAUSE_IF case.
gcc/fortran/ChangeLog:
* gfortran.h (typedef struct gfc_omp_clauses): Add self_expr field.
* openmp.cc (enum omp_mask2): Add OMP_CLAUSE_SELF.
(gfc_match_omp_clauses): Add handling for OMP_CLAUSE_SELF.
(OACC_PARALLEL_CLAUSES): Add OMP_CLAUSE_SELF.
(OACC_KERNELS_CLAUSES): Likewise.
(OACC_SERIAL_CLAUSES): Likewise.
(resolve_omp_clauses): Add handling for omp_clauses->self_expr.
* trans-openmp.cc (gfc_trans_omp_clauses): Add handling of
clauses->self_expr and building of OMP_CLAUSE_SELF tree clause.
(gfc_split_omp_clauses): Add handling of self_expr field copy.
gcc/ChangeLog:
* gimplify.cc (gimplify_scan_omp_clauses): Add OMP_CLAUSE_SELF case.
(gimplify_adjust_omp_clauses): Likewise.
* omp-expand.cc (expand_omp_target): Add OMP_CLAUSE_SELF expansion code,
* omp-low.cc (scan_sharing_clauses): Add OMP_CLAUSE_SELF case.
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_SELF enum.
* tree-nested.cc (convert_nonlocal_omp_clauses): Add OMP_CLAUSE_SELF
case.
(convert_local_omp_clauses): Likewise.
* tree-pretty-print.cc (dump_omp_clause): Add OMP_CLAUSE_SELF case.
* tree.cc (omp_clause_num_ops): Add OMP_CLAUSE_SELF entry.
(omp_clause_code_name): Likewise.
* tree.h (OMP_CLAUSE_SELF_EXPR): New macro.
gcc/testsuite/ChangeLog:
* c-c++-common/goacc/self-clause-1.c: New test.
* c-c++-common/goacc/self-clause-2.c: New test.
* gfortran.dg/goacc/self.f95: New test.
include/ChangeLog:
* gomp-constants.h (GOACC_FLAG_LOCAL_DEVICE): New flag bit value.
libgomp/ChangeLog:
* oacc-parallel.c (GOACC_parallel_keyed): Add code to handle
GOACC_FLAG_LOCAL_DEVICE case.
* testsuite/libgomp.oacc-c-c++-common/self-1.c: New test.
gcc/fortran/ChangeLog:
* gfortran.h (ext_attr_t): Add omp_allocate flag.
* match.cc (gfc_free_omp_namelist): Void deleting same
u2.allocator multiple times now that a sequence can use
the same one.
* openmp.cc (gfc_match_omp_clauses, gfc_match_omp_allocate): Use
same allocator expr multiple times.
(is_predefined_allocator): Make static.
(gfc_resolve_omp_allocate): Update/extend restriction checks;
remove sorry message.
(resolve_omp_clauses): Reject corarrays in allocate/allocators
directive.
* parse.cc (check_omp_allocate_stmt): Permit procedure pointers
here (rejected later) for less misleading diagnostic.
* trans-array.cc (gfc_trans_auto_array_allocation): Propagate
size for GOMP_alloc and location to which it should be added to.
* trans-decl.cc (gfc_trans_deferred_vars): Handle 'omp allocate'
for stack variables; sorry for static variables/common blocks.
* trans-openmp.cc (gfc_trans_omp_clauses): Evaluate 'allocate'
clause's allocator only once; fix adding expressions to the
block.
(gfc_trans_omp_single): Pass a block to gfc_trans_omp_clauses.
gcc/ChangeLog:
* gimplify.cc (gimplify_bind_expr): Handle Fortran's
'omp allocate' for stack variables.
libgomp/ChangeLog:
* libgomp.texi (OpenMP Impl. Status): Mention that Fortran now
supports the allocate directive for stack variables.
* testsuite/libgomp.fortran/allocate-5.f90: New test.
* testsuite/libgomp.fortran/allocate-6.f90: New test.
* testsuite/libgomp.fortran/allocate-7.f90: New test.
* testsuite/libgomp.fortran/allocate-8.f90: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/allocate-14.c: Fix directive name.
* c-c++-common/gomp/allocate-15.c: Likewise.
* c-c++-common/gomp/allocate-9.c: Fix comment typo.
* gfortran.dg/gomp/allocate-4.f90: Remove sorry dg-error.
* gfortran.dg/gomp/allocate-7.f90: Likewise.
* gfortran.dg/gomp/allocate-10.f90: New test.
* gfortran.dg/gomp/allocate-11.f90: New test.
* gfortran.dg/gomp/allocate-12.f90: New test.
* gfortran.dg/gomp/allocate-13.f90: New test.
* gfortran.dg/gomp/allocate-14.f90: New test.
* gfortran.dg/gomp/allocate-15.f90: New test.
* gfortran.dg/gomp/allocate-8.f90: New test.
* gfortran.dg/gomp/allocate-9.f90: New test.
For strictly structured blocks, a BLOCK was created but the code
was placed after the block the outer structured block. Additionally,
labelled blocks were mishandled. As the code is now properly in a
BLOCK, it solves additional issues.
gcc/fortran/ChangeLog:
* parse.cc (parse_omp_structured_block): Make the user code end
up inside of BLOCK construct for strictly structured blocks;
fix fallout for 'section' and 'teams'.
* openmp.cc (resolve_omp_target): Fix changed BLOCK handling
for teams in target checking.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/strictly-structured-block-1.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/block_17.f90: New test.
* gfortran.dg/gomp/strictly-structured-block-5.f90: New test.
Call GOMP_alloc/free for 'omp allocate' allocated variables. This is
for C only as C++ and Fortran show a sorry already in the FE. Note that
this only applies to stack variables as the C FE shows a sorry for
static variables.
gcc/ChangeLog:
* gimplify.cc (gimplify_bind_expr): Call GOMP_alloc/free for
'omp allocate' variables; move stack cleanup after other
cleanup.
(omp_notice_variable): Process original decl when decl
of the value-expression for a 'omp allocate' variable is passed.
* omp-low.cc (scan_omp_1_op): Handle 'omp allocate' variables
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.1 Impl.): Mark 'omp allocate' as
implemented for C only.
* testsuite/libgomp.c/allocate-4.c: New test.
* testsuite/libgomp.c/allocate-5.c: New test.
* testsuite/libgomp.c/allocate-6.c: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/allocate-11.c: Remove C-only dg-message
for 'sorry, unimplemented'.
* c-c++-common/gomp/allocate-12.c: Likewise.
* c-c++-common/gomp/allocate-15.c: Likewise.
* c-c++-common/gomp/allocate-9.c: Likewise.
* c-c++-common/gomp/allocate-10.c: New test.
* c-c++-common/gomp/allocate-17.c: New test.
For nvptx offloading, it'll FAIL its execution test until nvptx-tools updated
to include commit 1b5946d78ef5dcfb640e9f545a7c791b7f623911
"Merge commit '26095fd01232061de9f79decb3e8222ef7b46191' into HEAD [#29]",
<1b5946d78e>.
libgomp/
* testsuite/libgomp.c-c++-common/pr100059-1.c: New.
Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
OpenMP 5.0 removed the restriction that multiple collapsed loops must
be perfectly nested, allowing "intervening code" (including nested
BLOCKs) before or after each nested loop. In GCC this code is moved
into the inner loop body by the respective front ends.
In the Fortran front end, most of the semantic processing happens during
the translation phase, so the parse phase just collects the intervening
statements, checks them for errors, and splices them around the loop body.
gcc/fortran/ChangeLog
* gfortran.h (struct gfc_namespace): Add omp_structured_block bit.
* openmp.cc: Include omp-api.h.
(resolve_omp_clauses): Consolidate inscan reduction clause conflict
checking here.
(find_nested_loop_in_chain): New.
(find_nested_loop_in_block): New.
(gfc_resolve_omp_do_blocks): Set omp_current_do_collapse properly.
Handle imperfectly-nested loops when looking for nested omp scan.
Refactor to move inscan reduction clause conflict checking to
resolve_omp_clauses.
(gfc_resolve_do_iterator): Handle imperfectly-nested loops.
(struct icode_error_state): New.
(icode_code_error_callback): New.
(icode_expr_error_callback): New.
(diagnose_intervening_code_errors_1): New.
(diagnose_intervening_code_errors): New.
(make_structured_block): New.
(restructure_intervening_code): New.
(is_outer_iteration_variable): Do not assume loops are perfectly
nested.
(check_nested_loop_in_chain): New.
(check_nested_loop_in_block_state): New.
(check_nested_loop_in_block_symbol): New.
(check_nested_loop_in_block): New.
(expr_uses_intervening_var): New.
(is_intervening_var): New.
(expr_is_invariant): Do not assume loops are perfectly nested.
(resolve_omp_do): Handle imperfectly-nested loops.
* trans-stmt.cc (gfc_trans_block_construct): Generate
OMP_STRUCTURED_BLOCK if magic bit is set on block namespace.
gcc/testsuite/ChangeLog
* gfortran.dg/gomp/collapse1.f90: Adjust expected errors.
* gfortran.dg/gomp/collapse2.f90: Likewise.
* gfortran.dg/gomp/imperfect-gotos.f90: New.
* gfortran.dg/gomp/imperfect-invalid-scope.f90: New.
* gfortran.dg/gomp/imperfect1.f90: New.
* gfortran.dg/gomp/imperfect2.f90: New.
* gfortran.dg/gomp/imperfect3.f90: New.
* gfortran.dg/gomp/imperfect4.f90: New.
* gfortran.dg/gomp/imperfect5.f90: New.
libgomp/ChangeLog
* testsuite/libgomp.fortran/imperfect-destructor.f90: New.
* testsuite/libgomp.fortran/imperfect1.f90: New.
* testsuite/libgomp.fortran/imperfect2.f90: New.
* testsuite/libgomp.fortran/imperfect3.f90: New.
* testsuite/libgomp.fortran/imperfect4.f90: New.
* testsuite/libgomp.fortran/target-imperfect1.f90: New.
* testsuite/libgomp.fortran/target-imperfect2.f90: New.
* testsuite/libgomp.fortran/target-imperfect3.f90: New.
* testsuite/libgomp.fortran/target-imperfect4.f90: New.
OpenMP 5.0 removed the restriction that multiple collapsed loops must
be perfectly nested, allowing "intervening code" (including nested
BLOCKs) before or after each nested loop. In GCC this code is moved
into the inner loop body by the respective front ends.
This patch changes the C++ front end to use recursive descent parsing
on nested loops within an "omp for" construct, rather than an
iterative approach, in order to preserve proper nesting of compound
statements. Preserving cleanups (destructors) for class objects
declared in intervening code and loop initializers complicates moving
the former into the body of the loop; this is handled by parsing the
entire construct before reassembling any of it.
gcc/cp/ChangeLog
* cp-tree.h (cp_convert_omp_range_for): Adjust declaration.
* parser.cc (struct omp_for_parse_data): New.
(cp_parser_postfix_expression): Diagnose calls to OpenMP runtime
in intervening code.
(check_omp_intervening_code): New.
(cp_parser_statement_seq_opt): Special-case nested loops, blocks,
and other constructs for OpenMP loops.
(cp_parser_iteration_statement): Reject loops in intervening code.
(cp_parser_omp_for_loop_init): Expand comments and tweak the
interface slightly to better distinguish input/output parameters.
(cp_convert_omp_range_for): Likewise.
(cp_parser_omp_loop_nest): New, split from cp_parser_omp_for_loop
and largely rewritten. Add more comments.
(insert_structured_blocks): New.
(find_structured_blocks): New.
(struct sit_data, substitute_in_tree_walker, substitute_in_tree):
New.
(fixup_blocks_walker): New.
(cp_parser_omp_for_loop): Rewrite to use recursive descent instead
of a loop. Add logic to reshuffle the bits of code collected
during parsing so intervening code gets moved to the loop body.
(cp_parser_omp_loop): Remove call to finish_omp_for_block, which
is now redundant.
(cp_parser_omp_simd): Likewise.
(cp_parser_omp_for): Likewise.
(cp_parser_omp_distribute): Likewise.
(cp_parser_oacc_loop): Likewise.
(cp_parser_omp_taskloop): Likewise.
(cp_parser_pragma): Reject OpenMP pragmas in intervening code.
* parser.h (struct cp_parser): Add omp_for_parse_state field.
* pt.cc (tsubst_omp_for_iterator): Adjust call to
cp_convert_omp_range_for.
* semantics.cc (finish_omp_for): Try harder to preserve location
of loop variable init expression for use in diagnostics.
(struct fofb_data, finish_omp_for_block_walker): New.
(finish_omp_for_block): Allow variables to be bound in a BIND_EXPR
nested inside BIND instead of directly in BIND itself.
gcc/testsuite/ChangeLog
* c-c++-common/goacc/tile-2.c: Adjust expected error patterns.
* g++.dg/gomp/attrs-imperfect1.C: New test.
* g++.dg/gomp/attrs-imperfect2.C: New test.
* g++.dg/gomp/attrs-imperfect3.C: New test.
* g++.dg/gomp/attrs-imperfect4.C: New test.
* g++.dg/gomp/attrs-imperfect5.C: New test.
* g++.dg/gomp/pr41967.C: Adjust expected error patterns.
* g++.dg/gomp/tpl-imperfect-gotos.C: New test.
* g++.dg/gomp/tpl-imperfect-invalid-scope.C: New test.
libgomp/ChangeLog
* testsuite/libgomp.c++/attrs-imperfect1.C: New test.
* testsuite/libgomp.c++/attrs-imperfect2.C: New test.
* testsuite/libgomp.c++/attrs-imperfect3.C: New test.
* testsuite/libgomp.c++/attrs-imperfect4.C: New test.
* testsuite/libgomp.c++/attrs-imperfect5.C: New test.
* testsuite/libgomp.c++/attrs-imperfect6.C: New test.
* testsuite/libgomp.c++/imperfect-class-1.C: New test.
* testsuite/libgomp.c++/imperfect-class-2.C: New test.
* testsuite/libgomp.c++/imperfect-class-3.C: New test.
* testsuite/libgomp.c++/imperfect-destructor.C: New test.
* testsuite/libgomp.c++/imperfect-template-1.C: New test.
* testsuite/libgomp.c++/imperfect-template-2.C: New test.
* testsuite/libgomp.c++/imperfect-template-3.C: New test.
The following functions are not standard, and not always available
(e.g., on darwin). They should not be called unless available: gamma,
gammaf, scalb, scalbf, significand, and significandf.
libgomp/ChangeLog:
* testsuite/lib/libgomp.exp: Add effective target.
* testsuite/libgomp.c/simd-math-1.c: Avoid calling nonstandard
functions.
Before commit r12-5295-g47de0b56ee455e, all gimple_build_cond in
expand_omp_for_* were inserted with
gsi_insert_before (gsi_p, cond_stmt, GSI_SAME_STMT);
except the one dealing with the multiplicative factor that was
gsi_insert_after (gsi, cond_stmt, GSI_CONTINUE_LINKING);
That commit for PR103208 fixed the issue of some missing regimplify of
operands of GIMPLE_CONDs by moving the condition handling to the new function
expand_omp_build_cond. While that function has an 'bool after = false'
argument to switch between the two variants.
However, all callers ommited this argument. This commit reinstates the
prior behavior by passing 'true' for the factor != 0 condition, fixing
the included testcase.
PR middle-end/111017
gcc/
* omp-expand.cc (expand_omp_for_init_vars): Pass after=true
to expand_omp_build_cond for 'factor != 0' condition, resulting
in pre-r12-5295-g47de0b56ee455e code for the gimple insert.
libgomp/
* testsuite/libgomp.c-c++-common/non-rect-loop-1.c: New test.
When copying a 2D or 3D rectangular memmory block, the performance is
better when using CUDA's cuMemcpy2D/cuMemcpy3D instead of copying the
data one by one. That's what this commit does.
Additionally, it permits device-to-device copies, if neccessary using a
temporary variable on the host.
include/ChangeLog:
* cuda/cuda.h (CUlimit): Add CUDA_ERROR_NOT_INITIALIZED,
CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_INVALID_HANDLE.
(CUarray, CUmemorytype, CUDA_MEMCPY2D, CUDA_MEMCPY3D,
CUDA_MEMCPY3D_PEER): New typdefs.
(cuMemcpy2D, cuMemcpy2DAsync, cuMemcpy2DUnaligned,
cuMemcpy3D, cuMemcpy3DAsync, cuMemcpy3DPeer,
cuMemcpy3DPeerAsync): New prototypes.
libgomp/ChangeLog:
* libgomp-plugin.h (GOMP_OFFLOAD_memcpy2d,
GOMP_OFFLOAD_memcpy3d): New prototypes.
* libgomp.h (struct gomp_device_descr): Add memcpy2d_func
and memcpy3d_func.
* libgomp.texi (nvtpx): Document when cuMemcpy2D/cuMemcpy3D is used.
* oacc-host.c (memcpy2d_func, .memcpy3d_func): Init with NULL.
* plugin/cuda-lib.def (cuMemcpy2D, cuMemcpy2DUnaligned,
cuMemcpy3D): Invoke via CUDA_ONE_CALL.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memcpy2d,
GOMP_OFFLOAD_memcpy3d): New.
* target.c (omp_target_memcpy_rect_worker):
(omp_target_memcpy_rect_check, omp_target_memcpy_rect_copy):
Permit all device-to-device copyies; invoke new plugins for
2D and 3D copying when available.
(gomp_load_plugin_for_device): DLSYM the new plugin functions.
* testsuite/libgomp.c/target-12.c: Fix dimension bug.
* testsuite/libgomp.fortran/target-12.f90: Likewise.
* testsuite/libgomp.fortran/target-memcpy-rect-1.f90: New test.
Before this commit, gfortran produced with OpenMP for 'do i = 1,10,2'
the code
for (count.0 = 0; count.0 < 5; count.0 = count.0 + 1)
i = count.0 * 2 + 1;
While such an inner loop can be collapsed, a non-rectangular could not.
With this commit and for all constant loop steps, a simple loop such
as 'for (i = 1; i <= 10; i = i + 2)' is created. (Before only for the
constant steps of 1 and -1.)
The constant step permits to know the direction (increasing/decreasing)
that is required for the loop condition.
The new code is only valid if one assumes no overflow of the loop variable.
However, the Fortran standard can be read that this must be ensured by
the user. Namely, the Fortran standard requires (F2023, 10.1.5.2.4):
"The execution of any numeric operation whose result is not defined by
the arithmetic used by the processor is prohibited."
And, for DO loops, F2023's "11.1.7.4.3 The execution cycle" has the
following: The number of loop iterations handled by an iteration count,
which would permit code like 'do i = huge(i)-5, huge(i),4'. However,
in step (3), this count is not only decremented by one but also:
"... The DO variable, if any, is incremented by the value of the
incrementation parameter m3."
And for the example above, 'i' would be 'huge(i)+3' in the last
execution cycle, which exceeds the largest model number and should
render the example as invalid.
PR fortran/107424
gcc/fortran/ChangeLog:
* trans-openmp.cc (gfc_nonrect_loop_expr): Accept all
constant loop steps.
(gfc_trans_omp_do): Likewise; use sign to determine
loop direction.
libgomp/ChangeLog:
* libgomp.texi (Impl. Status 5.0): Add link to new PR110735.
* testsuite/libgomp.fortran/non-rectangular-loop-1.f90: Enable
commented tests.
* testsuite/libgomp.fortran/non-rectangular-loop-1a.f90: Remove
test file; tests are in non-rectangular-loop-1.f90.
* testsuite/libgomp.fortran/non-rectangular-loop-5.f90: Change
testcase to use a non-constant step to retain the 'sorry' test.
* testsuite/libgomp.fortran/non-rectangular-loop-6.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/linear-2.f90: Update dump to remove
the additional count variable.
The 'uses_allocators' clause to the 'target' construct accepts predefined
allocators and can also be used to define a new allocator for a target region.
As predefined allocators in GCC do not require special handling, those can and
are ignored after parsing, such that this feature now works. On the other hand,
defining a new allocator will fail for now with a 'sorry, unimplemented'.
Note that both the OpenMP 5.0/5.1 and 5.2 syntax for uses_allocators
is supported by this commit.
2023-07-17 Tobias Burnus <tobias@codesoucery.com>
Chung-Lin Tang <cltang@codesourcery.com>
gcc/fortran/ChangeLog:
* dump-parse-tree.cc (show_omp_namelist, show_omp_clauses): Dump
uses_allocators clause.
* gfortran.h (gfc_free_omp_namelist): Add memspace_sym to u union
and traits_sym to u2 union.
(OMP_LIST_USES_ALLOCATORS): New enum value.
(gfc_free_omp_namelist): Add 'bool free_mem_traits_space' arg.
* match.cc (gfc_free_omp_namelist): Likewise.
* openmp.cc (gfc_free_omp_clauses, gfc_match_omp_variable_list,
gfc_match_omp_to_link, gfc_match_omp_doacross_sink,
gfc_match_omp_clause_reduction, gfc_match_omp_allocate,
gfc_match_omp_flush): Update call.
(gfc_match_omp_clauses): Likewise. Parse uses_allocators clause.
(gfc_match_omp_clause_uses_allocators): New.
(enum omp_mask2): Add new OMP_CLAUSE_USES_ALLOCATORS.
(OMP_TARGET_CLAUSES): Accept it.
(resolve_omp_clauses): Resolve uses_allocators clause
* st.cc (gfc_free_statement): Update gfc_free_omp_namelist call.
* trans-openmp.cc (gfc_trans_omp_clauses): Handle
OMP_LIST_USES_ALLOCATORS; fail with sorry unless predefined allocator.
(gfc_split_omp_clauses): Handle uses_allocators.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/uses_allocators_1.f90: New test.
* testsuite/libgomp.fortran/uses_allocators_2.f90: New test.
Co-authored-by: Chung-Lin Tang <cltang@codesourcery.com>
Some test cases in libgomp testsuite pass -flto as an option, but
the testcases do not require LTO target support. This patch adds
the necessary DejaGNU requirement for LTO support to the testcases..
libgomp/ChangeLog:
* testsuite/libgomp.c++/target-map-class-2.C: Require LTO.
* testsuite/libgomp.c-c++-common/requires-4.c: Require LTO.
* testsuite/libgomp.c-c++-common/requires-4a.c: Require LTO.
Signed-off-by: David Edelsohn <dje.gcc@gmail.com>
As with the memkind library, it is only used when found at runtime;
it does not need to be present when building GCC.
The included testcase does not check whether the memory has been placed
on the nearest node as the Linux kernel memory handling too often ignores
that hint, using a different node for the allocation. However, when
running with 'numactl --preferred=<node> ./executable', it is clearly
visible that the feature works by comparing malloc/default vs. nearest
placement (using get_mempolicy to obtain the node for a mem addr).
libgomp/ChangeLog:
* allocator.c: Add ifdef for LIBGOMP_USE_LIBNUMA.
(enum gomp_numa_memkind_kind): Renamed from gomp_memkind_kind;
add GOMP_MEMKIND_LIBNUMA.
(struct gomp_libnuma_data, gomp_init_libnuma, gomp_get_libnuma): New.
(omp_init_allocator): Handle partition=nearest with libnuma if avail.
(omp_aligned_alloc, omp_free, omp_aligned_calloc, omp_realloc): Add
numa_alloc_local (+ memset), numa_free, and numa_realloc calls as
needed.
* config/linux/allocator.c (LIBGOMP_USE_LIBNUMA): Define
* libgomp.texi: Fix a typo; use 'fi' instead of its ligature char.
(Memory allocation): Renamed from 'Memory allocation with libmemkind';
updated for libnuma usage.
* testsuite/libgomp.c-c++-common/alloc-11.c: New test.
* testsuite/libgomp.c-c++-common/alloc-12.c: New test.
Depending on the details, the testcase can fail with different but
related messages; all of the following all could be observed for this
testcase:
libgomp: OMP_TARGET_OFFLOAD is set to MANDATORY, but device cannot be used for offloading
libgomp: OMP_TARGET_OFFLOAD is set to MANDATORY, but device not found
libgomp: OMP_TARGET_OFFLOAD is set to MANDATORY, but only the host device is available
Before, the last two were tested for with 'target offload_device' and
'! offload_device', respectively. Now, all three are accepted by matching
'.*' already after 'but' and without distinguishing whether the effective
target is an offload_device or not.
(For completeness, there is a fourth error that follows this pattern:
'OMP_TARGET_OFFLOAD is set to MANDATORY, but device is finalized'.)
libgomp/
* testsuite/libgomp.c/target-51.c: Accept more error msg variants
as expected dg-output.
For C/C++ pointers, default implicit mapping firstprivatizes the pointer
but if the memory it points to is mapped, the it is updated to point to
the device memory (by attaching a zero sized array section of the pointed-to
storage).
However, if the pointed-to storage wasn't mapped, the pointer was set to
NULL on the device side (OpenMP 5.0/5.1 semantic). With this commit, the
pointer retains the on-host address in that case (OpenMP 5.2 semantic).
The new semantic avoids an explicit map/firstprivate/is_device_ptr in the
following sensible cases: Special values (e.g. pointer or 0x1, 0x2 etc.),
explicitly device allocated memory (e.g. omp_target_alloc), and with
(unified) shared memory.
(Note: With (U)SM, mappings still must be tracked, at least when
omp_target_associate_ptr does not fail when passing in two destinct pointers.)
libgomp/
PR middle-end/110270
* target.c (gomp_map_vars_internal): Copy host value instead of NULL
for GOMP_MAP_ZERO_LEN_ARRAY_SECTION if not mapped.
* libgomp.texi (OpenMP 5.2 Impl.): Mark as 'Y'.
* testsuite/libgomp.c/target-19.c: Update expected value.
* testsuite/libgomp.c++/target-18.C: Likewise.
* testsuite/libgomp.c++/target-19.C: Likewise.
* testsuite/libgomp.c-c++-common/requires-unified-addr-2.c: New test.
* testsuite/libgomp.c-c++-common/target-implicit-map-3.c: New test.
* testsuite/libgomp.c-c++-common/target-implicit-map-4.c: New test.
It turned out that gomp_init_targets_once() was not run when directly
calling 'omp target' or 'omp target (enter/exit) data' causing an
abort with OMP_TARGET_OFFLOAD=mandatory wrongly claiming that no
device is available. It was called a tiny bit later but few lines too
late for updating the default-device-var.
libgomp/ChangeLog:
* target.c (resolve_device): Call gomp_get_num_devices early to ensure
gomp_init_targets_once was called before using default-device-var.
* testsuite/libgomp.c/target-55.c: New test.
* testsuite/libgomp.c/target-55a.c: New test.
Support OpenMP 5.1's syntax for OMP_ALLOCATOR as well,
which permits besides predefined allocators also
predefined memspaces optionally followed by traits.
Additionally, this commit adds the previously lacking
documentation for OMP_ALLOCATOR, OMP_AFFINITY_FORMAT
and OMP_DISPLAY_AFFINITY.
libgomp/ChangeLog:
* env.c (gomp_def_allocator_envvar): New var.
(parse_allocator): Handle OpenMP 5.1 syntax.
(cleanup_env): New.
(omp_display_env): Output gomp_def_allocator_envvar
for an allocator with traits.
* libgomp.texi (OMP_ALLOCATOR, OMP_AFFINITY_FORMAT,
OMP_DISPLAY_AFFINITY): New.
* testsuite/libgomp.c/allocator-1.c: New test.
* testsuite/libgomp.c/allocator-2.c: New test.
* testsuite/libgomp.c/allocator-3.c: New test.
* testsuite/libgomp.c/allocator-4.c: New test.
* testsuite/libgomp.c/allocator-5.c: New test.
* testsuite/libgomp.c/allocator-6.c: New test.
On 2023-06-14T11:42:22+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
> On 14.06.23 10:09, Thomas Schwinge wrote:
>> Let me know if I should also adjust the new 'target { ! offload_device }'
>> diagnostic "[...] MANDATORY but only the host device is available" to
>> include a comma before 'but', for consistency with the other existing
>> diagnostics (cited above)?
>
> I think it makes sense to be consistent. Thus: Yes, please add the commas.
Fix-up for recent commit 18c8b56c7d
"OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory".
libgomp/
* target.c (resolve_device): Align a
'OMP_TARGET_OFFLOAD=mandatory' diagnostic with others.
* testsuite/libgomp.c/target-51.c: Adjust.
..., and therefore, given 'target offload_device':
PASS: libgomp.c/target-51.c (test for excess errors)
PASS: libgomp.c/target-51.c execution test
[-FAIL:-]{+PASS:+} libgomp.c/target-51.c output pattern test
Fix-up for recent commit 18c8b56c7d
"OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory".
libgomp/
* testsuite/libgomp.c/target-51.c: Fix typo.
OMP_TARGET_OFFLOAD=mandatory handling was before inconsistent. Hence, in
OpenMP 5.2 it was clarified/extended by having implications on the
default-device-var; additionally, omp_initial_device and omp_invalid_device
enum values/PARAMETERs were added; support for it was added
in r13-1066-g1158fe43407568 including aborting for omp_invalid_device and
non-conforming device numbers. Only the mandatory handling was missing.
Namely, while the default-device-var is usually initialized to value 0,
with 'mandatory' it must have the value 'omp_invalid_device' if and only if
zero non-host devices are available. (The OMP_DEFAULT_DEVICE env var
overrides this as it comes semantically after the initialization.)
To achieve this, default-device-var is now initialized to MIN_INT. If
there is no 'mandatory', it is set to 0 directly after env var parsing.
Otherwise, it is updated in gomp_target_init to either 0 or
omp_invalid_device. To ensure INT_MIN is never seen by the user, both
the omp_get_default_device API routine and omp_display_env (user call
and OMP_DISPLAY_ENV env var) call gomp_init_targets_once() in that case.
libgomp/ChangeLog:
* env.c (gomp_default_icv_values): Init default_device_var to
an nonconforming value - INT_MIN.
(initialize_env): After env-var parsing, set default_device_var to
device 0 unless OMP_TARGET_OFFLOAD=mandatory.
(omp_display_env): If default_device_var is INT_MIN, call
gomp_init_targets_once.
* icv-device.c (omp_get_default_device): Likewise.
* libgomp.texi (OMP_DEFAULT_DEVICE): Update init description.
(OpenMP 5.2 Impl. Status): Mark OMP_TARGET_OFFLOAD=mandatory as 'Y'.
* target.c (resolve_device): Improve error message device-num < 0
with 'mandatory' and no no-host devices available.
(gomp_target_init): Set default-device-var if INT_MIN.
* testsuite/libgomp.c/target-48.c: New test.
* testsuite/libgomp.c/target-49.c: New test.
* testsuite/libgomp.c/target-50.c: New test.
* testsuite/libgomp.c/target-50a.c: New test.
* testsuite/libgomp.c/target-51.c: New test.
* testsuite/libgomp.c/target-52.c: New test.
* testsuite/libgomp.c/target-53.c: New test.
* testsuite/libgomp.c/target-54.c: New test.
Add a testcase for 'omp requires unified_address' that is currently supported
by all devices but was not tested for.
libgomp/
PR libgomp/109837
* testsuite/libgomp.c-c++-common/requires-unified-addr-1.c: New test.
* testsuite/libgomp.fortran/requires-unified-addr-1.f90: New test.
Reduce number of enum values passed to libgomp as
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} have the same semantic as
GOMP_MAP_FORCE_PRESENT (i.e. abort if not present, otherwise ignore);
that's different to GOMP_MAP_ALWAYS_PRESENT_{TO,TOFROM,FROM} which also
abort if not present but copy data when present. This is is a follow-up to
the commit r14-1579-g4ede915d5dde93 done 6 days ago.
Additionally, the commit improves a libgomp run-time and a C/C++ compile-time
error wording and extends testcases a tiny bit.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_clause_map): Reword error message for
clearness especially with 'omp target (enter/exit) data.'
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_clause_map): Reword error message for
clearness especially with 'omp target (enter/exit) data.'
* semantics.cc (handle_omp_array_sections): Handle
GOMP_MAP_{ALWAYS_,}PRESENT_{TO,TOFROM,FROM,ALLOC} enum values.
gcc/ChangeLog:
* gimplify.cc (gimplify_adjust_omp_clauses_1): Use
GOMP_MAP_FORCE_PRESENT for 'present alloc' implicit mapping.
(gimplify_adjust_omp_clauses): Change
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to the equivalent
GOMP_MAP_FORCE_PRESENT.
* omp-low.cc (lower_omp_target): Remove handling of no-longer valid
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC}; update map kinds used for
to/from clauses with present modifier.
include/ChangeLog:
* gomp-constants.h (enum gomp_map_kind): Change the enum values
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to be compiler only.
(GOMP_MAP_PRESENT_P): Update to include also GOMP_MAP_FORCE_PRESENT.
libgomp/ChangeLog:
* target.c (gomp_to_device_kind_p, gomp_map_vars_internal): Replace
GOMP_MAP_PRESENT_{FROM,TO,TOFROM,ACLLOC} by GOMP_MAP_FORCE_PRESENT.
(gomp_map_vars_internal, gomp_update): Likewise; unify and improve
error message.
* testsuite/libgomp.c-c++-common/target-present-2.c: Update for
changed error message.
* testsuite/libgomp.fortran/target-present-1.f90: Likewise.
* testsuite/libgomp.fortran/target-present-2.f90: Likewise.
* testsuite/libgomp.oacc-c-c++-common/present-1.c: Likewise.
* testsuite/libgomp.c-c++-common/target-present-1.c: Likewise and
extend testcase to check that data is copied when needed.
* testsuite/libgomp.c-c++-common/target-present-3.c: Likewise.
* testsuite/libgomp.fortran/target-present-3.f90: Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/defaultmap-4.c: Update scan-tree-dump.
* c-c++-common/gomp/map-9.c: Likewise.
* gfortran.dg/gomp/defaultmap-8.f90: Likewise.
* gfortran.dg/gomp/map-11.f90: Likewise.
* gfortran.dg/gomp/target-update-1.f90: Likewise.
* gfortran.dg/gomp/map-12.f90: Likewise; also check original dump.
* c-c++-common/gomp/map-6.c: Update dg-error and also check
clause error with 'target (enter/exit) data'.
One of the testcases lacked variables in a map clause such that
the fail occurred too early. Additionally, it would have failed
for all those non-host devices where 'present' is always true, i.e.
non-host devices which can access all of the host memory
(shared-memory devices). [There are currently none.]
The commit now runs the code on all devices, which should succeed
for host fallback and for shared-memory devices, finding potenial issues
that way. Additionally, a checkpoint (required stdout output) is used
to ensure that the execution won't fail (with the same error) before
reaching the expected fail location.
2023-06-07 Thomas Schwinge <thomas@codesourcery.com>
Tobias Burnus <tobias@codesourcery.com>
libgomp/
* testsuite/libgomp.c-c++-common/target-present-1.c: Run code
also for non-offload_device targets; check that it runs
successfully for those and for all until a checkpoint for all
* testsuite/libgomp.c-c++-common/target-present-2.c: Likewise.
* testsuite/libgomp.c-c++-common/target-present-3.c: Likewise.
* testsuite/libgomp.fortran/target-present-1.f90: Likewise.
* testsuite/libgomp.fortran/target-present-3.f90: Likewise.
* testsuite/libgomp.fortran/target-present-2.f90: Likewise;
add missing vars to map clause.
This implements support for the OpenMP 5.1 'present' modifier, which can be
used in map clauses in the 'target', 'target data', 'target data enter' and
'target data exit' constructs, and in the 'to' and 'from' clauses of the
'target update' construct. It is also supported in defaultmap.
The modifier triggers a fatal runtime error if the data specified by the
clause is not already present on the target device. It can also be combined
with 'always' in map clauses.
2023-06-06 Kwok Cheung Yeung <kcy@codesourcery.com>
Tobias Burnus <tobias@codesourcery.com>
gcc/c/
* c-parser.cc (c_parser_omp_clause_defaultmap,
c_parser_omp_clause_map): Parse 'present'.
(c_parser_omp_clause_to, c_parser_omp_clause_from): Remove.
(c_parser_omp_clause_from_to): New; parse to/from clauses with
optional present modifer.
(c_parser_omp_all_clauses): Update call.
(c_parser_omp_target_data, c_parser_omp_target_enter_data,
c_parser_omp_target_exit_data): Handle new map enum values
for 'present' mapping.
gcc/cp/
* parser.cc (cp_parser_omp_clause_defaultmap,
cp_parser_omp_clause_map): Parse 'present'.
(cp_parser_omp_clause_from_to): New; parse to/from
clauses with optional 'present' modifier.
(cp_parser_omp_all_clauses): Update call.
(cp_parser_omp_target_data, cp_parser_omp_target_enter_data,
cp_parser_omp_target_exit_data): Handle new enum value for
'present' mapping.
* semantics.cc (finish_omp_target): Likewise.
gcc/fortran/
* dump-parse-tree.cc (show_omp_namelist): Display 'present' map
modifier.
(show_omp_clauses): Display 'present' motion modifier for 'to'
and 'from' clauses.
* gfortran.h (enum gfc_omp_map_op): Add entries with 'present'
modifiers.
(struct gfc_omp_namelist): Add 'present_modifer'.
* openmp.cc (gfc_match_motion_var_list): New, handles optional
'present' modifier for to/from clauses.
(gfc_match_omp_clauses): Call it for to/from clauses; parse 'present'
in defaultmap and map clauses.
(resolve_omp_clauses): Allow 'present' modifiers on 'target',
'target data', 'target enter' and 'target exit' directives.
* trans-openmp.cc (gfc_trans_omp_clauses): Apply 'present' modifiers
to tree node for 'map', 'to' and 'from' clauses. Apply 'present' for
defaultmap.
gcc/
* gimplify.cc (omp_notice_variable): Apply GOVD_MAP_ALLOC_ONLY flag
and defaultmap flags if the defaultmap has GOVD_MAP_FORCE_PRESENT flag
set.
(omp_get_attachment): Handle map clauses with 'present' modifier.
(omp_group_base): Likewise.
(gimplify_scan_omp_clauses): Reorder present maps to come first.
Set GOVD flags for present defaultmaps.
(gimplify_adjust_omp_clauses_1): Set map kind for present defaultmaps.
* omp-low.cc (scan_sharing_clauses): Handle 'always, present' map
clauses.
(lower_omp_target): Handle map clauses with 'present' modifier.
Handle 'to' and 'from' clauses with 'present'.
* tree-core.h (enum omp_clause_defaultmap_kind): Add
OMP_CLAUSE_DEFAULTMAP_PRESENT defaultmap kind.
* tree-pretty-print.cc (dump_omp_clause): Handle 'map', 'to' and
'from' clauses with 'present' modifier. Handle present defaultmap.
* tree.h (OMP_CLAUSE_MOTION_PRESENT): New #define.
include/
* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_5): New.
(GOMP_MAP_FLAG_FORCE): Redefine.
(GOMP_MAP_FLAG_PRESENT, GOMP_MAP_FLAG_ALWAYS_PRESENT): New.
(enum gomp_map_kind): Add map kinds with 'present' modifiers.
(GOMP_MAP_COPY_TO_P, GOMP_MAP_COPY_FROM_P): Evaluate to true for
map variants with 'present'
(GOMP_MAP_ALWAYS_TO_P, GOMP_MAP_ALWAYS_FROM_P): Evaluate to true
for map variants with 'always, present' modifiers.
(GOMP_MAP_ALWAYS): Redefine.
(GOMP_MAP_FORCE_P, GOMP_MAP_PRESENT_P): New.
libgomp/
* libgomp.texi (OpenMP 5.1 Impl. status): Set 'present' support for
defaultmap to 'Y', add 'Y' entry for 'present' on to/from/map clauses.
* target.c (gomp_to_device_kind_p): Add map kinds with 'present'
modifier.
(gomp_map_vars_existing): Use new GOMP_MAP_FORCE_P macro.
(gomp_map_vars_internal, gomp_update, gomp_target_rev):
Emit runtime error if memory region not present.
* testsuite/libgomp.c-c++-common/target-present-1.c: New test.
* testsuite/libgomp.c-c++-common/target-present-2.c: New test.
* testsuite/libgomp.c-c++-common/target-present-3.c: New test.
* testsuite/libgomp.fortran/target-present-1.f90: New test.
* testsuite/libgomp.fortran/target-present-2.f90: New test.
* testsuite/libgomp.fortran/target-present-3.f90: New test.
gcc/testsuite/
* c-c++-common/gomp/map-6.c: Update dg-error, extend to test for
duplicated 'present' and extend scan-dump tests for 'present'.
* gfortran.dg/gomp/defaultmap-1.f90: Update dg-error.
* gfortran.dg/gomp/map-7.f90: Extend parse and dump test for
'present'.
* gfortran.dg/gomp/map-8.f90: Extend for duplicate 'present'
modifier checking.
* c-c++-common/gomp/defaultmap-4.c: New test.
* c-c++-common/gomp/map-9.c: New test.
* c-c++-common/gomp/target-update-1.c: New test.
* gfortran.dg/gomp/defaultmap-8.f90: New test.
* gfortran.dg/gomp/map-11.f90: New test.
* gfortran.dg/gomp/map-12.f90: New test.
* gfortran.dg/gomp/target-update-1.f90: New test.
The nteams-var ICV exists per device and can be set either via the routine
omp_set_num_teams or as environment variable (OMP_NUM_TEAMS with optional
_ALL/_DEV/_DEV_<num> suffix); it is default-initialized to zero. The number
of teams created is described under the num_teams clause. If the clause is
absent, the number of teams is implementation defined but at least
one team must exist and, if nteams-var is positive, at most nteams-var
teams may exist.
The latter condition was not honored in a target region before this
commit, such that too many teams were created.
Already before this commit, both the num_teams([lower:]upper) clause
(on the host and in target regions) and, only on the host, the nteams-var
ICV were honored. And as only one teams is created for host fallback,
unless the clause specifies otherwise, the nteams-var ICV was and is
effectively honored.
libgomp/ChangeLog:
PR libgomp/109875
* config/gcn/target.c (GOMP_teams4): Honor nteams-var ICV.
* config/nvptx/target.c (GOMP_teams4): Likewise.
* testsuite/libgomp.c-c++-common/teams-nteams-icv-1.c: New test.
* testsuite/libgomp.c-c++-common/teams-nteams-icv-2.c: New test.
* testsuite/libgomp.c-c++-common/teams-nteams-icv-3.c: New test.
* testsuite/libgomp.c-c++-common/teams-nteams-icv-4.c: New test.
Previously, array descriptors might have been mapped as 'alloc'
instead of 'to' for 'alloc', not updating the array bounds. The
'alloc' could also appear for 'data exit', failing with a libgomp
assert. In some cases, either array descriptors or deferred-length
string's length variable was not mapped. And, finally, some offset
calculations with array-sections mappings went wrong.
Additionally, the patch now unmaps for scalar allocatables/pointers
the GOMP_MAP_POINTER, avoiding stale mappings.
The testcases contain some comment-out tests which require follow-up
work and for which PR exist. Those mostly relate to deferred-length
strings which have several issues beyong OpenMP support.
gcc/fortran/ChangeLog:
* trans-decl.cc (gfc_get_symbol_decl): Add attributes
such as 'declare target' also to hidden artificial
variable for deferred-length character variables.
* trans-openmp.cc (gfc_trans_omp_array_section,
gfc_trans_omp_clauses, gfc_trans_omp_target_exit_data):
Improve mapping of array descriptors and deferred-length
string variables.
gcc/ChangeLog:
* gimplify.cc (gimplify_scan_omp_clauses): Remove Fortran
special case.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/target-enter-data-3.f90: Uncomment
'target exit data'.
* testsuite/libgomp.fortran/target-enter-data-4.f90: New test.
* testsuite/libgomp.fortran/target-enter-data-5.f90: New test.
* testsuite/libgomp.fortran/target-enter-data-6.f90: New test.
* testsuite/libgomp.fortran/target-enter-data-7.f90: New test.
gcc/testsuite/
* gfortran.dg/goacc/finalize-1.f: Update dg-tree; shows a fix
for 'finalize' as a ptr is now 'delete' instead of 'release'.
* gfortran.dg/gomp/pr78260-2.f90: Likewise as elem-size calc moved
to if (allocated) block
* gfortran.dg/gomp/target-exit-data.f90: Likewise as a var is now a
replaced by a MEM< _25 > expression.
* gfortran.dg/gomp/map-9.f90: Update dg-scan-tree-dump.
* gfortran.dg/gomp/map-10.f90: New test.
..., while still hard-coding the number of parallel slots to one.
PR testsuite/66005
libgomp/
* testsuite/Makefile.am (PWD_COMMAND): New variable.
(%/site.exp): New target.
(check_p_numbers0, check_p_numbers1, check_p_numbers2)
(check_p_numbers3, check_p_numbers4, check_p_numbers5)
(check_p_numbers6, check_p_numbers, gcc_test_parallel_slots)
(check_p_subdirs)
(check_DEJAGNU_libgomp_targets): New variables.
($(check_DEJAGNU_libgomp_targets)): New target.
($(check_DEJAGNU_libgomp_targets)): New dependency.
(check-DEJAGNU $(check_DEJAGNU_libgomp_targets)): New targets.
* testsuite/Makefile.in: Regenerate.
* testsuite/lib/libgomp.exp: For parallel testing,
'load_file ../libgomp-test-support.exp'.
Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
..., that is, 'GCC_UNDER_TEST', 'GXX_UNDER_TEST', 'GFORTRAN_UNDER_TEST' instead
of 'GCC_UNDER_TEST' for all of them. No need anymore for 'gcc -lstdc++ -x c++'
for C++ code, or 'gcc -lgfortran' plus conditional '-lquadmath' for Fortran
code. (Getting rid of explicit '-foffload=-lgfortran' is for another day.)
PR testsuite/91884
libgomp/
* configure.ac: 'AC_SUBST(CXX)'.
* configure: Regenerate.
* Makefile.in: Likewise.
* testsuite/Makefile.in: Likewise.
* testsuite/libgomp-site-extra.exp.in (GXX_UNDER_TEST)
(GFORTRAN_UNDER_TEST): Set.
* testsuite/lib/libgomp.exp (libgomp_init): Adjust.
* testsuite/libgomp.c++/c++.exp: Use 'GXX_UNDER_TEST'.
* testsuite/libgomp.oacc-c++/c++.exp: Likewise.
* testsuite/libgomp.fortran/fortran.exp: Use
'GFORTRAN_UNDER_TEST'.
* testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.