This commit caused failure of update_version_git due to the removal of
liboffloadmic with ChangeLog in it, so I had to blacklist that commit
and here I'm adding ChangeLog entries manually.
OpenMP 5.0 permits to use arrays with derived type components for the list
items to the 'from'/'to' clauses of the 'target update' directive.
gcc/fortran/ChangeLog:
* openmp.cc (gfc_match_omp_clauses): Permit derived types for
the 'to' and 'from' clauses of 'target update'.
* trans-openmp.cc (gfc_trans_omp_clauses): Fixes for
derived-type changes; fix size for scalars.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/target-11.f90: New test.
* testsuite/libgomp.fortran/target-13.f90: New test.
... which is 'libgomp.oacc-fortran/declare-allocatable-1.f90' adjusted
for missing support for OpenACC "Changes from Version 2.0 to 2.5":
"The 'declare create' directive with a Fortran 'allocatable' has new behavior".
Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
manually.
libgomp/
* testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90:
New.
This patch prevents compiler-generated artificial variables from being
treated as privatization candidates for OpenACC.
The rationale is that e.g. "gang-private" variables actually must be
shared by each worker and vector spawned within a particular gang, but
that sharing is not necessary for any compiler-generated variable (at
least at present, but no such need is anticipated either). Variables on
the stack (and machine registers) are already private per-"thread"
(gang, worker and/or vector), and that's fine for artificial variables.
We're restricting this to blocks, as we still need to understand what it
means for a 'DECL_ARTIFICIAL' to appear in a 'private' clause.
Several tests need their scan output patterns adjusted to compensate.
2022-10-14 Julian Brown <julian@codesourcery.com>
PR middle-end/90115
gcc/
* omp-low.cc (oacc_privatization_candidate_p): Artificial vars are not
privatization candidates.
libgomp/
* testsuite/libgomp.oacc-fortran/declare-1.f90: Adjust scan output.
* testsuite/libgomp.oacc-fortran/host_data-5.F90: Likewise.
* testsuite/libgomp.oacc-fortran/if-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/print-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.
Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
This patch adds a stub 'gomp_target_rev' in the host's target.c, which will
later handle the reverse offload.
For nvptx, it adds support for forwarding the offload gomp_target_ext call
to the host by setting values in a struct on the device and querying it on
the host - invoking gomp_target_rev on the result.
include/ChangeLog:
* cuda/cuda.h (enum CUdevice_attribute): Add
CU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING.
(CU_MEMHOSTALLOC_DEVICEMAP): Define.
(cuMemHostAlloc): Add prototype.
libgomp/ChangeLog:
* config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Remove
'static' for this variable.
* config/nvptx/libgomp-nvptx.h: New file.
* config/nvptx/target.c: Include it.
(GOMP_ADDITIONAL_ICVS): Declare extern var.
(GOMP_REV_OFFLOAD_VAR): Declare var.
(GOMP_target_ext): Handle reverse offload.
* libgomp-plugin.h (GOMP_PLUGIN_target_rev): New prototype.
* libgomp-plugin.c (GOMP_PLUGIN_target_rev): New, call ...
* target.c (gomp_target_rev): ... this new stub function.
* libgomp.h (gomp_target_rev): Declare.
* libgomp.map (GOMP_PLUGIN_1.4): New; add GOMP_PLUGIN_target_rev.
* plugin/cuda-lib.def (cuMemHostAlloc): Add.
* plugin/plugin-nvptx.c: Include libgomp-nvptx.h.
(struct ptx_device): Add rev_data member.
(nvptx_open_device): Remove async_engines query, last used in
r10-304-g1f4c5b9b; add unified-address assert check.
(GOMP_OFFLOAD_get_num_devices): Claim unified address
support.
(GOMP_OFFLOAD_load_image): Free rev_fn_table if no
offload functions exist. Make offload var available
on host and device.
(rev_off_dev_to_host_cpy, rev_off_host_to_dev_cpy): New.
(GOMP_OFFLOAD_run): Handle reverse offload.
That is, adjust for optimization introduced with recent
commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
"[PR107195] Set range to zero when nonzero mask is 0", where GCC now
understands that after 'r *= 2;', 'r & 1' will never hold here, and thus
transforms/optimizes/"disturbs" the original code such that GCC/nvptx's later
"Neuter whole SESE regions" optimization no longer is applicable to it:
UNSUPPORTED: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 (test for excess errors)
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 execution test
[-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 scan-nvptx-none-offload-rtl-dump mach "SESE regions:.* [0-9]+{[0-9]+->[0-9]+(\\.[0-9]+)+}"
Same for C++.
It's unclear to me if this is an actual "problem", which optimization is "more
important", so I've filed PR107344 "GCC/nvptx SESE region optimization" to
capture this question, and here restore what we intend to be testing (to my
understanding) in 'libgomp.oacc-c-c++-common/nvptx-sese-1.c'.
PR tree-optimization/107195
PR target/107344
libgomp/
* testsuite/libgomp.oacc-c-c++-common/nvptx-sese-1.c: Restore SESE
regions checking.
Duplicate libgomp.c-c++-common/requires-4.c (as ...-4a.c) but
with using a heap-allocated instead of static memory for a variable.
This change and the added offload_device_gcn check prepare for
pseudo-USM, where the device hardware cannot access all host
memory but only managed and pinned memory; for those, requires-4.c
will fail and the new check permits to add
target { ! { offload_device_nvptx || offload_device_gcn } }
to requires-4.c; however, it has not been added yet as pseuo-USM
support is not yet on mainline. (Review is pending for the USM
patches.)
include/ChangeLog:
* gomp-constants.h (GOMP_DEVICE_HSA): Comment out unused define.
libgomp/ChangeLog:
* testsuite/lib/libgomp.exp (check_effective_target_offload_device_gcn):
New.
* testsuite/libgomp.c-c++-common/on_device_arch.h (device_arch_gcn,
on_device_arch_gcn): New.
* testsuite/libgomp.c-c++-common/requires-4a.c: New test; copied from
requires-4.c but using heap-allocated memory.
After commit r13-3404-g7c55755d4c760de326809636531478fd7419e1e5
"amdgcn: Use FLAT addressing for all functions with pointer arguments [PR105421]",
"big" private data now works for GCN offloading, too.
PR target/105421
libgomp/
* testsuite/libgomp.oacc-c-c++-common/private-big-1.c: New.
That is, '-mptx=_' is only valid in '-foffload-options=nvptx-none', too.
Fix test case added in recent
commit r13-2625-g6b43f556f392a7165582aca36a19fe7389d995b2 "nvptx/mkoffload.cc:
Warn instead of error when reverse offload is not possible".
libgomp/
* testsuite/libgomp.c/reverse-offload-sm30.c: Fix nvptx-specific
'-foffload-options' syntax.
Fortranized testcases of commits r13-3257-ga58a965eb73
and r13-3258-g0ec4e93fb9f.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/task-7.f90: New test.
* testsuite/libgomp.fortran/task-8.f90: New test.
* testsuite/libgomp.fortran/task-in-explicit-1.f90: New test.
* testsuite/libgomp.fortran/task-in-explicit-2.f90: New test.
* testsuite/libgomp.fortran/task-in-explicit-3.f90: New test.
* testsuite/libgomp.fortran/task-reduction-17.f90: New test.
* testsuite/libgomp.fortran/task-reduction-18.f90: New test.
The previous bullet correctly mentions 5.2 added for Fortran
allocators directive which is a replacement of allocate directive
associated with ALLOCATE statement to differentiate it at parse time
from allocate directive as declarative one not associated with ALLOCATE
statement, but the deprecation bullet talks about non-existing allocator
directive.
2022-10-12 Jakub Jelinek <jakub@redhat.com>
* libgomp.texi (OpenMP 5.2): Fix up allocator -> allocate directive
in deprecation bullet.
This is pretty straightforward, if gomp_thread ()->task is NULL,
it can't be explicit task, otherwise if
gomp_thread ()->task->kind == GOMP_TASK_IMPLICIT, it is an implicit
task, otherwise explicit task.
2022-10-12 Jakub Jelinek <jakub@redhat.com>
* omp.h.in (omp_in_explicit_task): Declare.
* omp_lib.h.in (omp_in_explicit_task): Likewise.
* omp_lib.f90.in (omp_in_explicit_task): New interface.
* libgomp.map (OMP_5.2): New symbol version, export
omp_in_explicit_task and omp_in_explicit_task_.
* task.c (omp_in_explicit_task): New function.
* fortran.c (omp_in_explicit_task): Add ialias_redirect.
(omp_in_explicit_task_): New function.
* libgomp.texi (OpenMP 5.2): Mark omp_in_explicit_task as implemented.
* testsuite/libgomp.c-c++-common/task-in-explicit-1.c: New test.
* testsuite/libgomp.c-c++-common/task-in-explicit-2.c: New test.
* testsuite/libgomp.c-c++-common/task-in-explicit-3.c: New test.
When not in explicit parallel/target/teams construct, we in some cases create
an artificial parallel with a single thread (either to handle target nowait
or for task reduction purposes). In those cases, it handled again artificially
created implicit task (created by gomp_new_icv for cases where we needed to write
to some ICVs), but as the testcases show, didn't take into account possibility
of this being done from explicit task(s). The code would destroy/free the previous
task and replace it with the new implicit task. If task is an explicit task
(when teams is NULL, all explicit tasks behave like if (0)), it is a pointer to
a local stack variable, so freeing it doesn't work, and additionally we shouldn't
lose the explicit tasks - the new implicit task should instead replace the
ancestor task which is the first implicit one.
2022-10-12 Jakub Jelinek <jakub@redhat.com>
* task.c (gomp_create_artificial_team): Fix up handling of invocations
from within explicit task.
* target.c (GOMP_target_ext): Likewise.
* testsuite/libgomp.c/task-7.c: New test.
* testsuite/libgomp.c/task-8.c: New test.
* testsuite/libgomp.c-c++-common/task-reduction-17.c: New test.
* testsuite/libgomp.c-c++-common/task-reduction-18.c: New test.
This change adds the configury bits to activate the build of
shared libs on VxWorks ports configured with --enable-shared,
for libraries variants where this is generally supported (rtp,
code model !large - currently not compatible with -fPIC).
Set lt_cv_deplibs_check_method in libtool.m4, so the build of
libraries know how to establish dependencies. This is useful in
configurations such as aarch64 where proper support of LSE relies
on accurate dependency information between libstdc++ and libgcc_s
to begin with.
Regenerate configure scripts to reflect libtool.m4 change.
2022-10-09 Olivier Hainque <hainque@adacore.com>
* libtool.m4 (*vxworks*): When enable_shared, set dynamic_linker
and friends for rtp !large. Assume the linker has the required
abilities and set lt_cv_deplibs_check_method.
gcc/
* config.gcc (*vxworks*): Add t-slibgcc fragment
if enable_shared.
libgcc/
* config.host (*vxworks*): When enable_shared, add
libgcc and crtstuff "shared" fragments for rtp except
large code model.
(aarch64*-wrs-vxworks7*): Remove t-slibgcc-libgcc from
the list of fragments.
2022-10-09 Olivier Hainque <hainque@adacore.com>
gcc/
* configure: Regenerate.
libatomic/
* configure: Regenerate.
libbacktrace/
* configure: Regenerate.
libcc1/
* configure: Regenerate.
libffi/
* configure: Regenerate.
libgfortran/
* configure: Regenerate.
libgomp/
* configure: Regenerate.
libitm/
* configure: Regenerate.
libobjc/
* configure: Regenerate.
liboffloadmic/
* configure: Regenerate.
liboffloadmic/
* plugin/configure: Regenerate.
libphobos/
* configure: Regenerate.
libquadmath/
* configure: Regenerate.
libsanitizer/
* configure: Regenerate.
libssp/
* configure: Regenerate.
libstdc++-v3/
* configure: Regenerate.
libvtv/
* configure: Regenerate.
lto-plugin/
* configure: Regenerate.
zlib/
* configure: Regenerate.
The following patch adds support for the begin declare target construct,
which is another spelling for declare target construct without clauses
(where it needs paired end declare target), but unlike that one accepts
clauses.
This is an OpenMP 5.1 feature, implemented with 5.2 clarification because
in 5.1 we had a restriction in the declare target chapter shared by
declare target and begin declare target that if there are any clauses
specified at least one of them needs to be to or link. But that
was of course meant just for declare target and not begin declare target,
because begin declare target doesn't even allow to/link/enter clauses.
In addition to that, the patch also makes device_type clause duplication
an error (as stated in 5.1) and similarly makes declare target with
just device_type clause an error rather than warning.
What this patch doesn't do is:
1) OpenMP 5.1 also added an indirect clause, we don't support that
neither on declare target nor begin declare target
and I couldn't find it in our features pages (neither libgomp.texi
nor web)
2) I think device_type(nohost)/device_type(host) support can't work for
variables (in 5.0 it only talked about procedures so this could be
also thought as 5.1 feature that we should just add to the list
and implement)
3) I don't see any use of the "omp declare target nohost" attribute, so
I'm not sure if device_type(nohost) works at all
2022-10-04 Jakub Jelinek <jakub@redhat.com>
gcc/c-family/
* c-omp.cc (c_omp_directives): Uncomment begin declare target
entry.
gcc/c/
* c-lang.h (struct c_omp_declare_target_attr): New type.
(current_omp_declare_target_attribute): Change type from
int to vec<c_omp_declare_target_attr, va_gc> *.
* c-parser.cc (c_parser_translation_unit): Adjust for that change.
If last pushed directive was begin declare target, use different
wording and simplify format strings for easier translations.
(c_parser_omp_clause_device_type): Uncomment
check_no_duplicate_clause call.
(c_parser_omp_declare_target): Adjust for the
current_omp_declare_target_attribute type change, push { -1 }.
Use error_at rather than warning_at for declare target with
only device_type clauses.
(OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Define.
(c_parser_omp_begin): Add begin declare target support.
(c_parser_omp_end): Adjust for the
current_omp_declare_target_attribute type change, adjust
diagnostics wording and simplify format strings for easier
translations.
* c-decl.cc (current_omp_declare_target_attribute): Change type from
int to vec<c_omp_declare_target_attr, va_gc> *.
(c_decl_attributes): Adjust for the
current_omp_declare_target_attribute type change. If device_type
was present on begin declare target, add "omp declare target host"
and/or "omp declare target nohost" attributes.
gcc/cp/
* cp-tree.h (struct omp_declare_target_attr): Rename to ...
(cp_omp_declare_target_attr): ... this. Add device_type member.
(omp_begin_assumes_data): Rename to ...
(cp_omp_begin_assumes_data): ... this.
(struct saved_scope): Change types of omp_declare_target_attribute
and omp_begin_assumes.
* parser.cc (cp_parser_omp_clause_device_type): Uncomment
check_no_duplicate_clause call.
(cp_parser_omp_all_clauses): Fix up pasto, c_name for OMP_CLAUSE_LINK
should be "link" rather than "to".
(cp_parser_omp_declare_target): Adjust for omp_declare_target_attr
to cp_omp_declare_target_attr changes, push -1 as device_type. Use
error_at rather than warning_at for declare target with only
device_type clauses.
(OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Define.
(cp_parser_omp_begin): Add begin declare target support. Adjust
for omp_begin_assumes_data to cp_omp_begin_assumes_data change.
(cp_parser_omp_end): Adjust for the
omp_declare_target_attr to cp_omp_declare_target_attr and
omp_begin_assumes_data to cp_omp_begin_assumes_data type changes,
adjust diagnostics wording and simplify format strings for easier
translations.
* semantics.cc (finish_translation_unit): Likewise.
* decl2.cc (cplus_decl_attributes): If device_type was present on
begin declare target, add "omp declare target host" and/or
"omp declare target nohost" attributes.
gcc/testsuite/
* c-c++-common/gomp/declare-target-4.c: Move tests that are now
rejected into declare-target-7.c.
* c-c++-common/gomp/declare-target-6.c: Adjust expected diagnostics.
* c-c++-common/gomp/declare-target-7.c: New test.
* c-c++-common/gomp/begin-declare-target-1.c: New test.
* c-c++-common/gomp/begin-declare-target-2.c: New test.
* c-c++-common/gomp/begin-declare-target-3.c: New test.
* c-c++-common/gomp/begin-declare-target-4.c: New test.
* g++.dg/gomp/attrs-9.C: Add begin declare target tests.
* g++.dg/gomp/attrs-18.C: New test.
libgomp/
* libgomp.texi (Support begin/end declare target syntax in C/C++):
Mark as implemented.
OpenMP 5.1 added has_device_addr and relaxed the restrictions for
use_device_ptr, including processing non-type(c_ptr) arguments as
if has_device_addr was used. (There is a semantic difference.)
For completeness, the likewise change was done for 'use_device_ptr',
where non-type(c_ptr) arguments now use use_device_addr.
Finally, a warning for 'device(omp_{initial,invalid}_device)' was
silenced on the way as affecting the new testcase.
PR fortran/105318
gcc/fortran/ChangeLog:
* openmp.cc (resolve_omp_clauses): Update is_device_ptr restrictions
for OpenMP 5.1 and map to has_device_addr where applicable; map
use_device_ptr to use_device_addr where applicable.
Silence integer-range warning for device(omp_{initial,invalid}_device).
libgomp/ChangeLog:
* testsuite/libgomp.fortran/is_device_ptr-2.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/is_device_ptr-1.f90: Remove dg-error.
* gfortran.dg/gomp/is_device_ptr-2.f90: Likewise.
* gfortran.dg/gomp/is_device_ptr-3.f90: Update tree-scan-dump.
This patch changes c_tree_equal to work more like cp_tree_equal, be
more strict in what it accepts. The ICE on the first testcase was
due to INTEGER_CST wi::wide (t1) == wi::wide (t2) comparison which
ICEs if the two constants have different precision, but as the second
testcase shows, being too lenient in it can also lead to miscompilation
of valid OpenMP programs where we think certain expression is the same
even when it isn't and can be guaranteed at runtime to represent different
memory location. So, the patch looks through only NON_LVALUE_EXPRs
and for constants as well as casts requires that the types match before
actually comparing the constant values or recursing on the cast operands.
2022-09-24 Jakub Jelinek <jakub@redhat.com>
PR c/106981
gcc/c/
* c-typeck.cc (c_tree_equal): Only strip NON_LVALUE_EXPRs at the
start. For CONSTANT_CLASS_P or CASE_CONVERT: return false if t1 and
t2 have different types.
gcc/testsuite/
* c-c++-common/gomp/pr106981.c: New test.
libgomp/
* testsuite/libgomp.c-c++-common/pr106981.c: New test.
This patch refactors struct sibling-list processing in gimplify.cc, and
adjusts some related mapping-clause processing in the Fortran FE and
omp-low.cc accordingly.
2022-09-13 Julian Brown <julian@codesourcery.com>
gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_clauses): Don't create
GOMP_MAP_TO_PSET mappings for class metadata, nor GOMP_MAP_POINTER
mappings for POINTER_TYPE_P decls.
gcc/
* gimplify.cc (gimplify_omp_var_data): Remove GOVD_MAP_HAS_ATTACHMENTS.
(GOMP_FIRSTPRIVATE_IMPLICIT): Renumber.
(insert_struct_comp_map): Refactor function into...
(build_omp_struct_comp_nodes): This new function. Remove list handling
and improve self-documentation.
(extract_base_bit_offset): Remove BASE_REF, OFFSETP parameters. Move
code to strip outer parts of address out of function, but strip no-op
conversions.
(omp_mapping_group): Add DELETED field for use during reindexing.
(omp_strip_components_and_deref, omp_strip_indirections): New functions.
(omp_group_last, omp_group_base): Add GOMP_MAP_STRUCT handling.
(omp_gather_mapping_groups): Initialise DELETED field for new groups.
(omp_index_mapping_groups): Notice DELETED groups when (re)indexing.
(omp_siblist_insert_node_after, omp_siblist_move_node_after,
omp_siblist_move_nodes_after, omp_siblist_move_concat_nodes_after): New
helper functions.
(omp_accumulate_sibling_list): New function to build up GOMP_MAP_STRUCT
node groups for sibling lists. Outlined from gimplify_scan_omp_clauses.
(omp_build_struct_sibling_lists): New function.
(gimplify_scan_omp_clauses): Remove struct_map_to_clause,
struct_seen_clause, struct_deref_set. Call
omp_build_struct_sibling_lists as pre-pass instead of handling sibling
lists in the function's main processing loop.
(gimplify_adjust_omp_clauses_1): Remove GOVD_MAP_HAS_ATTACHMENTS
handling, unused now.
* omp-low.cc (scan_sharing_clauses): Handle pointer-type indirect
struct references, and references to pointers to structs also.
gcc/testsuite/
* g++.dg/goacc/member-array-acc.C: New test.
* g++.dg/gomp/member-array-omp.C: New test.
* g++.dg/gomp/target-3.C: Update expected output.
* g++.dg/gomp/target-lambda-1.C: Likewise.
* g++.dg/gomp/target-this-2.C: Likewise.
* c-c++-common/goacc/deep-copy-arrayofstruct.c: Move test from here.
* c-c++-common/gomp/target-50.c: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c: New test.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c: New test.
* testsuite/libgomp.oacc-c++/deep-copy-17.C: New test.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-arrayofstruct.c: Move
test to here, make "run" test.
While icv_addr[1] = false; assignments where icv_addr has void *
element type is correct and matches how it is used (in those cases
the void * pointer is then cast to bool and used that way), there is no
reason not to add explicit (void *) casts there which are there already
for (void *) true. And, there is in fact even no point in actually
doing those stores at all because we set that pointer to NULL a few
lines earlier. So, this patch adds the explicit casts and then
comments those out to show intent.
2022-09-13 Jakub Jelinek <jakub@redhat.com>
PR libgomp/106906
* env.c (get_icv_member_addr): Cast false to void * before assigning
it to icv_addr[1], and comment the whole assignment out.
I misplaced one remark into 'gcn' instead of 'nvptx' in
commit r13-2625-g6b43f556f392a7165582aca36a19fe7389d995b2
libgomp/ChangeLog:
* libgomp.texi (gcn): Move misplaced -march=sm_30 remark to ...
(nvptx): ... here.
Reverse offload requests at least -misa=sm_35; with this patch, a warning
instead of an error is shown, still permitting reverse offload for all
other configured device types. This is achieved by not calling
GOMP_offload_register_ver (and stopping generating pointless 'static const char'
variables, once known.)
The tool_name as progname changes adds "nvptx " and "gcn " to the
"mkoffload: warning/error:" diagnostic.
gcc/ChangeLog:
* config/nvptx/mkoffload.cc (process): Replace a fatal_error by
a warning + not enabling offloading if -misa=sm_30 prevents
reverse offload.
(main): Use tool_name as progname for diagnostic.
* config/gcn/mkoffload.cc (main): Likewise.
libgomp/ChangeLog:
* libgomp.texi (Offload-Target Specifics: nvptx): Document
that reverse offload requires >= -march=sm_35.
* testsuite/libgomp.c-c++-common/requires-4.c: Build for nvptx
with -misa=sm_35.
* testsuite/libgomp.c-c++-common/requires-5.c: Likewise.
* testsuite/libgomp.c-c++-common/requires-6.c: Likewise.
* testsuite/libgomp.c-c++-common/reverse-offload-1.c: Likewise.
* testsuite/libgomp.fortran/reverse-offload-1.f90: Likewise.
* testsuite/libgomp.c/reverse-offload-sm30.c: New test.
The thing is,
make check
or
make check RUNTESTFLAGS="c.exp='icv-6.c' c++.exp='icv-6.c'"
in libgomp obj dir work fine, but
make -j32 -k check RUNTESTFLAGS="c.exp='icv-6.c' c++.exp='icv-6.c'"
fails.
The thing is that the testcase as written relies on OMP_NUM_THREADS not being
set in environment (as it takes priority over OMP_NUM_THREADS_ALL for the
host).
So, if either a user has OMP_NUM_THREADS=42 in the environment by himself, or
when doing make check with -jN, we trigger:
if test $$num_cpus -gt 8 && test -z "$$OMP_NUM_THREADS"; then \
OMP_NUM_THREADS=8; export OMP_NUM_THREADS; \
echo @@@ libgomp OMP_NUM_THREADS adjusted to 8 because of parallel
make check and too many CPUs; \
fi; \
in libgomp/testsuite/Makefile.am and so the test fails.
2022-09-12 Jakub Jelinek <jakub@redhat.com>
PR libgomp/106894
* testsuite/libgomp.c-c++-common/icv-6.c: Include string.h.
(main): Avoid tests for which corresponding non-_ALL suffixed variable
is in the environment, or for OMP_NUM_TEAMS on the device
OMP_NUM_TEAMS_DEV_?.