..., by simplifying 'libgomp.oacc-c-c++-common/parallel-dims.c', and updating
the former correspondingly. '__builtin_goacc_parlevel_id' does the right thing
for all 'acc_device_*'.
Follow-up to commit 09e0ad6253 "Update OpenACC
tests for amdgcn".
libgomp/
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Simplify.
* testsuite/libgomp.oacc-fortran/parallel-dims-aux.c: Update.
When gcc is configured for nvptx offloading with --without-cuda-driver
and full CUDA isn't installed, many libgomp.oacc-*/* tests fail,
some of them because cuda.h header can't be found, others because
the tests can't be linked against -lcuda, -lcudart or -lcublas.
I usually only have akmod-nvidia and xorg-x11-drv-nvidia-cuda rpms
installed, so libcuda.so.1 can be dlopened and the offloading works,
but linking against those libraries isn't possible nor are the
headers around (for the plugin itself there is the fallback
libgomp/plugin/cuda/cuda.h).
The following patch adds 3 new effective targets and uses them in tests that
needs those.
2021-05-27 Jakub Jelinek <jakub@redhat.com>
* testsuite/lib/libgomp.exp (check_effective_target_openacc_cuda,
check_effective_target_openacc_cublas,
check_effective_target_openacc_cudart): New.
* testsuite/libgomp.oacc-fortran/host_data-4.f90: Require effective
target openacc_cublas.
* testsuite/libgomp.oacc-fortran/host_data-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/host_data-3.f: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-91.c: Require effective
target openacc_cuda.
* testsuite/libgomp.oacc-c-c++-common/lib-70.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-90.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-75.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-69.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-74.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-81.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-72.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-85.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/pr87835.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-82.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-73.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-83.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-78.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-76.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-84.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-79.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/host_data-1.c: Require effective
targets openacc_cublas and openacc_cudart.
* testsuite/libgomp.oacc-c-c++-common/context-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/context-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/context-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/context-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property-nvptx.c:
Require effective target openacc_cudart.
* testsuite/libgomp.oacc-c-c++-common/asyncwait-1.c: Add -DUSE_CUDA_H
for effective target openacc_cuda and add && defined USE_CUDA_H to
preprocessor conditionals. Guard -lcuda also on openacc_cuda
effective target.
This patch implements a method to track the "private-ness" of
OpenACC variables declared in offload regions in gang-partitioned,
worker-partitioned or vector-partitioned modes. Variables declared
implicitly in scoped blocks and those declared "private" on enclosing
directives (e.g. "acc parallel") are both handled. Variables that are
e.g. gang-private can then be adjusted so they reside in GPU shared
memory.
The reason for doing this is twofold: correct implementation of OpenACC
semantics, and optimisation, since shared memory might be faster than
the main memory on a GPU. Handling of private variables is intimately
tied to the execution model for gangs/workers/vectors implemented by
a particular target: for current targets, we use (or on mainline, will
soon use) a broadcasting/neutering scheme.
That is sufficient for code that e.g. sets a variable in worker-single
mode and expects to use the value in worker-partitioned mode. The
difficulty (semantics-wise) comes when the user wants to do something like
an atomic operation in worker-partitioned mode and expects a worker-single
(gang private) variable to be shared across each partitioned worker.
Forcing use of shared memory for such variables makes that work properly.
In terms of implementation, the parallelism level of a given loop is
not fixed until the oaccdevlow pass in the offload compiler, so the
patch delays fixing the parallelism level of variables declared on or
within such loops until the same point. This is done by adding a new
internal UNIQUE function (OACC_PRIVATE) that lists (the address of) each
private variable as an argument, and other arguments set so as to be able
to determine the correct parallelism level to use for the listed
variables. This new internal function fits into the existing scheme for
demarcating OpenACC loops, as described in comments in the patch.
Two new target hooks are introduced: TARGET_GOACC_ADJUST_PRIVATE_DECL and
TARGET_GOACC_EXPAND_VAR_DECL. The first can tweak a variable declaration
at oaccdevlow time, and the second at expand time. The first or both
of these target hooks can be used by a given offload target, depending
on its strategy for implementing private variables.
This patch updates the TARGET_GOACC_ADJUST_PRIVATE_DECL target hook in
the AMD GCN backend to the current name and prototype. (An earlier
version of the hook was already present, but dormant.)
gcc/
PR middle-end/90115
* doc/tm.texi.in (TARGET_GOACC_EXPAND_VAR_DECL)
(TARGET_GOACC_ADJUST_PRIVATE_DECL): Add documentation hooks.
* doc/tm.texi: Regenerate.
* expr.c (expand_expr_real_1): Expand decls using the
expand_var_decl OpenACC hook if defined.
* internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE.
* internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE.
* omp-low.c (omp_context): Add oacc_privatization_candidates
field.
(lower_oacc_reductions): Add PRIVATE_MARKER parameter. Insert
before fork.
(lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify
private marker's gimple call arguments, and pass it to
lower_oacc_reductions.
(oacc_privatization_scan_clause_chain)
(oacc_privatization_scan_decl_chain, lower_oacc_private_marker):
New functions.
(lower_omp_for, lower_omp_target, lower_omp_1): Use these.
* omp-offload.c (convert.h): Include.
(oacc_loop_xform_head_tail): Treat private-variable markers like
fork/join when transforming head/tail sequences.
(struct var_decl_rewrite_info): Add struct.
(oacc_rewrite_var_decl, is_sync_builtin_call): New functions.
(execute_oacc_device_lower): Support rewriting gang-private
variables using target hook, and fix up addr_expr and var_decl
nodes afterwards.
* target.def (adjust_private_decl, expand_var_decl): New hooks.
* config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl):
Rename to...
(gcn_goacc_adjust_private_decl): ...this.
* config/gcn/gcn-tree.c (gcn_goacc_adjust_gangprivate_decl):
Rename to...
(gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter.
* config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename
definition using gcn_goacc_adjust_gangprivate_decl...
(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...to this, using
gcn_goacc_adjust_private_decl.
* config/nvptx/nvptx.c (tree-pretty-print.h): Include.
(gang_private_shared_size): New global variable.
(gang_private_shared_align): Likewise.
(gang_private_shared_sym): Likewise.
(gang_private_shared_hmap): Likewise.
(nvptx_option_override): Initialize these.
(nvptx_file_end): Output gang_private_shared_sym.
(nvptx_goacc_adjust_private_decl, nvptx_goacc_expand_var_decl):
New functions.
(nvptx_set_current_function): Clear gang_private_shared_hmap.
(TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook.
(TARGET_GOACC_EXPAND_VAR_DECL): Likewise.
libgomp/
PR middle-end/90115
* testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c: New
test.
* testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90:
Likewise.
Co-Authored-By: Chung-Lin Tang <cltang@codesourcery.com>
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
It turned out that a compiler built without offloading support
and one with can produce slightly different diagnostic.
Offloading support implies ENABLE_OFFLOAD which implies that
g->have_offload is set when offloading is actually needed.
In cgraphunit.c, the latter causes flag_generate_offload = 1,
which in turn affects tree.c's free_lang_data.
The result is that the front-end specific diagnostic gets reset
('tree_diagnostics_defaults (global_dc)'), which affects in this
case 'Warning' vs. 'warning' via the Fortran frontend.
Result: 'Warning:' vs. 'warning:'.
Side note: Other FE also override the diagnostic, leading to
similar differences, e.g. the C++ FE outputs mangled function
names differently, cf. patch thread.
libgomp/ChangeLog:
* testsuite/libgomp.oacc-fortran/par-reduction-2-1.f:
Use [Ww]arning in dg-bogus as FE diagnostic and default
diagnostic differ and the result depends on ENABLE_OFFLOAD.
* testsuite/libgomp.oacc-fortran/par-reduction-2-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise.
* testsuite/libgomp.oacc-fortran/parallel-reduction.f90: Likewise.
gcc/testsuite/ChangeLog:
* gfortran.dg/goacc/classify-serial.f95:
Use [Ww]arning in dg-bogus as FE diagnostic and default
diagnostic differ and the result depends on ENABLE_OFFLOAD.
* gfortran.dg/goacc/kernels-decompose-2.f95: Likewise.
* gfortran.dg/goacc/routine-module-mod-1.f90: Likewise.
Fixup for recent commit d28f3da11d "openacc: Fix
lowering for derived-type mappings through array elements". With nvptx
offloading we see the usual:
[...]/libgomp.oacc-fortran/derivedtypes-arrays-1.f90: In function 'MAIN__._omp_fn.0':
[...]/libgomp.oacc-fortran/derivedtypes-arrays-1.f90:90:40: warning: using vector_length (32), ignoring 1
libgomp/
* testsuite/libgomp.oacc-fortran/derivedtypes-arrays-1.f90:
OpenACC 'serial' construct diagnostic for nvptx offloading.
This patch disallows selecting components of array sections in update
directives for OpenACC, as specified in OpenACC 3.0, "2.14.4. Update
Directive":
In Fortran, members of variables of derived type may appear, including
a subarray of a member. Members of subarrays of derived type may
not appear.
The diagnostic for attempting to use the same construct on other
directives has also been improved.
gcc/fortran/
* openmp.c (resolve_omp_clauses): Disallow selecting components
of arrays of derived type.
gcc/testsuite/
* gfortran.dg/goacc/array-with-dt-2.f90: Remove expected errors.
* gfortran.dg/goacc/array-with-dt-6.f90: New test.
* gfortran.dg/goacc/mapping-tests-2.f90: Update expected error.
* gfortran.dg/goacc/ref_inquiry.f90: Update expected errors.
* gfortran.dg/gomp/ref_inquiry.f90: Likewise.
libgomp/
* testsuite/libgomp.oacc-fortran/array-stride-dt-1.f90: Remove
expected errors.
This patch fixes lowering of derived-type mappings which select elements
of arrays of derived types, and similar. These would previously lead
to ICEs.
With this change, OpenACC directives can pass through constructs that
are no longer recognized by the gimplifier, hence alterations are needed
there also.
gcc/fortran/
* trans-openmp.c (gfc_trans_omp_clauses): Handle element selection
for arrays of derived types.
gcc/
* gimplify.c (gimplify_scan_omp_clauses): Handle ATTACH_DETACH
for non-decls.
gcc/testsuite/
* gfortran.dg/goacc/array-with-dt-1.f90: New test.
* gfortran.dg/goacc/array-with-dt-3.f90: Likewise.
* gfortran.dg/goacc/array-with-dt-4.f90: Likewise.
* gfortran.dg/goacc/array-with-dt-5.f90: Likewise.
* gfortran.dg/goacc/derived-chartypes-1.f90: Re-enable test.
* gfortran.dg/goacc/derived-chartypes-2.f90: Likewise.
* gfortran.dg/goacc/derived-classtypes-1.f95: Uncomment
previously-broken directives.
libgomp/
* testsuite/libgomp.oacc-fortran/derivedtypes-arrays-1.f90: New test.
* testsuite/libgomp.oacc-fortran/update-dt-array.f90: Likewise.
This patch adds some XFAILs for PR98979 until the patch to fix them has
been approved. See:
https://gcc.gnu.org/pipermail/gcc-patches/2021-February/564711.html
gcc/testsuite/
PR fortran/98979
* gfortran.dg/goacc/array-with-dt-2.f90: Add expected errors.
* gfortran.dg/goacc/derived-chartypes-1.f90: Skip ICEing test.
* gfortran.dg/goacc/derived-chartypes-2.f90: Likewise.
libgomp/
PR fortran/98979
* testsuite/libgomp.oacc-fortran/array-stride-dt-1.f90: Add expected
errors.
OpenACC 3.0 ("2.14.4. Update Directive") states:
Noncontiguous subarrays may appear. It is implementation-specific
whether noncontiguous regions are updated by using one transfer for
each contiguous subregion, or whether the non-contiguous data is
packed, transferred once, and unpacked, or whether one or more larger
subarrays (no larger than the smallest contiguous region that contains
the specified subarray) are updated.
This patch relaxes some conditions in the Fortran front-end so that
strided accesses are permitted for update directives.
gcc/fortran/
* openmp.c (resolve_omp_clauses): Omit OpenACC update in
contiguity check and stride-specified error.
gcc/testsuite/
* gfortran.dg/goacc/array-with-dt-2.f90: New test.
libgomp/
* testsuite/libgomp.oacc-fortran/array-stride-dt-1.f90: New test.
The testcase had invalid assumptions about which loop iterations would run
first and last.
libgomp/ChangeLog
* testsuite/libgomp.oacc-fortran/atomic_capture-1.f90 (main): Adjust
expected results.
Document status quo re PR94358 "[OMP] Privatize internal array variables
introduced by the Fortran FE".
libgomp/
PR fortran/94358
* testsuite/libgomp.oacc-fortran/pr94358-1.f90: New.
Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
The gangs participating in a gang-partitioned loop are not all guaranteed
to complete before some given gang continues to execute beyond that loop.
This means that two existing test cases contain a race condition,
because a loop that may be gang-partitioned is followed immediately by
another loop. The fix is to place the loops in separate parallel regions.
2020-09-08 Julian Brown <julian@codesourcery.com>
libgomp/
* testsuite/libgomp.oacc-fortran/collapse-1.f90: Fix race condition.
* testsuite/libgomp.oacc-fortran/collapse-2.f90: Likewise.
With the pr96628-part1.f90 source and -ftree-slp-vectorize, we run into an
ICE due to the fact that V2DI mode is not handled in nvptx_gen_shuffle.
Fix this by adding handling of V2DI as well as V2SI mode in
nvptx_gen_shuffle.
Build and reg-tested on x86_64 with nvptx accelerator.
gcc/ChangeLog:
PR target/96428
* config/nvptx/nvptx.c (nvptx_gen_shuffle): Handle V2SI/V2DI.
libgomp/ChangeLog:
PR target/96428
* testsuite/libgomp.oacc-fortran/pr96628-part1.f90: New test.
* testsuite/libgomp.oacc-fortran/pr96628-part2.f90: New test.
Standalone attach and detach clauses should not create present/release
mappings for Fortran array descriptors (e.g. used when we have a pointer
to an array), both because it is unnecessary and because those mappings
will be incorrectly subject to reference counting. Simply omitting the
mappings means we just use GOMP_MAP_TO_PSET and GOMP_MAP_{ATTACH,DETACH}
mappings for array descriptors.
That requires a tweak in gimplify.c, since we may now see GOMP_MAP_TO_PSET
without a preceding data-movement mapping.
2020-08-03 Julian Brown <julian@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
gcc/fortran/
* trans-openmp.c (gfc_trans_omp_clauses): Don't create present/release
mappings for array descriptors.
gcc/
* gimplify.c (gimplify_omp_target_update): Allow GOMP_MAP_TO_PSET
without a preceding data-movement mapping.
gcc/testsuite/
* gfortran.dg/goacc/attach-descriptor.f90: Update pattern output. Add
scanning of gimplify dump.
libgomp/
* testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90: Don't run for
shared-memory devices. Extend with further checking.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
Attach and detach operations are not supposed to affect structural or
dynamic reference counts for OpenACC. Previously they did so, which led to
subtle problems in some circumstances. We can avoid reference-counting
attach/detach operations by extending and slightly repurposing the
do_detach field in target_var_desc. It is now called is_attach to better
reflect its new role.
2020-07-27 Julian Brown <julian@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
libgomp/
* libgomp.h (struct target_var_desc): Rename do_detach field to
is_attach.
* oacc-mem.c (goacc_exit_datum_1): Add assert. Don't set finalize for
GOMP_MAP_FORCE_DETACH. Update checking to use is_attach field.
(goacc_enter_data_internal): Don't affect reference counts
for attach mappings.
(goacc_exit_data_internal): Don't affect reference counts for detach
mappings.
* target.c (gomp_map_vars_existing): Don't affect reference counts for
attach mappings.
(gomp_map_vars_internal): Set renamed is_attach flag unconditionally to
mark attach mappings.
(gomp_unmap_vars_internal): Use is_attach flag to prevent affecting
reference count for attach mappings.
* testsuite/libgomp.oacc-c-c++-common/mdc-refcount-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/mdc-refcount-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/mdc-refcount-2.c: New test.
* testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90: Mark
test as shouldfail.
* testsuite/libgomp.oacc-fortran/deep-copy-6.f90: Adjust to fail
gracefully in no-finalize mode.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
..., so that we don't leak this into '*.exp' files running later.
This is relevant after commit efc16503ca "handle
dumpbase in offloading, adjust testsuite" -- I was confused why in a
(simplified) testing sequence as follows:
default 'libgomp.c/c.exp'
default 'libgomp.oacc-c/c.exp'
'-m32' 'libgomp.c/c.exp'
'-m32' 'libgomp.oacc-c/c.exp'
..., the "'-m32' 'libgomp.c/c.exp'" variant would not execute any offloading
dump scanning. The reason is that the "default 'libgomp.oacc-c/c.exp'" variant
ends with 'offload_target=disable' set, so that's what the "'-m32'
'libgomp.c/c.exp'" variant would then see, in particular
'gcc/testsuite/lib/scanoffload.exp:scoff'.
libgomp/
* testsuite/libgomp.oacc-c++/c++.exp: Unset 'offload_target' after
use.
* testsuite/libgomp.oacc-c/c.exp: Likewise.
* testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
This patch makes it so that an "attach" operation for a Fortran pointer
with an array descriptor copies that array descriptor to the target,
and similarly that detach operations release the array descriptor.
2020-07-16 Julian Brown <julian@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
gcc/fortran/
* trans-openmp.c (gfc_trans_omp_clauses): Rework OpenACC
attach/detach handling for arrays with descriptors.
gcc/testsuite/
* gfortran.dg/goacc/attach-descriptor.f90: New test.
libgomp/
* testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90: New test.
* testsuite/libgomp.oacc-fortran/attach-descriptor-2.f90: New test.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
OpenACC 2.6 specifies that the array descriptor (when present) must be
copied to the target before attaching pointers in Fortran. This patch
reverses the stripping of GOMP_MAP_TO_PSET and GOMP_MAP_POINTER that
was introduced by the "OpenACC reference count overhaul" patch.
2020-07-10 Julian Brown <julian@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
gcc/
* gimplify.c (gimplify_scan_omp_clauses): Do not strip
GOMP_MAP_TO_PSET/GOMP_MAP_POINTER for OpenACC enter/exit data
directives (see also PR92929).
gcc/testsuite/
* gfortran.dg/goacc/finalize-1.f: Update expected dump output.
libgomp/
* testsuite/libgomp.oacc-fortran/dynamic-pointer-1.f90: New test.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
This patch adjusts how dynamic reference counts work so that they match
the semantics of the source program more closely, instead of representing
"excess" reference counts beyond those that represent pointers in the
internal libgomp splay-tree data structure. This allows some corner
cases to be handled more gracefully.
2020-07-10 Julian Brown <julian@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
libgomp/
* libgomp.h (struct splay_tree_key_s): Change virtual_refcount to
dynamic_refcount.
(struct gomp_device_descr): Remove GOMP_MAP_VARS_OPENACC_ENTER_DATA.
* oacc-mem.c (acc_map_data): Substitute virtual_refcount for
dynamic_refcount.
(acc_unmap_data): Update comment.
(goacc_map_var_existing, goacc_enter_datum): Adjust for
dynamic_refcount semantics.
(goacc_exit_datum_1, goacc_exit_datum): Re-add some error checking.
Adjust for dynamic_refcount semantics.
(goacc_enter_data_internal): Implement "present" case of dynamic
memory-map handling here. Update "non-present" case for
dynamic_refcount semantics.
(goacc_exit_data_internal): Use goacc_exit_datum_1.
* target.c (gomp_map_vars_internal): Remove
GOMP_MAP_VARS_OPENACC_ENTER_DATA handling. Update for dynamic_refcount
handling.
(gomp_unmap_vars_internal): Remove virtual_refcount handling.
(gomp_load_image_to_device): Substitute dynamic_refcount for
virtual_refcount.
* testsuite/libgomp.oacc-c-c++-common/pr92843-1.c: Remove XFAILs.
* testsuite/libgomp.oacc-c-c++-common/refcounting-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/refcounting-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/struct-3-1-1.c: New test.
* testsuite/libgomp.oacc-fortran/deep-copy-6.f90: Remove XFAILs and
trace output.
* testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90: Remove
trace output.
* testsuite/libgomp.oacc-fortran/dynamic-incr-structural-1.f90: New
test.
* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-4.c:
Remove stale comment.
* testsuite/libgomp.oacc-fortran/mdc-refcount-1-1-1.f90: Remove XFAILs.
* testsuite/libgomp.oacc-fortran/mdc-refcount-1-1-2.F90: Likewise.
* testsuite/libgomp.oacc-fortran/mdc-refcount-1-2-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/mdc-refcount-1-2-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/mdc-refcount-1-3-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/mdc-refcount-1-4-1.f90: Adjust XFAIL.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
gcc/fortran/ChangeLog
* parse.c (decode_oacc_directive): Permit 'acc routine' also
inside pure procedures.
* openmp.c (gfc_match_oacc_routine): Inside pure procedures
do not permit gang, worker or vector clauses.
libgomp/ChangeLog:
* testsuite/libgomp.oacc-fortran/routine-10.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/goacc/pure-elemental-procedures-2.f90: New test.
Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>
If, for example, GCC is configured such that 'libgomp-plugin-nvptx.so.1'
dynamically links against 'libcuda.so.1', but testing is run on a system where
there is no 'libcuda.so.1', this produces output such as:
PASS: libgomp.oacc-fortran/error_stop-1.f -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O0 (test for excess errors)
PASS: libgomp.oacc-fortran/error_stop-1.f -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O0 execution test
FAIL: libgomp.oacc-fortran/error_stop-1.f -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O0 output pattern test, is CheCKpOInT
libgomp: while loading libgomp-plugin-nvptx.so.1: libcuda.so.1: cannot open shared object file: No such file or directory
ERROR STOP
Error termination. Backtrace: [...]
, should match CheCKpOInT(
|
|^M)+ERROR STOP (
|
|^M)+Error termination.*
..., where after 'CheCKpOInT' we got 'libgomp: while loading [...]' injected
before the expected 'ERROR STOP'.
libgomp/
* testsuite/libgomp.oacc-fortran/error_stop-1.f: Initialize before
the checkpoint.
* testsuite/libgomp.oacc-fortran/error_stop-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/error_stop-3.f: Likewise.
* testsuite/libgomp.oacc-fortran/stop-1.f: Likewise.
* testsuite/libgomp.oacc-fortran/stop-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/stop-3.f: Likewise.
2020-03-18 Julian Brown <julian@codesourcery.com>
Tobias Burnus <tobias@codesourcery.com>
* testsuite/libgomp.oacc-fortran/atomic_capture-1.f90: Really make
it work concurrently.
The commit r10-6721-g8d1a1cb1b816381bf60cb1211c93b8eba1fe1472 has changed
the name of the type that is used for the return value of the Fortran
acc_get_property function without adapting the test acc_get_property.f90.
2020-02-21 Frederik Harwath <frederik@codesourcery.com>
* testsuite/libgomp.oacc-fortran/acc_get_property.f90: Adapt to
changes from 2020-02-19, i.e. use integer(c_size_t) instead of
integer(acc_device_property) for the type of the return value of
acc_get_property.
Add full support for the OpenACC 2.6 acc_get_property and
acc_get_property_string functions to the libgomp GCN plugin.
libgomp/
* plugin-gcn.c (struct agent_info): Add fields "name" and
"vendor_name" ...
(GOMP_OFFLOAD_init_device): ... and init from here.
(struct hsa_context_info): Add field "driver_version_s" ...
(init_hsa_contest): ... and init from here.
(GOMP_OFFLOAD_openacc_get_property): Replace stub with a proper
implementation.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property.c:
Enable test execution for amdgcn and host offloading targets.
* testsuite/libgomp.oacc-fortran/acc_get_property.f90: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property-aux.c
(expect_device_properties): Split function into ...
(expect_device_string_properties): ... this new function ...
(expect_device_memory): ... and this new function.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property-gcn.c:
Add test.
gcc/fortran/
* gfortran.h (gfc_symbol): Add comp_mark bitfield.
* openmp.c (resolve_omp_clauses): Disallow mixed component and
full-derived-type accesses to the same variable within a single
directive.
libgomp/
* testsuite/libgomp.oacc-fortran/deep-copy-2.f90: Remove test from here.
* testsuite/libgomp.oacc-fortran/deep-copy-3.f90: Don't use mixed
component/non-component variable refs in a single directive.
* testsuite/libgomp.oacc-fortran/classtypes-1.f95: Likewise.
gcc/testsuite/
* gfortran.dg/goacc/deep-copy-2.f90: Move test here (from libgomp
testsuite). Make a compilation test, and expect rejection of mixed
component/non-component accesses.
* gfortran.dg/goacc/mapping-tests-1.f90: New test.
Add generic support for the OpenACC 2.6 `acc_get_property' and
`acc_get_property_string' routines, as well as full handlers for the
host and the NVPTX offload targets and minimal handlers for the HSA,
Intel MIC, and AMD GCN offload targets.
Included are C/C++ and Fortran tests that, in particular, print
the property values for acc_property_vendor, acc_property_memory,
acc_property_free_memory, acc_property_name, and acc_property_driver.
The output looks as follows:
Vendor: GNU
Name: GOMP
Total memory: 0
Free memory: 0
Driver: 1.0
with the host driver (where the memory related properties are not
supported for the host device and yield 0, conforming to the standard)
and output like:
Vendor: Nvidia
Total memory: 12651462656
Free memory: 12202737664
Name: TITAN V
Driver: CUDA Driver 9.1
with the NVPTX driver.
2019-12-22 Maciej W. Rozycki <macro@codesourcery.com>
Frederik Harwath <frederik@codesourcery.com>
Thomas Schwinge <tschwinge@codesourcery.com>
include/
* gomp-constants.h (gomp_device_property): New enum.
libgomp/
* libgomp.h (gomp_device_descr): Add `get_property_func' member.
* libgomp-plugin.h (gomp_device_property_value): New union.
(gomp_device_property_value): New prototype.
* openacc.h (acc_device_t): Add `acc_device_current' enumeration
constant.
(acc_device_property_t): New enum.
(acc_get_property, acc_get_property_string): New prototypes.
* oacc-init.c (acc_get_device_type): Also assert that result
is not `acc_device_current'.
(get_property_any, acc_get_property, acc_get_property_string):
New functions.
* openacc.f90 (openacc_kinds): Add `acc_device_current' and
`acc_property_memory', `acc_property_free_memory',
`acc_property_name', `acc_property_vendor' and
`acc_property_driver' constants. Add `acc_device_property' data
type.
(openacc_internal): Add `acc_get_property' and
`acc_get_property_string' interfaces. Add `acc_get_property_h',
`acc_get_property_string_h', `acc_get_property_l' and
`acc_get_property_string_l'.
* oacc-host.c (host_get_property): New function.
(host_dispatch): Wire it.
* target.c (gomp_load_plugin_for_device): Handle `get_property'.
* libgomp.map (OACC_2.6): Add `acc_get_property', `acc_get_property_h_',
`acc_get_property_string' and `acc_get_property_string_h_' symbols.
* libgomp.texi (OpenACC Runtime Library Routines): Add
`acc_get_property'.
(acc_get_property): New node.
* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_property): New
function (stub).
* plugin/plugin-hsa.c (GOMP_OFFLOAD_get_property): New function.
* plugin/plugin-nvptx.c (CUDA_CALLS): Add `cuDeviceGetName',
`cuDeviceTotalMem', `cuDriverGetVersion' and `cuMemGetInfo'
calls.
(GOMP_OFFLOAD_get_property): New function.
(struct ptx_device): Add new field "name".
(cuda_driver_version_s): Add new static variable ...
(nvptx_init): ... and init from here.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property.c: New test.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property-3.c: New test.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property-aux.c: New file
with test helper functions.
* testsuite/libgomp.oacc-fortran/acc_get_property.f90: New test.
liboffloadmic/
* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_get_property):
New function.
Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>
Co-Authored-By: Frederik Harwath <frederik@codesourcery.com>
Co-Authored-By: Thomas Schwinge <tschwinge@codesourcery.com>
From-SVN: r279710