Fixes for commit r14-2792-g25072a477a56a727b369bf9b20f4d18198ff5894
"OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect",
namely:
In that commit, the code was changed to handle shared-memory devices;
however, as pointed out, omp_target_memcpy_check already set the pointer
to NULL in that case. Hence, this commit reverts to the prior version.
In cuda.h, it adds cuMemcpyPeer{,Async} for symmetry for cuMemcpy3DPeer
(all currently unused) and in three structs, fixes reserved-member names
and remove a bogus 'const' in three structs.
And it changes a DLSYM to DLSYM_OPT as not all plugins support the new
functions, yet.
include/ChangeLog:
* cuda/cuda.h (CUDA_MEMCPY2D, CUDA_MEMCPY3D, CUDA_MEMCPY3D_PEER):
Remove bogus 'const' from 'const void *dst' and fix reserved-name
name in those structs.
(cuMemcpyPeer, cuMemcpyPeerAsync): Add.
libgomp/ChangeLog:
* target.c (omp_target_memcpy_rect_worker): Undo dim=1 change for
GOMP_OFFLOAD_CAP_SHARED_MEM.
(omp_target_memcpy_rect_copy): Likewise for lock condition.
(gomp_load_plugin_for_device): Use DLSYM_OPT not DLSYM for
memcpy3d/memcpy2d.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memcpy2d,
GOMP_OFFLOAD_memcpy3d): Use memset 0 to nullify reserved and
unused src/dst fields for that mem type; remove '{src,dst}LOD = 0'.