
Reverse offload requests at least -misa=sm_35; with this patch, a warning instead of an error is shown, still permitting reverse offload for all other configured device types. This is achieved by not calling GOMP_offload_register_ver (and stopping generating pointless 'static const char' variables, once known.) The tool_name as progname changes adds "nvptx " and "gcn " to the "mkoffload: warning/error:" diagnostic. gcc/ChangeLog: * config/nvptx/mkoffload.cc (process): Replace a fatal_error by a warning + not enabling offloading if -misa=sm_30 prevents reverse offload. (main): Use tool_name as progname for diagnostic. * config/gcn/mkoffload.cc (main): Likewise. libgomp/ChangeLog: * libgomp.texi (Offload-Target Specifics: nvptx): Document that reverse offload requires >= -march=sm_35. * testsuite/libgomp.c-c++-common/requires-4.c: Build for nvptx with -misa=sm_35. * testsuite/libgomp.c-c++-common/requires-5.c: Likewise. * testsuite/libgomp.c-c++-common/requires-6.c: Likewise. * testsuite/libgomp.c-c++-common/reverse-offload-1.c: Likewise. * testsuite/libgomp.fortran/reverse-offload-1.f90: Likewise. * testsuite/libgomp.c/reverse-offload-sm30.c: New test.
30 lines
894 B
C
30 lines
894 B
C
/* { dg-additional-options "-foffload-options=nvptx-none=-misa=sm_35" { target { offload_target_nvptx } } } */
|
|
/* { dg-additional-sources requires-5-aux.c } */
|
|
|
|
/* Depending on offload device capabilities, it may print something like the
|
|
following (only) if GOMP_DEBUG=1:
|
|
"devices present but 'omp requires unified_address, unified_shared_memory, reverse_offload' cannot be fulfilled"
|
|
and in that case does host-fallback execution.
|
|
|
|
As no offload devices support USM at present, we may verify host-fallback
|
|
execution by absence of separate memory spaces. */
|
|
|
|
#pragma omp requires unified_shared_memory, unified_address, reverse_offload
|
|
|
|
int a[10] = { 0 };
|
|
extern void foo (void);
|
|
|
|
int
|
|
main (void)
|
|
{
|
|
#pragma omp target map(to: a)
|
|
for (int i = 0; i < 10; i++)
|
|
a[i] = i;
|
|
|
|
for (int i = 0; i < 10; i++)
|
|
if (a[i] != i)
|
|
__builtin_abort ();
|
|
|
|
foo ();
|
|
return 0;
|
|
}
|