Commit graph

144 commits

Author SHA1 Message Date
Thomas Schwinge
e4cba49413 Remove support for Intel MIC offloading
... after its deprecation in GCC 12.

	* Makefile.def: Remove module 'liboffloadmic'.
	* Makefile.in: Regenerate.
	* configure.ac: Remove 'liboffloadmic' handling.
	* configure: Regenerate.
	contrib/
	* gcc-changelog/git_commit.py (default_changelog_locations):
	Remove 'liboffloadmic'.
	* gcc_update (files_and_dependencies): Remove 'liboffloadmic'
	files.
	* update-copyright.py (GCCCmdLine): Remove 'liboffloadmic'
	comment.
	gcc/
	* config.gcc [target *-intelmic-* | *-intelmicemul-*]: Remove.
	* config/i386/i386-options.cc (ix86_omp_device_kind_arch_isa)
	[ACCEL_COMPILER]: Remove.
	* config/i386/intelmic-mkoffload.cc: Remove.
	* config/i386/intelmic-offload.h: Likewise.
	* config/i386/t-intelmic: Likewise.
	* config/i386/t-omp-device: Likewise.
	* configure.ac [target *-intelmic-* | *-intelmicemul-*]: Remove.
	* configure: Regenerate.
	* doc/install.texi (--enable-offload-targets=[...]): Update.
	* doc/sourcebuild.texi: Remove 'liboffloadmic' documentation.
	include/
	* gomp-constants.h (GOMP_DEVICE_INTEL_MIC): Comment out.
	(GOMP_VERSION_INTEL_MIC): Remove.
	libgomp/
	* libgomp-plugin.h (OFFLOAD_TARGET_TYPE_INTEL_MIC): Remove.
	* libgomp.texi (OpenMP Context Selectors): Remove Intel MIC
	documentation.
	* plugin/configfrag.ac <enable_offload_targets>
	[*-intelmic-* | *-intelmicemul-*]: Remove.
	* configure: Regenerate.
	* testsuite/lib/libgomp.exp (libgomp_init): Remove 'liboffloadmic'
	handling.
	(offload_target_to_openacc_device_type)
	[$offload_target = *-intelmic*]: Remove.
	(check_effective_target_offload_device_intel_mic)
	(check_effective_target_offload_device_any_intel_mic): Remove.
	* testsuite/libgomp.c-c++-common/on_device_arch.h
	(device_arch_intel_mic, on_device_arch_intel_mic, any_device_arch)
	(any_device_arch_intel_mic): Remove.
	* testsuite/libgomp.c-c++-common/target-45.c: Remove
	'offload_device_any_intel_mic' XFAIL.
	* testsuite/libgomp.fortran/target10.f90: Likewise.
	liboffloadmic/
	* ChangeLog: Remove.
	* Makefile.am: Likewise.
	* Makefile.in: Likewise.
	* aclocal.m4: Likewise.
	* configure: Likewise.
	* configure.ac: Likewise.
	* configure.tgt: Likewise.
	* doc/doxygen/config: Likewise.
	* doc/doxygen/header.tex: Likewise.
	* include/coi/common/COIEngine_common.h: Likewise.
	* include/coi/common/COIEvent_common.h: Likewise.
	* include/coi/common/COIMacros_common.h: Likewise.
	* include/coi/common/COIPerf_common.h: Likewise.
	* include/coi/common/COIResult_common.h: Likewise.
	* include/coi/common/COISysInfo_common.h: Likewise.
	* include/coi/common/COITypes_common.h: Likewise.
	* include/coi/sink/COIBuffer_sink.h: Likewise.
	* include/coi/sink/COIPipeline_sink.h: Likewise.
	* include/coi/sink/COIProcess_sink.h: Likewise.
	* include/coi/source/COIBuffer_source.h: Likewise.
	* include/coi/source/COIEngine_source.h: Likewise.
	* include/coi/source/COIEvent_source.h: Likewise.
	* include/coi/source/COIPipeline_source.h: Likewise.
	* include/coi/source/COIProcess_source.h: Likewise.
	* liboffloadmic_host.spec.in: Likewise.
	* liboffloadmic_target.spec.in: Likewise.
	* plugin/Makefile.am: Likewise.
	* plugin/Makefile.in: Likewise.
	* plugin/aclocal.m4: Likewise.
	* plugin/configure: Likewise.
	* plugin/configure.ac: Likewise.
	* plugin/libgomp-plugin-intelmic.cpp: Likewise.
	* plugin/offload_target_main.cpp: Likewise.
	* runtime/cean_util.cpp: Likewise.
	* runtime/cean_util.h: Likewise.
	* runtime/coi/coi_client.cpp: Likewise.
	* runtime/coi/coi_client.h: Likewise.
	* runtime/coi/coi_server.cpp: Likewise.
	* runtime/coi/coi_server.h: Likewise.
	* runtime/compiler_if_host.cpp: Likewise.
	* runtime/compiler_if_host.h: Likewise.
	* runtime/compiler_if_target.cpp: Likewise.
	* runtime/compiler_if_target.h: Likewise.
	* runtime/dv_util.cpp: Likewise.
	* runtime/dv_util.h: Likewise.
	* runtime/emulator/coi_common.h: Likewise.
	* runtime/emulator/coi_device.cpp: Likewise.
	* runtime/emulator/coi_device.h: Likewise.
	* runtime/emulator/coi_host.cpp: Likewise.
	* runtime/emulator/coi_host.h: Likewise.
	* runtime/emulator/coi_version_asm.h: Likewise.
	* runtime/emulator/coi_version_linker_script.map: Likewise.
	* runtime/liboffload_error.c: Likewise.
	* runtime/liboffload_error_codes.h: Likewise.
	* runtime/liboffload_msg.c: Likewise.
	* runtime/liboffload_msg.h: Likewise.
	* runtime/mic_lib.f90: Likewise.
	* runtime/offload.h: Likewise.
	* runtime/offload_common.cpp: Likewise.
	* runtime/offload_common.h: Likewise.
	* runtime/offload_engine.cpp: Likewise.
	* runtime/offload_engine.h: Likewise.
	* runtime/offload_env.cpp: Likewise.
	* runtime/offload_env.h: Likewise.
	* runtime/offload_host.cpp: Likewise.
	* runtime/offload_host.h: Likewise.
	* runtime/offload_iterator.h: Likewise.
	* runtime/offload_omp_host.cpp: Likewise.
	* runtime/offload_omp_target.cpp: Likewise.
	* runtime/offload_orsl.cpp: Likewise.
	* runtime/offload_orsl.h: Likewise.
	* runtime/offload_table.cpp: Likewise.
	* runtime/offload_table.h: Likewise.
	* runtime/offload_target.cpp: Likewise.
	* runtime/offload_target.h: Likewise.
	* runtime/offload_target_main.cpp: Likewise.
	* runtime/offload_timer.h: Likewise.
	* runtime/offload_timer_host.cpp: Likewise.
	* runtime/offload_timer_target.cpp: Likewise.
	* runtime/offload_trace.cpp: Likewise.
	* runtime/offload_trace.h: Likewise.
	* runtime/offload_util.cpp: Likewise.
	* runtime/offload_util.h: Likewise.
	* runtime/ofldbegin.cpp: Likewise.
	* runtime/ofldend.cpp: Likewise.
	* runtime/orsl-lite/include/orsl-lite.h: Likewise.
	* runtime/orsl-lite/lib/orsl-lite.c: Likewise.
	* runtime/orsl-lite/version.txt: Likewise.
2022-11-04 10:51:01 +01:00
Thomas Schwinge
205538832b libgomp/nvptx: Prepare for reverse-offload callback handling, resolve spurious SIGSEGVs
Per commit r13-3460-g131d18e928a3ea1ab2d3bf61aa92d68a8a254609
"libgomp/nvptx: Prepare for reverse-offload callback handling",
I'm seeing a lot of libgomp execution test regressions.  Random
example, 'libgomp.c-c++-common/error-1.c':

    [...]
      GOMP_OFFLOAD_run: kernel main$_omp_fn$0: launch [(teams: 1), 1, 1] [(lanes: 32), (threads: 8), 1]

    Thread 1 "a.out" received signal SIGSEGV, Segmentation fault.
    0x00007ffff793b87d in GOMP_OFFLOAD_run (ord=<optimized out>, tgt_fn=<optimized out>, tgt_vars=<optimized out>, args=<optimized out>) at [...]/source-gcc/libgomp/plugin/plugin-nvptx.c:2127
    2127            if (__atomic_load_n (&ptx_dev->rev_data->fn, __ATOMIC_ACQUIRE) != 0)
    (gdb) print ptx_dev
    $1 = (struct ptx_device *) 0x6a55a0
    (gdb) print ptx_dev->rev_data
    $2 = (struct rev_offload *) 0xffffffff00000000
    (gdb) print ptx_dev->rev_data->fn
    Cannot access memory at address 0xffffffff00000000

	libgomp/
	* plugin/plugin-nvptx.c (nvptx_open_device): Initialize
	'ptx_dev->rev_data'.
2022-10-24 21:47:05 +02:00
Tobias Burnus
131d18e928 libgomp/nvptx: Prepare for reverse-offload callback handling
This patch adds a stub 'gomp_target_rev' in the host's target.c, which will
later handle the reverse offload.
For nvptx, it adds support for forwarding the offload gomp_target_ext call
to the host by setting values in a struct on the device and querying it on
the host - invoking gomp_target_rev on the result.

include/ChangeLog:

	* cuda/cuda.h (enum CUdevice_attribute): Add
	CU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING.
	(CU_MEMHOSTALLOC_DEVICEMAP): Define.
	(cuMemHostAlloc): Add prototype.

libgomp/ChangeLog:

	* config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Remove
	'static' for this variable.
	* config/nvptx/libgomp-nvptx.h: New file.
	* config/nvptx/target.c: Include it.
	(GOMP_ADDITIONAL_ICVS): Declare extern var.
	(GOMP_REV_OFFLOAD_VAR): Declare var.
	(GOMP_target_ext): Handle reverse offload.
	* libgomp-plugin.h (GOMP_PLUGIN_target_rev): New prototype.
	* libgomp-plugin.c (GOMP_PLUGIN_target_rev): New, call ...
	* target.c (gomp_target_rev): ... this new stub function.
	* libgomp.h (gomp_target_rev): Declare.
	* libgomp.map (GOMP_PLUGIN_1.4): New; add GOMP_PLUGIN_target_rev.
	* plugin/cuda-lib.def (cuMemHostAlloc): Add.
	* plugin/plugin-nvptx.c: Include libgomp-nvptx.h.
	(struct ptx_device): Add rev_data member.
	(nvptx_open_device): Remove async_engines query, last used in
	r10-304-g1f4c5b9b; add unified-address assert check.
	(GOMP_OFFLOAD_get_num_devices): Claim unified address
	support.
	(GOMP_OFFLOAD_load_image): Free rev_fn_table if no
	offload functions exist. Make offload var available
	on host and device.
	(rev_off_dev_to_host_cpy, rev_off_host_to_dev_cpy): New.
	(GOMP_OFFLOAD_run): Handle reverse offload.
2022-10-24 17:04:08 +02:00
Tobias Burnus
50be486dff nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
Add support to nvptx for reverse lookup of function name to prepare for
'omp target device(ancestor:1)'.

gcc/ChangeLog:

	* config/nvptx/mkoffload.cc (struct id_map): Add 'dim' member.
	(record_id): Store func name without quotes, store dim separately.
	(process): For GOMP_REQUIRES_REVERSE_OFFLOAD, check that -march is
	at least sm_35, create '$offload_func_table' global array and init
	with reverse-offload function addresses.
	* config/nvptx/nvptx.cc (write_fn_proto_1, write_fn_proto): New
	force_public attribute to force .visible.
	(nvptx_declare_function_name): For "omp target
	device_ancestor_nohost" attribut, force .visible/TREE_PUBLIC.

libgomp/ChangeLog:

	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Read offload
	function address table '$offload_func_table' if rev_fn_table
	is not NULL.
2022-09-09 17:58:02 +02:00
Tobias Burnus
dfd75bf7e9 GCN: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
Add support to GCN for reverse lookup of function name to prepare for
'omp target device(ancestor:1)'.

gcc/ChangeLog:

	* config/gcn/mkoffload.cc (process_asm): Create .offload_func_table,
	similar to pre-existing .offload_var_table.

libgomp/ChangeLog:

	* plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Read
	.offload_func_table to populate rev_fn_table when requested.
2022-09-09 17:43:12 +02:00
Tobias Burnus
0fcc0cf9dc libgomp: Prepare for reverse offload fn lookup
Prepare for reverse-offloading function-pointer lookup by passing
a rev_fn_table argument to GOMP_OFFLOAD_load_image.

The argument will be NULL, unless GOMP_REQUIRES_REVERSE_OFFLOAD is
requested and devices not supported it, are filtered out.
(Up to and including this commit, no non-host device claims such
support and the caller currently always passes NULL.)

libgomp/ChangeLog:

	* libgomp-plugin.h (GOMP_OFFLOAD_load_image): Add
	'uint64_t **rev_fn_table' argument.
	* oacc-host.c (host_load_image): Likewise.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Likewise;
	currently unused.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Likewise.
	* target.c (gomp_load_image_to_device): Update call but pass
	NULL for now.

liboffloadmic/ChangeLog:

	* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_load_image):
	Add (unused) uint64_t **rev_fn_table argument.
2022-09-09 17:37:09 +02:00
Marcel Vollweiler
9f2fca5659 OpenMP, libgomp: Environment variable syntax extension
This patch considers the environment variable syntax extension for
device-specific variants of environment variables from OpenMP 5.1 (see
OpenMP 5.1 specification, p. 75 and p. 639).  An environment variable (e.g.
OMP_NUM_TEAMS) can have different suffixes:

_DEV (e.g. OMP_NUM_TEAMS_DEV): affects all devices but not the host.
_DEV_<device> (e.g. OMP_NUM_TEAMS_DEV_42): affects only device with
number <device>.
no suffix (e.g. OMP_NUM_TEAMS): affects only the host.

In future OpenMP versions also suffix _ALL will be introduced (see discussion
https://github.com/OpenMP/spec/issues/3179). This is also considered in this
patch:

_ALL (e.g. OMP_NUM_TEAMS_ALL): affects all devices and the host.

The precedence is as follows (descending). For the host:

	1. no suffix
	2. _ALL

For devices:

	1. _DEV_<device>
	2. _DEV
	3. _ALL

That means, _DEV_<device> is used whenever available. Otherwise _DEV is used if
available, and at last _ALL.  If there is no value for any of the variable
variants, default values are used as already implemented before.

This patch concerns parsing (a), storing (b), output (c) and transmission to the
device (d):

(a) The actual number of devices and the numbering are not known when parsing
the environment variables.  Thus all environment variables are iterated and
searched for device-specific ones.
(b) Only configured device-specific variables are stored.  Thus, a linked list
is used.
(c) The output is done in omp_display_env (see specification p. 468f).  Global
ICVs are tagged with [all], see https://github.com/OpenMP/spec/issues/3179.
ICVs which are not global but aren't handled device-specific yet are tagged
with [host].  omp_display_env outputs the initial values of the ICVs.  That is
why a dedicated data structure is introduced for the inital values only
(gomp_initial_icv_list).
(d) Device-specific ICVs are transmitted to the device via GOMP_ADDITIONAL_ICVS.

libgomp/ChangeLog:

	* config/gcn/icv-device.c (omp_get_default_device): Return device-
	specific ICV.
	(omp_get_max_teams): Added for GCN devices.
	(omp_set_num_teams): Likewise.
	(ialias): Likewise.
	* config/nvptx/icv-device.c (omp_get_default_device): Return device-
	specific ICV.
	(omp_get_max_teams): Added for NVPTX devices.
	(omp_set_num_teams): Likewise.
	(ialias): Likewise.
	* env.c (struct gomp_icv_list): New struct to store entries of initial
	ICV values.
	(struct gomp_offload_icv_list): New struct to store entries of device-
	specific ICV values that are copied to the device and back.
	(struct gomp_default_icv_values): New struct to store default values of
	ICVs according to the OpenMP standard.
	(parse_schedule): Generalized for different variants of OMP_SCHEDULE.
	(print_env_var_error): Function that prints an error for invalid values
	for ICVs.
	(parse_unsigned_long_1): Removed getenv.  Generalized.
	(parse_unsigned_long): Likewise.
	(parse_int_1): Likewise.
	(parse_int): Likewise.
	(parse_int_secure): Likewise.
	(parse_unsigned_long_list): Likewise.
	(parse_target_offload): Likewise.
	(parse_bind_var): Likewise.
	(parse_stacksize): Likewise.
	(parse_boolean): Likewise.
	(parse_wait_policy): Likewise.
	(parse_allocator): Likewise.
	(omp_display_env): Extended to output different variants of environment
	variables.
	(print_schedule): New helper function for omp_display_env which prints
	the values of run_sched_var.
	(print_proc_bind): New helper function for omp_display_env which prints
	the values of proc_bind_var.
	(enum gomp_parse_type): Collection of types used for parsing environment
	variables.
	(ENTRY): Preprocess string lengths of environment variables.
	(OMP_VAR_CNT): Preprocess table size.
	(OMP_HOST_VAR_CNT): Likewise.
	(INT_MAX_STR_LEN): Constant for the maximal number of digits of a device
	number.
	(gomp_get_icv_flag): Returns if a flag for a particular ICV is set.
	(gomp_set_icv_flag): Sets a flag for a particular ICV.
	(print_device_specific_icvs): New helper function for omp_display_env to
	print device specific ICV values.
	(get_device_num): New helper function for parse_device_specific.
	Extracts the device number from an environment variable name.
	(get_icv_member_addr): Gets the memory address for a particular member
	of an ICV struct.
	(gomp_get_initial_icv_item): Get a list item of gomp_initial_icv_list.
	(initialize_icvs): New function to initialize a gomp_initial_icvs
	struct.
	(add_initial_icv_to_list): Adds an ICV struct to gomp_initial_icv_list.
	(startswith): Checks if a string starts with a given prefix.
	(initialize_env): Extended to parse the new syntax of environment
	variables.
	* icv-device.c (omp_get_max_teams): Added.
	(ialias): Likewise.
	(omp_set_num_teams): Likewise.
	* icv.c (omp_set_num_teams): Moved to icv-device.c.
	(omp_get_max_teams): Likewise.
	(ialias): Likewise.
	* libgomp-plugin.h (GOMP_DEVICE_NUM_VAR): Removed.
	(GOMP_ADDITIONAL_ICVS): New target-side struct that
	holds the designated ICVs of the target device.
	* libgomp.h (enum gomp_icvs): Collection of ICVs.
	(enum gomp_device_num): Definition of device numbers for _ALL, _DEV, and
	no suffix.
	(enum gomp_env_suffix): Collection of possible suffixes of environment
	variables.
	(struct gomp_initial_icvs): Contains all ICVs for which we need to store
	initial values.
	(struct gomp_default_icv):New struct to hold ICVs for which we need
	to store initial values.
	(struct gomp_icv_list): Definition of a linked list that is used for
	storing ICVs for the devices and also for _DEV, _ALL, and without
	suffix.
	(struct gomp_offload_icvs): New struct to hold ICVs that are copied to
	a device.
	(struct gomp_offload_icv_list): Definition of a linked list that holds
	device-specific ICVs that are copied to devices.
	(gomp_get_initial_icv_item): Get a list item of gomp_initial_icv_list.
	(gomp_get_icv_flag): Returns if a flag for a particular ICV is set.
	* libgomp.texi: Updated.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Extended to read
	further ICVs from the offload image.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Likewise.
	* target.c (gomp_get_offload_icv_item): Get a list item of
	gomp_offload_icv_list.
	(get_gomp_offload_icvs): New. Returns the ICV values
	depending on the device num and the variable hierarchy.
	(gomp_load_image_to_device): Extended to copy further ICVs to a device.
	* testsuite/libgomp.c-c++-common/icv-5.c: New test.
	* testsuite/libgomp.c-c++-common/icv-6.c: New test.
	* testsuite/libgomp.c-c++-common/icv-7.c: New test.
	* testsuite/libgomp.c-c++-common/icv-8.c: New test.
	* testsuite/libgomp.c-c++-common/omp-display-env-1.c: New test.
	* testsuite/libgomp.c-c++-common/omp-display-env-2.c: New test.
2022-09-08 10:19:37 -07:00
Tobias Burnus
683f118439 OpenMP: Move omp requires checks to libgomp
Handle reverse_offload, unified_address, and unified_shared_memory
requirements in libgomp by saving them alongside the offload table.
When the device lto1 runs, it extracts the data for mkoffload. The
latter than passes the value on to GOMP_offload_register_ver.

lto1 (either the host one, with -flto [+ ENABLE_OFFLOADING], or in the
offload-device lto1) also does the the consistency check is done,
erroring out when the 'omp requires' clause use is inconsistent.

For all in-principle supported devices, if a requirement cannot be fulfilled,
the device is excluded from the (supported) devices list. Currently, none of
those requirements are marked as supported for any of the non-host devices.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_omp_target_data, c_parser_omp_target_update,
	c_parser_omp_target_enter_data, c_parser_omp_target_exit_data): Set
	OMP_REQUIRES_TARGET_USED.
	(c_parser_omp_requires): Remove sorry.

gcc/ChangeLog:

	* config/gcn/mkoffload.cc (process_asm): Write '#include <stdint.h>'.
	(process_obj): Pass omp_requires_mask to GOMP_offload_register_ver.
	(main): Ask lto1 to obtain omp_requires_mask and pass it on.
	* config/nvptx/mkoffload.cc (process, main): Likewise.
	* lto-cgraph.cc (omp_requires_to_name): New.
	(input_offload_tables): Save omp_requires_mask.
	(output_offload_tables): Read it, check for consistency,
	save value for mkoffload.
	* omp-low.cc (lower_omp_target): Force output_offloadtables
	call for OMP_REQUIRES_TARGET_USED.

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_target_data,
	cp_parser_omp_target_enter_data, cp_parser_omp_target_exit_data,
	cp_parser_omp_target_update): Set OMP_REQUIRES_TARGET_USED.
	(cp_parser_omp_requires): Remove sorry.

gcc/fortran/ChangeLog:

	* openmp.cc (gfc_match_omp_requires): Remove sorry.
	* parse.cc (decode_omp_directive): Don't regard 'declare target'
	as target usage for 'omp requires'; add more flags to
	omp_requires_mask.

include/ChangeLog:

	* gomp-constants.h (GOMP_VERSION): Bump to 2.
	(GOMP_REQUIRES_UNIFIED_ADDRESS, GOMP_REQUIRES_UNIFIED_SHARED_MEMORY,
	GOMP_REQUIRES_REVERSE_OFFLOAD, GOMP_REQUIRES_TARGET_USED):
	New defines.

libgomp/ChangeLog:

	* libgomp-plugin.h (GOMP_OFFLOAD_get_num_devices): Add
	omp_requires_mask arg.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices): Likewise;
	return -1 when device available but omp_requires_mask != 0.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Likewise.
	* oacc-host.c (host_get_num_devices, host_openacc_get_property):
	Update call.
	* oacc-init.c (resolve_device, acc_init_1, acc_shutdown_1,
	goacc_attach_host_thread_to_device, acc_get_num_devices,
	acc_set_device_num, get_property_any): Likewise.
	* target.c (omp_requires_mask): New global var.
	(gomp_requires_to_name): New.
	(GOMP_offload_register_ver): Handle passed omp_requires_mask.
	(gomp_target_init): Handle omp_requires_mask.
	* libgomp.texi (OpenMP 5.0): Update requires impl. status.
	(OpenMP 5.1): Add a missed item.
	(OpenMP 5.2): Mark linear-clause change as supported in C/C++.
	* testsuite/libgomp.c-c++-common/requires-1-aux.c: New test.
	* testsuite/libgomp.c-c++-common/requires-1.c: New test.
	* testsuite/libgomp.c-c++-common/requires-2-aux.c: New test.
	* testsuite/libgomp.c-c++-common/requires-2.c: New test.
	* testsuite/libgomp.c-c++-common/requires-3-aux.c: New test.
	* testsuite/libgomp.c-c++-common/requires-3.c: New test.
	* testsuite/libgomp.c-c++-common/requires-4-aux.c: New test.
	* testsuite/libgomp.c-c++-common/requires-4.c: New test.
	* testsuite/libgomp.c-c++-common/requires-5-aux.c: New test.
	* testsuite/libgomp.c-c++-common/requires-5.c: New test.
	* testsuite/libgomp.c-c++-common/requires-6.c: New test.
	* testsuite/libgomp.c-c++-common/requires-7-aux.c: New test.
	* testsuite/libgomp.c-c++-common/requires-7.c: New test.
	* testsuite/libgomp.fortran/requires-1-aux.f90: New test.
	* testsuite/libgomp.fortran/requires-1.f90: New test.

liboffloadmic/ChangeLog:

	* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_get_num_devices):
	Return -1 when device available but omp_requires_mask != 0.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/requires-4.c: Update dg-*.
	* c-c++-common/gomp/reverse-offload-1.c: Likewise.
	* c-c++-common/gomp/target-device-ancestor-2.c: Likewise.
	* c-c++-common/gomp/target-device-ancestor-3.c: Likewise.
	* c-c++-common/gomp/target-device-ancestor-4.c: Likewise.
	* c-c++-common/gomp/target-device-ancestor-5.c: Likewise.
	* gfortran.dg/gomp/target-device-ancestor-3.f90: Likewise.
	* gfortran.dg/gomp/target-device-ancestor-4.f90: Likewise.
	* gfortran.dg/gomp/target-device-ancestor-5.f90: Likewise.
	* gfortran.dg/gomp/target-device-ancestor-2.f90: Likewise. Move
	post-FE checks to ...
	* gfortran.dg/gomp/target-device-ancestor-2a.f90: ... this new file.
	* gfortran.dg/gomp/requires-8.f90: Update as we don't regard
	'declare target' for the 'requires' usage requirement.

Co-authored-by: Chung-Lin Tang <cltang@codesourcery.com>
Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
2022-07-04 13:52:02 +02:00
Thomas Schwinge
1459b55d24 libgomp nvptx plugin: Remove '--with-cuda-driver=[...]' etc. configuration option
That means, exposing to the user only the '--without-cuda-driver' behavior:
including the GCC-shipped 'include/cuda/cuda.h' (not system <cuda.h>), and
'dlopen'ing the CUDA Driver library (not linking it).

For development purposes, the libgomp nvptx plugin developer may still manually
override that, to get the previous '--with-cuda-driver' behavior.

	libgomp/
	* plugin/Makefrag.am: Evaluate 'if PLUGIN_NVPTX_DYNAMIC' to true.
	* plugin/configfrag.ac (--with-cuda-driver)
	(--with-cuda-driver-include, --with-cuda-driver-lib)
	(CUDA_DRIVER_INCLUDE, CUDA_DRIVER_LIB, PLUGIN_NVPTX_CPPFLAGS)
	(PLUGIN_NVPTX_LDFLAGS, PLUGIN_NVPTX_LIBS, PLUGIN_NVPTX_DYNAMIC):
	Remove.
	* testsuite/libgomp-test-support.exp.in (cuda_driver_include)
	(cuda_driver_lib): Remove.
	* testsuite/lib/libgomp.exp (libgomp_init): Don't consider these.
	* Makefile.in: Regenerate.
	* configure: Likewise.
	* testsuite/Makefile.in: Likewise.
2022-06-10 17:08:57 +02:00
Andrew Stubbs
cde52d3a2d amdgcn: Add gfx90a support
This adds architecture options and multilibs for the AMD GFX90a GPUs.
It also tidies up some of the ISA selection code, and corrects a few small
mistake in the gfx908 naming.

gcc/ChangeLog:

	* config.gcc (amdgcn): Accept --with-arch=gfx908 and gfx90a.
	* config/gcn/gcn-opts.h (enum gcn_isa): New.
	(TARGET_GCN3): Use enum gcn_isa.
	(TARGET_GCN3_PLUS): Likewise.
	(TARGET_GCN5): Likewise.
	(TARGET_GCN5_PLUS): Likewise.
	(TARGET_CDNA1): New.
	(TARGET_CDNA1_PLUS): New.
	(TARGET_CDNA2): New.
	(TARGET_CDNA2_PLUS): New.
	(TARGET_M0_LDS_LIMIT): New.
	(TARGET_PACKED_WORK_ITEMS): New.
	* config/gcn/gcn.cc (gcn_isa): Change to enum gcn_isa.
	(gcn_option_override): Recognise CDNA ISA variants.
	(gcn_omp_device_kind_arch_isa): Support gfx90a.
	(gcn_expand_prologue): Make m0 init optional.
	Add support for packed work items.
	(output_file_start): Support gfx90a.
	(gcn_hsa_declare_function_name): Support gfx90a metadata.
	* config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS):Add __CDNA1__ and
	__CDNA2__.
	* config/gcn/gcn.md (<su>mulsi3_highpart): Use TARGET_GCN5_PLUS.
	(<su>mulsi3_highpart_imm): Likewise.
	(<su>mulsidi3): Likewise.
	(<su>mulsidi3_imm): Likewise.
	* config/gcn/gcn.opt (gpu_type): Add gfx90a.
	* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX90a): New.
	(main): Support gfx90a.
	* config/gcn/t-gcn-hsa: Add gfx90a multilib.
	* config/gcn/t-omp-device: Add gfx90a isa.

libgomp/ChangeLog:

	* plugin/plugin-gcn.c (EF_AMDGPU_MACH): Add
	EF_AMDGPU_MACH_AMDGCN_GFX90a.
	(gcn_gfx90a_s): New.
	(isa_hsa_name): Support gfx90a.
	(isa_code): Likewise.
2022-05-24 16:18:14 +01:00
Thomas Schwinge
1f89e48789 libgomp nvptx plugin: Only consider '--with-cuda-driver=[...]' when applicable
They're not applicable in 'PLUGIN_NVPTX_DYNAMIC' configurations.

	libgomp/
	* plugin/Makefrag.am (libgomp_plugin_nvptx_la_CPPFLAGS)
	[PLUGIN_NVPTX_DYNAMIC]: Don't append '$(PLUGIN_NVPTX_CPPFLAGS)'.
	(libgomp_plugin_nvptx_la_LDFLAGS) [PLUGIN_NVPTX_DYNAMIC]: Don't
	append '$(PLUGIN_NVPTX_LDFLAGS)'.
	* Makefile.in: Regenerate.
2022-05-13 14:01:01 +02:00
Thomas Schwinge
dcc266796a Refactor '-ldl' handling for libgomp proper and plugins
Instead of implicit global 'LIBS="-ldl $LIBS"' via 'AC_CHECK_LIB', make
'-ldl' explicit for libgomp proper, and clean up 'PLUGIN_GCN_LIBS',
'PLUGIN_NVPTX_LIBS' accordingly.

	libgomp/
	* Makefile.am (libgomp_la_LIBADD): Initialize.
	* plugin/configfrag.ac (DL_LIBS): New.
	(PLUGIN_GCN_LIBS): Remove.
	(PLUGIN_NVPTX_LIBS): Don't set in the 'PLUGIN_NVPTX_DYNAMIC' case.
	* plugin/Makefrag.am (libgomp_la_LIBADD)
	(libgomp_plugin_gcn_la_LIBADD): Consider '$(DL_LIBS)'.
	(libgomp_plugin_nvptx_la_LIBADD) <PLUGIN_NVPTX_DYNAMIC>: Likewise.
	* Makefile.in: Regenerate.
	* config.h.in: Likewise.
	* configure: Likewise.
	* testsuite/Makefile.in: Likewise.
2022-05-12 15:11:30 +02:00
Thomas Schwinge
cd644ce8be libgomp nvptx plugin: Split 'PLUGIN_NVPTX_DYNAMIC' into 'PLUGIN_NVPTX_INCLUDE_SYSTEM_CUDA_H' and 'PLUGIN_NVPTX_LINK_LIBCUDA'
Including the GCC-shipped 'include/cuda/cuda.h' vs. system <cuda.h> and
'dlopen'ing the CUDA Driver library vs. linking it are separate concerns.

	libgomp/
	* plugin/Makefrag.am: Handle 'PLUGIN_NVPTX_DYNAMIC'.
	* plugin/configfrag.ac (PLUGIN_NVPTX_DYNAMIC): Change
	'AC_DEFINE_UNQUOTED' into 'AM_CONDITIONAL'.
	* plugin/plugin-nvptx.c: Split 'PLUGIN_NVPTX_DYNAMIC' into
	'PLUGIN_NVPTX_INCLUDE_SYSTEM_CUDA_H' and
	'PLUGIN_NVPTX_LINK_LIBCUDA'.
	* Makefile.in: Regenerate.
	* config.h.in: Likewise.
	* configure: Likewise.
2022-05-12 14:14:13 +02:00
Thomas Schwinge
edbd2b1caa libgomp plugins: Don't 'AC_SUBST' and 'AC_DEFINE_UNQUOTED' for 'PLUGIN_GCN', 'PLUGIN_NVPTX'
Nothing ever used these.

	libgomp/
	* plugin/configfrag.ac: Don't 'AC_SUBST' and 'AC_DEFINE_UNQUOTED'
	for 'PLUGIN_GCN', 'PLUGIN_NVPTX'.
	* Makefile.in: Regenerate.
	* config.h.in: Likewise.
	* configure: Likewise.
	* testsuite/Makefile.in: Likewise.
2022-05-12 13:28:23 +02:00
Thomas Schwinge
876ac21b7e libgomp: Remove unused '--with-hsa-runtime', '--with-hsa-runtime-include', '--with-hsa-runtime-lib'
With recent commit 2e309a4eff "libgomp testsuite:
Don't amend 'LD_LIBRARY_PATH' for system-provided HSA Runtime library",
and commit d6adba3075 "libgomp GCN plugin:
 Clean up unused references to system-provided HSA Runtime library", the last
uses of '--with-hsa-runtime' etc. are gone.

	gcc/
	* doc/install.texi: Don't document '--with-hsa-runtime',
	'--with-hsa-runtime-include', '--with-hsa-runtime-lib'.
	libgomp/
	* plugin/configfrag.ac: Remove '--with-hsa-runtime',
	'--with-hsa-runtime-include', '--with-hsa-runtime-lib' processing.
	* Makefile.in: Regenerate.
	* configure: Likewise.
	* testsuite/Makefile.in: Likewise.
2022-05-11 14:27:42 +02:00
Thomas Schwinge
91a6dcd149 libgomp GCN plugin: Clean up always-empty 'PLUGIN_GCN_CPPFLAGS', 'PLUGIN_GCN_LDFLAGS'
After recent commit d6adba3075
"libgomp GCN plugin: Clean up unused references to system-provided HSA Runtime
library", these aren't set anymore.

	libgomp/
	* plugin/Makefrag.am (libgomp_plugin_gcn_la_CPPFLAGS): Don't
	consider 'PLUGIN_GCN_CPPFLAGS'.
	(libgomp_plugin_gcn_la_LDFLAGS): Don't consider
	'PLUGIN_GCN_LDFLAGS'.
	* plugin/configfrag.ac (PLUGIN_GCN_CPPFLAGS, PLUGIN_GCN_LDFLAGS):
	Remove.
	* Makefile.in: Regenerate.
	* configure: Likewise.
	* testsuite/Makefile.in: Likewise.
2022-05-11 14:25:58 +02:00
Thomas Schwinge
d6adba3075 libgomp GCN plugin: Clean up unused references to system-provided HSA Runtime library
This is only active if GCC is 'configure'd with '--with-hsa-runtime=[...]' or
'--with-hsa-runtime-include=[...]', '--with-hsa-runtime-lib=[...]' -- which
nobody really is doing, as far as I can tell.

Originally changed for the libgomp HSA plugin in
commit b8d89b03db (r242749)
"Remove build dependence on HSA run-time", and later propagated into the GCN
plugin, these are no longer built against system-provided HSA Runtime library.
Instead, unconditionally built against the GCC-shipped 'include/hsa*.h' header
files, and at run time does 'dlopen("libhsa-runtime64.so.1")'.  It thus doesn't
make sense to consider references to system-provided HSA Runtime library during
libgomp GCN plugin build.

	libgomp/
	* plugin/configfrag.ac (HSA_RUNTIME_CPPFLAGS)
	(HSA_RUNTIME_LDFLAGS): Remove.
	* configure: Regenerate.
2022-05-11 14:24:55 +02:00
Tobias Burnus
4a20616107 libgomp/plugin/plugin-gcn.c: Use -foffload-options= in err msg
While -foffload=-<flag> works (never documented legacy feature),
the documented way is to use -foffload-options=.

libgomp/ChangeLog:

	* plugin/plugin-gcn.c (isa_matches_agent): Suggest -foffload-options.
2022-05-04 18:40:03 +02:00
Thomas Schwinge
5e431ae4cc Move 'libgomp/plugin/cuda/cuda.h' to 'include/cuda/cuda.h'
... so that it may be used by other projects that inherit GCC's 'include'
directory.

	include/
	* cuda/cuda.h: New file.
	libgomp/
	* plugin/cuda/cuda.h: Remove file.
	* plugin/plugin-nvptx.c [PLUGIN_NVPTX_DYNAMIC]: Include
	"cuda/cuda.h" instead of <cuda.h>.
	* plugin/configfrag.ac <PLUGIN_NVPTX_DYNAMIC>: Don't set
	'PLUGIN_NVPTX_CPPFLAGS'.
	* configure: Regenerate.
2022-04-06 22:30:14 +02:00
Tom de Vries
52f42dce15 [libgomp, testsuite] Fix hardcoded libexec in plugin/configfrag.ac
When building an nvptx offloading configuration on openSUSE Leap 15.3, the
site script /usr/share/site/x86_64-unknown-linux-gnu is activated, setting
libexecdir to ${exec_prefix}/lib rather than ${exec_prefix}/libexec:
...
| # If user did not specify libexecdir, set the correct target:
| # Nor FHS nor openSUSE allow prefix/libexec. Let's default to prefix/lib.
|
| if test "$libexecdir" = '${exec_prefix}/libexec' ; then
|       libexecdir='${exec_prefix}/lib'
| fi
...

However, in libgomp libgomp/plugin/configfrag.ac we hardcode libexec:
...
    # Configure additional search paths.
    if test x"$tgt_dir" != x; then
      offload_additional_options="$offload_additional_options \
        -B$tgt_dir/libexec/gcc/\$(target_alias)/\$(gcc_version) \
	-B$tgt_dir/bin"
...

Fix this by using /$(libexecdir:\$(exec_prefix)/%=%)/ instead of /libexec/.

Tested on x86_64-linux with nvptx accelerator.

libgomp/ChangeLog:

2022-03-28  Tom de Vries  <tdevries@suse.de>

	* plugin/configfrag.ac: Use /$(libexecdir:\$(exec_prefix)/%=%)/
	instead of /libexec/.
	* configure: Regenerate.
2022-03-28 14:09:02 +02:00
Kwok Cheung Yeung
a78b1ab1df amdgcn: Tune default OpenMP/OpenACC GPU utilization
libgomp/
	* plugin/plugin-gcn.c (parse_target_attributes): Automatically set
	the number of teams and threads if necessary.
	(gcn_exec): Automatically set the number of gangs and workers if
	necessary.

Co-Authored-By: Andrew Stubbs  <ams@codesourcery.com>
2022-01-16 17:25:36 +01:00
Chung-Lin Tang
fbb592407c libgomp: Fix GOMP_DEVICE_NUM_VAR stringification during offload image load
In the patch that implemented omp_get_device_num(), there was an error where
the stringification of GOMP_DEVICE_NUM_VAR, which is the macro expanding to
the actual symbol used, was erroneously using the STRINGX() macro in the
libgomp offload image symbol search, and expansion of the variable name
string through the additional layer of preprocessor symbol was not properly
achieved.

This patch fixes this by changing to properly use XSTRING(), also from
include/symcat.h.

libgomp/ChangeLog:

	* plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Change uses of STRINGX
	into XSTRING when looking for GOMP_DEVICE_NUM_VAR in offload image.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Likewise.
2022-01-04 17:26:23 +08:00
Jakub Jelinek
7adcbafe45 Update copyright years. 2022-01-03 10:42:10 +01:00
Andrew Stubbs
4a87a8e4b1 amdgcn: Change offload variable table discovery
Up to now the libgomp GCN plugin has been finding the offload variables
by using a symbol lookup, but the AMD runtime requires that the symbols are
global for that to work. This was ensured by mkoffload as a post-procssing
step, but the LLVM 13 assembler no longer accepts this in the case where the
variable was previously declared differently.

This patch switches to locating the symbols directly from the
offload_var_table, which means that only one symbol needs to be forced
global.

This changes breaks the libgomp image compatibility so GOMP_VERSION_GCN has
also been bumped.

gcc/ChangeLog:

	* config/gcn/mkoffload.c (process_asm): Process the variable table
	completely differently.
	(process_obj): Encode the varaible data differently.

include/ChangeLog:

	* gomp-constants.h (GOMP_VERSION_GCN): Bump.

libgomp/ChangeLog:

	* plugin/plugin-gcn.c (struct gcn_image_desc): Remove global_variables.
	(GOMP_OFFLOAD_load_image): Locate the offload variables via the
	table, not individual symbols.
2021-12-10 10:45:09 +00:00
Julian Brown
c408512e1f amdgcn: Enable OpenACC worker partitioning for AMD GCN
gcc/
	* config/gcn/gcn.c (gcn_init_builtins): Override decls for
	BUILT_IN_GOACC_SINGLE_START, BUILT_IN_GOACC_SINGLE_COPY_START,
	BUILT_IN_GOACC_SINGLE_COPY_END and BUILT_IN_GOACC_BARRIER.
	(gcn_goacc_validate_dims): Turn on worker partitioning unconditionally.
	(gcn_fork_join): Update comment.
	* config/gcn/gcn.opt (flag_worker_partitioning): Remove.
	(macc_experimental_workers): Remove unused option.
	libgomp/
	* plugin/plugin-gcn.c (gcn_exec): Change default number of workers to
	16.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c
	[acc_device_radeon]: Update.
	* testsuite/libgomp.oacc-c-c++-common/loop-dim-default.c
	[ACC_DEVICE_TYPE_radeon]: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
	[acc_device_radeon]: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c
	[ACC_DEVICE_TYPE_radeon]: Likewise.
	* testsuite/libgomp.oacc-fortran/optional-reduction.f90: XFAIL for
	'openacc_radeon_accel_selected' and '-O0'.
	* testsuite/libgomp.oacc-fortran/reduction-7.f90: Likewise.

Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2021-08-09 15:08:44 +02:00
Chung-Lin Tang
0bac793ed6 openmp: Implement omp_get_device_num routine
This patch implements the omp_get_device_num library routine, specified in
OpenMP 5.0.

GOMP_DEVICE_NUM_VAR is a macro symbol which defines name of a "device number"
variable, is defined on the device-side libgomp, has it's address returned to
host-side libgomp during device initialization, and the host libgomp then
sets its value to the designated device number.

libgomp/ChangeLog:

	* icv-device.c (omp_get_device_num): New API function, host side.
	* fortran.c (omp_get_device_num_): New interface function.
	* libgomp-plugin.h (GOMP_DEVICE_NUM_VAR): Define macro symbol.
	* libgomp.map (OMP_5.0.2): New version space with omp_get_device_num,
	omp_get_device_num_.
	* libgomp.texi (omp_get_device_num): Add documentation for new API
	function.
	* omp.h.in (omp_get_device_num): Add declaration.
	* omp_lib.f90.in (omp_get_device_num): Likewise.
	* omp_lib.h.in (omp_get_device_num): Likewise.
	* target.c (gomp_load_image_to_device): If additional entry for device
	number exists at end of returned entries from 'load_image_func' hook,
	copy the assigned device number over to the device variable.

	* config/gcn/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global.
	(omp_get_device_num): New API function, device side.
	* plugin/plugin-gcn.c ("symcat.h"): Add include.
	(GOMP_OFFLOAD_load_image): Add addresses of device GOMP_DEVICE_NUM_VAR
	at end of returned 'target_table' entries.

	* config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global.
	(omp_get_device_num): New API function, device side.
	* plugin/plugin-nvptx.c ("symcat.h"): Add include.
	(GOMP_OFFLOAD_load_image): Add addresses of device GOMP_DEVICE_NUM_VAR
	at end of returned 'target_table' entries.

	* testsuite/lib/libgomp.exp
	(check_effective_target_offload_target_intelmic): New function for
	testing for intelmic offloading.
	* testsuite/libgomp.c-c++-common/target-45.c: New test.
	* testsuite/libgomp.fortran/target10.f90: New test.
2021-08-05 23:29:03 +08:00
Julian Brown
9c41f5b9cd Fix OpenACC "ephemeral" asynchronous host-to-device copies
This patch fixes several places in libgomp/target.c where "ephemeral" data
(on the stack or in temporary heap locations) may be used as the source of
an asynchronous host-to-device copy that may not complete before the host
data disappears.

An existing, but flawed, workaround for this problem in the AMD GCN
libgomp offloading plugin is currently present on mainline, and was
posted for the og9 branch here:

  https://gcc.gnu.org/legacy-ml/gcc-patches/2019-08/msg00901.html

and previous versions of this patch were posted here (for mainline/og9):

  https://gcc.gnu.org/legacy-ml/gcc-patches/2019-11/msg01482.html
  https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01026.html

libgomp/
	* libgomp.h (gomp_copy_host2dev): Update prototype.
	* oacc-mem.c (memcpy_tofrom_device, update_dev_host): Add new
	argument to gomp_copy_host2dev (false).
	* plugin/plugin-gcn.c (struct copy_data): Remove free_src field.
	(copy_data): Don't free src.
	(queue_push_copy): Remove free_src handling.
	(GOMP_OFFLOAD_dev2dev): Update call to queue_push_copy.
	(GOMP_OFFLOAD_openacc_async_host2dev): Remove source-data
	snapshotting.
	(GOMP_OFFLOAD_openacc_async_dev2host): Update call to
	queue_push_copy.
	* target.c (goacc_device_copy_async): Add SRCADDR_ORIG parameter.
	(gomp_copy_host2dev): Add EPHEMERAL parameter.  Snapshot source
	data when true, and set up deferred freeing of temporary buffer.
	(gomp_copy_dev2host): Update call to goacc_device_copy_async.
	(gomp_map_vars_existing, gomp_map_pointer, gomp_attach_pointer)
	(gomp_detach_pointer, gomp_map_vars_internal, gomp_update): Update
	calls to gomp_copy_host2dev with appropriate ephemeral argument.
	* testsuite/libgomp.oacc-c-c++-common/async-data-1-1.c: Remove
	XFAIL.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2021-07-27 11:16:27 +02:00
Thomas Schwinge
30656822b3 [GCN] Fix run-time variable 'num_workers'
... which currently has *not* been forced to 'num_workers (1)'.

In addition to the testcases modified here, this also fixes:

    FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/mode-transitions.c -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa  -O0  execution test
    [Etc.]

    mode-transitions.exe: [...]/libgomp.oacc-c-c++-common/mode-transitions.c:702: t17: Assertion `arr_b[i] == (i ^ 31) * 8' failed.

	libgomp/
	* plugin/plugin-gcn.c (gcn_exec): Force 'num_workers (1)'
	unconditionally.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c:
	Update.
	* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c: Likewise.
2021-06-08 12:00:15 +02:00
Thomas Schwinge
7c1e856bed libgomp HSA/GCN plugins: don't prepend the 'HSA_RUNTIME_LIB' path to 'libhsa-runtime64.so'
For unknown reasons, this had gotten added for the libgomp HSA plugin in commit
b8d89b03db (r242749) "Remove build dependence on
HSA run-time", and later propagated into the GCN plugin.

	libgomp/
	* plugin/plugin-gcn.c (init_environment_variables): Don't prepend
	the 'HSA_RUNTIME_LIB' path to 'libhsa-runtime64.so'.
	* plugin/configfrag.ac (HSA_RUNTIME_LIB): Clean up.
	* config.h.in: Regenerate.
	* configure: Likewise.
2021-03-25 14:11:50 +01:00
Jakub Jelinek
f65e551f73 libgomp: Use sizeof(void*) based checks instead of looking through $CC $CFLAGS for -m32/-mx32
Some gcc configurations default to -m32 but support -m64 too.  This patch
just makes the ILP32 tests more reliable by following what e.g. libsanitizer
configury does.

2021-03-04  Jakub Jelinek  <jakub@redhat.com>

	* configure.ac: Add AC_CHECK_SIZEOF([void *]).
	* plugin/configfrag.ac: Check $ac_cv_sizeof_void_p value instead of
	checking of -m32 or -mx32 options on the command line.
	* config.h.in: Regenerated.
	* configure: Regenerated.
2021-03-04 09:43:34 +01:00
Andrew Stubbs
3535402e20 amdgcn: Add gfx908 support
gcc/

	* config/gcn/gcn-opts.h (enum processor_type): Add PROCESSOR_GFX908.
	* config/gcn/gcn.c (gcn_omp_device_kind_arch_isa): Add gfx908.
	(output_file_start): Add gfx908.
	* config/gcn/gcn.opt (gpu_type): Add gfx908.
	* config/gcn/t-gcn-hsa (MULTILIB_OPTIONS): Add march=gfx908.
	(MULTILIB_DIRNAMES): Add gfx908.
	* config/gcn/mkoffload.c (EF_AMDGPU_MACH_AMDGCN_GFX908): New define.
	(main): Recognize gfx908.
	* config/gcn/t-omp-device: Add gfx908.

libgomp/

	* plugin/plugin-gcn.c (EF_AMDGPU_MACH): Add
	EF_AMDGPU_MACH_AMDGCN_GFX908.
	(gcn_gfx908_s): New constant string.
	(isa_hsa_name): Add gfx908.
	(isa_code): Add gfx908.
2021-02-03 14:25:33 +00:00
Thomas Schwinge
6106dfb9f7 [nvptx libgomp plugin] Build only in supported configurations
As recently again discussed in <https://gcc.gnu.org/PR97436> "[nvptx] -m32
support", nvptx offloading other than for 64-bit host has never been
implemented, tested, supported.  So we simply should buildn't the nvptx libgomp
plugin in this case.

This avoids build problems if, for example, in a (standard) bi-arch
x86_64-pc-linux-gnu '-m64'/'-m32' build, libcuda is available only in a 64-bit
variant but not in a 32-bit one, which, for example, is the case if you build
GCC against the CUDA toolkit's 'stubs/libcuda.so' (see
<https://stackoverflow.com/a/52784819>).

This amends PR65099 commit a92defdab7 (r225560)
"[nvptx offloading] Only 64-bit configurations are currently supported" to
match the way we're doing this for the HSA/GCN plugins.

	libgomp/
	PR libgomp/65099
	* plugin/configfrag.ac (PLUGIN_NVPTX): Restrict to supported
	configurations.
	* configure: Regenerate.
	* plugin/plugin-nvptx.c (nvptx_get_num_devices): Remove 64-bit
	check.
2021-01-14 18:48:00 +01:00
Julian Brown
6b577a17b2 nvptx: Cache stacks block for OpenMP kernel launch
2021-01-05  Julian Brown  <julian@codesourcery.com>

libgomp/
	* plugin/plugin-nvptx.c (SOFTSTACK_CACHE_LIMIT): New define.
	(struct ptx_device): Add omp_stacks struct.
	(nvptx_open_device): Initialise cached-stacks housekeeping info.
	(nvptx_close_device): Free cached stacks block and mutex.
	(nvptx_stacks_free): New function.
	(nvptx_alloc): Add SUPPRESS_ERRORS parameter.
	(GOMP_OFFLOAD_alloc): Add strategies for freeing soft-stacks block.
	(nvptx_stacks_alloc): Rename to...
	(nvptx_stacks_acquire): This.  Cache stacks block between runs if same
	size or smaller is required.
	(nvptx_stacks_free): Remove.
	(GOMP_OFFLOAD_run): Call nvptx_stacks_acquire and lock stacks block
	during kernel execution.
2021-01-05 09:56:36 -08:00
Jakub Jelinek
99dee82307 Update copyright years. 2021-01-04 10:26:59 +01:00
Andrew Stubbs
85f0a4d982 Import HSA header files from AMD
These are the same header files that exist in the Radeon Open Compute Runtime
project (as of October 2020), but they have been specially relicensed by AMD
for use in GCC.

The header files retain AMD copyright.

include/ChangeLog:

	* hsa.h: Replace whole file.
	* hsa_ext_amd.h: New file.
	* hsa_ext_image.h: New file.

libgomp/ChangeLog:

	* plugin/plugin-gcn.c: Include hsa_ext_amd.h.
	(HSA_AMD_AGENT_INFO_COMPUTE_UNIT_COUNT): Delete redundant definition.
2020-12-09 11:10:40 +00:00
Andrew Stubbs
97981e13b7 Tweak plugin-gcn.c defines
Ensure the code will continue to compile when elf.h gets these definitions.

libgomp/ChangeLog:

	* plugin/plugin-gcn.c: Don't redefine relocations if elf.h has them.
	(reserved): Delete unused define.
2020-11-24 14:56:38 +00:00
Tom de Vries
7345ef6c2a [libgomp, nvptx] Report launch dimensions in GOMP_OFFLOAD_run
Using this patch, when using GOMP_DEBUG=1 and launching a kernel in
GOMP_OFFLOAD_run (used by the omp implementation), we see the kernel launch
dimensions:
...
  GOMP_OFFLOAD_run: kernel main$_omp_fn$0: \
    launch [(teams: 1), 1, 1] [(lanes: 32), (threads: 1), 1]
...

Build on x86_64-linux with nvptx accelerator, tested libgomp.

libgomp/ChangeLog:

2020-10-08  Tom de Vries  <tdevries@suse.de>

	PR libgomp/81802
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_run): Report launch
	dimensions.
2020-10-08 11:03:29 +02:00
Tom de Vries
c0e9cee285 [libgomp, nvptx] Print error log for link error
By running libgomp test-case libgomp.c/target-28.c with GOMP_NVPTX_PTXRW=w
(using a maintenance patch that adds support for this env var), we dump the
ptx in target-28.exe to file.  By editing one ptx file to rename
gomp_nvptx_main to gomp_nvptx_main2 in both declaration and call, and
running with GOMP_NVPTX_PTXRW=r, we trigger a link error:
...
$ GOMP_NVPTX_PTXRW=r ./target-28.exe
libgomp: cuLinkComplete error: unknown error
...
The error is somewhat uninformative.

Fix this by dumping the error log returned by the failing cuda call, such
that we have instead:
...
$ GOMP_NVPTX_PTXRW=r ./target-28.exe
libgomp: Link error log error   : \
  Undefined reference to 'gomp_nvptx_main2' in ''
libgomp: cuLinkComplete error: unknown error
...

Build on x86_64 with nvptx accelerator, tested libgomp.

libgomp/ChangeLog:

	* plugin/plugin-nvptx.c (link_ptx): Print elog if cuLinkComplete call
	fails.
2020-09-22 13:38:00 +02:00
Chung-Lin Tang
f9b9832837 libgomp: adjust nvptx_free callback context checking
Change test for CUDA callback context in nvptx_free() from using
GOMP_PLUGIN_acc_thread () into checking for CUDA_ERROR_NOT_PERMITTED,
for the former only works for OpenACC, but not OpenMP offloading.

2020-08-20  Chung-Lin Tang  <cltang@codesourcery.com>

	libgomp/
	* plugin/plugin-nvptx.c (nvptx_free):
	Change "GOMP_PLUGIN_acc_thread () == NULL" test into check of
	CUDA_ERROR_NOT_PERMITTED status for cuMemGetAddressRange. Adjust
	comments.
2020-08-20 07:18:51 -07:00
Martin Jambor
c56684fd61 Removal of HSA offloading from gcc and libgomp
This patch removes the generation of HSAIL from the compiler, the HSA
offloading plugin from libgomp and the associated testsuite tests and
infrastructure bits from the respective testsuites.

Apart from removal of the obvious files, I removed bits that I found
by searching for HSA related terms and by re-tracing my steps and
looking at the patches that introduced HSA in the first place.  I did
not remove everything these patches brought in, for example:

  - the mechanism to pass offload-target specific info from the application to
    the offloading plugin - but the same mechanism is also used to
    communicate number of teams and the thread limit to all offload targets.

  - run_func hook in gomp_device_descr stays too, although now it is
    not used.  If some future offload target would like the ability to
    refuse to offload some functions, it can use it.  It is easy to
    remove as a follow-up if it is considered clutter, though.

  - configure options --with-hsa-runtime=PATH, -with-hsa-runtime-include=PATH
    and --with-hsa-runtime-lib=PATH rmeain because GCN uses them too.

  - Surprisingly, GOMP_TARGET_ARG_HSA_KERNEL_ATTRIBUTES (a constant
    from gomp-constants.h) appears in the source of the amdgcn libgomp
    plugin, although I tend to think that code path is not ever used
    and this patch certainly removes it from the compiler.
    Nevertheless, it seems it has potential value beyond HSAIL and so
    I've kept it, it can of course always be easily removed in the
    future of GCN folk abandon it too.

  - I assume constants OFFLOAD_TARGET_TYPE_HSA and GOMP_DEVICE_HSA
    need to stay indefinitely too just so that no future offload
    target picks that number.

  - I have kept dg-require-effective-target
    offload_device_nonshared_as requirement of thests which have it.

It is quite probable I missed some small HSA artifacts but those
should be easy to remove later as we find them.

include/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* gomp-constants.h (GOMP_VERSION_HSA): Remove.

gcc/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* hsa-brig-format.h: Moved to brig/brigfrontend.
	* hsa-brig.c: Removed.
	* hsa-builtins.def: Likewise.
	* hsa-common.c: Likewise.
	* hsa-common.h: Likewise.
	* hsa-dump.c: Likewise.
	* hsa-gen.c: Likewise.
	* hsa-regalloc.c: Likewise.
	* ipa-hsa.c: Likewise.
	* omp-grid.c: Likewise.
	* omp-grid.h: Likewise.
	* Makefile.in (BUILTINS_DEF): Remove hsa-builtins.def.
	(OBJS): Remove hsa-common.o, hsa-gen.o, hsa-regalloc.o, hsa-brig.o,
	hsa-dump.o, ipa-hsa.c and omp-grid.o.
	(GTFILES): Removed hsa-common.c and omp-expand.c.
	* builtins.def: Remove processing of hsa-builtins.def.
	(DEF_HSA_BUILTIN): Remove.
	* common.opt (flag_disable_hsa): Remove.
	(-Whsa): Ignore.
	* config.in (ENABLE_HSA): Removed.
	* configure.ac: Removed handling configuration for hsa offloading.
	(ENABLE_HSA): Removed.
	* configure: Regenerated.
	* doc/install.texi (--enable-offload-targets): Remove hsa from the
	example.
	(--with-hsa-runtime): Reword to reference any HSA run-time, not
	specifically HSA offloading.
	* doc/invoke.texi (Option Summary): Remove -Whsa.
	(Warning Options): Likewise.
	(Optimize Options): Remove hsa-gen-debug-stores.
	* doc/passes.texi (Regular IPA passes): Remove section on IPA HSA
	pass.
	* gimple-low.c (lower_stmt): Remove GIMPLE_OMP_GRID_BODY case.
	* gimple-pretty-print.c (dump_gimple_omp_for): Likewise.
	(dump_gimple_omp_block): Likewise.
	(pp_gimple_stmt_1): Likewise.
	* gimple-walk.c (walk_gimple_stmt): Likewise.
	* gimple.c (gimple_build_omp_grid_body): Removed function.
	(gimple_copy): Remove GIMPLE_OMP_GRID_BODY case.
	* gimple.def (GIMPLE_OMP_GRID_BODY): Removed.
	* gimple.h (gf_mask): Removed GF_OMP_PARALLEL_GRID_PHONY,
	OMP_FOR_KIND_GRID_LOOP, GF_OMP_FOR_GRID_PHONY,
	GF_OMP_FOR_GRID_INTRA_GROUP, GF_OMP_FOR_GRID_GROUP_ITER and
	GF_OMP_TEAMS_GRID_PHONY.  Renumbered GF_OMP_FOR_KIND_SIMD and
	GF_OMP_TEAMS_HOST.
	(gimple_build_omp_grid_body): Removed declaration.
	(gimple_has_substatements): Remove GIMPLE_OMP_GRID_BODY case.
	(gimple_omp_for_grid_phony): Removed.
	(gimple_omp_for_set_grid_phony): Likewise.
	(gimple_omp_for_grid_intra_group): Likewise.
	(gimple_omp_for_grid_intra_group): Likewise.
	(gimple_omp_for_grid_group_iter): Likewise.
	(gimple_omp_for_set_grid_group_iter): Likewise.
	(gimple_omp_parallel_grid_phony): Likewise.
	(gimple_omp_parallel_set_grid_phony): Likewise.
	(gimple_omp_teams_grid_phony): Likewise.
	(gimple_omp_teams_set_grid_phony): Likewise.
	(CASE_GIMPLE_OMP): Remove GIMPLE_OMP_GRID_BODY case.
	* lto-section-in.c (lto_section_name): Removed hsa.
	* lto-streamer.h (lto_section_type): Removed LTO_section_ipa_hsa.
	* lto-wrapper.c (compile_images_for_offload_targets): Remove special
	handling of hsa.
	* omp-expand.c: Do not include hsa-common.h and gt-omp-expand.h.
	(parallel_needs_hsa_kernel_p): Removed.
	(grid_launch_attributes_trees): Likewise.
	(grid_launch_attributes_trees): Likewise.
	(grid_create_kernel_launch_attr_types): Likewise.
	(grid_insert_store_range_dim): Likewise.
	(grid_get_kernel_launch_attributes): Likewise.
	(get_target_arguments): Remove code passing HSA grid sizes.
	(grid_expand_omp_for_loop): Remove.
	(grid_arg_decl_map): Likewise.
	(grid_remap_kernel_arg_accesses): Likewise.
	(grid_expand_target_grid_body): Likewise.
	(expand_omp): Remove call to grid_expand_target_grid_body.
	(omp_make_gimple_edges): Remove GIMPLE_OMP_GRID_BODY case.
	* omp-general.c: Do not include hsa-common.h.
	(omp_maybe_offloaded): Do not check for HSA offloading.
	(omp_context_selector_matches): Likewise.
	* omp-low.c: Do not include hsa-common.h and omp-grid.h.
	(build_outer_var_ref): Remove handling of GIMPLE_OMP_GRID_BODY.
	(scan_sharing_clauses): Remove handling of OMP_CLAUSE__GRIDDIM_.
	(scan_omp_parallel): Remove handling of the phoney variant.
	(check_omp_nesting_restrictions): Remove handling of
	GIMPLE_OMP_GRID_BODY and GF_OMP_FOR_KIND_GRID_LOOP.
	(scan_omp_1_stmt): Remove handling of GIMPLE_OMP_GRID_BODY.
	(lower_omp_for_lastprivate): Remove handling of gridified loops.
	(lower_omp_for): Remove phony loop handling.
	(lower_omp_taskreg): Remove phony construct handling.
	(lower_omp_teams): Likewise.
	(lower_omp_grid_body): Removed.
	(lower_omp_1): Remove GIMPLE_OMP_GRID_BODY case.
	(execute_lower_omp): Do not call omp_grid_gridify_all_targets.
	* opts.c (common_handle_option): Do not handle hsa when processing
	OPT_foffload_.
	* params.opt (hsa-gen-debug-stores): Remove.
	* passes.def: Remove pass_ipa_hsa and pass_gen_hsail.
	* timevar.def: Remove TV_IPA_HSA.
	* toplev.c: Do not include hsa-common.h.
	(compile_file): Do not call hsa_output_brig.
	* tree-core.h (enum omp_clause_code): Remove OMP_CLAUSE__GRIDDIM_.
	(tree_omp_clause): Remove union field dimension.
	* tree-nested.c (convert_nonlocal_omp_clauses): Remove the
	OMP_CLAUSE__GRIDDIM_ case.
	(convert_local_omp_clauses): Likewise.
	* tree-pass.h (make_pass_gen_hsail): Remove declaration.
	(make_pass_ipa_hsa): Likewise.
	* tree-pretty-print.c (dump_omp_clause): Remove GIMPLE_OMP_GRID_BODY
	case.
	* tree.c (omp_clause_num_ops): Remove the element corresponding to
	OMP_CLAUSE__GRIDDIM_.
	(omp_clause_code_name): Likewise.
	(walk_tree_1): Remove GIMPLE_OMP_GRID_BODY case.
	* tree.h (OMP_CLAUSE__GRIDDIM__DIMENSION): Remove.
	(OMP_CLAUSE__GRIDDIM__SIZE): Likewise.
	(OMP_CLAUSE__GRIDDIM__GROUP): Likewise.

gcc/fortran/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* f95-lang.c (gfc_init_builtin_functions): Remove processing of
	hsa-builtins.def.

gcc/brig/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* brigfrontend/brig-util.h (hsa_type_packed_p): Declared.
	* brigfrontend/brig-util.cc (hsa_type_packed_p): Moved here from
	removed gcc/hsa-common.c.

libgomp/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* plugin/Makefrag.am: Remove configuration of HSA plugin.
	* aclocal.m4: Regenerated.
	* Makefile.in: Regenerated.
	* config.h.in: Regenerated.
	* configure: Regenerated.
	* plugin/configfrag.ac: Likewise.
	* plugin/hsa_ext_finalize.h: Removed.
	* plugin/plugin-hsa.c: Likewise.
	* testsuite/Makefile.in: Regenerated.
	* testsuite/lib/libgomp.exp
	(offload_target_to_openacc_device_type): Remove hsa case.
	(check_effective_target_hsa_offloading_selected_nocache): Removed
	(check_effective_target_hsa_offloading_selected): Likewise.
	(libgomp_init): Do not add -Wno-hsa to additional_flags.
	* testsuite/libgomp.hsa.c/alloca-1.c: Removed test.
	* testsuite/libgomp.hsa.c/bitfield-1.c: Likewise.
	* testsuite/libgomp.hsa.c/bits-insns.c: Likewise.
	* testsuite/libgomp.hsa.c/builtins-1.c: Likewise.
	* testsuite/libgomp.hsa.c/c.exp: Likewise.
	* testsuite/libgomp.hsa.c/complex-1.c: Likewise.
	* testsuite/libgomp.hsa.c/complex-align-2.c: Likewise.
	* testsuite/libgomp.hsa.c/formal-actual-args-1.c: Likewise.
	* testsuite/libgomp.hsa.c/function-call-1.c: Likewise.
	* testsuite/libgomp.hsa.c/get-level-1.c: Likewise.
	* testsuite/libgomp.hsa.c/gridify-1.c: Likewise.
	* testsuite/libgomp.hsa.c/gridify-2.c: Likewise.
	* testsuite/libgomp.hsa.c/gridify-3.c: Likewise.
	* testsuite/libgomp.hsa.c/gridify-4.c: Likewise.
	* testsuite/libgomp.hsa.c/memory-operations-1.c: Likewise.
	* testsuite/libgomp.hsa.c/pr69568.c: Likewise.
	* testsuite/libgomp.hsa.c/pr82416.c: Likewise.
	* testsuite/libgomp.hsa.c/rotate-1.c: Likewise.
	* testsuite/libgomp.hsa.c/staticvar.c: Likewise.
	* testsuite/libgomp.hsa.c/switch-1.c: Likewise.
	* testsuite/libgomp.hsa.c/switch-branch-1.c: Likewise.
	* testsuite/libgomp.hsa.c/switch-sbr-2.c: Likewise.
	* testsuite/libgomp.hsa.c/tiling-1.c: Likewise.
	* testsuite/libgomp.hsa.c/tiling-2.c: Likewise.

gcc/testsuite/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* lib/target-supports.exp (check_effective_target_offload_hsa):
	Removed.
	* c-c++-common/gomp/gridify-1.c: Removed test.
	* c-c++-common/gomp/gridify-2.c: Likewise.
	* c-c++-common/gomp/gridify-3.c: Likewise.
	* c-c++-common/gomp/hsa-indirect-call-1.c: Likewise.
	* gfortran.dg/gomp/gridify-1.f90: Likewise.
	* gcc.dg/gomp/gomp.exp: Do not pass -Wno-hsa to tests.
	* g++.dg/gomp/gomp.exp: Likewise.
	* gfortran.dg/gomp/gomp.exp: Likewise.
2020-08-03 18:13:00 +02:00
Andrew Stubbs
f062c3f115 amdgcn: Switch to HSACO v3 binary format
This upgrades the compiler to emit HSA Code Object v3 binaries.  This means
changing the assembler directives, and linker command line options.

The gcn-run and libgomp loaders need corresponding alterations.  The
relocations no longer need to be fixed up manually, and the kernel symbol
names have changed slightly.

This move makes the binaries compatible with the new rocgdb from ROCm 3.5.

2020-06-17  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* config/gcn/gcn-hsa.h (TEXT_SECTION_ASM_OP): Use ".text".
	(BSS_SECTION_ASM_OP): Use ".bss".
	(ASM_SPEC): Remove "-mattr=-code-object-v3".
	(LINK_SPEC): Add "--export-dynamic".
	* config/gcn/gcn-opts.h (processor_type): Replace PROCESSOR_VEGA with
	PROCESSOR_VEGA10 and PROCESSOR_VEGA20.
	* config/gcn/gcn-run.c (HSA_RUNTIME_LIB): Use ".so.1" variant.
	(load_image): Remove obsolete relocation handling.
	Add ".kd" suffix to the symbol names.
	* config/gcn/gcn.c (MAX_NORMAL_SGPR_COUNT): Set to 62.
	(gcn_option_override): Update gcn_isa test.
	(gcn_kernel_arg_types): Update all the assembler directives.
	Remove the obsolete options.
	(gcn_conditional_register_usage): Update MAX_NORMAL_SGPR_COUNT usage.
	(gcn_omp_device_kind_arch_isa): Handle PROCESSOR_VEGA10 and
	PROCESSOR_VEGA20.
	(output_file_start): Rework assembler file header.
	(gcn_hsa_declare_function_name): Rework kernel metadata.
	* config/gcn/gcn.h (GCN_KERNEL_ARG_TYPES): Set to 16.
	* config/gcn/gcn.opt (PROCESSOR_VEGA): Remove enum.
	(PROCESSOR_VEGA10): New enum value.
	(PROCESSOR_VEGA20): New enum value.

	libgomp/
	* plugin/plugin-gcn.c (init_environment_variables): Use ".so.1"
	variant for HSA_RUNTIME_LIB name.
	(find_executable_symbol_1): Delete.
	(find_executable_symbol): Delete.
	(init_kernel_properties): Add ".kd" suffix to symbol names.
	(find_load_offset): Delete.
	(create_and_finalize_hsa_program): Remove relocation handling.
2020-06-17 10:06:21 +01:00
Andrew Stubbs
966de09be9 amdgcn: Check HSA return codes [PR94629]
Ensure that the returned status values are not ignored.  The old code was
not broken, but this is both safer and satisfies static analysis.

2020-04-23  Andrew Stubbs  <ams@codesourcery.com>

	PR other/94629

	libgomp/
	* plugin/plugin-gcn.c (init_hsa_context): Check return value from
	hsa_iterate_agents.
	(GOMP_OFFLOAD_init_device): Check return values from both calls to
	hsa_agent_iterate_regions.
2020-04-23 15:05:23 +01:00
Frederik Harwath
001ab12e62 openmp: ignore nowait if async execution is unsupported [PR93481]
An OpenMP "nowait" clause on a target construct currently leads to
a call to GOMP_OFFLOAD_async_run in the plugin that is used for
offloading at execution time. The nvptx plugin contains only a stub
of this function that always produces a fatal error if called.

This commit changes the "nowait" implementation to ignore the clause
if the executing device's plugin does not implement GOMP_OFFLOAD_async_run.
The stub in the nvptx plugin is removed which effectively means that
programs containing "nowait" can now be executed with nvptx offloading
as if the clause had not been used.
This behavior is consistent with the OpenMP specification which says that
"[...] execution of the target task *may* be deferred" (emphasis added),
cf. OpenMP 5.0, page 172.

libgomp/

	* plugin/plugin-nvptx.c: Remove GOMP_OFFLOAD_async_run stub.
	* target.c (gomp_load_plugin_for_device): Make "async_run" loading
	optional.
	(gomp_target_task_fn): Assert "devicep->async_run_func".
	(clear_unsupported_flags): New function to remove unsupported flags
	(right now only GOMP_TARGET_FLAG_NOWAIT) that can be be ignored.
	(GOMP_target_ext): Apply clear_unsupported_flags to flags.
	* testsuite/libgomp.c/target-33.c:
	Remove xfail for offload_target_nvptx.
	* testsuite/libgomp.c/target-34.c: Likewise.
2020-02-13 10:18:31 +01:00
Andrew Stubbs
591f869ad7 Remove gfx801 "carrizo" support
2020-02-03  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* config.gcc: Remove "carrizo" support.
	* config/gcn/gcn-opts.h (processor_type): Likewise.
	* config/gcn/gcn.c (gcn_omp_device_kind_arch_isa): Likewise.
	* config/gcn/gcn.opt (gpu_type): Likewise.
	* config/gcn/t-omp-device: Likewise.

	libgomp/
	* plugin/plugin-gcn.c (EF_AMDGPU_MACH_AMDGCN_GFX801): Remove.
	(gcn_gfx801_s): Remove.
	(isa_hsa_name): Remove gfx801.
	(isa_gcc_name): Remove gfx801/carizzo.
	(isa_code): Remove gfx801.
2020-02-03 17:23:18 +00:00
Kwok Cheung Yeung
5a28e2727f [amdgcn] Scale number of threads/workers with VGPR usage
2020-01-31  Kwok Cheung Yeung  <kcy@codesourcery.com>

	gcc/
	* config/gcn/mkoffload.c (process_asm): Add sgpr_count and vgpr_count
	to definition of hsa_kernel_description.  Parse assembly to find SGPR
	and VGPR count of kernel and store in hsa_kernel_description.

	libgomp/
	* plugin/plugin-gcn.c (struct hsa_kernel_description): Add sgpr_count
	and vgpr_count fields.
	(struct kernel_info): Add a field for a hsa_kernel_description.
	(run_kernel): Reduce the number of threads/workers if the requested
	number would require too many VGPRs.
	(init_basic_kernel_info): Initialize description field with
	the hsa_kernel_description entry for the kernel.
2020-01-31 07:13:05 -08:00
Tobias Burnus
5ab5d81b36 Skip plugin-{gcn,hsa} for (-m)x32 (PR bootstrap/93409)
PR bootstrap/93409
        * plugin/configfrag.ac (enable_offload_targets): Skip
        HSA and GCN plugin besides -m32 also for -mx32.
        * configure: Regenerate.
2020-01-30 12:27:17 +01:00
Frederik Harwath
2e5ea57959 Add OpenACC acc_get_property support for AMD GCN
Add full support for the OpenACC 2.6 acc_get_property and
acc_get_property_string functions to the libgomp GCN plugin.

libgomp/
	* plugin-gcn.c (struct agent_info): Add fields "name" and
	"vendor_name" ...
	(GOMP_OFFLOAD_init_device): ... and init from here.
	(struct hsa_context_info): Add field "driver_version_s" ...
	(init_hsa_contest): ... and init from here.
	(GOMP_OFFLOAD_openacc_get_property): Replace stub with a proper
	implementation.
	* testsuite/libgomp.oacc-c-c++-common/acc_get_property.c:
	Enable test execution for amdgcn and host offloading targets.
	* testsuite/libgomp.oacc-fortran/acc_get_property.f90: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/acc_get_property-aux.c
	(expect_device_properties): Split function into ...
	(expect_device_string_properties): ... this new function ...
	(expect_device_memory): ... and this new function.
	* testsuite/libgomp.oacc-c-c++-common/acc_get_property-gcn.c:
	Add test.
2020-01-29 11:54:56 +01:00
Andrew Stubbs
14e5e74698 Fix libgomp plugin-gcn bug
2020-01-23  Andrew Stubbs  <ams@codesourcery.com>

	libgomp/
	* plugin/plugin-gcn.c (parse_target_attributes): Use correct mask for
	the device id.
2020-01-23 12:40:04 +00:00
Frederik Harwath
7d593fd672 Add runtime ISA check for amdgcn offloading
The HSA/ROCm runtime rejects binaries not built for the exact GPU device
present. So far, the libgomp amdgcn plugin does not verify that the GPU ISA
and the ISA specified at compile time match before handing over the binary to
the runtime.  In case of a mismatch, the user is confronted with an unhelpful
runtime error.

This commit implements a runtime ISA check. In case of an ISA mismatch, the
execution is aborted with a clear error message and a hint at the correct
compilation parameters for the GPU on which the execution has been attempted.

libgomp/
	* plugin/plugin-gcn.c (EF_AMDGPU_MACH): New enum.
	* (EF_AMDGPU_MACH_MASK): New constant.
	* (gcn_isa): New typedef.
	* (gcn_gfx801_s): New constant.
	* (gcn_gfx803_s): New constant.
	* (gcn_gfx900_s): New constant.
	* (gcn_gfx906_s): New constant.
	* (gcn_isa_name_len): New constant.
	* (elf_gcn_isa_field): New function.
	* (isa_hsa_name): New function.
	* (isa_gcc_name): New function.
	* (isa_code): New function.
	* (struct agent_info): Add field "device_isa" and remove field
	"gfx900_p".
	* (GOMP_OFFLOAD_init_device): Adapt agent init to "agent_info"
	field changes, fail if device has unknown ISA.
	* (parse_target_attributes): Replace "gfx900_p" by "device_isa".
	* (isa_matches_agent): New function ...
	* (create_and_finalize_hsa_program): ... used from here to check
	that the GPU ISA and the code-object ISA match.
2020-01-21 07:41:45 +01:00
Thomas Schwinge
6fc0385c0c OpenACC 'acc_get_property' cleanup
include/
	* gomp-constants.h (enum gomp_device_property): Remove.
	libgomp/
	* libgomp-plugin.h (enum goacc_property): New.  Adjust all users
	to use this instead of 'enum gomp_device_property'.
	(GOMP_OFFLOAD_get_property): Rename to...
	(GOMP_OFFLOAD_openacc_get_property): ... this.  Adjust all users.
	* libgomp.h (struct gomp_device_descr): Move
	'GOMP_OFFLOAD_openacc_get_property'...
	(struct acc_dispatch_t): ... here.  Adjust all users.
	* plugin/plugin-hsa.c (GOMP_OFFLOAD_get_property): Remove.
	liboffloadmic/
	* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_get_property):
	Remove.

From-SVN: r280150
2020-01-10 23:24:36 +01:00