gcc/libgomp/plugin
Andrew Stubbs c7ec7bd1c6 amdgcn: add -march=gfx1030 EXPERIMENTAL
Accept the architecture configure option and resolve build failures.  This is
enough to build binaries, but I've not got a device to test it on, so there
are probably runtime issues to fix.  The cache control instructions might be
unsafe (or too conservative), and the kernel metadata might be off.  Vector
reductions will need to be reworked for RDNA2.  In principle, it would be
better to use wavefrontsize32 for this architecture, but that would mean
switching everything to allow SImode masks, so wavefrontsize64 it is.

The multilib is not included in the default configuration so either configure
--with-arch=gfx1030 or include it in --with-multilib-list=gfx1030,....

The majority of this patch has no effect on other devices, but changing from
using scalar writes for the exit value to vector writes means we don't need
the scalar cache write-back instruction anywhere (which doesn't exist in RDNA2).

gcc/ChangeLog:

	* config.gcc: Allow --with-arch=gfx1030.
	* config/gcn/gcn-hsa.h (NO_XNACK): gfx1030 does not support xnack.
	(ASM_SPEC): gfx1030 needs -mattr=+wavefrontsize64 set.
	* config/gcn/gcn-opts.h (enum processor_type): Add PROCESSOR_GFX1030.
	(TARGET_GFX1030): New.
	(TARGET_RDNA2): New.
	* config/gcn/gcn-valu.md (@dpp_move<mode>): Disable for RDNA2.
	(addc<mode>3<exec_vcc>): Add RDNA2 syntax variant.
	(subc<mode>3<exec_vcc>): Likewise.
	(<convop><mode><vndi>2_exec): Add RDNA2 alternatives.
	(vec_cmp<mode>di): Likewise.
	(vec_cmp<u><mode>di): Likewise.
	(vec_cmp<mode>di_exec): Likewise.
	(vec_cmp<u><mode>di_exec): Likewise.
	(vec_cmp<mode>di_dup): Likewise.
	(vec_cmp<mode>di_dup_exec): Likewise.
	(reduc_<reduc_op>_scal_<mode>): Disable for RDNA2.
	(*<reduc_op>_dpp_shr_<mode>): Likewise.
	(*plus_carry_dpp_shr_<mode>): Likewise.
	(*plus_carry_in_dpp_shr_<mode>): Likewise.
	* config/gcn/gcn.cc (gcn_option_override): Recognise gfx1030.
	(gcn_global_address_p): RDNA2 only allows smaller offsets.
	(gcn_addr_space_legitimate_address_p): Likewise.
	(gcn_omp_device_kind_arch_isa): Recognise gfx1030.
	(gcn_expand_epilogue): Use VGPRs instead of SGPRs.
	(output_file_start): Configure gfx1030.
	* config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Add __RDNA2__;
	(ASSEMBLER_DIALECT): New.
	* config/gcn/gcn.md (rdna): New define_attr.
	(enabled): Use "rdna" attribute.
	(gcn_return): Remove s_dcache_wb.
	(addcsi3_scalar): Add RDNA2 syntax variant.
	(addcsi3_scalar_zero): Likewise.
	(addptrdi3): Likewise.
	(mulsi3): v_mul_lo_i32 should be v_mul_lo_u32 on all ISA.
	(*memory_barrier): Add RDNA2 syntax variant.
	(atomic_load<mode>): Add RDNA2 cache control variants, and disable
	scalar atomics for RDNA2.
	(atomic_store<mode>): Likewise.
	(atomic_exchange<mode>): Likewise.
	* config/gcn/gcn.opt (gpu_type): Add gfx1030.
	* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX1030): New.
	(main): Recognise -march=gfx1030.
	* config/gcn/t-omp-device: Add gfx1030 isa.

libgcc/ChangeLog:

	* config/gcn/amdgcn_veclib.h (CDNA3_PLUS): Set false for __RDNA2__.

libgomp/ChangeLog:

	* plugin/plugin-gcn.c (EF_AMDGPU_MACH_AMDGCN_GFX1030): New.
	(isa_hsa_name): Recognise gfx1030.
	(isa_code): Likewise.
	* team.c (defined): Remove s_endpgm.
2023-10-20 12:40:25 +01:00
..
configfrag.ac Update copyright years. 2023-01-16 11:52:17 +01:00
cuda-lib.def OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect 2023-07-26 16:22:35 +02:00
Makefrag.am Update copyright years. 2023-01-16 11:52:17 +01:00
plugin-gcn.c amdgcn: add -march=gfx1030 EXPERIMENTAL 2023-10-20 12:40:25 +01:00
plugin-nvptx.c libgomp: cuda.h and omp_target_memcpy_rect cleanup 2023-07-29 13:25:03 +02:00