GCC modified for the FreeChainXenon project
Find a file
Juzhe-Zhong 1a51886a79 RISC-V: VLA preempts VLS on unknown NITERS loop
This patch fixes the known issues on SLP cases:

	ble	a2,zero,.L11
	addiw	t1,a2,-1
	li	a5,15
	bleu	t1,a5,.L9
	srliw	a7,t1,4
	slli	a7,a7,7
	lui	t3,%hi(.LANCHOR0)
	lui	a6,%hi(.LANCHOR0+128)
	addi	t3,t3,%lo(.LANCHOR0)
	li	a4,128
	addi	a6,a6,%lo(.LANCHOR0+128)
	add	a7,a7,a0
	addi	a3,a1,37
	mv	a5,a0
	vsetvli	zero,a4,e8,m8,ta,ma
	vle8.v	v24,0(t3)
	vle8.v	v16,0(a6)
.L4:
	li	a6,128
	vle8.v	v0,0(a3)
	vrgather.vv	v8,v0,v24
	vadd.vv	v8,v8,v16
	vse8.v	v8,0(a5)
	add	a5,a5,a6
	add	a3,a3,a6
	bne	a5,a7,.L4
	andi	a5,t1,-16
	mv	t1,a5
.L3:
	subw	a2,a2,a5
	li	a4,1
	beq	a2,a4,.L5
	slli	a5,a5,32
	srli	a5,a5,32
	addiw	a2,a2,-1
	slli	a5,a5,3
	csrr	a4,vlenb
	slli	a6,a2,32
	addi	t3,a5,37
	srli	a3,a6,29
	slli	a4,a4,2
	add	t3,a1,t3
	add	a5,a0,a5
	mv	t5,a3
	bgtu	a3,a4,.L14
.L6:
	li	a4,50790400
	addi	a4,a4,1541
	li	a6,67633152
	addi	a6,a6,513
	slli	a4,a4,32
	add	a4,a4,a6
	vsetvli	t4,zero,e64,m4,ta,ma
	vmv.v.x	v16,a4
	vsetvli	a6,zero,e16,m8,ta,ma
	vid.v	v8
	vsetvli	zero,t5,e8,m4,ta,ma
	vle8.v	v20,0(t3)
	vsetvli	a6,zero,e16,m8,ta,ma
	csrr	a7,vlenb
	vand.vi	v8,v8,-8
	vsetvli	zero,zero,e8,m4,ta,ma
	slli	a4,a7,2
	vrgatherei16.vv	v4,v20,v8
	vadd.vv	v4,v4,v16
	vsetvli	zero,t5,e8,m4,ta,ma
	vse8.v	v4,0(a5)
	bgtu	a3,a4,.L15
.L7:
	addw	t1,a2,t1
.L5:
	slliw	a5,t1,3
	add	a1,a1,a5
	lui	a4,%hi(.LC2)
	add	a0,a0,a5
	lbu	a3,37(a1)
	addi	a5,a4,%lo(.LC2)
	vsetivli	zero,8,e8,mf2,ta,ma
	vmv.v.x	v1,a3
	vle8.v	v2,0(a5)
	vadd.vv	v1,v1,v2
	vse8.v	v1,0(a0)
.L11:
	ret
.L15:
	sub	a3,a3,a4
	bleu	a3,a4,.L8
	mv	a3,a4
.L8:
	li	a7,50790400
	csrr	a4,vlenb
	slli	a4,a4,2
	addi	a7,a7,1541
	li	t4,67633152
	add	t3,t3,a4
	vsetvli	zero,a3,e8,m4,ta,ma
	slli	a7,a7,32
	addi	t4,t4,513
	vle8.v	v20,0(t3)
	add	a4,a5,a4
	add	a7,a7,t4
	vsetvli	a5,zero,e64,m4,ta,ma
	vmv.v.x	v16,a7
	vsetvli	a6,zero,e16,m8,ta,ma
	vid.v	v8
	vand.vi	v8,v8,-8
	vsetvli	zero,zero,e8,m4,ta,ma
	vrgatherei16.vv	v4,v20,v8
	vadd.vv	v4,v4,v16
	vsetvli	zero,a3,e8,m4,ta,ma
	vse8.v	v4,0(a4)
	j	.L7
.L14:
	mv	t5,a4
	j	.L6
.L9:
	li	a5,0
	li	t1,0
	j	.L3

The vectorization codegen is quite inefficient since we choose a VLS modes to vectorize the loop body
with epilogue choosing a VLA modes.

cost.c:6:21: note:  ***** Choosing vector mode V128QI
cost.c:6:21: note:  ***** Choosing epilogue vector mode RVVM4QI

As we known, in RVV side, we have VLA modes and VLS modes. VLAmodes support partial vectors wheras
VLSmodes support full vectors.  The goal we add VLSmodes is to improve the codegen of known NITERS
or SLP codes.

If NITERS is unknown, that is i < n, n is unknown. We will always have partial vectors vectorization.
It can be loop body or epilogue. In this case, It's always more efficient to apply VLA partial vectorization
on loop body which doesn't have epilogue.

After this patch:

f:
	ble	a2,zero,.L7
	li	a5,1
	beq	a2,a5,.L5
	li	a6,50790400
	addi	a6,a6,1541
	li	a4,67633152
	addi	a4,a4,513
	csrr	a5,vlenb
	addiw	a2,a2,-1
	slli	a6,a6,32
	add	a6,a6,a4
	slli	a5,a5,2
	slli	a4,a2,32
	vsetvli	t1,zero,e64,m4,ta,ma
	srli	a3,a4,29
	neg	t4,a5
	addi	a7,a1,37
	mv	a4,a0
	vmv.v.x	v12,a6
	vsetvli	t3,zero,e16,m8,ta,ma
	vid.v	v16
	vand.vi	v16,v16,-8
.L4:
	minu	a6,a3,a5
	vsetvli	zero,a6,e8,m4,ta,ma
	vle8.v	v8,0(a7)
	vsetvli	t3,zero,e8,m4,ta,ma
	mv	t1,a3
	vrgatherei16.vv	v4,v8,v16
	vsetvli	zero,a6,e8,m4,ta,ma
	vadd.vv	v4,v4,v12
	vse8.v	v4,0(a4)
	add	a7,a7,a5
	add	a4,a4,a5
	add	a3,a3,t4
	bgtu	t1,a5,.L4
.L3:
	slliw	a2,a2,3
	add	a1,a1,a2
	lui	a5,%hi(.LC0)
	lbu	a4,37(a1)
	add	a0,a0,a2
	addi	a5,a5,%lo(.LC0)
	vsetivli	zero,8,e8,mf2,ta,ma
	vmv.v.x	v1,a4
	vle8.v	v2,0(a5)
	vadd.vv	v1,v1,v2
	vse8.v	v1,0(a0)
.L7:
	ret

Tested on both RV32 and RV64 no regression. Ok for trunk ?

gcc/ChangeLog:

	* config/riscv/riscv-vector-costs.cc (costs::better_main_loop_than_p): VLA
	preempt VLS on unknown NITERS loop.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/partial/slp-1.c: Remove xfail.
	* gcc.target/riscv/rvv/autovec/partial/slp-16.c: Ditto.
	* gcc.target/riscv/rvv/autovec/partial/slp-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/partial/slp-5.c: Ditto.
2024-01-11 14:08:37 +08:00
.github Minor formatting fix for newly-added file from previous commit 2023-11-01 19:28:56 -04:00
c++tools Update copyright years. 2024-01-03 12:19:35 +01:00
config config: delete unused CYG_AC_PATH_LIBERTY macro 2024-01-10 19:51:07 -05:00
contrib Daily bump. 2024-01-10 00:18:30 +00:00
fixincludes Daily bump. 2023-11-23 00:18:14 +00:00
gcc RISC-V: VLA preempts VLS on unknown NITERS loop 2024-01-11 14:08:37 +08:00
gnattools Update Copyright year in ChangeLog files 2024-01-03 11:35:18 +01:00
gotools Daily bump. 2023-11-04 00:16:45 +00:00
include Daily bump. 2024-01-10 00:18:30 +00:00
INSTALL
libada Update copyright years. 2024-01-03 12:19:35 +01:00
libatomic Update copyright years. 2024-01-03 12:19:35 +01:00
libbacktrace Update copyright years. 2024-01-05 08:54:28 +01:00
libcc1 Daily bump. 2024-01-10 00:18:30 +00:00
libcody Update Copyright year in ChangeLog files 2024-01-03 11:35:18 +01:00
libcpp Daily bump. 2024-01-05 00:18:48 +00:00
libdecnumber Update copyright years. 2024-01-03 12:19:35 +01:00
libffi Daily bump. 2023-10-27 00:17:12 +00:00
libgcc Update copyright years. 2024-01-03 12:19:35 +01:00
libgfortran Daily bump. 2024-01-08 00:16:43 +00:00
libgm2 Daily bump. 2024-01-06 00:18:04 +00:00
libgo libgo: update configure.ac to upstream GCC 2023-11-30 13:23:53 -08:00
libgomp Daily bump. 2024-01-11 00:18:50 +00:00
libgrust Daily bump. 2024-01-09 00:17:50 +00:00
libiberty Update copyright years. 2024-01-03 12:19:35 +01:00
libitm Daily bump. 2024-01-04 00:18:45 +00:00
libobjc Update copyright years. 2024-01-03 12:19:35 +01:00
libphobos Update copyright years. 2024-01-05 08:54:28 +01:00
libquadmath Daily bump. 2024-01-04 00:18:45 +00:00
libsanitizer Daily bump. 2024-01-03 00:17:41 +00:00
libssp Update copyright years. 2024-01-03 12:19:35 +01:00
libstdc++-v3 libstdc++: Optimize std::is_compound compilation performance 2024-01-10 20:08:26 -08:00
libvtv Update copyright years. 2024-01-03 12:19:35 +01:00
lto-plugin Update copyright years. 2024-01-03 12:19:35 +01:00
maintainer-scripts Daily bump. 2023-11-14 12:23:39 +00:00
zlib Daily bump. 2023-10-23 00:16:43 +00:00
.dir-locals.el
.gitattributes
.gitignore *: add modern gettext 2023-11-14 00:47:11 +01:00
ABOUT-NLS
ar-lib
ChangeLog Daily bump. 2024-01-10 00:18:30 +00:00
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in LoongArch: Reimplement multilib build option handling. 2023-09-15 10:42:12 +08:00
config.guess
config.rpath
config.sub
configure build: Add libgrust as compilation modules 2023-12-14 13:58:57 +01:00
configure.ac build: Add libgrust as compilation modules 2023-12-14 13:58:57 +01:00
COPYING
COPYING.LIB
COPYING.RUNTIME
COPYING3
COPYING3.LIB
depcomp
install-sh
libtool-ldflags
libtool.m4 Build: fix error in fixinclude configure 2023-11-22 11:54:33 +01:00
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
lt~obsolete.m4
MAINTAINERS MAINTAINERS: Update my email address 2024-01-08 18:52:09 +00:00
Makefile.def Pass GUILE down to subdirectories 2024-01-09 08:02:31 -07:00
Makefile.in Pass GUILE down to subdirectories 2024-01-09 08:02:31 -07:00
Makefile.tpl Pass GUILE down to subdirectories 2024-01-09 08:02:31 -07:00
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
SECURITY.txt SECURITY.txt: Drop "exploitable" in reference to hardening issues 2024-01-09 10:49:01 -05:00
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.