GCC modified for the FreeChainXenon project
![]() This patch gets CSE to re-use constants already inside a vector rather than re-materializing the constant again. Basically consider the following case: #include <stdint.h> #include <arm_neon.h> uint64_t test (uint64_t a, uint64x2_t b, uint64x2_t* rt) { uint64_t arr[2] = { 0x0942430810234076UL, 0x0942430810234076UL}; uint64_t res = a | arr[0]; uint64x2_t val = vld1q_u64 (arr); *rt = vaddq_u64 (val, b); return res; } The actual behavior is inconsequential however notice that the same constants are used in the vector (arr and later val) and in the calculation of res. The code we generate for this however is quite sub-optimal: test: adrp x2, .LC0 sub sp, sp, #16 ldr q1, [x2, #:lo12:.LC0] mov x2, 16502 movk x2, 0x1023, lsl 16 movk x2, 0x4308, lsl 32 add v1.2d, v1.2d, v0.2d movk x2, 0x942, lsl 48 orr x0, x0, x2 str q1, [x1] add sp, sp, 16 ret .LC0: .xword 667169396713799798 .xword 667169396713799798 Essentially we materialize the same constant twice. The reason for this is because the front-end lowers the constant extracted from arr[0] quite early on. If you look into the result of fre you'll find <bb 2> : arr[0] = 667169396713799798; arr[1] = 667169396713799798; res_7 = a_6(D) | 667169396713799798; _16 = __builtin_aarch64_ld1v2di (&arr); _17 = VIEW_CONVERT_EXPR<uint64x2_t>(_16); _11 = b_10(D) + _17; *rt_12(D) = _11; arr ={v} {CLOBBER}; return res_7; Which makes sense for further optimization. However come expand time if the constant isn't representable in the target arch it will be assigned to a register again. (insn 8 5 9 2 (set (reg:V2DI 99) (const_vector:V2DI [ (const_int 667169396713799798 [0x942430810234076]) repeated x2 ])) "cse.c":7:12 -1 (nil)) ... (insn 14 13 15 2 (set (reg:DI 103) (const_int 667169396713799798 [0x942430810234076])) "cse.c":8:12 -1 (nil)) (insn 15 14 16 2 (set (reg:DI 102 [ res ]) (ior:DI (reg/v:DI 96 [ a ]) (reg:DI 103))) "cse.c":8:12 -1 (nil)) And since it's out of the immediate range of the scalar instruction used combine won't be able to do anything here. This will then trigger the re-materialization of the constant twice. To fix this this patch extends CSE to be able to generate an extract for a constant from another vector, or to make a vector for a constant by duplicating another constant. Whether this transformation is done or not depends entirely on the costing for the target for the different constants and operations. I Initially also investigated doing this in PRE, but PRE requires at least 2 BB to work and does not currently have any way to remove redundancies within a single BB and it did not look easy to support. gcc/ChangeLog: * cse.c (add_to_set): New. (find_sets_in_insn): Register constants in sets. (canonicalize_insn): Use auto_vec instead. (cse_insn): Try materializing using vec_dup. * rtl.h (simplify_context::simplify_gen_vec_select, simplify_gen_vec_select): New. * simplify-rtx.c (simplify_context::simplify_gen_vec_select): New. |
||
---|---|---|
c++tools | ||
config | ||
contrib | ||
fixincludes | ||
gcc | ||
gnattools | ||
gotools | ||
include | ||
INSTALL | ||
intl | ||
libada | ||
libatomic | ||
libbacktrace | ||
libcc1 | ||
libcody | ||
libcpp | ||
libdecnumber | ||
libffi | ||
libgcc | ||
libgfortran | ||
libgo | ||
libgomp | ||
libiberty | ||
libitm | ||
libobjc | ||
liboffloadmic | ||
libphobos | ||
libquadmath | ||
libsanitizer | ||
libssp | ||
libstdc++-v3 | ||
libvtv | ||
lto-plugin | ||
maintainer-scripts | ||
zlib | ||
.dir-locals.el | ||
.gitattributes | ||
.gitignore | ||
ABOUT-NLS | ||
ar-lib | ||
ChangeLog | ||
ChangeLog.jit | ||
ChangeLog.tree-ssa | ||
compile | ||
config-ml.in | ||
config.guess | ||
config.rpath | ||
config.sub | ||
configure | ||
configure.ac | ||
COPYING | ||
COPYING.LIB | ||
COPYING.RUNTIME | ||
COPYING3 | ||
COPYING3.LIB | ||
depcomp | ||
install-sh | ||
libtool-ldflags | ||
libtool.m4 | ||
ltgcc.m4 | ||
ltmain.sh | ||
ltoptions.m4 | ||
ltsugar.m4 | ||
ltversion.m4 | ||
lt~obsolete.m4 | ||
MAINTAINERS | ||
Makefile.def | ||
Makefile.in | ||
Makefile.tpl | ||
missing | ||
mkdep | ||
mkinstalldirs | ||
move-if-change | ||
multilib.am | ||
README | ||
symlink-tree | ||
test-driver | ||
ylwrap |
This directory contains the GNU Compiler Collection (GCC). The GNU Compiler Collection is free software. See the files whose names start with COPYING for copying permission. The manuals, and some of the runtime libraries, are under different terms; see the individual source files for details. The directory INSTALL contains copies of the installation information as HTML and plain text. The source of this information is gcc/doc/install.texi. The installation information includes details of what is included in the GCC sources and what files GCC installs. See the file gcc/doc/gcc.texi (together with other files that it includes) for usage and porting information. An online readable version of the manual is in the files gcc/doc/gcc.info*. See http://gcc.gnu.org/bugs/ for how to report bugs usefully. Copyright years on GCC source files may be listed using range notation, e.g., 1987-2012, indicating that every year in the range, inclusive, is a copyrightable year that could otherwise be listed individually.