GCC modified for the FreeChainXenon project
Find a file
Roger Sayle 6a67fdcb3f i386: PR target/112992: Optimize mode for broadcast of constants.
The issue addressed by this patch is that when initializing vectors by
broadcasting integer constants, the compiler has the flexibility to
select the most appropriate vector mode to perform the broadcast, as
long as the resulting vector has an identical bit pattern.
For example, the following constants are all equivalent:
V4SImode {0x01010101, 0x01010101, 0x01010101, 0x01010101 }
V8HImode {0x0101, 0x0101, 0x0101, 0x0101, 0x0101, 0x0101, 0x0101, 0x0101 }
V16QImode {0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, ... 0x01 }
So instruction sequences that construct any of these can be used to
construct the others (with a suitable cast/SUBREG).

On x86_64, it turns out that broadcasts of SImode constants are preferred,
as DImode constants often require a longer movabs instruction, and
HImode and QImode broadcasts require multiple uops on some architectures.
Hence, SImode is always the equal shortest/fastest implementation.

Examples of this improvement, can be seen in the testsuite.

gcc.target/i386/pr102021.c
Before:
   0:   48 b8 0c 00 0c 00 0c    movabs $0xc000c000c000c,%rax
   7:   00 0c 00
   a:   62 f2 fd 28 7c c0       vpbroadcastq %rax,%ymm0
  10:   c3                      retq

After:
   0:   b8 0c 00 0c 00          mov    $0xc000c,%eax
   5:   62 f2 7d 28 7c c0       vpbroadcastd %eax,%ymm0
   b:   c3                      retq

and
gcc.target/i386/pr90773-17.c:
Before:
   0:   48 8b 15 00 00 00 00    mov    0x0(%rip),%rdx        # 7 <foo+0x7>
   7:   b8 0c 00 00 00          mov    $0xc,%eax
   c:   62 f2 7d 08 7a c0       vpbroadcastb %eax,%xmm0
  12:   62 f1 7f 08 7f 02       vmovdqu8 %xmm0,(%rdx)
  18:   c7 42 0f 0c 0c 0c 0c    movl   $0xc0c0c0c,0xf(%rdx)
  1f:   c3                      retq

After:
   0:   48 8b 15 00 00 00 00    mov    0x0(%rip),%rdx        # 7 <foo+0x7>
   7:   b8 0c 0c 0c 0c          mov    $0xc0c0c0c,%eax
   c:   62 f2 7d 08 7c c0       vpbroadcastd %eax,%xmm0
  12:   62 f1 7f 08 7f 02       vmovdqu8 %xmm0,(%rdx)
  18:   c7 42 0f 0c 0c 0c 0c    movl   $0xc0c0c0c,0xf(%rdx)
  1f:   c3                      retq

where according to Agner Fog's instruction tables broadcastd is slightly
faster on some microarchitectures, for example Knight's Landing.

2024-01-09  Roger Sayle  <roger@nextmovesoftware.com>
	    Hongtao Liu  <hongtao.liu@intel.com>

gcc/ChangeLog
	PR target/112992
	* config/i386/i386-expand.cc
	(ix86_convert_const_wide_int_to_broadcast): Allow call to
	ix86_expand_vector_init_duplicate to fail, and return NULL_RTX.
	(ix86_broadcast_from_constant): Revert recent change; Return a
	suitable MEMREF independently of mode/target combinations.
	(ix86_expand_vector_move): Allow ix86_expand_vector_init_duplicate
	to decide whether expansion is possible/preferrable.  Only try
	forcing DImode constants to memory (and trying again) if calling
	ix86_expand_vector_init_duplicate fails with an DImode immediate
	constant.
	(ix86_expand_vector_init_duplicate) <case E_V2DImode>: Try using
	V4SImode for suitable immediate constants.
	<case E_V4DImode>: Try using V8SImode for suitable constants.
	<case E_V4HImode>: Fail for CONST_INT_P, i.e. use constant pool.
	<case E_V2HImode>: Likewise.
	<case E_V8HImode>: For CONST_INT_P try using V4SImode via widen.
	<case E_V16QImode>: For CONT_INT_P try using V8HImode via widen.
	<label widen>: Handle CONT_INTs via simplify_binary_operation.
	Allow recursive calls to ix86_expand_vector_init_duplicate to fail.
	<case E_V16HImode>: For CONST_INT_P try V8SImode via widen.
	<case E_V32QImode>: For CONST_INT_P try V16HImode via widen.
	(ix86_expand_vector_init): Move try using a broadcast for all_same
	with ix86_expand_vector_init_duplicate before using constant pool.

gcc/testsuite/ChangeLog
	* gcc.target/i386/auto-init-8.c: Update test case.
	* gcc.target/i386/avx512f-broadcast-pr87767-1.c: Likewise.
	* gcc.target/i386/avx512f-broadcast-pr87767-5.c: Likewise.
	* gcc.target/i386/avx512fp16-13.c: Likewise.
	* gcc.target/i386/avx512vl-broadcast-pr87767-1.c: Likewise.
	* gcc.target/i386/avx512vl-broadcast-pr87767-5.c: Likewise.
	* gcc.target/i386/pr100865-1.c: Likewise.
	* gcc.target/i386/pr100865-10a.c: Likewise.
	* gcc.target/i386/pr100865-10b.c: Likewise.
	* gcc.target/i386/pr100865-2.c: Likewise.
	* gcc.target/i386/pr100865-3.c: Likewise.
	* gcc.target/i386/pr100865-4a.c: Likewise.
	* gcc.target/i386/pr100865-4b.c: Likewise.
	* gcc.target/i386/pr100865-5a.c: Likewise.
	* gcc.target/i386/pr100865-5b.c: Likewise.
	* gcc.target/i386/pr100865-9a.c: Likewise.
	* gcc.target/i386/pr100865-9b.c: Likewise.
	* gcc.target/i386/pr102021.c: Likewise.
	* gcc.target/i386/pr90773-17.c: Likewise.
2024-01-09 08:28:42 +00:00
.github Minor formatting fix for newly-added file from previous commit 2023-11-01 19:28:56 -04:00
c++tools Update copyright years. 2024-01-03 12:19:35 +01:00
config Daily bump. 2023-12-01 00:17:36 +00:00
contrib Daily bump. 2024-01-09 00:17:50 +00:00
fixincludes Daily bump. 2023-11-23 00:18:14 +00:00
gcc i386: PR target/112992: Optimize mode for broadcast of constants. 2024-01-09 08:28:42 +00:00
gnattools Update Copyright year in ChangeLog files 2024-01-03 11:35:18 +01:00
gotools Daily bump. 2023-11-04 00:16:45 +00:00
include Update copyright years. 2024-01-03 12:19:35 +01:00
INSTALL
libada Update copyright years. 2024-01-03 12:19:35 +01:00
libatomic Update copyright years. 2024-01-03 12:19:35 +01:00
libbacktrace Update copyright years. 2024-01-05 08:54:28 +01:00
libcc1 Update copyright years. 2024-01-03 12:19:35 +01:00
libcody Update Copyright year in ChangeLog files 2024-01-03 11:35:18 +01:00
libcpp Daily bump. 2024-01-05 00:18:48 +00:00
libdecnumber Update copyright years. 2024-01-03 12:19:35 +01:00
libffi Daily bump. 2023-10-27 00:17:12 +00:00
libgcc Update copyright years. 2024-01-03 12:19:35 +01:00
libgfortran Daily bump. 2024-01-08 00:16:43 +00:00
libgm2 Daily bump. 2024-01-06 00:18:04 +00:00
libgo libgo: update configure.ac to upstream GCC 2023-11-30 13:23:53 -08:00
libgomp Daily bump. 2024-01-09 00:17:50 +00:00
libgrust Daily bump. 2024-01-09 00:17:50 +00:00
libiberty Update copyright years. 2024-01-03 12:19:35 +01:00
libitm Daily bump. 2024-01-04 00:18:45 +00:00
libobjc Update copyright years. 2024-01-03 12:19:35 +01:00
libphobos Update copyright years. 2024-01-05 08:54:28 +01:00
libquadmath Daily bump. 2024-01-04 00:18:45 +00:00
libsanitizer Daily bump. 2024-01-03 00:17:41 +00:00
libssp Update copyright years. 2024-01-03 12:19:35 +01:00
libstdc++-v3 Daily bump. 2024-01-09 00:17:50 +00:00
libvtv Update copyright years. 2024-01-03 12:19:35 +01:00
lto-plugin Update copyright years. 2024-01-03 12:19:35 +01:00
maintainer-scripts Daily bump. 2023-11-14 12:23:39 +00:00
zlib Daily bump. 2023-10-23 00:16:43 +00:00
.dir-locals.el
.gitattributes
.gitignore *: add modern gettext 2023-11-14 00:47:11 +01:00
ABOUT-NLS
ar-lib
ChangeLog Daily bump. 2024-01-09 00:17:50 +00:00
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in LoongArch: Reimplement multilib build option handling. 2023-09-15 10:42:12 +08:00
config.guess
config.rpath
config.sub
configure build: Add libgrust as compilation modules 2023-12-14 13:58:57 +01:00
configure.ac build: Add libgrust as compilation modules 2023-12-14 13:58:57 +01:00
COPYING
COPYING.LIB
COPYING.RUNTIME
COPYING3
COPYING3.LIB
depcomp
install-sh
libtool-ldflags
libtool.m4 Build: fix error in fixinclude configure 2023-11-22 11:54:33 +01:00
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
lt~obsolete.m4
MAINTAINERS MAINTAINERS: Update my email address 2024-01-08 18:52:09 +00:00
Makefile.def build: Add libgrust as compilation modules 2023-12-14 13:58:57 +01:00
Makefile.in build: Add libgrust as compilation modules 2023-12-14 13:58:57 +01:00
Makefile.tpl Makefile.tpl: Avoid race condition in generating site.exp from the top level 2023-11-19 11:07:09 -05:00
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
SECURITY.txt secpol: consistent indentation 2023-10-05 12:00:39 -04:00
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.