c++/modules: use optimized crc32 from zlib

The current implementation of bytes::calc_crc computes the checksum one
byte at a time which turns out to be quite slow, accounting for 15% of
streaming in time for a modular Hello World.  We have a crc32_unsigned
version that processes 4 bytes at a time which we could use here, but
since we bundle zlib we might as well use its highly optimized crc
routines that can process up to 32 bytes at a time.

So this patch makes us use zlib's crc32 in this hot code path.  This
reduces stream in time for a modular Hello World by around 15% for me
with a release compiler.

gcc/cp/ChangeLog:

	* Make-lang.in (CFLAGS-cp/module.o): Add $(ZLIBINC).
	* module.cc: Include <zlib.h>.
	(bytes::calc_crc): Use crc32 from zlib.
	(bytes_out::set_crc): Use crc32_combine from zlib.

Reviewed-by: Jason Merill <jason@redhat.com>
This commit is contained in:
Patrick Palka 2024-02-13 14:26:48 -05:00
parent cb76d7e476
commit 0eb9265fe7
2 changed files with 4 additions and 6 deletions

View file

@ -55,7 +55,7 @@ c++.serial = cc1plus$(exeext)
CFLAGS-cp/g++spec.o += $(DRIVER_DEFINES)
CFLAGS-cp/module.o += -DHOST_MACHINE=\"$(host)\" \
-DTARGET_MACHINE=\"$(target)\"
-DTARGET_MACHINE=\"$(target)\" $(ZLIBINC)
# In non-release builds, use a date-related module version.
ifneq ($(DEVPHASE_c),)

View file

@ -233,6 +233,7 @@ Classes used:
/* This TU doesn't need or want to see the networking. */
#define CODY_NETWORKING 0
#include "mapper-client.h"
#include <zlib.h> // for crc32, crc32_combine
#if 0 // 1 for testing no mmap
#define MAPPED_READING 0
@ -487,10 +488,7 @@ protected:
unsigned
bytes::calc_crc (unsigned l) const
{
unsigned crc = 0;
for (size_t ix = 4; ix < l; ix++)
crc = crc32_byte (crc, buffer[ix]);
return crc;
return crc32 (0, (unsigned char *)buffer + 4, l - 4);
}
class elf_in;
@ -717,7 +715,7 @@ bytes_out::set_crc (unsigned *crc_ptr)
unsigned crc = calc_crc (pos);
unsigned accum = *crc_ptr;
/* Only mix the existing *CRC_PTR if it is non-zero. */
accum = accum ? crc32_unsigned (accum, crc) : crc;
accum = accum ? crc32_combine (accum, crc, pos - 4) : crc;
*crc_ptr = accum;
/* Buffer will be sufficiently aligned. */