i386.c (ix86_expand_prologue): Optimize stack checking for leaf functions without dynamic stack allocation.

* config/i386/i386.c (ix86_expand_prologue): Optimize stack checking for
	leaf functions without dynamic stack allocation.
	* config/ia64/ia64.c (ia64_emit_probe_stack_range): Adjust.
	(ia64_expand_prologue): Likewise.
	* config/mips/mips.c (mips_expand_prologue): Likewise.
	* config/rs6000/rs6000.c (rs6000_emit_prologue): Likewise.
	* config/sparc/sparc.c (sparc_expand_prologue): Likewise.
	(sparc_flat_expand_prologue): Likewise.

From-SVN: r204450
This commit is contained in:
Eric Botcazou 2013-11-06 10:55:13 +00:00
parent f054ff5b7c
commit 0dca9cd86c
6 changed files with 165 additions and 89 deletions

View file

@ -1,3 +1,14 @@
2013-11-06 Eric Botcazou <ebotcazou@adacore.com>
* config/i386/i386.c (ix86_expand_prologue): Optimize stack checking for
leaf functions without dynamic stack allocation.
* config/ia64/ia64.c (ia64_emit_probe_stack_range): Adjust.
(ia64_expand_prologue): Likewise.
* config/mips/mips.c (mips_expand_prologue): Likewise.
* config/rs6000/rs6000.c (rs6000_emit_prologue): Likewise.
* config/sparc/sparc.c (sparc_expand_prologue): Likewise.
(sparc_flat_expand_prologue): Likewise.
2013-11-06 James Greenhalgh <james.greenhalgh@arm.com> 2013-11-06 James Greenhalgh <james.greenhalgh@arm.com>
* config/aarch64/arm_neon.h * config/aarch64/arm_neon.h
@ -28,12 +39,12 @@
2013-11-06 Christian Bruel <christian.bruel@st.com> 2013-11-06 Christian Bruel <christian.bruel@st.com>
* gcc/config/sh/sh-mem.cc (sh_expand_cmpnstr, sh_expand_cmpstr): * config/sh/sh-mem.cc (sh_expand_cmpnstr, sh_expand_cmpstr):
Factorize probabilities, Use adjust_address instead of Factorize probabilities, Use adjust_address instead of
adjust_automodify_address when possible. Enable for optimize. adjust_automodify_address when possible. Enable for optimize.
(sh_expand_strlen): New function. (sh_expand_strlen): New function.
* gcc/config/sh/sh-protos.h (sh_expand_strlen): Declare. * config/sh/sh-protos.h (sh_expand_strlen): Declare.
* gcc/config/sh/sh.md (strlensi): New pattern. * config/sh/sh.md (strlensi): New pattern.
(UNSPEC_BUILTIN_STRLEN): Define. (UNSPEC_BUILTIN_STRLEN): Define.
2013-11-06 Jakub Jelinek <jakub@redhat.com> 2013-11-06 Jakub Jelinek <jakub@redhat.com>
@ -305,19 +316,19 @@
2013-11-04 Wei Mi <wmi@google.com> 2013-11-04 Wei Mi <wmi@google.com>
* gcc/config/i386/i386.c (memory_address_length): Extract a part * config/i386/i386.c (memory_address_length): Extract a part
of code to rip_relative_addr_p. of code to rip_relative_addr_p.
(rip_relative_addr_p): New Function. (rip_relative_addr_p): New Function.
(ix86_macro_fusion_p): Ditto. (ix86_macro_fusion_p): Ditto.
(ix86_macro_fusion_pair_p): Ditto. (ix86_macro_fusion_pair_p): Ditto.
* gcc/config/i386/i386.h: Add new tune features about macro-fusion. * config/i386/i386.h: Add new tune features about macro-fusion.
* gcc/config/i386/x86-tune.def (DEF_TUNE): Ditto. * config/i386/x86-tune.def (DEF_TUNE): Ditto.
* gcc/doc/tm.texi: Generated. * doc/tm.texi: Generated.
* gcc/doc/tm.texi.in: Ditto. * doc/tm.texi.in: Ditto.
* gcc/haifa-sched.c (try_group_insn): New Function. * haifa-sched.c (try_group_insn): New Function.
(group_insns_for_macro_fusion): Ditto. (group_insns_for_macro_fusion): Ditto.
(sched_init): Call group_insns_for_macro_fusion. (sched_init): Call group_insns_for_macro_fusion.
* gcc/target.def: Add two hooks: macro_fusion_p and * target.def: Add two hooks: macro_fusion_p and
macro_fusion_pair_p. macro_fusion_pair_p.
2013-11-04 Kostya Serebryany <kcc@google.com> 2013-11-04 Kostya Serebryany <kcc@google.com>
@ -337,17 +348,17 @@
2013-11-04 Wei Mi <wmi@google.com> 2013-11-04 Wei Mi <wmi@google.com>
* gcc/config/i386/i386-c.c (ix86_target_macros_internal): Separate * config/i386/i386-c.c (ix86_target_macros_internal): Separate
PROCESSOR_COREI7_AVX out from PROCESSOR_COREI7. PROCESSOR_COREI7_AVX out from PROCESSOR_COREI7.
* gcc/config/i386/i386.c (ix86_option_override_internal): Ditto. * config/i386/i386.c (ix86_option_override_internal): Ditto.
(ix86_issue_rate): Ditto. (ix86_issue_rate): Ditto.
(ix86_adjust_cost): Ditto. (ix86_adjust_cost): Ditto.
(ia32_multipass_dfa_lookahead): Ditto. (ia32_multipass_dfa_lookahead): Ditto.
(ix86_sched_init_global): Ditto. (ix86_sched_init_global): Ditto.
(get_builtin_code_for_version): Ditto. (get_builtin_code_for_version): Ditto.
* gcc/config/i386/i386.h (enum target_cpu_default): Ditto. * config/i386/i386.h (enum target_cpu_default): Ditto.
(enum processor_type): Ditto. (enum processor_type): Ditto.
* gcc/config/i386/x86-tune.def (DEF_TUNE): Ditto. * config/i386/x86-tune.def (DEF_TUNE): Ditto.
2013-11-04 Vladimir Makarov <vmakarov@redhat.com> 2013-11-04 Vladimir Makarov <vmakarov@redhat.com>
@ -903,7 +914,7 @@
2013-10-30 Tobias Burnus <burnus@net-b.de> 2013-10-30 Tobias Burnus <burnus@net-b.de>
PR other/33426 PR other/33426
* gcc/tree-cfg.c (replace_loop_annotate): Replace warning by * tree-cfg.c (replace_loop_annotate): Replace warning by
warning_at. warning_at.
2013-10-30 Jason Merrill <jason@redhat.com> 2013-10-30 Jason Merrill <jason@redhat.com>
@ -1024,10 +1035,10 @@
2013-10-30 Christian Bruel <christian.bruel@st.com> 2013-10-30 Christian Bruel <christian.bruel@st.com>
* gcc/config/sh/sh-mem.cc (sh_expand_cmpnstr): New function. * config/sh/sh-mem.cc (sh_expand_cmpnstr): New function.
(sh_expand_cmpstr): Handle known align and schedule improvements. (sh_expand_cmpstr): Handle known align and schedule improvements.
* gcc/config/sh/sh-protos.h (sh_expand_cmpstrn): Declare. * config/sh/sh-protos.h (sh_expand_cmpstrn): Declare.
* gcc/config/sh/sh.md (cmpstrnsi): New pattern. * config/sh/sh.md (cmpstrnsi): New pattern.
2013-10-30 Martin Jambor <mjambor@suse.cz> 2013-10-30 Martin Jambor <mjambor@suse.cz>
@ -2303,7 +2314,7 @@
2013-10-24 Joern Rennecke <joern.rennecke@embecosm.com> 2013-10-24 Joern Rennecke <joern.rennecke@embecosm.com>
* gcc/config/arc/arc.c (arc_ccfsm_post_advance): Also handle * config/arc/arc.c (arc_ccfsm_post_advance): Also handle
TYPE_UNCOND_BRANCH. TYPE_UNCOND_BRANCH.
(arc_ifcvt) <case 1 and 2>: Check that arc_ccfsm_post_advance (arc_ifcvt) <case 1 and 2>: Check that arc_ccfsm_post_advance
changes statep->state. changes statep->state.
@ -2335,12 +2346,12 @@
2013-10-25 Christian Bruel <christian.bruel@st.com> 2013-10-25 Christian Bruel <christian.bruel@st.com>
* config.gcc (sh-*): Add sh-mem.o to extra_obj. * config.gcc (sh-*): Add sh-mem.o to extra_obj.
* gcc/config/sh/t-sh (sh-mem.o): New rule. * config/sh/t-sh (sh-mem.o): New rule.
* gcc/config/sh/sh-mem.cc (expand_block_move): Moved here. * config/sh/sh-mem.cc (expand_block_move): Moved here.
(sh_expand_cmpstr): New function. (sh_expand_cmpstr): New function.
* gcc/config/sh/sh.c (force_into, expand_block_move): Move to sh-mem.c. * config/sh/sh.c (force_into, expand_block_move): Move to sh-mem.c.
* gcc/config/sh/sh-protos.h (sh_expand_cmpstr): Declare. * config/sh/sh-protos.h (sh_expand_cmpstr): Declare.
* gcc/config/sh/sh.md (cmpstrsi, cmpstr_t): New patterns. * config/sh/sh.md (cmpstrsi, cmpstr_t): New patterns.
(rotlhi3_8): Rename. (rotlhi3_8): Rename.
2013-10-24 Jan-Benedict Glaw <jbglaw@lug-owl.de> 2013-10-24 Jan-Benedict Glaw <jbglaw@lug-owl.de>
@ -3184,7 +3195,7 @@
2013-10-16 Bill Schmidt <wschmidt@linux.vnet.ibm.com> 2013-10-16 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
* gcc/config/rs6000/vector.md (vec_unpacks_hi_v4sf): Correct for * config/rs6000/vector.md (vec_unpacks_hi_v4sf): Correct for
endianness. endianness.
(vec_unpacks_lo_v4sf): Likewise. (vec_unpacks_lo_v4sf): Likewise.
(vec_unpacks_float_hi_v4si): Likewise. (vec_unpacks_float_hi_v4si): Likewise.
@ -3970,8 +3981,8 @@
(anddi3_insn): Update type attribute. (anddi3_insn): Update type attribute.
(xordi3_insn): Likewise. (xordi3_insn): Likewise.
(one_cmpldi2): Likewise. (one_cmpldi2): Likewise.
* gcc/config/arm/vfp.md (movhf_vfp_neon): Update type attribute. * config/arm/vfp.md (movhf_vfp_neon): Update type attribute.
* gcc/config/arm/neon.md (neon_mov): Update type attribute. * config/arm/neon.md (neon_mov): Update type attribute.
(*movmisalign<mode>_neon_store): Likewise. (*movmisalign<mode>_neon_store): Likewise.
(*movmisalign<mode>_neon_load): Likewise. (*movmisalign<mode>_neon_load): Likewise.
(vec_set<mode>_internal): Likewise. (vec_set<mode>_internal): Likewise.

View file

@ -10657,8 +10657,12 @@ ix86_expand_prologue (void)
if (STACK_CHECK_MOVING_SP) if (STACK_CHECK_MOVING_SP)
{ {
ix86_adjust_stack_and_probe (allocate); if (!(crtl->is_leaf && !cfun->calls_alloca
allocate = 0; && allocate <= PROBE_INTERVAL))
{
ix86_adjust_stack_and_probe (allocate);
allocate = 0;
}
} }
else else
{ {
@ -10668,9 +10672,26 @@ ix86_expand_prologue (void)
size = 0x80000000 - STACK_CHECK_PROTECT - 1; size = 0x80000000 - STACK_CHECK_PROTECT - 1;
if (TARGET_STACK_PROBE) if (TARGET_STACK_PROBE)
ix86_emit_probe_stack_range (0, size + STACK_CHECK_PROTECT); {
if (crtl->is_leaf && !cfun->calls_alloca)
{
if (size > PROBE_INTERVAL)
ix86_emit_probe_stack_range (0, size);
}
else
ix86_emit_probe_stack_range (0, size + STACK_CHECK_PROTECT);
}
else else
ix86_emit_probe_stack_range (STACK_CHECK_PROTECT, size); {
if (crtl->is_leaf && !cfun->calls_alloca)
{
if (size > PROBE_INTERVAL && size > STACK_CHECK_PROTECT)
ix86_emit_probe_stack_range (STACK_CHECK_PROTECT,
size - STACK_CHECK_PROTECT);
}
else
ix86_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
}
} }
} }

View file

@ -3206,61 +3206,54 @@ gen_fr_restore_x (rtx dest, rtx src, rtx offset ATTRIBUTE_UNUSED)
#define BACKING_STORE_SIZE(N) ((N) > 0 ? ((N) + (N)/63 + 1) * 8 : 0) #define BACKING_STORE_SIZE(N) ((N) > 0 ? ((N) + (N)/63 + 1) * 8 : 0)
/* Emit code to probe a range of stack addresses from FIRST to FIRST+SIZE, /* Emit code to probe a range of stack addresses from FIRST to FIRST+SIZE,
inclusive. These are offsets from the current stack pointer. SOL is the inclusive. These are offsets from the current stack pointer. BS_SIZE
size of local registers. ??? This clobbers r2 and r3. */ is the size of the backing store. ??? This clobbers r2 and r3. */
static void static void
ia64_emit_probe_stack_range (HOST_WIDE_INT first, HOST_WIDE_INT size, int sol) ia64_emit_probe_stack_range (HOST_WIDE_INT first, HOST_WIDE_INT size,
int bs_size)
{ {
/* On the IA-64 there is a second stack in memory, namely the Backing Store
of the Register Stack Engine. We also need to probe it after checking
that the 2 stacks don't overlap. */
const int bs_size = BACKING_STORE_SIZE (sol);
rtx r2 = gen_rtx_REG (Pmode, GR_REG (2)); rtx r2 = gen_rtx_REG (Pmode, GR_REG (2));
rtx r3 = gen_rtx_REG (Pmode, GR_REG (3)); rtx r3 = gen_rtx_REG (Pmode, GR_REG (3));
rtx p6 = gen_rtx_REG (BImode, PR_REG (6));
/* Detect collision of the 2 stacks if necessary. */ /* On the IA-64 there is a second stack in memory, namely the Backing Store
if (bs_size > 0 || size > 0) of the Register Stack Engine. We also need to probe it after checking
{ that the 2 stacks don't overlap. */
rtx p6 = gen_rtx_REG (BImode, PR_REG (6)); emit_insn (gen_bsp_value (r3));
emit_move_insn (r2, GEN_INT (-(first + size)));
emit_insn (gen_bsp_value (r3)); /* Compare current value of BSP and SP registers. */
emit_move_insn (r2, GEN_INT (-(first + size))); emit_insn (gen_rtx_SET (VOIDmode, p6,
gen_rtx_fmt_ee (LTU, BImode,
r3, stack_pointer_rtx)));
/* Compare current value of BSP and SP registers. */ /* Compute the address of the probe for the Backing Store (which grows
emit_insn (gen_rtx_SET (VOIDmode, p6, towards higher addresses). We probe only at the first offset of
gen_rtx_fmt_ee (LTU, BImode, the next page because some OS (eg Linux/ia64) only extend the
r3, stack_pointer_rtx))); backing store when this specific address is hit (but generate a SEGV
on other address). Page size is the worst case (4KB). The reserve
size is at least 4096 - (96 + 2) * 8 = 3312 bytes, which is enough.
Also compute the address of the last probe for the memory stack
(which grows towards lower addresses). */
emit_insn (gen_rtx_SET (VOIDmode, r3, plus_constant (Pmode, r3, 4095)));
emit_insn (gen_rtx_SET (VOIDmode, r2,
gen_rtx_PLUS (Pmode, stack_pointer_rtx, r2)));
/* Compute the address of the probe for the Backing Store (which grows /* Compare them and raise SEGV if the former has topped the latter. */
towards higher addresses). We probe only at the first offset of emit_insn (gen_rtx_COND_EXEC (VOIDmode,
the next page because some OS (eg Linux/ia64) only extend the gen_rtx_fmt_ee (NE, VOIDmode, p6, const0_rtx),
backing store when this specific address is hit (but generate a SEGV gen_rtx_SET (VOIDmode, p6,
on other address). Page size is the worst case (4KB). The reserve gen_rtx_fmt_ee (GEU, BImode,
size is at least 4096 - (96 + 2) * 8 = 3312 bytes, which is enough. r3, r2))));
Also compute the address of the last probe for the memory stack emit_insn (gen_rtx_SET (VOIDmode,
(which grows towards lower addresses). */ gen_rtx_ZERO_EXTRACT (DImode, r3, GEN_INT (12),
emit_insn (gen_rtx_SET (VOIDmode, r3, plus_constant (Pmode, r3, 4095))); const0_rtx),
emit_insn (gen_rtx_SET (VOIDmode, r2, const0_rtx));
gen_rtx_PLUS (Pmode, stack_pointer_rtx, r2))); emit_insn (gen_rtx_COND_EXEC (VOIDmode,
gen_rtx_fmt_ee (NE, VOIDmode, p6, const0_rtx),
/* Compare them and raise SEGV if the former has topped the latter. */ gen_rtx_TRAP_IF (VOIDmode, const1_rtx,
emit_insn (gen_rtx_COND_EXEC (VOIDmode, GEN_INT (11))));
gen_rtx_fmt_ee (NE, VOIDmode, p6,
const0_rtx),
gen_rtx_SET (VOIDmode, p6,
gen_rtx_fmt_ee (GEU, BImode,
r3, r2))));
emit_insn (gen_rtx_SET (VOIDmode,
gen_rtx_ZERO_EXTRACT (DImode, r3, GEN_INT (12),
const0_rtx),
const0_rtx));
emit_insn (gen_rtx_COND_EXEC (VOIDmode,
gen_rtx_fmt_ee (NE, VOIDmode, p6,
const0_rtx),
gen_rtx_TRAP_IF (VOIDmode, const1_rtx,
GEN_INT (11))));
}
/* Probe the Backing Store if necessary. */ /* Probe the Backing Store if necessary. */
if (bs_size > 0) if (bs_size > 0)
@ -3444,10 +3437,23 @@ ia64_expand_prologue (void)
current_function_static_stack_size = current_frame_info.total_size; current_function_static_stack_size = current_frame_info.total_size;
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK) if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
ia64_emit_probe_stack_range (STACK_CHECK_PROTECT, {
current_frame_info.total_size, HOST_WIDE_INT size = current_frame_info.total_size;
current_frame_info.n_input_regs int bs_size = BACKING_STORE_SIZE (current_frame_info.n_input_regs
+ current_frame_info.n_local_regs); + current_frame_info.n_local_regs);
if (crtl->is_leaf && !cfun->calls_alloca)
{
if (size > PROBE_INTERVAL && size > STACK_CHECK_PROTECT)
ia64_emit_probe_stack_range (STACK_CHECK_PROTECT,
size - STACK_CHECK_PROTECT,
bs_size);
else if (size + bs_size > STACK_CHECK_PROTECT)
ia64_emit_probe_stack_range (STACK_CHECK_PROTECT, 0, bs_size);
}
else if (size + bs_size > 0)
ia64_emit_probe_stack_range (STACK_CHECK_PROTECT, size, bs_size);
}
if (dump_file) if (dump_file)
{ {

View file

@ -10994,8 +10994,17 @@ mips_expand_prologue (void)
if (flag_stack_usage_info) if (flag_stack_usage_info)
current_function_static_stack_size = size; current_function_static_stack_size = size;
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK && size) if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
mips_emit_probe_stack_range (STACK_CHECK_PROTECT, size); {
if (crtl->is_leaf && !cfun->calls_alloca)
{
if (size > PROBE_INTERVAL && size > STACK_CHECK_PROTECT)
mips_emit_probe_stack_range (STACK_CHECK_PROTECT,
size - STACK_CHECK_PROTECT);
}
else if (size > 0)
mips_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
}
/* Save the registers. Allocate up to MIPS_MAX_FIRST_STACK_STEP /* Save the registers. Allocate up to MIPS_MAX_FIRST_STACK_STEP
bytes beforehand; this is enough to cover the register save area bytes beforehand; this is enough to cover the register save area

View file

@ -21538,8 +21538,19 @@ rs6000_emit_prologue (void)
if (flag_stack_usage_info) if (flag_stack_usage_info)
current_function_static_stack_size = info->total_size; current_function_static_stack_size = info->total_size;
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK && info->total_size) if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
rs6000_emit_probe_stack_range (STACK_CHECK_PROTECT, info->total_size); {
HOST_WIDE_INT size = info->total_size;
if (crtl->is_leaf && !cfun->calls_alloca)
{
if (size > PROBE_INTERVAL && size > STACK_CHECK_PROTECT)
rs6000_emit_probe_stack_range (STACK_CHECK_PROTECT,
size - STACK_CHECK_PROTECT);
}
else if (size > 0)
rs6000_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
}
if (TARGET_FIX_AND_CONTINUE) if (TARGET_FIX_AND_CONTINUE)
{ {

View file

@ -5362,8 +5362,17 @@ sparc_expand_prologue (void)
if (flag_stack_usage_info) if (flag_stack_usage_info)
current_function_static_stack_size = size; current_function_static_stack_size = size;
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK && size) if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
sparc_emit_probe_stack_range (STACK_CHECK_PROTECT, size); {
if (crtl->is_leaf && !cfun->calls_alloca)
{
if (size > PROBE_INTERVAL && size > STACK_CHECK_PROTECT)
sparc_emit_probe_stack_range (STACK_CHECK_PROTECT,
size - STACK_CHECK_PROTECT);
}
else if (size > 0)
sparc_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
}
if (size == 0) if (size == 0)
; /* do nothing. */ ; /* do nothing. */
@ -5464,8 +5473,17 @@ sparc_flat_expand_prologue (void)
if (flag_stack_usage_info) if (flag_stack_usage_info)
current_function_static_stack_size = size; current_function_static_stack_size = size;
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK && size) if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
sparc_emit_probe_stack_range (STACK_CHECK_PROTECT, size); {
if (crtl->is_leaf && !cfun->calls_alloca)
{
if (size > PROBE_INTERVAL && size > STACK_CHECK_PROTECT)
sparc_emit_probe_stack_range (STACK_CHECK_PROTECT,
size - STACK_CHECK_PROTECT);
}
else if (size > 0)
sparc_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
}
if (sparc_save_local_in_regs_p) if (sparc_save_local_in_regs_p)
emit_save_or_restore_local_in_regs (stack_pointer_rtx, SPARC_STACK_BIAS, emit_save_or_restore_local_in_regs (stack_pointer_rtx, SPARC_STACK_BIAS,