PR target/86048
* config/i386/winnt.c (i386_pe_seh_cold_init): Do not emit negative
offsets for register save directives. Emit a second batch of save
directives, if need be, when the function accesses prior frames.
From-SVN: r261544
Accept at most a single constant for fma patterns.
gcc/
2018-03-21 Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/fpu.md (fmasf4): Force operand to register.
(fnmasf4): Likewise.
gcc/testsuite
2018-03-21 Claudiu Zissulescu <claziss@synopsys.com>
* gcc.target/arc/fma-1.c: New test.
From-SVN: r261543
For ARC700, adding padding if necessary to avoid a mispredict. A
return could happen immediately after the function start. A
call/return and return/return must be 6 bytes apart to avoid
mispredict.
The old implementation was doing this operation very late in the
compilation process, and the additional nop instructions and/or
forcing some other instruction to take their long form was not taken
into account when generating brcc instructions. Thus, wrong code could
be generated.
gcc/
2017-03-24 Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc-protos.h (arc_pad_return): Remove.
* config/arc/arc.c (machine_function): Remove force_short_suffix
and size_reason.
(arc_print_operand): Adjust printing of '&'.
(arc_verify_short): Remove conditional printing of short suffix.
(arc_final_prescan_insn): Remove reference to size_reason.
(pad_return): New function.
(arc_reorg): Call pad_return.
(arc_pad_return): Remove.
(arc_init_machine_status): Remove reference to force_short_suffix.
* config/arc/arc.md (vunspec): Add VUNSPEC_ARC_BLOCKAGE.
(attr length): When attribute iscompact is true force to 2
regardless; in the case of maybe check if we want to force the
instruction to have 4 bytes length.
(nopv): Change it to generate 4 byte long nop as well.
(blockage): New pattern.
(simple_return): Remove call to arc_pad_return.
(p_return_i): Likewise.
gcc/testsuite/
2017-03-24 Claudiu Zissulescu <claziss@synopsys.com>
* gcc.target/arc/pr9001107555.c: New file.
From-SVN: r261542
gcc/
2017-05-02 Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc.c (atomic_exchangesi): EX instruction is default
for ARC700 and ARCv2.
From-SVN: r261539
This patch fixes an LRA cycling problem on the attached testcase.
The original insn was:
(insn 74 72 76 8 (set (reg:V2DI 287 [ _166 ])
(subreg:V2DI (reg/v/f:DI 112 [ d ]) 0)) 1060 {*aarch64_simd_movv2di}
(nil))
which IRA converted to:
(insn 74 72 580 8 (set (reg:V2DI 287 [ _166 ])
(subreg:V2DI (reg/v/f:DI 517 [orig:112 d ] [112]) 0)) 1060 {*aarch64_simd_movv2di}
(nil))
after creating loop allocnos. It happens that the ALLOCNO_WMODEs for
both 112 and 517 were not set to V2DI due to another bug that I'll post
a separate patch for, but we nevertheless got a valid allocation of
register 1.
LRA's first try at constraining the instruction gave:
Choosing alt 5 in insn 74: (0) ?w (1) r {*aarch64_simd_movv2di}
at which point all was good. But LRA later decided it needed
to spill r517:
Spill r517 after risky transformations
so the next constraint attempt gave:
Choosing alt 0 in insn 74: (0) =w (1) m {*aarch64_simd_movv2di}
which was still good. Then during inheritance we had:
Creating newreg=672 from oldreg=517, assigning class GENERAL_REGS to inheritance r672
Original reg change 517->672 (bb8):
74: r287:V2DI=r672:DI#0
Add inheritance<-original before:
939: r672:DI=r517:DI
Inheritance reuse change 517->672 (bb8):
620: r572:DI=r672:DI
REG_DEAD r672:DI
Use smallest class of POINTER_REGS and GENERAL_REGS
Creating newreg=673 from oldreg=517, assigning class POINTER_REGS to inheritance r673
Original reg change 517->673 (bb8):
936: r669:DI=r673:DI
Add inheritance<-original before:
940: r673:DI=r517:DI
("Use smallest class of POINTER_REGS and GENERAL_REGS" ought to
give GENERAL_REGS. That might be a missed optimisation, and probably
due to both classes having the same number of allocatable registers.
I'll look at that as a follow-on.)
Thus LRA created two inheritance registers for r517, one (r673)
that included the unallocatable x31 and another (r672) that didn't.
The r672 references included the paradoxical subreg in insn 74 but the
r673 ones didn't. LRA then allocated x30 to r673, which was a valid
choice.
Later LRA decided to "undo" the inheritance for insn 620, but because
of the double inheritance, it got confused as to what the original
situation was, and made insn 74 use the other inheritance register
instead of r517:
********** Undoing inheritance #2: **********
Inherit 11 out of 12 (91.67%)
Insn after restoring regs:
620: r572:DI=r517:DI
REG_DEAD r517:DI
Change reload insn:
74: r287:V2DI=r673:DI#0 <-------------------
Insn after restoring regs:
939: r517:DI=r673:DI
REG_DEAD r673:DI
This might be a bug in itself: we should probably look through sets
of other inheritance pseudos to find the "real" origin.
Either way, at this point we had a situation in which r673 was used in an
insn whose subreg was larger than the biggest_mode that r673 had when it
was allocated. While x30 was valid for the original biggest_mode, it
wasn't valid for this subreg use.
The next attempt to constrain insn 74 was:
Choosing alt 5 in insn 74: (0) ?w (1) r {*aarch64_simd_movv2di}
Creating newreg=684, assigning class GENERAL_REGS to r684
74: r287:V2DI=r684:V2DI
Inserting insn reload before:
951: r684:V2DI=r673:DI#0
where LRA reloaded the SUBREG rather than the SUBREG_REG. And it
then cycled trying the same thing when reloading the reload (and the
reload of the reload, etc.).
What it should be doing here is reloading the SUBREG_REG instead.
There's already code to cope with this case when the paradoxical
subreg falls outside the class (which isn't true here, since r673
is POINTER_REGS and POINTER_REGS includes x31). But I think we
should also test whether LRA is entitled to allocate the spanned
registers. Not doing that seems like a bug regardless of the above
missed optimisation and the mix-up undoing inheritance.
2018-05-30 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* lra-constraints.c (simplify_operand_subreg): In the paradoxical
case, check whether the outer register overlaps an unallocatable
register, not just whether it fits the required class.
gcc/testsuite/
* g++.dg/torture/aarch64-vect-init-1.C: New test.
From-SVN: r261531
This patch generalises various places that used hwi rtx accessors so
that they can handle poly_ints instead. In many cases these changes
are by inspection rather than because something had shown them to be
necessary.
2018-06-12 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* poly-int.h (can_div_trunc_p): Add new overload in which all values
are poly_ints.
* alias.c (get_addr): Extend CONST_INT handling to poly_int_rtx_p.
(memrefs_conflict_p): Likewise.
(init_alias_analysis): Likewise.
* cfgexpand.c (expand_debug_expr): Likewise.
* combine.c (combine_simplify_rtx, force_int_to_mode): Likewise.
* cse.c (fold_rtx): Likewise.
* explow.c (adjust_stack, anti_adjust_stack): Likewise.
* expr.c (emit_block_move_hints): Likewise.
(clear_storage_hints, push_block, emit_push_insn): Likewise.
(store_expr_with_bounds, reduce_to_bit_field_precision): Likewise.
(emit_group_load_1): Use rtx_to_poly_int64 for group offsets.
(emit_group_store): Likewise.
(find_args_size_adjust): Use strip_offset. Use rtx_to_poly_int64
to read the PRE/POST_MODIFY increment.
* calls.c (store_one_arg): Use strip_offset.
* rtlanal.c (rtx_addr_can_trap_p_1): Extend CONST_INT handling to
poly_int_rtx_p.
(set_noop_p): Use rtx_to_poly_int64 for the elements selected
by a VEC_SELECT.
* simplify-rtx.c (avoid_constant_pool_reference): Use strip_offset.
(simplify_binary_operation_1): Extend CONST_INT handling to
poly_int_rtx_p.
* var-tracking.c (compute_cfa_pointer): Take a poly_int64 rather
than a HOST_WIDE_INT.
(hard_frame_pointer_adjustment): Change from HOST_WIDE_INT to
poly_int64.
(adjust_mems, add_stores): Update accodingly.
(vt_canonicalize_addr): Track polynomial offsets.
(emit_note_insn_var_location): Likewise.
(vt_add_function_parameter): Likewise.
(vt_initialize): Likewise.
From-SVN: r261530
Core issue 1331 - const mismatch with defaulted copy constructor
* class.c (check_bases_and_members): When checking a defaulted
function, mark it as deleted rather than giving an error.
* g++.dg/cpp0x/defaulted15.C (struct F): Remove dg-error.
* g++.dg/cpp0x/defaulted52.C: New test.
* g++.dg/cpp0x/defaulted53.C: New test.
* g++.dg/cpp0x/defaulted54.C: New test.
* g++.dg/cpp0x/defaulted55.C: New test.
* g++.dg/cpp0x/defaulted56.C: New test.
* g++.dg/cpp0x/defaulted57.C: New test.
* g++.dg/cpp0x/defaulted58.C: New test.
* g++.dg/cpp0x/defaulted59.C: New test.
* g++.dg/cpp0x/defaulted60.C: New test.
From-SVN: r261526
[testsuite]
2018-06-12 Will Schmidt <will_schmidt@vnet.ibm.com>
* gcc.target/powerpc/fold-vec-load-vec_xl-char.c: New testcase.
* gcc.target/powerpc/fold-vec-load-vec_xl-double.c: New testcase.
* gcc.target/powerpc/fold-vec-load-vec_xl-float.c: New testcase.
* gcc.target/powerpc/fold-vec-load-vec_xl-int.c: New testcase.
* gcc.target/powerpc/fold-vec-load-vec_xl-longlong.c: New testcase.
* gcc.target/powerpc/fold-vec-load-vec_xl-short.c: New testcase.
From-SVN: r261503
[gcc]
2018-06-12 Will Schmidt <will_schmidt@vnet.ibm.com>
* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add
BUILTIN_VEC_XST entries for pointer to double and long long.
From-SVN: r261502
Glibc 2.18 was changed by
commit ecbf434213c0333d81706074e4d107ac45011635
Author: Andreas Jaeger <aj@suse.de>
Date: Wed May 15 20:20:54 2013 +0200
Reserve new TLS field for x86 and x86_64
[BZ #10686]
* sysdeps/x86_64/tls.h (struct tcbhead_t): Add __private_ss
field.
* sysdeps/i386/tls.h (struct tcbhead_t): Likewise.
to reduce the size of __private_tm to make room for __private_ss, which
was supposed to be used for TARGET_THREAD_SPLIT_STACK_OFFSET:
typedef struct
{
void *tcb; /* Pointer to the TCB. Not necessarily the
thread descriptor used by libpthread. */
dtv_t *dtv;
void *self; /* Pointer to the thread descriptor. */
int multiple_threads;
uintptr_t sysinfo;
uintptr_t stack_guard;
uintptr_t pointer_guard;
int gscope_flag;
int __glibc_reserved1;
/* Reservation of some values for the TM ABI. */
void *__private_tm[4];
/* GCC split stack support. */
void *__private_ss;
} tcbhead_t;
But the offset of __private_ss for i386 was mistakenly set to 0x30,
instead of 0x34 and libgcc/config/i386/morestack.S has:
cmpl %gs:0x30,%eax # See if we have enough space.
movl %eax,%gs:0x30 # Save the new stack boundary.
movl %eax,%gs:0x30 # Save the new stack boundary.
movl %ecx,%gs:0x30 # Save new stack boundary.
movl %eax,%gs:0x30
movl %gs:0x30,%eax
movl %eax,%gs:0x30
Since update TARGET_THREAD_SPLIT_STACK_OFFSET changes split stack ABI,
glibc 2.28 has been changed by
commit 0221ce2a90be2d40fc90f0b5dcec77a1ec013f53
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Tue Jun 12 06:23:28 2018 -0700
i386: Change offset of __private_ss to 0x30 [BZ #23250]
to match GCC:
typedef struct
{
void *tcb; /* Pointer to the TCB. Not necessarily the
thread descriptor used by libpthread. */
dtv_t *dtv;
void *self; /* Pointer to the thread descriptor. */
int multiple_threads;
uintptr_t sysinfo;
uintptr_t stack_guard;
uintptr_t pointer_guard;
int gscope_flag;
int __glibc_reserved1;
/* Reservation of some values for the TM ABI. */
void *__private_tm[3];
/* GCC split stack support. */
void *__private_ss;
void *__glibc_reserved2;
} tcbhead_t;
PR target/85990
* config/i386/gnu-user.h (TARGET_THREAD_SPLIT_STACK_OFFSET):
Update comments.
* config/i386/gnu-user64.h (TARGET_THREAD_SPLIT_STACK_OFFSET):
Likewise.
From-SVN: r261501
QuarkSE has lp_count width set to 16 bits. Update the compiler to
consider it.
gcc/
2018-06-12 Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc-arch.h (arc_extras): New enum.
(arc_cpu_t):Add field extra.
(arc_cpu_types): Consider the extras.
* config/arc/arc-cpus.def: Add extras info.
* config/arc/arc-opts.h (processor_type): Consider extra field.
* config/arc/arc.c (arc_override_options): Handle extra field.
From-SVN: r261496
When we pass an mcpu to the compiler we have two types of (hardware
configuration) flags that are set:
1. Architecture specific, for example code-density is always enabled
for ARCHS architectures. These options are overwriting whatever the
corresponding user options with the preset ones.
2. CPU specific, for example archs is using LL64 option by
default. These options can be freely enabled or disabled.
Because of the above complexity, we need to throw some errors for the
user to know when he/she does something which goes against the above
rules. Thus, I came up with the following set of rules:
1. Overwriting default architecture specific hardware option: it is
ignored, a warning is thrown;
2. Overwriting default CPU specific hardware option: it is taken into
account, a warning is thrown.
gcc/
2018-06-12 Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc-arch.h: Update ARC_OPTX macro.
* config/arc/arc-options.def (ARC_OPTX): Introduce a new doc
field.
* config/arc/arc.c (arc_init): Update pic warning.
(irq_range): Update irq range parsing warnings.
(arc_override_options): Update various warning messages.
(arc_handle_aux_attribute): Likewise.
gcc/testsuite
2018-06-12 Claudiu Zissulescu <claziss@synopsys.com>
* gcc.target/arc/builtin_simdarc.c: Update test.
* gcc.target/arc/mulsi3_highpart-2.c: Likewise.
* gcc.target/arc/tumaddsidi4.c: Likewise.
From-SVN: r261495
In glibc, sysdeps/i386/nptl/tls.h has
typedef struct
{
void *tcb; /* Pointer to the TCB. Not necessarily the
thread descriptor used by libpthread. */
dtv_t *dtv;
void *self; /* Pointer to the thread descriptor. */
int multiple_threads;
uintptr_t sysinfo;
uintptr_t stack_guard;
uintptr_t pointer_guard;
int gscope_flag;
int __glibc_reserved1;
/* Reservation of some values for the TM ABI. */
void *__private_tm[4];
/* GCC split stack support. */
void *__private_ss;
} tcbhead_t;
and sysdeps/x86_64/nptl/tls.h has
typedef struct
{
void *tcb; /* Pointer to the TCB. Not necessarily the
thread descriptor used by libpthread. */
dtv_t *dtv;
void *self; /* Pointer to the thread descriptor. */
int multiple_threads;
int gscope_flag;
uintptr_t sysinfo;
uintptr_t stack_guard;
uintptr_t pointer_guard;
unsigned long int vgetcpu_cache[2];
int __glibc_reserved1;
int __glibc_unused1;
/* Reservation of some values for the TM ABI. */
void *__private_tm[4];
/* GCC split stack support. */
void *__private_ss;
long int __glibc_reserved2;
/* Must be kept even if it is no longer used by glibc since programs,
like AddressSanitizer, depend on the size of tcbhead_t. */
__128bits __glibc_unused2[8][4] __attribute__ ((aligned (32)));
void *__padding[8];
} tcbhead_t;
The offsets of __private_tm are
i386: 36 bytes
x32: 48 bytes
x86_64: 80 bytes
and the offsets of pointer_guard are:
i386: 24 bytes
x32: 28 bytes
x86_64: 48 bytes
But config/linux/x86/tls.h had
#ifdef __x86_64__
#ifdef __LP64__
# define SEG_READ(OFS) "movq\t%%fs:(" #OFS "*8),%0"
# define SEG_WRITE(OFS) "movq\t%0,%%fs:(" #OFS "*8)"
# define SEG_DECODE_READ(OFS) SEG_READ(OFS) "\n\t" \
"rorq\t$17,%0\n\t" \
"xorq\t%%fs:48,%0"
# define SEG_ENCODE_WRITE(OFS) "xorq\t%%fs:48,%0\n\t" \
"rolq\t$17,%0\n\t" \
SEG_WRITE(OFS)
#else
// For X32.
# define SEG_READ(OFS) "movl\t%%fs:(" #OFS "*4),%0"
# define SEG_WRITE(OFS) "movl\t%0,%%fs:(" #OFS "*4)"
# define SEG_DECODE_READ(OFS) SEG_READ(OFS) "\n\t" \
"rorl\t$9,%0\n\t" \
"xorl\t%%fs:24,%0"
# define SEG_ENCODE_WRITE(OFS) "xorl\t%%fs:24,%0\n\t" \
"roll\t$9,%0\n\t" \
SEG_WRITE(OFS)
#endif
#else
# define SEG_READ(OFS) "movl\t%%gs:(" #OFS "*4),%0"
# define SEG_WRITE(OFS) "movl\t%0,%%gs:(" #OFS "*4)"
# define SEG_DECODE_READ(OFS) SEG_READ(OFS) "\n\t" \
"rorl\t$9,%0\n\t" \
"xorl\t%%gs:24,%0"
# define SEG_ENCODE_WRITE(OFS) "xorl\t%%gs:24,%0\n\t" \
"roll\t$9,%0\n\t" \
SEG_WRITE(OFS)
#endif
static inline struct gtm_thread *gtm_thr(void)
{
struct gtm_thread *r;
asm volatile (SEG_READ(10) : "=r"(r));
return r;
}
static inline void set_gtm_thr(struct gtm_thread *x)
{
asm volatile (SEG_WRITE(10) : : "r"(x));
}
static inline struct abi_dispatch *abi_disp(void)
{
struct abi_dispatch *r;
asm volatile (SEG_DECODE_READ(11) : "=r"(r));
return r;
}
static inline void set_abi_disp(struct abi_dispatch *x)
{
void *scratch;
asm volatile (SEG_ENCODE_WRITE(11) : "=r"(scratch) : "0"(x));
}
SEG_READ, SEG_WRITE, SEG_DECODE_READ and SEG_ENCODE_WRITE were correct
only for x86-64.
Update SEG_READ and SEG_WRITE to use the offset of __private_tm as base
and correct the offset of pointer_guard for x32. This patch doesn't
change ABI of libitm.
PR libitm/85988
* config/linux/x86/tls.h (SEG_READ): Use the offset of
__private_tm as base.
(SEG_WRITE): Likewise.
(SEG_ENCODE_WRITE): Correct the offset of pointer_guard for x32.
(gtm_thr): Replace SEG_READ(10) with SEG_READ(0).
(set_gtm_thr): Replace SEG_WRITE(10) with SEG_WRITE(0).
(abi_disp): Replace SEG_DECODE_READ(11) with SEG_DECODE_READ(1).
(set_abi_disp): Replace SEG_ENCODE_WRITE(11) with
SEG_ENCODE_WRITE(1).
From-SVN: r261491
* gcc-interface/ada-tree.h (TYPE_RETURN_BY_DIRECT_REF_P): Change from
using TYPE_LANG_FLAG_4 to using TYPE_LANG_FLAG_0.
(TYPE_ALIGN_OK): Move around.
(TYPE_PADDING_FOR_COMPONENT): Remove superfluous parentheses.
* gcc-interface/decl.c (change_qualified_type): Move to...
(gnat_to_gnu_entity): Adjust comment.
* gcc-interface/gigi.h (change_qualified_type): ...here; make inline.
(ceil_pow2): Use ceil_log2.
* gcc-interface/utils.c (finish_subprog_decl): Add couple of comments
and do not set TREE_SIDE_EFFECTS.
(handle_noreturn_attribute): Use change_qualified_type.
From-SVN: r261486
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Constant>: Do not get
the expression of a dispatch table that is not being defined.
<E_Record_Subtype>: Remove obsolete kludge.
From-SVN: r261483
* gcc-interface/decl.c (warn_on_field_placement): Use specific wording
for discriminants.
(warn_on_list_placement): New static function.
(components_to_record): Use it to warn on multiple fields in list.
From-SVN: r261480
* gcc-interface/decl.c (variant_desc): Add AUX field.
(gnat_to_gnu_entity) <discrete_type>: Do not call compute_record_mode
directly.
(reverse_sort_field_list): New static function.
(components_to_record): Place the variant part at the beginning of the
field list when there is an obvious order of increasing position.
(build_variant_list): Initialize it.
(create_variant_part_from): Do not call compute_record_mode directly.
(copy_and_substitute_in_layout): Likewise. Always sort the fields with
fixed position in order of increasing position, in the record and all
the variants, in any. Call reverse_sort_field_list.
* gcc-interface/utils.c (make_packable_type): Compute the sizes before
calling finish_record_type. Do not call compute_record_mode directly.
(finish_record_type): Overhaul final processing depending on REP_LEVEL
and call finish_bitfield_layout if it is equal to one or two.
From-SVN: r261479
* gcc.c: Document new %@{...} sequence.
(LINK_COMMAND_SPEC): Use it for the -L switches.
(cpp_unique_options): Use it for the -I switches.
(at_file_argbuf): New global variable.
(in_at_file): Likewise.
(alloc_args): Create at_file_argbuf.
(clear_args): Truncate at_file_argbuf.
(store_arg): If in_at_file, push the argument onto at_file_argbuf.
(open_at_file): New function.
(close_at_file): Likewise.
(create_at_file): Delete.
(do_spec_1) <'i'>: Use open_at_file/close_at_file.
<'o'>: Likewise.
<'@'>: New case.
(validate_switches_from_spec): Deal with %@{...} sequence.
(validate_switches): Likewise.
(driver::finalize): Call clear_args.
From-SVN: r261474