gcc/ada/
* gen_il-gen.adb (Put_Seinfo): Generate type
Seinfo.Type_Only_Enum based on type
Gen_IL.Internals.Type_Only_Enum. Automatically generating a copy
of the type will help keep them in sync. (Note that there are
no Ada compiler packages imported into Gen_IL.) Add a Type_Only
field to Field_Descriptor, so this information is available in
the Ada compiler (as opposed to just in the Gen_IL "compiler").
(One_Comp): Add initialization of the Type_Only field of
Field_Descriptor.
* gen_il-internals.ads (Image): Image function for
Type_Only_Enum.
* atree.ads (Node_To_Fetch_From): New function to compute which
node to fetch from, based on the Type_Only aspect.
* atree.adb (Get_Field_Value): Call Node_To_Fetch_From.
* treepr.adb (Print_Entity_Field): Call Node_To_Fetch_From.
(Print_Node_Field): Assert.
* sinfo-utils.adb (Walk_Sinfo_Fields,
Walk_Sinfo_Fields_Pairwise): Asserts.
gcc/ada/
* sem_case.adb (Composite_Case_Ops.Box_Value_Required): A new
function which takes a component type and returns a Boolean.
Returns True for the cases which were formerly forbidden as
components (these checks were formerly performed in the
now-deleted procedure
Check_Composite_Case_Selector.Check_Component_Subtype).
(Composite_Case_Ops.Normalized_Case_Expr_Type): Hoist this
function out of the Array_Case_Ops package because it has been
generalized to also do the analogous thing in the case of a
discriminated type.
(Composite_Case_Ops.Scalar_Part_Count): Return 0 if
Box_Value_Required returns True for the given type/subtype.
(Composite_Case_Ops.Choice_Analysis.Choice_Analysis.Component_Bounds_Info.
Traverse_Discrete_Parts): Return without doing anything if
Box_Value_Required returns True for the given type/subtype.
(Composite_Case_Ops.Choice_Analysis.Parse_Choice.Traverse_Choice):
If Box_Value_Required yields True for a given component type,
then check that the value of that component in a choice
expression is indeed a box (in which case the component is
ignored).
* doc/gnat_rm/implementation_defined_pragmas.rst: Update
documentation.
* gnat_rm.texi: Regenerate.
gcc/ada/
* einfo-utils.adb (Declaration_Node): Avoid returning the
following node kinds: N_Assignment_Statement, N_Integer_Literal,
N_Procedure_Call_Statement, N_Subtype_Indication, and
N_Type_Conversion. Assert that the result is in N_Is_Decl or
empty.
* gen_il-gen-gen_nodes.adb (N_Is_Decl): Modify to match the
things that Declaration_Node can return.
gcc/ada/
* libgnat/a-strunb.ads (Unbounded_String): Reference is never
null.
* libgnat/a-strunb.adb (Finalize): Copy reference while it needs
to be deallocated.
gcc/ada/
* sem_ch8.adb (Build_Class_Wide_Wrapper): Previous version split
in two subprograms to factorize its functionality:
Find_Suitable_Candidate, and Build_Class_Wide_Wrapper. These
routines are also placed in the new subprogram
Handle_Instance_With_Class_Wide_Type.
(Handle_Instance_With_Class_Wide_Type): New subprogram that
encapsulates all the code that handles instantiations with
class-wide types.
(Analyze_Subprogram_Renaming): Adjust code to invoke the new
nested subprogram Handle_Instance_With_Class_Wide_Type; adjust
documentation.
gcc/ada/
* einfo-utils.ads, einfo-utils.adb (Alias, Set_Alias,
Renamed_Entity, Set_Renamed_Entity, Renamed_Object,
Set_Renamed_Object): Add assertions that reflect how these are
supposed to be used and what they are supposed to return.
(Renamed_Entity_Or_Object): New getter.
(Set_Renamed_Object_Of_Possibly_Void): Setter that allows N to
be E_Void.
* checks.adb (Ensure_Valid): Use Renamed_Entity_Or_Object
because this is called for both cases.
* exp_dbug.adb (Debug_Renaming_Declaration): Use
Renamed_Entity_Or_Object because this is called for both cases.
Add assertions.
* exp_util.adb (Possible_Bit_Aligned_Component): Likewise.
* freeze.adb (Freeze_All_Ent): Likewise.
* sem_ch5.adb (Within_Function): Likewise.
* exp_attr.adb (Calculate_Header_Size): Call Renamed_Entity
instead of Renamed_Object.
* exp_ch11.adb (Expand_N_Raise_Statement): Likewise.
* repinfo.adb (Find_Declaration): Likewise.
* sem_ch10.adb (Same_Unit, Process_Spec_Clauses,
Analyze_With_Clause, Install_Parents): Likewise.
* sem_ch12.adb (Build_Local_Package, Needs_Body_Instantiated,
Build_Subprogram_Renaming, Check_Formal_Package_Instance,
Check_Generic_Actuals, In_Enclosing_Instance,
Denotes_Formal_Package, Process_Nested_Formal,
Check_Initialized_Types, Map_Formal_Package_Entities,
Restore_Nested_Formal): Likewise.
* sem_ch6.adb (Report_Conflict): Likewise.
* sem_ch8.adb (Analyze_Exception_Renaming,
Analyze_Generic_Renaming, Analyze_Package_Renaming,
Is_Primitive_Operator_In_Use, Declared_In_Actual,
Note_Redundant_Use): Likewise.
* sem_warn.adb (Find_Package_Renaming): Likewise.
* sem_elab.adb (Ultimate_Variable): Call Renamed_Object instead
of Renamed_Entity.
* exp_ch6.adb (Get_Function_Id): Call
Set_Renamed_Object_Of_Possibly_Void, because the defining
identifer is still E_Void at this point.
* sem_util.adb (Function_Call_Or_Allocator_Level): Likewise.
Remove redundant (unreachable) code.
(Is_Object_Renaming, Is_Valid_Renaming): Call Renamed_Object
instead of Renamed_Entity.
(Get_Fullest_View): Call Renamed_Entity instead of
Renamed_Object.
(Copy_Node_With_Replacement): Call
Set_Renamed_Object_Of_Possibly_Void because the defining entity
is sometimes E_Void.
* exp_ch5.adb (Expand_N_Assignment_Statement): Protect a call to
Renamed_Object with Is_Object to avoid assertion failure.
* einfo.ads: Minor comment fixes.
* inline.adb: Minor comment fixes.
* tbuild.ads: Minor comment fixes.
gcc/ada/
* sem_ch13.adb (Freeze_Entity_Checks): Perform same check on
predicate expression inside pragma as inside aspect.
* sem_util.adb (Is_Current_Instance): Recognize possible
occurrence of subtype as current instance inside the pragma
Predicate.
Set the 3 possible flags as all individual bits and group for options.
* flag-types.h (enum ranger_debug): Adjust values.
* params.opt (ranger_debug): Ditto.
These tests are testing Advanced SIMD codegen, so if the compiler or the
testsuite is forcing SVE they will fail.
This adds +nosve so that we always generate Advanced SIMD codegen.
gcc/testsuite/ChangeLog:
PR target/102907
* gcc.target/aarch64/shrn-combine-1.c: Disable SVE.
* gcc.target/aarch64/shrn-combine-2.c: Likewise.
* gcc.target/aarch64/shrn-combine-3.c: Likewise.
* gcc.target/aarch64/shrn-combine-4.c: Likewise.
* gcc.target/aarch64/shrn-combine-5.c: Likewise.
* gcc.target/aarch64/shrn-combine-6.c: Likewise.
* gcc.target/aarch64/shrn-combine-7.c: Likewise.
I was not careful with the fix for PR 102505 and did not craft the
check to satisfy the verifier carefully, which lead to PR 102886.
(The verifier has the test structured differently and somewhat
redundantly, so I could not just copy it).
This patch fixes it. I hope it is quite obvious correction of an
oversight and so will commit it if survives bootstrap and testing on
x86_64-linux and ppc64le-linux.
Testcase for this bug is gcc.dg/tree-ssa/sra-18.c (but only on
platforms with constant pools). I will backport the two fixes
to the release branches squashed.
gcc/ChangeLog:
2021-10-22 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/102886
* tree-sra.c (totally_scalarize_subtree): Fix the out of
access-condition.
Just like PR 100382, here we have a DCE removing a
null pointer load which is needed still.
In this case, execute_fixup_cfg removes a store (correctly)
and then removes the null load (incorrectly) due to
not checking stmt_unremovable_because_of_non_call_eh_p.
This patch adds the check in the similar way as the patch
to fix PR 100382 did.
gcc/ChangeLog:
* tree-ssa-dce.c (simple_dce_from_worklist):
Check stmt_unremovable_because_of_non_call_eh_p also
before removing the statement.
Previous refactoring made the possibility of considering re-aligned
loads for unlimited cost model alignment peeling difficult so I
ditched that. Later refactoring made it easily possible again so
the following patch re-instantiates this which should fix the
observed regression on powerpc with altivec.
2021-10-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/102905
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment):
Use vect_supportable_dr_alignment again to determine whether
an access is supported when not aligned.
Similar for sqrt/sqrtl.
gcc/ChangeLog:
PR target/102464
* match.pd: Simplify (_Float16) sqrtf((float) a) to .SQRT(a)
when direct_internal_fn_supported_p, similar for sqrt/sqrtl.
gcc/testsuite/ChangeLog:
PR target/102464
* gcc.target/i386/pr102464-sqrtph.c: New test.
* gcc.target/i386/pr102464-sqrtsh.c: New test.
This fixes a latent issue exposed by now allowing VN_TOP in PHI
arguments. We may only use optimistic equality when merging values on
different edges, not when merging values on the same edge - in particular
we may not choose the undef value on any edge when there's a not undef
value as well.
2021-10-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/102920
* tree-ssa-sccvn.h (expressions_equal_p): Add argument
controlling VN_TOP matching behavior.
* tree-ssa-sccvn.c (expressions_equal_p): Likewise.
(vn_phi_eq): Do not optimistically match VN_TOP.
* gcc.dg/torture/pr102920.c: New testcase.
This patch is to support transform in fast-math something like
_mm512_add_ph(x1, _mm512_fmadd_pch(a, b, _mm512_setzero_ph())) to
_mm512_fmadd_pch(a, b, x1).
And support transform _mm512_add_ph(x1, _mm512_fmul_pch(a, b))
to _mm512_fmadd_pch(a, b, x1).
gcc/ChangeLog:
* config/i386/sse.md (fma_<mode>_fadd_fmul): Add new
define_insn_and_split.
(fma_<mode>_fadd_fcmul):Likewise
(fma_<complexopname>_<mode>_fma_zero):Likewise
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-complex-fma.c: New test.
The behavior of the -mdisable-fpregs is confusing in that it doesn't
disable the use of the floating-point registers in all situations.
The -msoft-float disables the use of the floating-point registers in
all situations. The Linux kernel only needs to disable use of the
xmpyu instruction to avoid using the floating-point registers.
This change revises the -mdisable-fpregs option to disable the use of
the floating-point registers in all situations. It is now equivalent
to the -msoft-float option. A new -msoft-mult option is added to
disable use of the xmpyu instruction. The libgcc library can be
compiled with the -msoft-mult option to avoid using hardware integer
multiplication.
2021-10-24 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa-d.c (pa_d_handle_target_float_abi): Don't check
TARGET_DISABLE_FPREGS.
* config/pa/pa.c (fix_range): Use MASK_SOFT_FLOAT instead of
MASK_DISABLE_FPREGS.
(hppa_rtx_costs): Don't check TARGET_DISABLE_FPREGS. Adjust
cost of hardware integer multiplication.
(pa_conditional_register_usage): Don't check TARGET_DISABLE_FPREGS.
* config/pa/pa.h (INT14_OK_STRICT): Likewise.
* config/pa/pa.md: Don't check TARGET_DISABLE_FPREGS. Check
TARGET_SOFT_FLOAT in patterns that use xmpyu instruction.
* config/pa/pa.opt (mdisable-fpregs): Change target mask to
SOFT_FLOAT. Revise comment.
(msoft-float): New option.
The 'G' constraint only matches a float zero.
2021-10-24 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa.md: Don't use 'G' constraint in integer move patterns.
This patch cures the testsuite failure of bfin/20090914-3.c, which
currently FAILs on bfin-elf with "(test for excess errors)" due to:
20090914-3.c:3:1: warning: return type defaults to 'int' [-Wimplicit-int]
which is obviously not what this code was intended to test. Fixed by
turning the code into a function returning the final "fract32" result,
as simply specifying an "int" return type for main, results in the
entire function being optimized away, as the result is unused.
2021-10-24 Roger Sayle <roger@nextmovesoftware.com>
gcc/testsuite/ChangeLog
* gcc.target/bfin/20090914-3.c: Tweak test case.
Move bind-c-intent-out-2.f90 to gfortran.dg/ubsan for -fsanitize=undefined.
PR fortran/9262
* gfortran.dg/bind-c-intent-out-2.f90: Moved to ...
* gfortran.dg/ubsan/bind-c-intent-out-2.f90
On x86_64, V1TI mode holds a 128-bit integer value in a (vector) SSE
register (where regular TI mode uses a pair of 64-bit general purpose
scalar registers). This patch improves the implementation of AND, IOR,
XOR and NOT on these values.
The benefit is demonstrated by the following simple test program:
typedef unsigned __int128 v1ti __attribute__ ((__vector_size__ (16)));
v1ti and(v1ti x, v1ti y) { return x & y; }
v1ti ior(v1ti x, v1ti y) { return x | y; }
v1ti xor(v1ti x, v1ti y) { return x ^ y; }
v1ti not(v1ti x) { return ~x; }
For which GCC currently generates the rather large:
and: movdqa %xmm0, %xmm2
movq %xmm1, %rdx
movq %xmm0, %rax
andq %rdx, %rax
movhlps %xmm2, %xmm3
movhlps %xmm1, %xmm4
movq %rax, %xmm0
movq %xmm4, %rdx
movq %xmm3, %rax
andq %rdx, %rax
movq %rax, %xmm5
punpcklqdq %xmm5, %xmm0
ret
ior: movdqa %xmm0, %xmm2
movq %xmm1, %rdx
movq %xmm0, %rax
orq %rdx, %rax
movhlps %xmm2, %xmm3
movhlps %xmm1, %xmm4
movq %rax, %xmm0
movq %xmm4, %rdx
movq %xmm3, %rax
orq %rdx, %rax
movq %rax, %xmm5
punpcklqdq %xmm5, %xmm0
ret
xor: movdqa %xmm0, %xmm2
movq %xmm1, %rdx
movq %xmm0, %rax
xorq %rdx, %rax
movhlps %xmm2, %xmm3
movhlps %xmm1, %xmm4
movq %rax, %xmm0
movq %xmm4, %rdx
movq %xmm3, %rax
xorq %rdx, %rax
movq %rax, %xmm5
punpcklqdq %xmm5, %xmm0
ret
not: movdqa %xmm0, %xmm1
movq %xmm0, %rax
notq %rax
movhlps %xmm1, %xmm2
movq %rax, %xmm0
movq %xmm2, %rax
notq %rax
movq %rax, %xmm3
punpcklqdq %xmm3, %xmm0
ret
with this patch we now generate the much more efficient:
and: pand %xmm1, %xmm0
ret
ior: por %xmm1, %xmm0
ret
xor: pxor %xmm1, %xmm0
ret
not: pcmpeqd %xmm1, %xmm1
pxor %xmm1, %xmm0
ret
For my first few attempts at this patch I tried adding V1TI to the
existing VI and VI12_AVX_512F mode iterators, but these then have
dependencies on other iterators (and attributes), and so on until
everything ties itself into a knot, as V1TI mode isn't really a
first-class vector mode on x86_64. Hence I ultimately opted to use
simple stand-alone patterns (as used by the existing TF mode support).
2021-10-23 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/sse.md (<any_logic>v1ti3): New define_insn to
implement V1TImode AND, IOR and XOR on TARGET_SSE2 (and above).
(one_cmplv1ti2): New define expand.
gcc/testsuite/ChangeLog
* gcc.target/i386/sse2-v1ti-logic.c: New test case.
* gcc.target/i386/sse2-v1ti-logic-2.c: New test case.