[PATCH 5/5] x86: support AVX10.1 vector size restrictions

H.J. Lu hjl.tools@gmail.com
Tue Aug 29 16:26:23 GMT 2023


On Fri, Aug 25, 2023 at 5:48 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> Recognize "/<number>" suffixes on both -march=+avx10.1 and the
> corresponding .arch directive, setting an upper bound on the vector size
> that insns may use. Such a restriction can be reset by setting a new base
> architecture, by using a suffix-less form, by disabling AVX10, or by
> enabling any other VEX/EVEX-based vector extension.
>
> While for most insns we can suppress their use with too wide operands
> via registers becoming unavailable (or in Intel syntax memory operand
> size specifiers not being recognized), mask register insns have to have
> their minimum required vector size specified in a new attribute. (Of
> course this new attribute could also be used on other insns.)
>
> Note that .insn continues to be permitted to emit EVEX{512,256} (and
> VEX256 ones) encodings regardless of vector size restrictions in place.
> Of course these can't be expressed using zmm (or ymm) operands then,
> but need using the EVEX.512.* forms (broadcast forms may be usable right
> now, but this may go away so shouldn't be relied upon). This is why no
> assertions should be added to build_{e,}vex_prefix().
> ---
> It is unclear whether Vsz is a good name for the new attribute: The spec
> leaves open how 256-bit embedded rounding is going to be expressed. Yet
> that may require some similar attribute ...
>
> --- a/gas/NEWS
> +++ b/gas/NEWS
> @@ -1,5 +1,7 @@
>  -*- text -*-
>
> +* Add support for Intel AVX10.1.
> +
>  * Add support for Intel PBNDKB instructions.
>
>  * Add support for Intel SM4 instructions.
> --- a/gas/doc/c-i386.texi
> +++ b/gas/doc/c-i386.texi
> @@ -213,6 +213,9 @@ accept various extension mnemonics.  For
>  @code{sm4},
>  @code{pbndkb},
>  @code{avx10.1},
> +@code{avx10.1/512},
> +@code{avx10.1/256},
> +@code{avx10.1/128},
>  @code{amx_int8},
>  @code{amx_bf16},
>  @code{amx_fp16},
> @@ -267,7 +270,11 @@ accept various extension mnemonics.  For
>  @code{svme} and
>  @code{padlock}.
>  Note that these extension mnemonics can be prefixed with @code{no} to revoke
> -the respective (and any dependent) functionality.
> +the respective (and any dependent) functionality.  Note further that the
> +suffixes permitted on @code{-march=avx10.<N>} enforce a vector length
> +restriction, i.e. despite these otherwise being "enabling" options, using
> +these suffixes will disable all insns with wider vector or mask register
> +operands.
>
>  When the @code{.arch} directive is used with @option{-march}, the
>  @code{.arch} directive will take precedent.
> @@ -1673,6 +1680,12 @@ an unconditional jump to the target.
>
>  Note that the sub-architecture specifiers (starting with a dot) can be prefixed
>  with @code{no} to revoke the respective (and any dependent) functionality.
> +Note further that @samp{.avx10.<N>} can be suffixed with a vector length
> +restriction (@samp{/256} or @samp{/128}, with @samp{/512} simply restoring the
> +default).  Despite these otherwise being "enabling" specifiers, using these
> +suffixes will disable all insns with wider vector or mask register operands.
> +On SVR4-derived platforms, the separator character @samp{/} can be replaced by
> +@samp{:}.
>
>  Following the CPU architecture (but not a sub-architecture, which are those
>  starting with a dot), you may specify @samp{jumps} or @samp{nojumps} to

Although CPUID bits in AVX10 spec may leave an impression that 128-bit,
256-bit and 512-bit vectors may be enabled independently.  But it also says

A “converged” version of Intel AVX10 with maximum vector lengths of 256
bits and 32-bit opmask registers will be supported across all Intel processors,
while 512-bit vector registers and 64-bit opmasks will continue to be supported
on some P-core processors.

Adding avx10.1/128 isn't necessary.

-- 
H.J.


More information about the Binutils mailing list