Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do all 64 bit intel architectures support SSSE3/SSE4.1/SSE4.2 instructions?

I did searched on web and intel Software manual . But am unable to confirm if all Intel 64 architectures support upto SSSE3 or upto SSE4.1 or upto SSE4.2 or AVX etc. So that I would be able to use minimum SIMD supported instructions in my programme. Please help.

like image 848
Vikram Dattu Avatar asked Jan 28 '15 06:01

Vikram Dattu


People also ask

How do I know if my computer has SSE4 2 support?

If you are unsure about your particular computer, you can determine SSE2 support by: Windows: A free download, CPU-Z, is available from CPUID that will indicate if SSE2 is present on your system or not. Linux: From a terminal, run “cat /proc/cpuinfo”. “sse2” will be listed as one of the “flags” if SSE2 is available.

How do I know if my CPU supports SSE4 1?

A note about coreinfo: If you see an asterisk (*), then SSE4. 1 is supported. If you see a dash (-), then SSE4.

What is a SSE4 2 instruction set?

Streaming SIMD Extensions 2 (SSE2)SSE2 extends the MMX Technology and SSE technology with the addition of 144 instructions that deliver performance increases across a broad range of applications. The SIMD integer instructions introduced with MMX technology are extended from 64 to 128 bits.


2 Answers

A x64 native (AMD64 or Intel 64) processor is only mandated to support SSE and SSE2.

SSE3 is supported by Intel Pentium 4 processors (“Prescott”), AMD Athlon 64 (“revision E”), AMD Phenom, and later processors. This means most, but not quite all, x64 capable CPUs should support SSE3.

Supplemental SSE3 (SSSE3) is supported by Intel Core 2 Duo, Intel Core i7/i5/i3, Intel Atom, AMD Bulldozer, AMD Bobcat, and later processors.

SSE4.1 is supported on Intel Core 2 (“Penryn”), Intel Core i7 (“Nehalem”), Intel Atom (Silvermont core), AMD Bulldozer, AMD Jaguar, and later processors.

SSE 4.1 and SSE4.2 are supported on Intel Core i7 (“Nehalem”), Intel Atom (Silvermont core), AMD Bulldozer, AMD Jaguar, and later processors.

AVX is supported by Intel “Sandy Bridge”, AMD Bulldozer, AMD Jaguar, and later processors.

See this blog series.

A CPU with x64 native support but no SSE3 support is going to be 'first-generation' 64-bit which isn't supported by Windows 8.1 x64 native due to the requirements for CMPXCHG16b, PrefetchW, and LAHF/SAHF; so in practice SSE3 is highly likely in newer machines. SSSE3 or later is more restrictive depending on exactly who you are aiming at. For example, the Valve Hardware Survey puts SSE4.1 at 77%, SSE 4.2 at 72% (anything from AMD or Intel with SSE4.1 is going to also have SSE3 and SSSE3).

UPDATE: Per the comment below, the support for SSE3 for PC gamers per the Valve survey is now 100%. SSSE3, SSE4.1, and SSE4.2 are all in the 97-98% range. AVX is around 92%--the current generation gaming consoles from Sony & Microsoft support up through AVX. The primary value of AVX is that you can use the /arch:AVX switch which allows all SSE code-generation to use the 3-operand VEX prefix which makes register scheduling more efficient. See this blog post.

AVX2 is approaching 75% which is really good, but still potentially a blocker for a game to rely on without a fallback path. AVX2 is supported by Intel “Haswell”, AMD Excavator, and later processors. See this blog post.

Windows on ARM: Note that the x86 emulation for Windows on ARM64 only supports up to SSE4.1, and the x64 emulation in Windows 11 only supports up to SSE 4.2. AVX/AVX2 is not supported for these platforms.

like image 158
Chuck Walbourn Avatar answered Sep 27 '22 23:09

Chuck Walbourn


I have been trying to figure this out because failed to compile third party software using SSE. I found this might be helpful:

cat /proc/cpuinfo

Then pay attention to the flags section

flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts md_clear flush_l1d

I can see:

sse4_1 sse4_2

If you are trying to write some code to detect this automatically the following might be useful:

cat /proc/cpuinfo | grep flags | uniq | sed 's/.\+: //' | tr ' ' '\n' | grep -o "sse.*"
sse
sse2
sse3
sse4_1
sse4_2
like image 43
Kemin Zhou Avatar answered Sep 27 '22 22:09

Kemin Zhou