Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the status of the TSX-related Skylake errata SKL-105?

As is well known, Intel had to disable TSX in the Haswell-series of processors via a microcode updates. This was due to a bug in the TSX implementation that could give erroneous results if these instructions were used.

What seems to be less well known is that there is apparently also an errata affecting TSX on the newer architecture, Skylake. Specifically the errata "SKL-105" mentioned here:

http://www.intel.com/content/www/us/en/processors/core/desktop-6th-gen-core-family-spec-update.html

It specifically states that using TSX can lead to unpredictable system behavior. However, it also notes that it is possible for the BIOS to carry a fix. However, the question is then what this fix entails. Does it disable TSX altogether like the Haswell microcode "fix"? Googling "SKL105" gives no results so it seems the community is generally unaware of it?

Some users have noticed the TSX feature getting "steathily" disabled (but seemingly being unaware of the errata above):

https://www.reddit.com/r/hardware/comments/44k218/intel_disables_tsx_transactional_memory_again_in/

It is strange if only certain variants of the CPUs are affected, since one would presume they would all share the same microarchitecture and hence be equally affected by this bug.

By the way another way such a microcode "fix" could operate and which could be even more stealthy: I suppose it would be possible to make a microcode update that would still expose the presence of TSX (making it seem the feature was still enabled) but would override the implementation of the new TSX instructions with "dummy implementations" that actually would never elide the locks and essentially just execute the code the old-fashioned way, thereby avoiding the bug but also foregoing the performance improvement TSX could offer. The only way to determine if this happened would be through performance measurements.

Anyone has more info on the status of TSX in Skylake? In any case it is strange that not more info is released and one has to guess what is affected and what is not. And indeed if the feature is safe to use.

I have a 6700K and the feature is still there. But this also depends on whether the BIOS manufacturer has taken in the microcode updates and also I haven't actually measured the performance so I can't exclude it could still have been disabled cf. the previous paragraph.

like image 517
Morty Avatar asked Aug 10 '16 08:08

Morty


1 Answers

As far as I know, it is supposedly fixed on the latest public microcode update pack from 2016-07-14. For Skylake, this would be revision 0x9d/0x9e of the Skylake base microcode (processor signatures 0x406e3 and 0x506e3).

This new TSX erratum seems to be present on Broadwell, too. I assume it was also fixed through the new batch of Broadwell-* microcode updates that were published along with the new Skylake microcode updates.

For Linux, which updates microcode through data sent by the bootloader, it is trivial to apply the update and it is already available in most (serious) distributions. For Windows, you need to pester your system vendor for an EFI/BIOS update.

Sorry, I don't have the means to test TSX in latest Skylake/Broadwell microcode to check whether it is eliding locks or "always failing". As for disabling TSX, you must understand it has a real impact in L3 effectiveness (it does not come for free!) and power consumption, it would make a lot of sense to have TSX disabled by the BIOS on anything with a smaller L3.

Interestingly enough, the information on the TSX "chicken bit" is not public, we have no idea on how to disable (or re-enable) it.

like image 121
anonymous Avatar answered Oct 07 '22 14:10

anonymous