I understand the part of the paper where they trick the CPU to speculatively load the part of the victim memory into the CPU cache. Part I do not understand is how they retrieve it from cache.

They don't retrieve it directly (out of bounds read bytes are not "retired" by the CPU and cannot be seen by the attacker in the attack). A vector of attack is to do the "retrieval" a bit at a time. After the CPU cache has been prepared (flushing the cache where it has to be), and has been "taught" that a if branch goes through while the condition relies on non-cached data, the CPU speculatively executes the couple of lines from the if scope, including an out-of-bounds access (giving a byte B), and then immediately access some authorized non-cached array at an index that depends on one bit of the secret B (B will never directly be seen by the attacker). Finally, attacker retrieves the same authorized data array from, say, an index calculated with B bit, say zero: if the retrieval of that ok byte is fast, data was still in the cache, meaning B bit is zero. If the retrieval is (relatively) slow, the CPU had to load in its cache that ok data, meaning it didn't earlier, meaning B bit was one. For instance, <code>Cond</code>, all <code>ValidArray</code> not cached, <code>LargeEnough</code> is big enough to ensure the CPU will not load both <code>ValidArray[ valid-index + 0 ]</code> and <code>ValidArray[ valid-index + LargeEnough ]</code> in its cache in one shot <pre class="prettyprint"><code>if ( Cond ) { // the next 2 lines are only speculatively executed V = SomeArray[ out-of-bounds-attacked-index ] Dummy = ValidArray [ valid-index + ( V & bit ) * LargeEnough ] } // the next code is always retired (executed, not only speculatively) t1 = get_cpu_precise_time() Dummy2 = ValidArray [ valid-index ] diff = get_cpu_precise_time() - t1 if (diff > SOME_CALCULATED_VALUE) { // bit was its value (1, or 2, or 4, or ... 128) } else { // bit was 0 } </code></pre> where <code>bit</code> is tried successively being first <code>0x01</code>, then <code>0x02</code>... to <code>0x80</code>. By measuring the "time" (number of CPU cycles) the "next" code takes for each bit, the value of V is revealed: <ul> <li>if <code>ValidArray[ valid-index + 0 ]</code> is in the cache, <code>V & bit</code> is <code>0</code> </li> <li>otherwise <code>V & bit</code> is <code>bit</code> </li> </ul> This takes time, each bit requires to prepare the CPU L1 cache, tries several time the same bit to minimize timing errors etc... Then the correct attack "offset" has to be determined to read an interesting area. Clever attack, but not so easy to implement.

I would like to contribute one piece of information to the already existing answers, namely how the attacker can actually probe an array from the victim process in the probing phase. This is a problem, because Spectre (unlike Meltdown) runs in the victim's process and even through the cache the attacker cannot just query arrays from other processes. In short: With Spectre the FLUSH+RELOAD attack needs KSM or another method for shared memory. That way the attacker (to my understanding) can replicate the relevant parts of the victim's memory in his own address space and thus will be able to query the cache for the access times on the probe array. Long Explanation: One big difference between Meltdown and Spectre is that in Meltdown the whole attack is running in the address space of the attacker. Thus, it's quite clear how the attacker can both cause changes to the cache and read the cache at the same time. With Spectre however, the attack itself runs in the process of the victim. By using so called gadgets the victim will execute code that writes the secret data into the index of a probe array, e.g. with <code>a = array2[array1[x] * 4096]</code>. The proof-of-concepts that have been linked in other answers implement the basic branching/speculation concept of Spectre, but all code seems to run in the same process. Thus, of course it is no problem to have gadget code write to <code>array2</code> and then read <code>array2</code> for probing. In a real-world scenario, however, the victim process would write to <code>array2</code> which is also located in the victim process. Now, the problem - which the paper in my opinion does not explain well - is that the attacker has to be able to probe the cache for the victim's address space array (<code>array2</code>). Theoretically, this could be done either from within the victim again or from the attackers address space. The original paper only describes it vaguely, probably because it was clear to the authors: <blockquote> For the final phase, the sensitive data is recovered. For Spectre attacks using Flush+Reload or Evict+Reload, the recovery process consists of timing the access to memory addresses in the cache lines being monitored. </blockquote> <blockquote> To complete the attack, the adversary measures which location in array2 was brought into the cache, e.g., via Flush+Reload or Prime+Probe. </blockquote> Accessing the cache for <code>array2</code> from within the victim's address space would be possible, but it would require another gadget and the attacker would have to be able to trigger execution of this gadget. This seemed quite unrealistic to me, especially in Spectre-PHT. In the paper Detecting Spectre Attacks by identifying Cache Side-Channel Attacks using Machine Learning I found my missing explanation: <blockquote> In order for the FLUSH+RELOAD attack to work in this case, three preconditions have to be met. [...] But most importantly the CPU must have a mechanism like Kernel Same-page Merging (KSM) [4] or Transparent Page Sharing (TPS) [54] enabled [10]. KSM allows processes to share pages by merging different virtual addresses into the same page, if they reference the same physical address. It thereby increases the memory density, allowing for a more efficient memory usage. KSM was first implemented in Linux 2.6.32 and is enabled by default [33]. </blockquote> KSM explains how the attacker can access <code>array2</code> that normally would only be available within the victim's process.

How does Spectre attack read the cache it tricked CPU to load?

2 Answers

They don't retrieve it directly (out of bounds read bytes are not "retired" by the CPU and cannot be seen by the attacker in the attack).

A vector of attack is to do the "retrieval" a bit at a time. After the CPU cache has been prepared (flushing the cache where it has to be), and has been "taught" that a if branch goes through while the condition relies on non-cached data, the CPU speculatively executes the couple of lines from the if scope, including an out-of-bounds access (giving a byte B), and then immediately access some authorized non-cached array at an index that depends on one bit of the secret B (B will never directly be seen by the attacker). Finally, attacker retrieves the same authorized data array from, say, an index calculated with B bit, say zero: if the retrieval of that ok byte is fast, data was still in the cache, meaning B bit is zero. If the retrieval is (relatively) slow, the CPU had to load in its cache that ok data, meaning it didn't earlier, meaning B bit was one.

For instance, Cond, all ValidArray not cached, LargeEnough is big enough to ensure the CPU will not load both ValidArray[ valid-index + 0 ] and ValidArray[ valid-index + LargeEnough ] in its cache in one shot

if ( Cond ) {
   // the next 2 lines are only speculatively executed
   V = SomeArray[ out-of-bounds-attacked-index ]
   Dummy = ValidArray [ valid-index + ( V & bit ) * LargeEnough ]
}

// the next code is always retired (executed, not only speculatively)

t1 = get_cpu_precise_time()
Dummy2 = ValidArray [ valid-index ]
diff = get_cpu_precise_time() - t1

if (diff > SOME_CALCULATED_VALUE) {
   // bit was its value (1, or 2, or 4, or ... 128) 
}
else {
   // bit was 0
}

where bit is tried successively being first 0x01, then 0x02... to 0x80. By measuring the "time" (number of CPU cycles) the "next" code takes for each bit, the value of V is revealed:

if ValidArray[ valid-index + 0 ] is in the cache, V & bit is 0
otherwise V & bit is bit

This takes time, each bit requires to prepare the CPU L1 cache, tries several time the same bit to minimize timing errors etc...

Then the correct attack "offset" has to be determined to read an interesting area.

Clever attack, but not so easy to implement.

128

answered Jan 04 '23 08:01

Déjà vu

I would like to contribute one piece of information to the already existing answers, namely how the attacker can actually probe an array from the victim process in the probing phase. This is a problem, because Spectre (unlike Meltdown) runs in the victim's process and even through the cache the attacker cannot just query arrays from other processes.

In short: With Spectre the FLUSH+RELOAD attack needs KSM or another method for shared memory. That way the attacker (to my understanding) can replicate the relevant parts of the victim's memory in his own address space and thus will be able to query the cache for the access times on the probe array.

Long Explanation:

One big difference between Meltdown and Spectre is that in Meltdown the whole attack is running in the address space of the attacker. Thus, it's quite clear how the attacker can both cause changes to the cache and read the cache at the same time. With Spectre however, the attack itself runs in the process of the victim. By using so called gadgets the victim will execute code that writes the secret data into the index of a probe array, e.g. with a = array2[array1[x] * 4096].

The proof-of-concepts that have been linked in other answers implement the basic branching/speculation concept of Spectre, but all code seems to run in the same process. Thus, of course it is no problem to have gadget code write to array2 and then read array2 for probing. In a real-world scenario, however, the victim process would write to array2 which is also located in the victim process.

Now, the problem - which the paper in my opinion does not explain well - is that the attacker has to be able to probe the cache for the victim's address space array (array2). Theoretically, this could be done either from within the victim again or from the attackers address space.

The original paper only describes it vaguely, probably because it was clear to the authors:

For the final phase, the sensitive data is recovered. For Spectre attacks using Flush+Reload or Evict+Reload, the recovery process consists of timing the access to memory addresses in the cache lines being monitored.

To complete the attack, the adversary measures which location in array2 was brought into the cache, e.g., via Flush+Reload or Prime+Probe.

Accessing the cache for array2 from within the victim's address space would be possible, but it would require another gadget and the attacker would have to be able to trigger execution of this gadget. This seemed quite unrealistic to me, especially in Spectre-PHT.

In the paper Detecting Spectre Attacks by identifying Cache Side-Channel Attacks using Machine Learning I found my missing explanation:

In order for the FLUSH+RELOAD attack to work in this case, three preconditions have to be met. [...] But most importantly the CPU must have a mechanism like Kernel Same-page Merging (KSM) [4] or Transparent Page Sharing (TPS) [54] enabled [10].

KSM allows processes to share pages by merging different virtual addresses into the same page, if they reference the same physical address. It thereby increases the memory density, allowing for a more efficient memory usage. KSM was first implemented in Linux 2.6.32 and is enabled by default [33].

KSM explains how the attacker can access array2 that normally would only be available within the victim's process.

answered Jan 04 '23 10:01

aufziehvogel

Related questions
                            
                                How to protect against CSRF on a static site?
                            
                                CRL and OCSP behavior of iOS / Security.Framework?
                            
                                Convert RSA Public Key to PEM Format
                            
                                ID token or /userinfo for Identity assertion
                            
                                How deterministic Are .Net GUIDs?
                            
                                How do I generate One time passwords (OTP / HOTP)?
                            
                                How do I safely "eval" user code in a webpage?
                            
                                Add/import certificate with password via command line Mac OS X
                            
                                How to make iOS application tamper-evident?
                            
                                How Bluetooth Low Energy security works between Android app and BLE devices?
                            
                                Why isn't Suhosin part of the PHP core?
                            
                                Best Practices for MySQL Encryption?
                            
                                Increasing security of web-based login
                            
                                Is regenerating the session id after login a good practice?
                            
                                How to extend or override BeginForm to include a AntiForgeryToken field
                            
                                Adding an SSL Certificate to JRE in order to access HTTPS sites
                            
                                Testing Spring Boot Security simply
                            
                                Hiding true database object ID in url's
                            
                                Is it possible to zero a Golang string's memory "safely"?
                            
                                So, just what are Windows Atom tables for?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How does Spectre attack read the cache it tricked CPU to load?

Tags:

cpu-architecture

security

cpu

spectre

side-channel-attacks

NoSenseEtAl

People also ask

2 Answers

Déjà vu

aufziehvogel

Recent Activity

Donate For Us