Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ensuring Data completeness and validity on third party storage

I'm dealing with untrusted external storage and need to ensure the storage provider does not withhold any records in a query.

Example:

I have two trusted entities TA and TB, those entities should be able to alter the data that is stored in the cloud/untrusted storage, but nobody else. So my solution I equip TA and TB with Public-Keys and i have a data structure that can be compared to a table with versions say

 Ver | Data | Signature       | Signee
  4  |  ... | (AAAAAAAAA)_TA  | TA
  3  |  ... | (ZZZZZZZZZ)_TB  | TB
  2  |  ... | (YYYYYYYYY)_TA  | TA
  1  |  ... | (XXXXXXXXX)_TA  | TA

So when I retrieve such a table from the storage provider, I can easily verify the signatures and check whether the signature is correct, whether the signee was allowed to change the table or not.

However, I would also like to check for record completeness. Say TA uploads version 4, but TB is only aware of all records up to Version 3. Now the storage provider may withhold Version 4 completely when TB queries it.

As there is no direct sidechannel between TA and TB, there is no way to exchange the current version. Is there a way to circumvent this?

I was thinking of periodically inserting dummy records to at least have some time certainty. However, this approach lacks scalability and would result in a lot of storage and signing overhead. What is the actual system property i am looking for (it is hard to find research for something you do not know the name of)?

like image 323
worenga Avatar asked Dec 06 '13 14:12

worenga


Video Answer


1 Answers

This problem is not fully solvable without dummy records:

Let's call the state when the current version is version 3 "state 3", and the state when the current version is version 4 "state 4". No matter how you sign these states - as long as the attacker is telling you "state 3 is the current one" (showing you the entire database as it was during state 3), you can't know if this is true or if state 4 exists in the meantime.

Thus, you will have to periodically sign "no change" updates. You won't be able to avoid the signing overhead, but you don't have to store all of these. You make a separate "lastupdate" table:

 Signer | Last | Timestamp | Signature
  TA    |  4   | 2013-1... | (TA;4;2013-1...)_TA
  TB    |  3   | 2013-1... | (TB;3;2013-1...)_TB

meaning "Signer TA confirms that as of 2013-1..., the last version sent by me was 4". If the storage provider cannot show you a current confirmation from all signers that they didn't issue a newer version, you have to assume that he is hiding something (or something broke - either way, the data is not up to date). Any new signed statement replaces the older ones from that signer, because they are irrelevant now.

Now, if you don't have just one versioned "thing", but a large number of them, you don't necessarily have to have one such dummy log per "thing". For example, you could calculate a hash (or hash tree) over all the most current lines by your signer (e.g. "Thing A, Version 3. Thing B, Version 7. Thing C, Version 2.") and then just put the hash or the root of the hash tree in the lastupdate table.

If you really want to avoid additional signatures, and some things get updated all the time, you could include the hash and timestamp in the signatures of the version records you sign - the most current signed record would then be sufficient proof for freshness, and if it were too old, you could still use the "lastupdate" table. This is not worth the complexity, IMHO.

like image 89
Jan Schejbal Avatar answered Oct 20 '22 15:10

Jan Schejbal