Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is an SSTable?

People also ask

What is SSTable in Google?

Tablets are stored on Colossus, Google's file system, in SSTable format. An SSTable provides a persistent, ordered immutable map from keys to values, where both keys and values are arbitrary byte strings. Each tablet is associated with a specific Bigtable node.

What is a Cassandra SSTable?

SSTables are the immutable data files that Cassandra uses for persisting data on disk. As SSTables are flushed to disk from memtables or are streamed from other nodes, Cassandra triggers compactions which combine multiple SSTables into one. Once the new SSTable has been written, the old SSTables can be removed.

Why is SSTable immutable?

SSTables are immutable. Instead of overwriting existing rows with inserts or updates, Cassandra writes new timestamped versions of the inserted or updated data in new SSTables. Cassandra does not perform deletes by removing the deleted data: instead, Cassandra marks it with tombstones.

What is a Memtable?

1. A memtable is basically a write-back cache of data rows that can be looked up by key i.e. unlike a write-through cache, writes are batched up in the memtable until it is full, when a memtable is full, and it is written to disk as SSTable. Memtable is an in-memory cache with content stored as key/column.


Sorted Strings Table (borrowed from google) is a file of key/value string pairs, sorted by keys


"An SSTable provides a persistent,ordered immutable map from keys to values, where both keys and values are arbitrary byte strings. Operations are provided to look up the value associated with a specified key, and to iterate over all key/value pairs in a specified key range. Internally, each SSTable contains a sequence of blocks (typically each block is 64KB in size, but this is configurable). A block index (stored at the end of the SSTable) is used to locate blocks; the index is loaded into memory when the SSTable is opened. A lookup can be performed with a single disk seek: we first find the appropriate block by performing a binary search in the in-memory index, and then reading the appropriate block from disk. Optionally, an SSTable can be completely mapped into memory, which allows us to perform lookups and scans without touching disk."


A tablet is stored in the form of SSTables.

SSTable (directly mapped to GFS) is key-value based immutable storage. It stores chunks of data, each is of 64KB.

Definitions:

  • Index of the keys: key and starting location
  • Chunk is a storage unit in GFS, replica management are by chunk

  • SSTable (engl. Sorted Strings Table) is a file of key/value string pairs, sorted by keys.

  • An SSTable provides a persistent,ordered immutable map from keys to values, where both keys and values are arbitrary byte strings.

  • Internally, each SSTable contains a sequence of blocks (typically
    each block is 64KB in size, but this is configurable).


SSTable means "sorted string table" based on key-value pair.In Cassandra, SSTables are immutable and sorted by keys.