Guarantees in write ahead logging implementation

Tags:

If one were to issue a sequential series of write(2) in Linux/Unix seperated by fdatasync(2) or fsync(2) or sync(2) is it guaranteed that the first write() will be committed to disk before your second write()? The following SO post seems to say that such guarantees cannot be given, since there are multiple caching layers involved. For database systems which guarantee consistency this seems to be important, since in WAL (Write Ahead Logging) recovery, you'd need your logs to be persisted on disk before actually changing your data, so that in the event of an application/system failure you can revert back to your last known consistent state. How is this ensured/implemented in an actual database system?

931

asked May 24 '12 04:05

pjay

1 Answers

The sync() system call is practically no help whatsoever; it promises to schedule the write-to-disk operations, but that's about all.

The normal technique used is to set the correct options when you open() the file descriptor for the disk file: O_DSYNC, O_RSYNC, O_SYNC. However, the fsync() and fdatasync() get pretty close to the same effects. You can also look at O_DIRECTIO which is often supported, though it is not standardized at all by POSIX.

Ultimately, the DBMS relies on the O/S to undertake that data written and synchronized to one disk is secure. As long as the device will always return what the DBMS last wrote, even if it is not on actual disk yet because of caching (because it is backed up in non-volatile cache, or something like that), then it isn't critical. If, on the other, you have NAS (network attached storage) that doesn't guarantee that what you last wrote (and were told was safe on disk) is returned when you read it, then your DBMS can suffer if it has to do recovery. So, you choose where you store your DBMS with care, making sure the storage works sensibly. If the storage does not work sufficiently like the hypothetical disk, you can end up losing data.

186

answered Oct 13 '22 11:10

Jonathan Leffler

Related questions
                            
                                Looking for poorly optimized code [closed]
                            
                                Is this error caused by a 64-bit library being accessed by a Java program running in a 32-bit JVM?
                            
                                Why does my Perl TCP server script hang with many TCP connections?
                            
                                Valgrind says "stack allocation," I say "heap allocation"
                            
                                Embedded Python - Blocking operations in time module
                            
                                Porting NewLib for my OS: some questions
                            
                                How to get the function name of a C function pointer
                            
                                speeding up "base conversion" for large integers
                            
                                Difference in Logoff notification events between Windows XP and Windows 7
                            
                                Enumerating Graphs with Self-Loops
                            
                                boolean search on an array
                            
                                How to implement a circular buffer using a file?
                            
                                reordering 3D vector triplets in column major order is slow
                            
                                Finding a subset which satisfies a certain condition
                            
                                How to program videomixer using Gstreamer C API
                            
                                What are the alternatives for a multidatabase library for C/C++?
                            
                                PHP metaphone implementation bug
                            
                                Measure executing time on ARM Cortex-A8 using hardware counter
                            
                                Cannot implement password filter

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Guarantees in write ahead logging implementation

Tags:

c

algorithm

file-io

pjay

People also ask

1 Answers

Jonathan Leffler

Recent Activity

Donate For Us