There is an interesting discussion here “PostgreSQL’s fsync() surprise”. The PostgreSQL developers got some nasty surprises on how Linux fsync works, which means they could lose data without any errors being reported. NetBSD and OpenBSD can also behave in the same way, but FreeBSD does not. It is a bit scary to think that even database developers have trouble dealing with safely writing data to disk.

Included in the article is a link to a post by Dave Chinner, the XFS maintainer who points out that the close() call can return errors from previous writes—how many people handle errors from close?

If the problem is scary, then the solution is even more scary AIO+DIO (Asynchronous IO and Direct IO)—here is an example of some AIO. It’s interesting to see this AIO+DIA is currently evolving with new additions to the kernel like RWF_NOWAIT. Using this leads to no-portable code as one of the PostgreSQL developers points out.

There is a post on how the different OS behave, with some interesting phrases in the post “EIO lies” and “EIO tells the truth”. In distributed systems when a component “lies” it’s known as a Byzantine fault, after the famous Bzyantine Generals problem (See here or here). A real world example of something that might be classified as Byzantine fault is this. In practice people do not use the term Byzantine fault for these types of events, they are just classified as bugs – perhaps talking about Byzantine faults only makes sense in the design phase.

Leave a comment

Your email address will not be published. Required fields are marked *