Finding Metadata Inconsistencies in Distributed File Systems via Cross-Node Operation Modeling

Authors: 

Fuchen Ma, Yuanliang Chen, Yuanhang Zhou, and Zhen Yan, Tsinghua University; Hao Sun, ETH Zurich; Yu Jiang, Tsinghua University

Abstract: 

Metadata consistency is crucial for distributed file systems (DFSes) as it ensures that different clients have a consistent view of the data. However, DFSes are inherently error-prone, leading to metadata inconsistencies. Though rare, such inconsistencies can have severe consequences, including data loss, service failures, and permission violations. Unfortunately, there is limited understanding of metadata inconsistency characteristics, let alone an effective method for detecting them.

This paper presents a comprehensive study of metadata inconsistencies over the past five years across four widely-used DFSes. We identified two key findings: 1) Metadata inconsistencies are mainly triggered by interrelated cross-node file operations rather than system faults. 2) The root cause of inconsistencies mainly lies in the metadata conflict resolution process. Inspired by these findings, we proposed Horcrux, a highly effective fuzzing framework for detecting metadata inconsistencies in DFSes. Horcrux uses cross-node operation modeling to reduce the infinite input combinations to a manageable space. In this way, Horcrux captures implicit cross-node operation relationships and triggers more conflict resolution logic. Currently, Horcrux has detected 10 previously unknown metadata inconsistencies. In addition, Horcrux covers 20.29%-146.21% more conflict resolution code than state-of-the-art tools.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.