Fast18 Sun
Fast18 Sun
Fast18 Sun
Storage Applications
Kuei Sun, Daniel Fryer, Joseph Chu, Matthew Lakier, Angela Demke Brown,
and Ashvin Goel, University of Toronto
https://www.usenix.org/conference/fast18/presentation/sun
100 16th USENIX Conference on File and Storage Technologies USENIX Association
Tool Name Structure Field Description
dumpe2fs super block s_creator_os index out of bound error during OS name lookup
dump.f2fs super block log_blocks_per_seg index out of bound error while building nat bitmap
super block segment_count_main null pointer dereference after calloc fails
super block cp_blkaddr double free error during error handling (no valid checkpoint)
summary block n_nats index out of bound error during nid lookup
inode i_namelen index out of bound error when adding null character to end of name
Table 5: List of segmentation faults found during type-specific corruption experiments.
USENIX Association 16th USENIX Conference on File and Storage Technologies 101
These results show that storage applications using our references to external objects. Our annotation language
generated library can provide reasonable performance overcomes this limitation by explicitly annotating point-
improvements without changing the file system code. ers, which defines how file system metadata reference
each other. We also provide support for address spaces,
so that address values can be mapped to user-specified
7 Related Work
physical locations on disk.
A large body of work has focused on storage-layer ap- Several projects have explored C extensions for ex-
plications that perform file-system specific processing pressing additional semantic information [19, 35, 29].
for improving performance or reliability. Semantically- CCured [19] enables type and memory safety, and the
smart disks [24] used probing to gather detailed knowl- Deputy Type System [35] prevents out-of-bound array
edge of file system behavior, allowing functionality or errors. Both projects annotate source code, perform
performance to be enhanced transparently at the block static analysis, and add runtime checks, but they are de-
layer. The probing was designed for Ext4-like file sys- signed for in-memory structures.
tems and would likely require changes for copy-on-write Formal specification approaches for file systems [1, 5]
and log-structured file systems. Spiffy annotations avoid require building a new file system from scratch, while
the need for probing, helping provide accurate block type our work focuses on building tools for existing file sys-
information based on runtime interpretation. tems. Chen et al. [5] use logical address spaces as ab-
I/O shepherding [12] improves reliability by using stractions for writing higher-level file system specifica-
file structure information to implement checksumming tions. This idea inspired our use of an address space type
and replication. Block type information is provided to for specifying pointers. Another method for specifying
the storage layer I/O shepherd by modifying the file pointers is by defining paths that enable traversing the
system and the buffer-cache code. Our approach en- metadata tree to locate a metadata object, such as finding
ables I/O shepherding without requiring these changes. the inode structure from an inode number [14, 10]. These
Also, unlike I/O shepherding, Spiffy allows interpreting approaches focus on the correctness of file-system oper-
block contents, enabling more powerful policies, such as ations at the virtual file system layer, whereas our goal is
caching the files of specific users. to specify the physical structures of file systems.
A type-safe disk extends the disk interface by expos-
ing primitives for block allocation and pointer relation- 8 Conclusion
ships [23], which helps enforce invariants such as pre-
venting access to unallocated blocks, but this interface Spiffy is an annotation language for specifying the on-
requires extensive file system modifications. We believe disk file system data structures. File system developers
that our runtime interpretation approach allows enforcing annotate their data structures using Spiffy, which enables
such type-safety invariants on existing file systems. generating a library that allows parsing and traversing file
Serialization of structured data has been explored system data structures correctly.
through interface languages such as ASN.1 [25] and Pro- We have shown the generality of our approach by an-
tocol Buffers [31], which allow programmers to define notating three vastly different file systems. The anno-
their data structures so that marshaling routines can be tated file system code serves as detailed documentation
generated for them. However, the binary serialization for the metadata structures and the relationships between
format for the structures is specified by the protocol and them. File-system aware storage applications can use the
not under the control of the programmer. As a result, Spiffy libraries to improve their resilience against pars-
these languages cannot be used to interpret the existing ing bugs, and to reduce the overall programming effort
binary format of a file system. needed for supporting file-system specific logic in these
Data description languages such as Hammer [21] and applications. Our evaluation suggests that applications
PADS [7] allow fine-grained byte-level data formats to using the generated libraries perform reasonably well.
be specified. However, they have limited support for non- We believe our approach will enable interesting applica-
sequential processing, and thus their parsers cannot inter- tions that require an understanding of storage structures.
pret file system I/O, where a graph traversal is required
rather than a sequential scan. Furthermore, with online Acknowledgements
interpretation, this traversal is performed on a small part
of the graph, and not on the entire data. We thank the anonymous reviewers and our shepherd,
Nail [3] shares many goals with our work. Its grammar André Brinkmann, for their valuable feedback. We spe-
provides the ability to specify arbitrarily computed fields. cially thank Michael Stumm, Ding Yuan, Mike Qin, and
It also supports non-linear parsing, but its scope is lim- Peter Goodman for their insightful suggestions. This
ited to a single packet or file, and so it does not support work was supported by NSERC Discovery.
102 16th USENIX Conference on File and Storage Technologies USENIX Association
References [18] M ESNIER , M., C HEN , F., L UO , T., AND A KERS , J. B. Differen-
tiated storage services. In Proc. of the Symposium on Operating
[1] A MANI , S., RYZHYK , L., AND M URRAY, T. Towards a fully Systems Principles (SOSP) (2011), pp. 57–70.
verified file system, 2012. EuroSys Doctoral Workshop 2012.
[19] N ECULA , G. C., M C P EAK , S., AND W EIMER , W. Ccured:
[2] BAIRAVASUNDARAM , L. N., RUNGTA , M., AGRAWA , N., type-safe retrofitting of legacy code. In Proceedings of the 29th
A RPACI -D USSEAU , A. C., A RPACI -D USSEAU , R. H., AND ACM SIGPLAN-SIGACT symposium on Principles of program-
S WIFT, M. M. Analyzing the effects of disk-pointer corrup- ming languages (New York, NY, USA, 2002), POPL ’02, ACM,
tion. In 2008 IEEE International Conference on Dependable Sys- pp. 128–139.
tems and Networks With FTCS and DCC (DSN) (2008), IEEE,
pp. 502–511. [20] OVERSTREET, K. Linux bcache, Aug. 2016. https://bcache.
evilpiepirate.org/.
[3] BANGERT, J., AND Z ELDOVICH , N. Nail: A practical tool for
parsing and generating data formats. In 11th USENIX Sympo- [21] PATTERSON , M., AND H IRSCH , D. Hammer parser generator,
sium on Operating Systems Design and Implementation (OSDI march 2014. https://github.com/UpstandingHackers/
14) (2014), pp. 615–628. hammer.
[4] B UCKEYE , B., AND L ISTON , K. Recovering deleted files in [22] RONACHER , A. Jinja2 documentation, 2011.
linux. http://collaboration.cmc.ec.gc.ca/science/ [23] S IVATHANU , G., S UNDARARAMAN , S., AND Z ADOK , E. Type-
rpn/biblio/ddj/Website/articles/SA/v11/i04/a9. safe disks. In Proc. of the USENIX Symposium on Operating
htm, 2006. Systems Design and Implementation (OSDI) (2006), pp. 15–28.
[5] C HEN , H., Z IEGLER , D., C HAJED , T., C HLIPALA , A., [24] S IVATHANU , M., P RABHAKARAN , V., P OPOVICI , F. I.,
K AASHOEK , M. F., AND Z ELDOVICH , N. Using crash hoare D ENEHY, T. E., A RPACI -D USSEAU , A. C., AND A RPACI -
logic for certifying the fscq file system. In Proceedings of the D USSEAU , R. H. Semantically-smart disk systems. In USENIX
25th Symposium on Operating Systems Principles (2015), ACM, Conference on File and Storage Technologies (FAST) (2003),
pp. 18–37. pp. 73–88.
[6] DANIAL , A. Cloc–count lines of code. Open source (2009). [25] S TEEDMAN , D. Abstract syntax notation one (ASN. 1): the tuto-
http://cloc.sourceforge.net/. rial and reference. Technology appraisals, 1993.
[7] F ISHER , K., AND WALKER , D. The pads project: an overview. [26] S TEFANOVICI , I., T HERESKA , E., O’S HEA , G., S CHROEDER ,
In Proceedings of the 14th International Conference on Database B., BALLANI , H., K ARAGIANNIS , T., ROWSTRON , A., AND
Theory (2011), ACM, pp. 11–17. TALPEY, T. Software-defined caching: Managing caches in
[8] F RYER , D., S UN , K., M AHMOOD , R., C HENG , T., B ENJAMIN , multi-tenant data centers. In Proceedings of the Sixth ACM Sym-
S., G OEL , A., AND B ROWN , A. D. Recon: Verifying file system posium on Cloud Computing (2015), ACM, pp. 174–181.
consistency at runtime. ACM Transactions on Storage 8, 4 (Dec. [27] T ECH N ET, M. How to convert fat disks to ntfs.
2012), 15:1–15:29. https://technet.microsoft.com/en-us/library/
[9] G AMMA , E. Design patterns: elements of reusable object- bb456984.aspx.
oriented software. Pearson Education India, 1995. [28] T OM WARREN. Apple is upgrading millions of
[10] G ARDNER , P., N TZIK , G., AND W RIGHT, A. Local reasoning iOS devices to a new modern file system today.
for the posix file system. In European Symposium on Program- https://www.theverge.com/2017/3/27/15076244/
ming Languages and Systems (2014), Springer, pp. 169–188. apple-file-system-apfs-ios-10-3-features. Ac-
cessed: 2017-03-27.
[11] G EDAK , C. Manage Partitions with GParted How-to. Packt Pub-
lishing Ltd, 2012. [29] T ORVALDS , L., T RIPLETT, J., AND L I , C. Sparse–a semantic
parser for c. see http://sparse.wiki.kernel.org (2007).
[12] G UNAWI , H. S., P RABHAKARAN , V., K RISHNAN , S., A RPACI -
D USSEAU , A. C., AND A RPACI -D USSEAU , R. H. Improv- [30] T S ’ O , T. E2fsprogs: Ext2/3/4 filesystem utilities. http://
ing file system reliability with I/O shepherding. In Proc. of e2fsprogs.sourceforge.net/, 2017.
the Symposium on Operating Systems Principles (SOSP) (2007), [31] VARDA , K. Protocol buffers: Google’s data interchange for-
pp. 293–306. mat. Google Open Source Blog, Available at least as early as
[13] G UNAWI , H. S., R AJIMWALE , A., A RPACI -D USSEAU , A. C., Jul (2008).
AND A RPACI -D USSEAU , R. H. SQCK: A declarative file sys- [32] W ILSON , A. The new and improved filebench. In Proceed-
tem checker. In Proc. of the USENIX Symposium on Operating ings of 6th USENIX Conference on File and Storage Technologies
Systems Design and Implementation (OSDI) (Dec. 2008). (2008). https://github.com/filebench/filebench/.
[14] H ESSELINK , W. H., AND L ALI , M. I. Formalizing a hierarchical [33] YANG , J., T WOHEY, P., E NGLER , D., AND M USUVATHI , M.
file system. Electronic Notes in Theoretical Computer Science Using model checking to find serious file system errors. ACM
259 (2009), 67–85. Transactions on Computer Systems (TOCS) 24, 4 (2006), 393–
[15] L EE , C., S IM , D., H WANG , J., AND C HO , S. F2fs: A new file 423.
system for flash storage. In 13th USENIX Conference on File and [34] Z ALEWSKI , M. American fuzzy lop. http://lcamtuf.
Storage Technologies (FAST 15) (2015), pp. 273–286. coredump.cx/afl/, 2016.
[16] L U , L., A RPACI -D USSEAU , A. C., A RPACI -D USSEAU , R. H., [35] Z HOU , F., C ONDIT, J., A NDERSON , Z., BAGRAK , I., E N -
AND L U , S. A study of Linux file system evolution. In Proc. NALS , R., H ARREN , M., N ECULA , G., AND B REWER , E.
of the USENIX Conference on File and Storage Technologies Safedrive: Safe and recoverable extensions using language-based
(FAST) (Feb. 2013). techniques. In Proceedings of the 7th symposium on Operating
[17] M A , A., D RAGGA , C., A RPACI -D USSEAU , A. C., AND systems design and implementation (2006), USENIX Associa-
A RPACI -D USSEAU , R. H. ffsck: The fast file system checker. tion, pp. 45–60.
In Proc. of the USENIX Conference on File and Storage Tech-
nologies (FAST) (Feb. 2013).
USENIX Association 16th USENIX Conference on File and Storage Technologies 103