Resources

NOTE: Neither the IOTTA TWG nor SNIA vouch for the accuracy or reliability of any of the traces or other information provided below. Please contact us regarding any broken or inaccurate links.

Jump To:


Traces


System Call Traces

Code File System Traces
Traces of all system call activity on 33 machines collected between February 1991 and March 1993. Full data set is 24 GB, only a subset of it is available on-line. The site also has pointers to the DFSTrace tools that were used to collect this data. [Mummert96] provides extensive information about the design and implementation of the trace tool.

Drew Roselli's Traces
Several months of traces from three different environments, an academic research cluster, an instructional cluster used for student programming assignments, and a web-server. Traces were collected in late 1996 and early 1997. Only a portion of the full set of traces are available on-line. A USENIX paper and UC Berkeley Technical Report describe and analyze the traces [Roselli98] [Roselli00].

Kentucky Traces
Traces from two workstations. One is seven days, the other ten days. The traces were collected circa 1993. The web page describes the traces and gives contact information for requesting the trace data via e-mail. This data was used in [Appleton94].

Mambo Suite
System call traces of several Parallel I/O workloads. [Uysal97]

Seer Traces
Traces collected in 1996-7 on 9 laptops in an CS research environment over periods varying from 1 to 10 months. These traces were described in [Kuenning97].

Sprite Traces
Eight days of traces collected in academic research cluster during 1991. These traces were initially described and analyzed in [Baker91].


Network File System Traces

Berkeley Auspex Traces
One week of NFS traffic from 236 clients accessing an Auspex server in late 1993. They were gathered by Cliff Mather by snooping Ethernet packets on four subnets. The clients are the desktop workstations of the University of California at Berkeley Computer Science Division.

Harvard NFS Traces
Dan Ellard's Harvard Traces Extensive set of NFS traces covering many months. Collected from Campus e-mail server and departmental file server. Contact Dan Ellard (ellard@eecs.harvard.edu) for more information. [Ellard03a] [Ellard03b]


Parallel Traces

Los Alamos National Laboratory (LANL) Traces
LANL has released static file tree data (fsstats), including aggregate information on capacity, file and directory sizes, filename lengths, link counts, etc. The fsstats cover 9 anonymous parallel filesystems ranging from 16 TB to 439 TB total capacity used and file counts range from 2,024,729 to 43,605,555.

Other Data Sets

ECMWF Data Handling Log Traces
Traces from the storage landscape of the European Centre for Medium-Range Weather Forecasts (ECMWF) with over 100 PB of archives. Traces were collected from Jan 1, 2012, to May 20, 2014.

OLTP Application and Search Engine I/O Traces
The UMass trace repository holds several storage traces, including two I/O traces from OLTP applications run at financial institutions, and three I/O traces from a popular search engine.

Plan 9 File System Traces
This is a time series set of "snapshots" of the contents of the Plan 9 file servers at Bell Labs. One snapshot per day for a number of years.

Yahoo! Traces
The Yahoo Webscope Program holds multiple datasets of traces collected from Yahoo servers. These datasets are only for non-commercial use.


Tools and Documentation

Microsoft Event Tracing (1) (2)

Stonybrook University Dataseries Documentation Wiki

Computer Storage Systems Research Discussion Forum

Traces and Snapshots Public Archive


Storage Research Centers

Carnegie Mellon University
Parallel Data Lab (PDL)

San Diego Supercomputer Center (SDSC)

University of Minnesota
Digital Technology Center (DTC)
Intelligent Storage Consortium (DISC)

Storage Performance Council (SPC)


Papers and Publications

Papers Relating to Traces

[Harter11] Tyler Harter, Chris Dragga, Michael Vaughn, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau.
A File is Not a File: Understanding the I/O Behavior of Apple Desktop Applications.
Department of Computer Sciences, University of Wisconsin, Madison. 2011.

[Ellard03b] Daniel Ellard, Margo Seltzer.
NFS Tricks and Benchmarking Traps.
Proceedings of the FREENIX Technical Conference, San Antonio, Texas. June, 2003.

[Ellard03a] Daniel Ellard, Jonathan Ledlie, Pia Malkani, Margo Seltzer.
Passive NFS Tracing of Email and Research Workloads.
Proceedings of the Second Annual USENIX File and Storage Technologies Conference, pp. 203-216, San Francisco, CA. March, 2003.

[Roselli00] Drew Roselli, Jacob R. Lorch, Thomas E. Anderson.
A Comparison of File System Workloads.
Proceedings of the 2000 USENIX Technical Conference, pp. 44 - 54. San Diego, CA. June, 2000.

[Vogels99] Werner Vogels.
File system usage in Windows NT 4.0.
Proceedings of the 17th Symposium on Operating System Principles, pp. 93 - 109. Kiawah Island Resort, SC. December, 1999.

[Douceur99] John R. Douceur, William J. Bolosky.
A Large-Scale Study of File-System Contents.
Proceedings of SIGMETRICS '99, pp. 59 - 70. Atlanta, GA. May, 1999.

[Kuenning97] Geoffrey H. Kuenning and Gerald J. Popek.
Automated Hoarding for Mobile Computers.
Proceedings of the 16th ACM Symposium on Operating Systems Principles, St. Malo, France, October 5-8, 1997.

[Uysal97] Mustafa Uysal, Anurag Acharya, Joel Saltz.
Requirements of I/O Systems for Parallel Machines: An Application-driven Study.
Technical Report, CS-TR-3802, University of Maryland, College Park. May 1997.

[Mummert96] L. Mummert, M. Satyanarayanan.
Long Term Distributed File Reference Tracing: Implementation and Experience.
Software - Practice and Experience, Vol. 26, No. 6, pp. 705 - 736. June, 1996.

[Blackwell95] Trevor Blackwell, Jeffrey Harris, Margo Seltzer.
Heuristic Cleaning Algorithms in Log-Structured File Systems.
Proceedings of the 1995 USENIX Technical Conference, pp. 277 - 288. New Orleans, LA. January, 1995.

[Griffioen94] Jim Griffioen, Randy Appleton.
Reducing File System Latency using a Predictive Approach.
Proceedings of the Summer 1994 USENIX Technical Conference, pp. 197 - 207. Boston, MA. June, 1994.

[Chiang93] Chi-ming Chiang, Matt W. Mutka.
Characteristics of User File Usage Patterns.
Systems and Software, Vol. 23, No. 3, pp. 257 - 268. December, 1993.

[Ruemmler93] Chris Ruemmler, John Wilkes.
UNIX Disk Access Patterns.
Proceedings of the Winter 1993 USENIX Technical Conference, pp. 405 - 420. San Diego, CA. January, 1993.

[Ramakrishnan92] K.K. Ramakrishnan, Prabuddha Biswas, Ramakrishna Karedla.
Analysis of File I/O Traces in Commercial Computing Environments.
Proceedings of SIGMETRICS '92, pp. 78 - 90. Newport, RI. June, 1992.

[Roselli98] Drew Roselli, Thomas E. Anderson.
Characteristics of File System Workloads.
University of California Berkeley Computer Science Division Technical Report UCB//CSD-98-1029. 1992.

[Shirriff92] Ken Shirriff, John K. Ousterhout.
A Trace-Driven Analysis of Name and Attribute Caching in a Distributed System.
Proceedings of the Winter 1992 USENIX Technical Conference, pp. 315 - 332. San Francisco, CA. January, 1992.

[Miller91] Ethan L. Miller, Randy H. Katz.
Input/Output Behavior of Supercomputing Applications.
Proceedings of the 1991 Conference on Supercomputing, pp. 567 - 576. Albuquerque, NM. November, 1991.

[Baker91] M. Baker, J. Hartman, M. Kupfer, K. Shirriff, and J. Ousterhout.
Measurements of a Distributed File System.
Proceedings of the 13th ACM Symposium of Operating Systems Principles, pp. 198 - 212. October 1991.

[Bozman91] G.P. Bozman, H.H. Ghannad, E.D. Weinberger.
A trace-driven study of CMS file references.
IBM Journal of Research and Development, Vol. 35, No. 5/6, pp. 815 - 828. September/November, 1991.

[Bennet91] J. Michael Bennet, Michael A. Bauer, David Kinchlea.
Characteristics of Files in NFS Environments.
Proceedings of the 1991 ACM Symposium on Small Systems, pp. 33 - 40. 1991.

[Biswas90] P. Biswas, K.K. Ramakrishnan.
File Access Characterization of VAX/VMS Environments.
Proceedings of the 10th International Conference on Distributed Computing Systems, pp. 227 - 234. Paris, France. May, 1990.

[Floyd86] Rick Floyd.
Short-Term File Reference Patterns in a UNIX Environment.
University of Rochester Computer Science Technical Report #177. March, 1986.

[Ousterhout85] J. Ousterhout, H. Costa, D. Harrison, J. Kunze, M. Kupfer, J. Thompson.
A Trace-Driven Analysis of the UNIX 4.2BSD File System.
Proceedings of the 10th Symposium on Operating System Principles, pp. 15 - 24. Orcas Island, WA. December, 1985.

[Satyanarayanan81] M. Satyanarayanan.
A Study of File Sizes and Functional Lifetimes.
Proceedings of the 8th Symposium on Operating System Principles, pp. 96 - 108. Pacific Grove, CA. December, 1981.

[Smith81] A. J. Smith.
Analysis of Long Term File Reference Patterns for Application to File Migration Algorithms.
IEEE Transactions on Software Engineering, Vol SE-7, No. 4, pp. 403 - 417. July, 1981.

Publications That Cite iotta.snia.org

The following publications cite iotta.snia.org as a source of trace data used in their research. They are organized in reverse chronological order. This list attempts to be comprehensive but is not complete; feel free to contact us to suggest additional entries.


[Venkataraman et al., 2013]
Kalyana Sundaram Venkataraman, Tong Zhang, Wenzhe Zhao, Hongbin Sun, and Nanning Zheng. Scheduling algorithms for handling updates in shingled magnetic recording. In Proceedings of the 8th IEEE International Conference on Networking, Architecture, and Storage, pages 205–214, Xi'an, Shaanxi, China, July 2013. IEEE.
[He et al., 2013]
Wanhui He, Nong Xiao, Fang Liu, Zhiguang Chen, and Yinjin Fu. DL-Dedupe: Dual-Level Deduplication Scheme for Flash-Based SSDs, volume 7901 of Lecture Notes in Computer Science, pages 4–15. Springer-Verlag, Berlin, Germany, June 2013.
[Hu et al., 2013]
Yang Hu, Hong Jiang, Dan Feng, Lei Tian, Hao Luo, and Chao Ren. Exploring and exploiting the multilevel parallelism inside SSDs for improved performance and endurance. IEEE Transactions on Computers, 62(6):1141–1155, June 2013. (doi:10.1109/TC.2012.60)
[Wei et al., 2013]
Qingsong Wei, Lingfang Zeng, Jianxi Chen, and Cheng Chen. A popularity-aware buffer management to improve buffer hit ratio and write sequentiality for solid-state drive. IEEE Transactions on Magnetics, 49(6):2786–2793, June 2013. (doi:10.1109/TMAG.2013.2249579)
[Yang et al., 2013a]
Jingpei Yang, Ned Plasson, Greg Gillis, Nisha Talagala, Swaminathan Sundararaman, and Robert Wood. HEC: Improving endurance of high performance flash-based cache devices. In Proceedings of the 6th ACM International Systems and Storage Conference (SYSTOR), Haifa, Israel, June 2013. (doi:10.1145/2485732.2485743)
[Yang et al., 2013b]
Ming-Chang Yang, Yuan-Hao Chang, Che-Wei Tsao, and Po-Chun Huang. New ERA: New efficient reliability-aware wear leveling for endurance enhancement of flash storage devices. In 50thACM/EDAC/IEEE Design Automation Conference, Austin, TX, June 2013. IEEE.
[Abdurrab et al., 2013]
Abdul R. Abdurrab, Tao Xie, and Wei Wang. DLOOP: A flash translation layer exploiting plane-level parallelism. In Proceedings of the 27th IEEE International Symposium on Parallel & Distributed Processing, pages 908–918, Boston, MA, May 2013. IEEE.
[Prada et al., 2013]
Laura Prada, Alejandro Calderón, Javier Garcia, J. Daniel, and Jesús Carretero. A novel black-box simulation model methodology for predicting performance and energy consumption in commodity storage devices. Simulation Modelling Practice and Theory, 34:48–63, May 2013. (doi:10.1016/j.simpat.2013.01.006)
[Qin et al., 2013]
Yi Qin, Dan Feng, Wei Tong, Jingning Liu, Yang Hu, and Zhiming Zhu. Per-file secure deletion combining with enhanced reliability for SSDs. In James J. Park, Hamid R. Arabnia, Cheonshik Kim, Weisong Shi, and Joon-Min Gil, editors, Proceedings of the 8th International Conference on Grid and Pervasive Computing (GPC), volume 7861 of Lecture Notes in Computer Science, pages 509–516, Seoul, South Korea, May 2013. Springer-Verlag. (doi:10.1007/978-3-642-38027-3_54)
[Talwadker and Voruganti, 2013]
Rukma Talwadker and Kaladhar Voruganti. Paragone: What's next in block I/O trace modeling. In Proceedings of the 29th IEEE Symposium on Mass Storage Systems and Technologies, Long Beach, CA, May 2013. (doi:10.1109/MSST.2013.6558436)

Displaying Citations 241–250 of 336 in total
Showing citations per page


Member Links

Featured Events