NOTE: Neither the IOTTA TWG nor SNIA vouch for the accuracy or reliability of any of the trace or other information provided below. Please contact us regarding any broken or inaccurate links.
System Call Traces
Code File System Traces
Traces of all system call activity on 33 machines collected between February 1991 and March 1993. Full data set is 24 GB, only a subset of it is available on-line. The site also has pointers to the DFSTrace tools that were used to collect this data. [Mummert96] provides extensive information about the design and implementation of the trace tool.
Drew Roselli's Traces
Several months of traces from three different environments, an academic research cluster, an instructional cluster used for student programming assignments, and a web-server. Traces were collected in late 1996 and early 1997. Only a portion of the full set of traces are available on-line. A USENIX paper and UC Berkeley Technical Report describe and analyze the traces [Roselli98] [Roselli00].
Traces from two workstations. One is seven days, the other ten days. The traces were collected circa 1993. The web page describes the traces and gives contact information for requesting the trace data via e-mail. This data was used in [Appleton94].
System call traces of several Parallel I/O workloads. [Uysal97]
Traces collected in 1996-7 on 9 laptops in an CS research environment over periods varying from 1 to 10 months. These traces were described in [Kuenning97].
Eight days of traces collected in academic research cluster during 1991. These traces were initially described and analyzed in [Baker91].
Network File System Traces
Berkeley Auspex Traces
One week of NFS traffic from 236 clients accessing an Auspex server in late 1993. They were gathered by Cliff Mather by snooping Ethernet packets on four subnets. The clients are the desktop workstations of the University of California at Berkeley Computer Science Division.
Harvard NFS Traces
Dan Ellard's Harvard Traces Extensive set of NFS traces covering many months. Collected from Campus e-mail server and departmental file server. Contact Dan Ellard (email@example.com) for more information. [Ellard03a] [Ellard03b]
Block I/O Traces
HP Lab Traces
The Storage Systems group at HP Labs has made over a decades worth of I/O traces and trace software available. These traces were made on a variety of servers and workstations, running several different applications.
Other Data Sets
Plan 9 File System Traces
This is a time series set of "snapshots" of the contents of the Plan 9 file servers at Bell Labs. One snapshot per day for a number of years.
Papers and Publications
NOTE: Most of the papers upon this list are analyses of system behavior based on trace analysis. There is a much larger list of research that has used traces for other purposes (e.g., to drive simulations). Other work in this area includes adding additional annotations about the papers and looking for online copies of many of these references.
[Baker91] M. Baker, J. Hartman, M. Kupfer, K. Shirriff, and J. Ousterhout.
Measurements of a Distributed File System.
Proceedings of the 13th ACM Symposium of Operating Systems Principles, pp. 198 - 212. October 1991.
[Bennet91] J. Michael Bennet, Michael A. Bauer, David Kinchlea.
Characteristics of Files in NFS Environments.
Proceedings of the 1991 ACM Symposium on Small Systems, pp. 33 - 40. 1991. http://dl.acm.org/citation.cfm?id=152431
[Biswas90] P. Biswas, K.K. Ramakrishnan.
File Access Characterization of VAX/VMS Environments.
Proceedings of the 10th International Conference on Distributed Computing Systems, pp. 227 - 234. Paris, France. May, 1990.http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=00089280
[Blackwell95] Trevor Blackwell, Jeffrey Harris, Margo Seltzer.
Heuristic Cleaning Algorithms in Log-Structured File Systems.
Proceedings of the 1995 USENIX Technical Conference, pp. 277 - 288. New Orleans, LA. January, 1995. http://www.usenix.org/publications/library/proceedings/neworl/blackwell.html
[Bozman91] G.P. Bozman, H.H. Ghannad, E.D. Weinberger.
A trace-driven study of CMS file references.
IBM Journal of Research and Development, Vol. 35, No. 5/6, pp. 815 - 828. September/November, 1991.
[Chiang93] Chi-ming Chiang, Matt W. Mutka.
Characteristics of User File Usage Patterns.
Systems and Software, Vol. 23, No. 3, pp. 257 - 268. December, 1993.
[Douceur99] John R. Douceur, William J. Bolosky.
A Large-Scale Study of File-System Contents.
Proceedings of SIGMETRICS '99, pp. 59 - 70. Atlanta, GA. May, 1999.
[Ellard03a] Daniel Ellard, Jonathan Ledlie, Pia Malkani, Margo Seltzer.
Passive NFS Tracing of Email and Research Workloads.
Proceedings of the Second Annual USENIX File and Storage Technologies Conference, pp. 203-216, San Francisco, CA. March, 2003.
[Ellard03b] Daniel Ellard, Margo Seltzer.
NFS Tricks and Benchmarking Traps.
Proceedings of the FREENIX Technical Conference, San Antonio, Texas. June 2003.
[Floyd86] Rick Floyd.
Short-Term File Reference Patterns in a UNIX Environment.
University of Rochester Computer Science Technical Report #177. March, 1986.
[Griffioen94] Jim Griffioen, Randy Appleton.
Reducing File System Latency using a Predictive Approach.
Proceedings of the Summer 1994 USENIX Technical Conference, pp. 197 - 207. Boston, MA. June, 1994.
[Kuenning97] Geoffrey H. Kuenning and Gerald J. Popek.
Automated Hoarding for Mobile Computers.
Proceedings of the 16th ACM Symposium on Operating Systems Principles, St. Malo, France, October 5-8, 1997.
[Miller91] Ethan L. Miller, Randy H. Katz.
Input/Output Behavior of Supercomputing Applications.
Proceedings of the 1991 Conference on Supercomputing, pp. 567 - 576. Albuquerque, NM. November, 1991.
[Mummert96] L. Mummert, M. Satyanarayanan.
Long Term Distributed File Reference Tracing: Implementation and Experience.
Software - Practice and Experience, Vol. 26, No. 6, pp. 705 - 736. June, 1996.
[Ousterhout85] J. Ousterhout, H. Costa, D. Harrison, J. Kunze, M. Kupfer,
A Trace-Driven Analysis of the UNIX 4.2BSD File System.
Proceedings of the 10th Symposium on Operating System Principles, pp. 15 - 24. Orcas Island, WA. December, 1985.
[Ramakrishnan92] K.K. Ramakrishnan, Prabuddha Biswas, Ramakrishna Karedla.
Analysis of File I/O Traces in Commercial Computing Environments.
Proceedings of SIGMETRICS '92, pp. 78 - 90. Newport, RI. June, 1992.
[Roselli98] Drew Roselli, Thomas E. Anderson.
Characteristics of File System Workloads.
University of California Berkeley Computer Science Division Technical Report UCB//CSD-98-1029. 1992.
[Roselli00] Drew Roselli, Jacob R. Lorch, Thomas E. Anderson.
A Comparison of File System Workloads.
Proceedings of the 2000 USENIX Technical Conference, pp. 44 - 54. San Diego, CA. June, 2000.
[Ruemmler93] Chris Ruemmler, John Wilkes.
UNIX Disk Access Patterns.
Proceedings of the Winter 1993 USENIX Technical Conference, pp. 405 - 420. San Diego, CA. January, 1993.
[Satyanarayanan91] M. Satyanarayanan.
A Study of File Sizes and Functional Lifetimes.
Proceedings of the 8th Symposium on Operating System Principles, pp. 96 - 108. Pacific Grove, CA. December, 1981.
[Shirriff92] Ken Shirriff, John K. Ousterhout.
A Trace-Driven Analysis of Name and Attribute Caching in a Distributed System.
Proceedings of the Winter 1992 USENIX Technical Conference, pp. 315 - 332. San Francisco, CA. January, 1992.
[Smith81] A. J. Smith.
Analysis of Long Term File Reference Patterns for Application to File Migration Algorithms.
IEEE Transactions on Software Engineering, Vol SE-7, No. 4, pp. 403 - 417. July, 1981.
[Uysal97] Mustafa Uysal, Anurag Acharya, Joel Saltz.
Requirements of I/O Systems for Parallel Machines: An Application-driven Study.
Technical Report, CS-TR-3802, University of Maryland, College Park. May 1997.
[Vogels99] Werner Vogels.
File system usage in Windows NT 4.0.
Proceedings of the 17th Symposium on Operating System Principles, pp. 93 - 109. Kiawah Island Resort, SC. December, 1999.
Tools and Documentation
Storage Research Centers
Carnegie Mellon University
Parallel Data Lab (PDL)
San Diego Supercomputer Center (SDSC)
University of Minnesota
Digital Technology Center (DTC)
Intelligent Storage Consortium (DISC)
Brigham Young University
Trace Distribution Center
Performance Evaluation Laboratory
Storage Performance Council (SPC)