This is a short summary or overview I wrote after reading a conference paper from the ACM Digital Library. The original paper can be found here: Click here
Analysis of I/O Behaviour of Apple Desktop Applications
The iBench task suite was created consisting of 6 applications running a total of 34 different tasks. The applications were grouped as iWork (Pages, Numbers, Keynote) and iLife (iPhoto, iTunes, iMovie). The I/O behaviour of these applications was analyzed.
The tasks in the iBench suite were:
iLife iPhoto: start, import, duplicate, edit, view, delete. The application worked on 400 2.5 MB photos imported as 12 MP pictures from a 1 GB flash card.
iLife iTunes: start, import and play mp3 album containing 10 songs, import and play a 3 minute long MPEG-4 movie.
iLife iMovie: start, import, add clip to project, export to 3 minute MPEG-4 movie.
iWork Pages: start, create, save, open, export 15 page documents with and without images in different formats.
iWork Numbers: start, generate and save, open, export a 5 page spreadsheet.
iWork Keynote: start, create slides with and without images, open, play, export 20 slides.
For the purpose of a particular case study, the Pages application, a word processor, was used to create a blank document, insert 15 JPEG images each of size 2.5 MB, and save as .doc.
The automated task trace was performed on a Mac Mini running Mac OS X Snow Leopard (version 10.6.2) and HFS+ (Heirarchical File System). The device had 2 GB of memory and a 2.26 GHz Intel Core Duo processor.
First, an instrumentation framework was built to monitor system calls made by each traced application, examine stack traces, and in-kernel functions (paging). The framework was built on top of the Dtrace tracing system of Mac OS X. Next, AppleScript was used to drive each application in an automated and repeatable manner.
The result is analyzed based on the overall behaviour of all the tasks in the iBench suite, and not just on the case study of Pages alone. Graphs are plotted for the traces and the observations noted.
The traces must be run on a single file-system image which contains snapshots of initial directories to be restored before each run. The snapshot must be used as a common benchmark for all the test runs done using the task bench. The simple snapshot makes the iBench task suite easy to use.
Analysis of Result
Nature of Files:
File Types – Out of 6 file categories (multimedia, productivity, SQLite files, plist files, strings files, and misc), generally multimedia files were opened most frequently. Comparing relative amounts of I/O size in number of bytes, iLife tasks accessed mostly multimedia, while iWork tasks focussed on productivity files.
File Sizes – Applications tended to open many very small files (< 4KB), especially iWork. This is because of frequent use of .plist files that store preferences and settings. The most number of bytes accessed are in large files (>10MB) even though they are opened rarely.
File Accesses – Majority files were opened as exclusive read or exclusive write. Read-only was very common, while write-only was common in some iLife tasks, as iWork had fewer write-only accesses (but wrote a big number of bytes to those files). Read-write access was rare. but when present, it contributed to a large portion of the total I/O, eg. exporting files.
Sequentiality – Read sequentiality was high in iWork and less in iLife, while write sequentiality was high in iLife and low in iWork. Many iLife tasks occasionally accessed a small header while accessing the rest of the file sequentially, so 95% bytes formed a nearly sequential run. In iWork, the access was more random. This knowledge is important for prefetching and other optimizations.
Preallocation – In iBench, hints were provided to the file system using pwrite (write a byte beyond EOF to signify future EOF). iPhoto and iMovie used preallocation hints. It is seen that these are mostly useless because they are followed by a single write call with all the data.
Durability – Written data was flushed to the storage using fsync. Half the tasks synchronized almost 100% of their written data, and 2/3rd of the tasks synchronized more than 60%. iLife called fsync frequently compared to iWork. Calls to fsync were mostly associated with media. Majority of the calls synchronized small amounts of data. Only a few iLife tasks synchronized more than 1 MB in a single fsync call. This shows that delayed writing is becoming less common because application developers and library frameworks are calling small synchronous write routines very frequently.
Atomic Writes – Applications automatically updated files by writing to a temporary file, and then using rename or exchangedata call to atomically replace the old file. Rename deleted and replaced the original (this was used frequently), while exchangedata swapped the inode numbers of the new and old files (rarely used). Atomic write using rename was frequent, but only a few out of those calls moved files between directories, eg. When saving a file at the user’s request.
Threads and Asynchronity:
Asynchronous Reads – These were used rarely, only by iLife tasks. But when used, they were used heavily for a large number of bytes.
Portion of I/O – Most applications issued I/O requests from many threads, especially iTunes and iPhoto, as their tasks are readily subdivided, eg. importing a large number of tracks or photos. Only a few tasks performed all I/O from a single thread.
Responsibilities of I/O Threads – There were many threads created for reading, and also some for both read and write. But there were not too many threads for write alone.
Threads are required to perform long-latency operations in the background to keep the interface responsive. It is seen that multiple threads perform I/O. So it is important for file and storage systems to be thread-aware in order to allocate bandwidth to each thread more effectively.