Some notes on the format of the Macintosh Finder's .DS_Store files.
The Mac OS X Finder stores information about how it displays directories and files in files named .DS_Store in each directory which it has touched. (This seems to be a departure from the pre-OSX method of storing all the information in one file at the root of each filesystem). The format is not documented by Apple. The information in this file is based on the reverse-engineering notes by Mark Mentovai published on the Mozilla wiki, and further investigation by me (Wim Lewis).
The .DS_Store file holds a series of records giving attributes of the files in the directory or of the directory itself (referred to as .). These records are stored in a B-tree, and the pages of the B-tree are stored in the file by a "buddy allocator" along with a small amount of metadata. The allocator also provides a level of indirection, from small integers to file offsets, presumably allowing blocks to be relocated as they grow and shrink.
The file is generally big-endian. Unless otherwise noted, an "integer" is a four-byte (32-bit) big-endian number, probably unsigned (but I'm not always sure about that).
A record has the following format:
Filename length, in characters, as an integer. (4 bytes)
Filename, in big-endian UTF-16. Presumably with the same Unicode composition rules as the HFS+ filesystem. (2 * length bytes)
Structure id or type, a FourCharCode indicating what property of the file this entry describes. (4 bytes)
Data type (4 bytes), indicating what kind of data field follows:
An integer (4 bytes)
A short integer? Still stored as four bytes, but the first two are always zero.
A boolean value, stored as one byte.
An arbitrary block of bytes, stored as an integer followed by that many bytes of data.
Four bytes, containing a FourCharCode.
A Unicode text string, stored as an integer character count followed by 2*count bytes of data in UTF-16.
A given structure id/type always seems to have the same type of data associated with it, and any structure id appears at most once per filename. Some only appear on files, and some only appear on directories (including the filename .).
Information about a directory (its Finder window view settings, for example) is usually held in its parent directory's store file, unless the directory is at the root of its volume, in which case it's held in the directory itself with the filename ..
I've encountered the following record types:
blob, directories only. Indicates the background of the Finder window viewing this directory (in icon mode). The format depends on the kind of background:
- Default background
DefB, followed by eight unknown bytes, probably garbage.
- Solid color
ClrB, followed by an RGB value in six bytes, followed by two unknown bytes.
PctB, followed by the the length of the blob stored in the
'pict'record, followed by four unknown bytes. The
'pict'record points to the actual background image.
bool, directories only. Unknown meaning. Always seems to be 1, so presumably 0 is the default value.
blob, attached to files and directories. The file's icon location. Two 4-byte values representing the horizontal and vertical positions of the icon's center (not top-left). (Then, 6 bytes 0xff and 2 bytes 0?) For the purposes of the center, the icon only is taken into account, not any label. The icon's size comes from the icvo blob.
bool, attached to directories. Purpose unknown.
blobcontaining a binary plist. This contains the size and layout of the window (including whether optional parts like the sidebar or path bar are visible). This appeared in Snow Leopard (10.6).
ustr, containing a file's "Spotlight Comments".
blob, attached to files and directories. Unknown, may indicate the icon location when files are displayed on the desktop.
bool, attached to subdirectories. Indicates that the subdirectory is open (disclosed) in list view.
blob, directories only. Finder window information. The data is first four two-byte values representing the top, left, bottom, and right edges of the rect defining the content area of the window. The next four bytes represent the view of the window:
icnvis icon view, other values are
Nlsv. The next four bytes are unknown, and are either zeroes or
00 01 00 00.
On Leopard (10.5), the view-type information seems to be ignored, but see
vstl. On Snow Leopard (10.6), some more of this record's function seems to have been taken over by the plist records
long, directories only. Finder window sidebar width, in pixels/points. Zero if collapsed.
shor, directories only. Finder window vertical height. If present, it overrides the height defined by the rect in fwi0. The Finder seems to create these (at least on 10.4) even though it will do the right thing for window height with only an fwi0 around, perhaps this is because the stored height is weird when accounting for toolbars and status bars.
blob, directories (and files?). Unknown. Probably two integers, and often the value
00 00 00 00 00 00 00 04.
blob, directories only. Unknown, usually all but the last two bytes are zeroes.
18- or 26-byte
blob, directories only. Icon view options. There seem to be two formats for this blob.
If the first 4 bytes are "icvo", then 8 unknown bytes (flags?), then 2 bytes corresponding to the selected icon view size, then 4 unknown bytes
6e 6f 6e 65(the text "none", guess that this is the "keep arranged by" setting?).
If the first 4 bytes are "icv4", then: two bytes indicating the icon size in pixels, typically 48; a 4CC indicating the "keep arranged by" setting (or
nonefor none or
gridfor align to grid); another 4CC, either
rght, indicating the label position w.r.t. the icon; and then 12 unknown bytes (flags?).
Of the flag bytes, the low-order bit of the second byte is 1 if "Show item info" is checked, and the low-order bit of the 12th (last) byte is 1 if the "Show icon preview" checkbox is checked. The tenth byte usually has the value 4, and the remainder are zero.
shor, directories only. Icon view text label (filename) size, in points.
40- or 48-byte
blob, attached to directories and files. Unknown.
blob, directories only. Unknown. Possibly the scroll position in list view mode?
blob, directories only. List view options. Seems to contain the columns displayed in list view, their widths, and their sort ordering if any.
shor, directories only. List view text (filename) size, in points.
blobcontaining a binary plist. List view settings, perhaps supplanting the
'lsvo'record. Appeared in Snow Leopard (10.6). These list view settings are shared between list view and the list portion of coverflow view.
blob, directories only. Despite the name, this contains not a PICT image but an Alias record (see Inside Macintosh: Files) which resolves to the file containing the actual background image. See also
type, directories only. Indicates the style of the view (one of
Flwv, indicating respectively: icon view, column/browser view, list view, and coverflow view) selected by the Finder's "Always open in icon [or other style] view" checkbox. This appears to be a new addition to the Leopard (10.5) Finder.
The records are stored in a B-tree structure. The B-tree consists of a small master block containing a few statistics and a pointer to the root node; one or more leaf (external) nodes; and zero or more non-leaf (internal) nodes.
The header block is pointed to by the
DSDB entry in the buddy allocator's directory. It is 20 bytes long and contains five integers:
The block number of the root node of the B-tree
The number of levels of internal nodes (tree height minus one --- that is, for a tree containing only a single, leaf, node this will be zero)
The number of records in the tree
The number of nodes in the tree (tree nodes, not including this header block)
Always 0x1000, probably the tree node page size
Individual nodes are either leaf nodes containing a bunch of records, or non-leaf (internal) nodes containing N records and N+1 pointers to child nodes.
Each node starts with two integers,
P is 0, then this is a leaf node and
count is immediately followed by that many records. If
P is nonzero, then this is an internal node, and
count is followed by the block number of the leftmost child, then a record, then another block number, etc., for a total of
count child pointers and
P is itself the rightmost child pointer, that is, it is logically at the end of the node.
This relies on 0 not being a valid value for a block number. As far as I can tell, 0 is a valid value for a block number but it always holds the block containing the buddy allocator's internal information, presumably because that block is allocated first.
The ordering of records within the B-tree is by case-insensitive comparison of their filenames, secondarily sorted on the structure ID (record type) field. My guess is that the string comparison follows the same rules as HFS+ described in Apple's TN1150.
B-tree pages and other info are stored in blocks managed by a buddy allocator. The allocator maintains a list of the offsets and sizes of blocks (indexed by small integers) and a freelist. The allocator also stores a small amount of metadata, including a directory or table of contents which maps short strings to block numbers. The only entry in that table of contents maps the string
DSDB ("desktop services database"?) to the B-tree's master block.
The buddy allocator is in charge of all but the first 36 bytes of the file, and manages a notional 2GB address space, although the file is of course truncated to the last allocated block. All its offsets are relative to the fourth byte of the file. Another way to describe this is that the file consists of a four-byte header (always
00 00 00 01) followed by a 2GB buddy-allocated area, the first 32-byte block of which is allocated but does not appear on the buddy allocator's allocation list.
The 32-byte header has the following fields:
42 75 64 31)
Offset to the allocator's bookkeeping information block
Size of the allocator's bookkeeping information block
A second copy of the offset; the Finder will refuse to read the file if this does not match the first copy. Perhaps this is a safeguard against corruption from an interrupted write transaction.
Sixteen bytes of unknown purpose. These might simply be the unused space at the end of the block, since the minimum allocation size is 32 bytes, as will be seen later.
The offset and size indicate where to find the block containing the rest of the buddy allocator's state. That block has the following fields:
- Block count
Integer. The number of blocks in the allocated-blocks list.
Four unknown bytes. Appear to always be 0.
- Block addresses
Array of integers. There are block count block addresses here, with unassigned block numbers represented by zeroes. This is followed by enough zeroes to round the section up to the next multiple of 256 entries (1024 bytes).
- Directory count
Integer, indicates the number of directory entries.
- Directory entries
Each consists of a 1-byte count, followed by that many bytes of name (in ASCII or perhaps some 1-byte superset such as MacRoman), followed by a 4-byte integer containing the entry's block number.
- Free lists
There are 32 freelists, one for each power of two from 2^0 to 2^31. Each freelist has a count followed by that many offsets.
There are three different ways to refer to a given block. Most of the file uses what I call block numbers or block IDs, which are indexes into the
block address table. Block ID 0 always seems to refer to the buddy allocator's metadata block itself.
The entries in the block address table are what I call block addresses. Each address is a packed offset+size. The least-significant 5 bits of the number indicate the block's size, as a power of 2 (from 2^5 to 2^31). If those bits are masked off, the result is the starting offset of the block (keeping in mind the 4-byte fudge factor). Since the lower 5 bits are unusable to store an offset, blocks must be allocated on 32-byte boundaries, and as a side effect the minimum block size is 32 bytes (in which case the least significant 5 bits are equal to
The free lists contain actual block offsets, not "addresses". The size of the referenced blocks is implied by which freelist the block is in; a free block in freelist N is 2^N bytes long.
Although the header block is not explicitly allocated, the allocator behaves as if it is; other blocks are split in order to accommodate it in the buddy scheme, and its buddy block (at 0x20) is either on the freelist or allocated. (Usually it holds the B-tree's master block.)
Other than the 4-byte prefix and the 32-byte header block, every byte in the file is either in a block on the allocated blocks list, or is in a block on one of the free lists.
Original reverse-engineering effort by Mark Mentovai, https://wiki.mozilla.org/DS_Store_File_Format. Some of the text describing record types has been copied from that wiki page.
Further investigation and documentation by Wim Lewis <email@example.com>.
Also thanks to Yvan BARTHÉLEMY for investigation and bugfixes.
4 POD Errors
The following errors were encountered while parsing the POD:
- Around line 295:
Expected text after =item, not a number
- Around line 300:
Expected text after =item, not a number
- Around line 304:
Expected text after =item, not a number
- Around line 309:
Expected text after =item, not a number