[ The PC Guide | Systems and Components Reference Guide | Hard Disk Drives | Hard Disk Logical Structures and File Systems | New Technology File System (NTFS) | NTFS Directories and Files ] NTFS Files and Data Storage As with most file systems, the fundamental unit of storage in NTFS from the user's perspective is the file. A file is just a collection of any sort of data, and can contain anything: programs, text files, audio clips, database records--and thousands of other kinds of information. The operating system doesn't distinguish between types of files. The use of a particular file depends on how it is interpreted by applications that use it. Within NTFS, all files are stored in pretty much the same way: as a collection of attributes. This includes the data in the file itself, which is just another attribute: the "data attribute", technically. Note that to understand how NTFS stores files, one must first understand the basics of NTFS architecture, and in particular, it's good to comprehend what the Master File Table is and how it works. You may also wish to review the discussion of NTFS attributes, because understanding the difference between resident and non-resident attributes is important to making any sense at all of the rest of this page. ;^) The way that data is stored in files in NTFS depends on the size of the file. The core structure of each file is based on the following information and attributes that are stored for each file:
These are the basic attributes; others may also be associated with a file (see this full discussion of attributes for details). If a file is small enough that all of its attributes can fit within the MFT record for the file, it is stored entirely within the MFT. Whether this happens or not depends largely on the size of the MFT records used on the volume. If the file is too large for all of the attributes to fit in the MFT, NTFS begins a series of "expansions" that move attributes out of the MFT and and make them non-resident. The sequence of steps taken is something like this:
The data runs (extents) are where most file data in an NTFS volume is stored. These runs consist of blocks of contiguous clusters on the disk. The pointers in the data attribute(s) for the file contain a reference to the start of the run, and also the number of clusters in the run. The start of each run is identified using a virtual cluster number or VCN. The use of a "pointer+length" scheme means that under NTFS, it is not necessary to read each cluster of the file in order to determine where the next one in the file is located. This method also reduces fragmentation of files compared to the FAT setup.
|