SPARSE files considered harmful - please revert

John Vincent jpv50@hotmail.com
Mon May 19 19:26:00 GMT 2003


Hi,

I looked up sparse files on MSDN and found the following link:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/base/sparse_file_operations.asp

The most interesting thing is that a sparse file is only sparse if the zeros 
in the file are written with a special operation. I strongly suspect that 
the patch to support sparse files introduced in cygwin is incorrect (or at 
least incomplete)

I've quoted the contents of the entry below, I hope this is helpful:

Sparse File Operations

To determine whether a file system supports sparse files, call the 
GetVolumeInformation function and examine the FILE_SUPPORTS_SPARSE_FILES bit 
flag.


Most applications are not aware of sparse files and will not create sparse 
files. The fact that an application is reading a sparse file is transparent 
to the application. An application that is aware of sparse-files should 
determine whether its data set is suitable to be kept in a sparse file. 
After that determination is made, the application must explicitly declare a 
file as sparse, using the FSCTL_SET_SPARSE control code.

After an application has set a file to be sparse, the application can use 
the FSCTL_SET_ZERO_DATA control code to set a region of the file to zero. In 
addition, the application can use the FSCTL_QUERY_ALLOCATED_RANGES control 
code to speed searches for nonzero data in the sparse file.

When you perform a write operation (with a function or operation other than 
FSCTL_SET_ZERO_DATA) whose data consists of nothing but zeros, zeros will be 
written to the disk for the entire length of the write. To zero out a range 
of the file and maintain sparseness, use FSCTL_SET_ZERO_DATA.

A sparseness-aware application may also set an existing file to be sparse. 
If an application sets an existing file to be sparse, it should then scan 
the file for regions which contain zeros, and use FSCTL_SET_ZERO_DATA to 
reset those regions, thereby possibly deallocating some physical disk 
storage. An application upgraded to sparse file awareness should perform 
this conversion.

When you perform a read operation from a zeroed-out portion of a sparse 
file, the operating system may not read from the hard drive. Instead, the 
system recognizes that the portion of the file to be read contains zeros, 
and it returns a buffer full of zeros without actually reading from the 
disk.

As with any other file, the system can write data to or read data from any 
position in a sparse file. Nonzero data being written to a previously zeroed 
portion of the file may result in allocation of disk space. Zeros being 
written over nonzero data (only with FSCTL_SET_ZERO_DATA) may result in a 
deallocation of disk space.

Note  It is up to the application to maintain sparseness by writing zeros 
with FSCTL_SET_ZERO_DATA.

Defragmenting tools that handle compressed files on NTFS file systems will 
correctly handle sparse files on NTFS volumes.

_________________________________________________________________
Stay in touch with absent friends - get MSN Messenger 
http://www.msn.co.uk/messenger


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/



More information about the Cygwin mailing list