This is the mail archive of the
mailing list for the Cygwin project.
Re: SPARSE files considered harmful - please revert
> 1) You are assuming behavior that isn't documented. I can imagine that
> the first block could occupy, say 16 blocks and depending on the size of
> the hole, there could be no fragmentation.
You are assuming an optimization that may or may not exist. In my example,
there is certainly no reason why the first block would occupy 16 blocks. I
already specified the hole is exactly one block size. At most the file
system may allocate 3 blocks, so the middle one could be filled later. But
even in that case you would still get fragmentation as a result. However,
the fragmentation would more likely result from a one block file being
written into the "reserved" space, before it is needed for the updated
sparse file. Either way use of a sparse file for a file that is regularly
accessed in RW mode will result in fragmentation. The only question is how
fast it will fragment. That behavior depends on the filesystem, and how the
drivers are implemented.
Really sophisticated drivers might even do things like rewrite the file if
it is below
a threshold size, just to fix fragmentation on the fly. I can definitely
say NTFS is
not that sophisticated. Even on disks with a large amount of free space
fragments at an alarmingly fast rate. I defragment Linux partition once
years at most (by repartitioning and copying). Any more frequent and there
noticeable improvement in performance. For NTFS I find I need to run the
every weekend for optimal performance.
> 2) Normal read/write behavior would not result in a file that has a
> sparse block. I think it is a rare program which writes beyond EOF. So
> this would normally be a non-issue.
Correct. I am only talking about why it is bad idea to blindly convert all
files to sparse files. This can be done with either GNU tar or GNU cp.
The above fragmentation behavior is going to happen and does happen when
the file in question is a database file, since databases tend to contain
lots of blank space intended for adding new records.
> 3) What no one seems to be mentioning is that we are trying to emulate
> UNIX behavior here. If the above is an issue for Windows then it could
> also be an issue for UNIX.
It sounds like we are really on the same page, but discussing different
CYGWIN should definitely support creating sparse files in the classical Unix
method of seeking beyond the end of the file. From what I've seen in this
it already does, and that is not an issue. What I'm arguing is that files
be blindly converted into sparse files with GNU tar -S, GNU
cp --sparse=always, etc.
If for example, you convert a database file into a sparse file, it is not
uncommon for the
fragmentation to reduce database access times by an order of magnitude or
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html