This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [ANNOUNCEMENT] Updated: rcs-5.8.1-1: The Revision Control System

Dear Dr. Zell,

> [ANNOUNCEMENT] Updated: rcs-5.8.1-1: The Revision Control System

Do you perhaps know whether this version fixes the file corruption
problem noticed earlier on this list for 5.8, or should we stay with
5.7 to avoid corrupting large RCS files?

Short problem description (what I think is happening):

For large files, the check-in tool ci decides to do file descriptor IO
instead of memory mapped IO.

Ci reads the start of the work file. This triggers the IO library to
fill its buffer, 64k in this case.

Ci dups the file descriptor and supplies it to a diff subprocess. The
file pointer is shared.

Diff reads the entire file. This places the file pointer at the end of
the file.

Ci rewinds the work file. The IO library sees that the start of file
position is in the current buffer and simply resets the buffer

Ci wants to read the entire work file to place it in the new rcs
file. At the end of the 64k buffer, the IO library reads the file
descriptor for the next 64k.  But the file pointer of the file
descriptor is still at the end of the file, so ci gets an EOF and
thinks the work file copy is complete.

Thus the content of the last version in the rcs file is truncated to
64k and the rcs file most likely becomes inconsistent, since the edit
scripts to reproduce older versions from the last version refer to
lines that are no longer present.

If you do a check-in, check-out sequence, then you end up with a
truncated work file, corrupted rcs file, have lost recent work and
must resort to backups.


Peter Wagemans

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]