Cygwin&Win32 file prefetch, block sizes?
Corinna Vinschen
corinna-cygwin@cygwin.com
Wed Apr 3 08:14:48 GMT 2024
On Apr 3 00:35, Martin Wege via Cygwin wrote:
> On Tue, Apr 2, 2024 at 3:17 PM Corinna Vinschen via Cygwin
> <cygwin@cygwin.com> wrote:
> >
> > On Apr 2 02:04, Martin Wege via Cygwin wrote:
> > > Hello,
> > >
> > > Is there any document which describes how Cygwin and Win32 file
> > > prefetch and readahead work, and which sizes are used (e.g. always
> > > read one full page even if only 16 bytes are requested?)?
> >
> > I'm not aware of any docs, but again, keep in mind that Cygwin is a
> > usersapce DLL. We basically do what Windows does for low-level file
> > access.
> >
> > > Quick /usr/bin/stat /etc/profile returns "IO Block: 65536". Does that
> > > mean the file's block size is really 64k? Is this info per filesystem,
> > > or hardcoded in Cygwin?
> >
> > Hardcoded in Cygwin since 2017, based on a discussion in terms of
> > file access performance, especially when using stdio.h functions:
> >
> > https://cygwin.com/cgit/newlib-cygwin/commit/?id=7bef7db5ccd9c
>
> OUCH.
>
> While I can understand the motivation, FAT32 on multi-GB-devices
> having 64k block size, and Win32 API on Win95/98/ME/Win7 being
> optimized to that insane block size, it is absolutely WRONG with
> today's NTFS and even more so with ReFS. This only works if you stream
> files, but as soon as you are doing random read/writes the performance
> is terrible due to cache thrashing. That could explain the many
> complaints about Cygwin's IO performance.
The above patch *only* sets stat::st_blksize to 64K. Nothing else
happens!
This usually means that stdio.h functions use this size for their buffer
and readahead. It doesn't affect direct calls to read(2)/write(2) and
fread(3)/fwrite(3) at all!
> So, what can be done? I'm not a benchmarking guru, so I'd like to
> propose to add a tunable called EXPERIMENTAL_PREFERRED_IO_BLKSIZE to
No.
We have two ways to handle this *iff* there's really a reason to
handle this.
- Either we just lower PREFERRED_IO_BLKSIZE to 4K or 8K, but that's
kind of bad in terms of pipes, the clipboard, etc.
- So we keep PREFERRED_IO_BLKSIZE at 64K but don't use it for disk
files. Rather, we read this info from the filesystem:
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntddk/ns-ntddk-_file_fs_sector_size_information
If the filesystem is local and SSINFO_FLAGS_NO_SEEK_PENALTY is set, we
could stick to 64K.
Otherwise the PhysicalBytesPerSectorForPerformance member might be
helpful I guess. Needs checking, of course.
If this isn't any good, we can still fallback to
FILE_FS_FULL_SIZE_INFORMATION as in fhandler_base::fstatvfs_by_handle,
https://cygwin.com/cgit/newlib-cygwin/tree/winsup/cygwin/fhandler/disk_file.cc#n661
Corinna
More information about the Cygwin
mailing list