Slow stat(2) performance on ClearCase MVFS

Earl Chew earl_chew@agilent.com
Sat Apr 18 19:02:00 GMT 2009


For quite a while now I've seen noticeably poor cygwin performance
on ClearCase MVFS drives with recursive commands like:

o grep -r
o find .
o rm -r

For example, executing 'find . -name "*.exe"' on a particular
MVFS directory tree here takes 8 mins (480 secs), but using
the strategy outlined in result 6 below reduces the time to 32 secs.


Some digging on 1.5.25-15 and narrowed down the issue to the
performance of stat(2).



Some questions:

o Does it make sense to replace GetFileAttributes() with
   FindFirstFile() in all cases ?

o Is it possible for fhandler_base::fstat_fs() to always
   use fstat_by_name() only, and avoid using open_fs() and
   fstat_by_handle() ?





Here are timings using some simple benchmarking programs. Each
program has a simple 10000 iteration loop:


GetFileAttributes    Perform GetFileAttributes(argv[1])
FindFirstFile        Perform FindFirstFile(argv[1]), FindClose()
stat                 Perform stat(argv[1])


The results are measured in elapsed seconds using cygwin time(1)
on the following files:

NTFS     c:/WINDOWS/system32/drivers/etc/hosts
MVFS     v:/cerberus/daytona/lib/Makefile.mk


                          NTFS   MVFS
1. GetFileAttributes     0.66   10.5
2. FindFirstFile         0.33    1.2
3. stat(MSVC)            0.37    1.2
4. stat(CYGWIN-1.5.25)   1.47   20.3
5. stat(no open)          2.4   11.5
6. stat(no attr, open)    2.0    2.3


Results 2 and 3 show that Win32 and MSVC functions perform
well, but that we can expect that ClearCase MVFS is four times
slower than a native NTFS.

Result 1 shows that GetFileAttributes is nearly ten times
slower than FindFirstFile for MVFS, and twice as slow for NTFS.

Result 4 gives a baseline performance for stat(2) on a vanilla
1.5.25-15 system.

Result 5 shows a doubling of MVFS performance over result 4 by forcing
fstat_by_name() instead of fstat_by_handle():

--- fhandler_disk_file.cc.orig  2009-04-18 10:26:34.937500000 -0700
+++ fhandler_disk_file.cc       2009-04-18 10:27:04.484375000 -0700
@@ -356,7 +356,7 @@
         return fstat_by_name (buf);
        query_open (query_stat_control);
      }
-  if (!(oret = open_fs (open_flags, 0)) && get_errno () == EACCES)
+  if ((oret = 0) && !(oret = open_fs (open_flags, 0)) && get_errno () 
== EACCES
)


Result 6 shows a ten times improvement in MVFS performance over
result 4 by forcing fstat_by_name() and also forcing the use
of GetFileAttributes():

--- path.cc.orig        2009-04-18 11:18:49.812500000 -0700
+++ path.cc     2009-04-18 11:19:01.625000000 -0700
@@ -4299,3 +4299,24 @@
      strcpy (bs, ".");
    return buf;
  }
+
+extern "C"
+DWORD GetFileAttributes (const TCHAR* path)
+{
+  for (const TCHAR* p = path; *p; ++p)
+    if (*p == '*' || *p == '?')
+       return INVALID_FILE_ATTRIBUTES;
+
+  WIN32_FIND_DATA findbuf;
+
+  HANDLE findhandle = FindFirstFile(path, &findbuf);
+
+  if (findhandle != INVALID_HANDLE_VALUE)
+    {
+      FindClose(findhandle);
+
+      return findbuf.dwFileAttributes;
+    }
+
+  return INVALID_FILE_ATTRIBUTES;
+}






More information about the Cygwin-developers mailing list