du reports half of correct file sizes

Brian Dessent brian@dessent.net
Sun Dec 26 06:32:00 GMT 2004


Ross Boulet wrote:

> I do not have POSIXLY_CORRECT set.  I tried the -b and -k
> options.  The -b option reports correctly, but the -k still
> seems to report half.
> 
> $ ls -l a*
> -rw-r--r--  1 rossboulet None    2 Dec 25 17:24 a
> -rw-r--r--  1 rossboulet None 3740 Dec 25 10:58 aaa
> 
> $ du -b a*
> 2       a
> 3740    aaa
> 
> $ du -k a*
> 1       a
> 2       aaa

You're right this does seem to be a bug in du.  The du from
fileutils-4.1-2 reports the correct size, but the one from
coreutils-5.2.1-3 is incorrect.

Looking into it, the offending code seems to be the following starting
at line 383 in du.c:

      size = (apparent_size
	      ? sb->st_size
	      : ST_NBLOCKS (*sb) * ST_NBLOCKSIZE)

`apparent_size' is true if the `--apparent-size' argument was supplied,
which takes the size directly from the `sb' stat buffer, otherwise it
rounds the size based on blocks.

The problem here is the definition of ST_NBLOCKSIZE, which gets defined
as 512 in system.h in coreutils under Cygwin.  Apparently in that file
there are a set of #defines to set this based on platform, with a
default of 512.  On Cygwin the block size is 1024 as defined by
S_BLKSIZE in newlib's sys/stat.h.  Thus du is off by half.

I don't know why du doesn't just use the block size as returned in the
stat buffer (sb->st_blksize) rather than relying on ST_NBLOCKSIZE.  They
have a seperate macro for that, but for some reason it is not used here.

--- du.c.orig   2004-12-25 22:08:55.138687500 -0800
+++ du.c        2004-12-25 22:09:16.966812500 -0800
@@ -382,7 +382,7 @@ process_file (FTS *fts, FTSENT *ent)
     {
       size = (apparent_size
              ? sb->st_size
-             : ST_NBLOCKS (*sb) * ST_NBLOCKSIZE);
+             : ST_NBLOCKS (*sb) * ST_BLKSIZE(*sb));
     }
 
   if (first_call)

Perhaps a better way would be to modify system.h such that ST_NBLOCKSIZE
is set to 1024 for Cygwin, since there could very well be other places
where ST_NBLOCKSIZE is used in relation to file lengths, in which case
other coreutils
programs might have issues as well.  

Since Corinna is the current (reluctant) coreutils maintainer, it is
really up to her how to decide to handle this and whether to pester the
upstream coreutils people.

Brian

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/



More information about the Cygwin mailing list