This is the mail archive of the
mailing list for the Cygwin project.
Re: readdir truncates file names whose UTF-8 representation is longer than 255 bytes
On Mar 2 06:56, Uri Simchoni wrote:
> I'm using Cygwin 1.7.7 in UTF-8 mode. I have a file whose name is composed of Hebrew character, so the UTF-8 representation is longer than 255 characters.
> Trying "ls -l" fails to list the file's attributes.
> Using a short C program that loops through a directory (readdir()/stat()) shows that readdir() truncates the file name.
> Is there any way around it? (using environment variable, fstab or system call other readdir - I want to keep UTF-8)
I don't think there's a way around this, at least not an easy one for
Cygwin. The problem is that the dirent structure has no room for a
multibyte string of more than 255 bytes, while the underlying OS
provides filenames with up to 255 UTF-16 chars.
To support that, we would have to raise the size of a single dirent so
that it allows names with at least 512 bytes, but even that might be too
short, 1024 would be required. That's not exactly an easy change, so we
won't do that any time soon, I think.
The only solution for now is to switch to another charset or to
shorten the filenames for now.
Sorry for not having better news,
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Problem reports: http://cygwin.com/problems.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple