This is the mail archive of the cygwin@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: problem report: gawk 3.1.1


Pieter Prinsloo wrote:
> Hi.
>
> Have a query/problem with gawk version 3.1.1-5 dated 17/Oct/2002.
> (allthough the problem as stated is for cygwin - it can also be
> emulated in Linux
> with gawk 3.1.0)
>
> Given the following example
> ==ort awk program file={
>    one =printf("%s",$1);
>    two =printf("%s",$2);
>    printf("LEFT=:right=:\n",$1,$2);
>    printf("left=:right=:\n",one,two);
> }
> =

The above shouldn't even compile in awk.  awk isn't C or
perl, and printf() doesn't return a value (nor do you need
the semicolons).

Did you mean sprintf()?

It is usually best to use copy and paste to put your bash
session into questions of this sort.

I took your awk program and changed it to use sprintf():

  {
  one = sprintf("%s",$1);
  two = sprintf("%s",$2);
  printf("LEFT=:right=:\n",$1,$2);
  printf("left=:right=:\n",one,two);
}

Even so, your code will only ever print out

LEFT=:right=:
left=:right=:
LEFT=:right=:
left=:right=:
LEFT=:right=:
left=:right=:

with the sample file that you quote because the printf calls
don't specify "%s" anywhere.

Correcting this to:

  {
  one = sprintf("%s",$1);
  two = sprintf("%s",$2);
  printf("LEFT=: %s right=: %s\n",$1,$2);
  printf("left=: %s right=: %s\n",one,two);
}

gives this:

LEFT=: left side of record 1  right=:  right side of record
left=: left side of record 1  right=:  right side of record
LEFT=: left side of record 2  right=:  right side of record
left=: left side of record 2  right=:  right side of record
LEFT=: left side of record 3  right=:  right side of record
left=: left side of record 3  right=:  right side of record

as its output.

I suggest that you take a look at the gawk manual and read
the section about the BINMODE variable, which is used to
determine how gawk deals with line end conversion.

Remember that, in general, Cygwin sets up a UNIX-like file
system and file handling so the line terminator is expected
to be "\n", whereas "MSDOS"-created files have "\r\n" line
terminators.  Under UNIX if you pass an MSDOS line terminated
file to gawk it will not treat "\r\n" as the terminator, but
"\n".  In other words each line read will end with a "\r" and
this may cause output lines to overwrite all or part of
earlier ones.

Something like this:

$ echo -e "a b c \r" | gawk '{print NF}'
4

$

is correct behaviour because the "\r" character represents
a separate field.  And notice that

$ echo -e "a b c \r" | gawk '{print $0, NF}'
 4b c

$

is correct under UNIX/Linux.

In this case the "\r" causes the OFS space and the digit 4
to overwrite the "a " of "a b c ".

By using the sed command that you mention you are using it
to remove the "\r" characters.

Note, in the above I'm assuming that you are using a full
Cygwin installation including bash.  You will possibly
get different results if you run gawk from a DOS box under
Windows.

HTH
--
Peter S Tillier
"Who needs perl when you can write dc and sokoban in sed?"





__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]