This is the mail archive of the cygwin@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: wget not behaving correctly


Sorry about the delay, have been too busy this week 8-(


cygwin-digest-help@sources.redhat.com wrote:
> Subject: Re: wget not behaving correctly
> Date: Sat, 8 Sep 2001 14:34:59 -0600
> From: "C. Porter Bassett" <porter@et.byu.edu>
> Reply-To: "C. Porter Bassett" <cporter@byu.edu>
> To: <cygwin@cygwin.com>
> 
> ----- Original Message -----
> From: "Hack Kampbjørn" <hack@hackdata.com>
> 
> >Yes this is the '?' in the URL. There can be two problems here depending
> >on which shell you use. First your shell may not send the '?' to wget
> >but directly complain that it cannot expand the globbing character '?'
> >to a filename, I cannot tell if this is the case as you did not include
> >full output of running the wget command.
> 
> OK, here it is:
> 1427:PWORK:~$ wget.exe -rd -A gif,jpg,png,bmp
> http://fan.theonering.net/rolozo/galleries.php?
> mode=2
> DEBUG output created by Wget 1.6 on cygwin32.
> 
> parseurl ("http://fan.theonering.net/rolozo/galleries.php?mode=2";) -> host
> fan.theonering.net -> opath rolozo/galleries.php?mode=2 -> dir rolozo ->
> file galleries.php?mode=2 -> ndir rolozo
> newpath: /rolozo/galleries.php?mode=2
> Checking for fan.theonering.net.
> This is the first time I hear about host fan.theonering.net by that name.
> --14:27:54--  http://fan.theonering.net/rolozo/galleries.php?mode=2
>            => `fan.theonering.net/rolozo/galleries.php?mode=2'
> Connecting to fan.theonering.net:80... Created fd 3.
> connected!
> ---request begin---
> GET /rolozo/galleries.php?mode=2 HTTP/1.0
> User-Agent: Wget/1.6
> Host: fan.theonering.net
> Accept: */*
> 
> ---request end---
> HTTP request sent, awaiting response... HTTP/1.1 200 OK
> Date: Sat, 08 Sep 2001 20:35:39 GMT
> Server: Apache/1.3.19 (Unix) PHP/4.0.4pl1 mod_perl/1.25
> X-Powered-By: PHP/4.0.4pl1
> Connection: close
> Content-Type: text/html
> 
> Length: unspecified [text/html]
> fan.theonering.net/rolozo/galleries.php?mode=2: No such file or directory
> Closing fd 3
> 
> Cannot write to `fan.theonering.net/rolozo/galleries.php?mode=2' (No such
> file or directory).

OK, it's the problem of '?' being an illegal character on Windows file
systems

> 
> FINISHED --14:27:54--
> Downloaded: 0 bytes in 0 files
> 1427:PWORK:~$
> 

You can work around it with the --output-document option but then you
loose the recursion. 

What I use this is patch (should apply clean on wget-1.6 too). Yes, it's
a crude hack, and doesn't address all the problems with illegal
characters. And some of the code in wget doesn't expect the file to be
saved with a different name than on the webserver so you loose some
options too (IIRC --convert-links).

I keep forgetting that the wget-patch list isn't archive. And not every
patch is sent to the wget list. Sorry for sending you on a fruitless
search there 8-(

Index: src/url.c
===================================================================
RCS file: /pack/anoncvs/wget/src/url.c,v
retrieving revision 1.21.2.1
diff -u -r1.21.2.1 url.c
--- src/url.c   2000/12/17 19:28:20     1.21.2.1
+++ src/url.c   2001/02/03 15:53:24
@@ -1272,16 +1272,17 @@
          file = nfile;
        }
     }
-  /* DOS-ish file systems don't like `%' signs in them; we change it
-     to `@'.  */
-#ifdef WINDOWS
+  /* Windows file systems don't like `?' signs in them; we change it
+     to `@'. 
+     #### Note: nor are \ / : * " < > | allowed */
+#if defined(WINDOWS) || defined (__CYGWIN__)
   {
     char *p = file;
     for (p = file; *p; p++)
-      if (*p == '%')
+      if (*p == '?')
        *p = '@';
   }
-#endif /* WINDOWS */
+#endif /* WINDOWS or CYGWIN */
 
   /* Check the cases in which the unique extensions are not used:
      1) Clobbering is turned off (-nc).


-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn               hack@hackdata.com
HackLine                     +45 2031 7799

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]