Grab-a-Site

Sergey Okhapkin sos@prospect.com.ru
Tue Feb 4 01:36:00 GMT 1997


Hi!

Someone in this mailing list asks about a tool to download pages/trees from 
a web site to local disk. I wrote this tool as a Tcl script. Here is a 
readme file and script itself.

------------Grab-A-Site-1.0.tcl.README

			Grab-A-Site v1.0
			~~~~~~~~~~~~~~~~

1. What is it?

  This is a Tcl script to download ("grab") entire WWW site or selected 
tree.
You need Tcl 7.6 and later to run this scripts. The earlier versions of Tcl
interpreter will not work due to usage of "file delete" command. This 
script
was succesfully tested with Tcl 7.6 and 8.0a2 on Win32 (Windows NT 4.0) and
Linux platforms. Check availability of this packages for your platform on
http://www.sunlabs.com/research/tcl . Script requires http 1.0 package from 
Tcl
8.0a2 source distribution (file lib/http.tcl). WARNING: this file is missed 
in
Tcl/Tk 8.0a2 binary distribution for Windows.

2. How to use it?

  Run "tclsh Grab-A-Site-1.0.tcl URL [download directory]",
or (on UNIX systems) "chmod +x Grab-A-Site-1.0.tcl" and run
"Grab-A-Site-1.0.tcl URL [download directory]",

where:
  URL - full url of requested file or directory tree. "/" is required after
        directory name!
  Download direcrory - where to download (default to current directory). 
Script
        automatically creates directory tree corresponding to server's 
tree.

Examples:
  tclsh Grab-A-Site-1.0.tcl www.w3.org/pub/WWW/Protocols/ www.w3.org - 
dowload
  /pub/WWW/Protocols/ tree from web site www.w3.org to local directory 
www.w3.org.

  Grab-A-Site-1.0.tcl www.favorite.place/misc/aaa.tar.gz - download file
  aaa.tar.gz from directory /misc of www site www.favorite.place to current
  directory

  Grab-A-Site-1.0.tcl http://www.xpics.com/ /pics/cool - grab entire site
  www.xpics.com to local directory /pics/cool :-)

If the site is already downloaded, script performs update-check - it 
downloads
only pages/files, modified since last download.

Hint - if you hate this long script's name, just rename it or make symlink 
:-)


3. Known bugs.

  - Grab-A-Site-1.0.tcl doesn't handle cgi requests properly.
  - No image map support.

4. How it works.

 I don't know this exactly, but it works :-) Thanx to Dr. Ousterhout for 
his
nice Tcl language!

5. Status of this software.

  This package released as public domain code. Feel free to redistribute 
and
modify this code without author's permissions.

6. ToDo

- error handling impruvements.
- make p 3 clean :-)
- Tk-based GUI interface. Is it really needed?

Sergey Okhapkin					FidoNet 2:5020/47
						E-mail: sos@prospect.com.ru

--------------------------------

 

--
Sergey Okhapkin
Moscow, Russia
Looking for a job.


begin 600 Grab-A-Site-1.0.tcl
M(R$O=7-R+V)I;B]T8VQS: T*#0IP86-K86=E(')E<75I<F4@:'1T<" Q+C -
M"@T*(R!#;VYF:6=U<F%T:6]N(&1A=&$-"G-E="!D969A=6QT;F%M92!I;F1E
M>"YH=&UL#0IS970@;6%X=&AR96%D<R U#0IS970@=&EM97-T86UP9FEL92 N
M1W)A8BU!+5-I=&4N=&EM97-T86UP#0H-"G!R;V,@:'1T<$-A;&QB86-K('MT
M;VME;GT@>PT*"6=L;V)A;"!N=&AR96%D<R!L;V-A;&1I<B!D969A=6QT;F%M
M90T*"75P=F%R(",P("1T;VME;B!S=&%T90T*"6EF('LD;G1H<F5A9',@/B P
M?2![:6YC<B!N=&AR96%D<R M,7T-"@ES970@=2!;0G)E86M5<FP@)'-T871E
M*'5R;"E=#0H)<W=I=&-H("UG;&]B("1S=&%T92AH='1P*2![#0H)*C(P,"H@
M>PT*"2,@;F]R;6%L(&-O;7!L971I;VX-"@D):68@6W-T<FEN9R!M871C:"!T
M97AT+VAT;6P@)'-T871E*'1Y<&4I72![#0H)"0ES965K("1S=&%T92@M8VAA
M;FYE;"D@, T*"0D)<')O8V5S<VAT;6P@)'1O:V5N#0H)"7T-"@D)8VQO<V4@
M)'-T871E*"UC:&%N;F5L*0T*"2 @(" @('T-"@DJ,S Q*B M#0H)*C,P,BH@
M>PT*"2,@<F5D:7)E8W1I;VX-"@D)8VQO<V4@)'-T871E*"UC:&%N;F5L*0T*
M"0EC871C:"![9FEL92!D96QE=&4@)&QO8V%L9&ER6VQI;F1E>" D=2 S75ML
M:6YD97@@)'4@-%U]#0H)"6%R<F%Y('-E="!M971A("1S=&%T92AM971A*0T*
M"0EH='1P1V5T0F]D>2!;<W1R:6YG('1R:6T@)&UE=&$H3&]C871I;VXI70T*
M"2 @(" @('T-"@DJ,S T*B![#0H)(R!.;W0@;6]D:69I960-"G!U=',@(D9I
M;&4@)'-T871E*'5R;"D@;F]T(&UO9&EF:65D(@T*"0EI9B!;<W1R:6YG(&UA
M=&-H('1E>'0O:'1M;" D<W1A=&4H='EP92E=('L-"@D)"7-E96L@)'-T871E
M*"UC:&%N;F5L*2 P#0H)"0EP<F]C97-S:'1M;" D=&]K96X-"@D)?0T*"0EC
M;&]S92 D<W1A=&4H+6-H86YN96PI#0H)(" @(" @?0T*"2HT*B M#0H)*C4J
M(" @>PT*"2,@97)R;W(-"@D)8VQO<V4@)'-T871E*"UC:&%N;F5L*0T*"0EC
M871C:"![9FEL92!D96QE=&4@)&QO8V%L9&ER6VQI;F1E>" D=2 S75ML:6YD
M97@@)'4@-%U]#0H)(" @("!]#0H)?0T*?0T*#0HC#0HC($)R96%K(&1O=VX@
M=7)L#0HC(%)E='5R;G,@82!L:7-T(&]F('5R;"!P87)T<PT*(PT*<')O8R!"
M<F5A:U5R;"!U<FP@>PT*"6EF('LA(%MR96=E>' @+6YO8V%S92!<#0H@>RAH
M='1P.B\O?&9T<#HO+WQG;W!H97(Z+R]\;6%I;'1O.GQN97=S.GQF:6QE.B]\
M=&5L;F5T.B\O*3\H6UXO.B-=*RD_*#HH6S M.5TK*2D_*%M>(UTJ+RD_*%M>
M+R-=*RD_*",N*BD_?2!<#0H)"21U<FP@7 T*"2 @("!X('!R;W1O(&AO<W0@
M>2!P;W)T('!A=&@@9FEL92!M87)K77T@>PT*"0EE<G)O<B B26YE<FYA;"!E
M<G)O<CH@56YS=7!P;W)T960@55),.B D=7)L(@T*"7T-"@ER971U<FX@6VQI
M<W0@)'!R;W1O("1H;W-T("1P;W)T("1P871H("1F:6QE70T*?0T*#0HC#0HC
M($=E="!B;V1Y(&]F('5R;"UP;VEN=&5D(&1O8W5M96YT+@T*(R!U<FP@+2!F
M=6QL(&1O8W5M96YT('5R; T*(PT*<')O8R!H='1P1V5T0F]D>2![=7)L?2![
M#0H)9VQO8F%L(&QO8V%L9&ER(&1E9F%U;'1N86UE(&YT:')E861S(&UA>'1H
M<F5A9',@<W1A<G1T:6UE#0H)(R!B<F5A:R!U<FP@:6YT;R!P871H(&%N9"!F
M:6QE;F%M90T*"7-E="!U(%M"<F5A:U5R;" D=7)L70T*"7-E="!P871H(%ML
M:6YD97@@)'4@,UT-"@ES970@9FEL92!;;&EN9&5X("1U(#1=#0H):68@>UMS
M=')I;F<@;&5N9W1H("1F:6QE72 ]/2 P?2![#0H)"7-E="!F:6QE("1D969A
M=6QT;F%M90T*"7T-"@ES970@;&8@)&QO8V%L9&ER)'!A=&@D9FEL90T*"6EF
M(%MF:6QE(&5X:7-T<R D;&9=('L-"B-P=71S('-T9&5R<B B1FEL92 D;&8@
M97AI<W1S(@T*"0ES970@9G1I;64@6V9I;&4@;71I;64@)&QF70T*"0EI9B![
M)&9T:6UE(#X]("1S=&%R='1I;65]('L-"@D)"2,@9FEL92!D;W=N;&]A9&5D
M(&EN('1H:7,@<V5S<VEO;@T*"0D)<F5T=7)N#0H)"7T-"@D)<V5T(&UO9&EF
M:65D(%ML:7-T($EF+4UO9&EF:65D+5-I;F-E(%P-"@D)("!;8VQO8VL@9F]R
M;6%T("1F=&EM92 M9F]R;6%T("(E82P@)60@)6(@)5D@)4@Z)4TZ)5,@1TU4
M(B M9VUT(#%=70T*"0EI9B!;8V%T8V@@>W-E="!C:&%N;F5L(%MO<&5N("1L
M9B!R*UU]72![#0H)"0EP=71S(")#86XG="!O<&5N("1L9B(-"@D)"7)E='5R
M;@T*"0E]#0H)"65X96,@=&]U8V@@)&QF#0H)?2!E;'-E('L-"@D)<V5T(&UO
M9&EF:65D(%ML:7-T($EF+4UO9&EF:65D+5-I;F-E(")4:'4L(# Q($1E8R Q
M.3<Q(#$V.C P.C P($=-5")=#0H)"6EF(%MC871C:"![9FEL92!M:V1I<B D
M;&]C86QD:7(D<&%T:'U=('L-"@D)"7!U=',@(D-A;B=T(&-R96%T92!D:7(@
M)&QO8V%L9&ER)'!A=&@B#0H)"0ER971U<FX-"@D)?0T*"0EI9B!;8V%T8V@@
M>W-E="!C:&%N;F5L(%MO<&5N("1L9B!W*UU]72![#0H)"0EP=71S(")#86XG
M="!C<F5A=&4@)&QF(@T*"0D)<F5T=7)N#0H)"7T-"@E]#0H)=VAI;&4@>R1N
M=&AR96%D<R ^/2 D;6%X=&AR96%D<WT@>PT*"0EV=V%I="!N=&AR96%D<PT*
M"7T-"@EP=71S("(D=7)L("T^("1L;V-A;&1I<B1P871H)&9I;&4B#0H)9F-O
M;F9I9W5R92 D8VAA;FYE;" M=')A;G-L871I;VX@8FEN87)Y("UB=69F97)S
M:7IE(#@Q.3(-"@EI9B!;8V%T8V@@>VAT='!?9V5T("1U<FP@+6-O;6UA;F0@
M:'1T<$-A;&QB86-K("UC:&%N;F5L("1C:&%N;F5L("UH96%D97)S("1M;V1I
M9FEE9'U=('L-"@D)8VQO<V4@)&-H86YN96P-"@D)<F5T=7)N#0H)?0T*"6EN
M8W(@;G1H<F5A9',-"GT-"@T*(PT*(R!,;V]K('5P(&%N9"!P<F]C97-S(&AY
M<&5R;&EN:W,@:6X@:'1M;"!D;V-U;65N= T*(PT*<')O8R!P<F]C97-S:'1M
M;"![<WT@>PT*"6=L;V)A;"!H;W-T#0H)=7!V87(@(S @)',@<W1A=&4-"@ES
M970@=2!;0G)E86M5<FP@)'-T871E*'5R;"E=#0H)<V5T(&9I;&4@6VQI;F1E
M>" D=2 S75ML:6YD97@@)'4@-%T-"@ER96=S=6(@+6%L;"![*%M>/%TJ*2@\
M*%M>/ETK*3XI?2!;<F5A9" D<W1A=&4H+6-H86YN96PI72![7#,@?2!T86=S
M#0H)<V5T('1A9W,@6W-P;&ET("1T86=S70T*(W!U=',@(E1A9W,@87)E.B(-
M"B-P=71S(%MJ;VEN("1T86=S70T*"69O<F5A8V@@=&%G("1T86=S('L-"@D)
M=7!D871E#0H)"6EF(%MR96=E>' @+6YO8V%S92![*&AR968](BDH6UXB72LI
M*"(I?2 D=&%G('@@>2!R969=('L-"@D)"6UA:V554DP@)')E9B D9FEL90T*
M"0E](&5L<V5I9B!;<F5G97AP("UN;V-A<V4@>RAB86-K9W)O=6YD/2(I*%M>
M(ETK*2@B*7T@)'1A9R!X('D@<F5F72![#0H)"0EM86ME55),("1R968@)&9I
M;&4-"@D)?2!E;'-E:68@6W)E9V5X<" M;F]C87-E('LH<W)C/2(I*%M>(ETK
M*2@B*7T@)'1A9R!X('D@<F5F72![#0H)"0EM86ME55),("1R968@)&9I;&4-
M"@D)?0T*"7T-"GT-"@T*(PT*(R!-86ME(&9U;&P@55),(&9R;VT@<&%R=&EA
M;"!P871H#0HC(')E9B M(&QI;FL@9G)O;2!54DP-"B,@9FEL92 M(&%B<V]L
M=71E(&YA;64@;V8@9FEL92P@8V]N=&%I;FEN9R!T:&ES(')E9BX-"B,-"G!R
M;V,@;6%K95523"![<F5F(&9I;&5]('L-"@EG;&]B86P@<')O=&]C;VP@:&]S
M="!P;W)T('!A=&@-"@ES970@=2!;0G)E86M5<FP@)')E9ET-"@ES970@8W!R
M;W1O8V]L(%ML:6YD97@@)'4@,%T-"@ES970@8VAO<W0@6VQI;F1E>" D=2 Q
M70T*"7-E="!C<&]R="!;;&EN9&5X("1U(#)=#0H)<V5T(&-P871H(%ML:6YD
M97@@)'4@,UU;;&EN9&5X("1U(#1=#0HC<'5T<R B;6%K95523"!P<F]T;R D
M8W!R;W1O8V]L(&@@)&-H;W-T(' @)&-P;W)T('!A=&@@)&-P871H(&9I;&4@
M)&9I;&4B#0H)(R!I9B!T:&5R92!I<R!N;R!P<F]T;V-O;"P@=&AI<R!I<R!A
M(&QI;FL@=&\@;&]C86P@9FEL90T*"6EF('M;<W1R:6YG(&QE;F=T:" D8W!R
M;W1O8V]L72 ]/2 P?2![#0H)"7-E="!C<&%T:" D8VAO<W0D8W!A=&@-"@D)
M:68@>UMS=')I;F<@;&5N9W1H("1C<&%T:%T@/3T@,'T@>R!R971U<FX@?0T*
M"0EI9B!;<W1R:6YG(&-O;7!A<F4@6W-T<FEN9R!I;F1E>" D8W!A=&@@,%T@
M(B\B72![#0H)"0ES970@<F5T(%MF:6QE(&1I<FYA;64@)'MF:6QE?6%=+R1C
M<&%T: T*"0E](&5L<V4@>PT*"0D)<V5T(')E=" D8W!A=&@-"@D)?0T*"7T@
M96QS92![#0H)"2,@4VMI<"!L:6YK<R!T;R!A;F]T:&5R(&AO<W1S#0H)"6EF
M('M;<W1R:6YG(&QE;F=T:" D8VAO<W1=(#T](#!]('MS970@8VAO<W0@)&AO
M<W1]#0H)"6EF(%MS=')I;F<@8V]M<&%R92 D8VAO<W0@)&AO<W1=('MR971U
M<FY]#0H)"6EF('M;<W1R:6YG(&QE;F=T:" D8W!O<G1=("$](#!]('MS970@
M8W!O<G0@.B1C<&]R='T-"@D):68@>UMS=')I;F<@;&5N9W1H("1C<&%T:%T@
M/3T@,'T@>W-E="!C<&%T:" B+R)]#0H)"7-E="!R970@)&-P;W)T)&-P871H
M#0H)?0T*"2,@<F5P;&%C92 O9&ER;F%M92\N+B\@=VET:" O#0HC<'5T<R B
M;6%K95523"!P871H("1P871H(')E=" D<F5T(@T*"7=H:6QE(%MR96=S=6(@
M>R];7B]=*R]<+EPN+WT@)')E=" B+R(@<F5T72![?0T*(W!U=',@(FUA:V55
M4DP@<&%T:" D<&%T:"!R970@)')E="(-"@DC(&EG;F]R92!F:6QE<R!N;W0@
M:6X@<W1A<G0@<&%T:"X-"@EI9B!;<W1R:6YG(&UA=&-H("1P871H*B D<F5T
M72![#0H)"6AT='!'971";V1Y(&AT=' Z+R\D:&]S="1R970-"@E]#0I]#0H-
M"FEF('LD87)G8R \(#%]('L-"@EP=71S(")5<V%G93H@)&%R9W8P(%523"!<
M6V]U=&9I;&5<72(-"@EE>&ET#0I]#0II9B![)&%R9V,@/3T@,GT@>PT*"7-E
M="!L;V-A;&1I<B!;;&EN9&5X("1A<F=V(#%=#0I](&5L<V4@>PT*"7-E="!L
M;V-A;&1I<B N#0I]#0IS970@55),(%ML:6YD97@@)&%R9W8@,%T-"@T*<V5T
M('4@6T)R96%K57)L("154DQ=#0IS970@<')O=&]C;VP@6VQI;F1E>" D=2 P
M70T*<V5T(&AO<W0@6VQI;F1E>" D=2 Q70T*<V5T('!O<G0@6VQI;F1E>" D
M=2 R70T*<V5T('!A=&@@6VQI;F1E>" D=2 S70T*<V5T(&9I;&4@6VQI;F1E
M>" D=2 T70T*:68@>UMS=')I;F<@;&5N9W1H("1P<F]T;V-O;%T@/3T@,'T@
M>W-E="!P<F]T;V-O;" B:'1T<#HO+R)]#0II9B![6W-T<FEN9R!L96YG=&@@
M)'!O<G1=("$](#!]('MS970@<&]R=" Z)'!O<G1]#0IS970@;G1H<F5A9',@
M, T*<V5T('1S(%MO<&5N("1T:6UE<W1A;7!F:6QE('==#0IC;&]S92 D=',-
M"G-E="!S=&%R='1I;64@6V9I;&4@;71I;64@)'1I;65S=&%M<&9I;&5=#0IF
M:6QE(&1E;&5T92 D=&EM97-T86UP9FEL90T*:'1T<$=E=$)O9'D@)'!R;W1O
M8V]L)&AO<W0D<&]R="1P871H)&9I;&4-"G!U=',@6W1I;64@>PT*=VAI;&4@
K>R1N=&AR96%D<R ^(#!]('L-"@EV=V%I="!N=&AR96%D<PT*?0T*?5T-"F4@
`
end

-
For help on using this list, send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".



More information about the Cygwin mailing list