This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: How to get HTML page having embedded javascript


2009/2/5 TAJTHY Tamás:
> I have a problem. I'd like to get HTML pages, but not their plain
> sources. If it has an embedded javascript and it generates HTML code I
> need the resutling HTML code. Now I just run a perl script which
> launches a firefox and I copy the resulting page to the clipboard. But
> this is not too nice solution as I can not detect when firefox
> finished downloading and processing the page.
>
> Is there a library which can do this? Can anyone give some help, how
> can solve this?

The libraries are called gecko and webkit.

You need to evaluate the js which leads to the final html, as
interpreted client-side
in the browser.
This is done by dom manipulation from the original html, but by hooking
into the layout renderer you should be able to get at some sort of final layout.

It looks like a nice project for the next two years or so. Maybe GSOC
sponsors it,
because Google already has such emulators.

This has nothing to do with cygwin, ask this at some web list.
-- 
Reini Urban
http://phpwiki.org/              http://murbreak.at/

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]