This is the mail archive of the cygwin-apps@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [ITP] catdoc-0.93.3 - New package


* Mon 2004-02-16 Igor Pechtchanski <pechtcha <AT> cs.nyu.edu> list.cygwin-apps
* Message-Id: <Pine.GSO.4.56.0402162054400.26191 <AT> slinky.cs.nyu.edu>
| On Tue, 17 Feb 2004, Jari Aalto+mail.linux wrote:
| 
| > Extract text from MS-Word files, trying to preserve as many special
| > printable characters as possible. Catdoc doesn't attempt to analyze
| > Word file formatting, it just extracts readable text. Known to
| > support up to Word-97 format.
| >
| > http://freshmeat.net/projects/catdoc/
| 
| Question: how is this different from 'antiword'?
| 	Igor


catdoc is the "original". Essentially these two are the same.
I ran a simple tests with these two and I looked like catdoc 
preserved paragraph bounds together better than antiword
(which stuck lines together).

Why not have both!

Jari


-- 
http://tiny-tools.sourceforge.net/
Swatch @time   http://www.mir.com.my/iTime/itime.htm
               http://www.ryanthiessen.com/swatch/resources.htm
Use Licenses!  http://www.linuxjournal.com/article.php?sid=6225
Which Licence? http://www.linuxjournal.com/article.php?sid=4825
OSI Licences   http://www.opensource.org/licenses/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]