Publisher review:It builds on Lucene Java, adding new web-specifics, such as parsers for HTML, a crawler, a link-graph database and other document formats.
Nutch can run on a single machine, but works better in Hadoop clusters.
Plugins are available for expanding its usage spectrum.
Apache Nutch 2.2.1 is a script for Complete applications scripts design by Apache Software Foundation.
It runs on following operating system: Windows / Linux / Mac OS / BSD / Solaris and has as system requierments: .
Operating system:Windows / Linux / Mac OS / BSD / Solaris