-
Class Summary
| Class |
Description |
| HtmlIndexingFilter |
Add raw HTML content of a document to the index.
|
Package org.apache.nutch.indexer.html Description
Index raw HTML content.
The plugin index-html adds the field "rawcontent" to the index.
This field contains the raw (HTML) content of a document converted to a String.