public class HtmlIndexingFilter extends java.lang.Object implements IndexingFilter
X_POINT_ID| Constructor and Description |
|---|
HtmlIndexingFilter() |
| Modifier and Type | Method and Description |
|---|---|
void |
addIndexBackendOptions(Configuration conf) |
NutchDocument |
filter(NutchDocument doc,
java.lang.String url,
WebPage page)
Adds fields or otherwise modifies the document that will be indexed for a
parse.
|
Configuration |
getConf() |
java.util.Collection<WebPage.Field> |
getFields() |
void |
setConf(Configuration conf) |
public NutchDocument filter(NutchDocument doc, java.lang.String url, WebPage page) throws IndexingException
IndexingFilterfilter in interface IndexingFilterdoc - document instance for collecting fieldsurl - page urlIndexingExceptionpublic void addIndexBackendOptions(Configuration conf)
public void setConf(Configuration conf)
setConf in interface Configurablepublic Configuration getConf()
getConf in interface Configurablepublic java.util.Collection<WebPage.Field> getFields()
getFields in interface FieldPluggableCopyright © 2019 The Apache Software Foundation