public class BasicIndexingFilter extends java.lang.Object implements IndexingFilter
X_POINT_ID| Constructor and Description |
|---|
BasicIndexingFilter() |
| Modifier and Type | Method and Description |
|---|---|
void |
addIndexBackendOptions(Configuration conf) |
NutchDocument |
filter(NutchDocument doc,
java.lang.String url,
WebPage page)
The
BasicIndexingFilter filter object which supports boolean
configurable value for length of characters permitted within the title @see
indexer.max.title.length in nutch-default.xml |
Configuration |
getConf()
Get the
Configuration object |
java.util.Collection<WebPage.Field> |
getFields()
Gets all the fields for a given
WebPage Many datastores need to
setup the mapreduce job by specifying the fields needed. |
void |
setConf(Configuration conf)
Set the
Configuration object |
public NutchDocument filter(NutchDocument doc, java.lang.String url, WebPage page) throws IndexingException
BasicIndexingFilter filter object which supports boolean
configurable value for length of characters permitted within the title @see
indexer.max.title.length in nutch-default.xmlfilter in interface IndexingFilterdoc - The NutchDocument objecturl - URL to be filtered for anchor textpage - WebPage object relative to the URLIndexingExceptionpublic void addIndexBackendOptions(Configuration conf)
public void setConf(Configuration conf)
Configuration objectsetConf in interface Configurablepublic Configuration getConf()
Configuration objectgetConf in interface Configurablepublic java.util.Collection<WebPage.Field> getFields()
WebPage Many datastores need to
setup the mapreduce job by specifying the fields needed. All extensions
that work on WebPage are able to specify what fields they need.getFields in interface FieldPluggableCopyright © 2019 The Apache Software Foundation