public class LanguageIndexingFilter extends java.lang.Object implements IndexingFilter
IndexingFilter that adds a
lang (language) field to the document.
It tries to find the language of the document by checking if
HTMLLanguageParser has added some language informationX_POINT_ID| Constructor and Description |
|---|
LanguageIndexingFilter()
Constructs a new Language Indexing Filter.
|
| Modifier and Type | Method and Description |
|---|---|
void |
addIndexBackendOptions(Configuration conf) |
NutchDocument |
filter(NutchDocument doc,
java.lang.String url,
WebPage page)
Adds fields or otherwise modifies the document that will be indexed for a
parse.
|
Configuration |
getConf() |
java.util.Collection<WebPage.Field> |
getFields() |
void |
setConf(Configuration conf) |
public LanguageIndexingFilter()
public NutchDocument filter(NutchDocument doc, java.lang.String url, WebPage page) throws IndexingException
IndexingFilterfilter in interface IndexingFilterdoc - document instance for collecting fieldsurl - page urlIndexingExceptionpublic java.util.Collection<WebPage.Field> getFields()
getFields in interface FieldPluggablepublic void addIndexBackendOptions(Configuration conf)
public void setConf(Configuration conf)
setConf in interface Configurablepublic Configuration getConf()
getConf in interface ConfigurableCopyright © 2019 The Apache Software Foundation