public class TLDScoringFilter extends java.lang.Object implements ScoringFilter
X_POINT_ID| Constructor and Description |
|---|
TLDScoringFilter() |
| Modifier and Type | Method and Description |
|---|---|
void |
distributeScoreToOutlinks(java.lang.String fromUrl,
WebPage page,
java.util.Collection<ScoreDatum> scoreData,
int allCount)
Distribute score value from the current page to all its outlinked pages.
|
float |
generatorSortValue(java.lang.String url,
WebPage page,
float initSort)
This method prepares a sort value for the purpose of sorting and selecting
top N scoring pages during fetchlist generation.
|
Configuration |
getConf() |
java.util.Collection<WebPage.Field> |
getFields() |
float |
indexerScore(java.lang.String url,
NutchDocument doc,
WebPage page,
float initScore)
This method calculates a Lucene document boost.
|
void |
initialScore(java.lang.String url,
WebPage page)
Set an initial score for newly discovered pages.
|
void |
injectedScore(java.lang.String url,
WebPage page)
Set an initial score for newly injected pages.
|
void |
setConf(Configuration conf) |
void |
updateScore(java.lang.String url,
WebPage page,
java.util.List<ScoreDatum> inlinkedScoreData)
This method calculates a new score during table update, based on the values
contributed by inlinked pages.
|
public Configuration getConf()
getConf in interface Configurablepublic void setConf(Configuration conf)
setConf in interface Configurablepublic java.util.Collection<WebPage.Field> getFields()
getFields in interface FieldPluggablepublic void injectedScore(java.lang.String url,
WebPage page)
throws ScoringFilterException
ScoringFilterinjectedScore in interface ScoringFilterurl - url of the pagepage - new page. Filters will modify it in-place.ScoringFilterExceptionpublic void initialScore(java.lang.String url,
WebPage page)
throws ScoringFilterException
ScoringFilterinitialScore in interface ScoringFilterurl - url of the pageScoringFilterExceptionpublic float generatorSortValue(java.lang.String url,
WebPage page,
float initSort)
throws ScoringFilterException
ScoringFiltergeneratorSortValue in interface ScoringFilterurl - url of the pagepage - WebPage object relative to the URLinitSort - initial sort value, or a value from previous filters in chainScoringFilterExceptionpublic void distributeScoreToOutlinks(java.lang.String fromUrl,
WebPage page,
java.util.Collection<ScoreDatum> scoreData,
int allCount)
throws ScoringFilterException
ScoringFilterdistributeScoreToOutlinks in interface ScoringFilterfromUrl - url of the source pagescoreData - A list of ScoreDatumallCount - number of all collected outlinks from the source pageScoringFilterExceptionpublic void updateScore(java.lang.String url,
WebPage page,
java.util.List<ScoreDatum> inlinkedScoreData)
throws ScoringFilterException
ScoringFilterupdateScore in interface ScoringFilterurl - url of the pagepage - WebPage object relative to the URLinlinkedScoreData - list of ScoreDatums for all inlinks pointing to
this URL.ScoringFilterExceptionpublic float indexerScore(java.lang.String url,
NutchDocument doc,
WebPage page,
float initScore)
throws ScoringFilterException
ScoringFilterindexerScore in interface ScoringFilterurl - url of the pagedoc - document. NOTE: this already contains all information collected by
indexing filters. Implementations may modify this instance, in
order to store/remove some information.initScore - initial boost value for the Lucene document.ScoringFilterExceptionCopyright © 2019 The Apache Software Foundation