| Interface | Description |
|---|---|
| FetchSchedule |
This interface defines the contract for implementations that manipulate fetch
times and re-fetch intervals.
|
| Class | Description |
|---|---|
| AbstractFetchSchedule |
This class provides common methods for implementations of
FetchSchedule. |
| AdaptiveFetchSchedule |
This class implements an adaptive re-fetch algorithm.
|
| CrawlStatus | |
| DbUpdateMapper | |
| DbUpdateReducer | |
| DbUpdaterJob | |
| DefaultFetchSchedule |
This class implements the default re-fetch schedule.
|
| FetchScheduleFactory |
Creates and caches a
FetchSchedule implementation. |
| GeneratorJob | |
| GeneratorJob.SelectorEntry | |
| GeneratorJob.SelectorEntryComparator | |
| GeneratorMapper | |
| GeneratorReducer |
Reduce class for generate
The #reduce() method write a random integer to all generated URLs.
|
| InjectorJob |
This class takes a flat file of URLs and adds them to the of pages to be
crawled.
|
| InjectorJob.UrlMapper | |
| MD5Signature |
Default implementation of a page signature.
|
| NutchWritable | |
| Signature | |
| SignatureComparator | |
| SignatureFactory |
Factory class, which instantiates a Signature implementation according to the
current Configuration configuration.
|
| TextMD5Signature |
Default implementation of a page signature.
|
| TextProfileSignature |
An implementation of a page signature.
|
| URLPartitioner |
Partition urls by host, domain name or IP depending on the value of the
parameter 'partition.url.mode' which can be 'byHost', 'byDomain' or 'byIP'
|
| URLPartitioner.FetchEntryPartitioner | |
| URLPartitioner.SelectorEntryPartitioner | |
| URLWebPage | |
| UrlWithScore |
A writable comparable container for an url with score.
|
| UrlWithScore.UrlOnlyPartitioner |
A partitioner by {url}.
|
| UrlWithScore.UrlScoreComparator |
Compares by {url,score}.
|
| UrlWithScore.UrlScoreComparator.UrlOnlyComparator |
Compares by {url}.
|
| WebTableReader |
Displays information about the entries of the webtable
|
| WebTableReader.WebTableRegexMapper |
Filters the entries from the table based on a regex
|
| WebTableReader.WebTableStatCombiner | |
| WebTableReader.WebTableStatMapper | |
| WebTableReader.WebTableStatReducer |
| Enum | Description |
|---|---|
| InjectType |
Copyright © 2019 The Apache Software Foundation