public abstract class RegexURLFilterBase extends java.lang.Object implements URLFilter
URL filter based on regular
expressions.
The format of this file is made of many rules (one per line):
[+-]<regex>
where plus (+)means go ahead and index it and minus (
-)means no.
X_POINT_ID| Modifier | Constructor and Description |
|---|---|
|
RegexURLFilterBase()
Constructs a new empty RegexURLFilterBase
|
|
RegexURLFilterBase(java.io.File filename)
Constructs a new RegexURLFilter and init it with a file of rules.
|
protected |
RegexURLFilterBase(java.io.Reader reader)
Constructs a new RegexURLFilter and init it with a Reader of rules.
|
|
RegexURLFilterBase(java.lang.String rules)
Constructs a new RegexURLFilter and inits it with a list of rules.
|
| Modifier and Type | Method and Description |
|---|---|
protected abstract RegexRule |
createRule(boolean sign,
java.lang.String regex)
Creates a new
RegexRule. |
java.lang.String |
filter(java.lang.String url) |
Configuration |
getConf() |
protected abstract java.io.Reader |
getRulesReader(Configuration conf)
Returns the name of the file of rules to use for a particular
implementation.
|
static void |
main(RegexURLFilterBase filter,
java.lang.String[] args)
Filter the standard input using a RegexURLFilterBase.
|
void |
setConf(Configuration conf) |
public RegexURLFilterBase()
public RegexURLFilterBase(java.io.File filename)
throws java.io.IOException,
java.lang.IllegalArgumentException
filename - is the name of rules file.java.io.IOExceptionjava.lang.IllegalArgumentExceptionpublic RegexURLFilterBase(java.lang.String rules)
throws java.io.IOException,
java.lang.IllegalArgumentException
rules - string with a list of rules, one rule per linejava.io.IOExceptionjava.lang.IllegalArgumentExceptionprotected RegexURLFilterBase(java.io.Reader reader)
throws java.io.IOException,
java.lang.IllegalArgumentException
reader - is a reader of rules.java.io.IOExceptionjava.lang.IllegalArgumentExceptionprotected abstract RegexRule createRule(boolean sign, java.lang.String regex)
RegexRule.sign - of the regular expression. A true value means that
any URL matching this rule must be included, whereas a
false value means that any URL matching this rule
must be excluded.regex - is the regular expression associated to this rule.protected abstract java.io.Reader getRulesReader(Configuration conf) throws java.io.IOException
conf - is the current configuration.java.io.IOExceptionpublic java.lang.String filter(java.lang.String url)
public void setConf(Configuration conf)
setConf in interface Configurablepublic Configuration getConf()
getConf in interface Configurablepublic static void main(RegexURLFilterBase filter, java.lang.String[] args) throws java.io.IOException, java.lang.IllegalArgumentException
filter - is the RegexURLFilterBase to use for filtering the standard input.args - some optional parameters (not used).java.io.IOExceptionjava.lang.IllegalArgumentExceptionCopyright © 2019 The Apache Software Foundation