public class JSParseFilter extends java.lang.Object implements ParseFilter, Parser
X_POINT_IDX_POINT_ID| Constructor and Description |
|---|
JSParseFilter() |
| Modifier and Type | Method and Description |
|---|---|
Parse |
filter(java.lang.String url,
WebPage page,
Parse parse,
HTMLMetaTags metaTags,
org.w3c.dom.DocumentFragment doc)
Scan the JavaScript looking for possible
Outlink's |
Configuration |
getConf()
Get the
Configuration object |
java.util.Collection<WebPage.Field> |
getFields()
Gets all the fields for a given
WebPage Many datastores need to
setup the mapreduce job by specifying the fields needed. |
Parse |
getParse(java.lang.String url,
WebPage page)
Parse a JavaScript file and extract outlinks
|
static void |
main(java.lang.String[] args)
Main method which can be run from command line with the plugin option.
|
void |
setConf(Configuration conf)
Set the
Configuration object |
public Parse filter(java.lang.String url, WebPage page, Parse parse, HTMLMetaTags metaTags, org.w3c.dom.DocumentFragment doc)
Outlink'sfilter in interface ParseFilterurl - URL of the WebPage to be parsedpage - WebPage object relative to the URLparse - Parse object holding parse statusmetaTags - within the HTMLMetaTagsdoc - The DocumentFragment objectParse object with additional outlinks from JavaScriptpublic Parse getParse(java.lang.String url, WebPage page)
public static void main(java.lang.String[] args)
throws java.lang.Exception
args - java.lang.Exceptionpublic void setConf(Configuration conf)
Configuration objectsetConf in interface Configurablepublic Configuration getConf()
Configuration objectgetConf in interface Configurablepublic java.util.Collection<WebPage.Field> getFields()
WebPage Many datastores need to
setup the mapreduce job by specifying the fields needed. All extensions
that work on WebPage are able to specify what fields they need.getFields in interface FieldPluggableCopyright © 2019 The Apache Software Foundation