![]() |
||
Home Products Purchase Downloads Demos Forums Blogs Ticket Wiki API Corporate |
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectcom.aspose.pdf.kit.PdfExtractor
Represents a class to extract images and text from pdf file.
Constructor Summary | |
PdfExtractor()
The constructor of the PdfExtractor object. |
Method Summary | |
void |
bindPdf(java.io.InputStream inputStream)
Binds a Pdf Stream for extract. |
void |
bindPdf(java.lang.String inputFile)
Binds a Pdf file for extract. |
void |
extractAttachment()
Extracts attachments from a Pdf document. |
void |
extractImage()
Extracts images from a Pdf document. |
void |
extractText()
Extracts text from a Pdf document. |
void |
extractTextInRectangle(java.awt.Rectangle rec)
Extracts the text content of the page within the rectangle. |
void |
extractTextInRectangle(java.awt.Rectangle rec,
ExtractTextMode extMode)
Extracts the text content of the page within the rectangle. |
void |
getAllRectangleText(java.io.OutputStream outputStream)
Saves all texts within the rectangle to stream. |
void |
getAllRectangleText(java.lang.String outputFile)
Saves all texts within the rectangle to file. |
java.io.ByteArrayOutputStream[] |
getAttachment()
Saves all the attachment file to streams. |
void |
getAttachment(java.lang.String outputPath)
Saves all the attachment file to outputPath. |
java.util.ArrayList |
getAttachNames()
Gets all the attachment file's filename. |
int |
getEndPage()
Gets endPage value. |
void |
getNextImage(java.io.OutputStream outputStream)
Saves image to stream with default image format - Jpeg. |
void |
getNextImage(java.io.OutputStream outputStream,
ImageType imageTypeName)
Saves image to stream with the givin image format. |
void |
getNextImage(java.io.OutputStream outputStream,
java.lang.String imageTypeName)
Saves image to stream with the givin image format name. |
void |
getNextImage(java.lang.String outputFile)
Saves image to file with default image format - Jpeg. |
void |
getNextImage(java.lang.String outputFile,
ImageType imageType)
Saves image to file with the givin image format. |
void |
getNextImage(java.lang.String outputFile,
java.lang.String imageTypeName)
Saves image to file with the givin image format name. |
java.lang.String |
getPassword()
Gets password. |
int |
getStartPage()
Gets startPage value. |
void |
getText(java.io.OutputStream outputStream)
Saves text to stream. |
void |
getText(java.lang.String outputFile)
Saves text to file. |
int |
getWordCount()
Returns the word count of the pdf document. |
boolean |
hasNextImage()
Judges if it can get more images or not. |
void |
setEndPage(int endPage)
Sets endPage value. |
void |
setPassword(java.lang.String password)
Sets password, use this password to decrypt the pdf file. |
void |
setStartPage(int startPage)
Sets startPage value. |
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
public PdfExtractor()
Method Detail |
public void setStartPage(int startPage)
startPage
- start position which you want to extract of the pdf file.public int getStartPage()
public void setEndPage(int endPage)
endPage
- end position which you want to extract of the pdf file.public int getEndPage()
public void setPassword(java.lang.String password)
password
- the input pdf file's password.public java.lang.String getPassword()
public void bindPdf(java.lang.String inputFile) throws java.io.FileNotFoundException
inputFile
- The pdf file to be extracted.
java.io.FileNotFoundException
public void bindPdf(java.io.InputStream inputStream)
inputStream
- The pdf Stream to be extracted.
java.io.FileNotFoundException
public void extractImage() throws java.lang.Exception
[SampleCode] PdfExtractor extractor = new PdfExtractor(); extractor.bindPdf(path + "Image.pdf"); extractor.extractImage(); String suffix = ".jpg"; int imageCount = 1; while (extractor.hasNextImage()) { extractor.getNextImage(path + imageCount + suffix); imageCount++; }
java.lang.Exception
public void getNextImage(java.lang.String outputFile) throws java.lang.Exception
outputFile
- The file path and name to save the image.
java.lang.Exception
public void getNextImage(java.lang.String outputFile, java.lang.String imageTypeName) throws java.lang.Exception
outputFile
- The file path and name to save the imageimageTypeName
- Image format name of the extracted image
java.lang.Exception
getNextImage(String, ImageType)
public void getNextImage(java.lang.String outputFile, ImageType imageType) throws java.lang.Exception
outputFile
- The file path and name to save the image.imageType
- Image format of the extracted image.
java.lang.Exception
getNextImage(OutputStream, ImageType)
public void getNextImage(java.io.OutputStream outputStream) throws java.lang.Exception
outputStream
- The stream to save the image.
java.lang.Exception
getNextImage(OutputStream, ImageType)
public void getNextImage(java.io.OutputStream outputStream, java.lang.String imageTypeName) throws java.lang.Exception
[SampleCode] //extract image with the givin image format name(PNG) PdfExtractor extractor = new PdfExtractor(); extractor.bindPdf(path + "Image.pdf"); extractor.extractImage(); String suffix = ".png"; int imageCount = 1; while (extractor.hasNextImage()) { extractor.getNextImage(path + imageCount + suffix, "PNG"); imageCount++; }
outputStream
- The stream to save the image.imageTypeName
- Image format name of the extracted image.
java.lang.Exception
getNextImage(OutputStream, ImageType)
public void getNextImage(java.io.OutputStream outputStream, ImageType imageTypeName) throws java.lang.Exception
[SampleCode] //extract image with the givin image format(PNG) PdfExtractor extractor = new PdfExtractor(); extractor.bindPdf(path + "Image.pdf"); extractor.extractImage(); String suffix = ".png"; int imageCount = 1; while (extractor.hasNextImage()) { extractor.getNextImage(path + imageCount + suffix, ImageType.Png); imageCount++; }
outputStream
- The stream to save the image.imageTypeName
- Image format name of the extracted image.
java.lang.Exception
public boolean hasNextImage()
public void extractText() throws java.lang.Exception
[SampleCode] PdfExtractor extractor = new PdfExtractor(); extractor.bindPdf(path + "Text.pdf"); extractor.extractText(); extractor.getText(path + "text.txt");
java.lang.Exception
public void getText(java.lang.String outputFile) throws java.lang.Exception
outputFile
- The file path and name to save the text.
java.lang.Exception
public void getText(java.io.OutputStream outputStream) throws java.lang.Exception
outputStream
- The stream to save the text.
java.lang.Exception
public int getWordCount()
[SampleCode] PdfExtractor extractor = new PdfExtractor(); extractor.bindPdf(path + "Text.pdf"); extractor.extractText(); int wordCount = extractor.getWordCount();
public void extractTextInRectangle(java.awt.Rectangle rec, ExtractTextMode extMode) throws PdfViewerException, java.lang.Exception
rec
- java.awt.Rectangle the rectangle which extracted the texts.
The coordinate origin is (0,0) which is the pdf file top left point.
The rec.width is the extraction text width and the rec.height is the extraction text height.extMode
- ExtractTextMode the extract text mode.
PdfViewerException
java.lang.Exception
public void extractTextInRectangle(java.awt.Rectangle rec) throws PdfViewerException, java.lang.Exception
rec
- java.awt.Rectangle the rectangle which extracted the texts.
The coordinate origin is (0,0) which is the pdf file top left point.
The rec.width is the extraction text width and the rec.height is the extraction text height.
PdfViewerException
java.lang.Exception
public void getAllRectangleText(java.lang.String outputFile) throws java.lang.Exception
outputFile
- The file path and name to save the texts.
java.lang.Exception
public void getAllRectangleText(java.io.OutputStream outputStream) throws java.lang.Exception
outputStream
- The stream to save the texts.
java.lang.Exception
public void extractAttachment() throws java.lang.Exception
java.lang.Exception
public void getAttachment(java.lang.String outputPath) throws java.io.IOException
[SampleCode] PdfExtractor extractor = new PdfExtractor(); extractor.bindPdf(path + "Attach.pdf"); extractor.extractAttachment(); extractor.getAttachment(path);
outputPath
- The path to save the attachment.
java.io.IOException
public java.io.ByteArrayOutputStream[] getAttachment() throws java.io.IOException
[SampleCode] PdfExtractor extractor = new PdfExtractor(); extractor.bindPdf(path + "Attach.pdf"); extractor.extractAttachment(); ArrayList names = extractor.getAttachNames(); ByteArrayOutputStream[] tempStreams = extractor.getAttachment(); for (int i=0; i
- Returns:
- The stream array of the attachment file in the pdf document.
- Throws:
java.io.IOException
public java.util.ArrayList getAttachNames()
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |