Nl article.asp?id=207&t=How To Read A P - Danysoft

How To Read A PDF File From A URL In Java

Learn to process a PDF document stored on the Net.

By V. Subhash

This article is based on a source code example sent by Gnostice DevTools member L. Santhanam to a customer who wanted to load PDF files stored on a website (Intranet or Internet).

PDFOne (for JavaTM) can load PDF documents from files, streams, and byte arrays. So, the trick here is to read the file off the Net and store it in a file using Java API. To read a Web resource, you need to use the classes in package. Here is what you need to do:

Set a .URL object with the address of the PDF document. Test if the content type of the PDF resource reached by the URL object. If the content type is that of a PDF document, read the input stream of the PDF and save it to a file output stream. Use PDFOne to process the PDF document saved in the file.

import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import .ConnectException; import .URL; import .URLConnection;

import com.gnostice.pdfone.PdfDocument;

public class Read_PDF_From_URL {

public static void main(String[] args) throws IOException {

URL url1 = new URL("");

byte[] ba1 = new byte[1024]; int baLength; FileOutputStream fos1 = new FileOutputStream("download.pdf");

try { // Contacting the URL System.out.print("Connecting to " + url1.toString() + " ... "); URLConnection urlConn = url1.openConnection();

// Checking whether the URL contains a PDF if (!urlConn.getContentType().equalsIgnoreCase("application/pdf")) {

System.out.println("FAILED.\n[Sorry. This is not a PDF.]"); } else {

try {

// Read the PDF from the URL and save to a local file InputStream is1 = url1.openStream(); while ((baLength = is1.read(ba1)) != -1) {

fos1.write(ba1, 0, baLength); } fos1.flush(); fos1.close(); is1.close();

// Load the PDF document and display its page count System.out.print("DONE.\nProcessing the PDF ... "); PdfDocument doc = new PdfDocument(); try {

doc.load("download.pdf"); System.out.println("DONE.\nNumber of pages in the PDF is " +

doc.getPageCount()); doc.close(); } catch (Exception e) { System.out.println("FAILED.\n[" + e.getMessage() + "]"); }

} catch (ConnectException ce) { System.out.println("FAILED.\n[" + ce.getMessage() + "]\n");

} }

} catch (NullPointerException npe) { System.out.println("FAILED.\n[" + npe.getMessage() + "]\n");

} } }

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download