1. Home >
  2. Apps >
  3. Groups >

How to extract content of PDF file using java?

Question asked by ms_cs in #Java on Apr 10, 2009
ms_cs
ms_cs · Apr 10, 2009
Rank B1 - LEADER
I want to extract the content of one pdf file and want to store it in Text File using java..How to convert this..?Is there any Inbuilt class is available for doing this? Posted in: #Java
MaRo
MaRo · Apr 10, 2009
Rank B2 - LEADER
First you have to read carefully PDF file specifications, you may find it here Adobe - PDF Developer Center: PDF reference, then read file metadata & scrap the actual text according to the file's spec.

The problem is in understanding the pdf file structure after recognizing the metadata the rest is as reading normal text files.


This might help https://www.planetpdf.com/developer/article.asp?ContentID=navigating_the_internal_struct

You must log-in or sign-up to reply to this post.

Click to Log-In or Sign-Up