View Feed
group-icon
Java Developers
Community of Java Developers: Get Java Programming Help from fellow Java Developers across the world.
654 Members
Join this group to post and comment.
ms_cs
ms_cs • Apr 10, 2009

How to extract content of PDF file using java?

I want to extract the content of one pdf file and want to store it in Text File using java..How to convert this..?Is there any Inbuilt class is available for doing this?
MaRo
MaRo • Apr 10, 2009
First you have to read carefully PDF file specifications, you may find it here Adobe - PDF Developer Center: PDF reference, then read file metadata & scrap the actual text according to the file's spec.

The problem is in understanding the pdf file structure after recognizing the metadata the rest is as reading normal text files.


This might help https://www.planetpdf.com/developer/article.asp?ContentID=navigating_the_internal_struct

Share this content on your social channels -