View Feed
group-icon
Coffee Room
Discuss anything here - everything that you wish to discuss with fellow engineers.
12763 Members
Join this group to post and comment.
ankur8819
ankur8819 • May 2, 2012

Reading data from .xls and .xlsx file using Apache poi

Hi CEans ,
Was doing a POC for reading a Spreadsheet in .xls and .xlsx format using Apache POI.
Thought might be useful to you guys.
The code is the most basic version of Reading from either of the formats.Its applications are endless.let me know if there is some specific requirement.

import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
 
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.DateUtil;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
 
public class ExcelRead {
 
    public static void main(String ar[]) {
        String fname = "B:\\AnkurTostudy\\poi-3.8\\TestSheet.xlsx";
        System.out.println(fname);
        Workbook workbook = null;
        Sheet sheet = null;
        try {
            InputStream in = new FileInputStream(fname);
            try {
                workbook = WorkbookFactory.create(in);
                sheet = workbook.getSheetAt(0);
                int noOfRows = sheet.getPhysicalNumberOfRows();
 
                for (int rownum = 0; rownum < noOfRows; rownum++) {
                    Row row = sheet.getRow(rownum);
 
                    int noOfColumns = row.getLastCellNum();
                    for (int colnum = 0; colnum < noOfColumns; colnum++) {
 
                        Cell cell = row.getCell(colnum);
                        switch (cell.getCellType()) {
                        case Cell.CELL_TYPE_STRING:
                            System.out.println(cell.getStringCellValue());
                            break;
                        case Cell.CELL_TYPE_NUMERIC:
                            if (DateUtil.isCellDateFormatted(cell)) {
                                System.out.println(cell.getDateCellValue());
                            } else {
                                System.out.println(cell.getNumericCellValue());
                            }
                            break;
                        case Cell.CELL_TYPE_BOOLEAN:
                            System.out.println(cell.getBooleanCellValue());
                            break;
                        case Cell.CELL_TYPE_FORMULA:
                            System.out.println(cell.getCellFormula());
                            break;
                        default:
                            System.out.println("In Default");
                        }
                    }
                }
 
            } catch (InvalidFormatException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
 
        } catch (FileNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }
}
Here is how the Build Path looks like.
Untitled
Ankita Katdare
Ankita Katdare • May 3, 2012
Thanks for the share ankur8819 😀 I am sure it will be helpful to CEans.
Awesome... Let me extract the information about Apache POI for the others:
The Apache POI Project's mission is to create and maintain Java APIs for manipulating various file formats based upon the Office Open XML standards (OOXML) and Microsoft's OLE 2 Compound Document format (OLE2). In short, you can read and write MS Excel files using Java. In addition, you can read and write MS Word and MS PowerPoint files using Java. Apache POI is your Java Excel solution (for Excel 97-2008). We have a complete API for porting other OOXML and OLE2 formats and welcome others to participate.

OLE2 files include most Microsoft Office files such as XLS, DOC, and PPT as well as MFC serialization API based file formats. The project provides APIs for the OLE2 Filesystem (POIFS) and OLE2 Document Properties (HPSF).

Office OpenXML Format is the new standards based XML file format found in Microsoft Office 2007 and 2008. This includes XLSX, DOCX and PPTX. The project provides a low level API to support the Open Packaging Conventions using openxml4j.

For each MS Office application there exists a component module that attempts to provide a common high level Java api to both OLE2 and OOXML document formats. This is most developed for Excel workbooks (SS=HSSF+XSSF). Work is progressing for Word documents (HWPF+XWPF) and PowerPoint presentations (HSLF+XSLF).

The project has recently added support for Outlook (HSMF). Microsoft opened the specifications to this format in October 2007. We would welcome contributions.

There are also projects for Visio (HDGF), TNEF (HMEF), and Publisher (HPBF).

As a general policy we collaborate as much as possible with other projects to provide this functionality. Examples include: Cocoon for which there are serializers for HSSF; Open Office.org with whom we collaborate in documenting the XLS format; and Tika / Lucene, for which we provide format interpretors. When practical, we donate components directly to those projects for POI-enabling them.

Share this content on your social channels -