read - Lee el archivo XLSX en Java

read xlsx java (12)

Necesito leer un archivo Excel 2007 XLSX en una aplicación Java. ¿Alguien sabe de una buena API para realizar esta tarea?

Éste quizás funcione para usted, puede leer / escribir el archivo xlsx de Excel 2007. SmartXLS

¿Has visto la POI ?

No importa:

HSSF es la implementación Java pura del proyecto POI del formato de archivo Excel ''97 (-2007). No admite el nuevo formato de archivo .xlsx OOXML de Excel 2007, que no está basado en OLE2.

En su lugar, podría considerar usar un puente JDBC-ODBC .

AFAIK aún no hay bibliotecas xlsx disponibles. Pero hay algunos para xls antiguos:

Una biblioteca es jxls que usa internamente el POI ya mencionado.

2 otros enlaces: maneje archivos de Excel , bibliotecas de Java para leer y escribir archivos de documentos de Excel XLS .

No estoy muy contento con ninguna de las opciones, así que terminé solicitando el archivo en formato Excel 97. El POI funciona muy bien para eso. Gracias a todos por la ayuda.

No sé si está actualizado para Excel 2007, pero para versiones anteriores uso JExcelAPI

Podría ser un poco tarde, pero el POI beta ahora es compatible con xlsx.

Prueba esto:

Descomprime el archivo XLSX
Leer archivos XML
Componer y usar datos

Código de ejemplo:

public Workbook getTemplateData(String xlsxFile) { Workbook workbook = new Workbook(); parseSharedStrings(xlsxFile); parseWorkesheet(xlsxFile, workbook); parseComments(xlsxFile, workbook); for (Worksheet worksheet : workbook.sheets) { worksheet.dimension = manager.getDimension(worksheet); } return workbook; } private void parseComments(String tmpFile, Workbook workbook) { try { FileInputStream fin = new FileInputStream(tmpFile); final ZipInputStream zin = new ZipInputStream(fin); InputStream in = getInputStream(zin); while (true) { ZipEntry entry = zin.getNextEntry(); if (entry == null) break; String name = entry.getName(); if (name.endsWith(".xml")) { //$NON-NLS-1$ if (name.contains(COMMENTS)) { parseComments(in, workbook); } } zin.closeEntry(); } in.close(); zin.close(); fin.close(); } catch (FileNotFoundException e) { System.out.println(e); } catch (IOException e) { e.printStackTrace(); } } private void parseComments(InputStream in, Workbook workbook) { try { DefaultHandler handler = getCommentHandler(workbook); SAXParser saxParser = getSAXParser(); saxParser.parse(in, handler); } catch (Exception e) { e.printStackTrace(); } } private DefaultHandler getCommentHandler(Workbook workbook) { final Worksheet ws = workbook.sheets.get(0); return new DefaultHandler() { String lastTag = ""; private Cell ccell; @Override public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { lastTag = qName; if (lastTag.equals("comment")) { String cellName = attributes.getValue("ref"); int r = manager.getRowIndex(cellName); int c = manager.getColumnIndex(cellName); Row row = ws.rows.get(r); if (row == null) { row = new Row(); row.index = r; ws.rows.put(r, row); } ccell = row.cells.get(c); if (ccell == null) { ccell = new Cell(); ccell.cellName = cellName; row.cells.put(c, ccell); } } } @Override public void characters(char[] ch, int start, int length) throws SAXException { String val = ""; if (ccell != null && lastTag.equals("t")) { for (int i = start; i < start + length; i++) { val += ch[i]; } if (ccell.comment == null) ccell.comment = val; else { ccell.comment += val; } } } }; } private void parseSharedStrings(String tmpFile) { try { FileInputStream fin = new FileInputStream(tmpFile); final ZipInputStream zin = new ZipInputStream(fin); InputStream in = getInputStream(zin); while (true) { ZipEntry entry = zin.getNextEntry(); if (entry == null) break; String name = entry.getName(); if (name.endsWith(".xml")) { //$NON-NLS-1$ if (name.startsWith(SHARED_STRINGS)) { parseStrings(in); } } zin.closeEntry(); } in.close(); zin.close(); fin.close(); } catch (FileNotFoundException e) { System.out.println(e); } catch (IOException e) { e.printStackTrace(); } } public void parseWorkesheet(String tmpFile, Workbook workbook) { try { FileInputStream fin = new FileInputStream(tmpFile); final ZipInputStream zin = new ZipInputStream(fin); InputStream in = getInputStream(zin); while (true) { ZipEntry entry = zin.getNextEntry(); if (entry == null) break; String name = entry.getName(); if (name.endsWith(".xml")) { //$NON-NLS-1$ if (name.contains("worksheets")) { Worksheet worksheet = new Worksheet(); worksheet.name = name; parseWorksheet(in, worksheet); workbook.sheets.add(worksheet); } } zin.closeEntry(); } in.close(); zin.close(); fin.close(); } catch (FileNotFoundException e) { System.out.println(e); } catch (IOException e) { e.printStackTrace(); } } public void parseWorksheet(InputStream in, Worksheet worksheet) throws IOException { // read sheet1 sharedStrings // styles, strings, formulas ... try { DefaultHandler handler = getDefaultHandler(worksheet); SAXParser saxParser = getSAXParser(); saxParser.parse(in, handler); } catch (SAXException e) { e.printStackTrace(); } catch (ParserConfigurationException e) { e.printStackTrace(); } }

donde clase de libro de trabajo:

public class Workbook { Integer id = null; public List<Worksheet> sheets = new ArrayList<Worksheet>();}

y clase de hoja de trabajo:

public class Worksheet { public Integer id = null; public String name = null; public String dimension = null; public Map<Integer, Row> rows = new TreeMap<Integer, Row>(); public Map<Integer, Column> columns = new TreeMap<Integer, Column>(); public List<Span> spans = new ArrayList<Span>();}

y clase Row:

public class Row { public Integer id = null; public Integer index = null; public Row tmpRow = null; public Style style = null; public Double height = null; public Map<Integer,Cell> cells = new TreeMap<Integer, Cell>(); public String spans = null; public Integer customHeight = null;}

y clase celular:

public class Cell { public Integer id = null; public Integer rowIndex = null; public Integer colIndex = null; public String cellName = null; public String text = null; public String formula = null; public String comment = null; public Style style = null; public Object value = null; public Cell tmpCell = null;}

y Clase de columna:

public class Column { public Integer index = null; public Style style = null; public String width = null; public Column tmpColumn = null; }

y clase Span:

public class Span { Integer id = null; String topLeft = null; String bottomRight = null; }

Puedes usar Apache Tika para eso:

String parse(File xlsxFile) { return new Tika().parseToString(xlsxFile); }

Tika usa Apache POI para analizar archivos XLSX.

Aquí hay algunos ejemplos de uso de Tiki.

Alternativamente, si desea manejar cada celda de la hoja de cálculo individualmente, aquí hay una forma de hacerlo con POI:

void parse(File xlsx) { try (XSSFWorkbook workbook = new XSSFWorkbook(xlsx)) { // Handle each cell in each sheet workbook.forEach(sheet -> sheet.forEach(row -> row.forEach(this::handle))); } catch (InvalidFormatException | IOException e) { System.out.println("Can''t parse file " + xlsx); } } void handle(Cell cell) { final String cellContent; switch (cell.getCellType()) { case Cell.CELL_TYPE_STRING: cellContent = cell.getStringCellValue(); break; case Cell.CELL_TYPE_NUMERIC: cellContent = String.valueOf(cell.getNumericCellValue()); break; case Cell.CELL_TYPE_BOOLEAN: cellContent = String.valueOf(cell.getBooleanCellValue()); break; default: cellContent = "Don''t know how to handle cell " + cell; } System.out.println(cellContent); }

Tenía que hacer esto en .NET y no pude encontrar ninguna API. Mi solución fue descomprimir el .xlsx y sumergirme en la manipulación del XML. No es tan malo una vez que creas tus clases de ayudantes y tal.

Hay algunos "errores", como los nodos, todos tienen que ser ordenados de acuerdo con la forma en que Excel los espera, que no encontré en los documentos oficiales. Excel tiene su propia marca de fecha y hora, por lo que deberá crear una fórmula de conversión.

docx4j ahora también cubre xlsx.

"¿Por qué usaría docx4j para hacer esto", oí preguntar, "en lugar de POI, que se centra en xlsx y xls binarios"?

Probablemente porque te gusta JAXB (a diferencia de XML Beans), o ya estás usando docx4j para docx o pptx, y también debes poder hacer algunas cosas con xlsx.

Otra posible razón es que el jar XML Beans generado a partir de los esquemas OpenXML es demasiado grande para sus propósitos. (Para evitar esto, POI ofrece un subconjunto ''lite'': el ''gran'' ooxml-schemas-1.0.jar es de 14.5 MB! Pero si necesita soportar hojas de cálculo arbitrarias, probablemente necesite el jar completo). Por el contrario, la totalidad de docx4j / pptx4j / xlsx4j pesa aproximadamente lo mismo que el subconjunto lite de POI.

Si está procesando hojas de cálculo solamente (es decir, no docx o pptx), y el párrafo anterior no es una preocupación para usted, entonces probablemente sea mejor que use POI.

POI 3.5 ha agregado soporte a todos los OOXML (docx, xlsx, etc.)

Ver el subproyecto XSSF

Aspose.Cells for Java es compatible con el formato XLSX. Puede encontrar más detalles y más ayuda en Aspose.Cells for Java Documentation . Por favor mira si esto ayuda.

Divulgación: trabajo como desarrollador evangelista en Aspose.