leer - itext java example

¿Cómo comprobar que todas las fuentes utilizadas estén integradas en PDF con Java iText? (5)

¿Cómo comprobar que todas las fuentes que se utilizan en un archivo PDF están incrustadas en el archivo con Java e iText? Tengo algunos documentos PDF existentes, y me gustaría validar que solo usan fuentes incrustadas.

Esto requeriría comprobar que no se usan fuentes estándar PDF y que otras fuentes utilizadas están incrustadas en el archivo.

Cuando creas Chunk, declaras qué tipo de letra utilizas.
Cree BaseFont a partir de la fuente que desea usar y declare como BaseFont.EMBEDDED.
Tenga en cuenta que cuando no establece el subconjunto de opciones en verdadero, se incrustará toda la fuente.

Tenga en cuenta que la incrustación de fuentes puede violar los derechos de autor.

La respuesta más simple es abrir el archivo PDF con Adobe Acrobat y luego:

haga clic en Archivo
seleccione Propiedades
haga clic en la pestaña Fuentes

Esto le mostrará una lista de todas las fuentes en el documento. Cualquier fuente que esté incrustada mostrará "(incrustado)" junto al nombre de la fuente.

Por ejemplo:

ACaslonPro-Bold (Embebido)

donde ACaslonPro-Bold se deriva del nombre del archivo que usted incrustó (por ejemplo, FontFactory.register("/path/to/ACaslonPro-Bold.otf",...

Mire el ejemplo de ListUsedFonts de iText en acción.

http://itextpdf.com/examples/iia.php?id=287

Parece que esto imprimirá las fuentes utilizadas en un pdf y si están incrustadas.

/* * This class is part of the book "iText in Action - 2nd Edition" * written by Bruno Lowagie (ISBN: 9781935182610) * For more info, go to: http://itextpdf.com/examples/ * This example only works with the AGPL version of iText. */ package part4.chapter16; import java.io.FileOutputStream; import java.io.IOException; import java.io.PrintWriter; import java.util.Set; import java.util.TreeSet; import part3.chapter11.FontTypes; import com.itextpdf.text.DocumentException; import com.itextpdf.text.pdf.PdfDictionary; import com.itextpdf.text.pdf.PdfName; import com.itextpdf.text.pdf.PdfReader; public class ListUsedFonts { /** The resulting PDF file. */ public static String RESULT = "results/part4/chapter16/fonts.txt"; /** * Creates a Set containing information about the fonts in the src PDF file. * @param src the path to a PDF file * @throws IOException */ public Set<String> listFonts(String src) throws IOException { Set<String> set = new TreeSet<String>(); PdfReader reader = new PdfReader(src); PdfDictionary resources; for (int k = 1; k <= reader.getNumberOfPages(); ++k) { resources = reader.getPageN(k).getAsDict(PdfName.RESOURCES); processResource(set, resources); } reader.close(); return set; } /** * Extracts the font names from page or XObject resources. * @param set the set with the font names * @param resources the resources dictionary */ public static void processResource(Set<String> set, PdfDictionary resource) { if (resource == null) return; PdfDictionary xobjects = resource.getAsDict(PdfName.XOBJECT); if (xobjects != null) { for (PdfName key : xobjects.getKeys()) { processResource(set, xobjects.getAsDict(key)); } } PdfDictionary fonts = resource.getAsDict(PdfName.FONT); if (fonts == null) return; PdfDictionary font; for (PdfName key : fonts.getKeys()) { font = fonts.getAsDict(key); String name = font.getAsName(PdfName.BASEFONT).toString(); if (name.length() > 8 && name.charAt(7) == ''+'') { name = String.format("%s subset (%s)", name.substring(8), name.substring(1, 7)); } else { name = name.substring(1); PdfDictionary desc = font.getAsDict(PdfName.FONTDESCRIPTOR); if (desc == null) name += " nofontdescriptor"; else if (desc.get(PdfName.FONTFILE) != null) name += " (Type 1) embedded"; else if (desc.get(PdfName.FONTFILE2) != null) name += " (TrueType) embedded"; else if (desc.get(PdfName.FONTFILE3) != null) name += " (" + font.getAsName(PdfName.SUBTYPE).toString().substring(1) + ") embedded"; } set.add(name); } } /** * Main method. * * @param args no arguments needed * @throws DocumentException * @throws IOException */ public static void main(String[] args) throws IOException, DocumentException { new FontTypes().createPdf(FontTypes.RESULT); Set<String> set = new ListUsedFonts().listFonts(FontTypes.RESULT); PrintWriter out = new PrintWriter(new FileOutputStream(RESULT)); for (String fontname : set) out.println(fontname); out.flush(); out.close(); } }

No creo que este sea un caso de uso "iText". Use PDFBox o jPod . Estos implementan el modelo PDF y, como tal, te permiten:

abre el documento
recurse desde el documento raíz hasta el árbol de objetos
comprobar si se trata de un objeto de fuente
compruebe si el archivo de fuente está disponible

Una comprobación de si solo se utilizan fuentes incrustadas es mucho más compleja (es decir, las fuentes que no están incorporadas pero que no se usan están bien).

/** * Creates a set containing information about the not-embedded fonts within the src PDF file. * @param src the path to a PDF file * @throws IOException */ public Set<String> listFonts(String src) throws IOException { Set<String> set = new TreeSet<String>(); PdfReader reader = new PdfReader(src); PdfDictionary resources; for (int k = 1; k <= reader.getNumberOfPages(); ++k) { resources = reader.getPageN(k).getAsDict(PdfName.RESOURCES); processResource(set, resources); } reader.close(); return set; } /** * Finds out if the font is an embedded subset font * @param font name * @return true if the name denotes an embedded subset font */ private boolean isEmbeddedSubset(String name) { //name = String.format("%s subset (%s)", name.substring(8), name.substring(1, 7)); return name != null && name.length() > 8 && name.charAt(7) == ''+''; } private void processFont(PdfDictionary font, Set<String> set) { String name = font.getAsName(PdfName.BASEFONT).toString(); if(isEmbeddedSubset(name)) return; PdfDictionary desc = font.getAsDict(PdfName.FONTDESCRIPTOR); //nofontdescriptor if (desc == null) { PdfArray descendant = font.getAsArray(PdfName.DESCENDANTFONTS); if (descendant == null) { set.add(name.substring(1)); } else { for (int i = 0; i < descendant.size(); i++) { PdfDictionary dic = descendant.getAsDict(i); processFont(dic, set); } } } /** * (Type 1) embedded */ else if (desc.get(PdfName.FONTFILE) != null) ; /** * (TrueType) embedded */ else if (desc.get(PdfName.FONTFILE2) != null) ; /** * " (" + font.getAsName(PdfName.SUBTYPE).toString().substring(1) + ") embedded" */ else if (desc.get(PdfName.FONTFILE3) != null) ; else { set.add(name.substring(1)); } } /** * Extracts the names of the not-embedded fonts from page or XObject resources. * @param set the set with the font names * @param resources the resources dictionary */ public void processResource(Set<String> set, PdfDictionary resource) { if (resource == null) return; PdfDictionary xobjects = resource.getAsDict(PdfName.XOBJECT); if (xobjects != null) { for (PdfName key : xobjects.getKeys()) { processResource(set, xobjects.getAsDict(key)); } } PdfDictionary fonts = resource.getAsDict(PdfName.FONT); if (fonts == null) return; PdfDictionary font; for (PdfName key : fonts.getKeys()) { font = fonts.getAsDict(key); processFont(font, set); } }

El código anterior podría usarse para recuperar las fuentes que no están incrustadas en el archivo PDF dado. He mejorado el código de iText en Acción para que también pueda manejar el nodo de fuente descendiente de Font.