Categories:
Audio (13)
Biotech (29)
Bytecode (36)
Database (77)
Framework (7)
Game (7)
General (507)
Graphics (53)
I/O (35)
IDE (2)
JAR Tools (102)
JavaBeans (21)
JDBC (121)
JDK (426)
JSP (20)
Logging (108)
Mail (58)
Messaging (8)
Network (84)
PDF (97)
Report (7)
Scripting (84)
Security (32)
Server (121)
Servlet (26)
SOAP (24)
Testing (54)
Web (15)
XML (322)
Collections:
Other Resources:
DomXmlParserWhitespace.java - Parse XML File without Whitespaces
How to parse an XML file with the DOM API without including whitespaces between XML elements?
✍: FYIcenter
In many cases, whitespaces are included in XML fiels before and after XML elements
to make the XML file more readable.
For example, the follwoing XML file, User.xml, includes whitespaces:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!-- Copyright (c) 2017 FYIcenter.com -->
<User>
<ID>101</ID>
<BirthDate>1970-01-01+00:01</BirthDate>
<Name>Frank Y. Ivy</Name>
<Sex> Male</Sex>
</User>
If you want the DOM XML parser to ignore whitespaces, you need to do two things:
1, Add DTD (Document Type Definition) to define the element structure as shown in UserDTD.xml:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!-- Copyright (c) 2017 FYIcenter.com -->
<!DOCTYPE User [
<!ELEMENT User (ID, BirthDate, Name, Sex)>
<!ELEMENT ID (#PCDATA)>
<!ELEMENT BirthDate (#PCDATA)>
<!ELEMENT Name (#PCDATA)>
<!ELEMENT Sex (#PCDATA)>
]>
<User>
<ID>101</ID>
<BirthDate>1970-01-01+00:01</BirthDate>
<Name>Frank Y. Ivy</Name>
<Sex> Male</Sex>
</User>
2. Tell the parser to ignore whitespaces: setIgnoringElementContentWhitespace(true), as shown in DomXmlParserWhitespace.java:
// Copyright (c) 2017 FYIcenter.com
import java.io.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
public class DomXmlParserWhitespace {
static String dot = "............................................................";
public static void main(String[] args) throws Exception {
DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
f.setIgnoringElementContentWhitespace(Boolean.parseBoolean(args[1]));
DocumentBuilder b = f.newDocumentBuilder();
Document d = b.parse(new File(args[0]));
System.out.println("Implementation class:\n "+d.getClass().getName());
System.out.println("DOM object elements and text contents:");
Node n = d.getDocumentElement();
printText(n, 1);
}
public static void printText(Node n, int l) {
String v = "";
if (n.getNodeType()==Node.TEXT_NODE) v = n.getTextContent();
System.out.println(dot.substring(0,l)+n.getNodeName()+":"+v);
NodeList c = n.getChildNodes();
for (int i=0; i<c.getLength(); i++) {
printText(c.item(i),l+1);
}
}
}
Compile and run the example program, DomXmlParserWhitespace.java, with setIgnoringElementContentWhitespace(false):
>\fyicenter\jdk-1.8.0\bin\javac DomXmlParserWhitespace.java >\fyicenter\jdk-1.8.0\bin\java DomXmlParserWhitespace UserDTD.xml false Implementation class: com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl DOM object elements and text contents: .User: ..#text: ..ID: ...#text:101 ..#text: ..BirthDate: ...#text:1970-01-01+00:01 ..#text: ..Name: ...#text:Frank Y. Ivy ..#text: ..Sex: ...#text: Male ..#text:
Run it again with setIgnoringElementContentWhitespace(true):
>\fyicenter\jdk-1.8.0\bin\java DomXmlParserWhitespace UserDTD.xml true Implementation class: com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl DOM object elements and text contents: .User: ..ID: ...#text:101 ..BirthDate: ...#text:1970-01-01+00:01 ..Name: ...#text:Frank Y. Ivy ..Sex: ...#text: Male
The output tells you that Apache Xerces is able to ignore whitespaces based on the DTD definitions.
⇒ DomXmlSerializer.java - Serialize DOM to XML String
⇐ DomXmlParser.java - Parse XML File with DOM API
2017-12-13, ∼2703🔥, 0💬
Popular Posts:
JBrowser Source Code Files are provided in the source package file. You can download JBrowser source...
Jackson is "the Java JSON library" or "the best JSON parser for Java". Or simply as "JSON for Java"....
JDK 11 jdk.jfr.jmod is the JMOD file for JDK 11 JFR module. JDK 11 JFR module compiled class files a...
The JDT project provides the tool plug-ins that implement a Java IDE supporting the development of a...
Jackson is "the Java JSON library" or "the best JSON parser for Java". Or simply as "JSON for Java"....