Basic Concepts
Understanding DomTrip's core concepts will help you use the library effectively. This guide covers the fundamental ideas behind DomTrip's design and how they differ from traditional XML libraries.
The Lossless Philosophy
Traditional XML libraries focus on data extraction - they parse XML to get the information you need, often discarding formatting details in the process. DomTrip takes a different approach: preservation first.
String xml = "<project><version>1.0</version></project>";
// DomTrip approach (preservation-focused)
Editor editor = new Editor(Document.of(xml));
Element root = editor.root();
Element version = root.child("version").orElse(null);
String value = version.textContent();
String result = editor.toXml(); // Identical to original if unchanged
Node Hierarchy
DomTrip uses a clean, type-safe node hierarchy that reflects XML structure:
Node (abstract base)
├── ContainerNode (abstract)
│ ├── Document (root container)
│ └── Element (XML elements)
└── Leaf Nodes
├── Text (text content, CDATA)
├── Comment (XML comments)
└── ProcessingInstruction (PIs)
Why This Design?
- Memory Efficiency: Leaf nodes don't waste memory on unused children collections
- Type Safety: Impossible to add children to text nodes at compile time
- Clear API: Child management methods only exist where they make sense
// ✅ This works - Element can have children
Element parent = Element.of("parent");
parent.addNode(Text.of("content"));
// Text nodes cannot have children (compile-time safety)
Text text = Text.of("content");
// text.addNode(...); // Would not compile
Modification Tracking
Every node tracks whether it has been modified since parsing. This enables minimal-change serialization:
// Unmodified nodes use original formatting
Element unchanged = doc.root().child("groupId").orElseThrow();
Assertions.assertFalse(unchanged.isModified()); // false
// Modified nodes are rebuilt with inferred formatting
Element changed = doc.root().child("version").orElseThrow();
editor.setTextContent(changed, "2.0.0");
Assertions.assertTrue(changed.isModified()); // true
Dual Content Storage
Text nodes store content in two forms:
- Decoded Content: For your application logic
- Raw Content: For preservation during serialization
// Original XML: <message>Hello & goodbye</message>
String xml = "<message>Hello & goodbye</message>";
Document doc = Document.of(xml);
Element element = doc.root();
// For your code - entities are decoded
String decoded = element.textContent(); // "Hello & goodbye"
// For serialization - entities are preserved in XML output
String result = doc.toXml(); // Contains "Hello & goodbye"
This allows you to work with normal strings while preserving entity encoding.
Attribute Handling
Attributes are first-class objects that preserve formatting details:
String xml = "<dependency scope='test'></dependency>";
Editor editor = new Editor(Document.of(xml));
Element element = editor.root();
// Access attribute as object for detailed information
Attribute scope = element.attributeObject("scope");
String value = scope.value(); // "test"
QuoteStyle quoteStyle = scope.quoteStyle(); // QuoteStyle.SINGLE
String whitespace = scope.precedingWhitespace(); // Whitespace before attribute
Whitespace Management
DomTrip tracks whitespace at multiple levels:
1. Node-Level Whitespace
public abstract class Node {
protected String precedingWhitespace; // Before the node
// Note: followingWhitespace has been removed - whitespace is now stored
// as precedingWhitespace of the next node for a cleaner model
}
2. Element-Level Whitespace
public class Element extends ContainerNode {
private String openTagWhitespace; // Inside opening tag: <element >
private String closeTagWhitespace; // Inside closing tag: </ element>
}
3. Intelligent Inference
For new content, DomTrip infers formatting from surrounding context:
// Existing structure with indentation
String xml =
"""
<dependencies>
<dependency>existing</dependency>
</dependencies>
""";
Editor editor = new Editor(Document.of(xml));
Element dependencies = editor.root();
// Adding new dependency automatically infers indentation
Element newDep = editor.addElement(dependencies, "dependency");
editor.setTextContent(newDep, "new");
String result = editor.toXml();
// Result uses same indentation as existing dependencies
Configuration System
DomTrip behavior is controlled through DomTripConfig
:
// Preset configurations
DomTripConfig defaults = DomTripConfig.defaults(); // Maximum preservation
DomTripConfig pretty = DomTripConfig.prettyPrint(); // Clean output
DomTripConfig minimal = DomTripConfig.minimal(); // Compact output
// Custom configuration
DomTripConfig custom = DomTripConfig.defaults()
.withIndentString(" ") // 2 spaces
.withCommentPreservation(true) // Keep comments
.withDefaultQuoteStyle(QuoteStyle.DOUBLE); // Prefer double quotes
Navigation Patterns
DomTrip provides multiple ways to navigate XML structures:
1. Traditional Navigation
Element root = editor.getDocumentElement();
Element child = root.getChild("child-name");
List<Element> children = root.getChildren("item");
2. Optional-Based Navigation
String xml = "<root><child>value</child></root>";
Editor editor = new Editor(Document.of(xml));
Element root = editor.root();
Optional<Element> child = root.child("child");
child.ifPresent(element -> {
// Safe navigation - no null checks needed
String value = element.textContent();
Assertions.assertEquals("value", value);
});
3. Stream-Based Navigation
String xml = createConfigXml();
Document doc = Document.of(xml);
Editor editor = new Editor(doc);
// Stream-based navigation
editor.root()
.descendants()
.filter(e -> e.name().equals("port"))
.findFirst()
.ifPresent(port -> System.out.println("Port: " + port.textContent()));
4. Namespace-Aware Navigation
String xml =
"""
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:custom="http://example.com/custom">
<groupId>com.example</groupId>
<custom:metadata>
<custom:author>John Doe</custom:author>
</custom:metadata>
</project>
""";
Document doc = Document.of(xml);
Editor editor = new Editor(doc);
Element root = doc.root();
// Find elements by qualified name (prefix:localName)
Element metadata = root.child("custom:metadata").orElseThrow();
Element author = metadata.child("custom:author").orElseThrow();
String authorName = author.textContent(); // "John Doe"
Error Handling
DomTrip uses specific exception types for better error handling:
try {
String xml = "<root><child>value</child></root>";
InputStream inputStream = new ByteArrayInputStream(xml.getBytes(StandardCharsets.UTF_8));
Document doc = Document.of(inputStream);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
doc.toXml(outputStream);
inputStream.close();
outputStream.close();
} catch (DomTripException e) {
if (e.getCause() instanceof IOException) {
// Handle I/O errors
System.err.println("I/O error: " + e.getMessage());
} else {
// Handle parsing/encoding errors
System.err.println("XML error: " + e.getMessage());
}
}
Next Steps
Now that you understand the core concepts, explore specific features:
- 🔄 Lossless Parsing - Deep dive into preservation
- 📝 Formatting Preservation - How formatting is maintained
- 🌐 Namespace Support - Working with XML namespaces
- 🏗️ Builder Patterns - Creating complex XML structures