Basic Concepts
Understanding DomTrip's core concepts will help you use the library effectively. This guide covers the fundamental ideas behind DomTrip's design and how they differ from traditional XML libraries.
The Lossless Philosophy
Traditional XML libraries focus on data extraction - they parse XML to get the information you need, often discarding formatting details in the process. DomTrip takes a different approach: preservation first.
String xml = "<project><version>1.0</version></project>";
// DomTrip approach (preservation-focused)
Editor editor = new Editor(Document.of(xml));
Element root = editor.root();
Element version = root.child("version").orElse(null);
String value = version.textContent();
String result = editor.toXml(); // Identical to original if unchanged
Node Hierarchy
DomTrip uses a clean, type-safe node hierarchy that reflects XML structure:
Node (abstract base)
├── ContainerNode (abstract)
│ ├── Document (root container)
│ └── Element (XML elements)
└── Leaf Nodes
├── Text (text content, CDATA)
├── Comment (XML comments)
└── ProcessingInstruction (PIs)
Why This Design?
- Memory Efficiency: Leaf nodes don't waste memory on unused children collections
- Type Safety: Impossible to add children to text nodes at compile time
- Clear API: Child management methods only exist where they make sense
// ✅ This works - Element can have children
Element parent = Element.of("parent");
parent.addNode(Text.of("content"));
// Text nodes cannot have children (compile-time safety)
Text text = Text.of("content");
// text.addNode(...); // Would not compile
Modification Tracking
Every node tracks whether it has been modified since parsing. This enables minimal-change serialization:
// Unmodified nodes use original formatting
Element unchanged = doc.root().child("groupId").orElseThrow();
Assertions.assertFalse(unchanged.isModified()); // false
// Modified nodes are rebuilt with inferred formatting
Element changed = doc.root().child("version").orElseThrow();
editor.setTextContent(changed, "2.0.0");
Assertions.assertTrue(changed.isModified()); // true
Dual Content Storage
Text nodes store content in two forms:
- Decoded Content: For your application logic
- Raw Content: For preservation during serialization
// Original XML: <message>Hello & goodbye</message>
String xml = "<message>Hello & goodbye</message>";
Document doc = Document.of(xml);
Element element = doc.root();
// For your code - entities are decoded
String decoded = element.textContent(); // "Hello & goodbye"
// For serialization - entities are preserved in XML output
String result = doc.toXml(); // Contains "Hello & goodbye"
This allows you to work with normal strings while preserving entity encoding.
Attribute Handling
Attributes are first-class objects that preserve formatting details:
String xml = "<dependency scope='test'></dependency>";
Editor editor = new Editor(Document.of(xml));
Element element = editor.root();
// Access attribute as object for detailed information
Attribute scope = element.attributeObject("scope");
String value = scope.value(); // "test"
QuoteStyle quoteStyle = scope.quoteStyle(); // QuoteStyle.SINGLE
String whitespace = scope.precedingWhitespace(); // Whitespace before attribute
Whitespace Management
DomTrip tracks whitespace at multiple levels:
1. Node-Level Whitespace
public abstract class Node {
protected String precedingWhitespace; // Before the node
// Note: followingWhitespace has been removed - whitespace is now stored
// as precedingWhitespace of the next node for a cleaner model
}
2. Element-Level Whitespace
public class Element extends ContainerNode {
private String openTagWhitespace; // Inside opening tag: <element >
private String closeTagWhitespace; // Inside closing tag: </ element>
}
3. Intelligent Inference
For new content, DomTrip infers formatting from surrounding context:
Document doc = Document.of(
"""
<project>
<groupId>com.example</groupId>
<artifactId>my-app</artifactId>
</project>
""");
Editor editor = new Editor(doc);
Element artifactId = doc.root().child("artifactId").orElseThrow();
// Insert version between groupId and artifactId
// Whitespace is automatically inferred from surrounding elements
Element version = editor.insertElementBefore(artifactId, "version");
editor.setTextContent(version, "1.0.0");
String result = editor.toXml();
Configuration System
DomTrip behavior is controlled through DomTripConfig
:
// Preset configurations
DomTripConfig defaults = DomTripConfig.defaults(); // Maximum preservation
DomTripConfig pretty = DomTripConfig.prettyPrint(); // Clean output
DomTripConfig minimal = DomTripConfig.minimal(); // Compact output
// Custom configuration
DomTripConfig custom = DomTripConfig.defaults()
.withIndentString(" ") // 2 spaces
.withWhitespacePreservation(true) // Keep original whitespace
.withCommentPreservation(true) // Keep comments
.withDefaultQuoteStyle(QuoteStyle.DOUBLE); // Prefer double quotes
Navigation Patterns
DomTrip provides multiple ways to navigate XML structures:
1. Traditional Navigation
Element root = editor.getDocumentElement();
Element child = root.getChild("child-name");
List<Element> children = root.getChildren("item");
2. Optional-Based Navigation
String xml = "<root><child>value</child></root>";
Editor editor = new Editor(Document.of(xml));
Element root = editor.root();
Optional<Element> child = root.child("child");
child.ifPresent(element -> {
// Safe navigation - no null checks needed
String value = element.textContent();
Assertions.assertEquals("value", value);
});
3. Stream-Based Navigation
String xml = createConfigXml();
Document doc = Document.of(xml);
Editor editor = new Editor(doc);
// Stream-based navigation
editor.root()
.descendants()
.filter(e -> e.name().equals("port"))
.findFirst()
.ifPresent(port -> System.out.println("Port: " + port.textContent()));
4. Namespace-Aware Navigation
String xml =
"""
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:custom="http://example.com/custom">
<groupId>com.example</groupId>
<custom:metadata>
<custom:author>John Doe</custom:author>
</custom:metadata>
</project>
""";
Document doc = Document.of(xml);
Editor editor = new Editor(doc);
Element root = doc.root();
// Find elements by qualified name (prefix:localName)
Element metadata = root.child("custom:metadata").orElseThrow();
Element author = metadata.child("custom:author").orElseThrow();
String authorName = author.textContent(); // "John Doe"
Error Handling
DomTrip uses specific exception types for better error handling:
try {
// Attempt to parse malformed XML
String malformedXml = "<root><unclosed>";
Document doc = Document.of(malformedXml);
// This won't be reached due to parsing error
Editor editor = new Editor(doc);
} catch (Exception e) {
// Handle parsing errors gracefully
System.err.println("XML parsing failed: " + e.getMessage());
// Provide fallback or user-friendly error message
System.out.println("Please check your XML syntax and try again.");
}
// Safe navigation with Optional
String xml = createConfigXml();
Document doc = Document.of(xml);
Editor editor = new Editor(doc);
editor.root()
.descendant("nonexistent")
.ifPresentOrElse(
element -> System.out.println("Found: " + element.name()),
() -> System.out.println("Element not found - using default behavior"));
Next Steps
Now that you understand the core concepts, explore specific features:
- 🔄 Lossless Parsing - Deep dive into preservation
- 📝 Formatting Preservation - How formatting is maintained
- 🌐 Namespace Support - Working with XML namespaces
- 🏗️ Builder Patterns - Creating complex XML structures