Basic Concepts

Understanding DomTrip's core concepts will help you use the library effectively. This guide covers the fundamental ideas behind DomTrip's design and how they differ from traditional XML libraries.

The Lossless Philosophy

Traditional XML libraries focus on data extraction - they parse XML to get the information you need, often discarding formatting details in the process. DomTrip takes a different approach: preservation first.

String xml = "<project><version>1.0</version></project>";

// DomTrip approach (preservation-focused)
Editor editor = new Editor(Document.of(xml));
Element root = editor.root();
Element version = root.child("version").orElse(null);
String value = version.textContent();
String result = editor.toXml(); // Identical to original if unchanged

Node Hierarchy

DomTrip uses a clean, type-safe node hierarchy that reflects XML structure:

Node (abstract base)
├── ContainerNode (abstract)
│   ├── Document (root container)
│   └── Element (XML elements)
└── Leaf Nodes
    ├── Text (text content, CDATA)
    ├── Comment (XML comments)
    └── ProcessingInstruction (PIs)

Why This Design?

  1. Memory Efficiency: Leaf nodes don't waste memory on unused children collections
  2. Type Safety: Impossible to add children to text nodes at compile time
  3. Clear API: Child management methods only exist where they make sense
// ✅ This works - Element can have children
Element parent = Element.of("parent");
parent.addNode(Text.of("content"));

// Text nodes cannot have children (compile-time safety)
Text text = Text.of("content");
// text.addNode(...); // Would not compile

Modification Tracking

Every node tracks whether it has been modified since parsing. This enables minimal-change serialization:

// Unmodified nodes use original formatting
Element unchanged = doc.root().child("groupId").orElseThrow();
Assertions.assertFalse(unchanged.isModified()); // false

// Modified nodes are rebuilt with inferred formatting
Element changed = doc.root().child("version").orElseThrow();
editor.setTextContent(changed, "2.0.0");
Assertions.assertTrue(changed.isModified()); // true

Dual Content Storage

Text nodes store content in two forms:

  1. Decoded Content: For your application logic
  2. Raw Content: For preservation during serialization
// Original XML: <message>Hello &amp; goodbye</message>
String xml = "<message>Hello &amp; goodbye</message>";
Document doc = Document.of(xml);
Element element = doc.root();

// For your code - entities are decoded
String decoded = element.textContent(); // "Hello & goodbye"

// For serialization - entities are preserved in XML output
String result = doc.toXml(); // Contains "Hello &amp; goodbye"

This allows you to work with normal strings while preserving entity encoding.

Attribute Handling

Attributes are first-class objects that preserve formatting details:

String xml = "<dependency scope='test'></dependency>";
Editor editor = new Editor(Document.of(xml));
Element element = editor.root();

// Access attribute as object for detailed information
Attribute scope = element.attributeObject("scope");

String value = scope.value(); // "test"
QuoteStyle quoteStyle = scope.quoteStyle(); // QuoteStyle.SINGLE
String whitespace = scope.precedingWhitespace(); // Whitespace before attribute

Whitespace Management

DomTrip tracks whitespace at multiple levels:

1. Node-Level Whitespace

public abstract class Node {
    protected String precedingWhitespace;  // Before the node
    // Note: followingWhitespace has been removed - whitespace is now stored
    // as precedingWhitespace of the next node for a cleaner model
}

2. Element-Level Whitespace

public class Element extends ContainerNode {
    private String openTagWhitespace;   // Inside opening tag: <element >
    private String closeTagWhitespace;  // Inside closing tag: </ element>
}

3. Intelligent Inference

For new content, DomTrip infers formatting from surrounding context:

Document doc = Document.of(
        """
    <project>
        <groupId>com.example</groupId>
        <artifactId>my-app</artifactId>
    </project>
    """);

Editor editor = new Editor(doc);
Element artifactId = doc.root().child("artifactId").orElseThrow();

// Insert version between groupId and artifactId
// Whitespace is automatically inferred from surrounding elements
Element version = editor.insertElementBefore(artifactId, "version");
editor.setTextContent(version, "1.0.0");

String result = editor.toXml();

Configuration System

DomTrip behavior is controlled through DomTripConfig:

// Preset configurations
DomTripConfig defaults = DomTripConfig.defaults(); // Maximum preservation
DomTripConfig pretty = DomTripConfig.prettyPrint(); // Clean output
DomTripConfig minimal = DomTripConfig.minimal(); // Compact output

// Custom configuration
DomTripConfig custom = DomTripConfig.defaults()
        .withIndentString("  ") // 2 spaces
        .withWhitespacePreservation(true) // Keep original whitespace
        .withCommentPreservation(true) // Keep comments
        .withDefaultQuoteStyle(QuoteStyle.DOUBLE); // Prefer double quotes

DomTrip provides multiple ways to navigate XML structures:

1. Traditional Navigation

Element root = editor.getDocumentElement();
Element child = root.getChild("child-name");
List<Element> children = root.getChildren("item");

2. Optional-Based Navigation

String xml = "<root><child>value</child></root>";
Editor editor = new Editor(Document.of(xml));
Element root = editor.root();

Optional<Element> child = root.child("child");
child.ifPresent(element -> {
    // Safe navigation - no null checks needed
    String value = element.textContent();
    Assertions.assertEquals("value", value);
});

3. Stream-Based Navigation

String xml = createConfigXml();
Document doc = Document.of(xml);
Editor editor = new Editor(doc);

// Stream-based navigation
editor.root()
        .descendants()
        .filter(e -> e.name().equals("port"))
        .findFirst()
        .ifPresent(port -> System.out.println("Port: " + port.textContent()));

4. Namespace-Aware Navigation

String xml =
        """
    <project xmlns="http://maven.apache.org/POM/4.0.0"
             xmlns:custom="http://example.com/custom">
        <groupId>com.example</groupId>
        <custom:metadata>
            <custom:author>John Doe</custom:author>
        </custom:metadata>
    </project>
    """;

Document doc = Document.of(xml);
Editor editor = new Editor(doc);

Element root = doc.root();

// Find elements by qualified name (prefix:localName)
Element metadata = root.child("custom:metadata").orElseThrow();
Element author = metadata.child("custom:author").orElseThrow();

String authorName = author.textContent(); // "John Doe"

Error Handling

DomTrip uses specific exception types for better error handling:

try {
    // Attempt to parse malformed XML
    String malformedXml = "<root><unclosed>";
    Document doc = Document.of(malformedXml);

    // This won't be reached due to parsing error
    Editor editor = new Editor(doc);
} catch (Exception e) {
    // Handle parsing errors gracefully
    System.err.println("XML parsing failed: " + e.getMessage());

    // Provide fallback or user-friendly error message
    System.out.println("Please check your XML syntax and try again.");
}

// Safe navigation with Optional
String xml = createConfigXml();
Document doc = Document.of(xml);
Editor editor = new Editor(doc);

editor.root()
        .descendant("nonexistent")
        .ifPresentOrElse(
                element -> System.out.println("Found: " + element.name()),
                () -> System.out.println("Element not found - using default behavior"));

Next Steps

Now that you understand the core concepts, explore specific features: