Basic Concepts

Understanding DomTrip's core concepts will help you use the library effectively. This guide covers the fundamental ideas behind DomTrip's design and how they differ from traditional XML libraries.

The Lossless Philosophy

Traditional XML libraries focus on data extraction - they parse XML to get the information you need, often discarding formatting details in the process. DomTrip takes a different approach: preservation first.

String xml = "<project><version>1.0</version></project>";

// DomTrip approach (preservation-focused)
Editor editor = new Editor(Document.of(xml));
Element root = editor.root();
Element version = root.child("version").orElse(null);
String value = version.textContent();
String result = editor.toXml(); // Identical to original if unchanged

Node Hierarchy

DomTrip uses a clean, type-safe node hierarchy that reflects XML structure:

Node (abstract base)
├── ContainerNode (abstract)
│   ├── Document (root container)
│   └── Element (XML elements)
└── Leaf Nodes
    ├── Text (text content, CDATA)
    ├── Comment (XML comments)
    └── ProcessingInstruction (PIs)

Why This Design?

Memory Efficiency: Leaf nodes don't waste memory on unused children collections
Type Safety: Impossible to add children to text nodes at compile time
Clear API: Child management methods only exist where they make sense

// ✅ This works - Element can have children
Element parent = Element.of("parent");
parent.addNode(Text.of("content"));

// Text nodes cannot have children (compile-time safety)
Text text = Text.of("content");
// text.addNode(...); // Would not compile

Modification Tracking

Every node tracks whether it has been modified since parsing. This enables minimal-change serialization:

// Unmodified nodes use original formatting
Element unchanged = doc.root().child("groupId").orElseThrow();
Assertions.assertFalse(unchanged.isModified()); // false

// Modified nodes are rebuilt with inferred formatting
Element changed = doc.root().child("version").orElseThrow();
editor.setTextContent(changed, "2.0.0");
Assertions.assertTrue(changed.isModified()); // true

Dual Content Storage

Text nodes store content in two forms:

Decoded Content: For your application logic
Raw Content: For preservation during serialization

// Original XML: <message>Hello &amp; goodbye</message>
String xml = "<message>Hello &amp; goodbye</message>";
Document doc = Document.of(xml);
Element element = doc.root();

// For your code - entities are decoded
String decoded = element.textContent(); // "Hello & goodbye"

// For serialization - entities are preserved in XML output
String result = doc.toXml(); // Contains "Hello &amp; goodbye"

This allows you to work with normal strings while preserving entity encoding.

Attribute Handling

Attributes are first-class objects that preserve formatting details:

String xml = "<dependency scope='test'></dependency>";
Editor editor = new Editor(Document.of(xml));
Element element = editor.root();

// Access attribute as object for detailed information
Attribute scope = element.attributeObject("scope");

String value = scope.value(); // "test"
QuoteStyle quoteStyle = scope.quoteStyle(); // QuoteStyle.SINGLE
String whitespace = scope.precedingWhitespace(); // Whitespace before attribute

Whitespace Management

DomTrip tracks whitespace at multiple levels:

1. Node-Level Whitespace

public abstract class Node {
    protected String precedingWhitespace;  // Before the node
    // Note: followingWhitespace has been removed - whitespace is now stored
    // as precedingWhitespace of the next node for a cleaner model
}

2. Element-Level Whitespace

public class Element extends ContainerNode {
    private String openTagWhitespace;   // Inside opening tag: <element >
    private String closeTagWhitespace;  // Inside closing tag: </ element>
}

3. Intelligent Inference

For new content, DomTrip infers formatting from surrounding context:

// Existing structure with indentation
String xml =
        """
    <dependencies>
        <dependency>existing</dependency>
    </dependencies>
    """;

Editor editor = new Editor(Document.of(xml));
Element dependencies = editor.root();

// Adding new dependency automatically infers indentation
Element newDep = editor.addElement(dependencies, "dependency");
editor.setTextContent(newDep, "new");

String result = editor.toXml();
// Result uses same indentation as existing dependencies

Configuration System

DomTrip behavior is controlled through DomTripConfig:

// Preset configurations
DomTripConfig defaults = DomTripConfig.defaults(); // Maximum preservation
DomTripConfig pretty = DomTripConfig.prettyPrint(); // Clean output
DomTripConfig minimal = DomTripConfig.minimal(); // Compact output

// Custom configuration
DomTripConfig custom = DomTripConfig.defaults()
        .withIndentString("  ") // 2 spaces
        .withCommentPreservation(true) // Keep comments
        .withDefaultQuoteStyle(QuoteStyle.DOUBLE); // Prefer double quotes

DomTrip provides multiple ways to navigate XML structures:

Element root = editor.getDocumentElement();
Element child = root.getChild("child-name");
List<Element> children = root.getChildren("item");

String xml = "<root><child>value</child></root>";
Editor editor = new Editor(Document.of(xml));
Element root = editor.root();

Optional<Element> child = root.child("child");
child.ifPresent(element -> {
    // Safe navigation - no null checks needed
    String value = element.textContent();
    Assertions.assertEquals("value", value);
});

String xml = createConfigXml();
Document doc = Document.of(xml);
Editor editor = new Editor(doc);

// Stream-based navigation
editor.root()
        .descendants()
        .filter(e -> e.name().equals("port"))
        .findFirst()
        .ifPresent(port -> System.out.println("Port: " + port.textContent()));

String xml =
        """
    <project xmlns="http://maven.apache.org/POM/4.0.0"
             xmlns:custom="http://example.com/custom">
        <groupId>com.example</groupId>
        <custom:metadata>
            <custom:author>John Doe</custom:author>
        </custom:metadata>
    </project>
    """;

Document doc = Document.of(xml);
Editor editor = new Editor(doc);

Element root = doc.root();

// Find elements by qualified name (prefix:localName)
Element metadata = root.child("custom:metadata").orElseThrow();
Element author = metadata.child("custom:author").orElseThrow();

String authorName = author.textContent(); // "John Doe"

Error Handling

DomTrip uses specific exception types for better error handling:

try {
    String xml = "<root><child>value</child></root>";
    InputStream inputStream = new ByteArrayInputStream(xml.getBytes(StandardCharsets.UTF_8));
    Document doc = Document.of(inputStream);

    ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
    doc.toXml(outputStream);

    inputStream.close();
    outputStream.close();
} catch (DomTripException e) {
    if (e.getCause() instanceof IOException) {
        // Handle I/O errors
        System.err.println("I/O error: " + e.getMessage());
    } else {
        // Handle parsing/encoding errors
        System.err.println("XML error: " + e.getMessage());
    }
}

Next Steps

Now that you understand the core concepts, explore specific features:

🔄 Lossless Parsing - Deep dive into preservation
📝 Formatting Preservation - How formatting is maintained
🌐 Namespace Support - Working with XML namespaces
🏗️ Builder Patterns - Creating complex XML structures