Processing Instructions

DomTrip provides comprehensive support for XML processing instructions (PIs), preserving their exact formatting and content during parsing and serialization.

Overview

Processing instructions are special XML constructs that provide instructions to applications processing the XML document. They have the form <?target data?> and are commonly used for:

  • Stylesheet declarations - <?xml-stylesheet type="text/xsl" href="style.xsl"?>
  • Application directives - <?php echo "Hello World"; ?>
  • Processing hints - <?sort alpha-ascending?>

Key Features

  • Perfect preservation - Original formatting and content maintained
  • Target and data access - Separate access to PI components
  • Modification support - Change target and data while preserving structure
  • Position awareness - Maintain PI placement in document structure

Basic Usage

Creating Processing Instructions

// Create a new processing instruction
ProcessingInstruction pi = ProcessingInstruction.of("xml-stylesheet", "type=\"text/xsl\" href=\"style.xsl\"");

// Access components
String target = pi.target(); // "xml-stylesheet"
String data = pi.data(); // "type=\"text/xsl\" href=\"style.xsl\""

Parsing Documents with Processing Instructions

String xml =
        """
    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="transform.xsl"?>
    <?sort-order alpha-ascending?>
    <root>
        <data>content</data>
    </root>
    """;

Document doc = Document.of(xml);
Editor editor = new Editor(doc);

// Processing instructions are preserved exactly
String result = editor.toXml();
// Result maintains all PIs in their original positions

Working with Processing Instructions

Finding Processing Instructions

String xmlWithPIs =
        """
    <?xml version="1.0"?>
    <?stylesheet type="text/xsl" href="style.xsl"?>
    <root>content</root>
    """;
Document doc = Document.of(xmlWithPIs);

// Find all processing instructions in document (XML declaration is stored separately)
List<ProcessingInstruction> pis = doc.nodes()
        .filter(node -> node instanceof ProcessingInstruction)
        .map(node -> (ProcessingInstruction) node)
        .collect(Collectors.toList());

// Find specific PI by target
Optional<ProcessingInstruction> stylesheet = doc.nodes()
        .filter(node -> node instanceof ProcessingInstruction)
        .map(node -> (ProcessingInstruction) node)
        .filter(pi -> "stylesheet".equals(pi.target()))
        .findFirst();

Modifying Processing Instructions

ProcessingInstruction pi = ProcessingInstruction.of("target", "old-data");

// Modify target and data
pi.target("new-target");
pi.data("new-data with parameters");

// Get updated content
String target = pi.target(); // "new-target"
String data = pi.data(); // "new-data with parameters"

Advanced Features

Processing Instructions with Special Characters

String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" + "<root><text>Special: àáâãäå èéêë</text></root>";

// Round-trip preserves special characters
InputStream inputStream = new ByteArrayInputStream(xml.getBytes(StandardCharsets.UTF_8));
Document doc = Document.of(inputStream);

ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
doc.toXml(outputStream);
// Special characters are preserved

Position and Whitespace Preservation

String xml =
        """
    <?xml version="1.0"?>

    <?stylesheet href="style.css"?>

    <root/>
    """;

Document doc = Document.of(xml);

// Find stylesheet PI
Optional<ProcessingInstruction> stylesheet = doc.nodes()
        .filter(node -> node instanceof ProcessingInstruction)
        .map(node -> (ProcessingInstruction) node)
        .filter(pi -> "stylesheet".equals(pi.target()))
        .findFirst();

// Whitespace around PIs is preserved in the document structure

Common Use Cases

XML Stylesheet Declaration

// Add stylesheet PI to document
Document doc = Document.withRootElement("html");
Editor editor = new Editor(doc);

ProcessingInstruction stylesheet =
        ProcessingInstruction.of("xml-stylesheet", "type=\"text/xsl\" href=\"transform.xsl\"");

// Insert PI before root element
doc.addNode(stylesheet);

PHP Processing Instructions

String phpXml =
        """
    <?xml version="1.0"?>
    <?php
        $title = "Dynamic Title";
        echo "<title>$title</title>";
    ?>
    <html>
        <head></head>
        <body>Content</body>
    </html>
    """;

Document doc = Document.of(phpXml);
// PHP PI content is preserved exactly, including newlines and formatting

Application-Specific Instructions

// Custom processing instructions for application logic
ProcessingInstruction sortOrder = ProcessingInstruction.of("sort-order", "alpha-ascending");
ProcessingInstruction cacheHint = ProcessingInstruction.of("cache-duration", "3600");

// Add to document
Document doc = Document.withRootElement("data");
doc.addNode(sortOrder);
doc.addNode(cacheHint);

Best Practices

Do:

  • Use meaningful target names that identify the processing application
  • Include necessary data in a structured format
  • Preserve original formatting when possible
  • Use PIs for application-specific metadata

Avoid:

  • Using PIs for data that belongs in elements or attributes
  • Creating PIs with malformed syntax
  • Assuming all parsers will preserve PI content exactly
  • Using reserved target names like "xml"

Integration with Editor

Processing instructions work seamlessly with DomTrip's Editor API:

// Create document and edit
Document doc = Document.withRootElement("config");
Editor editor = new Editor(doc);

// Editor operations modify the document
editor.addElement(editor.root(), "setting", "value");

// Document reflects changes
Element setting = doc.root().child("setting").orElse(null);

Performance Considerations

  • Memory efficient - PIs are stored as lightweight objects
  • Lazy parsing - PI content is parsed only when accessed
  • Minimal overhead - No performance impact when PIs are not used
  • Streaming friendly - Compatible with large document processing

Processing instructions in DomTrip provide the perfect balance of preservation and functionality, making them ideal for applications that need to maintain exact XML formatting while providing programmatic access to PI content.